📄 bzero.s
字号:
/* @(#)bzero.s 4.1 (ULTRIX) 7/3/90 *//* ------------------------------------------------------------------ *//* | Copyright Unpublished, MIPS Computer Systems, Inc. All Rights | *//* | Reserved. This software contains proprietary and confidential | *//* | information of MIPS and its suppliers. Use, disclosure or | *//* | reproduction is prohibited without the prior express written | *//* | consent of MIPS. | *//* ------------------------------------------------------------------ *//* $Header: bzero.s,v 1.1 87/02/16 11:17:20 dce Exp $ */#include <mips/regdef.h>#include <mips/asm.h>#define NBPW 4/* * bzero(dst, bcount) * Zero block of memory * * Calculating MINZERO, assuming 50% cache-miss on non-loop code: * Overhead =~ 18 instructions => 63 (81) cycles * Byte zero =~ 16 (24) cycles/word for 08M44 (08V11) * Word zero =~ 3 (6) cycles/word for 08M44 (08V11) * If I-cache-miss nears 0, MINZERO ==> 4 bytes; otherwise, times are: * breakeven (MEM) = 63 / (16 - 3) =~ 5 words * breakeven (VME) = 81 / (24 - 6) =~ 4.5 words * Since the overhead is pessimistic (worst-case alignment), and many calls * will be for well-aligned data, and since Word-zeroing at least leaves * the zero in the cache, we shade these values (18-20) down to 12 */#define MINZERO 12/* It turns out better to think of lwl/lwr and swl/swr as smaller-vs-bigger address rather than left-vs-right. Such a representation makes the code endian-independent. */#ifdef MIPSEB# define LWS lwl# define LWB lwr# define SWS swl# define SWB swr#else# define LWS lwr# define LWB lwl# define SWS swr# define SWB swl#endifLEAF(bzero)XLEAF(blkclr) subu v1,zero,a0 # number of bytes til aligned blt a1,MINZERO,bytezero and v1,NBPW-1 subu a1,v1 # BDSLOT beq v1,zero,blkzero # already aligned SWS zero,0(a0) addu a0,v1/* * zero 32 byte, aligned block */blkzero: and v0,a1,31 # count after by-32-byte loop done subu a3,a1,v0 # 32-byte chunks beq a1,v0,wordzero # less than a chunk to zero addu a3,a0 # dst endpoint1: sw zero,0(a0) sw zero,4(a0) sw zero,8(a0) sw zero,12(a0) addu a0,32 sw zero,-16(a0) sw zero,-12(a0) sw zero,-8(a0) sw zero,-4(a0) bne a0,a3,1b move a1,v0 # set count after loopwordzero: and v0,a1,(NBPW-1) # count after by-word loop done subu a3,a1,v0 # word chunks beq a1,v0,bytezero # less than a word to zero addu a3,a0 # dst endpoint1: addu a0,NBPW sw zero,-NBPW(a0) bne a0,a3,1b move a1,v0 # set count after loopbytezero: ble a1,zero,zerodone addu a1,a0 # dst endpoint1: addu a0,1 sb zero,-1(a0) bne a0,a1,1bzerodone: j ra.end bzero
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -