mblock_sub44_sads_x86_h.c

来自「Motion JPEG编解码器源代码」· C语言代码 · 共 108 行

108 行

/* * * mblock_sub44_sads_x86_h.c * Copyright (C) 2000 Andrew Stevens <as@comlab.ox.ac.uk> * * Fast block sum-absolute difference computation for a rectangular area 4*x * by y where y > h against a 4 by h block. * * Used for 4*4 sub-sampled motion compensation calculations. *  * * This file is part of mpeg2enc, a free MPEG-2 video stream encoder * based on the original MSSG reference design * * mpeg2enc is free software; you can redistribute new parts * and/or modify under the terms of the GNU General Public License  * as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * mpeg2enc is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the * GNU General Public License for more details. * * See the files for those sections (c) MSSG * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA */#ifdef HAVE_CONFIG_H#include "config.h"#endif#include <stdlib.h>#define PREFETCH_OPT/* * * Generates a vector sad's for 4*4 sub-sampled pel (qpel) data (with * co-ordinates and top-left qpel address) from specified rectangle * against a specified 16*h pel (4*4 qpel) reference block.  The * generated vector contains results only for those sad's that fall * below twice the running best sad and are aligned on 8-pel * boundaries * * Invariant: blk points to top-left sub-sampled pel for macroblock * at (ilow,ihigh) * i{low,high) j(low,high) must be multiples of 4. * * sad = Sum Absolute Differences * * NOTES: for best efficiency i{low,high) should be multiples of 16. * * */int SIMD_SUFFIX(mblocks_sub44_mests)( uint8_t *blk,  uint8_t *ref,					int ilow,int jlow,					int ihigh, int jhigh, 					int h, int rowstride, 					int threshold,					me_result_s *resvec){	int32_t x,y;	uint8_t *currowblk = blk;	uint8_t *curblk;	me_result_s *cres = resvec;	int      gridrowstride = rowstride;	int weight;        SIMD_SUFFIX(init_qblock_sad)(ref, h, rowstride);	for( y=jlow; y <= jhigh ; y+=4)	{		curblk = currowblk;		// You'd think prefetching curblk+4*rowstride would help here.		// I have found *NO* measurable increase in performance...				for( x = ilow; x <= ihigh; x += 4)		{			if( (x & 15) == (ilow & 15) )			{				load_blk( curblk, rowstride, h );                                curblk += 4;			}			weight = SIMD_SUFFIX(qblock_sad)(ref, h, rowstride);			shift_blk(8);			if( weight <= threshold )			{				threshold = intmin(weight<<2,threshold);				/* Rough and-ready absolute distance penalty */				/* NOTE: This penalty is *vital* to correct operation 				   as otherwise the sub-mean filtering won't work on very				   uniform images.				 */				cres->weight = (uint16_t)(weight+(intmax(abs(x),abs(y))<<2));				cres->x = (uint8_t)x;				cres->y = (uint8_t)y;				++cres;			}		}		currowblk += gridrowstride;	}	emms();	return cres - resvec;}

mblock_sub44_sads_x86_h.c - 源码说明

本页面展示了「Motion JPEG编解码器源代码」中的 mblock_sub44_sads_x86_h.c 源码文件，采用 C语言编程语言编写，共 108 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与Motion相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?