📄 readme

📁 Hausdorff Distance for Image Recognition
💻
字号:
This directory contains the code to compute the rasterised Hausdorffdistance under translation.The files are:driver.c: main() routine which loads up images, computes the Hausdorffdistance, and prints the resultsr-h.c: the actual Hausdorff distance computation codetrans.c, transhash.c: some support code.Makefile: the Makefile.Putting it together, you get the program "r-h". This program hasparameters:	-f forward_thresh forward_frac	-r reverse_thresh reverse_frac	imagepoints [modelpoints...]where	forward_thresh is the threshold to be used in the forward(model-to-image) distance computation. It is an integer, and scaled by 100(i.e. a value of 141 means a distance of 1.41 pixels).	forward_frac is the fraction to be used in this computation.	reverse_thresh is the threshold to be used in the reverse(image-to-model) distance computation.	reverse_frac is the fraction to be used in this computation.	imagepoints is a PBM file containing the image.	modelpoints is a PBM file containing the model. Several models canbe given; each will be matched against the image.As an example,r-h -f 141 .9 -r 200 .8 im.pbm mod.pbmmeans:	find all translations where		1) at least 90% of all the model points (non-zero pixels)are within 1.41 pixels of some image point, and		2) at least 80% of the image points lying under thetranslated model are within 2 pixels of some model point.This will produce some output like:Translations for "mod.pbm"( 100,  90) = (141, .953), (100, .87)...Each line after the first contains information on one match. The line givenas an example means:	If mod.pbm is translated by 100 pixels in X and 90 pixels in Y,then		1) The 90th percentile value of the list of model point toclosest image point distances is 1.41 pixels (.9 is the model_fracparameter),		2) 95.3% of the model points are within 1.41 pixels of someimage point,		3) The 80th percentile value of the list of image point toclosest model point values is 1 pixel, and		4) 87% of the image points underlying the translated modelare within 2 pixels of some model point (the 2 pixels is the reverse_threshparameter).Note well the distinction between 3 and 4. A translation will be rejectedunless the 80th percentile value of the list of image point to closestmodel point values is 2 pixels or less (i.e. 80% of the image pointsunderlying the translated model must be within 2 pixels of some modelpoint). What is reported is the actual 80th percentile value (known to be 2or less) and the actual fraction of the points that were under 2 pixels(87% in this case, known to be 80% or more).The main entry point to the code isfindTransAllModels(image_info *image, model_info *models, unsigned nmodels,		   int revstyle);The parameters:	image:	an image_info *, pointing to a structure defining theimage.	models: a model_info *, pointing to an array of model_infostructures, each defining a single model.	nmodels: how many models there are in this array.	revstyle: something determining how the reverse (image-to-model)distance computation will be done. Possible values are:		REVERSE_BOX: Compute the image-to-model distance valuesbased only on those image points which underly the translated model. Thisis the style used by driver.c.		REVERSE_ALLIMAGE: Compute the image-to-model distancevalues based on *all* the image points.		REVERSE_NONE: Skip the image-to-model distance computationentirely. The values returned in the reverse distance fields will beundefined.What this does:	Each model is considered separately. Let model_info *model be oneof them.	If model->trans is initially NULLLIST, then it will search thespace of all possible translations (within the provided bounds, see below)and set model->trans to a List, each ListNode of which will have atransval * pointer as its userdata (see list.c for an explanation of this).Each transval will represent a valid translation.	If model->trans is not initially NULLLIST, then it will assume thatit contains a list, each node of which contains a transval *. It willlook at each of these, and consider the translation in each one's transposfield. It then determines which of these translations are valid (satisfythe various threshold and fraction parameters). If one is not valid, itwill delete it from the list and free it. If one is valid, then the rest ofthe fields in the transval will be filled out (the reverse_* fields will befilled out according to the current revstyle; if this is REVERSE_NONE, theywill be unaffected).The main structures this used:typedef struct {    long x;    long y;    } point;This represents a point in the plane.typedef struct {    point transpos;    long forward_val;    float forward_frac;	/* What fraction are actually <= model_thresh */    long reverse_val;	/* Ditto for image */    float reverse_frac;    long reverse_num;	/* How many image pixels lie under the model */    } transval;This represents a translation of the model with respect to the image. Thefields have the following meanings:    transpos: the translation itself.    forward_val: the forward distance value, as in the output of r-h above    forward_frac: the forward fraction    long reverse_val: the reverse distance value    float reverse_frac: the reverse fraction    long reverse_num: The number of image points were used to compute thereverse distance. If revstyle is REVERSE_BOX, this will be the number ofimage points that actually lie under the translated model; if it isREVERSE_ALLIMAGE, it will be the total number of image points.transval structures are *not* allocated using the standard malloc() call.Instead, malloc_trans() is used to allocate them, and free_trans() to freethem.typedef struct {    unsigned xsize;    unsigned ysize;    unsigned npts;    point *pts;    BinaryImage im;    double model_frac;    long model_thresh;    double image_frac;    long image_thresh;    int leftborder;    int topborder;    int rightborder;    int bottomborder;    int stepx;    int stepy;    LongImage dtrans;    List trans;    void *userdata;    } model_info;These fields mean:    Fields determining the model itself:    	xsize: this is the width of the box containing the model. Allpoints in the model must have X coordinates >= 0 and < xsize.	ysize: similarly for the height of the box.	npts: the number of model points.	pts: a pointer to an array of all the model points.	im: a BinaryImage of the model. This must be an image xsize byysize, and must be consistent with npts and pts.    Fields determining how the model is matched against the image:	model_frac, model_thresh, image_frac, image_thresh: these are theparameters to the matching algorithm, as given to the -f and -rcommand-line parameters of the r-h program.	leftborder, topborder, rightborder, bottomborder: These fieldslimit the range of translations. Any translation where the entire modelfits into the image expanded by these borders is considered; those wherethe model box lies completely or partially outside this range are not. Ifthese are all zero, then only translations where the entire model liesentirely inside the image are considered; increasing one will increase thisrange. Note that negative values can be used; for example, settingleftborder to -3 means that if a translation brings the model closer than 3pixels to the left edge of the image, then that translation will not beconsidered.	stepx, stepy: It is sometimes not necessary to consider everypossible translation in the range being searched. If you want to consideronly every other translation in X, say, set stepx to 2 (normally it is 1).    Fields generated from this:	dtrans: This contains the distance transform of the model. Itshould initially be (LongImage)NULL. It will be filled out if it isrequired. Note that any further calls to findTransAllModels() will use thevalue of this field if it is not NULL; this avoids unnecessaryrecalculation. If it is no longer needed, it should be freed.	trans: As noted above, this is a List of transval * values.    Misc fields:	userdata: this is a void * field which may be set by the user. Itis not used at all in the r-h code.typedef struct {    unsigned xsize;    unsigned ysize;    unsigned npts;    point *pts;    BinaryImage im;    unsigned xborder;    unsigned yborder;    LongImage dtrans;    long dtrans_thresh;    ShortImage plusx_dtrans;    void *userdata;    } image_info;These fields mean:    Fields determining the image itself:    	xsize, ysize, npts, pts, im: these have the same meanings as in amodel_info structure.    Fields generated:	xborder, yborder: These are derived from the model borders.	dtrans: This is the image's distance transform. It should initiallybe (LongImage)NULL, as in model_info. It should also be kept if theimage_info might be re-used, and freed if not.	dtrans_thresh: an internal field, used to maintain consistency.	plusx_dtrans: This is another of these re-usable fields. Again, itshould initially be NULL, and will be set as required, and re-used if theimage_info is re-used. Note that if the image changes at all, all of thesere-usable fields should be freed and set to NULL.    Misc fields:	userdata: this is a void * field which may be set by the user. Itis not used at all in the r-h code.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -