⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 diff.java

📁 JSPWiki,100%Java开发的一套完整WIKI程序
💻 JAVA
📖 第 1 页 / 共 3 页
字号:
/* $Log: Diff.java,v $/* Revision 1.2  2003/03/02 13:36:06  jalkanen/* Changed license text to refer to LGPL/*/* Revision 1.1  2002/05/22 19:24:21  jalkanen/* Initial import from BMSI source./* * Revision 1.3  2000/03/03  21:58:03  stuart * move discard_confusing_lines and shift_boundaries to class file_data * * Revision 1.2  2000/03/02  16:37:38  stuart * Add GPL and copyright * */package bmsi.util;import java.util.Hashtable;/** A class to compare vectors of objects.  The result of comparison is a list of <code>change</code> objects which form an edit script.  The objects compared are traditionally lines of text from two files.  Comparison options such as "ignore whitespace" are implemented by modifying the <code>equals</code> and <code>hashcode</code> methods for the objects compared. <p> The basic algorithm is described in: </br> "An O(ND) Difference Algorithm and its Variations", Eugene Myers, Algorithmica Vol. 1 No. 2, 1986, p 251. <p> This class outputs different results from GNU diff 1.15 on some inputs.  Our results are actually better (smaller change list, smaller total size of changes), but it would be nice to know why.  Perhaps there is a memory overwrite bug in GNU diff 1.15. @author Stuart D. Gathman, translated from GNU diff 1.15 Copyright (C) 2000  Business Management Systems, Inc. <p> This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. <p> This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public License for more details. <p> You should have received a copy of the <a href=COPYING.txt> GNU Lesser General Public License</a> along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */public class Diff {    /** Prepare to find differences between two arrays.  Each element of     the arrays is translated to an "equivalence number" based on     the result of <code>equals</code>.  The original Object arrays     are no longer needed for computing the differences.  They will     be needed again later to print the results of the comparison as     an edit script, if desired.     */    public Diff(Object[] a, Object[] b) {        Hashtable h = new Hashtable(a.length + b.length);        filevec[0] = new file_data(a, h);        filevec[1] = new file_data(b, h);    }    /** 1 more than the maximum equivalence value used for this or its     sibling file. */    private int equiv_max = 1;    /** When set to true, the comparison uses a heuristic to speed it up.     With this heuristic, for files with a constant small density     of changes, the algorithm is linear in the file size.  */    public boolean heuristic = false;    /** When set to true, the algorithm returns a guarranteed minimal     set of changes.  This makes things slower, sometimes much slower. */    public boolean no_discards = false;    private int[] xvec, yvec;	/* Vectors being compared. */    private int[] fdiag;    /* Vector, indexed by diagonal, containing               the X coordinate of the point furthest               along the given diagonal in the forward               search of the edit matrix. */    private int[] bdiag;    /* Vector, indexed by diagonal, containing               the X coordinate of the point furthest               along the given diagonal in the backward               search of the edit matrix. */    private int fdiagoff, bdiagoff;    private final file_data[] filevec = new file_data[2];    private int cost;    /** Find the midpoint of the shortest edit script for a specified     portion of the two files.     We scan from the beginnings of the files, and simultaneously from the ends,     doing a breadth-first search through the space of edit-sequence.     When the two searches meet, we have found the midpoint of the shortest     edit sequence.     The value returned is the number of the diagonal on which the midpoint lies.     The diagonal number equals the number of inserted lines minus the number     of deleted lines (counting only lines before the midpoint).     The edit cost is stored into COST; this is the total number of     lines inserted or deleted (counting only lines before the midpoint).     This function assumes that the first lines of the specified portions     of the two files do not match, and likewise that the last lines do not     match.  The caller must trim matching lines from the beginning and end     of the portions it is going to specify.     Note that if we return the "wrong" diagonal value, or if     the value of bdiag at that diagonal is "wrong",     the worst this can do is cause suboptimal diff output.     It cannot cause incorrect diff output.  */    private int diag(int xoff, int xlim, int yoff, int ylim) {        final int[] fd = fdiag;	// Give the compiler a chance.        final int[] bd = bdiag;	// Additional help for the compiler.        final int[] xv = xvec;		// Still more help for the compiler.        final int[] yv = yvec;		// And more and more . . .        final int dmin = xoff - ylim;	// Minimum valid diagonal.        final int dmax = xlim - yoff;	// Maximum valid diagonal.        final int fmid = xoff - yoff;	// Center diagonal of top-down search.        final int bmid = xlim - ylim;	// Center diagonal of bottom-up search.        int fmin = fmid, fmax = fmid;	// Limits of top-down search.        int bmin = bmid, bmax = bmid;	// Limits of bottom-up search.        /* True if southeast corner is on an odd                         diagonal with respect to the northwest. */        final boolean odd = (fmid - bmid & 1) != 0;        fd[fdiagoff + fmid] = xoff;        bd[bdiagoff + bmid] = xlim;        for (int c = 1; ; ++c) {            int d;			/* Active diagonal. */            boolean big_snake = false;            /* Extend the top-down search by an edit step in each diagonal. */            if (fmin > dmin)                fd[fdiagoff + --fmin - 1] = -1;            else                ++fmin;            if (fmax < dmax)                fd[fdiagoff + ++fmax + 1] = -1;            else                --fmax;            for (d = fmax; d >= fmin; d -= 2) {                int x, y, oldx, tlo = fd[fdiagoff + d - 1], thi = fd[fdiagoff + d + 1];                if (tlo >= thi)                    x = tlo + 1;                else                    x = thi;                oldx = x;                y = x - d;                while (x < xlim && y < ylim && xv[x] == yv[y]) {                    ++x;                    ++y;                }                if (x - oldx > 20)                    big_snake = true;                fd[fdiagoff + d] = x;                if (odd && bmin <= d && d <= bmax && bd[bdiagoff + d] <= fd[fdiagoff + d]) {                    cost = 2 * c - 1;                    return d;                }            }            /* Similar extend the bottom-up search. */            if (bmin > dmin)                bd[bdiagoff + --bmin - 1] = Integer.MAX_VALUE;            else                ++bmin;            if (bmax < dmax)                bd[bdiagoff + ++bmax + 1] = Integer.MAX_VALUE;            else                --bmax;            for (d = bmax; d >= bmin; d -= 2) {                int x, y, oldx, tlo = bd[bdiagoff + d - 1], thi = bd[bdiagoff + d + 1];                if (tlo < thi)                    x = tlo;                else                    x = thi - 1;                oldx = x;                y = x - d;                while (x > xoff && y > yoff && xv[x - 1] == yv[y - 1]) {                    --x;                    --y;                }                if (oldx - x > 20)                    big_snake = true;                bd[bdiagoff + d] = x;                if (!odd && fmin <= d && d <= fmax && bd[bdiagoff + d] <= fd[fdiagoff + d]) {                    cost = 2 * c;                    return d;                }            }            /* Heuristic: check occasionally for a diagonal that has made               lots of progress compared with the edit distance.               If we have any such, find the one that has made the most               progress and return it as if it had succeeded.               With this heuristic, for files with a constant small density               of changes, the algorithm is linear in the file size.  */            if (c > 200 && big_snake && heuristic) {                int best = 0;                int bestpos = -1;                for (d = fmax; d >= fmin; d -= 2) {                    int dd = d - fmid;                    if ((fd[fdiagoff + d] - xoff) * 2 - dd > 12 * (c + (dd > 0 ? dd : -dd))) {                        if (fd[fdiagoff + d] * 2 - dd > best                                && fd[fdiagoff + d] - xoff > 20                                && fd[fdiagoff + d] - d - yoff > 20) {                            int k;                            int x = fd[fdiagoff + d];                            /* We have a good enough best diagonal;                               now insist that it end with a significant snake.  */                            for (k = 1; k <= 20; k++)                                if (xvec[x - k] != yvec[x - d - k])                                    break;                            if (k == 21) {                                best = fd[fdiagoff + d] * 2 - dd;                                bestpos = d;                            }                        }                    }                }                if (best > 0) {                    cost = 2 * c - 1;                    return bestpos;                }                best = 0;                for (d = bmax; d >= bmin; d -= 2) {                    int dd = d - bmid;                    if ((xlim - bd[bdiagoff + d]) * 2 + dd > 12 * (c + (dd > 0 ? dd : -dd))) {                        if ((xlim - bd[bdiagoff + d]) * 2 + dd > best                                && xlim - bd[bdiagoff + d] > 20                                && ylim - (bd[bdiagoff + d] - d) > 20) {                            /* We have a good enough best diagonal;                               now insist that it end with a significant snake.  */                            int k;                            int x = bd[bdiagoff + d];                            for (k = 0; k < 20; k++)                                if (xvec[x + k] != yvec[x - d + k])                                    break;                            if (k == 20) {                                best = (xlim - bd[bdiagoff + d]) * 2 + dd;                                bestpos = d;                            }                        }                    }                }                if (best > 0) {                    cost = 2 * c - 1;                    return bestpos;                }            }        }    }    /** Compare in detail contiguous subsequences of the two files     which are known, as a whole, to match each other.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -