📄 sid2
字号:
- Introduction sis2: sparse svd via subpspace iteration using A'A eigensystems. sis2.c is an ANSI-C code designed to find several of the largest eigenvalues and eigenvectors of a real symmetric positive definite matrix B. This is a modified version of the ritzit program (algol) originally designed by Rutishauser in 1970 (Num. Math. 16, 205-223 and Handbook for Auto. Comp., Vol.ii-Linear Algebra, 284-302), and recoded in Fortran by B. Garbow (Argonne National Lab). The matrix B is assumed to be of the form B = A'A, where A is m by n (m>>n) and sparse. hence, the singular triplets of A are computed as the eigenpairs of B. The eigenvalues of B are the squares of the singular values of A, the eigenvectors correspond to the right singular vectors only. The left singular vectors of A are then determined by u = 1/sigma A*v, where {u,sigma,v} is a singular triplet of A. This particular implementation is discussed in "Multiprocessor Sparse SVD Algorithms and Applications", Ph.D. Thesis by M. Berry, University of Illinois at Urbana-Champaign, October 1990.- Calling sequence The calling sequence for ritzit is void ritzit( long n, long kp, long km, double eps, void (*opb) (long, double *, double *, double), void (*inf) (long, long, long, double *, long ), long kem, double **x, double *d, double *f, double *cx, double *u, long *imem ); The user must specify as part of the parameter list: n ... Order of matrix B for SVD problem {integer}. kp ... Number of simultaneous iteration vectors {integer}. kp must not be greater than n. km ... Maximum number of iterations to be per- formed {integer}. If starting vectors for the iteration vectors are available, km should be prefixed with a minus sign. eps ... Tolerance for accepting eigenvectors {double}. opb ... Name of the subroutine that defines the matrix B. opb is called with parameters (n,u,w) and must compute w=Bu without altering the vector u for the SVD problem. inf ... Name of the subroutine that may be used to obtain information or exert control during execution. inf is called with parameters (ks,kg,kh,f,m), where ks is the number of the next iteration step, kg is the number of already accepted eigenvectors, kh is the number of already accepted eigenvalues, f is the array of error quantities for the vectors of x. An element of f has the value 4.0 until the corresponding eigenvalue has been accepted. m is the degree of the current chebyshev polynomial. ks,kg,kh,m are {integer}. f is a 1-dim. array of length n {double}. kem ... Number of eigenvalues and corresponding eigenvectors desired {integer}. (kem must be less than kp) x ... Contains, if km is negative, the starting values for the iteration vectors. Otherwise, its contents are ignored and random starting values are generated. ritzit returns via its parameter list the following items: km ... Unchanged. kem ... Reset to the number of eigenvalues and eigenvectors actually accepted within the limit of km steps. imem ... Approximate number of bytes needed for this run. x ... Contains in its first kem columns orthonormal eigenvectors of B corresponding to the eigenvalues in array d. The remaining columns contain approximations to further eigenvectors of B (singular vectors of matrix A). x is an n by kp 2-dim. array {double}. d ... Contains in its first kem positions the absolutely largest eigenvalues of B (perturbed singular values of A). The remaining positions contain approximations to smaller eigenvalues of B d is a 1-dim. array of length kp {double}. The remaining parameters define temporary storage arrays: u ... Temporary 1-dim. storage array of length n {double}. f ... Temporary 1-dim. storage array of length kp {double}. cx ... Temporary 1-dim. storage array of length kp {double}.- User-supplied routines For sis2.c, the user must specify multiplication by the matrices A and B (subroutines opa and opb, respectively). The specification of opb should look something like void opb(long n, double *x, double *y) so that opb takes a vector x and returns y = B*x, where B is the appropriate matrix (see above). The specification of opa should look something like void opa(long n, double *x, double *y) so that opa takes a vector x and returns y = A*x. Subroutines opa and opb will be called with n always equal to the dimension of the eigenproblem solved. In sis2.c we use the Harwell-Boeing sparse matrix format for accessing elements of the nrow by ncol sparse matrix A and its transpose (denoted A'). Other sparse matrix formats can be used, of course.- Information Please address all questions, comments, or corrections to: M. W. Berry Department of Computer Science University of Tennessee 107 Ayres Hall Knoxville, TN 37996-1301 email: berry@cs.utk.edu phone: (615) 974-5067- File descriptions sis2.c requires the include files sisc.h and sisg.h for compilation. Constants are defined in sisc.h and all global variables are defined in sisg.h. The input and output files associated with sis2.c are listed below. Code Input Output ------ ------------ --------- sis2.c sip2, matrix sio2,sio5,siv2 The binary output file siv2 containing approximate left and right singular vectors will be created by sis2.c if it does not already exist. If you are running on a Unix-based workstation you should uncomment the line /* #define UNIX_CREAT */ in the declarations prior to main() in sis2.c. UNIX_CREAT specifies the use of the UNIX "creat" system routine with the permissions defined by the PERMS constant #define PERMS 0664 You may adjust PERMS for the desired permissions on the siv2 file (default is Read/Write for user and group, and Read for others). Subsequent runs will be able to open and overwrite these files with the default permissions. sis2.c obtains its parameters specifying the sparse SVD problem to be solved from the input file sip1. This parameter file contains the single line <name> em numextra km eps v where <name> is the name of the data set; em is an integer specifying the number of desired triplets; numextra is an integer specifying the number of extra vectors to carry so that the subspace dimension is em+numextra. km is an integer specifying the maximum number of iterations. eps is a double specifying the residual tolerance for approximated singular triplets of A. vectors contains the string TRUE or FALSE to indicate when singular triplets are needed (TRUE) and when only singular values are needed (FALSE); The current sis2.c code is designed to approximate the kem-largest singular triplets of A. Users interested in the kem-smallest singular triplets via subspace iteration should replace the given subroutine opb with one that returns y = C*x, where C=[(alpha*alpha)*I-A'A], and alpha is any upper bound for the largest singular value of the matrix A. If the parameter "v" from sip1 is set to "TRUE", the unformatted output file sio8 will contain the approximate singular vectors written in the order u[1], v[1], u[2], v[2], ..., u[kem], v[kem]. Here u[i] and v[i] denote the left and right singular vectors, respectively, corresponding to the i-th approximate singular value. A sample inf routine called "intros" has been supplied in sis2.c The output from intros (called by ritzit) is written to the formatted output file sio5.- Sparse matrix format sis2.c is designed to read input matrices that are stored in the Harwell-Boeing sparse matrix format. The nonzeros of such matrices are stored in a compressed column-oriented format. The row indices and corresponding nonzero values are stored by columns with a column start index array whose entries contain pointers to the nonzero starting each column. sis2.c reads the sparse matrix data from the input file called "matrix". Each input file "matrix" should begin with a four-line header record followed by three more records containing, in order, the column-start pointers, the row indices, and the nonzero numerical values. The first line of the header consists of a 72-character title and an 8-character key by which the matrices are referenced. The second line can be used for comments or to indicate record length for each index or value array. Although this line is generally ignored, A CHARACTER MUST BE PLACED ON THAT LINE. The third line contains a three-character string denoting the matrix type and the three integers specifying the number of rows, columns, and nonzeros. The fourth line which usually contains input format for Fortran-77 I/O is ignored by our ANSI-C code. The exact format is "%72c %*s %*s %*s %d %d %d %*d" for the first three lines of the header, line 1 <title> <key> (col. 1 - 72) (col. 73 - 80) line 2 <string> line 3 <matrix type> nrow ncol nnzero and "%*s %*s %*s %*s" for the last line of the header. line 4 <string1> <string2> <string3> <string4> Even though only the title and the integers specifying the number of rows, columns, and nonzero elements are read, other strings of input must be present in indicated positions. Otherwise, the format of the "fscanf" statements must be changed accordingly. - References Rutishauser, H., Simultaneous Iteration Method for Symmetric Matrices, Num. Math. l6, 205-223 (1970). (reprinted in Handbook for Automatic Computation, Volume ii, Linear Algebra, J. H. Wilkinson - C. Reinsch, contribution ii/9, 284-302, Springer- Verlag, 1971. Rutishauser, H., Computational Aspects of F. L. Bauer's Simultaneous Iteration Method., Num. Math. 13, 4-13 (1969). Garbow, B. S. and Dongarra, J. J., Path Chart and Documentation for the Eispack Package of Matrix Eigensystem Routines, Technical Memorandum No. 250, Applied Mathematics Division, Argonne National Laboratory, July, 1974, updated August, 1975.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -