📄 fmm.l

📁 C++编写的高性能矩阵乘法的Stranssen算法
💻 L
字号:
.TH fmm l "12 November 1996" "PRISM version 1.0" "PRISM routine (version 1.0)".TH fmm l "12 November 1996" "PRISM routine" "PRISM routine".SH NAMEfmm - Winograd's variant of Strassen's algorithm to perform one of the matrix-matrix operations.brC := alpha*op( A )*op( B ) + beta*C..SH SYNOPSIS.TP 6fmm ( char transa, char transb, int m, int n, int k,double alpha, double *a, int lda, double *b, int ldb, double beta, double *c, int ldc, double *work, .brint lwork ).SH PURPOSEfmm performs one of the matrix-matrix operationsC := alpha*op( A )*op( B ) + beta*C,where  op( X ) is one of.br   op( X ) = X   or   op( X ) = X' (transpose of X),.bralpha and beta are scalars, and A, B and C are matrices, with op( A )an m by k matrix,  op( B )  a  k by n matrix and  C an m by n matrix.fmm calling sequence is very similar to the Level 3 BLAS routineDGEMM except it has two extra arguments to specify work space.  Ituses Winograd's variant of Strassen's algorithm to reduce theoperations count relative to conventional matrix multiplication(DGEMM).  It has a C calling interface but all arrays are stored in columnmajor format (Fortran style)..SH PARAMETERS.TP 8transa  - char.On entry, transa specifies the form of op( a ) to be used inthe matrix multiplication as follows:transa = 'N' or 'n',  op( a ) = a.transa = 'T' or 't',  op( a ) = a'.transa = 'C' or 'c',  op( a ) = a'.Unchanged on exit..TP 8transb  - charOn entry, transb specifies the form of op( b ) to be used inthe matrix multiplication as follows:transb = 'N' or 'n',  op( b ) = b.transb = 'T' or 't',  op( b ) = b'.transb = 'C' or 'c',  op( b ) = b'.Unchanged on exit..TP 8m       - int.On entry,  m  specifies  the number  of rows  of the  matrixop( a )  and of the  matrix  c.  m  must  be at least  zero.Unchanged on exit..TP 8n       - int.On entry,  n  specifies the number  of columns of the matrixop( b ) and the number of columns of the matrix c. n must beat least zero.Unchanged on exit..TP 8k       - int.On entry,  k  specifies  the number of columns of the matrixop( a ) and the number of rows of the matrix op( b ). k mustbe at least  zero.Unchanged on exit..TP 8alpha   - double.On entry, alpha specifies the scalar alpha.Unchanged on exit..TP 8a       - double pointer.a is logically an array of size ( LDA, ka ), where ka isk  when  transa = 'N' or 'n',  and is  m  otherwise.Before entry with  transa = 'N' or 'n',  the leading  m by kpart of the array  a  must contain the matrix  A,  otherwisethe leading  k by m  part of the array  a  must contain  thematrix A.Unchanged on exit..TP 8lda     - int.On entry, lda specifies the first dimension of a as "declared"in the calling (sub) program. When  transa = 'N' or 'n' thenlda must be at least  max( 1, m ), otherwise  lda must be atleast  max( 1, k ).Unchanged on exit..TP 8b       - double pointerb is logically an array of size  ( ldb, kb ), where kb isn  when  transb = 'N' or 'n',  and is  k  otherwise.Before entry with  transb = 'N' or 'n',  the leading  k by npart of the array  b  must contain the matrix  B,  otherwisethe leading  n by k  part of the array  b  must contain  thematrix B.Unchanged on exit..TP 8ldb     - int.On entry, ldb specifies the first dimension of b as "declared"in the calling (sub) program. When  transb = 'N' or 'n' thenldb must be at least  max( 1, k ), otherwise  ldb must be atleast  max( 1, n ).Unchanged on exit..TP 8beta    - double.On entry,  beta  specifies the scalar  beta.  When  beta  issupplied as zero then c need not be set on input.Unchanged on exit..TP 8c       - double pointer.c is logically an array of size ( ldc, n ).Before entry, the leading  m by n  part of the array  c mustcontain the matrix  C,  except when  beta  is zero, in whichcase c need not be set on entry.On exit, the array  c  is overwritten by the  m by n  matrix( alpha*op( a )*op( b ) + beta*c )..TP 8ldc     - int.On entry, ldc specifies the first dimension of c as "declared"in  the  calling  (sub)  program.   ldc  must  be  at  leastmax( 1, m ).Unchanged on exit..TP 8work    - double pointer.work is logically an array of size ( LWORK ) which istemporary work space for use by the routine.  Size must be at least lwork double words.  Input values are ignored.If work is NULL then fmm automatically allocates the temporaryspace it needs.On exit, the array  work will be overwritten by values that are only of internal use..TP 8lwork   - int.On entry, lwork specifies the size of the work space array work.If lwork is too small for the work space needed then fmm automaticallyallocates the temporary space it needs.The size necessary can be calculated with the routine tmp_fmm.Unchanged on exit..SH SEE ALSO.BR DGEFMM (l),.BR DGEFMM_MEM (l),.BR TMP_DGEFMM_MEM (l),.BR tmp_fmm (l)..SH FURTHER INFORMATION.LPSee the DGEFMM User's Guide provided in the doc/ directory of the distribution..SH BUGS.LPThere are no known bugs..SH COPYRIGHTSee COPYRIGHT file provided with the software distribution..SH AUTHORS.IPSteven Huss-Lederman, University of Wisconsin.brElaine M. Jacobson, Center for Computing Sciences.brJeremy R. Johnson, Drexel University.brAnna Tsao, Center for Computing Sciences.brThomas Turnbull, Center for Computing Sciences
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -