📄 dgefmm.l

📁 C++编写的高性能矩阵乘法的Stranssen算法
💻 L
字号:
.TH DGEFMM l "12 November 1996" "PRISM version 1.0" "PRISM routine (version 1.0)".TH DGEFMM l "12 November 1996" "PRISM routine" "PRISM routine".SH NAMEDGEFMM - Winograd's variant of Strassen's algorithm to perform one of the matrix-matrix operations.brC := alpha*op( A )*op( B ) + beta*C..SH SYNOPSIS.TP 20SUBROUTINE DGEFMM ( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB, BETA, C, LDC ).TP 22.ti +4CHARACTER*1TRANSA, TRANSB.TP 22.ti +4INTEGERM, N, K, LDA, LDB, LDC.TP 22.ti +4DOUBLE PRECISION ALPHA, BETA.TP 22.ti +4DOUBLE PRECISION A( LDA, * ), B( LDB, * ), C( LDC, * ).SH PURPOSEDGEFMM  performs one of the matrix-matrix operationsC := alpha*op( A )*op( B ) + beta*C,where  op( X ) is one of.br   op( X ) = X   or   op( X ) = X' (transpose of X),.bralpha and beta are scalars, and A, B and C are matrices, with op( A )an m by k matrix,  op( B )  a  k by n matrix and  C an m by n matrix.DGEFMM is designed to be plug-and-play with the Level 3 BLAS routineDGEMM.  It uses Winograd's variant of Strassen's algorithm to reducethe operations count relative to conventional matrix multiplication(DGEMM).  It has a FORTRAN interface.  All arrays are stored in columnmajor format.  The necessary temporary memory is dynamically mallocedinside the routine..SH PARAMETERS.TP 8TRANSA  - CHARACTER*1.On entry, TRANSA specifies the form of op( A ) to be used inthe matrix multiplication as follows:TRANSA = 'N' or 'n',  op( A ) = A.TRANSA = 'T' or 't',  op( A ) = A'.TRANSA = 'C' or 'c',  op( A ) = A'.Unchanged on exit..TP 8TRANSB  - CHARACTER*1.On entry, TRANSB specifies the form of op( B ) to be used inthe matrix multiplication as follows:TRANSB = 'N' or 'n',  op( B ) = B.TRANSB = 'T' or 't',  op( B ) = B'.TRANSB = 'C' or 'c',  op( B ) = B'.Unchanged on exit..TP 8M       - INTEGER.On entry,  M  specifies  the number  of rows  of the  matrixop( A )  and of the  matrix  C.  M  must  be at least  zero.Unchanged on exit..TP 8N       - INTEGER.On entry,  N  specifies the number  of columns of the matrixop( B ) and the number of columns of the matrix C. N must beat least zero.Unchanged on exit..TP 8K       - INTEGER.On entry,  K  specifies  the number of columns of the matrixop( A ) and the number of rows of the matrix op( B ). K mustbe at least  zero.Unchanged on exit..TP 8ALPHA   - DOUBLE PRECISION.On entry, ALPHA specifies the scalar alpha.Unchanged on exit..TP 8A       - DOUBLE PRECISION array of DIMENSION ( LDA, ka ), where ka isk  when  TRANSA = 'N' or 'n',  and is  m  otherwise.Before entry with  TRANSA = 'N' or 'n',  the leading  m by kpart of the array  A  must contain the matrix  A,  otherwisethe leading  k by m  part of the array  A  must contain  thematrix A.Unchanged on exit..TP 8LDA     - INTEGER.On entry, LDA specifies the first dimension of A as declaredin the calling (sub) program. When  TRANSA = 'N' or 'n' thenLDA must be at least  max( 1, m ), otherwise  LDA must be atleast  max( 1, k ).Unchanged on exit..TP 8B       - DOUBLE PRECISION array of DIMENSION ( LDB, kb ), where kb isn  when  TRANSB = 'N' or 'n',  and is  k  otherwise.Before entry with  TRANSB = 'N' or 'n',  the leading  k by npart of the array  B  must contain the matrix  B,  otherwisethe leading  n by k  part of the array  B  must contain  thematrix B.Unchanged on exit..TP 8LDB     - INTEGER.On entry, LDB specifies the first dimension of B as declaredin the calling (sub) program. When  TRANSB = 'N' or 'n' thenLDB must be at least  max( 1, k ), otherwise  LDB must be atleast  max( 1, n ).Unchanged on exit..TP 8BETA    - DOUBLE PRECISION.On entry,  BETA  specifies the scalar  beta.  When  BETA  issupplied as zero then C need not be set on input.Unchanged on exit..TP 8C       - DOUBLE PRECISION array of DIMENSION ( LDC, n ).Before entry, the leading  m by n  part of the array  C mustcontain the matrix  C,  except when  beta  is zero, in whichcase C need not be set on entry.On exit, the array  C  is overwritten by the  m by n  matrix( alpha*op( A )*op( B ) + beta*C )..TP 8LDC     - INTEGER.On entry, LDC specifies the first dimension of C as declaredin  the  calling  (sub)  program.   LDC  must  be  at  leastmax( 1, m ).Unchanged on exit..SH SEE ALSO.BR DGEFMM_MEM (l),.BR fmm (l),.BR TMP_DGEFMM_MEM (l),.BR tmp_fmm (l)..SH FURTHER INFORMATION.LPSee the DGEFMM User's Guide provided in the doc/ directory of the distribution..SH BUGS.LPThere are no known bugs..SH COPYRIGHTSee COPYRIGHT file provided with the software distribution..SH AUTHORS.IPSteven Huss-Lederman, University of Wisconsin.brElaine M. Jacobson, Center for Computing Sciences.brJeremy R. Johnson, Drexel University.brAnna Tsao, Center for Computing Sciences.brThomas Turnbull, Center for Computing Sciences
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -