⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 spmatrixvec.br

📁 用于GPU通用计算的编程语言BrookGPU 0.4
💻 BR
字号:
/*  Sparse Matrix-Vector Multiply    Simple example of how to perform a sparse matrix vector multiplication  (y = Ax) where the sparse matrix A is represented in padded compressed  sparse row, aka ITPACK format.  Each row of the matrix is constrained  to have at most MAX_NZ_PER_ROW nonzero elements.  Example:    1  0  0  3  2  0  1  9  0  4  0  0  0  0  0  5  The representation of the above matrix, where MAX_NZ_PER_ROW = 3 and  NUM_ROWS = 4 is:  Anz:  1 3 0  2 1 9  4 0 0  5 0 0  Col:  0 3 0  0 2 3  1 0 0  3 0 0   Algorithm:  1. Use a gather to "arrange" the elements of vector x into the ordering  they will be referenced in the mat-vec multiply (ordering determined by  cols of nonzeros of A)  Following the above example, elements of the x vector would be gathered  using the following access sequence.  0 3 0  0 2 3  1 0 0  3 0 0  2. Component-wise multiply the gathered x vector with the nonzeros of A  3. Perform reduction on resulting stream, reducing via the sum operator  every MAX_NZ_PER_ROW elements (the rows of the matrix) to get y = Ax.*/#include <stdio.h>#include <stdlib.h>#define MAX_NZ_PER_ROW  5#define NUM_ROWS        10#define MAX_NZ          50/*** Kernels needed for Sparse Matrix-Vector multiplication *******************/kernel void gather(float index<>, float x[NUM_ROWS+1], out float result<>) {  result = x[index];}// componentwise multiplykernel void mult(float a<>, float b<>, out float c<>) {  c = a*b;}reduce void sumRows(float nzValues<>, reduce float result<>) {  result += nzValues;}/*****************************************************************************/// The data is read from disk in compressed sparse row (CSR) format.// This converts the structures read from disk into the padded ITPACK// representation that is easy to stream.void reshuffleData(float nz[MAX_NZ], int cols[MAX_NZ], int rowStart[NUM_ROWS+1],                   float Anz[MAX_NZ], float Acols[MAX_NZ]) {  int i,j;  for (i=0;i<NUM_ROWS;i++) {    int offset = 0;    for (j=rowStart[i];j<rowStart[i+1];j++) {      Anz[MAX_NZ_PER_ROW*i + offset] = nz[j];      Acols[MAX_NZ_PER_ROW*i + offset] = (float)cols[j];      offset++;    }      // must pad the rest of the row if less than MAX_NZ_PER_ROW nonzeros    while (offset < MAX_NZ_PER_ROW) {      Anz[MAX_NZ_PER_ROW*i + offset] = 0.0f;      Acols[MAX_NZ_PER_ROW*i + offset] = (float)0.0f;  // value is 0.0 so index doesn't matter      offset++;    }  }}int parseCmdLine(int argc, char** argv, char* filename) {    if (argc < 2) {	  strncpy(filename, "spMatrixVec.txt", 70);	  return 0;  }  strncpy(filename, argv[1], 70);   return 0;}int main(int argc, char** argv) {  int i,j;  char filename[80];  // stores compressed sparse row representation of the matrix on disk  float  csrNz[MAX_NZ];         // nonzero values  int    csrCols[MAX_NZ];       // corresponding column  // index into csrNz of first nz element in each row  // note: number of nz in row i = csrRowStart[i+1] - csrRowStart[i]  int    csrRowStart[NUM_ROWS+1];   // storage for ITPACK representation of the matrix  float  Anz[MAX_NZ];  float  cIdx[MAX_NZ];  float  x[NUM_ROWS];  float  y[NUM_ROWS];  float  AStrm<MAX_NZ>;           // nonzeros of A  float  cIdxStrm<MAX_NZ>;        // column indices  float  productsStrm<MAX_NZ>;      float  xGatherStrm<MAX_NZ>;     // "gathered" version of x  float  xStrm<NUM_ROWS>;  float  yStrm<NUM_ROWS>;  if (parseCmdLine(argc, argv, filename))    return 1;  // load the CSR matrix structure here //////////////////////////////////////////  FILE* data;  data = fopen(filename, "r");    if (!data) {    printf("Couldn't open CSR data file: %s\n", filename);    return 1;  }  for (i=0;i<NUM_ROWS;i++) {    fscanf(data, "%f", &x[i]);    printf("%.2f ", x[i]);  }  printf("\n");  for (i=0;i<NUM_ROWS+1;i++) {    fscanf(data, "%d", &csrRowStart[i]);    printf("%d ", csrRowStart[i]);  }  printf("\n");  for (i=0;i<csrRowStart[NUM_ROWS];i++)    fscanf(data, "%d", &csrCols[i]);  for (i=0;i<csrRowStart[NUM_ROWS];i++)    fscanf(data, "%f", &csrNz[i]);    printf("CSR Sparse Matrix Form:  [value(col)]\n");  for (i=0;i<NUM_ROWS;i++) {    for (j=csrRowStart[i];j<csrRowStart[i+1];j++)      printf("%.3f(%d)  ", csrNz[j], csrCols[j]);    printf("\n");  }  // convert from pure CSR to a padded scheme better suited for streaming, the following  // printf's should ilustrate how the data is laid out  reshuffleData(csrNz, csrCols, csrRowStart, Anz, cIdx);  printf("\n");  printf("Padded Non-zero values:\n");  for (i=0;i<NUM_ROWS;i++) {    for (j=0;j<MAX_NZ_PER_ROW;j++)      printf("%3.3f ", Anz[i*MAX_NZ_PER_ROW + j]);    printf("\n");  }  printf("\n");  printf("Padded Col Indices:\n");  for (i=0;i<NUM_ROWS;i++) {    for (j=0;j<MAX_NZ_PER_ROW;j++)      printf("%3.1f ", cIdx[i*MAX_NZ_PER_ROW + j]);    printf("\n");  }    printf("\n");  printf("x:\n");  for (i=0;i<NUM_ROWS;i++)    printf("%.3f ", x[i]);  // Now perform the matrix multiplication ///////////////////////////////////////////////  // read in the matrix data  streamRead(AStrm, Anz);  streamRead(cIdxStrm, cIdx);    for (i=0;i<NUM_ROWS;i++)    y[i] = 0.0f;  streamRead(xStrm, x);  // these four lines perform sparse matrix-vector multiply  streamRead(yStrm, y);  gather(cIdxStrm, xStrm, xGatherStrm);  mult(AStrm, xGatherStrm, productsStrm);  // reduction from (NUM_ROWS * MAX_NX_PER_ROW) to (NUM_ROWS)  // Note: in 2D, this can be thought of as a reduction along the rows of the matrix  sumRows(productsStrm, yStrm);         streamWrite(yStrm, y);  printf("\n");  printf("result: y = Ax\n");  for (i=0;i<NUM_ROWS;i++)    printf("%.3f\n", y[i]);  return 0;}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -