Paul Wang’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (1)


Is RISC-V ready for HPC prime-time: Evaluating the 64-core Sophon SG2042 RISC-V CPU
  • Conference Paper

November 2023

·

57 Reads

·

14 Citations

·

Maurice Jamieson

·

Joseph Lee

·

Paul Wang

Citations (1)


... The pseudocode for the optimized algorithm for a lower triangular non-transposed matrix is given in Algorithm 4. It traverses the matrix "bottom-up". For the last columns, calculations are performed using the baseline algorithm (lines [4][5][6][7][8][9][10][11][12][13][14], and for the remaining columns, they are performed by an optimized algorithm that traverses along diagonals (lines 15-31). AXPY(length + 1, alpha * X[i], a + lda * i, Y + i); 7: end for 8: for (i = 0; i < iend; i += BLOCK_SIZE) do 9: y_copy =LOAD(y + i); 10: for (INT j = 0; j < k; j++) do 11: x_copy = LOAD(x + i + 1 + j); 12: diag_a = LOAD_WITH_STRIDE(a + 1 + j, STRIDE); 13: mul = MUL_VV(x_copy, diag_a); 14: y_copy = FMA_VF(y_copy, alpha, mul); 15: end for 16: SAVE(y + i, y_copy); 17: a += BLOCK_SIZE * lda; 18: end for 19: for (; i < n -k; i++) do 20: length = MIN(n -i -1, k); 21: Y[i] += alpha * DOT(length, a + 1, X + i + 1); 22: a += lda; 23: end for 24: for (; i < n; i++) do 25: length = MIN(n -i -1, k); 26: AXPY(length + 1, alpha * X[i], a + k -length, 27: Y + i -length); 28: Y[i] += alpha * DOT(length, a + 1, X + i + 1); 29: a += lda; 30: end for 31: } Depending on stored triangle of the matrix and whether the matrix is transposed or not, this operation is performed using the DOT or AXPY. ...

Reference:

Performance optimization of BLAS algorithms with band matrices for RISC-V processors
Is RISC-V ready for HPC prime-time: Evaluating the 64-core Sophon SG2042 RISC-V CPU
  • Citing Conference Paper
  • November 2023