Difference between revisions of "HPL.out"
Line 2: | Line 2: | ||
This attempt is the first run I did. It probably needs tuning because it SUCKS! This is the result I received with the use of VSIPL as the math libraries. I am now working on getting BLAS installed to test against those instead. | This attempt is the first run I did. It probably needs tuning because it SUCKS! This is the result I received with the use of VSIPL as the math libraries. I am now working on getting BLAS installed to test against those instead. | ||
+ | <pre> | ||
+ | ==Second Attempt== | ||
<pre> | <pre> | ||
============================================================================ | ============================================================================ | ||
HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 | HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 | ||
Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK | Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK | ||
+ | ============================================================================ | ||
+ | |||
+ | An explanation of the input/output parameters follows: | ||
+ | T/V : Wall time / encoded variant. | ||
+ | N : The order of the coefficient matrix A. | ||
+ | NB : The partitioning blocking factor. | ||
+ | P : The number of process rows. | ||
+ | Q : The number of process columns. | ||
+ | Time : Time in seconds to solve the linear system. | ||
+ | Gflops : Rate of execution for solving the linear system. | ||
+ | |||
+ | The following parameter values will be used: | ||
+ | |||
+ | N : 8192 | ||
+ | NB : 16 | ||
+ | PMAP : Row-major process mapping | ||
+ | P : 1 | ||
+ | Q : 1 | ||
+ | PFACT : Left Crout Right | ||
+ | NBMIN : 2 4 | ||
+ | NDIV : 2 | ||
+ | RFACT : Left Crout Right | ||
+ | BCAST : 1ring | ||
+ | DEPTH : 0 | ||
+ | SWAP : Mix (threshold = 64) | ||
+ | L1 : transposed form | ||
+ | U : transposed form | ||
+ | EQUIL : yes | ||
+ | ALIGN : 8 double precision words | ||
+ | |||
+ | ---------------------------------------------------------------------------- | ||
+ | |||
+ | - The matrix A is randomly generated for each test. | ||
+ | - The following scaled residual checks will be computed: | ||
+ | 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) | ||
+ | 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) | ||
+ | 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) | ||
+ | - The relative machine precision (eps) is taken to be 1.110223e-16 | ||
+ | - Computational tests pass if scaled residuals are less than 16.0 | ||
+ | |||
+ | ============================================================================ | ||
+ | T/V N NB P Q Time Gflops | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 8192 16 1 1 1554.50 2.358e-01 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB P Q Time Gflops | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L4 8192 16 1 1 1546.80 2.370e-01 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED | ||
+ | </pre> | ||
+ | ==Third Attempt== | ||
+ | <pre> | ||
+ | ============================================================================ | ||
+ | T/V N NB P Q Time Gflops | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 8192 32 1 1 1203.06 3.047e-01 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0597090 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0251891 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0041571 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB P Q Time Gflops | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L4 8192 32 1 1 1202.94 3.048e-01 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0597090 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0251891 ...... PASSED | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0041571 ...... PASSED | ||
+ | </pre> | ||
+ | ==Fourth Attempt== | ||
+ | |||
============================================================================ | ============================================================================ | ||
Revision as of 11:18, 28 August 2007
Contents
First attempt
This attempt is the first run I did. It probably needs tuning because it SUCKS! This is the result I received with the use of VSIPL as the math libraries. I am now working on getting BLAS installed to test against those instead.
==Second Attempt== <pre> ============================================================================ HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK ============================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 8192 NB : 16 PMAP : Row-major process mapping P : 1 Q : 1 PFACT : Left Crout Right NBMIN : 2 4 NDIV : 2 RFACT : Left Crout Right BCAST : 1ring DEPTH : 0 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words ---------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L2 8192 16 1 1 1554.50 2.358e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L4 8192 16 1 1 1546.80 2.370e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED
Third Attempt
============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L2 8192 32 1 1 1203.06 3.047e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0597090 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0251891 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0041571 ...... PASSED ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L4 8192 32 1 1 1202.94 3.048e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0597090 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0251891 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0041571 ...... PASSED
Fourth Attempt
================================================================
An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system.
The following parameter values will be used:
N : 8192 NB : 16 PMAP : Row-major process mapping P : 1 Q : 1 PFACT : Left Crout Right NBMIN : 2 4 NDIV : 2 RFACT : Left Crout Right BCAST : 1ring DEPTH : 0 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words
- The matrix A is randomly generated for each test. - The following scaled residual checks will be computed:
1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo )
- The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0
================================================================
T/V N NB P Q Time Gflops
WR00L2L2 8192 16 1 1 8794.08 4.169e-02
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0522414 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0220388 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0036372 ...... PASSED
================================================================
T/V N NB P Q Time Gflops
WR00L2L4 8192 16 1 1 8676.29 4.225e-02
||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0522414 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0220388 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0036372 ...... PASSED
Second Attempt
============================================================================ HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK ============================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 8192 NB : 16 PMAP : Row-major process mapping P : 1 Q : 1 PFACT : Left Crout Right NBMIN : 2 4 NDIV : 2 RFACT : Left Crout Right BCAST : 1ring DEPTH : 0 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words ---------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L2 8192 16 1 1 1554.50 2.358e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L4 8192 16 1 1 1546.80 2.370e-01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0625347 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0263812 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0043538 ...... PASSED