SGEMM and DGEMM compute, in single and double precision, respectively:
C := alpha*op( A )*op( B ) + beta*Cwhere:
op( A ) = A op( B ) = B
All cases were run on a single processor on one of the Hoffman2 Cluster compute nodes. The code is single-threaded and statically linked.
MFLOPS is calculated as:
MFLOPS = (1x10**-6) * 2 * N**2 / (CPU seconds)
Versions of BLAS compared: BLAS library from the Netlib Repository, ATLAS library, Intel-MKL library, AMD ACML Library and Goto BLAS.