TestClusterCompilingSuccess

From GPU-Users
Jump to: navigation, search

Efforts of groups for comping and running cuda/opencl code on the test nodes

C - compile successfully
R - run successfully (with number gpus)
x - no port available

group node -> alpha : Tesla C2050 beta : ATI Radeon HD gamma : GeForce GTX 470 (x3) gamma : Quadro 4000
application OpenCL Cuda OpenCL OpenCL Cuda OpenCL Cuda
Walker Homework MD (200k Atoms) CR
(1float=26.78s)
(1double=42.74s)
CR
(1float=23.02s)
(1double=38.72s)
CR
(cpufloat=200.86s)
(cpudouble=356.74s)
CR
(1float=29.18s)
(1double=64.13)
CR
(1float=26.04s)
(1double=57.06s)
CR
(1float=50.42s)
(1double=85.80s)
CR
(1float=51.91s)
(1double=81.99s)
Walker Homework MD (1M Atoms) (1float=11:54.56)
(1double=16:41.67)
(1float=11:47.84)
(1double=16:25.28)
(cpufloat=1:23:37.61)
(cpudouble=2:28:11.58)
(1float=12:27.40)
(1double=25:18.61)
(1float=12:26.98)
(1double=24:10.6)
(1float=25:19.88)
(1double=34:13.78)
(1float=25:34.34)
(1double=34:48.59)
Walker LAMMPS x x x
Walker HOOMD (binary) x x x
Palmeri/Polyn OSNN1-OCL
Palmeri/Polyn OSNN2-CUDA C C
Meiler ANN R
(1=58.296s)
(double=100.183s)
R
(1=58.253s)
R
(1=69.07s)
(cpu=working)
R
(1=53.777s)
(2=27.152s)
(3=18.189s)
(3double=46.5)
R
(1=53.226s)
(2=27.09s)
(3=18.059s)
R (float = 90.068s)
(double = 163.441s)
Meiler Matrix/Vector float (CPU/GPU) R (1) R (1) R (1) R (1) R (1) R (1)
Meiler Matrix/Vector double (CPU/GPU) R (1) R (1) R (1) R (1) R (1) R (1)
Meiler Pearson's product-moment coefficient
for electron density CCC
R
( 1 double=9.771s)
R
( 1 double=9.503s)
R
( 1 double=16.455s)
Kelly/Andreas

Note: Meiler ANN results on beta used ati stream 2.1
Note: Meiler Matrix/Vector had wrong results for beta, since the kernel was assuming warp size of 32 and get_local_size(0) is less or equal to 512, which is not true on CPU/ATI gpu
Note: Meiler Pearson's product-moment coefficient (cross correlation coefficient) - two sample sets with 92,000 elements for 10,000 iterations (serial CPU 25.230s)( opencl CPU 12min 31.555s)
Note: Walker Homework MD is brute force method - Homework problem from HPC Class.