2017年3月29日 星期三

matlab backslash benchmarking on T3600 + Tesla C2075

Dell T3600 (CPU: Intel Xeon E5-1607, quad-core, 3GHz) +
(GPU: Tesla C2075, 14 MultiProcessor x 32 cores per MP = 448 cores, 1.15GHz)

>> paralleldemo_gpu_backslash
Warning: Support for GPU devices with Compute Capability 2.0 will be removed in a future MATLAB release.
To learn more about supported GPU devices, see www.mathworks.com/gpudevice. 
> In parallel.internal.gpu.selectDevice
  In parallel.gpu.GPUDevice.current (line 44)
  In gpuDevice (line 23)
  In paralleldemo_gpu_backslash (line 25) 

Starting benchmarks with 8 different single-precision matrices of sizes
ranging from 1024-by-1024 to 22528-by-22528.
Creating a matrix of size 1024-by-1024.
Gigaflops on CPU: 48.665627
Gigaflops on GPU: 88.271370
Creating a matrix of size 4096-by-4096.
Gigaflops on CPU: 100.882839
Gigaflops on GPU: 413.998827
Creating a matrix of size 7168-by-7168.
Gigaflops on CPU: 118.687441
Gigaflops on GPU: 509.776228
Creating a matrix of size 10240-by-10240.
Gigaflops on CPU: 133.868299
Gigaflops on GPU: 573.530695
Creating a matrix of size 13312-by-13312.
Gigaflops on CPU: 139.691629
Gigaflops on GPU: 599.239014
Creating a matrix of size 16384-by-16384.
Gigaflops on CPU: 137.981033
Gigaflops on GPU: 611.514780
Creating a matrix of size 19456-by-19456.
Gigaflops on CPU: 143.995683
Gigaflops on GPU: 620.246637
Creating a matrix of size 22528-by-22528.
Gigaflops on CPU: 149.399225
Gigaflops on GPU: 628.924035

Starting benchmarks with 6 different double-precision matrices of sizes
ranging from 1024-by-1024 to 16384-by-16384.
Creating a matrix of size 1024-by-1024.
Gigaflops on CPU: 29.380086
Gigaflops on GPU: 63.980649
Creating a matrix of size 4096-by-4096.
Gigaflops on CPU: 48.082489
Gigaflops on GPU: 227.926438
Creating a matrix of size 7168-by-7168.
Gigaflops on CPU: 61.270138
Gigaflops on GPU: 270.644244
Creating a matrix of size 10240-by-10240.
Gigaflops on CPU: 64.503818
Gigaflops on GPU: 291.146412
Creating a matrix of size 13312-by-13312.
Gigaflops on CPU: 68.655104
Gigaflops on GPU: 300.565164
Creating a matrix of size 16384-by-16384.
Gigaflops on CPU: 56.189737
Gigaflops on GPU: 302.010177

ans = 

  struct with fields:

         sizeSingle: [1024 4096 7168 10240 13312 16384 19456 22528]
    gflopsSingleCPU: [48.6656 100.8828 118.6874 133.8683 139.6916 137.9810 143.9957 149.3992]
    gflopsSingleGPU: [88.2714 413.9988 509.7762 573.5307 599.2390 611.5148 620.2466 628.9240]
         sizeDouble: [1024 4096 7168 10240 13312 16384]
    gflopsDoubleCPU: [29.3801 48.0825 61.2701 64.5038 68.6551 56.1897]
    gflopsDoubleGPU: [63.9806 227.9264 270.6442 291.1464 300.5652 302.0102]
Single precision GFlops
Double precision GFlops
Speedup

沒有留言: