lookitx.blogg.se

Opencl benchmark linux
Opencl benchmark linux












opencl benchmark linux

The OpenCL JIT compilation benchmark source code and results are available for download and should answer any remaining questions on details. Unfortunately, a suitable Intel GPU was not available for measuring the jit-overhead of the Intel OpenCL SDK when targeting GPUs. The CPUs on the two machines are comparable: The deviation of the jit-compilation targeting the CPU with the AMD OpenCL SDK is below ten percent. All other benchmarks are taken on a Linux Mint Maya machine with an AMD A10-5800K APU equipped with a discrete NVIDIA GeForce GTX 750 Ti GPU. The benchmark is run on an OpenSUSE 13.2 machine with an AMD FirePro W9100 GPU to measure jit-compilation overhead for an AMD GPU. Other combinations such as 2, 4, 8, 16, or 32 kernels per OpenCL program are also considered in this benchmark (always adding up to a total of 64 kernels).įor comparison we select the recent OpenCL SDKs from the major vendors: At the other hand, one may use on OpenCL program per kernel, resulting in 64 compilation units. compilation unit) and call the jit-compiler only once. How are the 64 kernels compiled? There are several options with OpenCL: One may put all kernels into a single OpenCL program (i.e. By varying the kernel name (hence the _1_2 suffix) and the index used, we make sure that all the kernels are indeed distinct and no caching optimizations in jit-compilers trigger. To mimic a realistic workload, consider the compilation of 64 kernels similar to the one above. Each of those kernels is more involved than the one shown above. More elaborate preconditioners quickly drive up the number to 15 or 20. Taking the iterative solvers in ViennaCL as an example, one quickly needs about 10 different kernels for simple setups. If you run a more involved OpenCL application, you may need a couple of different kernels. Since the kernel is so simple, it is reasonable to expect that a jit-compiler only requires a fraction of a second to compile this kernel. The kernel only sets the third entry of the buffer x to 1. But what is 'small'?Ĭonsider simple OpenCL kernels like the following: _kernel void kernel_1_2(_global float * x) In reality, it is sufficient to keep the jit-compilation time small compared to the overall execution time.

opencl benchmark linux

Ideally, jit-compilation is infinitely fast. Today's blog post is about just-in-time (jit) compilation overhead. Disadvantage: No automatic performance portability.Disadvantage: Just-in-Time compilation induces overhead.Advantage: Binary can be fully optimized for the underlying hardware.The kernels are just-in-time compiled during the program run, which has several advantages and disadvantages. The beauty of the vendor-independent standard OpenCL is that a single kernel language is sufficient to program many different architectures, ranging from dual-core CPUs over Intel's Many Integrated Cores (MIC) architecture to GPUs and even FPGAs.














Opencl benchmark linux