How do I infer fan-out in Altera opencl design? - opencl

I am currently implementing a 2-D PE array in my Altera openCL design on PFGA. In this implementation, each BRAM is feeding multiple PEs at the same time, but looks like this fan-out design cannot be inferred by the compiler. Anyone got any idea how this can be achieved in openCL C code?

Related

Translate OpenCL SPIR-V to Vulkan SPIR-V

Is it possible to translate OpenCL-style SPIR-V to Vulkan-style SPIR-V?
I know that it is possible to use clspv to compile OpenCL C to Vulkan-style SPIR-V, but I haven't seen any indication that it also supports ingesting OpenCL-style SPIR-V.
Thank you for any suggestions if you know how to achieve this :)
I know that it is possible to use clspv to compile OpenCL C to
Vulkan-style SPIR-V, but I haven't seen any indication that it also
supports ingesting OpenCL-style SPIR-V.
clspv compiles to "Opencl-style SPIR-V". IOW, it uses OpenCL execution model and also OpenCL memory model. The answer to your question is no (in general). The problem is that e.g. GLSL uses logical memory model, which means pointers are abstract, so you can't have pointers to pointers. While OpenCL allows this, because it uses physical memory model. Plus there are other things in OpenCL which cannot be expressed in GLSL. You could try to write some translator, and it might work for some very simple code, but that's about it.

Metal 2 vs OpenCL 1.2 for compute: what is Metal missing?

I have an OpenCL 1.2 application that I would like to run on iOS.
So, my only choice for gpgpu is Metal. I am curious about what is missing
in Metal relative to OpenCL ? My current app makes heavy use of OpenCL images,
and compute features such as popcnt .
I don't know OpenCL, but I doubt Metal is missing much, since it was designed much later. You can see from the Metal Shader Language Specification (PDF) that it provides the popcount() function.
Compute functions in Metal can read from and write to textures as well as buffers, if that's what OpenCL images are used for.
As warrenm points out, Meta does not support double-precision floating-point types.

Is there a general binary intermediate representation for OpenCL kernel programming?

as I understood, the OpenCL uses a modified C language (by adding some keywords like __global) as the general purpose for defining kernel function. And now I am doing a front-end inside F# language, which has a code quotation feature that can do meta programming (you can think it as some kind of reflection tech). So I would like to know if there is a general binary intermediate representation for the kernel instead of C source file.
I know that CUDA supports LLVM IR for the binary intermediate representation, so we can create kernel programmatically, and I want to do the same thing with OpenCL. But the document says that the binary format is not specified, each implementation can use their own binary format. So is there any general purpose IR which can be generated by program and can also run with NVIDIA, AMD, Intel implementation of OpenCL?
Thansk.
No, not yet. Khronos is working on SPIR (the spec is still provisional), which would hopefully become this. As far as I can tell, none of the major implementations support it yet. Unless you want to bet your project on its success and possibly delay your project for a year or two, you should probably start with generating code in the C dialect.

Kernels can invoke a broader number of functions than shaders

I read a article which stated that "Kernels can invoke a broader number of functions than shaders" how far is this true.
link for that article is http://www.dyn-lab.com/articles/cl-gl.html
The difference is quite the opposite actually. If you compare Section 8 of the GLSL specification with Section 6.12 of the OpenCL specification, you can see that there is a large overlap concerning mathematical operations.
However, GLSL has far more bit- and image-related operations and provides matrix operations which are not existing in OpenCL 1.2. On the other hand, OpenCL has more synchronization primitives and work group management functions that are not necessary with GLSL. Moreover, OpenCL provides smaller and larger integer types than GLSL.
Also, in Appendix C of the AMD APP OpenCL Programming Guide, the amount/types of available functions is not listed as a major difference between a shader and a kernel.

Any new ideas on using openCL with multiple GPUs?

My question is :
Has there been any new advancement (or perhaps a tool/library developed) for using openCL with multiple GPUs? I understand that if someone wants to write a code in openCL with the goal of using multiple GPUs, then he can, but I have been told that the way you can arrange the communications between them is a little "primitive". What I want to know is if there is something out there that can put a level of abstraction between the programmer and all that arrangement of communications between the GPUs.
I am working at stochastic simulations with pretty big lattices and I would like to be able to break them into different GPUs, each of which can do the computing and communicate if necessary. Writing this in a way that it's efficient is difficult enough, so if I can avoid all the low level work of using the standard way to do it through openCL, it would be a big help.
Thanks!
On the academic side, there is this paper from Seoul National University in South Korea:
Achieving a single compute device image in OpenCL for multiple GPUs, http://dl.acm.org/citation.cfm?id=1941591
The authors propose an automatic mechanism for dividing a kernel across multiple GPUs. Unfortunately, their framework has not been released yet.

Resources