How to plugin already written C/C++ layer in chainer - chainer

I have already written C++ layer which is using CPU, I want to plugin chainer framework, How to do that ? Can chainer mix of CPU and GPU layers together?

You can use Cython, pybind11, or whatever tool to call C++ code from Python to embed your C++ layer into Chainer. You have to write a little bit of glue code to do that (e.g. converting NumPy array buffer from/to the data format used in the layer written in C++ and converting the interface of you layer into Chainer style Function; the latter should be easily done by writing a small Python class).
In order to mix CPU and GPU in your forward/backward computations, you can use F.copy(); it supports backprop (see https://docs.chainer.org/en/stable/reference/generated/chainer.functions.copy.html?highlight=copy).

Related

libuv vs asyncio (python)

I have been trying to find the difference in implementation of the uvloop and inbuilt asyncio that comes up with python. Apart from the fact that libuv the base of uvloop is written in c++, there is no other factor that is mentioned in the web. I would like to know about the other factors that affect the asyncio [erfomance between them.
Also on a side-note this blog consists of performance difference stream and normal async io, isn't stream generated from the asyncio and thus dependent on each other?
As you said, uvloop is written in Cython (equivalent to c) on top of libuv.
Writing code in Cython is almost guaranteed to give you a noticeable speed boost which is exactly what's happening here. No need for any other difference. It's much like numpy doing operations faster than writing normally in Python.
For your other question: The difference between asyncio and asyncio-streams is that streams are built on top of the basic asyncio.
Asyncio uses transports and protocols, the first responsible for writing to the socket, and the second for handling data received by the socket.
Streams are simple constructs built on top of both, and have an easier to use interface that mimics regular files or sockets.

Reason to use Qt standard library function wrappers

Is there any reason to use Qt standard function wrappers like qstrncpy instead of strncpy?
I could not find any hint in documentation. And I'm curious if there is any functional difference. It looks like making code dependent on Qt, even in not mandatory places.
I found this: Qt wrapper for C libraries
But it doesn't answer my question.
These methods are part of Qt's efforts for platform-independence. Qt tries to hide platform differences and use the best each platform has to offer, replicating that functionality on platforms where it is not available. Here is what the documentation of qstrncpy has to say:
A safe strncpy() function.
Copies at most len bytes from src (stopping at len or the terminating '\0' whichever comes first) into dst and returns a pointer to dst. Guarantees that dst is '\0'-terminated. If src or dst is nullptr, returns nullptr immediately.
[…]
Note: When compiling with Visual C++ compiler version 14.00 (Visual C++ 2005) or later, internally the function strncpy_s will be used.
So qstrncpy is safer than strncpy.
The Qt wrappers for these functions are safer than the standard ones because they guarantee the destination string will always be null-terminated. strncpy() does not guarantee this.
In C11, strncpy_s() and other _s() suffixed functions were added as safe string functions. However, they are not available in any C++ standard, they are C-only. The Qt wrappers fix this.

HEVC Deblocking with parallel processing on OpenCL

I have been working on HEVC for the past 2 years and recently I was asked to port the code of x265 onto OpenCL for parallel processing. Now, I am still at the starting stage and do see some concerns since Class is not a possibility as x265 uses many classes. Would it be possible to pass the structure since I have some function prototypes within the class. Is it possible to replicate the same onto GPU.
Yes, as you have mentioned that we will not be able to pass a class to the Kernel function. However, you would be able to include the prototypes in the structure and pass it to the GPU. You can refer to this link. passing parameters of an kernel function as C++ struct?

Is there a general binary intermediate representation for OpenCL kernel programming?

as I understood, the OpenCL uses a modified C language (by adding some keywords like __global) as the general purpose for defining kernel function. And now I am doing a front-end inside F# language, which has a code quotation feature that can do meta programming (you can think it as some kind of reflection tech). So I would like to know if there is a general binary intermediate representation for the kernel instead of C source file.
I know that CUDA supports LLVM IR for the binary intermediate representation, so we can create kernel programmatically, and I want to do the same thing with OpenCL. But the document says that the binary format is not specified, each implementation can use their own binary format. So is there any general purpose IR which can be generated by program and can also run with NVIDIA, AMD, Intel implementation of OpenCL?
Thansk.
No, not yet. Khronos is working on SPIR (the spec is still provisional), which would hopefully become this. As far as I can tell, none of the major implementations support it yet. Unless you want to bet your project on its success and possibly delay your project for a year or two, you should probably start with generating code in the C dialect.

Calling external functions when using OpenCL for CPU Device

I am evaluating the possibility for using OpenCL for just-in-time compilation of performance-critical mathematical expressions for CPU devices. I am currently using LLVM directly (or rather, I have a working proof-of-concept), but would find the abstraction offered by OpenCL very useful going forward.
I am now trying to figure out if there is some way to call functions with external linkage when using OpenCL for CPU devices, equivalent to the following in LLVM:
... = llvm::Function::Create(..., llvm::Function::ExternalLinkage, "...", ...);
Since my OpenCL implementation at least is built on top of LLVM, I was hoping that this would be possible somehow.
Does this function http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clEnqueueNativeKernel.html
accomplish what you are after?
Edit: credit where credit is due: https://stackoverflow.com/a/10807728/717881

Resources