Are there any 'standard' plugins for detecting the CPU architecture in scons?
BTW, this question was asked already here in a more general form... just wondering if anyone has already taken the time to incorporate this information into scons.
Using i386 is rather compiler dependant, and won't detect non x86 32 bits archs. Assuming the python interpreter used by scons runs on the CPU you are interested in (not always the case - think cross compilation), you can just use python itself.
import platform
print platform.machine()
print platform.architecture()
If you need something more sophisticated, then maybe you will have to write your own configure function - but it may be better to deal with it in your code directly.
Something like this?
env = Environment()
conf = Configure(env)
if conf.CheckDeclaration("__i386__"):
conf.Define("MY_ARCH", "blahblablah")
env = conf.Finish()
Related
I'm working on some OpenCL code within a larger project. The code only gets compiled at run-time - but I don't want to deploy a version and start it up just for that. Is there some way for me to have the syntax of those kernels checked (even without consider), or even compile them, at least under some restrictions, to make it easier to catch errors earlier?
I will be targeting AMD and/or NVIDIA GPUs.
The type of program you are looking for is an "offline compiler" for OpenCL kernels - knowing this will hopefully help with your search. They exist for many OpenCL implementations, you should check availability for the specific implementation you are using; otherwise, a quick web search suggests there are some generic open source ones which may or may not fit the bill for you.
If your build machine is also your deployment machine (i.e. your target OpenCL implementation is available on your build machine), you can of course also put together a very basic offline compiler yourself by simply wrapping clBuildProgram() and friends in a basic command line utility.
As for the Vala language cross-platform to know the bitness of the system?
sizeof(void*) will be 8 for 64-bit systems and 4 for 32 bit systems. Also, 2 for 16 bit systems, but I don't even know that glib will work there.
The whole point of GLib is to avoid having to do platform specific code.
However according to you comment you want to do something like download platform specific packages.
First of all it would be better to use a system or user package manager to do that, since they already know how to achieve that (DRY principle).
If you absolutely must, you can also use tools like lsb-release -a or the more general uname -a (for the kernel and arch) or some other arguments to those tools.
You can invoke them with GLibs process spawning facilities.
See also:
How to determine whether a given Linux is 32 bit or 64 bit?
And a related problem is OS detection:
Is OS detection possible with GLib?
Also since Vala is a compiled language, you could use your favorite build system to pass something like -DPlatformx64 or -DPlatformx86 to the Vala compiler (see the above link on OS detection for an example on how to use the preprocessor in Vala code).
This is a very strange situation. Why do I get error
CL_PLATFORM_NOT_FOUND_KHR
when I'm calling this function:
clGetPlatformIDs(0, NULL, &platformCount);
Earlier this error was not. I have installed the driver and SDK from Intel and Nvidia. Are there any suggestions?
Here is explained why such error can occur. clGetPlatformIDs returns CL_SUCCESS if the function is executed successfully and there are a non-zero number of platforms available. Otherwise it can return CL_PLATFORM_NOT_FOUND_KHR if the cl_khr_icd extension is enabled and no platforms are found.
You are in luck. Well sort of... Seeing this is 3 years later.
Disclaimer: I HAVE NO CLUE WHY THIS WORKS:
Machine: x64 windows 10.
Graphics Card: Geforce GTX 960
Total Failure To Load Library : LoadLibraryA( "OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL.dll" )
CORRECT: LoadLibraryA( "OpenCL.dll" )
Here is the really insideous thing: Both of my "WRONG" answers will let you
grab function pointers, but when you call clGetPlatformIDs the return status
will be 0xFFFFFC17 ( CL_PLATFORM_NOT_FOUND_KHR ).
Then you'll be examining your function call correctness.
Maybe you'll even look at the calling convention. Maybe you'll check
the header files and make sure there are not any typos there. And yet,
you are looking in all the wrong places because the original problem happened
more steps back than you think.
Because of this problem, I build into my programs code that reads a file:
"OPEN_CL_SEARCH_PATHS.TXT" so the user of the software can change what DLL file
the program attempts to load.
While I am here, I would also like to add that there seems to be a bug with the
driver that makes it so OpenCL <==> OpenGL sharing is NOT a zero-copy share and
is incredibly laggy. Now I've got to go figure out Vulkan to make my fractal
rendering engine even though OpenCL's abstraction better suits the problem.
It is probably important to note that I am NOT using an SDK or any
validation layers. In fact, I am not even using
windows.h.
I wrote assembly code to grab the address of GetProcAddress and LoadLibrary by navigating the PEB file. I am also not using cl.h or cl_platform.h.
I reconstruct the structs I need from the documentation. I am also not
bothering with prototypes for function signatures either. For example,
I call "clGetPlatformIDs" by casting it to type "F_03" and then
calling it that way.
typedef void* (F_03)( void, void*, void* );
My machine doesn't have GPU and so had to use hashcat with OpenCL for CPU alone. My machine was Intel core i3, so I have downloaded the OpenCL softwares from Intel website and installed manually and the error gone.
Source: https://youtu.be/AieYqNQ6ADM
I have an openmp code written in C. I executed the code on Intel MIC on Stampede. I want to profile the code to find the hotspots in the code so that it will be helpful for me to optimize the code further. I tried to use the profiler gprof but I read somewhere that gprof cannot be used on MIC directly. I tried to use perf by going through tutorial. I could go till a certain step after which when the perf annotate step comes and I execute the code, it gives me the error ")" unexpected. So I am not knowing how to proceed to profile my code. Can anybody please help ??
This is the site where I referred to the perf tutorial : sandsoftwaresound.net/perf/perf-tutorial-hot-spots/ .
80% of optimization for the Xeon Phi is the same as for the host (Xeon). Use gprof, printf, compiler options, and the rest of your toolkit and carry your optimization as far as you can executing your code on the host only. After you can do no more, then focus on specific Xeon Phi optimizations.
As you are on Stampede, I assume you are using the Intel compiler. The compiler has a lot of diagnostic capabilities to profile your code and even provide suggestions. I'd provide you with more specific URLs but am on vacation with limited bandwidth.
Though this isn't specific to your question, here are some other suggestions. If you aren't, you'll most likely get a substantial boost using it. Intel compilers are danged good at optimizations, especially on Intel architectures. Also, you should use Intel MKL where possible. All of MKL's routines are optimized for the different IA architectures, and the most relevant to HPC are optimized specifically for MIC.
You have a few options.
The heavyweight approach is to use Intel Vtune. Firstly add -g to your compiler flags.
I use Vtune from the host command line quite a bit, here is the command I use to profile an application on the MIC. (This is executed on the host machine, Vtune on the host uses ssh
to launch the application on the MIC.)
amplxe-cl -collect knc-hotspots -source-search-dir=/mysrc/dir -search-dir=/mybin/dir -- ssh mic0 /home/me/myapp
Assume the app on the MIC is at /home/me/myapp, and the source dir and source search dir on the host. (With Vtune update 15 at least, I need to specify both of these separately in order to get the Vtune GUI to show me symbol info)
Once your app has finished, run the Vtune GUI on the host with amplxe-gui and open your result set.
There are also some simplified open source profiling tools developed by Intel that support the MIC, Speedometer and Overhead, you can find information about them here
Hopefully this is enough info to get you started.
How do I make a makefile that works on AIX, Linux and SunOS and has the ability to provide different compiler options for each environment?
I have access to an environment variable which describes the OS, but the AIX make utility does not like ifeq, so I can't do something like:
ifeq($(OS), AIX)
CFLAGS = $(CFLAGS) <IBM compiler options>
endif
You can use a construct like this:
CFLAGS_AIX = AIX flags
CFLAGS_Linux = Linux flags
CFLAGS_SunOS = SunOS flags
CFLAGS = $(CFLAGS_$(OS))
The portability of a Makefile is not directly related to the operating system, but to the implementation of make on the platform in question. (So there is an indirect relationship in that the implementation of make may be guessed (but NOT accurately) from the OS.) In general, this is a difficult problem for which many solutions have been proposed. You might want to look into automake, which will generate portable makefiles for you. However, automake's solution to the problem of setting flags for different unixen may not appeal to you as the solution is (basically) "don't do it". Rather than setting options based on the platform, the philosophy is to determine what flags are needed based on the functionality provided by the host or by the user at configure time. One convenient autoconf/automake based solution for the problem of assigning flags based on platform is to have a distinct file for each of your platforms which assigns CFLAGS at configure time, and have the file be specified in the CONFIG_SITE environment variable of the user running configure. You can assign CONFIG_SITE in the login script of the user based on the platform. (ie, push the problem away from configure/make and into the login setup) This makes the assignment transparent to the user building the software. (transparent but easily overridden).