Build time slow with cgo dependencies - qt

I have a Go program that uses the Qt wrapper library https://github.com/therecipe/qt. Unfortunately, the build time gets extremely high with it (assuming it's the go part)
go build -i . // takes about 14 seconds
go run . // takes about 8 seconds
After running either of the above commands I get the pre-compiled dependencies in my $GOPATH/pkg/linux_amd64/github.com/therecipe/qt as .a files so they are not rebuilding each time.
I've tried to use ccache and Gold linker /usr/bin/ld.gold as described in https://github.com/therecipe/qt/wiki/Faster-builds-(Linux) but it didn't improve anything. Also this Qt wrapper ships with its own build tool qtdeploy which I tried but it is roughly the same build time.
System I'm running on:
go version go1.14.4 linux/amd64
Intel(R) Core(TM) i7-8550U CPU # 1.80GHz
16GB Ram
Would anyone know if it's possible to improve the build time at least a bit?
EDIT:
Running go build -x . shows that the biggest time consumer is the following command
~/.go/pkg/tool/linux_amd64/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link -buildmode=exe -buildid=k8lYa6JYqRdCY9Gyt0jX/16myMybByG5X6rOfaRpS/WHdW2kCTfMCZs2I4x9WE/k8lYa6JYqRdCY9Gyt0jX -extld=g++ ~/.cache/go-build/b5/b5e47b7f77c2df06ba69937dc8b1399b1289b7c90d2e08b3341fdf13e119c860-d

Related

Device is unable to BOOT or INSTALL with generated .hddimg

I have device which has following configuration:
Chipset architecture - Intel NM10 express
Processor - Atom D2550 Dual Core
Display - DVI
Volatile Memory - 2GB DDR3
Storage - 16GB
Objective: Device should run yocto embededded OS successfully
What I have done,
Downloaded three required yocto layers for warrior branch i.e. 1. poky 2. meta-openembedded 3. meta-intel
Modified local.conf with MACHINE ??= "intel-core2-32"
Ran source poky/oe-init-build-env
Generated .hddimg by bitbake core-image-minimal
Flashed .hddimg to thumb drive through dd command
Attached thumb drive to device and I could see BOOT and INSTALL option, upon clicking any of them nothing happens(not even logs) i.e. Blank screen
Troubleshooting I tried out are,
Tried to boot lubuntu and it was successful
Replaced kernel & initrd of lubuntu with yocto's one and booting was successful which indicates there is no issue with kernel or initrd in .hddimg generated by yocto
Tried some experiment with syslinux as well but didn't work out
The .hddimg types are quite outdated these days, and meta-intel has also switched to wic Their README includes very good information on how to create boot- and installable images here and here.
Short summary of it:
for booting, use the .wic-file
for building an installer, setup image and bootlader config according to documentation, then use .wic-file

cvxopt uses just one core, need to run on all / some

I call cvxopt.glpk.ilp in Python 3.6.6, cvxopt==1.2.3 for a boolean optimization problem with about 500k boolean variables. It is solved in 1.5 hours, but it seems to run on just one core! How can I make it run on all or a specific set of cores?
The server with Linux Ubuntu x86_64 has 16 or 32 physical cores. My process affinity is 64 cores (I assume due to hyperthreading).
> grep ^cpu\\scores /proc/cpuinfo | uniq
16
> grep -c ^processor /proc/cpuinfo
64
> taskset -cp <PID>
pid <PID> current affinity list: 0-63
However top shows only 100% CPU for my process, and htop shows that only one core is 100% busy (some others are slightly loaded presumably by other users).
I set OMP_NUM_THREADS=32 and started my program again, but still one core. It's a bit difficult to restart the server itself. I don't have root access to the server.
I installed cvxopt from a company's internal repo which should be a mirror of PyPI. The following libs are installed in /usr/lib: liblapack, liblapack_atlas, libopenblas, libblas, libcblas, libatlas.
Here some SO-user writes, that GLPK is not multithreaded. This is the solver used by default as cvxopt has no own MIP-solver.
As cvxopt only supports GLPK as open-source mixed-integer programming solver, you are out of luck.
Alternatively you can use CoinOR's Cbc, which is usually a much better solver than GLPK while still being open-source. This one also can be compiled with parallelization. See some benchmarks which also indicate that GLPK is really without parallel support.
But as there is no support in cvxopt, you will need some alternative access-point:
own C/C++ based wrapper
pulp
binary install available
python-mip
binary install available
Google's ortools
binary install available
cylp
cvxpy + cylp
binary install available for cvxpy; without cylp-build
Those:
have very different modelling-styles (from completely low-level: cylp to very high-level: cvxpy)
i'm not sure if all those builds are compiled with enable-parallel (which is needed when compiling Cbx)
Furthermore: don't expect too much gain from multithreading. It's usually way worse than linear speedup (as for all combinatorial-optimization problems which are not based on brute-force).
(Imho the GIL does not matter as all those are C-extensions where the GIL is not in the way)

Building PETSc with Intel tools

I would like to install PETSc library with Intel compilers, OpenMP, MPI, MKL. I am not sure how to properly create configure file. I have intel parallel studio xe 2017 installed on my computer. I checked ./configure --help in PETSc directory for options, but there is plenty of them. I don't how should I match it with Intel-mkl-link-line-advisor.
Anyone did this before?
I use these build lines for my Intel build. Unfortunately I stumbled uppon your questio while trying to fix my own problem (which is probably caused by something else) but this build worked liek a charm for over 2 years.
Of course you'll have to change the PETSc dir to your source directory. The 'tee's are so that I can run a script that generates all the build I need (with differing compilers, MPI implementations, debug/opt, etc) and still have logs from each build on separete files.
./configure --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/mkl/lib/intel64/ --with-debugging=1 PETSC_ARCH=linux-intel-dbg all test | tee linux-intel-dbg/configure.log
make PETSC_DIR=~/opt/petsc/ PETSC_ARCH=linux-intel-dbg all | tee linux-intel-dbg/make.log
make PETSC_DIR=~/opt/petsc/ PETSC_ARCH=linux-intel-dbg test | tee linux-intel-dbg/test.log
make PETSC_DIR=~/opt/petsc/ PETSC_ARCH=linux-intel-dbg streams NPMAX=8 | tee linux-intel-dbg/streams.log

Cray aprun is adding an extra dash to program arguments - how can I stop this?

I have an MPI application which has a command line option -ss to specify an argument. I've been running this successfully on various Cray machines, including ARCHER (www.archer.ac.uk) an XC30, for years. The OS was recently upgraded and as part of this ALPS was upgraded to version 5.1.1-2.0501.8507.1.1
Now when I launch the program on the compute nodes with aprun, the program is receiving the the option as --ss.
Checking with a shell script instead of a full application
#!/bin/bash
echo $*
confirms that this option is getting double-dashed by aprun.
Clearly there is a bug in aprun (I've reported it) but how can I work around the issue until this is patched?

Error -1001 in clGetPlatformIDs Call !

I am trying to start working with OpenCL. I have two NVidia graphics card, I installed "developer driver" as well as SDK from NVidia website. I compiled the demos but when I run
./oclDeviceQuery
I see:
OpenCL SW Info:
Error -1001 in clGetPlatformIDs Call
!!!
How can I fix it? Does it mean my nvidia cards cannot be detected? I am running Ubuntu 10.10 and X server works properly with nvidia driver.
I am pretty sure the problem is not related to file permissions as it doesn't work with sudo either.
In my case I have solved it by installing nvidia-modprobe package available in ubuntu (utopic/multiverse). And the driver itself (v346) was installed from https://launchpad.net/~mamarley/+archive/ubuntu/nvidia
Concretely, I have installed nvidia-opencl-icd-346, nvidia-libopencl1-346, nvidia-346-uvm, nvidia-346 and libcuda1-346. Not sure if they are all needed for OpenCL.
This is a result of not installing the ICD portion of Nvidia's openCL runtime. The ICD profile will instruct your application of the different openCL implementations installed on the system as multiple implementations from different vendors can coexist. Whe your application does not find the ICD information it gives the Error -1001.
Run your program as root. In case of success: you have trouble with cl_khr_icd- extension to load the vendor driver.
If you not running X11, you have to create device files manually or by (boot-)script:
ERROR: clGetPlatformIDs -1001 when running OpenCL code (Linux)
Same problem for me on a Linux system. Solution is to add the user to the video group:
# sudo usermod -aG video your-user-name
Since I just spend a couple of hours on this, I thought I would share:
I got the error because I was connected to the machine per remote desktop (mstsc). On the machine itself everything worked fine.
I have been told that it should work with TeamViewer by the way.
Dont know if you ever solved this problem, but I had the same issue and solved it in this post: ERROR: clGetPlatformIDs -1001 when running OpenCL code (Linux)
Hope it helps!
I have solved it in Ubuntu 13.10 saucy for intel opencl by created link:
sudo ln -s /opt/intel/opencl-1.2-3.2.1.16712/etc/intel64.icd /etc/OpenCL/vendors/nvidia.icd
I just ran into this problem on ubuntu 14.04 and I could not find ANY working answers anywhere online including this thread (though this was the first to show up on google). What ended up working for me was to remove ALL previous nvidia software and then to reinstall it using the .run file provided on the nvidia website. Installing the components through apt-get seems to fail for some reason.
1) Download CUDA .run file: https://developer.nvidia.com/cuda-downloads
2) Purge all previous nvidia packages
sudo apt-get purge nvidia-*
3) Install all run file components (you will likely have to stop X or restart in recovery mode to run this)
sudo sh cuda_X.X.XX_linux.run
This is because OpenCL has the same brain damaged one library per vendor setup that OpenGL has. A likely reason for the -1001 error is that you have compiled with a different library than the linker is trying to dynamically load.
So see if this is the problem run:
$ ldd oclDeviceQuery
...
libOpenCL.so.1 => important path here (0x00007fe2c17fb000)
...
Does the path point towards the NVidia-provided libOpenCL.so.1 file? If it doesn't, you should recompile the program with an -L parameter pointing towards the directory containing NVidia's libOpenCL.so.1. If you can't do that, you can override the linker's path like this:
$ LD_LIBRARY_PATH=/path/to/nvidias/lib ./oclDeviceQuery
For me, I was missing the CUDA OpenCL library, Running sudo apt install cuda-opencl-dev-12-0 solved it.
You should get number of platforms, allocate the memory for platforms, again get this platforms and then create context from this platform. There is good example:
http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=71
This might be due to querying clGetPlatformIDs by multiple threads at the same time

Resources