Segmentation fault at clReleaseEvent(*events) - opencl

I have multiple OpenCL events in the code which are defined like the following
cl_event events0[4], events1[4];
And when I release events at the end of the code like below, I get segementation fault (core dumped) error.
clReleaseEvent(*events0);
clReleaseEvent(*events1);
While for other OpenCL objects including memory buffers, kernels and programs, command queues and contexts are defined and released in the same manner does not occur any errors. Am I doing wrong? Am I missing something?
Thanks.

According to OpenCL specification, clReleaseEvent() should return CL_​INVALID_​EVENT if passed event is not a valid event object. But some OpenCL runtime implementations only check that the event is not NULL. Therefore, following code can lead to segmentation fault:
cl_event e = (cl_event)1;
clReleaseEvent(e);
This code
cl_event events0[4], events1[4];
does not initialize the events and if they are not initialized in the code below before calling clReleaseEvent() it can lead to segmentation fault.

Related

How to check which CUDA error arises in which asynchronous CUDA call?

Suppose we have the following situation:
launch_kernel_a<<<n_blocks, n_threads>>>(...);
launch_kernel_b<<<n_blocks, n_threads>>>(...);
cudaDeviceSynchronize();
if(cudaGetLastError() != CudaSuccess)
{
// Handle error
...
}
My understanding is that in the above, execution errors occurring during the asynchronous execution of either kernel may be returned by cudaGetLastError(). In that case, how do I figure out which kernel caused the error to occur during runtime?
My understanding is that in the above, execution errors occurring during the asynchronous execution of either kernel may be returned by cudaGetLastError().
That is correct. The runtime API will return the last error which was encountered. It isn't possible to know from which call in a sequence of asynchronous API calls an error was generated.
In that case, how do I figure out which kernel caused the error to occur during runtime?
You can't. You would require some kind of additional API call between the two kernel launches to determine the error. The crudest would be a cudaDeviceSynchronize() call, although that would serialize the operations if they actually did overlap (although I see no stream usage so that is probably not happening here).
As noted in comments -- most kernel runtime errors will result in context destruction, so if you got an error from the first kernel, the second kernel will abort or refuse to run anyway and that is probably fatal to your whole application.

Detect QApplication exit as normal or crash

I need to detect the application get exit as normal or crash. QProcess have the finished() signal and can get the exit code. But i need this exit code for QApplication when the application get crash or close.
When your process crashes, it's gone. The crash means that the process has finished because of an unhandled exception. Your job should be to prevent the crash from happening. In other words: handle the exceptions. Note that the exceptions may not be C++ exceptions, they may be low-level platform-specific mechanisms, such as native exceptions on Windows or signals on UNIX. You'd have to handle those, but recognize that the underlying issue is not fixed merely because you catch such an exception. You must assume that the state of your application has been corrupted, and the only safe thing to do is to exit ASAP anyway. For example, do not try to modify any files: you're likely to corrupt them.
I don't think this is something you can do just like that. Reading the value returned by QApplication::exec() is related to the Qt infrastructure:
Enters the main event loop and waits until exit() is called, then
returns the value that was set to exit() (which is 0 if exit() is
called via quit()).
Usually your main looks like this:
#include <QApplication>
int main( int argc, char **argv )
{
QApplication a( argc, argv );
// Initialize your widget(s)
return a.exec(); // You can store this and check its value
}
However if I'm not mistaken this doesn't include handling a crash of your application (segmentation fault, unhandled exception etc.). In Linux people usually use a script which starts the application and then reads its exit code after the application quits or crashes. If you use Linux you can use echo $? to read the exit code from the bash scrip (or its equivalent for a different shell) and then do something based on its value.
Note also that you can at least do some exception handling since some crashes result in exactly that - an exception that has been thrown for some reason and has not been processed properly. Unhandled exceptions in Qt get propagated to the top level (that is QCoreApplication).

RocksDB cryptic error message

Does anyone understand what this RocksDB error refers to ?
/column_family.cc:275: rocksdb::ColumnFamilyData::~ColumnFamilyData():
Assertion `refs_ == 0' failed. Aborted (core dumped)
This is an assertion failure raised by RocksDB, and it intentionally terminates the execution of the program.
In general, assertions are used by programmers to ensure certain invariants in the program. Assertions have some runtime overhead, and therefore can be completely disabled. Often they are compiled into development or debug builds, but are omitted for production builds.
When an assertion fails, the program execution is intentionally aborted immediately by calling std::abort. This may lead to your OS writing a core dump (as it obviously did as the above message reveals), but if and where core dumps are written depends on the OS configuration.
In case of this specific assertion, the destructor of rocksdb::ColumnFamilyData raised the assertion because it requires its refs_ member to have a value of 0. refs_ is a reference counter and it makes sense to assert that no references are actually held when the object's destructor is called.
From just looking at the destructor code, it is unclear whether this is a bug in the RocksDB library itself, or an error caused by using it the wrong way, e.g. destroying column family objects when they are still in use by other objects.
For reference, here's the code part that raised the assertion (currently on line 365 in file rocksdb/db/column_family.cc):
ColumnFamilyData::~ColumnFamilyData() {
assert(refs_.load(std::memory_order_relaxed) == 0);
If the error persists, it may be useful if you provide the code that uses RocksDB here. Otherwise it may be impossible to find the error source.
The core dump may also provide useful information, because it contains the stack trace of the code that actually invoked the object's destructor.
I noticed that all column_family.cc errors (core_dumped, memory_order_relaxed and etc) occur after incorrect rocksdb installation. In my vagrant script i found true way.
instead of use
https://github.com/facebook/rocksdb/blob/master/INSTALL.md
i create script
cd /opt
git clone https://github.com/facebook/rocksdb.git
cd rocksdb
git checkout tags/v4.1
PORTABLE=1 make shared_lib
export LD_LIBRARY_PATH=/opt/rocksdb
LD_LIBRARY_PATH add better to your environment path(.bash_rc or /etc/environment)
Assertion refs_ == 0 fails on ~ColumnFamilyData() means the reference count of a column family is not zero when the column family is deleted. Most likely you have some un-deleted column family handles before closing the DB. Note that all column family handles must be deleted before closing the DB. Otherwise the assertion will fail.
// Before delete DB, you have to close All column families by calling
// DestroyColumnFamilyHandle() with all the handles.
static Status Open(const DBOptions& db_options, const std::string& name,
const std::vector<ColumnFamilyDescriptor>& column_families,
std::vector<ColumnFamilyHandle*>* handles, DB** dbptr);
To fix such assertion failure, making sure you delete all column family handles before closing the DB.

I get error in Qt: SIGILL ERROR

I'm gettint error when trying to compare a QString with a empty string.
QString S = "abc";
if (S != "") // Sigill on this line
{
qDebug("ok");
}
According to the signal man page (section 7), SIGILL indicates an illegal instruction and is raised if an attempt is made to execute an instruction that is invalid or ill-formed or if the instruction requires a higher privilege level than you run at.
Because the comparison of two strings should not require operations that need special privileges, your QT version was likely compiled with supoort for instruction sets that are not supported by your processor (eg. support for SSE 4.2, whereas your processor does not support SSE 4.2). To check that condition, you might recompile the QT library after checking that all instruction sets the compiler uses are supported by your processor.

what does "QGLContext::makeCurrent() : wglMakeCurrent failed: The operation completed successfully" mean?

I am trying to make a multi threaded Qt Application that uses QGLWidgets and I keep getting this error.(I am trying to paint from another thread using QPainter)
And it also looks like I have a huge memory leak because of it.
The error is "QGLContext::makeCurrent() : wglMakeCurrent failed: The operation completed successfully"
I believe this is related to a rather old issue from the Qt mailing list as described here. In short, if the thread calling makeCurrent() does not equal the thread where the device context was retrieved, GetDC() is called. As outlined in the linked thread, the problem is that ReleaseDC() is not called accordingly, resulting in a handle leak, and triggering Windows to return NULL in the call to GetDC() at some point, which makes wglMakeCurrent() fail. I don't know, however, why GetLastError() claims "The operation completed successfully" in this case.

Resources