Qt QSemaphore's release() does not immediately notify waiters? - qt

I've written a Qt console application to try out QSemaphores and noticed some strange behavior. Consider a semaphore with 1 resource and two threads getting and releasing a single resource. Pseudocode:
QSemaphore sem(1); // init with 1 resource available
thread1()
{
while(1)
{
if ( !sem.tryAquire(1 resource, 1 second timeout) )
{
print "thread1 couldn't get a resource";
}
else
{
sem.release(1);
}
}
}
// basically the same thing
thread2()
{
while(1)
{
if ( !sem.tryAquire(1 resource, 1 second timeout) )
{
print "thread2 couldn't get a resource";
}
else
{
sem.release(1);
}
}
}
Seems straightforward, but the threads will often fail to get a resource. A way to fix this is to put the thread to sleep for a bit after sem.release(1). What this tells me is that the release() member does not allow other threads waiting in tryAquire() access to the semaphore before the current thread loops around to the top of while(1) and grabs the resource again.
This surprises me because similar testing with QMutex showed proper behavior... i.e. another thread hanging out in QMutex::tryLock(timeout) gets notified properly when QMutex::unlock() is called.
Any ideas?

I'm not able to fully test this out or find all of the suppporting links at the moment, but here are a few observations...
First, the documentation for QSemaphore.tryAcquire indicates that the timeout value is in milliseconds, not seconds. So your threads are only waiting 1 millisecond for the resource to become free.
Secondly, I recall reading somewhere (I unfortunately can't remember where) a discussion about what happens when multiple threads are trying to acquire the same resource simultaneously. Although the behavior may vary by OS and situation, it seemed that the typical result is that it is a free-for-all with no one thread being given any more priority than another. As such, a thread waiting to acquire a resource would have just as much chance of getting it as would a thread that had just released it and is attempting to immediately reacquire it. I'm unsure if the priority setting of a thread would affect this.
So, why might you get different results for a QSemaphore versus a QMutex? Well, I think a semaphore may be a more complicated system resource that would take more time to acquire and release than a mutex. I did some simple timing recently for mutexes and found that on average it was taking around 15-25 microseconds to lock or unlock one. In the 1 millisecond your threads are waiting, this would be at least 20 cycles of locking and unlocking, and the odds of the same thread always reacquiring the lock in that time are small. The waiting thread is likely to get at least one bite at the apple in the time that it is waiting, so you won't likely see any acquisition failures when using mutexes in your example.
If, however, releasing and acquiring a semaphore takes much longer (I haven't timed them but I'm guessing they might), then it's more likely that you could just by chance get a situation where one thread is able to keep reacquiring the resource repeatedly until the wait condition for the waiting thread runs out.

Related

QMutex::lock() taken by same QThread several times consecutively

I have a QThread spawned off from my main that is doing a synchronous read on a device. The read has a 1000ms timeout. The read is wrapped in a forever loop, and the read is protected with a QMutex. The basic code is:
Thread 1 - Read forever on a device
for (;;){
readMutex.lock(); // Lock so that system cannot change device parameters in middle of read
read (&devicePtr, 1000);
readMutex.unlock; //Unlock so that thread 2 can grab lock if its waiting
}
I have another method that that runs under the main event loop that can set some parameters to the devicePointer. Its not safe to do that when the device is being read, so it tries to take the lock, then set the parameters, then unlock the mutex. This will allow the Thread 1 read loop to continue. The basic code is:
Thread 2 - Set device params
void setParm(){
readMutex.lock(); //Expects to take lock as soon as Thread 1 unlocks it
setParam (&devicePtr, PARAM);
readMutex.unlock(); //Unlock so thread 1 can resume its read forever loop
}
I have some qDebug in the code dumping the thread id when each of the thread take the lock. What I see is that Thread 2 calls the lock and blocks while thread 1 has the lock and is doing the read. Thread 1 completes the read and unlocks the Mutex. Thread 1 moves on to the next iteration of the loop, and takes the lock again. Thread 2 remains blocked on the readMutex.lock () call. This will have 5 or 6 times before Thread 2 is eventually allowed to take the lock and proceed.
I have assumed that the QMutex queue up the threads with round robin, but it doesn't seem so with Thread 1 able to take the lock back on the next iteration. I can sort of force Thread 2 to take the lock if I add a QThread::msleep(250) to the end of Thread 1's loop after the unlock. That sleep does the trick, and Thread 2 is able to take the lock right away and set the device parameters.
Am I doing something wrong, is this a priority thing? Any idea how I can make this round robin through the threads without using a msleep to put Thread 1 in the background?
There is no way to ensure the scheduler will attend one or the other thread. You could try changing the priority of the running threads to give a better chance to execute to one over the other, but it will no warranty anything and not all system supports priority change (ie:linux does not).
You should use a QWaitCondition instead of the sleep (you could pass the sleep value as a timeout for that condition) and make each thread wakes one pending thread waiting on that condition (QWaitCondition::wakeOne) before waiting on it (QWaitCondition::wait).

Meteor.setTimeout() memory leak?

I've created a new project with just one file (server.js) on the server with this tiny piece of code that does nothing. But, after running it, my node process is using about 1Gb of memory. Does anyone know why?
for (var i = 1000000; i >= 0; i--) {
Meteor.setTimeout(function(){},1000);
};
Apparently Meteor.setTimeout() function does or uses something (closure?) that prevents GC from clearing memory after it has been executed. Any ideas?
Since you are calling this on the server side, Meteor.setTimeout is a lot more complex than it appears on the surface. Meteor.setTimeout wraps setTimeout with Meteor.bindEnvironment(), which is essentially binding the context of the current environment to the timeout callback. When that timeout triggers, it will pull in the context of when it was originally called.
A good example would be if you called a Meteor.method() on the server and used a Meteor.setTimeout() within it. Meteor.method() will keep track of the user who called the method. If you use Meteor.setTimeout() it will bind that environment to the callback for the timeout, increasing the amount of memory needed for an empty function().
As to why there isn't any garbage collection occurring on your server, it may not have hit it's buffer. I tried running your test and my virtual memory hit around 1.2gb, but it never went any higher, even after subsequent tests. Try running that code multiple times to see if memory consumption continues to increase linearly, or if it hits a ceiling and stops growing.

How exactly works set_timeout() on a TcpStream?

I'm working with TcpStream. The basic structure I'm working with is :
loop {
if /* new data in the stream */ { /* handle it */ }
/* do a lot of other stuff */
}
So set_timeout() appears to be what I need, but I'm a little puzzled about how it works. The documentation says :
This function will set a timeout for all blocking operations (including reads and writes) on this stream. The timeout specified is a relative time, in milliseconds, into the future after which point operations will time out. This means that the timeout must be reset periodically to keep it from expiring.
So I would expect to have to reset the timeout each time before checking if new data is available, otherwise I would only have Err(TimeOut) after some time.
But it appears not to be the case : actually if I set a very low timeout (like 10 ms) once and for all, the loop does exactly what I want. It returns new data if there is some, and returns Err(TimeOut) if there is none.
Am I misunderstanding the documentation ? Is it safe for me to use this behavior ?
I would have expected it to work like a socket timeout, like you have as the property for sockets in most operating systems and which is available from with the programming languages with SO_TIMEOUT or similar things. With such socket timeout the timer will be started whenever you start a blocking operation on the socket, like read, write, connect. Either the operation will succeed within the time frame or the timer will be triggered and the operation fail because of a timeout. The timeout is a property of the socket and not of the operation, so there is no need to set it again before each operation.
But according to the documentation Rust implemented a completely different thing. If I interpret the documentation correctly they don't set a timeout per operation, but instead set a deadline for all operations of this type on the socket. That is, when the timer is set up to 10 seconds you can have multiple reads within this time but if there is still a read active after 10 seconds it will be stopped.
When one is used to work with socket timeouts in other languages this behavior is not the expected one and it looks like the Rust developers have similar objections to this (experimental) API. In https://github.com/rust-lang/rust/issues/15802 they suggest to rename these kind of functions from set..timeout to set..deadline to make the name reflect the behavior.

Qt application crashes when making 2 network requests from 2 threads

I have a Qt application that launches two threads from the main thread at start up. Both these threads make network requests using distinct instances of the QNetworkAccessManager object. My program keeps crashing about 50% of the times and I'm not sure which thread is crashing.
There is no data sharing or signalling occurring directly between the two threads. When a certain event occurs, one the threads signals the main thread, which may in turn signal the second thread. However, by printing logs, I am pretty certain the crash doesn't occur during the signalling.
The structure of both threads is as follows. There's hardly any difference between the threads except for the URL etc.
MyThread() : QThread() {
moveToThread(this);
}
MyThread()::~MyThread() {
delete m_manager;
delete m_request;
}
MyThread::run() {
m_manager = new QNetworkAccessManager();
m_request = new QNetworkRequest(QUrl("..."));
makeRequest();
exec();
}
MyThread::makeRequest() {
m_reply = m_manager->get(*m_request);
connect(m_reply, SIGNAL(finished()), this, SLOT(processReply()));
// my log line
}
MyThread::processReply() {
if (!m_reply->error()) {
QString data = QString(m_reply->readAll());
emit signalToMainThread(data);
}
m_reply->deleteLater();
exit(0);
}
Now the weird thing is that if I don't start one of the threads, the program runs fine, or at least doesn't crash in around 20 invocations. If both threads run one after the other, the program doesn't crash. The program only crashes about half the times if I start and run both the threads concurrently.
Another interesting thing I gathered from logs is that whenever the program crashes, the line labelled with the comment my log line is the last to be executed by both the threads. So I don't know which thread causes the crash. But it leads me to suspect that QNetworkAccessManager is somehow to blame.
I'm pretty blank about what's causing the crash. I will appreciate any suggestions or pointers. Thanks in advance.
First of all you're doing it wrong! Fix your threading first
// EDIT
From my own experience with this pattern i know that it may lead to many unclear crashes. I would start from clearing this thing out, as it may straighten some things and make finding problem clear. Also I don't know how do you invoke makeRequest. Also about QNetworkRequest. It is only a data structure so you don't need to make it on heap. Stack construction would be enough. Also you should remember (or protect somehow) from overwriting m_reply pointer. Do you call makeRequest more than once? If you do, then it may lead to deleting currently processed request after previous request finished.
What does happen if you call makeRequest twice:
First call of makeRequest assigns m_reply pointer.
Second call of makeRequest assigns m_reply pointer second time (replacing assigned pointer but not deleting pointed object)
Second request finishes before first, so processReply is called. deleteLater is queued at second
Somewhere in eventloop second reply is deleted, so from now m_reply pointer is pointing at some random (deleted) memory.
First reply finishes, so another processReply is called, but it operates on m_reply that is pointing a garbage, so every call at m_reply produces crash.
It is one of possible scenarios. That's why you don't get crash every time.
I'm not sure why do you call exit(0) at reply finish. It's also incorrect here if you use more then one call of makeRequest. Remember that QThread is interface to a single thread, not thread pool. So you can't call start() second time on thread instance when it is still running. Also if you're creating network access manager in entry point run() you should delete it in same place after exec(). Remember that exec() is blocking, so your objects won't be deleted before your thread exits.

Critical section negative lock count

I am debugging a deadlock issue and call stack shows that threads are waiting on some events.
Code is using critical section as synchronization primitive I think there is some issue here.
Also the debugger is pointing to a critical section that is owned by some other thread,but lock count is -2.
As per my understanding lock count>0 means that critical section is locked by one or more threads.
So is there any possibility that I am looking at right critical section which could be the culprit in deadlock.
In what scenarios can a critical section have negative lock count?
Beware: since Windows Server 2003 (for client OS this is Vista and newer) the meaning of LockCount has changed and -2 is a completely normal value, commonly seen when a thread has entered a critical section without waiting and no other thread is waiting for the CS. See Displaying a Critical Section:
In Microsoft Windows Server 2003 Service Pack 1 and later versions of Windows, the LockCount field is parsed as follows:
The lowest bit shows the lock status. If this bit is 0, the critical section is locked; if it is 1, the critical section is not locked.
The next bit shows whether a thread has been woken for this lock. If this bit is 0, then a thread has been woken for this lock; if it is 1, no thread has been woken.
The remaining bits are the ones-complement of the number of threads waiting for the lock.
I am assuming that you are talking about CCriticalSection class in MFC. I think you are looking at the right critical section. I have found that the critical section's lock count can go negative if the number of calls to Lock() function is less than the number of Unlock() calls. I found that this generally happens in the following type of code:
void f()
{
CSingleLock lock(&m_synchronizer, TRUE);
//Some logic here
m_synchronizer.Unlock();
}
At the first glance this code looks perfectly safe. However, note that I am using CCriticalSection's Unlock() method directly instead of CSingleLock's Unlock() method. Now what happens is that when the function exits, CSingleLock in its destructor calls Unlock() of the critical section again and its lock count becomes negative. After this the application will be in a bad shape and strange things start to happen. If you are using MFC critical sections then do check for this type of problems.

Resources