If I call WinUSB_AbortPipe() just as WinUSB_ReadPipe() starts, I get into a deadlock state. I ran the debug trace log that is provided here. Below is the last 5 lines in the log where the problem occurs. I think ReadPipe must have missed the signal, and AbortPipe is waiting for ReadPipe to complete.
[0]4E34.4B58::06/09/2015-15:42:12.528 - IOCTL_WINUSB_READ_PIPE
[0]4E34.4B58::06/09/2015-15:42:12.528 - PIPE129: (00000019) The read has been added to the raw io queue
[0]4E34.4B58::06/09/2015-15:42:12.528 - PIPE129: (00000019) The read is being handled
[2]4E34.4ECC::06/09/2015-15:42:12.529 - IOCTL_WINUSB_ABORT_PIPE
[2]4E34.4B58::06/09/2015-15:42:12.529 - PIPE129: (00000019) Reading 64 bytes from the device
In my design, I have the IN endpoints read asynchronously into buffers. I found that it is best to set the timeout of the read operation to infinite because the driver hates it when I cause STALLs to occur (ran into other issues with that). So I need to have the disconnect sequence cause the threads to wake up to realize that we need to close. Is there any way to safely do that?
My workaround for this is to instead call WinUsb_ResetPipe(). This causes WinUSB_ReadPipe() to unblock, and doesn't seem to lock up as WinUSB_AbortPipe() sometimes does. The only evidence that I have that this works is through successfully running tests over several hours, so I can't guarantee that this is a solution.
Related
Scenario : The server is in middle of processing a http request and the server shuts down. There are multiple points till where the code has executed. How are such cases typically handled ?. A typical example could be that some downstream http calls had to be made as a part of the incoming http request. How to find whether such calls were made or not made when the shutdown occurred. I assume that its not possible to persist every action in the code flow. Suggestions and views are welcome.
There are two kinds of shutdowns to consider here.
There are graceful shutdowns: when the execution environment politely asks your process to stop (e.g. systemd sends a SIGTERM) and expects it to exit on its own. If your process doesn’t exit within a few seconds, the environment proceeds to kill the process in a more forceful way.
A typical way to handle a graceful shutdown is:
listen for the signal from the environment
when you receive the signal, stop accepting new requests...
...and then wait for all current requests to finish
Exactly how you do this depends on your platform/framework. For instance, Go’s standard net/http library provides a Server.Shutdown method.
In a typical system, most shutdowns will be graceful. For example, when you need to restart your process to deploy a new version of code, you do a graceful shutdown.
There can also be unexpected shutdowns: e.g. when you suddenly lose power or network connectivity (a disconnected server is usually as good as a dead one). Such faults are harder to deal with. There’s an entire body of research dedicated to making distributed systems robust to arbitrary faults. In the simple case, when your server only writes to a single database, you can open a transaction at the beginning of a request and commit it before returning the response. This will guarantee that either all the changes are saved to the database or none of them are. But if you call multiple downstream services as part of one upstream HTTP request, you need to coordinate them, for example, with a saga.
For some applications, it may be OK to ignore unexpected shutdowns and simply deal with any inconsistencies manually if/when they arise. This depends on your application.
Calling send() on a TCP socket which has already been dropped by the client causes what appears to be a memory access violation, as when I run a server application I made and then bombard it with requests from a browser, it crashes after serving between about 7 and 11 requests. Specifically, it accepts the connections and then sits for up to 10 seconds or so, then Windows throws up the "This program has stopped working..." message. No such crash happens if I remove the send() calls, leading me to believe that Microsoft's send() does not safely handle a socket being closed from the other end.
I am aware there are various ways to check whether the socket has in fact been closed, but I don't want to check then send, because there's still a chance a client could cut out between checking and sending.
Edit: I noticed close() socket directly after send(): unsafe? in the "Similar Questions" box, and although it doesn't quite fit my situation, I am now wondering if calling close() quickly after send() could be the contributing to the problem.
If this is the case, a solution involving checking then closing would work as it does not have the implication stated above. However, I am unaware of how to check whether closesocket() would be safe.
Edit: I would also be fine with a way to detect that send() has in fact broken and prevent the entire application from crashing.
Edit: I thought I'd finally update this question, considering I figured out the issue a while ago and there may be curious people stumbling across this. As it turns out, the issue had nothing to do with the send function or anything else related to sockets. In fact, the problem was something incredibly stupid I was doing: calling free on invalid pointers (NULL and junk-data addresses alike). A couple of years ago I had finally updated my compiler from a very outdated version I was originally using, and I suppose the very outdated standard library implementation was what allowed me to get away with such a cringe-worthy practice, and it seems that what I saw as an issue with send was a side-effect of that.
I have been programming in WinSock for over a decade, and have never seen or heard of send() throwing an exception on failure (in fact, no Win32 API function throws an exception on failure). If the socket is closed, an appropriate error code is reported. So something else is going on in your code. Maybe the pointer you pass to the buf parameter is not pointing at a valid memory block, or maybe if the value you pass to the len parameter is beyond the bounds of buf.
Like #RemyLebeau, I have been programming in Winsock for over a decade, in my case well over two decades, and I have never seen this either.
Microsoft's send() handles sending to a connection that has already been closed by the other end, by returning SOCKET_ERROR (-1) with WSAGetLastError() returning WSAECONNRESET. Unless the connection was lost abnormally (network failure, etc), in which case WinSock does not know the connection is gone, and send() happily keeps buffering outbound data until the socket's buffer fills up, or the socket times out internally so failures are then reported.
The send/close question you refer to contains nothing about memory access errors, and in any case calling close() after send() can't possibly cause the prior send() to misbehave, unless you have managed to get time running backwards.
You have a bug in your code somewhere.
I'm using syslog to log data to a file - the data is pretty intensive, the order of thousands of rows every few seconds. What I observe is that trace amounts of logs are being missed - less than 0.1 % most of the times - but they're still missing. I have no explanation for why this occurs.
It doesn't seem to correlate directly to the amount of data being written because increasing the amount of data being written did not increase the rate of missed logs.
I'm wondering of ways to debug this - how could we understand or confirm if it is indeed syslog which is dropping data and if so why?
If you look at the source code for syslogd, you will see that the syslogd program only uses datagram sockets (type SOCK_DGRAM). These are by definition connectionsless but also not completely reliable in the sense that stream sockets are.
This is by design. Using stream sockets would mean that the syslog() call would have to wait for a confirmation that the message that it sent was received properly. So if syslogd was busy, every application that calls syslog() would block.
Syslogd was simply not designed with the volume of data that you are subjecting it to in mind. You could try enlarging the value of the sysctl variable kern.ipc.maxsockbuf, giving the logging socket a larger buffer.
If you want to make sure you capture everything, write to a file instead.
All port operations in Rebol 3 are asynchronous. The only way I can find to do synchronous communication is calling wait.
But the problem with calling wait in this case is that it will check events for all open ports (even if they are not in the port block passed to wait). Then they call their responding event handlers, but a read/write could be done in one of those event handlers. That could result in recursive calls to "wait".
How do I get around this?
Why don´t you create a kind of "Buffer" function to receive all messages from assyncronous entries and process them as FIFO (first-in, first-out)?
This way you may keep the Assync characteristics of your ports and process them in sync mode.
in cases where there are only asynchronous events and we are in need on synchronous reply, start a timer or sleep for timeout, if the handler or required objective is met then say true, else false and make sure the event gets cancelled /reset for the same if critical.
I think that there are 2 design problems (maybe intrinsic to the tools / solutions at hand).
Wait is doing too much - it will check events for all open ports. In a sound environment, waiting should be implemented only where it is needed: per device, per port, per socket... Creating unnecessary inter-dependencies between shared resources cannot end well - especially knowing that shared resources (even without inter-dependencies) can create a lot of problems.
The event handlers may do too much. An event handler should be as short as possible, and it should only handle the event. If is does more, then the handler is doing too much - especially if involves other shared resources. In many situations, the handler just saves the data which will be lost otherwise; and an asynchronous job will do the more complex things.
You can just use a lock. Cummunication1 can set some global lock state i.e. with a variable (be sure that it's thread safe). locked = true. Then Communication2 can wait until it's unlocked.
loop do
sleep 10ms
break if not locked
end
locked = true
handle_communication()
I am designing and testing a client server program based on TCP sockets(Internet domain). Currently , I am testing it on my local machine and not able to understand the following about SIGPIPE.
*. SIGPIPE appears quite randomly. Can it be deterministic?
The first tests involved single small(25 characters) send operation from client and corresponding receive at server. The same code, on the same machine runs successfully or not(SIGPIPE) totally out of my control. The failure rate is about 45% of times(quite high). So, can I tune the machine in any way to minimize this.
**. The second round of testing was to send 40000 small(25 characters) messages from the client to the server(1MB of total data) and then the server responding with the total size of data it actually received. The client sends data in a tight loop and there is a SINGLE receive call at the server. It works only for a maximum of 1200 bytes of total data sent and again, there are these non deterministic SIGPIPEs, about 70% times now(really bad).
Can some one suggest some improvement in my design(probably it will be at the server). The requirement is that the client shall be able to send over medium to very high amount of data (again about 25 characters each message) after a single socket connection has been made to the server.
I have a feeling that multiple sends against a single receive will always be lossy and very inefficient. Shall we be combining the messages and sending in one send() operation only. Is that the only way to go?
SIGPIPE is sent when you try to write to an unconnected pipe/socket. Installing a handler for the signal will make send() return an error instead.
signal(SIGPIPE, SIG_IGN);
Alternatively, you can disable SIGPIPE for a socket:
int n = 1;
setsockopt(thesocket, SOL_SOCKET, SO_NOSIGPIPE, &n, sizeof(n));
Also, the data amounts you're mentioning are not very high. Likely there's a bug somewhere that causes your connection to close unexpectedly, giving a SIGPIPE.
SIGPIPE is raised because you are attempting to write to a socket that has been closed. This does indicate a probable bug so check your application as to why it is occurring and attempt to fix that first.
Attempting to just mask SIGPIPE is not a good idea because you don't really know where the signal is coming from and you may mask other sources of this error. In multi-threaded environments, signals are a horrible solution.
In the rare cases were you cannot avoid this, you can mask the signal on send. If you set the MSG_NOSIGNAL flag on send()/sendto(), it will prevent SIGPIPE being raised. If you do trigger this error, send() returns -1 and errno will be set to EPIPE. Clean and easy. See man send for details.