Serial Comms dies in WinXP - networking

A bit of history: We have an application, which was originally written many years ago (1998 is the first date in PVCS but the app is about 5 years older than that as it originally was a DOS program). This application communicates with a piece of hardware via serial. When we got to Windows XP we started receiving reports of the app dying after a short time of running. It seems that the serial comms just 'died' and the app was left in a stuck state. The only way to recover from this situation was to restart the application.
The only information I can find regarding this problem was apparently the Windows Message system would miss that information was received, the buffer would fill and the system would get stuck. This snippet of information was left in a old word document, but there's no evidence to back this up. It also mentions that this is only prevalent at high baud rates (115200+).
The solution was to provide customers with USB->Serial converters along with the hardware.
Today: We are working on a new version of the hardware that will run across a network as well as serial ports. So to allow me to work on the network code, minus the actual hardware we are using a VSCOM NetCom113 device. It also installs a virtual comm port on the users (ie: mine) machine.
Now I have got the network code integrated with the app, it appears that the NetCom device exhibits the same behaviour as a physical commport. This is undesirable as I need the app to run longer than ~30 seconds.
Google turns up zero problems that we experience.
I was wondering:
Has anyone experienced this before? If so what did you do to fix/workaround the problem?
Does anyone have any suggestions as to whether the original author of the document is correct and what I can do to test the theory?
Unfortunately I can't post code as the serial code is tightly couple with the rest of the system, though if you have questions regarding it I can answer questions about it.
Updates:
The code is written using Win32 Comm routines - so I am using CreateFile, ReadFile. There's also judicious calls to GetOverlappedResult.
It's not hanging per se, it's just that the comms stops. You can access the menus, click the buttons, but nothing can interact with the connected hardware. Using realterm you can see that no data is coming in or going out.
I think the reference to the windows message is that the problem is internal to windows. Data has arrived but the kernal has missed it and thus not told the rest of the system about it.
Flow control is not used.
Writing a 'simple' test is difficult due the the fact that the code is tightly coupled and the underlying protocol is quite complex and would require a lot of work.

Are you using DOS-style serial code, or the Win32 CreateFile approach?
If the former, be very suspicious: if at all possible I'd convert to the latter.
If the latter, do you know on what kind of system call it's hanging? Are you in a blocking read call? or an overlapped I/O call? or waiting on an event? (I'm not sure I have enough experience to help, but those are the kinds of questions that come to mind)
You might also check into the queue size, which you can set with the SetupComm function.
I don't buy the "Windows Message system" stuff -- it sounds fishy; you can write good Win32 serial i/o code that never uses Windows messages.
edit: does your Overlapped I/O use events? I seem to remember something about auto-reset events occasionally missing their trigger... check your overlapped I/O calls very carefully to see whether you're handling the possible outcomes properly. Perhaps there's a way to make your code more robust by automatically cancelling the overlapped i/o and restarting another read. (I assume the problem is in the read half, not the write half?)
edit 2: A suggestion: assuming the win32 side has missed a byte or packet, and your devices are in deadlock because they're both expecting each other to respond to something, can you tweak the other side of the serial I/O to regularly send some type of "ping" packet with an incrementing counter? (and log the ping packets on the PC side; that way you can see whether you've missed any)

Are you sure you have your flow control set up correctly? DTR, RTS, etc...
-Adam

i have written apps that use usb / bluetooth serial ports and have never had an issue. with bluetooth i have seen bit rates (sustained) of 800,000 bps for long periods of time. most people don't properly implement the port.
My serial port

Not sure if this is a possibility for you, but if you could re-write the code using C#.NET you'd have access to the SerialPort class there. It might remedy your problem. I know a lot of legacy code based around the Win32 API for hardware I/O ports tended to fail in XP due to timing (had a small bit of experience with MIDI).
In addition, I don't know if you can use the Win32 method of Serial Port access in Vista, so that might shut out future MS OSes from being able to use your code.

Related

Why is reading and writing to serial so unreliable?

I've been having to come back to a modem I have and sending AT commands to it, and I need to do this programmatically. Sending AT commands works fine if using Minicom, but when using any kind of programatic method it's just super unreliable. I've tried echos and redirection with bash, the atinout program, and the pyserial module in Python, but no matter what sending and receiving commands is iffy at best. It is very rare that I attempt to run the same AT command twice and get consistent output back. I'll get the complete response one time, but then a partial response the next, or maybe no response.
Admittedly I don't know much about serial, so maybe it's my hardware, or maybe the protocol for reading and writing to serial is just unreliable. Can someone please explain how, in general, reading and writing output over a serial port may be unreliable, and any good techniques or libraries to help guarantee a stable flow of reading and writing?
There was another service on my system called ModemManager that was consuming the serial device at the same time I was running through my commands. Once that was disabled all of my programmatic efforts started producing reliable IO to the device.

How to acces the data from a website 50-100 times a second using raspberry Pi?

I want to fetch the data of a stock. Since the data changes very fast, is there any way to pull the data like 50-100 times a second from trading websites?
And can we implement that using a raspberry Pi 4 8gig model.
RasPi4 should be more than adequate for this task. Both the ethernet and WiFi hardware is capable of connections at these speeds. (Unless you’re running a bunch of other stuff on it.) Consider where your bottlenecks may be, likely ISP or other network traffic). Consider avoiding WiFi in favor of cat5e or cat6. Consider hanging this device off your router (edge) to keep lan traffic lower and consider QOS settings if you think this traffic may compete with other lan traffic.
This appears to be a general question with no specific platform in mind. For stocks, there are lots of platforms to choose from.
APIs for trading platforms often include a method to open a stream. Instead of a full TCP conversation for each price check, a stream tells the server to just keep on sending data. There are timeout mechanisms of course, but it is good to close that stream gracefully (It’s polite since you’re consuming server resources at a different scale. I’ve seen some financial APIs monitor and throttle stream subscribers who leave sessions open.).
For some APIs/languages you can find solid classes already built on GitHub. Although, if simply pulling and reading a stream then the platform API doc code snippets should be enough to get you going.
Be sure to find out what other overhead may be implicated. For example, if an account or API key is needed to open a stream then either a session must be opened first or the creds must be passed with the stream being opened. The API docs will say. If you’re new to this sort of thing, just be a detective and try to infer what is needed. API docs usually try to be precise and technically correct with the absolute minimum word count.
Simply checking the steam should be easy. Depending on how that steam can be handled by your code/script, it may be harder to perform logic on the stream while it is being updated. That’s usually a thread issue or a variable scope issue depending on the script/code. For what you’re doing I would consider Python or PowerShell depending on your skill-set and other design parameters.

RS232 Alicat and Labview communication drop

At the moment I have a problem I cannot pin down. Seemingly at random my communication with my RS232 Alicat Device will get held up. It will get held up somewhere in the read or write process and be unable to complete it. Upon closing the VI I will get a "Resetting VI" error in Labview 2020. I am using 7 of the 9 RS232 ports. My question is:
How do I fix this problem so that I do not get a communication drop OR (more likely)
How do I code the system such that I can catch and move through this problem or reset the connection. Something of a VISA read/write timeout? Open to ideas on how to move past the block
Here is what I have gathered about the problem:
Windows 10, I’ve tested everything on multiple computers. It happens no matter what.
It happens at random. It might happen twice within 20 minutes or not for a couple of hours.
I have never experienced the error when probing the line. I don’t know if that is a clue, or if that speaks to the randomness of the problem
Baud Rate = 9600, Prior to this I was running at 19,200 and experienced equivalent issues. The manufacturer recommended lowering the baud rate to reduce noise. I have also isolated the cable from other parts of the hardware. At this point noise on the connection is not an issue, but I am still experiencing the error.
My buffer size is 1000 bytes.
By termination character is \r. I cannot imagine a scenario where it fails to read a termination character due to the size of my buffer
I'm querying it every 50ms. Far below the threshold of a standard timeout. Too much?
What I am currently testing.
Due to how my code block is setup I cannot yet confirm if it is getting locked up on the read or write block or both. I'm attempting to isolate the problem with only minor modifications to see if I can isolate it.
Attached is slimmed down version of my code that I isolated the error to.
I have experienced similar problems with some RS232 devices from different suppliers. The (quite bad) solution was to connect and disconnect for each communication command. The question would be what sample rate you need.
Another idea is to replace that device with an ethernet device. If I am not mistaken Alicat supplies those with Modbus (TCP).
The issue turned out to be specific to windows/my laptop. There is a USB setting that disables inactive USB's after a certain amount of time. The setting to disable the timeout was unavailable through the control panel on my laptop, though it was available on my coworkers. I had to use powershell commands to change the setting

Connecting Rime network to the Internet

I have been playing with Contiki for some time now and have tried out various examples and wrote my own for both the simulation environment and the real hardware. I have only been experimenting with self contained networks that, for example, measure the temperature difference between two nodes and then communicate that data with other devices (PCs) over plain text RS232 link, blink LEDs, and such simple stuff.
Now I want to make a more complicated system where instead of just forwarding the data in plain text to be read on a console I would forward it to an application that would in turn post it to some sort of web service and, vice versa, receive data from web service to be delivered to nodes on the network. There is quite a lot of examples and tutorials describing this kind of setup but all of them (as far as I am aware) are focusing on the IP(v6) stack and SLIP to achieve this. The problem with this is that I have a really lousy programmer and uploading of a 50 kB image takes about 1.5 mins so the development cycle is pure hell. I am also out of luck with simulation since my platform is not really supported at the moment.
That's why I decided to try out the Rime stack, image size is 1/3 of the IPv6 and the development cycle is somewhat acceptable now (I really should get a decent JTAG programmer...) Meanwhile, I am having a bit of trouble wrapping my head around this new setup with a different network stack on which there is very little information around. Although it is pretty easy to understand by itself, I am not sure how I would go about connecting a Rime network to an IP network and if it was even possible or advised/intended by it's designers.
I have some ideas in my head, ranging from ad hoc communication over a serial link between a server application running on the PC and the collector node, to a real Rime border router that is certainly outside of my league, for now.
How would you go about it? It would certainly work for my simple experimental case to just have a collector node that gathers the data from the Rime network and sends the aggregated data over a serial connection to the custom application that does the rest of the magic, but, I wouldn't want to be the guy that reinvented the wheel and I am quite sure that Rime wasn't designed to be used in a vacuum so there must be at least an advised way of doing this?
Rime is a really simple stack (By simple I mean few functionnality). But it's quite quicker for simple task.
You need to program the Rime stack on your gateway. Thus, your board and the gateway can communicate with the same stack. So now you have the data send on your gateway. The gateway now can send the data with IP to whoever you want.
If you want more technical detail, then edit your question with more specific technical context.
Btw JTAG is a must have. (for industrial application)
Edit : An other solution is to simply send your data from your board to your gateway in broadcast. Then the gateway take the data and interpret it. The cons of this method is you have to somehow be sure that your gateway interpret only the data of your board (not of others board)

best practices or solutions to writing resilient programs using unstable slow networks

A program is using a slow unstable network. Frequent timeouts, slow connection etc.
The program uses a few rest APIs and even ssh. The previous developer solved timeout problems by checking for error message and running the same instruction again until it worked. However, sometimes the network connection simply goes dark for a few hours and we have to wait.
We cannot always keep the program alive and waiting (to save power) and end up serializing the state and have another program check for network activity and then wake it up and resume from a serialized state.
The code quality is becoming more and more hackish due to these workarounds.
are there any best practices or solutions when it comes to writing programs for unstable networks? I'm wondering if someone already solved that problem through a library or some book you could recommend?
Thank you
PS: I have no control over the network infrastructure.

Resources