Arduino Bootloader

Arduino Bootloader - arduino

Can someone please explain how the Arduino bootloader works? I'm not looking for a high level answer here, I've read the code and I get the gist of it.
There's a bunch of protocol interaction that happens between the Arduino IDE and the bootloader code, ultimately resulting in a number of inline assembly instructions that self-program the flash with the program being transmitted over the serial interface.
What I'm not clear on is on line 270:
void (*app_start)(void) = 0x0000;
...which I recognize as the declaration, and initialization to NULL, of a function pointer. There are subsequent calls to app_start in places where the bootloader is intended to delegate to execution of the user-loaded code.
Surely, somehow app_start needs to get a non-NULL value at some point for this to all come together. I'm not seeing that in the bootloader code... is it magically linked by the program that gets loaded by the bootloader? I presume that main of the bootloader is the entry point into software after a reset of the chip.
Wrapped up in the 70 or so lines of assembly must be the secret decoder ring that tells the main program where app_start really is? Or perhaps it's some implicit knowlege being taken advantage of by the Arduino IDE? All I know is that if someone doesn't change app_start to point somewhere other than 0, the bootloader code would just spin on itself forever... so what's the trick?
Edit
I'm interested in trying to port the bootloader to an Tiny AVR that doesn't have separate memory space for boot loader code. As it becomes apparent to me that the bootloader code relies on certain fuse settings and chip support, I guess what I'm really interested in knowing is what does it take to port the bootloader to a chip that doesn't have those fuses and hardware support (but still has self-programming capability)?

On NULL
Address 0 does not a null pointer make. A "null pointer" is something more abstract: a special value that applicable functions should recognize as being invalid. C says the special value is 0, and while the language says dereferencing it is "undefined behavior", in the simple world of microcontrollers it usually has a very well-defined effect.
ATmega Bootloaders
Normally, on reset, the AVR's program counter (PC) is initialized to 0, thus the microcontroller begins executing code at address 0.
However, if the Boot Reset Fuse ("BOOTRST") is set, the program counter is instead initialized to an address of a block at the upper end of the memory (where that is depends on how the fuses are set, see a datasheet (PDF, 7 MB) for specifics). The code that begins there can do anything—if you really wanted you could put your own program there if you use an ICSP (bootloaders generally can't overwrite themselves).
Often though, it's a special program—a bootloader—that is able to read data from an external source (often via UART, I2C, CAN, etc.) to rewrite program code (stored in internal or external memory, depending on the micro). The bootloader will typically look for a "special event" which can literally be anything, but for development is most conveniently something on the data bus it will pull the new code from. (For production it might be a special logic level on a pin as it can be checked nearly-instantly.) If the bootloader sees the special event, it can enter bootloading-mode, where it will reflash the program memory, otherwise it passes control off to user code.
As an aside, the point of the bootloader fuse and upper memory block is to allow the use of a bootloader with no modifications to the original software (so long as it doesn't extend all the way up into the bootloader's address). Instead of flashing with just the original HEX and desired fuses, one can flash the original HEX, bootloader, and modified fuses, and presto, bootloader added.
Anyways, in the case of the Arduino, which I believe uses the protocol from the STK500, it attempts to communicate over the UART, and if it gets either no response in the allotted time:
uint32_t count = 0;
while(!(UCSRA & _BV(RXC))) { // loops until a byte received
count++;
if (count > MAX_TIME_COUNT) // 4 seconds or whatever
app_start();
}
or if it errors too much by getting an unexpected response:
if (++error_count == MAX_ERROR_COUNT)
app_start();
It passes control back to the main program, located at 0. In the Arduino source seen above, this is done by calling app_start();, defined as void (*app_start)(void) = 0x0000;.
Because it's couched as a C function call, before the PC hops over to 0, it will push the current PC value onto the stack which also contains other variables used in the bootloader (e.g. count and error_count from above). Does this steal RAM from your program? Well, after the PC is set to 0, the operations that are executed blatantly "violate" what a proper C function (that would eventually return) should do. Among other initialization steps, it resets the stack pointer (effectively obliterating the call stack and all local variables), reclaiming RAM. Global/static variables are initialized to 0, the address of which can freely overlap with whatever the bootloader was using because the bootloader and user programs were compiled independently.
The only lasting effects from the bootloader are modifications to hardware (peripheral) registers, which a good bootloader won't leave in a detrimental state (turning on peripherals that might waste power when you try to sleep). It's generally good practice to also fully initialize peripherals you will use, so even if the bootloader did something strange you'll set it how you want.
ATtiny Bootloaders
On ATtinys, as you mentioned, there is no luxury of the bootloader fuses or memory, so your code will always start at address 0. You might be able to put your bootloader into some higher pages of memory and point your RESET vector at it, then whenever you receive a new hex file to flash with, take the command that's at address 0:1, replace it with the bootloader address, then store the replaced address somewhere else to call for normal execution. (If it's an RJMP ("relative jump") the value will obviously need to be recalculated)

Edit
I'm interested in trying to port the bootloader to an Tiny AVR that doesn't have separate memory space for boot loader code. As it becomes apparent to me that the bootloader code relies on certain fuse settings and chip support, I guess what I'm really interested in knowing is what does it take to port the bootloader to a chip that doesn't have those fuses and hardware support (but still has self-programming capability)?
Depending on your ultimate goal it may be easier to just create your own bootloader rather than try to port one. You really only need to learn a few items for that part.
1) uart tx
2) uart rx
3) self-flash programming
Which can be learned separately and then combined into a bootloader. You will want a part that you can use spi or whatever to write the flash, so that if your bootloader doesnt work or whatever the part came with gets messed up you can still continue development.
Whether you port or roll your own you will still need to understand those three basic things with respect to that part.

Related

What happens before Micro-controller Startup Code being executed? or Power-On/Reset Sequence?

What I know is as below and correct me if wrong, For the automotive bootloader based on any microcontroller, we will have
Startup code (Flash)
Primary bootloader (Flash)
Secondary bootloader (RAM)
As far as a power-on sequence is considered I know that,
From the startup code (provided by the micro vendor, Freescale, ST
Micro, etc.,) the control will be transferred to PBL (Primary
bootloader) using jump or function pointer.
PBL will download the SBL (Secondary bootloader) into RAM, which will
contain the flash driver, capable to download the application.
SBL will download the application into the flash area.
But what will happen before startup code is being executed or just after power on?
I know that each controller will have some sort of code to execute after power on POST (power-on self-test) but still not clear with sequence to operation till bootloader execution comes into execution.
It would be a great help if someone can provide a sequence of operations to reach startup code?

I find it this not uncommon confusion interesting.
POST is software in general, but your question is so vague. Usually when someone talks about POST they are talking about their x86 based computer, that is just software, happens well after the part you are confused about, and is in no way whatsoever required for a computer/processor to run, it has a purpose, adds value so it is there.
Microcontrollers in general do not have primary bootloaders nor secondary, they simply start running your application. Of the dozens/hundreds I have used/examined trying to think of any that have a primary or secondary. Can't think of any off hand. Certain brands in particular do have bootloaders that are usually programmed by them and you cant change or some that you can. How you get into the bootloader varies by brand, often a strap, sometimes a non-volatile bit in a register.
First off processors and the chip around it are dumb, very dumb. Only do what they are told by the humans. Incredibly simple machines. And while the difference between an mcu and a full blown system are at this view pretty much identical, the mcus are simpler and more reliable (for various reasons). The root of the answer starts with the processor or processor core or core or whatever term
might help you. In an mcu this is just one lego block in the whole of the chip, not necessarily even the largest block in the chip. When you look at arm based chips like the stm32 and others with a cortex-m (or older ones with ARMV7TDMI) that lego block is purchased ip from arm, the rest of the chip is either other purchased ip from one or more vendors or in-house made logic. the sram certainly and the flash probably is ip that the chip vendor buys for the specific process on the specific foundry (just like other cell library items, simple gates like AND, OR, NOT and more complicated gates).
Whatever processor core this is, it has an architecture and instruction set. While we know some architectures are implemented using microcode, unlikely that the mcus are, makes no sense the more cisc like might, but the arms and mips and such definitely not. But for this understanding it doesn't matter being microcoded or not there are bit patterns that drive the processor, machine code. We have all heard that chips are made of transistors, and they are. The transistors are part of the simplicity, the basic ones AND, OR, NOT gates you can look up on Wikipedia. You can (inefficiently) build the rest out of those fundamental blocks. A particular instruction tickles the logic, the transistors in a certain way to cause a chain of events, ones and zeros in a specific sequence that do the thing you asked. Logic is not limited to implementing processor instructions, most logic is not part of the decoding and execution of a processor instruction, most if it are equally dumb items. An sram is a lot of packed in bits (four transistors wired up a certain way per bit) with an address and data bus, the logic of an sram lights up rows and columns of these bits when writing or reading. Then there is more logic in front of that sram that decodes an address bus, etc.
As mentioned in the other answer, when power comes up then reset is released, the flip flop based items in the chip which are the registers we read in the manual plus countless others that are behind the scenes are set to their reset value which is done by wiring of the transistors. A number of state machines start which are similar to programs, but are hardwired. wait for reset to go high, once reset goes high then if this input to the state machine is this and that input to the state machine is that then I can move to the next state. The rules to get from one state to the next are implemented in logic. A chip with memory and flash for example might do a bist on the ram first, likely not in an mcu, doesn't make sense, this is logic not software doing this, this is not the post you think of in your laptop/desktop/server. The flash or ram or adcs or other logic might require some number of clocks to settle their logic before reset is released (the reset on the edge of the chip is not necessarily hard wired to all items in the chip, usually it is gated, delayed, etc). So there is a power on state machine that manages this, when the chip is ready then the processor itself will be released, this can be a few or dozens of clock cycles later. The clock itself has to settle, and the logic is designed to wait for that.
When the processor is released from reset it again may have some number of clocks to settle things in its design, it will have a state machine or many that start up the various blocks, and then based on the architectural design of that processor it does one of two things, fetches its first instruction from a known address (address within the processors address space which isn't necessarily the address in the chips view), or it uses a vector table approach and it reads a value from a known address, and that value read is the address of the first instruction and it fetches that instruction. Up to the first fetch there is no software, it is logic.
Depending on how the chip vendor has designed the chip, how they have defined the address space, and understand that addressing within a chip or board design is not some flat universal thing, to the programmer it is, but in reality it isn't. There are many busses with addresses and those address spaces are specific to that portion of the design. When you see the stm32 or others with a bootloader and a strap (boot0/boot1 pin), the logic on the other end of the processor bus may see a fetch at the well known address (meaning both the folks that implement the logic and the folks that write software for the logic know that this is the specific address where things start and if you don't put stuff there it won't boot/work) but as mentioned the chip vendor can do whatever they want with that and often do. As a programmer this can be easily understood as logic isn't any more magical than software:
if strap == 0 return flash_bank_0[address&mask]
else return flash_bank_1[address&mask]
For a certain address range that is decoded in front of this code, but also both banks may be directly addressable:
if address[24]==0 return flash_bank_0[address&mask]
else return flash_bank_1[address&mask]
And this way you can have what you see in the stm32s, that both address 0x00000000 and 0x08000000 or in other vendors chips 0x00000000 and 0x01000000 for example map to the same (flash) memory.
The reason being is that the cortex-ms is vector based, there is a table of addresses that point you at code rather than just instructions at known addresses (like the full sized arms arm7, arm9, arm11, cortex-a). The way you use that is you set your address for reset in the table to be 0x08000000 based so when the processor reads at 0x000000xx it is told to fetch instructions from 0x0800xxxx and it does. When the strap is the other way it finds a different flash which may or may not have a fixed space it may only be visible from the if-then-else. (pretty easy to see with a cortex-m and an SWD debugger and software).
The stm32s will have logic that if the strap is set to run the user application will fetch my guess is four words, if the first one or a specific one is all ones or for some chips all zeros (very often flash/rom resets to ones, because there is a logic in version saving a transistor, so the bit is a zero, but we see it as a one, the bits are all inverted, but this is not a hard and fast rule, just very common) the logic/state machine will, for the stm32 realize there is no user application and will load the bootloader. Now it is very possible the design actually always boots the bootloader and there is software there that looks at the application flash, but I think myself and others on this site decided that is not the case, but none of us work there nor have the visibility into the design. In either case the processor then starts executing what it finds and it is very dumb it is told fetch from this address and it does, the programmer had to make sure that stuff is at that address, and each and every instruction has to be laid out in order properly like train tracks, any gaps or mistakes and the trail goes off the rails, otherwise the train is stupid it just follows the tracks. As humans we call the software post or bootloader or application or whatever. It is just software. Once the processor is started if some software loads and runs other software the processor doesn't know it is stupid it just keeps performing the instructions it is fed as it rolls down the track.
Short answer:
Power ramps up to a chip specified level. At a chip specified time reset should be released. This releases state machines to get the chip ready as needed and release the processor. The processor based on its design either fetches its first instruction from a known place or it reads from a known place and that user planted value is the address where the first instruction lives. After that per the architecture of the chip the execution of that first instruction and fetching of more based on that instruction continue until it crashes or is turned off or put in reset.
There is no magic.
There are a number of good open cores out there that you can simulate with free tools and see (with free tools) the internal signals that make that chip work, you can see the post reset activity leading up to the first fetch and then all the execution from there.

Without knowing which microcontroller you are using, this should be general enough:
The hardware in the microcontroller resets several registers to their documented values. This includes the PC, the program counter.
If the microcontroller has configurable reset vectors the value can be chosen from a few alternatives, other controllers always use the same value.
The code at the location the PC points to is the startup code.
Note: It's always a good idea to read the data sheet of the controller!

how interrupts works and what is the function of vectors in MSP430 ?

Can someone explain me how to write ISR and how to set their priority when they are many in one program?
What is the function of vectors and is it necessary to consider them while interrupt handling?
If its possible please provide some examples as well (c code).

Just like when a doorbell or phone rings at your home you stop what you are doing, deal with the interrupt, then, ideally, return to what you were doing.
Same with a processor (msp430 or otherwise). There are ways to interrupt the processor for various reasons. I have a new byte in the uart for you, a timer has timed out, a gpio pin has changed state, etc. Things that you have configured to be something that interrupts the processor when they happen.
Just like the doorbell. the hardware has to have a way to stop and save something to remember what it was doing, find out what the interrupt is and handle it, then go back to what it was doing. Processors often, quite literally interrupt between instructions they will finish the current instruction (with piplines "current" is a bit fuzzy). Then based on the interrupt and the design of the processor there is some place that the hardware and software agree upon (the hardware dictates and the programmers use) such that the software can tell the processor where the code is that handles all interrupts or that particular flavor of interrupt, depending on how the processor is designed. A common solution is an interrupt vector table, a list of addresses usually that the programmer sets that point to the code that handles each one of those events or interrupts, both the programmer and the hardware know that a particular interrupt will cause a particular address to be read in the memory space and the hardware assumes that address is the code for that interupt.
So the processor gets an interrupt, it saves the state of the machine which at a minimum is the program counter and can depending on the design also save the status register and gprs, but often the programmer is responsible for saving gprs and such as needed. The hardware then based on the interrupt/event reads from an address, usually that address contains an address to a handler so for example 0xFFF8 might be the address to the interrupt handler (dont know didnt look it up for the msp430). so 0xFFF8 is not where the code is but the number at that address is where the code is maybe 0xD008 for example. It depends on the processor architecture but when you finish handling the interrupt you need to tell the processor so it can return to what was interrupted. often that is a special return from interrupt instruction but different processors have different solutions.
Priority if any, is dictated by the hardware design, something as simple as an msp430 might not (not sure off hand) have a priority scheme other than whoever gets here first. and the scheme might be that before you exit the handler you check to see if any others have come in while you were handling that one that interrupted you. if there is a priority scheme in the design then it simply repeats the process saves state (of the interrupt or forground code interrupted) finds the entry point for the handler using a vector table usually. when the highest priority handler finishes it returns and control goes back to the next higher priority thing, and eventually back to the forground task (assuming nothing else comes along).
in general an isr needs to not destroy anything the foreground task was using, preserve the state of the gprs if needed, preserve the state of the status register, dont mess up the stack or memory used by the foreground task, etc. And ideally keep the isr lean and mean, dont waste a lot of time there. the vector table is just where you fill in the addresses for entry points into the code reset handler interrupt handler, etc.

An interrupt handler (also known as an interrupt service routine or ISR) is a piece of code that runs when an event (I/O) occurs that requires CPU attention. An interrupt event is typically asynchronous, hence the reason a handler must be registered for the event.
For example, in the case of Serial communication, data is received by the USCI peripheral (configured for UART) that needs to be processed. In this case, an interrupt will be issued by the USCI peripheral and the CPU will begin executing from the interrupt handler (addressed by the interrupt vector). Vectors are at fixed locations and are outlined in the datasheet of your device. When the end of the interrupt handler is reached, the CPU will go back to where it left off (or service another interrupt). A datasheet/user's guide will explain the default priorities of interrupts.
A typical interrupt handler using the IAR Embedded Workbench IDE will look like the following:
// Port 1 interrupt service routine
#pragma vector=PORT1_VECTOR
__interrupt void Port_1(void)
{
P1OUT ^= 0x01;
// P1.0 = toggle
P1IFG &= ~0x10;
// P1.4 IFG cleared
}
Further reading is available here.

How to know whether Power on reset or Software reset has occured in 8051 microcontrller

I am developing an application on ATMEL AT89C51 of 8051 family.
Could anyone suggest how to determine in coding whether the reset has been done due to power cycle or through software?

According to the Atmel 8051 Microcontrollers Hardware Manual (PDF link), the power-off flag (POF / bit 4) in the power control register (PCON / 87h) is set by hardware when VCC rises from 0 to its nominal voltage. The power-off flag reset value will be 1 only after a power on (cold reset). A warm reset (e.g. software reset) does not affect the value of this bit.
I've often found that different vendors implement their own registers in the SFR space that can be taken advantage of for cases such as this. For example, Silicon Labs uses a power-on reset flag (PORSF) in their reset source register (RSTSRC).

It really depends if you wanted to depend on some specific 8051 variant vendor. It is best to use vendor provided registers, but if you changed vendor your code will brake, or even worse, it will misbehave.
If you had external RAM in your system (and it was not battery powered), than you could write a sequence of bytes (like 0xAA, 0x55...) somewhere in the reserved part of the memory, and check if it was still there after start up. If not, you have had a cold start. Of course, you should modify assembler start up code to make sure it does not initialize this part of memory (or it would be zero at each start), and you should instruct your linker to exclude this memory from linkage so that it does not get used by anything else.
Finally, include conditional compilation in your code so that if you had some 8051 variant with special registers, it would be used, if not, try the plan B.
I have done that with few bytes of internal 8051 memory /all my external RAM was battery powered/ and then I have learned than not every 8051 variant has had consistent policy at start up - some have all their internal memory initialized, some have initialized only SFR and some other specific areas leaving me few bytes to play with the procedure described.

I don't think there is a method to determine how reset has occurred because once reset everything starts from the beginning in 8051.
One method i guess would work is,
Say take a variable X, before every software code of reset, just set X=1 (indicating software reset) and store this variable in any ROM if you interfaced externally.
On every reset, at the beginning include an instance which checks this variable X to see which reset had occurred and change X to 0, for next time detection.
If you do not have an external ROM, interface an D-latch atleast.
I hope this works. Do tell me if this works.

Is it stable to change I/O direction on microcontroller repeatedly?

I'm new to microcontroller programming and I have interfaced my microcontroller board to another device that provides a status based on the command send to it but, this status is provided on the same I/O pin that is used to provide data. So basically, I have an 8-bit data line that is used as an output from the microcontroller, but for certain commands I get a status back on one of the data lines if I choose to read it. So I would be required to change the direction of this one line to read the status thus converting this line as an ouput to an input and then back to an output. Is this acceptable programming or will this changing of the I/O pin this frequently cause instability?
Thanks.

There should not be any problem with changing the direction of the I/O line to read the status returned by the peripheral provided that you change the state of the line to an input before the peripheral starts to drive the line and then do not try to drive the line as an output until the peripheral stops driving it. What you must try to avoid is contention between the two driver devices, i.e. having the two ends being driven to opposite states by the processor and peripheral. This would result in, at best a large spike in the power consumption or worse blown pin driver circuitry in the processor, peripheral or both.
You do not say what the processor or peripheral are so I cannot tell whether there are any control bits in the interface that enable the remote device to output the status so that you can know whether the peripheral is driving the line at any time.

I've done this on digital I/O pins without any problems but I'm very far from an expert on this. It probably depends entirely on which microcontroller you are using though.

Yes, it's perfectly fine to change I/O direction on microcontroller repeatedly.
That's the standard method of communicating over open-collector buses such as I2C and the iButton. (see PICList: busses for links to assembly-language code examples).
transmit 0 bit: set output LATx bit to 0, and then set TRISx bit to OUTPUT.
transmit 1 bit: keep output LATx bit at 0, and set TRIS bit to INPUT (let external resistor pull-up line to high)
listen for response from peripheral: keep output LATx bit at 0, and set TRIS bit to INPUT. Let external resistor pull-up line to high when peripheral is transmitting a 1, or let the peripheral pull the line low when peripheral is transmitting a 0. Read the bit from the PORTx pin.
If both ends of the bus correctly follow this protocol (in particular, if neither end actively drives the line to high), then you never have to worry about contention or current spikes.

It`s important to remember that any IO switching in high speed generates EMI.
Depending of switching frequency, board layout and devices sensibilities, this EMI can affect performance and reliability of your application.
If you are having problems in your application use an oscilloscope to check for irradiated EMI in your board lanes.

Windows XP embedded - RS485 problems

We've got a system running XP embedded, with COM2 being a hardware RS485 port.
In my code, I'm setting up the DCB with RTS_CONTROL_TOGGLE. I'd assume that would do what it says... turn off RTS in kernel mode once the write empty interrupt happens. That should be virtually instant.
Instead, We see on a scope that the PC is driving the bus anywhere from 1-8 milliseconds longer than the end of the message. The device that we're talking to is responding in about 1-5 milliseconds. So... communications corruptions galore. No, there's no way to change the target's response time.
We've now hooked up to the RS232 port and connected the scope to the TX and RTS lines, and we're seeing the same thing. The RTS line stays high 1-8 milliseconds after the message is sent.
We've also tried turning off the FIFO, or setting the FIFO depths to 1, with no effect.
Any ideas? I'm about to try manually controlling the RTS line from user mode with REALTIME priority during the "SendFile, clear RTS" cycle. I don't have many hopes that this will work either. This should not be done in user mode.

RTS_CONTROL_TOGGLE does not work (has a variable 1-15 millisecond delay before turning it off after transmit) on our embedded XP platform. It's possible I could get that down if I altered the time quantum to 1 ms using timeBeginPeriod(1), etc, but I doubt it would be reliable or enough to matter. (The device responds # 1 millisecond sometimes)
The final solution is really ugly but it works on this hardware. I would not use it on anything where the hardware is not fixed in stone.
Basically:
1) set the FIFOs on the serial port's device manager page to off or 1 character deep
2) send your message + 2 extra bytes using this code:
int WriteFile485(HANDLE hPort, void* pvBuffer, DWORD iLength, DWORD* pdwWritten, LPOVERLAPPED lpOverlapped)
{
int iOldClass = GetPriorityClass(GetCurrentProcess());
int iOldPriority = GetThreadPriority(GetCurrentThread());
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
EscapeCommFunction(hPort, SETRTS);
BOOL bRet = WriteFile(hPort, pvBuffer, iLength, pdwWritten, lpOverlapped);
EscapeCommFunction(hPort, CLRRTS);
SetPriorityClass(GetCurrentProcess(), iOldClass);
SetThreadPriority(GetCurrentThread(), iOldPriority);
return bRet;
}
The WriteFile() returns when the last byte or two have been written to the serial port. They have NOT gone out the port yet, thus the need to send 2 extra bytes. One or both of them will get trashed when you do CLRRTS.
Like I said... it's ugly.

Any ideas?
You may find that there's source code for the serial port driver in the DDK, which would let you see how that option is supposed to be implemented: i.e. whether it's at interrupt-level, at DPC-level, or worse.
Other possibilities include rewriting the driver; using a 3rd-party RS485 driver if you can find one; or using 3rd-party RS485 hardware with its own driver (e.g. at least in the past 3rd parties used to make "intelligent serial port boards" with 32 ports, deep buffers, and its own microprocessor; I expect that RS485 is a problem that's been solved by someone).
8 milliseconds does seem like a disappointingly long time; I know that XP isn't a RTOS but I'd expect it to (usually) do better than that. Another thing to look at is whether there are other high-priority threads running which may be interfering. If you've been boosting the priorities of some threads in your own application, perhaps instead you should be reducing the priorities of other threads.
I'm about to try manually controlling the RTS line from user mode with REALTIME priority during the "SendFile, clear RTS" cycle.
Don't let that thread spin out of control: IME a thread like that can if it's buggy preempt every other user-mode thread forever.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex