BLE Missing Packets (Protocol / Spec question) - bluetooth-lowenergy

I have been learning the nuts and bolts of BLE lately, because I intend to do some development work using a BLE stack. I have learned a lot from the online documentation and the spec, but there is one aspect that I cannot seem to find.
BLE uses frequency hopping for communication. Once two devices are connected (one master and one slave), it looks like all communication is then initiated via the master and the slave responds to each packet. My question involves loss of packets in the air. There are two major cases I am concerned with:
Master sends a packet that is received by the slave and the slave sends a packet back to the master. The master doesn't receive the packet or if it does, it is corrupt.
Master sends a packet that is not received by the slave.
Case 1 to me is a "dont care" (I think). Basically the master doesn't get a reply but at the very least, the slave got the packet and can "sync" to it. The master does whatever and tries transmitting the packet at the next connection event.
Case 2 is the harder case. The slave doesn't receive the packet and therefore cannot "sync" its communication to the current frequency channel.
How exactly do devices synchronize the channel hopping sequence with each other when packets are lost in the air (specifically case 2)? Yes, there is a channel map, so the slave technically knows what frequency to jump to for the next connection event. However, the only way I can see all of this happening is via a "self timed" mechanism based on the connection parameters. Is this good enough? I mean, given the clock drift, there will be slight differences in the amount of time the master and slave are transmitting and receiving on the same channel... and eventually they will be off by 1 channel.. 2 channels, etc. Is this not really an issue, because for that to happen 'a lot' of time needs to pass based on the 500ppm clock spec? I understand there is a supervisor timer that would declare the connection dead after no valid data is transferred for some time. However, I still wonder about the "hopping drift", which brings me to the next point.
How much "self timing" is employed / mandated within the protocol? Do slave devices use a valid start of packet from the master every connection interval to re synchronize the channel hopping? For example if the (connection interval + some window) elapses, hop to the next channel, OR if packet received re sync / restart timeout timer. This would be a hop timer separate from the supervisor timer.
I can't really find this information within the core 5.2 spec. It's pretty dense at only over 3000+ pages... If somebody could point me to the relevant sections in the spec or somewhere else.. or even answer the questions, that would be great.

The slave knows the channel map. If one packet is not received from the master, it will listen again after one connection interval on the next channel. If that it also not received, it adds one extra connection interval and next channel.
The slave also stores a timestamp (or event counter) when the last received packet from the master was detected, regardless of if the crc was correct or not. This is called the anchor point. This is not the same time point used for supervision timeout.
The amount of time between the anchor point and the next expected packet is multiplied by the master + slave accuracy (for example 500 ppm) to get a receive window, plus 16 microseconds. So the slave listens this amount of time before and after the expected packet time of arrival.

Related

What happens when ethernet reception buffer is full

I have a quite newbie question : assume that I have two devices communication via Ethernet (TCP/IP) at 100Mbps. In one side, I will be feeding the device with data to transmit. At the other side, I will be consuming the received data. I have the ability to choose the adequate buffer size of both devices.
And now my question is : If data consumption rate from the second device, is slower than data feeding rate at the first one, what will happen then?
I found some, talking about overrun counter.
Is there anything in the ethernet communication indicating that a device is momently busy and can't receive new packets? so I can pause the transmission from the receiver device.
Can some one provide me with a document or documents that explain this issue in detail because I didn't find any.
Thank you by advance
Ethernet protocol runs on MAC controller chip. MAC has two separate RX-ring (for ingress packets) and TX-ring(for egress packets), this means its a full-duplex in nature. RX/TX-rings also have on-chip FIFO but the rings hold PDUs in host memory buffers. I have covered little bit of functionality in one of the related post
Now, congestion can happen but again RX and TX are two different paths and will be due to following conditions
Queue/de-queue of rx-buffers/tx-buffers is NOT fast compared to line rate. This happens when CPU is busy and not honer the interrupts fast enough.
Host memory is slower (ex: DRAM and not SRAM), or not enough memory(due to memory leak)
Intermediate processing of the buffers taking too long.
Now, about the peer device: Back-pressure can be taken care in the a standalone system and when that happens, we usually tail drop the packets. This is agnostics to the peer device, if peer device is slow its that device's problem.
Definition of overrun is: Number of times the receiver hardware was unable to handle received data to a hardware buffer because the input rate exceeded the receiver’s ability to handle the data.
I recommend pick any MAC controller's data-sheet (ex: Intel's ethernet Controller) and you will get all your questions covered. Or if you get to see device-driver for any MAC controller.
TCP/IP is upper layer stack sits inside kernel(this can be in user plane as well), whereas ARPA protocol (ethernet) is inside MAC controller hardware. If you understand this you will understand the difference between router and switches (where there is no TCP/IP stack).

Is DMA synchronous in network card drivers?

My understanding is that when a NIC adapter receives new packets, the top half handler uses DMA to copy data from the RX buffer to the main memory. I think this handler should not exit or release the INT pin before the transmission is completed, otherwise new packets would corrupt the old ones.
However, DMA is generally considered asynchronous and itself requires the interrupt mechanism to notify the CPU that data transmission is done. Thus my question, is DMA actually synchronous here, or interrupt can in fact happen within another interrupt handler?
In general, this synchronisation happens via ring descriptor between NIC(device driver) and host CPU. You will get packet path details here. I have explained the ring-descriptors below.
Edit:
Let me explain with Intel's ethernet Controller. If you look at section 3.2.3, where the RX descriptor format is given, it has status field which solves packet ownership problem. There are two major points to avoid contention and packet corruption as to who owns the packet (NIC driver or CPU).
DMA (from I/O device to Host memory): RX/TX Ring consists of 'hardware descriptors' and 'buffers' (carved from host memory). When we say DMA, controller transfer data, this happens from hardware FIFOs to this host memory.
Let us assume my ring buffers (of 512 bytes) are not big enough to hold the complete incoming packet(1500 or Jumbo packet), in that case the packet may span across multiple ring buffers and with EOP(End Of Packet) status field, indicates that the complete packet is now received (considering all the sanity checks/checksums are already done).
Second is who owns the packet now (driver or CPU for further consumption)? Now until the status flag DD (Descriptor Done) is set, it belongs to driver. Once set CPU can grab it for picking-and-poking.
This is specific to RX path. TX path is slightly different.
Consider it this way, there are multiple interrupts (IO, keyboard, mouse, etc) happening all the time in the system, but the time duration between two interrupts are so huge that CPU can do lot of other good stuff in between. And to further offload CPUs work DMA helps transferring data. So if an interrupt is raised and subroutine is called, all the subsequent interrupts can be masked as you are already inside that subroutine, but trust me these subroutine are very tiny they hardly consume any time until your next packet arrives. That means your packet arriving speed has to be higher than your processing speed.
Another Ex: for router/switches 99% time task is routing and switching hence subroutine and interrupt priorities are completely different, moreover all the time they are bombard with tons of packets and hence the subroutine in such cases will never come until there is another packet at bay. At least i have worked on such networking gears.

Does a BLE device reads advertising packets when not scanning? (autoconnect)

I read in some places that advertising packets are sent to every one in the distance range. However, should the other device be scanning to receive them or it will receive it anyways?
The problem:
let's say I'm establishing a piconet between 5 or 6 BLE devices. At some point I have some connections between the slaves and one master. Then if one of the devices get removed/shut off for a few days I would like it to reconnect back to the network as soon as turned on.
I read about the autoconnect feature but it seems when you set it true, the device creates a background scanning which is actually slower (in frequency) than the manual scanning. This makes me conclude that for the autoConnect to work the device which is being turned on again needs to advertise again, right? Therefore, if autoconnect really runs a slow scan on background so it seems to me that you can never receive the adv packets instantly unless you're scanning somehow. Does that make sense?
If so, is there any way around it? I mean, detect the device that is comming back to the range instantly?
Nothing is "Instant". You are talking about radio protocols with delays, timeouts, retransmits, jamming, etc. There are always delays. The important thing is what you consider acceptable for your application.
A radio transceiver is either receiving, sleeping or transmitting, on one given channel at a time. Transmitting and receiving implies power consumption.
When a Central is idle (not handling any connection at all), all it has to do is scanning. It can do it full time (even if spec says this should be duty cycled). You can expect to actually receive an advertising packet from peer Peripheral the first time it is transmitted.
When a Central is maintaining a connection to multiple peripherals, its transceiver time is shared between all the connections to maintain. Background scanning is considered low priority, and takes some of the remaining transceiver time. Then an advertising Peripheral may send its ADV packet while Central is not listening.
Here comes statistical magic:
Spec says interval between two advertising events must be augmented with a (pseudo-)random delay. This ensures Central (scanner) and Peripheral (advertiser) will manage to see each other at some point in time. Without this random delay, their timing allocations could become harmonic, but out of phase, and it could happen they never see each other.
Depending on the parameters used on Central and Peripheral (advInterval, advDelay, scanWindow, scanInterval) and radio link quality, you can compute the probability to be able to reach a node after a given time. This is left as an exercise to the reader... :)
In the end, the question you should ask yourself looks like "is it acceptable my Peripheral is reconnected to my Central after 300 ms in 95% of cases" ?

Reliable full-duplex serial comms

I'm designing a device that will encrypt a long (assume infinite) stream of data sent from the PC and send it back. I'm planning to use a single serial port on the device running full duplex with hardware handshaking and "block" the data, sending a CRC value after every block. The device will only buffer a limited number of blocks- ideally just one buffer accumulating the block being received and one buffer holding the block presently being sent, switching them over at each block boundary and using hardware handshaking to keep things in sync.
The problem I'm considering is what happens when there's corruption and there's a mismatch between the CRC value calculated by the receiver- which could be either the PC or the device- and the one sent. If the receiver detects an error, it sets a break condition on its transmit line- because although TX and RX are doing different things that's all we CAN do- and then we drop into a recovery sequence.
Recovery is easy when the error condition is detected before the data disappears from the sender, but particularly on the PC receiving there may be a significant amount of buffer space, and by the time the PC catches up and detects the corruption the data may have disappeared from the device and we can't simply retransmit. It's difficult to "rewind" cipher generation, so resending the source data and trying to pick things up in the middle is difficult- and indeed the source data may not be available to resend depending on where it's ultimately coming from.
I considered having each side send its "last frame successfully received" counter along with its last frame sent CRC value, and having the device drop RTS if there's too much unconfirmed data waiting at the output, but that would then deadlock- the device never gets the confirmation that the PC's receive thread has caught up.
I've also considered having the PC send a block and then not send another block until the first block's been confirmed processed and received back, but that's essentially going to half duplex or block-synchronous operation and the system runs slower than it can do. A compromise is to have a number of buffers in the device, the PC to know how many buffers and to throttle its own output based on what it thinks the device is doing, but having that degree of 'intelligence' needed in the PC side seems inelegant and hacky.
Serial comms is quite ancient tech. Surely there's a good way of doing this?
Designing a reliable protocol is not that easy. Some notes with what you've talked about so far:
Only use RTS to do what it is designed to do, avoid receive buffer overflow. It is not suitable to do more.
Strongly consider not having multiple un-acknowledged frames around. It is only important if the connection suffers from high latency, that is not a problem with serial ports.
Achieve full duplex operation by layering, use the OSI model as a guide.
Be sure to treat the input and output of your protocol as plain byte streams. Framing is only a detail of the protocol implementation, the actual frame size does not matter. If the app signals by using messages then that should be implemented on top of the protocol. Otherwise the automatic outcome of proper layering.
Keep in mind that a frame can do more than just transmit data, it can also include an ACK for a received frame. In other words, you only need a separate ACK frame if there isn't anything to transmit back.
And avoid reinventing the wheel, this has been done before. I can recommend RATP, the subject of RFC916. Widely ignored btw so you are not likely to find code you can copy. I've implemented it and had good success. It has only one flaw that I know of, it is not resilient to multiple connection attempts that are present in the receive buffer. Intentionally purging the buffer when you open the port is important.

Regarding CAN bus

I am using a 16-bit MCU PIC24HJ64GP504 to write a CAN based application. Basically it is communication between my board and one another node which continuously keeps on sending data to my board using CAN at 1 Mbit/s. I am configuring the ECAN module in my PIC24 to work at 1 Mbit/s. I have written the code in such a way that for the first 10 ms the ECAN module will accept all messages coming in from the other side and after that I have re-configured the ECAN module to accept only those messages with message ID 0x13.
Now here comes the issue... The other node and my board are powered up at the same instant. The other node starts transmitting messages after 40 ms or so after powerup. But I am not able to get any message from it on my board. Now if I power up my board first, give it some time to reconfigure the ECAN module with new filters and settle down and then power up the other node, then everything works perfectly.
Now the strangest part... If I have a CAN bus analyzer connected between my board and the other node and even if I power up both the nodes at the same time, everything works fine... No need to power up my board first. I have tried this with three different bus analyzers from different manufacturers and got the same results.
To me it appears that during re-configuration of the ECAN module, it takes some time to settle down. And with the introduction of the bus analyzer in the bus, this time is somehow cut short so that everything works perfectly. But I am not sure what exactly the problem might be.
The problem might be a missing ACK. The CAN-Analyzer might acknowledge frames and the device does not switch to error passive.
I would hold off sending until the whole bus is initialized.
Also sounds like missing ACKs to me.
Are you seeing any error frames (get the scope to trigger off 6 consecutive dominant bits) - the Tx node might be going off the bus or even into some application-error mode if it doesn't get acknowledged enough.
You might be able to coax it back on bus by transmitting a dummy message on the bus.
I've found a Saleae Logic very useful in these circumstances (as well as a scope) - hang it off the Rx pin of your physical layer (or even wire up a standalone PHY that you can use to monitor the bus). The Saleae software will interpret the CAN and show you what's happening. Sometimes it's useful to use the scope trigger out to trigger the Logic.
CAN Communication requires at least two active devices on bus to have successful communication. This is because, a CAN frame is not completed unless someone acknowledges it.
When you power up your board and other node, it seems your board is not getting ready in 40msec. If it is not ready, it leaves "Other node" to be the only member on the bus and voilates above stated rule. Other node will get Tx error and after 128 erros, That other node will go in error mode and stop sending messages -- Hence you are not getting anything.
When you power up your board first, give it time - your board is ready and will ACK every message sent by other node -- Hence communication is good!
When you add CANalyzer, even if your board is not powered up, there are two active nodes on the bus -- Hence communication is good!

Resources