Bit ordering in a byte, does it matter? - networking

Let's say we have two machines on a network MA and MB,
MA considers little endian the order of the bits in a byte,
on the contrary MB considers big endian the order of the bits in a byte.
How do MA and MB agree on what "endianess" to use for the bits in a byte during the
communication over the network ?
Is there a standard "network endianess" or what ?
Do socket programmers have to take any actions in ensuring a correct communication ?
For example HTTP is a text protocol, that means that machines send and receive bytes which represent characters,what if the encoding of the characters is different in the endianess of the bits ?

Yes, the hardware protocols specify the bit order of bytes on all network links. This is generally handled automatically by the NIC hardware.
See, for example, this description of Ethernet frame format.
Ethernet transmission is strange, in that the byte order is big-endian (leftmost byte is sent first), but bit order little-endian (rigthmost, or LSB (Least Significant Bit) of the byte is sent first).

Check this page: http://www.comptechdoc.org/independent/networking/protocol/protlayers.html
It suggests that byte ordering is done at the Presentation Layer which is quite high up. This however will relate specifically to the application that you are using. I suspect data at lower levels (wrapping the higher levels) has a predetermined byte and bit order.

Related

Does Modbus RTU require a gap between characters when transmitting?

I'm a little confused about the inter-character gap in Modbus and whether its required when a master sends a message to a slave. The protocol spec says you can't have more than a 3.5 character gap between bytes when transmitting but is there any specific minimum amount of time you must have between bytes?
I've written a Modbus driver (master) that is able to communicate with a variety of devices and most don't seem to care about any gap between characters when receiving messages. However, I've come across a couple of devices where I was unable to communicate reliably without putting in a some kind of delay (measured in microseconds) between bytes, which is determined by the baud rate.
Is the character gap an absolute requirement or does it depend on the manufacture of the device and how the implement the Modbus protocol?
Does Modbus RTU require a gap between characters when transmitting?
No.
In fact the Modbus spec states in section 2.5.1.1 MODBUS Message RTU Framing that "[t]he entire message frame must be transmitted as a continuous stream of characters."
Requiring intercharacter gaps would be contradictory to specifying "a continuous stream".
The protocol spec says you can't have more than a 3.5 character gap between bytes when transmitting ...
You're misquoting the protocol spec.
Only a 1.5 character gap is tolerated between characters in a RTU message.
From the Modbus spec: "If a silent interval of more than 1.5 character times occurs between two characters, the message frame is declared incomplete and should be discarded by the receiver."
A silent (idle) line with a duration of 3.5 characters must precede a message.
IOW a gap of 2 (i.e. more than 1.5 and less than 3.5) characters would prematurely end the current message, and the following characters (of that malformed message) would not be considered the start of a new message and must be discarded (until the line goes idle for at least 3.5 characters).
... is there any specific minimum amount of time you must have between bytes?
The Modbus spec does not mention any such requirement.
Such a requirement would be impractical.
UARTs (typically) do not have a capability to meter its output by inserting a delay between the transmission of character frames.
Adding such a delay is an additional processor burden as well as the use of a timer.
On the contrary UARTs have evolved to transmit characters as fast as the baudrate allows with the least processor intervention, e.g. hardware FIFO and DMA.
A "minimum amount of time you must have between bytes" is simply a reduction in the effective data rate.
Therefore an appropriate reduction of the baudrate would accomplish the exact same data rate.
Is the character gap an absolute requirement or does it depend on the manufacture of the device and how the implement the Modbus protocol?
No, you are probably using too fast a baudrate for the device/circumstances in question.
A microprocessor or microcontroller should be able to keep a UART busy and transmit without any intercharacter gaps.
A UART that requires gaps during receiving is IMO an overloaded system and is broken.
For reliable communication (without flow control) use a baudrate low enough so that metering the transmitted characters is not necessary.
Addendum
Apparently there is at least one UART (from Siemens) that can meter its output by holding the line idle for N bit-times between character frames.
It is at the end of the message that there should be a pause of 3.5 characters or longer.
Usually in the data transmission protocols in the first positions of the byte sequence, the number of bytes that follow is included, but Modbus RTU does not send that length and what determines when a message ends is the pause of 3.5 characters.
If you send the sequence of bytes at once there should never be any pause between characters
If you are writing a Master you should not worry about this, since it is the slave that must wait 3.5 characters to know when the master request is finished.
You from the master side simply wait for the slave to reply since you know how many bytes the slave is going to send, in the request you already sent how many bits or 16bit words you want to read.
And if you have communication problems with some devices, it is probably due to the combination of communication speed and poor quality of the line. Try a lower baud rate, but adding wait between characters for me doesn't make much sense.
I have encountered uarts which require 2 stop bits on receive. If it set for 2 stop bits, this may explain the reqirement for a gap to extend thes# stop period beyond 2 stop bits.
Usually, only the first stop bit is checked on receive to determine framing error regardless of the stop bit setting.

GnuRadio tcp_sink data values are garbled

I'm developing a web front end for a GNU Radio application developed by a colleague.
I have a TCP client connecting to the output of two TCP Sink blocks, and the data encoding is not as I expect it to be.
One TCP Sink is sending complex data and the other is sending float data.
I'm decoding the data at the client by reading each 4-byte chunk as a float32 value. The server and the client are both little-endian systems, but I also tried byte swapping (with the GNU Radio Endian Swap block and also manually at the client), and the data is still not right. Actually it's much worse then, confirming there is no byte order mismatch.
When I execute the flow graph in GNU Radio Companion with appropriate GUI elements, the plots look correct. The data values are shown as expected to between 0 and 10.
However the values decoded at the client are generally around 0.00xxxxx, and the plot looks like noise rather than showing a simple tone as is seen in GNU Radio. If I manually scale the data by multiplying by 1000 it still looks like noise.
I'll describe the pre-D path in GNU Radio since it's shorter, but I see the same problem on the post-D path, where a WBFM Receive and a Rational Resampler are added, followed by a Throttle block and then a TCP Sink block sending float data.
File Source (Output Type: complex, vector length: 1) =>
Throttle (vector length: 1) =>
Low Pass Filter (FIR Type: Complex->Complex (Decimating)) =>
Throttle (vector length: 1) =>
TCP Sink (input type: complex, vector length: 1).
This seems to be the correct way to specify the stream parameters (and indeed Companion shows errors if I make changes which mismatch the stream items), but I can find no way to decode the data correctly on the other end of the stream.
"the historic RFC 1700 (also known as Internet standard STD 2) has defined the network order for protocols in the Internet protocol suite to be big-endian , hence the use of the term 'network byte order' for big-endian byte order."
see https://en.wikipedia.org/wiki/Endianness
having mentioned the network order for protocols being big-endian, this actually says nothing about the byte order of network payload itself.
also note: Sun Microsystems made big-endian native byte order computers (upon which much Internet protocol development was done).
i am surprised the previous answer has gone this long without a lesson on network byte order versus native byte order.
GNURadio appears to assume native byte order from a UDP Source block.
Examining the datatype color codes in Help->Types of GNURadio Companion, the orange colored 'float' connections are float32.
To verify a computer's native byte order, in Python, do:
from sys import byteorder
byteorder
the result will be 'little' or 'big'
It might be possible that no matter what type floats you are sending, when bytes get on network they get ordered in little endian. I had similar problem with udp connection, and I solved it by parsing floats as little endian on client side.

Ethernet frames smaller than 64 bytes

By IEEE 802.3, an Ethernet frame has to carry a payload of at least 46 bytes. This is for collision detection-- collisions of smaller frames (may) go undetected.
The Q is: what if the payload to be carried is shorter? what kind of padding is used to scale the frame up to the slot size-- 64 bytes?
TIA.
To quote from Data and Computer Network Communication (emphasis mine);
If the network layer wishes to send less than 46 bytes of data the MAC protocol adds sufficient number of zero bytes (0x00, is also known as null padding characters) to satisfy the requirement.
Some buggy drivers fail to do this though as noted by Adaptec.

Endianess of network data transmissions over TCP/IP

Here is a question I've been trying to solve since quite some time ago. This does not attain a particular languaje, although it's not really beneficial for some that have a VM that specifies endianess. I know, like the 99.9999% of people that use sockets to send data using TCP/IP, that the protocol specifies a endianess for the transmission elements, like destination address, port and such. The thing I don't know is if it requires the payload to be in a specific format to prevent incompatibilities.
For example, let's say I develop a protocol that is not a presentation layer, and that I, due to the inmense dominance that little endian devices have nowadays, decide to make it little endian (for example the positions of the players and such are transmitted in little endian order). For example a network module for a game engine, where latencies matter and byte conversion would cost a noticeable amount of time. Of course the address, port and all of that data that is protocol related would be specified in big endian as is mandatory, I'm talking about the payload, and only that.
Would that protocol work out of the box (translating the contents as necessary, of course, once the the transmission is received) on a big endian machine? Or would the checksums of the IP protocol or something of the kind get computed wrong since the data is in a different order, and the programmer does not have control of them if raw_sockets aren't used?
Since the whole explanation can be misleading, feel free to ask for clarifications.
Thank you very much.
The thing I don't know is if it requires the payload to be in a specific format to prevent incompatibilities.
It doesn't, and it doesn't have a way of telling. To TCP it's just a byte-stream. It is up to the application protocol to decide endian-ness, and it is up to the implementors at each end to implement it correctly. There is a convention to use big-endian, but there's no compulsion.
Application-layer protocols dictate their own endianness. However, by convention, multi-byte integer values should be sent in network-byte order (big endian) for consistency across platforms, such as by using platform-provided hton...() (host-to-network) and ntoh...() (network-to-host) function implementations in your code. On little-endian systems, they will do the necessary byte swapping. On big endian systems, they are no-ops. The functions provide an abtraction layer so code does not have to worry about that.

What is the difference between using mark/space parity and parity none?

What is the purpose having created three type of parity bits that all define a state where the parity bit is precisely not used ?
"If the parity bit is present but not used, it may be referred to as mark parity (when the parity bit is always 1) or space parity (the bit is always 0)" - Wikipedia
There is a very simple and very useful reason to have mark or space parity that appears to be left out here: node address flagging.
Very low-power and/or small embedded systems sometimes utilize an industrial serial bus like RS485 or RS422. Perhaps many very tiny processors may be attached to the same bus.
These tiny devices don't want to waste power or processing time looking at every single character that comes in over the serial port. Most of the time, it's not something they're interested in.
So, you design a bus protocol that uses for example maybe 9 bits... 8 data bits and a mark/space parity bit. Each data packet contains exactly one byte or word (the node address) with the mark parity bit set. Everything else is space parity. Then, these tiny devices can simply wait around for a parity error interrupt. Once it get's the interrupt, it checks that byte. Is that my address? No, go back to sleep.
It's a very power-efficient system... and only 10% wasteful on bandwidth. In many environments, that's a very good trade-off.
So... if you've then got a PC-class system trying to TALK to these tiny devices, you need to be able to set/clear that parity bit. So, you set MARK parity when you transmit the node addresses, and SPACE parity everywhere else.
So there are five possibilities, not three: no parity, mark, space, odd and even. With no parity the extra bit is just omitted in the frame, often selected when the protocol is already checking for errors with a checksum or CRC or data corruption is not deemed likely or critical.
Nobody ever selects mark or space, that's just wasting bandwidth. Modulo some odd standard, like 9-bit data protocols that hardware vendors like to force you to buy their hardware since you have no real shot at reprogramming the UART on the fly without writing a driver.
Setting mark or space parity is useful if you're generating data to send to hardware that requires a parity bit (perhaps because it has a hard coded word length built into the electronics) but doesn't care what its value is.
RS485 requires 9 bits transmission, as described above. RS485 is widely used in industrial applications, whatever the controlled device 'size' (for instance there are many air conditioners or refrigerators offering a RS485 interface, not really 'tiny' things). RS485 allows up to 10Mbs throughput or distances up to 4000 feet. Using the parity bit to distinguish address/data bytes eases hardware implementation, each node of the network can have their own hardware to generate interrupts only if an address byte on the wire matches the node's address.
Very clear and helpful answers and remarks.
For those who find the concept perverse, relax; the term is a problem of semantics rather than information theory or engineering, the difficulty arising from the use of the word "parity".
"Mark" and "space" bits are not parity bits in those applications, and the term arises from the fact that they occupy the bit position in which a parity bit might be expected in other contexts. In reality they have nothing to do with parity, but are used for any relevant purpose where a constant bit value is needed, such as to mark the start of a byte or other signal, or as a delay,or to indicate the status of a signal as being data or address or the like.
Accordingly they sometimes are more logically called "stick (parity) bits", being stuck in "on" or "off" state. Sometimes they really are "don't cares".

Resources