64/66b encoding - raw-ethernet

There are a few things I don't understand about 64/66bit encoding, and failed to find the answers to on the web. Any help/links would be greatly appreciated:
i) how is the start of a frame recognised? I don't think it can be by the initial 10/01 bits called the preamble on wikipedia because you cannot tell them apart (if an idle link is 0, then 0000 10 and 000 01 0 look rather similar). I expect the end of a frame is indicated by a control word, with the rest of the bits perhaps used for the CRC?
ii) how do the scramblers synchronise, and how do they avoid scrambling the same packet the same way? Or to put this another way, why is not possible for a malicious user to induce substantial packet loss by carefully choosing a bad message?
iii) this might have been answered in ii), but if a packet is sent to a switch, and then onto another host, is it scrambled the same way both times?
Once again, many thanks in advance

Layers
First of all the OSI model needs to be clear.
The ethernet frame is a data link layer, while the 64b/66b encoding is part of the physical layer (More precisely the PCS of the physical layer)
The physical layer doesn't know anything about the start of a frame. It sees only data. (The start of an ethernet frame are data bytes which contain the preamble.)
64b/66b encoding
Now let's assume that the link is up and running.
In this case the idle link is not full of '0'-s. (In that case the link wouldn't be self-synchronous) Idle messages (idle characters and/or synchronization blocks ie control information) are sent over the idle link. (The control information encoded with 0b10 preamble) (This is why the emitted spectrum and power dissipation don't depend on if the link is in idle state or not)
So a start of a new frame acts like following:
The link sends idle information. (with 0b10 preamble)
Upper layer (data link layer) sends the frame (in 64bit chunks of data) to physical layer.
The physical layer sends the data (with 0b01 preamble) over the link.
(Note that physical layer frequently inserts control (sync) symbols into the raw frame even during a data burst)
Synchronization
Before data transmission 64b/66b encoded lane must be initialized. This initialization includes the lane initialization which the block synchronization. Xilinx's Aurora's specification (P34) is an example of link initialization.
Briefly receiver tries to match the sync character in different bit-position, and when it match multiple times it reports link-up.
Note, that the 64b/66b encoding uses self-synchronous scrambler. This is why the scrambler (itself) doesn't need to know anything about where we are in the data stream. If you run a self-synchronous (de-)scrambler long enough, it produces the decoded bit stream.
Maliciousness
Note, that 64b/66b encoding is not an encryption. This scrambling won't protect you from eavesdropping/tamper. (Encryption should placed at higher level of the OSI model)
Same packet multiple times
Because the scrambler is in different state/seed when you sending the same packet second time, the two encoded packet will differ. (Theoretically we can creates packets, which sets back the shift register of the scramble, but we need to consider the control symbols, so practically this is impossible.)

Related

How does a computer know what data to reassemble?

When a computer X sends data through a network to computer Y the data goes down through the OSI layer. This is ok. I understand. But once the data is put on the media as eletric signals then how does the computer Y know what to reassmble, given the headers and trailers of the data model generated in OSI, once it is put on the electric media at layer 1 does not exist any more?
The physical layer is just 1's and 0's as you say - the trick is that there is a pattern that tells the receiver that this is the start of a packet. This is usual referred to as 'Framing'.
Once the receiver knows that, it simply reads in as many bits as its needs for the Layer 2 header and it then has that and so on.
The headers are clear in a typical OSI or networking diagrams, e.g. (https://www.ciscopress.com/articles/article.asp?p=2738463):
So the way the first two layers work on the receiver is:
layer 1 just recognises whether the signal is a one or a zero and creates the stream of ones and zeros.
layer 2 reads this stream and when it recognises the start pattern it then know the following bits are the header and so on and hence it can identify the frames.
You can see examples of start and stop patterns online e.g. (http://sinauonline.50webs.com/Cisco/Cisco%20Exploration%20Sem1Chap7.html):

read a well defined frame with qtserialport

I'm developing a desktop application with qt which communicates with stm32 to send and receive data.
The thing is, the data to transfer, follow a well-defined shape, with a previously defined fields. My problem is that I can't find how read () or readall() work or how Qserialport even treats the data. So my question is how can I read data (in real time, whenever there is data in the buffer) and analyze it field by field (or per byte) in order to be displayed in the GUI.
There's nothing to it. read() and readAll() give you bytes, optionally wrapped in a QByteArray. How you deal with those bytes is up to you. The serial port doesn't "treat" or interpret the data in any way.
The major point of confusion is that somehow people think of a serial port as if it was a packet oriented interface. It's not. When the readyRead() signal fires, all that you're guaranteed is that there's at least one new byte available to read. You must cope with such fragmentation.

Difference between PACKETS and FRAMES

Two words commonly used in networking world - Packets and frames.
Can anyone please give the detail difference between these two words?
Hope it might sounds silly but does it mean as below
A packet is the PDU(Protocol Data Unit) at layer 3 (network layer - ip packet) of the networking OSI model.
A frame is the PDU of layer 2 (data link) of the OSI model.
Packets and Frames are the names given to Protocol data units (PDUs) at different network layers
Segments/Datagrams are units of data in the Transport Layer.
In the case of the internet, the term Segment typically refers to TCP, while Datagram typically refers to UDP. However Datagram can also be used in a more general sense and refer to other layers (link):
Datagram
A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer andthe transporting network.
Packets are units of data in the Network Layer (IP in case of the Internet)
Frames are units of data in the Link Layer (e.g. Wifi,
Bluetooth, Ethernet, etc).
A packet is a general term for a formatted unit of data carried by a network. It is not necessarily connected to a specific OSI model layer.
For example, in the Ethernet protocol on the physical layer (layer 1), the unit of data is called an "Ethernet packet", which has an Ethernet frame (layer 2) as its payload. But the unit of data of the Network layer (layer 3) is also called a "packet".
A frame is also a unit of data transmission. In computer networking the term is only used in the context of the Data link layer (layer 2).
Another semantical difference between packet and frame is that a frame envelops your payload with a header and a trailer, just like a painting in a frame, while a packet usually only has a header.
But in the end they mean roughly the same thing and the distinction is used to avoid confusion and repetition when talking about the different layers.
Actually, there are five words commonly used when we talk about layers of reference models (or protocol stacks): data, segment, packet, frame and bit. And the term PDU (Protocol Data Unit) is a generic term used to refer to the packets in different layers of the OSI model. Thus PDU gives an abstract idea of the data packets. The PDU has a different meaning in different layers still we can use it as a common term.
When we come to your question, we can call all of them by using the general term PDU, but if you want to call them specifically at a given layer:
Data: PDU of Application, Presentation and Session Layers
Segment: PDU of Transport Layer
Packet: PDU of network Layer
Frame: PDU of data-link Layer
Bit: PDU of physical Layer
Here is a diagram, since a picture is worth a thousand words:
Consider TCP over ATM. ATM uses 48 byte frames, but clearly TCP packets can be bigger than that. A frame is the chunk of data sent as a unit over the data link (Ethernet, ATM). A packet is the chunk of data sent as a unit over the layer above it (IP). If the data link is made specifically for IP, as Ethernet and WiFi are, these will be the same size and packets will correspond to frames.
Packet
A packet is the unit of data that is routed between an origin and a destination on the Internet or any other packet-switched network. When any file (e-mail message, HTML file, Graphics Interchange Format file, Uniform Resource Locator request, and so forth) is sent from one place to another on the Internet, the Transmission Control Protocol (TCP) layer of TCP/IP divides the file into "chunks" of an efficient size for routing. Each of these packets is separately numbered and includes the Internet address of the destination. The individual packets for a given file may travel different routes through the Internet. When they have all arrived, they are reassembled into the original file (by the TCP layer at the receiving end).
Frame
1) In telecommunications, a frame is data that is transmitted between network points as a unit complete with addressing and necessary protocol control information. A frame is usually transmitted serial bit by bit and contains a header field and a trailer field that "frame" the data. (Some control frames contain no data.)
2) In time-division multiplexing (TDM), a frame is a complete cycle of events within the time division period.
3) In film and video recording and playback, a frame is a single image in a sequence of images that are recorded and played back.
4) In computer video display technology, a frame is the image that is sent to the display image rendering devices. It is continuously updated or refreshed from a frame buffer, a highly accessible part of video RAM.
5) In artificial intelligence (AI) applications, a frame is a set of data with information about a particular object, process, or image. An example is the iris-print visual recognition system used to identify users of certain bank automated teller machines. This system compares the frame of data for a potential user with the frames in its database of authorized users.

Is there a good way to frame a protocol so data corruption can be detected in every case?

Background: I've spent a while working with a variety of device interfaces and have seen a lot of protocols, many serial and UDP in which data integrity is handled at the application protocol level. I've been seeking to improve my receive routine handling of protocols in general, and considering the "ideal" design of a protocol.
My question is: is there any protocol framing scheme out there that can definitively identify corrupt data in all cases? For example, consider the standard framing scheme of many protocols:
Field: Length in bytes
<SOH>: 1
<other framing information>: arbitrary, but fixed for a given protocol
<length>: 1 or 2
<data payload etc.>: based on length field (above)
<checksum/CRC>: 1 or 2
<ETX>: 1
For the vast majority of cases, this works fine. When you receive some data, you search for the SOH (or whatever your start byte sequence is), move forward a fixed number of bytes to your length field, and then move that number of bytes (plus or minus some fixed offset) to the end of the packet to your CRC, and if that checks out you know you have a valid packet. If you don't have enough bytes in your input buffer to find an SOH or to have a CRC based on the length field, then you wait until you receive enough to check the CRC. Disregarding CRC collisions (not much we can do about that), this guarantees that your packet is well formed and uncorrupted.
However, if the length field itself is corrupt and has a high value (which I'm running into), then you can't check the (corrupt) packet's CRC until you fill up your input buffer with enough bytes to meet the corrupt length field's requirement.
So is there a deterministic way to get around this, either in the receive handler or in the protocol design itself? I can set a maximum packet length or a timeout to flush my receive buffer in the receive handler, which should solve the problem on a practical level, but I'm still wondering if there's a "pure" theoretical solution that works for the general case and doesn't require setting implementation-specific maximum lengths or timeouts.
Thanks!
The reason why all protocols I know of, including those handling "streaming" data, chop up the datastream in smaller transmission units each with their own checks on board is exactly to avoid the problems you describe. Probably the fundamental flaw in your protocol design is that the blocks are too big.
The accepted answer of this SO question contains a good explanation and a link to a very interesting (but rather heavy on math) paper about this subject.
So in short, you should stick to smaller transmission units not only because of practical programming related arguments but also because of the message length's role in determining the security offered by your crc.
One way would be to encode the length parameter so that it would be easily detected to be corrupted, and save you from reading in the large buffer to check the CRC.
For example, the XModem protocol embeds an 8 bit packet number followed by it's one's complement.
It could mean doubling your length block size, but it's an option.

How to determine the length of an Ethernet II frame?

The Ethernet II frame format does not contain a length field, and I'd like to understand how the end of a frame can be detected without it.
Unfortunately, I have no idea of physics, but the following sounds reasonable to me: we assume that Layer 1 (Physical Layer) provides us with a way of transmitting raw bits in such a way that it is possible to distinguish between the situation where bits are being sent and the situation where nothing is sent (if digital data was coded into analog signals via phase modulation, this would be true, for example - but I don't know if this is really what's done). In this case, an ethernet card could simply wait until a certain time intervall occurs where no more bits are being transmitted, and then decide that the frame transmission has to be finished.
Is this really what's happening?
If yes: where can I find these things, and what are common values for the length of "certain time intervall"? Why does IEEE 802.3 have a length field?
If not: how is it done instead?
Thank you for your help!
Hanno
Your assumption is right. The length field inside the frame is not needed for layer1.
Layer1 uses other means to detect the end of a frame which vary depending on the type of physical layer.
with 10Base-T a frame is followed by a TP_IDL waveform. The lack of further Manchester coded data bits can be detected.
with 100Base-T a frame is ended with an End of Stream Delimiter bit pattern that may not occur in payload data (because of its 4B/5B encoding).
A rough description you can find e.g. here:
http://ww1.microchip.com/downloads/en/AppNotes/01120a.pdf "Ethernet Theory of Operation"

Resources