what is bcount in network tuple received - tcp

I am trying to analyse network traffic received from fprobe. Sample tuples Im receiving are like this-
{
"bcount": 52,
"protocol": 6,
"Timestamp": "2017-11-15 12:07:20.049+0530",
"etime": 1048514968,
"daddr": "172.24.24.60",
"tag": "netflow_data",
"pcount": 1,
"saddr": "172.24.24.62",
"dport": 53345,
"sport": 8775,
"stime": 1048514968
}
I am able to figure out meaning for rest of the fileds except bcount. I googled it but not able to get proper answer. What I got after searching is that bcount may be stands for block count or byte count. Does anyone knows what exactly it is and what its significance?

Regular Netflow v5 records usually contains counters for bytes and packets contained within a flow (Wikipedia). You didn't mention which flow collector software you're using (only fprobe for generation), so I can't verify this, but I'm pretty sure bcount in your example specifies that there were 52 bytes in the flow record.
This, together with the pcount (number of packets in the flow), is significant for identifying the amount of data transferred between the communicating hosts.

Related

Developing Communication Protocol for XBee

I am using XBee Digimesh Modules in API-Mode to send data between different industrial machines allowing them to share data, information and commands.
The API-Mode offers some basic commands, mainly to perform addressing and talk with the XBee Module itself in order to do configuration, etc.
Sending user data is done via a corresponding XBee API-Command which allows to send user-defined data with a maximum payload of 72 Bytes.
Since I want to expand this communication to allow integration of more machines, etc. I am thinking about how to implement a basic communication system that's tailored perfectly to the super small payload of just 72 Bytes.
Coming from the web, I normally would use some sort of JSON here but that would fill up the payload very quickly.
Also it's not possible to send a frame with lot's of information since this also fills up the payload very quickly.
So I came up with a different way of communicating. Instead of transmitting frames packed with information, what about sending some sort of Messages like this:
Machine-A Broadcasts: Who's there?
Machine-B Answers: It's me I am a xxx-Machine
Machine-C Answers: It's me I am a xxx-Machine
Machine-A now evaluates the replies and decides to work with Machine-B (because Machine-C does not match As interface):
Machine-A to B: Hello B, Give me some Value, please!
Machine-B to A: There you go: 2.349590
This can be extended to different short messages. After each message the sender holds the type of message in a state and the reply will be evaluated in relation to the state / context.
What I was trying to avoid was defining a bit-based protocol (like MIDI) which defines all events as bit based flags. Since we do not now what type of hardware there will be added in the future I want a communication protocol that's very flexible and does not need a coordinator or message broker, etc.
But since this is the first time I am thinking about communication protocols I am curious to know if there might be some existing frameworks that can handle complex communication on a light payload.
You might want to read through the ZigBee Cluster Library specification with a focus on the general commands. It describes a system of attribute discovery and retrieval. Each attribute has a 16-bit ID and a datatype (integers of various sizes, enumerated types, bitmaps) that determines its size.
It's a protocol designed for the small payloads of an 802.15.4 network, and you could potentially based your protocol off of a subset of it. Other ZigBee specifications are simply a list of defined attributes (and commands) for a given 16-bit cluster ID.
Your master device can go through a discovery process to get a list of attribute IDs, and then send a request to get values for multiple IDs in one shot. The response will be packed tight with a 16-bit ID, 8-bit attribute type and then variable length data. Even if your master device doesn't know what the ID corresponds to, it can pass the data along to other systems (like a web server) that do know.

TCP: multiple messages in a row

Is it within TCP standard that multiple messages, sent from server to client in a row, will be accepted by client at same order (and bytes of one message will not be scattered within other messages)?
TCP provides an in-order byte stream delivery service. The bytes won't arrive in another order but the number of writes need not be equal to the number of reads.
You will never read bytes in another order than that in which they were sent
You can make no assumptions on "messages". TCP doesn't know about messages, only bytes (see above). Both the sender and the receiver can coalesce and split such "messages"
TCP uses a sequence number to identify each byte of data. The sequence number identifies the order of the bytes sent from each computer so that the data can be reconstructed in order, regardless of any fragmentation, disordering, or packet loss that may occur during transmission.
I agree with #cnicutar.
How are you deserializing the objects? I suspect the problem lies there.
For example if your messages are like
ABCD followed 200 ms later by PQR. It may appear as:
ABC followed by PQR
or ABCDPQR
or even AB followed by CD followed by PQ followed by R.
Basically you cannot make assumptions based on time of receiving the data.
The deserialization logic should know the object boundaries within a stream of bytes. This information should be encoded into the stream by the serialization logic.
If you are using Java, you can use ObjectInputStream & ObjectOutputStream and not be bothered about serialzation issues.
J2ME Polish has a good serialization utility that can be very easily ported to other platforms. I have myself used it in live environment.

Is there a good way to frame a protocol so data corruption can be detected in every case?

Background: I've spent a while working with a variety of device interfaces and have seen a lot of protocols, many serial and UDP in which data integrity is handled at the application protocol level. I've been seeking to improve my receive routine handling of protocols in general, and considering the "ideal" design of a protocol.
My question is: is there any protocol framing scheme out there that can definitively identify corrupt data in all cases? For example, consider the standard framing scheme of many protocols:
Field: Length in bytes
<SOH>: 1
<other framing information>: arbitrary, but fixed for a given protocol
<length>: 1 or 2
<data payload etc.>: based on length field (above)
<checksum/CRC>: 1 or 2
<ETX>: 1
For the vast majority of cases, this works fine. When you receive some data, you search for the SOH (or whatever your start byte sequence is), move forward a fixed number of bytes to your length field, and then move that number of bytes (plus or minus some fixed offset) to the end of the packet to your CRC, and if that checks out you know you have a valid packet. If you don't have enough bytes in your input buffer to find an SOH or to have a CRC based on the length field, then you wait until you receive enough to check the CRC. Disregarding CRC collisions (not much we can do about that), this guarantees that your packet is well formed and uncorrupted.
However, if the length field itself is corrupt and has a high value (which I'm running into), then you can't check the (corrupt) packet's CRC until you fill up your input buffer with enough bytes to meet the corrupt length field's requirement.
So is there a deterministic way to get around this, either in the receive handler or in the protocol design itself? I can set a maximum packet length or a timeout to flush my receive buffer in the receive handler, which should solve the problem on a practical level, but I'm still wondering if there's a "pure" theoretical solution that works for the general case and doesn't require setting implementation-specific maximum lengths or timeouts.
Thanks!
The reason why all protocols I know of, including those handling "streaming" data, chop up the datastream in smaller transmission units each with their own checks on board is exactly to avoid the problems you describe. Probably the fundamental flaw in your protocol design is that the blocks are too big.
The accepted answer of this SO question contains a good explanation and a link to a very interesting (but rather heavy on math) paper about this subject.
So in short, you should stick to smaller transmission units not only because of practical programming related arguments but also because of the message length's role in determining the security offered by your crc.
One way would be to encode the length parameter so that it would be easily detected to be corrupted, and save you from reading in the large buffer to check the CRC.
For example, the XModem protocol embeds an 8 bit packet number followed by it's one's complement.
It could mean doubling your length block size, but it's an option.

Data Error Checking

I've got a bit of an odd question. A friend of mine and I thought it would be funny to make a serial port kind of communication between computers using sound. Basically, computers emit a series of beeps to send data, and listen for beeps over a microphone to receive data. In short, the world's most annoying serial port. I have all of the basics worked out. I can filter out sounds of only one frequency and I have sent data from one computer to another. Although the transmission is fairly error free, being affected only by very loud noises, some issues still exist. My question is, what are some good ways to check the data for errors and, more importantly, recover from these errors.
My serial communication is very standard once you get past the fact it uses sound waves. I use one start bit, 8 data bits, and one stop bit in every frame. I have already considered Cyclic Redundancy Checks, and I plan to factor this into my error checking, but CRCs don't account for some of the more insidious issues. For example, consider sending two bytes of data. You send the first one, and it received correctly, but just after the stop bit of the first byte, and the start bit of the next, a large book falls on the floor, which the receiver interprets to be a start bit, now the true start bit is read as part of the data and the receiver could be reading garbage data for many bytes to come. Eventually, a pause in the data could get things back on track.
That isn't the worst of it though. Bits can be dropped too, and most error checking schemes I can think of rely on receiving a certain number of bytes. What happens when the receiver keeps waiting for bytes that may not come?
So, you can see the complexity of this question. If you can direct me to any resources, or just give me a few tips, I would greatly appreciate your help.
A CRC is just a part of the solution. You can check for bad data but then you have to do something about it. The transmitter has to re-send the data, it needs to be told to do that. A protocol.
The starting point is that you split up the data into packets. A common approach is a start byte that indicates the start of the packet, followed by a packet number, followed by a length byte that indicates the length of the packet. Followed by the data bytes and the CRC. The receiver sends an ACK or NAK back to indicate success.
This solves several problems:
you don't care about a bad start bit anymore, the pause you need to recover is always there
you start a timer when you receive the first bit or byte, declare failure when the timer expires before the entire packet is received
the packet number helps you recover from bad ACK/NAK returns. The transmitter times out and resends the packet, you can detect the duplicate
RFC 916 describes such a protocol in detail. I never heard of anybody actually implementing it (other than me). Works pretty well.

TCP send queue depth

How do I discover how many bytes have been sent to a TCP socket but have not yet been put on the wire?
Looking at the diagram here:
I would like to know the total of Categories 2, 3, and 4 or the total of 3 and 4. This is in C(++) and on both Windows and Linux. Ideally there is a ioctl that I could use, but there doesn't seem to be any.
Under Linux, see the man page for tcp(7).
It appears that you can get the number of untransmitted bytes by ioctl(sock,SIOCINQ ...
Other stats might be available from members of the structure given back by the TCP_INFO getsockopt() call.
Some Unix flavors may have an API way to do this, but there is no way to do it that is portable across different variants.
If you want to determine wheter to add data or not: don't worry, send will block until the data is in the queue. If you don't want it to block, you can tell it to send(2):
send(socket, buf, buflen, MSG_DONTWAIT);
But this only works on Linux.
You can also set the socket to non-blocking:
fcntl(socket, F_SETFD, O_NONBLOCK);
This way write will return an error (EAGAIN) if the data cannot be written to the stream.

Resources