I have to describe the functioning of HEVC in few words - codec

It works by comparing different parts of a video frame in order to find the ones that are redundant within the subsequent frames. These areas are replaced with a short information, describing the original pixels.
Is this correct? How would you describe it?

if I did understand you correctly, you are asking about the general video encoding scheme, so yeah the question is not related to programming and you should google about some video encoding basics.
somehow your definition is right, but very wide. compression in general aims to remove redundancy in bit stream, thus replacing some part of it with shortest segments. in video encoding (avc, hevc or other) the encoder will try to remove special and temporal redundancy. the compression process in complicated and differ from encoder to other so you cannot describe hevc in just one paragraph.
just start looking to the hybrid video encoder, than inter/intra predication+ entropy coding , which are mainly the most important parts.

Related

QR Code Recognition in AGV (Auto Guided Vehicle)

I have some questions.
The first question is which equipment should be used to recognize QR Code.
I'm thinking of two things.
The first is the QR code Scanner used in the industrial field.
The second is the camera module. (opencv will be used)
However, the situation to consider is that it should be recognized at the speed of 50cm/s.
What do you think about?
And if I use a camera, is there a library that you can recommend to recognize QR Code? (C/C++ only)
Always start with the simplest solution and then go more complex if needed. If you're using ROS/OpenCV, OpenCV has a QR Code scanner, ex. Other options include ZBar, quirc, and more, found by searching github or the internet.
As for a camera, if you don't need the intrinsic matrix, then you only need to decide on the resolution: more resolution takes (non-linearly) longer to compute, but less resolution prohibits seeing the objects well.
Your comment about "recognize at 50cm/s" doesn't make much sense. I assume you mean that you want to be able to decode a QR code that's up-to 50 cm away, and do it in less than a second (to have time to stop). First you'll have to check if the algorithm, running on your hardware, can detect the QR code at different desired distances, and how that changes with scaling the image up/down in OpenCV. Then you'll have to time how long it takes to detect/decode it at those distances/resolutions/scales. If it fails to be good enough, you can try another algorithm, try different compilation settings, perhaps give it it's own thread, change the scaling on the image, accept the limitations, or change the hardware.

Will any QR scanners fail to read a code with a sub-optimal mask pattern choice?

The QR code specification requires an optimal choice of mask to avoid patterns that are difficult to scan. If the rules for this choice are ignored and one of the mask patterns is chosen arbitrarily, but the scanner is not confused by the pattern while scanning it, will there be any decoding problems?
Yes, sometimes. Due ISO/IEC 18004:2000 (8.8), masking is used for:
balancing dark and light modules,
avoiding Position Detection pattern.
The first one will affect on version detection, which is made provisionaly for versions of QR code less than 7.
The second one will affect on position patterns detection.
Both these steps should be done before reading Format Information and applying mask to QR code while decoding.
No. The mask that is used is recorded in the QR code itself, in the area around the finder patterns. As long as that is read correctly, the mask will be correctly removed before decoding. It doesn't matter whether it's actually the optimal pattern or not. Any mask may be used in theory.

QR detection parameters

What are the parameters/factors that a QR detector need to detect/check before(during) decoding the QR code itself.
From what I know:
1. it need to find/locate three finder patterns
2. need to locate alignment patterns (if there is any)
3. need to check luminance
Is there anything else that need to be determined/checked?
I suppose that there are many ways to detect a QR code, and it's not required to do it one particular way or the other as long as the detection succeeds. There is a reference algorithms in the QR code specification, though in my opinion it is too slow to be practical, though it's quite thorough.
I can tell you how zxing does it. Yes, it first locates the three finder patterns. This is done by looking for 1:1:3:1:1 black/white/black/white/black crossings horizontally and vertically. It figures out which one is which by looking at the vectors between them.
Then it needs a fourth point since four points are needed to correct for perspective distortion. It uses the location of the 3 finder patterns to guess about where it is and scans for it similarly (looking for 1:1:1:1:1). You don't need to find all alignment patterns, though doing so would allow you to correct for warping in the QR code, which is very rare.
Then you can sample the image to get the black/white modules by computing the perspective transform and reversing it. Then the decoding proceeds, the processing of those black/white modules, which is a fair bit of work too but nothing to do with detection or image processing anymore.
Looking at luminance is really a step before all this, so you even have a notion of black and white in the image to begin with. That's different.

Are binary protocols dead? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
It seems like there used to be way more binary protocols because of the very slow internet speeds of the time (dialup). I've been seeing everything being replaced by HTTP and SOAP/REST/XML.
Why is this?
Are binary protocols really dead or are they just less popular? Why would they be dead or less popular?
You Just Can't Beat the Binary
Binary protocols will always be more space efficient than text protocols. Even as internet speeds drastically increase, so does the amount and complexity of information we wish to convey.
The text protocols you reference are outstanding in terms of standardization, flexibility and ease of use. However, there will always be applications where the efficiency of binary transport will outweigh those factors.
A great deal of information is binary in nature and will probably never be replaced by a text protocol. Video streaming comes to mind as a clear example.
Even if you compress a text-based protocol (e.g. with GZip), a general purpose compression algorithm will never be as efficient as a binary protocol designed around the specific data stream.
But Sometimes You Don't Have To
The reason you are seeing more text-based protocols is because transmission speeds and data storage capacity have indeed grown fast compared to the data size for a wide range of applications. We humans find it much easier to work with text protocols, so we designed our ubiquitous XML protocol around a text representation. Certainly we could have created XML as a binary protocol, if we really had to save every byte, and built common tools to visualize and work with the data.
Then Again, Sometimes You Really Do
Many developers are used to thinking in terms of multi-GB, multi-core computers. Even your typical phone these days puts my first IBM PC-XT to shame. Still, there are platforms such as embedded devices, that have rather strict limitations on processing power and memory. When dealing with such devices, binary may be a necessity.
A parallel with programming languages is probably very relevant.
While hi-level languages are the preferred tools for most programming jobs, and have been made possible (in part) by the increases in CPU speed and storage capactity, they haven't removed the need for assembly language.
In a similar fashion, non-binary protocols introduce more abstraction, more extensibility and are therefore the vehicle of choice particularly for application-level communication. They too have benefited from increases in bandwidth and storage capacity. Yet at lower level it is still impractical to be so wasteful.
Furthermore unlike with programming languages where there are strong incentives to "take the performance hit" in exchange for added simplicity, speed of development etc., the ability to structure communication in layers makes the complexity and "binary-ness" of lower layers rather transparent to the application level. For example so long as the SOAP messages one receives are ok, the application doesn't need to know that these were effectively compressed to transit over the wire.
Facebook, Last.fm, and Evernote use the Thrift binary protocol.
I rarely see this talked about but binary protocols, block protocols especially can greatly simplify the complexity of server architectures.
Many text protocols are implemented in such a way that the parser has no basis upon which to infer how much more data is necessary before a logical unit has been received (XML, and JSON can all provide minimum necessary bytes to finish, but can't provide meaningful estimates). This means that the parser may have to periodically cede to the socket receiving code to retrieve more data. This is fine if your sockets are in blocking mode, not so easy if they're not. It generally means that all parser state has to be kept on the heap, not the stack.
If you have a binary protocol where very early in the receive process you know exactly how many bytes you need to complete the packet, then your receiving operations don't need to be interleaved with your parsing operations. As a consequence, the parser state can be held on the stack, and the parser can execute once per message and run straight through without pausing to receive more bytes.
There will always be a need for binary protocols in some applications, such as very-low-bandwidth communications. But there are huge advantages to text-based protocols. For example, I can use Firebug to easily see exactly what is being sent and received from each HTTP call made by my application. Good luck doing that with a binary protocol :)
Another advantage of text protocols is that even though they are less space efficient than binary, text data compresses very well, so the data may be automatically compressed to get the best of both worlds. See HTTP Compression, for example.
Binary protocols are not dead. It is much more efficient to send binary data in many cases.
WCF supports binary encoding using TCP.
http://msdn.microsoft.com/en-us/library/ms730879.aspx
So far the answers all focus on space and time efficiency. No one has mentioned what I feel is the number one reason for so many text-based protocols: sharing of information. It's the whole point of the Internet and it's far easier to do with text-based, human-readable protocols that are also easily processed by machines. You rid yourself of language dependent, application-specific, platform-biased programming with text data interchange.
Link in whatever XML/JSON/*-parsing library you want to use, find out the structure of the information, and snip out the pieces of data you're interested in.
Some binary protocols I've seen on the wild for Internet Applications
Google Protocol Buffers which are used for internal communications but also on, for example Google Chrome Bookmark Syncing
Flash AMF which is used for communication with Flash and Flex applications. Both Flash and Flex have the capability of communicating via REST or SOAP, however the AMF format is much more efficient for Flex as some benchmarks prove
I'm really glad you have raised this question, as non-binary protocols have multiplied in usage many folds since the introduction of XML. Ten years ago, you would see virtually everybody touting their "compliance" with XML based communications. However, this approach, one of several approaches to binary protocols, has many deficiencies.
One of the values, for example, was readability. But readability is important for debugging, when humans should read the transaction. They are very inefficient when compared with binary transfers. This is due to the fact that XML itself is a binary stream, that has to be translated using another layer into textual fragments ("tokens"), and then back into binary with the contained data.
Another value people found was extensibility. But extensibility can be easily maintained if a protocol version number for the binary stream is used at the beginning of the transaction. Instead of sending XML tags, one could send binary indicators. If the version number is an unknown one, then the receiving end can download the "dictionary" of this unknown version. This dictionary could, for example, be an XML file. But downloading the dictionary is a one time operation, instead of every single transaction!
So efficiency could be kept together with extensibility, and very easily! There are a good number of "compiled XML" protocols out there which do just that.
Last, but not least, I have even heard people say that XML is a good way to overcome little-endian and big-endian types of binary systems. For example, Sun computers vs Intel computers. But this is incorrect: if both sides can accept XML (ASCII) in the right way, surely both sides can accept binary in the right way, as XML and ASCII are also transmitted binarically.......
Hope you find this interesting reading!
Binary protocols will continue to live wherever efficency is required. Mostly, they will live in the lower-levels, where hardware-implementation is more common than software implementations. Speed isn't the only factor - the simplicity of implementation is also important. Making a chip process binary data messages is much easier than parsing text messages.
Surely this depends entirely on the application? There have been two general types of example so far, xml/html related answers and video/audio. One is designed to be 'shared' as noted by Jonathon and the other efficient in its transfer of data (and without Matrix vision, 'reading' a movie would never be useful like reading a HTML document).
Ease of debugging is not a reason to choose a text protocol over a 'binary' one - the requirements of the data transfer should dictate that. I work in the Aerospace industry, where the majority of communications are high-speed, predictable data flows like altitude and radio frequencies, thus they are assigned bits on a stream and no human-readable wrapper is required. It is also highly efficient to transfer and, other than interference detection, requires no meta data or protocol processing.
So certainly I would say that they are not dead.
I would agree that people's choices are probably affected by the fact that they have to debug them, but will also heavily depend on the reliability, bandwidth, data type, and processing time required (and power available!).
They are not dead because they are the underlying layers of every communication system. Every major communication system's data link and network layers are based on some kind of "binary protocol".
Take the internet for example, you are now probably using Ethernet in your LAN, PPPoE to communicate with your ISP, IP to surf the web and maybe FTP to download a file. All of which are "binary protocols".
We are seeing this shift towards text-based protocols in the upper layers because they are much easier to develop and understand when compared to "binary protocols", and because most applications don't have strict bandwidth requirements.
depends on the application...
I think in real time environment (firewire, usb, field busses...) will always be a need for binary protocols
Are binary protocols dead?
Two answers:
Let's hope so.
No.
At least a binary protocol is better than XML, which provides all the readability of a binary protocol combined with all the efficiency of less efficiency than a well-designed ASCII protocol.
Eric J's answer pretty much says it, but here's some more food for thought and facts. Note that the stuff below is not about media protocols (videos, images). Some items may be clear to you, but I keep hearing myths every day so here you go ...
There is no difference in expressiveness between a binary protocol and a text protocol. You can transmit the same information with the same reliability.
For every optimum binary protocol, you can design an optimum text protocol that takes just around 15% more space, and that protocol you can type on your keyboard.
In practice (practical protocols is see every day), the difference is often even less significant due to the static nature of many binary protocols.
For example, take a number that can become very large (e.g., in 32 bit range) but is often very small. In binary, people model this usually as four bytes. In text, it's often done as printed number followed by colon. In this case, numbers below ten become two bytes and numbers below 100 three bytes. (You can of course claim that the binary encoding is bad and that you can use some size bits to make it more space efficient, but that's another thing that you have to document, implement on both sides, and be able to troubleshoot when it comes over your wire.)
For example, messages in binary protocols are often framed by length fields and/or terminators, while in text protocols, you just use a CRC.
In practice, the difference is often less significant due to required redundancy.
You want some level of redundancy, no matter if it's binary or text. Binary protocols often leave no room for error. You have to 100% correctly document every bit that you send, and since most of us are humans, that happens rarely and you can't read it well enough to make a safe conclusion what is correct.
So in summary: Binary protocols are theoretically more space and compute efficient, but the difference is in practice often less than you think and the deal is often not worth it. I am working in the Internet of Things area and have to deal nearly on daily base with custom, badly designed binary protocols which are really hard to troubleshoot, annoying to implement and not more space efficient. If you don't need to absolutely tweak the last milliampere out of your battery and calculate with microcontroller cycles (or transmit media), think twice.

What techniques can you use to encode data on a lossy one-way channel?

Imagine you have a channel of communication that is inherently lossy and one-way. That is, there is some inherent noise that is impossible to remove that causes, say, random bits to be toggled. Also imagine that it is one way - you cannot request retransmission.
But you need to send data over it regardless. What techniques can you use to send numbers and text over that channel?
Is it possible to encode numbers so that even with random bit twiddling they can still be interpreted as values close to the original (lossy transmittion)?
Is there a way to send a string of characters (ASCII, say) in a lossless fashion?
This is just for fun. I know you can use morse code or any very low frequency binary communication. I know about parity bits and checksums to detect errors and retrying. I know that you might as well use an analog signal. I'm just curious if there are any interesting computer-sciency techniques to send this stuff over a lossy channel.
Depending on some details that you don't supply about your lossy channel, I would recommend, first using a Gray code to ensure that single-bit errors result in small differences (to cover your desire for loss mitigation in lossy transmission), and then possibly also encoding the resulting stream with some "lossless" (==tries to be loss-less;-) encoding.
Reed-Solomon and variants thereof are particularly good if your noise episodes are prone to occur in small bursts (several bit mistakes within, say, a single byte), which should interoperate well with Gray coding (since multi-bit mistakes are the killers for the "loss mitigation" aspect of Gray, designed to degrade gracefully for single-bit errors on the wire). That's because R-S is intrinsically a block scheme, and multiple errors within one block are basically the same as a single error in it, from R-S's point of view;-).
R-S is particularly awesome if many of the errors are erasures -- to put it simply, an erasure is a symbol that has most probably been mangled in transmission, BUT for which you DO know the crucial fact that it HAS been mangled. The physical layer, depending on how it's designed, can often have hints about that fact, and if there's a way for it to inform the higher layers, that can be of crucial help. Let me explain erasures a bit...:
Say for a simplified example that a 0 is sent as a level of -1 volt and a 1 is send as a level of +1 volt (wrt some reference wave), but there's noise (physical noise can often be well-modeled, ask any competent communication engineer;-); depending on the noise model the decoding might be that anything -0.7 V and down is considered a 0 bit, anything +0.7 V and up is considered a 1 bit, anything in-between is considered an erasure, i.e., the higher layer is told that the bit in question was probably mangled in transmission and should therefore be disregarded. (I sometimes give this as one example of my thesis that sometimes abstractions SHOULD "leak" -- in a controlled and architected way: the Martelli corollary to Spolsky's Law of Leaky Abstractions!-).
A R-S code with any given redundancy ratio can be about twice as effective at correcting erasures (errors the decoder is told about) as it can be at correcting otherwise-unknown errors -- it's also possible to mix both aspects, correcting both some erasures AND some otherwise-unknown errors.
As the cherry on top, custom R-S codes can be (reasonably easily) designed and tailored to reduce the probability of uncorrected errors to below any required threshold θ given a precise model of the physical channel's characteristics in terms of both erasures and undetected errors (including both probability and burstiness).
I wouldn't call this whole area a "computer-sciency" one, actually: back when I graduated (MSEE, 30 years ago), I was mostly trying to avoid "CS" stuff in favor of chip design, system design, advanced radio systems, &c -- yet I was taught this stuff (well, the subset that was already within the realm of practical engineering use;-) pretty well.
And, just to confirm that things haven't changed all that much in one generation: my daughter just got her MS in telecom engineering (strictly focusing on advanced radio systems) -- she can't design just about any serious program, algorithm, or data structure (though she did just fine in the mandatory courses on C and Java, there was absolutely no CS depth in those courses, nor elsewhere in her curriculum -- her daily working language is matlab...!-) -- yet she knows more about information and coding theory than I ever learned, and that's before any PhD level study (she's staying for her PhD, but that hasn't yet begun).
So, I claim these fields are more EE-y than CS-y (though of course the boundaries are ever fuzzy -- witness the fact that after a few years designing chips I ended up as a SW guy more or less by accident, and so did a lot of my contemporaries;-).
This question is the subject of coding theory.
Probably one of the better-known methods is to use Hamming code. It might not be the best way of correcting errors on large scales, but it's incredibly simple to understand.
There is the redundant encoding used in optical media that can recover bit-loss.
ECC is also used in hard-disks and RAM
The TCP protocol can handle quite a lot of data loss with retransmissions.
Either Turbo Codes or Low-density parity-checking codes for general data, because these come closest to approaching the Shannon limit - see wikipedia.
You can use Reed-Solomon codes.
See also the Sliding Window Protocol (which is used by TCP).
Although this includes dealing with packets being re-ordered or lost altogether, which was not part of your problem definition.
As Alex Martelli says, there's lots of coding theory in the world, but Reed-Solomon codes are definitely a sweet spot. If you actually want to build something, Jim Plank has written a nice tutorial on Reed-Solomon coding. Plank has a professional interest in coding with a lot of practical expertise to back it up.
I would go for some of these suggestions, followed by multiple sendings of the same data. So that way you can hope for different errors to be introduced at different points in the stream, and you may be able to infer the desired number a lot easier.

Resources