Get four 16bit numbers from a 64bit hex value - hex

I have been through these related questions:
How to convert numbers between hexadecimal and decimal in C#?
How to Convert 64bit Long Data Type to 16bit Data Type
Way to get value of this hex number
But I did not get an answer probably because I do not understand 64bit or 16bit values.
I had posted a question on Picasa and face detection, to use the face detection that Picasa does to get individual pics from a photo containing many pictures. Automatic Face detection using API
In an answer #Joel Martinez linked to an answer on picasa help which said:
The number encased in rect64() is a 64-bit hexadecimal number.
Break that up into four 16-bit numbers.
Divide each by the maximum unsigned 16-bit number (65535) and you'll have four
numbers between 0 and 1.
the full text
#oedious wrote:- This is going to be
somewhat technical, so hang on. * The
number encased in rect64() is a 64-bit
hexadecimal number. * Break that up
into four 16-bit numbers. * Divide
each by the maximum unsigned 16-bit
number (65535) and you'll have four
numbers between 0 and 1. * The four
numbers remaining give you relative
coordinates for the face rectangle:
(left, top, right, bottom). * If you
want to end up with absolute
coordinates, multiple the left and
right by the image width and the top
and bottom by the image height.
A sample picasa.ini file:
[1.jpg]
backuphash=65527
faces=rect64(5520c092dfb2f8d),615eec1bb18bdec5;rect64(dcc2ccf1fd63e93e),bc209d92a3388dc3;rect64(52524b7c785e6cf6),242908faa5044cb3
crop=rect64(0)
How do I get the 4 numbers from the 64 bit hex?
I am sorry people, currently I do not understand the answers. I guess I will have to learn some C++ (I am a PHP & Java Web Developer with weakness in Math) before I can jump in and write a something which will cut up an image into multiple images with the help of some co-ordinates. I am looking into CodeLab and creating plugins for Paint.net too

If you want basics, say you have this hexadecimal number:
4444333322221111
We split it into your 4 parts on paper, so all that's left is to extract them. This involves using a ffff mask to block out everything else besides our number (f masks nothing, 0 masks everything) and sliding it over each part. So we have:
part 1: 4444333322221111 & ffff = 1111
part 2: 4444333322221111 & ffff0000 = 22220000
part 3: 4444333322221111 & ffff00000000 = 333300000000
part 4: 4444333322221111 & ffff000000000000 = 4444000000000000
All that's left is to remove the 0's at the end. All in all, in C, you'd write this as:
int GetPart(int64 pack, int n) // where you define int64 as whatever your platform uses
{ // __int64 in msvc
return (pack & (0xffff << (16*n)) >> (16*n);
}
So basically, you calculate the mask as 0xffff (2 bytes) moved to the right 16*n bits (0 for the first, 16 for the 2nd, 32 for the 3rd and 48 for the 4th), apply it over the number to mask out everything but the part we're interested in, then shift the result back 16*n bits to clear out those 0's at the end.
Some additional reading: Bitwise operators in C.
Hope that helps!

Here is the algorithm:
The remainder of the division by 0x10000 (65536) will give you the first number.
Take the result then divide by 0x10000 (65536) again, the remainder will give you the second number.
Take the result the divide by 0x10000 (65536) again, the remainder will give you the third number.
The result is the fourth number.

It depends on your programming language - in C# i.e. you can use the BitConverter class, which allows you to extract a number based on the byte position within a byte array.
UInt64 largeHexNumber = 420404334;
byte[] hexData = BitConverter.GetBytes(largeHexNumber);
UInt16 firstValue = BitConverter.ToUInt16(hexData, 0);
UInt16 secondValue = BitConverter.ToUInt16(hexData, 2);
UInt16 thirdValue = BitConverter.ToUInt16(hexData, 4);
UInt16 forthValue = BitConverter.ToUInt16(hexData, 6);

It depends on the language. For the C-family of languages, it can be done like this (in C#):
UInt64 number = 0x4444333322221111;
//to get the ones, use a mask
// 0x4444333322221111
const UInt64 mask1 = 0xFFFF;
UInt16 part1 = (UInt16)(number & mask1);
//to get the twos, use a mask then shift
// 0x4444333322221111
const UInt64 mask2 = 0xFFFF0000;
UInt16 part2 = (UInt16)((number & mask2) >> 16);
//etc.
// 0x4444333322221111
const UInt64 mask3 = 0xFFFF00000000;
UInt16 part3 = (UInt16)((number & mask3) >> 32);
// 0x4444333322221111
const UInt64 mask4 = 0xFFFF000000000000;
UInt16 part4 = (UInt16)((number & mask4) >> 48);

What I think you are being asked to do is take the 64 bits of data you have and treat it like 4 16-bit integers. From there you are taking the 16-bit values and converting them to percentages. Those percentages, when multiplied to the image height/width, give you 4 coordinates.
How you do this depends on the language you're programming in.

I needed to convert the crop=rect64() values from picasa.ini file.
I created the following ruby method with the above information.
def coordinates(hex_num)
[
hex_num.divmod(65536)[1],
hex_num.divmod(65536)[0].divmod(65536)[1],
hex_num.divmod(65536)[0].divmod(65536)[0].divmod(65536)[1],
hex_num.divmod(65536)[0].divmod(65536)[0].divmod(65536)[0].divmod(65536)[1]
].reverse
end
It works, but I needed to add the .reverse method on the array to achieve the desired result.

Related

How do I convert a signed 8-byte integer to normalised float? [duplicate]

I try to optimize a working compute shader. Its purpose is to create an image: find the good color (using a little palette), and call imageStore(image, ivec2, vec4).
The colors are indexed, in an array of uint, in an UniformBuffer.
One color in this UBO is packed inside one uint, as {0-255, 0-255, 0-255, 0-255}.
Here the code:
struct Entry
{
*some other data*
uint rgb;
};
layout(binding = 0) uniform SConfiguration
{
Entry materials[MATERIAL_COUNT];
} configuration;
void main()
{
Entry material = configuration.materials[currentMaterialId];
float r = (material.rgb >> 16) / 255.;
float g = ((material.rgb & G_MASK) >> 8) / 255.;
float b = (material.rgb & B_MASK) / 255.;
imageStore(outImage, ivec2(gl_GlobalInvocationID.xy), vec4(r, g, b, 0.0));
}
I would like to clean/optimize a bit, because this color conversion looks bad/useless in the shader (and should be precomputed). My question is:
Is it possible to directly pack a vec4(r, g, b, 0.0) inside the UBO, using 4 bytes (like a R8G8B8A8) ?
Is it possible to do it directly? No.
But GLSL does have a number of functions for packing/unpacking normalized values. In your case, you can pass the value as a single uint uniform, then use unpackUnorm4x8 to convert it to a vec4. So your code becomes:
vec4 color = unpackUnorm4x8(material.rgb);
This is, of course, a memory-vs-performance tradeoff. So if memory isn't an issue, you should probably just pass a vec4 (never use vec3) directly.
Is it possible to directly pack a vec4(r, g, b, 0.0) inside the UBO, using 4 bytes (like a R8G8B8A8) ?
There is no way to express this directly as 4 single byte values; there is no appropriate data type in the shader to allow you to do declare this as a byte type.
However, why do you think you need to? Just upload it as 4 floats - it's a uniform so it's not like you are replicating it thousands of times, so the additional size is unlikely to be a problem in practice.

Failed conversion of a QImage image to a CV image

I am new to both opencv and opencv. What I am doing is to convert a QImage image to an opencv Mat image, and then display both of them. Here is my code for this conversion:
i = new QImage("lena.png");
QImage lena = i->scaled(labW,labH,Qt::IgnoreAspectRatio);
//Original
QImage lenaRGB = lena.convertToFormat(QImage::Format_RGB888);
ui->imgWindow->setPixmap(QPixmap::fromImage(lena,Qt::AutoColor));
//method 1
Mat lena_cv, out;
QImage lena2 = lenaRGB.rgbSwapped();
QImage swapped = lena2;
swapped = swapped.rgbSwapped();
lena_cv = Mat(swapped.width(),swapped.height(),CV_8UC3, swapped.bits(),swapped.bytesPerLine()).clone();
namedWindow("CV Image");
imshow("CV Image", lena_cv);
//method 2
Mat out2,out3;
out2.create(Size(lena2.width(),lena2.height()),CV_8UC3);
int width = lena2.width();
int height = lena2.height();
memcpy(out2.data, lena2.bits(), sizeof(char)*width*height*3);
cvtColor(out2,out3,CV_RGB2GRAY);
namedWindow("CV Image2");
imshow("CV Image2",out3);
Both of the above two conversions cannot yield desired images, as shown below:
It is also noted that the conversion cannot proceed without using rgbSwapped, i.e.,:
lena_cv = Mat(lenaRGB.width(),lenaRGB.height(),CV_8UC3, lenaRGB.bits(),lenaRGB.bytesPerLine());
because:
The resulting image lena_cv cannot be displayed. If an additional step to convert lena_cv to BGR format using cvtColor before image display:
Exception at 0x7ffdff394008, code: 0xe06d7363: C++ exception, flags=0x1
(execution cannot be continued) (first chance) at c:\opencv-3.2.0
\sources\modules\core\src\opencl\runtime\opencl_core.cpp:278
This indicates the post conversion to BGR fails. I am not sure RGB to BGR conversion (of QImage) is necessary or not for converting QImage to CV image.
Can anyone help identify the issue with the above codes. Thanks :)
The "skew" in the third image is almost likely a result of assuming that each scan line occupies exactly width*3 bytes. There's typically a "stride" (or "steps") factor with each row in many image formats image such that the number of bytes per row is on some 4-byte or 16-byte boundary. Fortunately, QImage has a helper method called bytesPerLine that tells you how long each source row is.
So instead of this:
memcpy(out2.data, lena2.bits(), sizeof(char)*width*height*3);
Do this:
unsigned char* src = lena2.bits();
unsigned char* dst = out2.data;
int stride = lena2.bytesPerLine();
for (int row = 0; row < height; row++)
{
memcpy(dst + width*3*row, src+row*stride, width*3); // copy a single row, accounting for stride bytes
}
All of this assume it's the QImage that has the stride bytes and not the target Mat image you are transforming the bits too. If I have this backwards, then adjust the code to account for the steps member of Mat. (I don't see you using this, so I'm willing to be the above code is what you need).
The "blue" image is mostly likely just the RGB color bytes needing to be swapped for every pixel. Not sure why you are calling rgbSwapped unless that was the effect you were going for. Oh wait, you're probably referring to that noise effect at the bottom of the image. I'm willing to bet you need to think about "stride" bytes as well here too.

Simple low pass filter in fixed point

I have a simple circuit setup to read the light level via an LDR into an Arduino. I'm trying to implement a simple low pass filter to data read in. How best to tackle this given that analogRead() returns an unsigned int.
I have tried to implement a simple fixed point representation but am unsure if this is the correct approach.
Here's a code snippet:
#define WLPF 0.1
#define FIXED_SHIFT 4
ldr_val = ((int)analogRead(A0)) << FIXED_SHIFT;
while (true) {
int newval = (int)analogRead(A0) << FIXED_SHIFT;
ldr_val += WLPF*(newval - ldr_val);
Serial.println(ldr_val >> FIXED_SHIFT, DEC);
}
Note the resolution of the ADC is 10 bits and I am working with an 8-bit Arduino Micro.
I'm paraphrasing from the book "Musical Applications of Microprocessors" by Hal Chamberlin, page 438:
If you allow large numbers in the accumulator, then you can make a first-order low-pass filter with one multiplication and some right-shifts.
out = accum >> k
accum = accum - out + in
Choose 'k' to change the cutoff frequency. The more shifts, the lower the low-pass cutoff, but the larger the value in the accumulator. With a 10-bit value from analog_read(), you can easily right-shift 4 places, and still have 2 bits of headroom in the accumulator (as #datafiddler noted above).
Cypress has some app-notes for their PSOC chips with similar equations, and using shifts. I remember one had a nice table that related number of shifts to the cutoff frequency.
The approximate cutoff frequency is the sampling frequency divided by 2-pi times the gain factor:
f0 ~ fs / (2 pi a)
where 'a' is that power of two.
Keep smoothin' those signals!
On a device with no FPU rather then multiplying by 0.1 (which in any case make this a floating not fixed point implementation) you should divide by 10:
#define WLPF_DIV 10
...
ldr_val += (newval - ldr_val) / WLPF_DIV;
However division on an 8 bit processor is often expensive (although probably dwarfed by the execution time of Serial.println() in the loop - but that is a different issue). Instead it is more efficient to select a power of two so that the division can be performed with a right-shift.
#define WLPF_SHIFT 3 // divide by 8
...
ldr_val += (newval - ldr_val) >> WLPF_SHIFT ;
The use of signed int is problematic since right-shift of a signed type is undefined behaviour. In this case this can be resolved by changing the code to:
#define WLPF_DIV 8
...
ldr_val += (newval - ldr_val) / WLPF_DIV ;
The compiler will most likely spot the power-of-two constant and generate the code using an arithmetic-shift-right in any case. However you would probably do better to reconsider the data type.
You still have a right-shift in the Serial.println() call, but that too could by replaced with a divide-by-16:
#define WLPF_DIV 8
#define FIXED_MUL 16
ldr_val = (int)analogRead(A0) * FIXED_MUL ;
for(;;)
{
int newval = (int)analogRead(A0) * FIXED_MUL ;
ldr_val += (newval - ldr_val) / WLPF_DIV
Serial.println(ldr_val / FIXED_MUL, DEC);
}
Non-deterministic output of the data on a per sample basis is not going to make for a very accurate filter and will dominate the timing in any case so you have little control over the frequency response and it will not be stable. It also makes the previous performance optimisations rather pointless. You may want to think about that if it is important in your application - but that is a different question.
Stick with integer arithmetics:
#define WLPF 9
filtered = ((long)filtered * WLPF + newValue) / (WLPF + 1);

Hacks for clamping integer to 0-255 and doubles to 0.0-1.0?

Are there any branch-less or similar hacks for clamping an integer to the interval of 0 to 255, or a double to the interval of 0.0 to 1.0? (Both ranges are meant to be closed, i.e. endpoints are inclusive.)
I'm using the obvious minimum-maximum check:
int value = (value < 0? 0 : value > 255? 255 : value);
but is there a way to get this faster -- similar to the "modulo" clamp value & 255? And is there a way to do similar things with floating points?
I'm looking for a portable solution, so preferably no CPU/GPU-specific stuff please.
This is a trick I use for clamping an int to a 0 to 255 range:
/**
* Clamps the input to a 0 to 255 range.
* #param v any int value
* #return {#code v < 0 ? 0 : v > 255 ? 255 : v}
*/
public static int clampTo8Bit(int v) {
// if out of range
if ((v & ~0xFF) != 0) {
// invert sign bit, shift to fill, then mask (generates 0 or 255)
v = ((~v) >> 31) & 0xFF;
}
return v;
}
That still has one branch, but a handy thing about it is that you can test whether any of several ints are out of range in one go by ORing them together, which makes things faster in the common case that all of them are in range. For example:
/** Packs four 8-bit values into a 32-bit value, with clamping. */
public static int ARGBclamped(int a, int r, int g, int b) {
if (((a | r | g | b) & ~0xFF) != 0) {
a = clampTo8Bit(a);
r = clampTo8Bit(r);
g = clampTo8Bit(g);
b = clampTo8Bit(b);
}
return (a << 24) + (r << 16) + (g << 8) + (b << 0);
}
Note that your compiler may already give you what you want if you code value = min (value, 255). This may be translated into a MIN instruction if it exists, or into a comparison followed by conditional move, such as the CMOVcc instruction on x86.
The following code assumes two's complement representation of integers, which is usually a given today. The conversion from Boolean to integer should not involve branching under the hood, as modern architectures either provide instructions that can directly be used to form the mask (e.g. SETcc on x86 and ISETcc on NVIDIA GPUs), or can apply predication or conditional moves. If all of those are lacking, the compiler may emit a branchless instruction sequence based on arithmetic right shift to construct a mask, along the lines of Boann's answer. However, there is some residual risk that the compiler could do the wrong thing, so when in doubt, it would be best to disassemble the generated binary to check.
int value, mask;
mask = 0 - (value > 255); // mask = all 1s if value > 255, all 0s otherwise
value = (255 & mask) | (value & ~mask);
On many architectures, use of the ternary operator ?: can also result in a branchless instruction sequences. The hardware may support select-type instructions which are essentially the hardware equivalent of the ternary operator, such as ICMP on NVIDIA GPUs. Or it provides CMOV (conditional move) as in x86, or predication as on ARM, both of which can be used to implement branch-less code for ternary operators. As in the previous case, one would want to examine the disassembled binary code to be absolutely sure the resulting code is without branches.
int value;
value = (value > 255) ? 255 : value;
In case of floating-point operands, modern floating-point units typically provide FMIN and FMAX instructions which map straight to the C/C++ standard math functions fmin() and fmax(). Alternatively fmin() and fmax() may be translated into a comparison followed by a conditional move. Again, it would be prudent to examine the generated code to make sure it is branchless.
double value;
value = fmax (fmin (value, 1.0), 0.0);
I use this thing, 100% branchless.
int clampU8(int val)
{
val &= (val<0)-1; // clamp < 0
val |= -(val>255); // clamp > 255
return val & 0xFF; // mask out
}
For those using C#, Kotlin or Java this is the best I could do, it's nice and succinct if somewhat cryptic:
(x & ~(x >> 31) | 255 - x >> 31) & 255
It only works on signed integers so that might be a blocker for some.
For clamping doubles, I'm afraid there's no language/platform agnostic solution.
The problem with floating point that they have options from fastest operations (MSVC /fp:fast, gcc -funsafe-math-optimizations) to fully precise and safe (MSVC /fp:strict, gcc -frounding-math -fsignaling-nans). In fully precise mode the compiler does not try to use any bit hacks, even if they could.
A solution that manipulates double bits cannot be portable. There may be different endianness, also there may be no (efficient) way to get double bits, double is not necessarily IEEE 754 binary64 after all. Plus direct manipulations will not cause signals for signaling NANs, when they are expected.
For integers most likely the compiler will do it right anyway, otherwise there are already good answers given.

Microcontroller, How to display decimal on LCD?

I have a microcontroller and I am sampling the values of an LM335 temperature sensor.
The LCD library that I have allows me to display the hexadecimal value sampled by the 10-bit ADC.
10bit ADC gives me values from 0x0000 to 0x03FF.
What I am having trouble is trying to convert the hexadecimal value to a format that can be understood by regular humans.
Any leads would be greatly appreciated, since I am completely lost on the issue.
You could create a "string" into which you construct the decimal number like this (constants depend on what size the value actually, I presume 0-255, whether You want it to be null-terminated, etc.):
char result[4];
char i = 3;
do {
result[i] = '0' + value % 10;
value /= 10;
i--;
}
while (value > 0);
Basically, your problem is how to split a number into decimal digits so you can use your LCD library and send one digit to each cell.
If your LCD is based on 7-segment cells, then you need to output a value from 0 to 9 for each digit, not an ASCII code. The solution by #Roman Hocke is fine for this, provided that you don't add '0' to value % 10
Another way to split a number into digits is to convert it into BCD. For that, there is an algorithm named "double dabble" which allows you to convert your number into BCD without using divisions nor module operations, which can be nice if your microcontroller has no provision for division operation, or this is slower than you need.
"Double dable" algorithm sounds perfect for microcontrollers without provision for the division operation. However, a quick oversight of such algorithm in the Wikipedia shows that it uses dynamic memory, which seems to be worst than a routine for division. Of course, there must be an implementation out there that are not using calls to malloc() and friends.
Just to point out that Roman Hocke's snippet code has a little mistake. This version works ok for decimals in the range 0-255. It can be easily expand it to any range:
void dec2str(uint8_t val, char * res)
{
uint8_t i = 2;
do {
res[i] = '0' + val % 10;
val /= 10;
i--;
} while (val > 0);
res[3] = 0;
}

Resources