Why aren't color vectors composed of float4 instead of int4 or byte4? - fusetools

RBG values are 0-255 integers, so why was float4 chosen as the vector data type?
Seems to me that byte would be the ideal data type for a color in Fuse.

RGB values are only 1-byte (0-255) values in some contexts. There are numerous color spaces in common use which use more or less than 1 byte to repesent colors (such as compact 8bit and 16bit colorspaces, or HDR colorspaces using 16 or even 32 bits per channel). These aren't just theoretical uses, when dealing with images and GL textures they are used.
What's important is that each of these represent a range of values, from 0, no intensity, to 1, full intensity, for that channel. This is why float is used: it's the correct semantic type to represent a normalized range. It's also happens to be what OpenGL, the default graphic backend for Fuse, uses to represent colors.
float has the advantage of being a continuous value, unlike a byte which has discrete increments. This is important for interpolation. Consider animation between two colors, having a linear gradient, changing the opacity, or decreasing saturation; all of these need to be done on a continuous value range, such as float.
float also allows values to go about 1, and below 0. While these cannot be reflected in the final display they play a role during calculations. If you're doing many color operations in sequence you don't want to prematurely clamp your values.
Don't worry about things like memory bandwidth or storage space. Actual stored color values are a miniscule fraction of what occupies memory.
Also, the common hex syntaxes for color notation are supported in Fuse. You can use a simple #FAA for a light red, or #AB74FD80 for a more precise half-transparent color.

First I'm going to assume that by float you mean a 4-byte value.
Four floats take up 4 times as much memory. This is important not just for the space but the amount of time it takes to move the memory around since memory bandwidth is limited.
You can't use bit mask operators and shift on floats (well, you can, but it's not common).
Most display tech is limited to 16M colors, which is 24-bit RGB. Even if you have a 12-bit or 16-bit/channel display tech, floats still take at least twice as much memory.
Not all platforms even have native support for floating point operations.
I could probably keep going but you get the idea.

Related

Handle "Division by Zero" in Image Processing (or PRNU estimation)

I have the following equation, which I try to implement. The upcoming question is not necessarily about this equation, but more generally, on how to deal with divisions by zero in image processing:
Here, I is an image, W is the difference between the image and its denoised version (so, W expresses the noise in the image), and K is an estimated fingerprint, gained from d images of the same camera. All calculations are done pixel-wise; so the equations does not involve a matrix multiplication. For more on the Idea of estimating digital fingerprints consult corresponding literature like the general wikipedia article or scientific papers.
However my problem arises when an Image has a pixel with value Zero, e.g. perfect black (let's say we only have one image, k=1, so the Zero gets not overwritten by the pixel value of the next image by chance, if the next pixelvalue is unequal Zero). Then I have a division by zero, which apparently is not defined.
How can I overcome this problem? One option I came up with was adding +1 to all pixels right before I even start the calculations. However this shifts the range of pixel values from [0|255] to [1|256], which then makes it impossible to work with data type uint8.
Other authors in papers I read on this topic, often do not consider values close the range borders. For example they only calculate the equation for pixelvalues [5|250]. They reason this, not because of the numerical problem but they say, if an image is totally saturated, or totally black, the fingerprint can not even be estimated properly in that area.
But again, my main concern is not about how this algorithm performs best, but rather in general: How to deal with divisions by 0 in image processing?
One solution is to use subtraction instead of division; however subtraction is not scale invariant it is translation invariant.
[e.g. the ratio will always be a normalized value between 0 and 1 ; and if it exceeds 1 you can reverse it; you can have the same normalization in subtraction but you need to find the max values attained by the variables]
Eventualy you will have to deal with division. Dividing a black image with itself is a proper subject - you can translate the values to some other range then transform back.
However 5/8 is not the same as 55/58. So you can take this only in a relativistic way. If you want to know the exact ratios you better stick with the original interval - and handle those as special cases. e.g if denom==0 do something with it; if num==0 and denom==0 0/0 that means we have an identity - it is exactly as if we had 1/1.
In PRNU and Fingerprint estimation, if you check the matlab implementation in Jessica Fridrich's webpage, they basically create a mask to get rid of saturated and low intensity pixels as you mentioned. Then they convert Image matrix to single(I) which makes the image 32 bit floating point. Add 1 to the image and divide.
To your general question, in image processing, I like to create mask and add one to only zero valued pixel values.
img=imread('my gray img');
a_mat=rand(size(img));
mask=uint8(img==0);
div= a_mat/(img+mask);
This will prevent division by zero error. (Not tested but it should work)

why do I not find a LAB color cube?

I use the R colorspace package to convert a three-dimensional point into a LAB color. The LAB color is defined with three coordinates, the first one ranges from 0 to 100 and the two other ones range from -100 to 100.
But searching with Google I do not find a cuboidal representation of the LAB color space. Why ?
Short answer
The LAB color space, a.k.a. gamut, contain colors that are impossible to reproduce in nature or on a screen (according to this page).
Elaboration on converting RGB to LAB
I guess the reason you ask is that you want to make some kind of printed material and want to be sure the colors turn out right. I am merely an enthusiastic amateur in this field, but think this paragraph from the wikipedia article on lab color space explains some of the complications.
There are no simple formulas for conversion between RGB or CMYK values
and L*a*b*, because the RGB and CMYK color models are device
dependent. The RGB or CMYK values first need to be transformed to a
specific absolute color space, such as sRGB or Adobe RGB. This
adjustment will be device dependent, but the resulting data from the
transform will be device independent, allowing data to be transformed
to the CIE 1931 color space and then transformed into L*a*b*.
That is, in order to create a lab color cube, you must first find the transformation from your monitor specific color space into absolute color space. This is surprisingly difficult since the mapping is not linear or on any other simple form. The transformation is not likely to be perfect either since the RGB and LAB spaces do not span the same subspace (speculating here). I once talked to a printmaker about this and he said altough the human eye only has 4 types of color receptors (RGB + light intensity) you need about 17 color components on generate the full spectrum of visible colors on paper. Both RGB and LAB compromises on that, optimized for different purposes.
Bottom line
You can calibrate your screen to set up the transformation needed to convert the RGB of the screen to the LAB colors of human eyes, and then go on to make a color cube. However, it will only apply to your very monitor and not be perfect. You are best off test printing different color profiles and choose the one you like best.
Because there is no such thing. The CIELAB colour space has a Cartesian representation (of infinite size), but the (finite) gamut that we can perceive is not cubic, it has a complicated shape. Varying the two coordinates a* and b* independently in a pre-defined range may seem convenient, but this is fundamentally not the way human perception works.

Why does OpenGL provide support for mipmaps but not integral images?

I realize both mipmaps and integral images have the problem that the resulting pixel value is not the integral of an arbitrary polygon in original texture space. Integrating over axisaligned rectangle in texture coordinates using integral images requires 4 texture lookups. Using mipmaps, opengl interpolates across the 4 adjacent pixel values in the mipmap so also 4 memory lookups. Using an integral image you need less memory (no extra preresized images, only an integral image instead of the original) and no level determination. Of course this can be implemented through shaders, but why was the (now being deprecated) fixed function pipeline ever designed with mipmap support and no integral image support?
Using an integral image you need less memory
I very much doubt that this statement is true
From what I understand the values of an integral image can get quite large, therefore requiring floating point representation which will use a lot more space than a typical 24bit mipmap (mipmaps only double the size of an image) and/or be less precise and create noise during interpolation. Also floating point images were not really used that often with the fixed function pipeline and GPUs may have been a lot slower with floating point images.
If you would use integers for the picture then the bit-depth required for the integral image would rise unreasonably high (bitdepth = extents+8 for a white image which means a 256x256 image would need a bit-depth of 264bit per color channel) with higher resolution images.
but why was the (now being deprecated) fixed function pipeline ever designed with mipmap support and no integral image support?
Because the access and interpolation of mipmaps could be built as rather simple hardwired circuits. Ever wondered, why texture dimensions had to be powers of two? To implement mipmaping calculations as a series of bit shifts and additions. Also accessing the neighbouring elements in a gaussian pyramid requires less memory accesses than evaluating the integral. And there's your main problem: Fillrate, i.e. video memory bandwidth, always has been a bottleneck of GPUs.

How do browsers handle rgb(percentage); for strange numbers

This is related to CSS color codes:
For hexcode we can represent 16,777,216 colors from #000000 to #FFFFFF
According to W3C Specs, Valid RGB percentages fit in a range from (0.0% to 100.0%) essentially giving you 1,003,003,001 color combinations. (1001^3)
According to the specs:
Values outside the device gamut should be clipped or mapped into the gamut when the gamut is
known: the red, green, and blue values must be changed to fall within the range supported by
the device. Users agents may perform higher quality mapping of colors from one gamut to
another. For a typical CRT monitor, whose device gamut is the same as sRGB, the four rules
below are equivalent:
I'm doubtful if browsers actually can render all these values. (but if they do please tell me and ignore the rest of this post)
Im assuming there's some mapping from rgb(percentage) to hex. (but again Im not really sure how this works)
Ideally I'd like to find out the function rgb(percentage)->HEX
If I had to guess it would probably be one of these 3.
1) Round to the nearest HEX
2) CEIL to the nearest HEX
3) FLOOR to the nearest HEX
Problem is I need to be accurate on the mapping and I have no idea where to search.
There's no way my eyes can differentiate color at that level, but maybe there's some clever way to test each of these 3.
It might also be browser dependent. Can this be tested?
EDIT:
Firefox seems to round from empirical testing.
EDIT:
I'm looking through Firefox's source code right now,
nsColor.h
// A color is a 32 bit unsigned integer with four components: R, G, B
// and A.
typedef PRUint32 nscolor;
It seems Fiefox only has room for 255 values for each R,G and B. Hinting that rounding might be the answer, but maybe somethings being done with the alpha channel.
I think I found a solution for Firefox anyways, thought you might like a follow up:
Looking through the source code I found a file:
nsCSSParser.cpp
For each rgb percentages it does the following:
It takes the percentage component multiplies it by 255.0f
Stores it in a float
Passes it into a function NSToIntRound
The result of NSToIntRound is stored into an 8 bit integer datatype,
before it is combined with the other 2 components and an alpha
channel
Looking for more detail on NSToIntRound:
nsCoord.h
inline PRInt32 NSToIntRound(float aValue)
{
return NS_lroundf(aValue);
}
NSToIntRound is a wrapper function for NS_lroundf
nsMathUtils.h
inline NS_HIDDEN_(PRInt32) NS_lroundf(float x)
{
return x >= 0.0f ? PRInt32(x + 0.5f) : PRInt32(x - 0.5f);
}
This function is actually very clever, took me a while to decipher (I don't really have a good C++ background).
Assuming x is positive
It adds 0.5f to x and then casts to an integer
If the fractional part of x was less than 0.5, adding 0.5 won't change the integer and the fractional part is truncated,
Otherwise the integer value is bumped by 1 and the fractional part is truncated.
So each component's percentage is first multiplied by 255.0f
Then Rounded and cast into a 32bit Integer
And then Cast again into an 8 bit Integer
I agree with most of you that say this appears to be a browser dependent issue, so I will do some further research on other browsers.
Thanks a bunch!
According to W3C Specs, Valid RGB percentages fit in a range from (0.0% to 100.0%) essentially giving you 1,003,003,001 color combinations. (1001^3)
No, more than that, because the precision is not limited to one decimal place. For example, this is valid syntax:
rgb(23.456% 78.90123456% 0%)
The reason for this is that, while 8 bits per component is common (hence hex codes) newer hardware supports 10 or 12 bits per component; and wider gamut colorspaces need more bits to avoid banding.
This bit-depth agnosticism is also why newer CSS color specifications use a 0 to 1 float range.
Having said which, the CSS Object Model still requires color values to be serialized at 8 bits per component. This is going to change, but the higher-precision replacement is still being discussed in the CSS working group. So for now, browsers don't let you get more than 8 bits per component of precision.
If you are converting a float or percentage form to hex (or to 0 - 255 integer) the correct method is rounding. Floor or ceiling will not spec the values evenly at the top or bottom of the range.

In CSS, can HSL values be floats?

The CSS3 spec only specifies that:
The format of an HSLA color value in the functional notation is ‘hsla(’ followed by the hue in degrees, saturation and lightness as a percentage, and an , followed by ‘)’.
So am I to understand that these values would be interpreted not as integers but as floats? Example:
hsla(200.2, 90.5%, 10.2%, .2)
That would dramatically expand the otherwise small (relative to RGB) range of colors covered by HSL.
It seems to render fine in Chrome, though I don't know if they simply parse it as an INT value or what.
HSL values are converted to hexadecimal RGB values before they are handed off to the system. It's up to the device to clip any resulting RGB value that is outside the "device gamut" - the range of colors that can be displayed - to a displayable value. RGB values are denoted in Hexadecimal. This is the specified algorithm for browsers to convert HSL values to RGB values. Rounding behavior is not specified by the standard - and there are multiple ways of doing rounding since there doesn't appear to be a built-in rounding function in either C or C++.
HOW TO RETURN hsl.to.rgb(h, s, l):
SELECT:
l<=0.5: PUT l*(s+1) IN m2
ELSE: PUT l+s-l*s IN m2
PUT l*2-m2 IN m1
PUT hue.to.rgb(m1, m2, h+1/3) IN r
PUT hue.to.rgb(m1, m2, h ) IN g
PUT hue.to.rgb(m1, m2, h-1/3) IN b
RETURN (r, g, b)
From the proposed recommendation
In other words, you should be able to represent the exact same range of colors in HSLA as you can represent in RGB using fractional values for HSLA.
AFAIK, every browser casts them to INTs. Maybe. If I'm wrong you won't be able to tell the difference anyway. If it really matters, why not just go take screenshots an open them in photoshop or use an on-screen color meter. Nobody here is going to have a definitive answer without testing it, and it takes 2 minutes to test... so...
I wouldn't know exactly, but it makes sense to just put in some floating numbers and see if it works? it takes two seconds to try with a decimal, and without..

Resources