Volume rendering: confusion with front-to-back compositing - math

In, for example, GPU Gems the front-to-back compositing equation (for colour) is
C'i = (1 - A'i-1)Ci + C'i-1
where C'i is the output accumulated colour value; A'i-1 is the accumulated alpha (opacity) value up to the previous voxel; Ci is the colour value of the current voxel; and C'i-1 is the accumulated colour value up to the previous voxel.
This formulation raises two questions to me:
Termination of front-to-back occurs once the accumulated opacity reaches approximately 1. What, then, to do about the colour channels (RGB) that go past the maximum before the opacity limit is reached? Do you just clamp the values between 0..255 (e.g. 500,1000,2000 would become 255,255,255), or look to the ratio between the channels (e.g. 500,1000,2000 would become 64,128,255).
The answer to the previous question possibly feeds into this. The colour output of the current voxel depends on one minus the accumulated opacity. What if the accumulated opacity is zero and the current voxel's opacity is zero? - the output is a completely opaque voxel, since (1 - A'i-1) = 1, even though it is supposedly a transparent voxel!?
Any hints much appreciated.

A and C should be in the range 0-1. (If you're using unsigned bytes as the representation, divide by 255.0, but note that for some volume rendering application areas this will give you insufficient control over small alpha/low opacity regions to really be satisfactory. These days it's generally just easier to compute using floats from the outset). It turns out that the alpha and color values can never escape outside this range using the your formulas.
The sequence for the ray alpha A' is A'(i) = (1-A'(i-1)).A(i) + A'(i-1) (where A(i) is the voxel alpha), so if your accumulated ray starts with A' zero, and passes through a transparent (zero A) voxel, the ray now has A' = (1-0)*0+0, which is still zero as expected.

A and C should be between 0 and 1. Use pre-multiplied alpha; you will have no overflow issues.

Hm, let's belive that C and A are between 0 and 1. as one can see sequense C'(i) = [1 - A'_(i+1)]C(i) + C'(i-1) is grows with grow of i. I think C is not color (of RGB or whatever model). Maybe it's 'greyness' of some voxel. I.e. if voxel has many voxels in front of it it should be more grey then top voxels.
So my assumption is that C_i does not describe color dirrectly. It tells us how grey we should make color of some voxel.
forgive me for my poor English and fill free to re-ask if something is not clear.
BTW: if you belive me C_0 (greyness of top voxel) should be 1, and A_0 should be 0.

Related

DirectX negative W

I really was trying to find an answer on this very basic (at first sight) question.
For simplicity depth test is disabled during further discussion (it doesn’t have a big deal).
For example, we have triangle (after transformation) with next float4 coordinates.
top CenterPoint: (0.0f, +0.6f, 0.6f, 1f)
basic point1: (+0.4f, -0.4f, 0.4f, 1f),
basic point2: (-0.4f, -0.4f, 0.4f, 1f),
I’m sending float4 for input and use straight VertexShader (without transforms), so I’m sure about input. And we have result is reasonable:
But what we will get if we'll start to move CenterPoint to point of camera position. In our case we don’t have camera so will move this point to minus infinity.
I'm getting quite reasonable results as long as w (with z) is positive.
For example, (0.0f, +0.006f, 0.006f, .01f) – look the same.
But what if I'll use next coordinates (0.0f, -0.6f, -1f, -1f).
(Note: we have to switch points or change rasterizer for culling preventing).
According to huge amount of resource I'll have test like: -w < z < w, so GPU should cut of that point. And yes, in principle, I don’t see point. But triangle still visible! OK, according to huge amount of other resource (and my personal understanding) we'll have division like (x/w, y/w, z/w) so result should be (0, 0.6, 1). But I'm getting
And even if that result have some sense (one point is somewhere far away behind as), how really DirectX (I think it is rather GPU) works in such cases (in case of infinite points and negative W)?
It seems that I don't know something very basic, but it seems that nobody know that.
[Added]: I want to note that point w < 0 - is not a real input.
In real life such points are result of transformation by matrices and according to the math (math that are used in standard Direct sdk and other places) corresponds to the point that appears behind the camera position.
And yes, that point is clipped, but questions is rather about strange triangle that contains such point.
[Brief answer]: Clipping is essentially not just z/w checking and division (see details below).
Theoretically, NDC depth is divided into two distinct areas. The following diagram shows these areas for znear = 1, zfar = 3. The horizontal axis shows view-space z and the vertical axis shows the resulting NDC depth for a standard projective transform:
We can see that the part between view-space z of 1 and 3 (znear, zmax) gets mapped to NDC depth 0 to 1. This is the part that we are actually interested in.
However, the part where view-space z is negative also produces positive NDC depth. However, those are parts that result from fold-overs. I.e., if you take a corner of your triangle and slowly decrease z (along with w), starting in the area between znear and zfar, you would observe the following:
we start between znear and zfar, everything is good
as soon as we pass znear, the point gets clipped because NDC depth < 0.
when we are at view-space z = 0, the point also has w = 0 and no valid projection.
as we decrease view-space z further, the point gets a valid projection again (starting at infinity) and comes back in with positive NDC depth.
However, this last part is the area behind the camera. So, homogeneous clipping is made, such that this part is also clipped away by znear clipping.
Check the old D3D9 documentation for the formulas and some more illustrative explanations here.

Function to scale linear range that accentuates peaks

I am monitoring an audio source and visualizing the power of each channel. I get a number out of the api (averagePowerForChannel, but the language/platform shouldn't be important for this problem).
When I add both numbers together I have a scale from -240...0. This makes sense as this is the decibel range.
I transform this scale to a linear representation of the same numbers from 0...1. (I understand that decibels are logarithmic, I leave that alone and just map the scale linearly)
I then give the 0...1 value to an alpha channel that nicely represents the audio being played.
The problem is it's not showing enough change aesthetically. The value shifts slightly and usually hovers around 80:
alpha: 0.820713937282562
alpha: 0.816978693008423
alpha: 0.830410122871399
...
As you might imagine, this just creates a mild flicker.
Instead I'd like to accentuate the peaks of the audio. I have thrown some different methods at it:
// var alpha = 1 / (1 + exp(1-linear)) // never gets fully bright, sits at about .45
// var alpha = 1 - exp2(-linear) // stays around .45
// var alpha = linear / linear + 1
These do not get me a good result, but then again I don't have any idea what I was trying to do.
Goal:
low values on the range get pushed to zero or near zero (could even shift the range down 0.2 after the curve is calculated)
Mid values are pushed lower
High values have their differences accentuated (eg: .83 is shifted very close to 1, but .81 is shifted to .5)
I think I might want an exponential curve? I'm not sure. This is a very specific problem with known inputs so a magic number solution is acceptable.
I get a satisfactory visual by shifting the range to an interesting area, then using an exponential curve to emphasize changes from there on:
var alpha = volume / maxVolume
alpha = alpha - 0.5 // Shift the range over to the area with interesting differences in our source tracks
alpha = pow(alpha, 3) // Emphases the changes in this range
alpha *= 10 // Fix the decimal place
Will accept a better/more pure answer--for this I just wiggled numbers until they got me a good visual result. I'm sorry for grossing out the CS folks here :)
The best answer may be frequency isolation, but there is enough interesting difference to make a good visual without it.
Not sure why you don't invert the logarthmic scale:
decibels = 10 * log10( value );
the inverse is just algebra:
value = pow(10.0, decibels/10);
Observe that since your decibels are between -240 and 0, the value is between 0 (exclusive) and 1 (inclusive). That should ensure your values are more sparsely distributed. However if it still isn't, then your audio configuration might not be detecting significant changes in average amplitude - a not-so-unlikely possibility. In that case you might have to look at decomposing the audio into particular frequencies and looking at the amplitude of each frequency.

Color similarity/distance in RGBA color space

How to compute similarity between two colors in RGBA color space? (where the background color is unknown of course)
I need to remap an RGBA image to a palette of RGBA colors by finding the best palette entry for each pixel in the image*.
In the RGB color space the most similar color can be assumed to be the one with the smallest euclidean distance. However, this approach doesn't work in RGBA, e.g., Euclidean distance from rgba(0,0,0,0) to rgba(0,0,0,50%) is smaller than to rgba(100%,100%,100%,1%), but the latter looks much better.
I'm using premultiplied RGBA color space:
r = r×a
g = g×a
b = b×a
and I've tried this formula (edit: See the answer below for better formula):
Δr² + Δg² + Δb² + 3 × Δa²
but it doesn't look optimal — in images with semitransparent gradients it finds wrong colors that cause discontinuities/sharp edges. Linear proportions between opaque colors and alpha seem fishy.
What's the optimal formula?
*) for simplicity of this question I'm ignoring error diffusion, gamma and psychovisual color spaces.
Slightly related: if you want to find nearest color in this non-Euclidean RGBA space, vp-trees are the best.
Finally, I've found it! After thorough testing and experimentation my conclusions are:
The correct way is to calculate maximum possible difference between the two colors.
Formulas with any kind of estimated average/typical difference had room for discontinuities.
I was unable to find a working formula that calculates the distance without blending RGBA colors with some backgrounds.
There is no need to take every possible background color into account. It can be simplified down to blending maximum and minimum separately for each of R/G/B channels:
blend the channel in both colors with channel=0 as the background, measure squared difference
blend the channel in both colors with channel=max as the background, measure squared difference
take higher of the two.
Fortunately blending with "white" and "black" is trivial when you use premultiplied alpha.
The complete formula for premultiplied alpha color space is:
rgb *= a // colors must be premultiplied
max((r₁-r₂)², (r₁-r₂ - a₁+a₂)²) +
max((g₁-g₂)², (g₁-g₂ - a₁+a₂)²) +
max((b₁-b₂)², (b₁-b₂ - a₁+a₂)²)
C Source including SSE2 implementation.
Several principles:
When two colors have same alpha, rgbaDistance = rgbDistance * ( alpha / 255). Compatible with RGB color distance algorithm when both alpha are 255.
All Colors with very low alpha are similar.
The rgbaDistance between two colors with same RGB is linearly dependent on delta Alpha.
double DistanceSquared(Color a, Color b)
{
int deltaR = a.R - b.R;
int deltaG = a.G - b.G;
int deltaB = a.B - b.B;
int deltaAlpha = a.A - b.A;
double rgbDistanceSquared = (deltaR * deltaR + deltaG * deltaG + deltaB * deltaB) / 3.0;
return deltaAlpha * deltaAlpha / 2.0 + rgbDistanceSquared * a.A * b.A / (255 * 255);
}
My idea is integrating once over all possible background colors and averaging the square error.
i.e. for each component calculate(using red channel as example here)
Integral from 0 to 1 ((r1*a1+rB*(1-a1))-(r2*a2+rB*(1-a2)))^2*drB
which if I calculated correctly evaluates to:
dA=a1-a2
dRA=r1*a1-r2*a2
errorR=dRA^2+dA*dRA+dA^2/3
And then sum these over R, G and B.
First of all, a very interesting problem :)
I don't have a full solution (at least not yet), but there are 2 obvious extreme cases we should consider:
When Δa==0 the problem is similiar to RGB space
When Δa==1 the problem is only on the alpha 1-dim space
So the formula (which is very similar to the one you stated) that would satisfy that is:
(Δr² + Δg² + Δb²) × (1-(1-Δa)²) + Δa² or (Δr² + Δg² + Δb²) × (1-Δa²) + Δa²
In any case, it would probably be something like (Δr² + Δg² + Δb²) × f(Δa) + Δa²
If I were you, I would try to simulate it with various RGBA pairs and various background colors to find the best f(Δa) function. Not very mathematic, but will give you a close enough answer
I've never done it, but theory and practice say that converting the RGB values in the image and the palette to luminance–chrominance will help you find the best matches. I'd leave the alpha channel alone, as transparency should have little to nothing to do with the 'looking better' part.
This xmass I made some photomosaics for presents using open-source software that matches fragments of the original image to a collection of images. That seems like a harder problem than the one you're trying to solve. One of them programs was metapixel.
Lastly, the best option should be to use an existing library to convert the image to a format, like PNG, in which you can control the palette.

Calculus? Need help solving for a time-dependent variable given some other variables

Long story short, I'm making a platform game. I'm not old enough to have taken Calculus yet, so I know not of derivatives or integrals, but I know of them. The desired behavior is for my character to automagically jump when there is a block to either side of him that is above the one he's standing on; for instance, stairs. This way the player can just hold left / right to climb stairs, instead of having to spam the jump key too.
The issue is with the way I've implemented jumping; I've decided to go mario-style, and allow the player to hold 'jump' longer to jump higher. To do so, I have a 'jump' variable which is added to the player's Y velocity. The jump variable increases to a set value when the 'jump' key is pressed, and decreases very quickly once the 'jump' key is released, but decreases less quickly so long as you hold the 'jump' key down, thus providing continuous acceleration up as long as you hold 'jump.' This also makes for a nice, flowing jump, rather than a visually jarring, abrupt acceleration.
So, in order to account for variable stair height, I want to be able to calculate exactly what value the 'jump' variable should get in order to jump exactly to the height of the stair; preferably no more, no less, though slightly more is permissible. This way the character can jump up steep or shallow flights of stairs without it looking weird or being slow.
There are essentially 5 variables in play:
h -the height the character needs to jump to reach the stair top<br>
j -the jump acceleration variable<br>
v -the vertical velocity of the character<br>
p -the vertical position of the character<br>
d -initial vertical position of the player minus final position<br>
Each timestep:<br>
j -= 1.5; //the jump variable's deceleration<br>
v -= j; //the jump value's influence on vertical speed<br>
v *= 0.95; //friction on the vertical speed<br>
v += 1; //gravity<br>
p += v; //add the vertical speed to the vertical position<br>
v-initial is known to be zero<br>
v-final is known to be zero<br>
p-initial is known<br>
p-final is known<br>
d is known to be p-initial minus p-final<br>
j-final is known to be zero<br>
j-initial is unknown<br>
Given all of these facts, how can I make an equation that will solve for j?
tl;dr How do I Calculus?
Much thanks to anyone who's made it this far and decides to plow through this problem.
Edit: Here's a graph I made of an example in Excel.
I want an equation that will let me find a value for A given a desired value for B.
Since the jump variable decreases over time, the position value isn't just a simple parabola.
There are two difficulties in play here. The first is that you don't actually have j -= 1.5, you have j = max(0, j - 1.5). That throws somewhat of a wrench into calculations. Also, your friction term v *= 0.95 makes direct solution difficult.
I would suggest using a lookup table for this. You can precalculate the desired a for each possible b, by trial and error (e.g. binary search on the values of a that give you the required b). Store the results in a table and just do a simple table lookup during the game.
After extensive use of Excel 2010 and its Seek Goal function, I was able to make a table of values, and Excel gave me an approximate trendline and equation for it, which I tweaked until it worked out. The equation is j = 3.35 * h ^ 0.196, where j is the initial jump force and h is the height required to jump. Thanks for your help.
If I neglect the friction term, and assume that j reaches zero before v reaches zero, I get after a page of calculations that:
b = 1/(8*(deceleration^2)*gravity)*j0^4 - 1/(6*deceleration^2)*j0^3
the solution to this is quite long, but equal approximately (for 10 < b < 400) to:
j0 = (10*(deceleration^2)*gravity*b)^0.25

Normal Vector of Three Points

Hey math geeks, I've got a problem that's been stumping me for a while now. It's for a personal project.
I've got three dots: red, green, and blue. They're positioned on a cardboard slip such that the red dot is in the lower left (0,0), the blue dot is in the lower right (1,0), and the green dot is in the upper left. Imagine stepping back and taking a picture of the card from an angle. If you were to find the center of each dot in the picture (let's say the units are pixels), how would you find the normal vector of the card's face in the picture (relative to the camera)?
Now a few things I've picked up about this problem:
The dots (in "real life") are always at a right angle. In the picture, they're only at a right angle if the camera has been rotated around the red dot along an "axis" (axis being the line created by the red and blue or red and green dots).
There are dots on only one side of the card. Thus, you know you'll never be looking at the back of it.
The distance of the card to the camera is irrelevant. If I knew the depth of each point, this would be a whole lot easier (just a simple cross product, no?).
The rotation of the card is irrelevant to what I'm looking for. In the tinkering that I've been doing to try to figure this one out, the rotation can be found with the help of the normal vector in the end. Whether or not the rotation is a part of (or product of) finding the normal vector is unknown to me.
Hope there's someone out there that's either done this or is a math genius. I've got two of my friends here helping me on it and we've--so far--been unsuccessful.
i worked it out in my old version of MathCAD:
Edit: Wording wrong in screenshot of MathCAD: "Known: g and b are perpendicular to each other"
In MathCAD i forgot the final step of doing the cross-product, which i'll copy-paste here from my earlier answer:
Now we've solved for the X-Y-Z of the
translated g and b points, your
original question wanted the normal of
the plane.
If cross g x b, we'll get the
vector normal to both:
| u1 u2 u3 |
g x b = | g1 g2 g3 |
| b1 b2 b3 |
= (g2b3 - b2g3)u1 + (b1g3 - b3g1)u2 + (g1b2 - b1g2)u3
All the values are known, plug them in
(i won't write out the version with g3
and b3 substituted in, since it's just
too long and ugly to be helpful.
But in practical terms, i think you'll have to solve it numerically, adjusting gz and bz so as to best fit the conditions:
g · b = 0
and
|g| = |b|
Since the pixels are not algebraically perfect.
Example
Using a picture of the Apollo 13 astronauts rigging one of the command module's square Lithium Hydroxide cannister to work in the LEM, i located the corners:
Using them as my basis for an X-Y plane:
i recorded the pixel locations using Photoshop, with positive X to the right, and positive Y down (to keep the right-hand rule of Z going "into" the picture):
g = (79.5, -48.5, gz)
b = (-110.8, -62.8, bz)
Punching the two starting formulas into Excel, and using the analysis toolpack to "minimize" the error by adjusting gz and bz, it came up with two Z values:
g = (79.5, -48.5, 102.5)
b = (-110.8, -62.8, 56.2)
Which then lets me calcuate other interesting values.
The length of g and b in pixels:
|g| = 138.5
|b| = 139.2
The normal vector:
g x b = (3710, -15827, -10366)
The unit normal (length 1):
uN = (0.1925, -0.8209, -0.5377)
Scaling normal to same length (in pixels) as g and b (138.9):
Normal = (26.7, -114.0, -74.7)
Now that i have the normal that is the same length as g and b, i plotted them on the same picture:
i think you're going to have a new problem: distortion introduced by the camera lens. The three dots are not perfectly projected onto the 2-dimensional photographic plane. There's a spherical distortion that makes straight lines no longer straight, makes equal lengths no longer equal, and makes the normals slightly off of normal.
Microsoft research has an algorithm to figure out how to correct for the camera's distortion:
A Flexible New Technique for Camera Calibration
But it's beyond me:
We propose a flexible new technique to
easily calibrate a camera. It is well
suited for use without specialized
knowledge of 3D geometry or computer
vision. The technique only requires
the camera to observe a planar pattern
shown at a few (at least two)
different orientations. Either the
camera or the planar pattern can be
freely moved. The motion need not be
known. Radial lens distortion is
modeled. The proposed procedure
consists of a closed-form solution,
followed by a nonlinear refinement
based on the maximum likelihood
criterion. Both computer simulation
and real data have been used to test
the proposed technique, and very good
results have been obtained. Compared
with classical techniques which use
expensive equipments such as two or
three orthogonal planes, the proposed
technique is easy to use and flexible.
It advances 3D computer vision one
step from laboratory environments to
real world use.
They have a sample image, where you can see the distortion:
(source: microsoft.com)
Note
you don't know if you're seeing the "top" of the cardboard, or the "bottom", so the normal could be mirrored vertically (i.e. z = -z)
Update
Guy found an error in the derived algebraic formulas. Fixing it leads to formulas that i, don't think, have a simple closed form. This isn't too bad, since it can't be solved exactly anyway; but numerically.
Here's a screenshot from Excel where i start with the two knowns rules:
g · b = 0
and
|g| = |b|
Writing the 2nd one as a difference (an "error" amount), you can then add both up and use that value as a number to have excel's solver minimize:
This means you'll have to write your own numeric iterative solver. i'm staring over at my Numerical Methods for Engineers textbook from university; i know it contains algorithms to solve recursive equations with no simple closed form.
From the sounds of it, you have three points p1, p2, and p3 defining a plane, and you want to find the normal vector to the plane.
Representing the points as vectors from the origin, an equation for a normal vector would be
n = (p2 - p1)x(p3 - p1)
(where x is the cross-product of the two vectors)
If you want the vector to point outwards from the front of the card, then ala the right-hand rule, set
p1 = red (lower-left) dot
p2 = blue (lower-right) dot
p3 = green (upper-left) dot
# Ian Boyd...I liked your explanation, only I got stuck on step 2, when you said to solve for bz. You still had bz in your answer, and I don't think you should have bz in your answer...
bz should be +/- square root of gx2 + gy2 + gz2 - bx2 - by2
After I did this myself, I found it very difficult to substitute bz into the first equation when you solved for gz, because when substituting bz, you would now get:
gz = -(gxbx + gyby) / sqrt( gx2 + gy2 + gz2 - bx2 - by2 )
The part that makes this difficult is that there is gz in the square root, so you have to separate it and combine the gz together, and solve for gz Which I did, only I don't think the way I solved it was correct, because when I wrote my program to calculate gz for me, I used your gx, and gy values to see if my answer matched up with yours, and it did not.
So I was wondering if you could help me out, because I really need to get this to work for one of my projects. Thanks!
Just thinking on my feet here.
Your effective inputs are the apparent ratio RB/RG [+], the apparent angle BRG, and the angle that (say) RB makes with your screen coordinate y-axis (did I miss anything). You need out the components of the normalized normal (heh!) vector, which I believe is only two independent values (though you are left with a front-back ambiguity if the card is see through).[++]
So I'm guessing that this is possible...
From here on I work on the assumption that the apparent angle of RB is always 0, and we can rotate the final solution around the z-axis later.
Start with the card positioned parallel to the viewing plane and oriented in the "natural" way (i.e. you upper vs. lower and left vs. right assignments are respected). We can reach all the interesting positions of the card by rotating by \theta around the initial x-axis (for -\pi/2 < \theta < \pi/2), then rotating by \phi around initial y-axis (for -\pi/2 < \phi < \pi/2). Note that we have preserved the apparent direction of the RB vector.
Next step compute the apparent ratio and apparent angle after in terms of \theta and \phi and invert the result.[+++]
The normal will be R_y(\phi)R_x(\theta)(0, 0, 1) for R_i the primitive rotation matrix around axis i.
[+] The absolute lengths don't count, because that just tells you the distance to card.
[++] One more assumption: that the distance from the card to view plane is much large than the size of the card.
[+++] Here the projection you use from three-d space to the viewing plane matters. This is the hard part, but not something we can do for you unless you say what projection you are using. If you are using a real camera, then this is a perspective projection and is covered in essentially any book on 3D graphics.
right, the normal vector does not change by distance, but the projection of the cardboard on a picture does change by distance (Simple: If you have a small cardboard, nothing changes.
If you have a cardboard 1 mile wide and 1 mile high and you rotate it so that one side is nearer and the other side more far away, the near side is magnified and the far side shortened on the picture. You can see that immediately that an rectangle does not remain a rectangle, but a trapeze)
The mostly accurate way for small angles and the camera centered on the middle is to measure the ratio of the width/height between "normal" image and angle image on the middle lines (because they are not warped).
We define x as left to right, y as down to up, z as from far to near.
Then
x = arcsin(measuredWidth/normWidth) red-blue
y = arcsin(measuredHeight/normHeight) red-green
z = sqrt(1.0-x^2-y^2)
I will calculate tomorrow a more exact solution, but I'm too tired now...
You could use u,v,n co-oridnates. Set your viewpoint to the position of the "eye" or "camera", then translate your x,y,z co-ordinates to u,v,n. From there you can determine the normals, as well as perspective and visible surfaces if you want (u',v',n'). Also, bear in mind that 2D = 3D with z=0. Finally, make sure you use homogenious co-ordinates.

Resources