How are Vector Graphics Shown "Real-Time" as Raster Graphics? - vector-graphics

So my monitor is using raster graphics and is therefore full of pixels.
However, I have heard that Adobe Illustrator uses vector graphics.
So how can vector graphics be shown "real-time" on my monitor that is pixel-based?
From articles like this one, vector and raster graphics are completely different? So why can the they show each other - like they were the same?

The fact that Adobe Illustrator is a "vector program" only means that it is designed to help users work with vectors... just as Audacity, for example, helps users to work with sound, or Notepad lets a user work with characters.
There is no difference between Adobe Illustrator and any other program as far as what the Operating System (OS) and/or hardware expects from it in terms of the way it represents graphics.
Take these three examples:
We can use the idea of a "+" symbol to show the difference between a raster and a vector:
RASTER: A 3 x 3 pixel, black-and-white RASTER of the "+" symbol:
0 1 0
1 1 1
0 1 0
VECTOR: The same symbol as a Vector:
[draw a line from point ( 1/3 X, 1/2 Y ) to point ( 2/3 X, 1/2 Y )]
[draw a line from point ( 1/2 X, 1/3 Y ) to point ( 1/2 X, 2/3 Y )]
These are abstract representations -- they still need to be coded, stored, and displayed.
You can literally see how a raster is coded, stored, and displayed -- as a discrete matrix of values... a mosaic, if you will.
A vector on the other hand is coded and stored as a set of instructions. The x and y co-ordinates are stored as fractions of the total canvas space available (relative as opposed to absolute)... as the dimensions of the canvas space are not yet known... and this is the reason why you can infinitely expand a vector without losing resolution.
Now... if a vector is going to be displayed on an actual "vector display" monitor (very rare), then you could theoretically just send the vector instructions straight to the monitor. BUT... as you ask: "What happens if you're displaying a vector on a conventional monitor (a mosaic of pixels)?"
And... the answer to that is, once again: The same thing that happens when any other abstract concept is being illustrated by any program.
But... the vectors do end up on the screen, so here is a minimal example of how that happens:
Using the "plus symbol" example from above, imagine a really terrible monitor that is only 3 x 3 pixels in resolution. The OS would say to the program (Illustrator, presumably): "I need your raster output to be 3 x 3 pixels wide."
So the program would do this:
1) Draw a line from point ( 1/3 X, 1/2 Y ) to point ( 2/3 X, 1/2 Y )... but convert those points to points within that 3 x 3 pixel matrix... and draw the line by filling the first pixel, the last pixel, and all pixels in-between.
2) Do the same for the second instruction.
3) Hand the resulting 3 x 3 pixel matrix to the OS.
PS - You ask how they can "show each-other." A conventional monitor can show a vector that has been converted to pixels, but I don't think it's ever been done the other way around.

Related

Normalizing a vector with great accuracy and big numbers using fixed-point arithmetic

Why do I need this?
I'm creating a game about space. For a space game to work it'll need big and accurate numbers. Floating-point numbers are not really great for such application because if you get pretty far away in the world the precision will be worse, so the physics wont be the same, and so on.
What's the problem
In space there generally tends to be planets. So to create one I have to generate the spheres (planets) mesh. The problem is if I want a planet like Jupiter (with the radius of ~69911km), when I normalize a point and multiply it by the radius of the planet, and therefore "put" it on the surface of the planet, I don't have enough precision (the mesh isn't exactly round, there's an error of about 10-15m). Here's some footage of that error (the flickering of the planet is due to video compression, and the side length of the smallest square in the mesh grid is about ~10m). The problem isn't in the numbers, it's in the method I use to normalize the point.
In the video I am using 64-bit fixed-point numbers with 20 bits of precision. And I'm normalizing the points just by multiplying them with the fast inverse square root (I'm using just one iteration of the Newtons method, with two its better but still not nearly enough). In which I convert the fixed-point numbers to double precision floats just for the fast inverse square root.
My thoughts and attempts
I think the only solution to get such accurate vector normalization is to use an iterative method:
Calculate the point normally (normalize it and multiply by the planets radius)
Nudge it closer and closer to the radius until the error is small enough (I would like for it to be 0.01m)
The 2. step is the hard one. I don't have the math skills or any experience in this field of computer science to figure it out. We can iteratively increase or decrease a vector by some amount, but what amount, and how do we know we wont overshoot it. I thought about trying the Newtons method again but on the actual coordinates, not the inverse square root of the length. Which includes 128-bit division to preserve the precision needed, which cant be done efficiently (remember i have to do this maybe a million times, for each vertex of the planet mesh)
Any thoughts on this?
And the scale doesn't have to stop there, what if i wanted to make a star, the radius of the sun is 10 times bigger than that of Jupiter. And there are bigger stars.
I'm probably not the first one to ponder this question since many people have tried making the earth at a real scale (remember that earths radius is 10 time smaller than Jupiters). And i probably wont be the last one to try this so there will be an answer sooner or later.
Update 1, after some comments
Yes im using a 3D vector with 64bit numbers with 20 bits of precision as numbers. And I am obviously using a local coordinate system to the planet for creating the points (I need to for the normalization to work as needed).
Using floating-point numbers is my last hope. They're really not made for this kind of application because of the precision difference at different scales.
The scales I want to make this work for the celestial bodies (stars, planets) is at a minimum the size of the Sun. For the position of objects I am using two 64-bit ints (effectively one 128-bit int) so the world in the game can be bigger than our actual observable universe if I want to make it (thanks to fixed-point numbers). But I don't need 128-bit ints for the positions of vertices of planets, 64-bit is good enough (though I can use 128-bit ints in the process to calculate the values). And as I mentioned, it would be best for the mesh vertices to have an error less than 0.01 meters, where 1 unit of a coordinate is 1 meter (imagine walking on the surface of the planet, anything bigger than 0.01m could be noticeable).
The reason I'm pushing fixed-point numbers so much is because I can't use the position of objects as floating-point numbers because of physics. And lets say we have a big planet, bigger than Jupiter. Because I need to calculate the vertices relative to Jupiters position and then subtract the cameras position from them in the shaders (which are mostly limited to 32-bit floats), the errors would just add up and it would be noticable. There are workarounds but the core problem will always come up with using floats.
TL;DR
I need a way to calculate a point on a sphere with a radius that can be as big as 1000000000 = 10^9, but with an error less than 0.01, given the angles at which the point needs to rest. And this method needs to be relatively efficient.
My understanding is that you are
Making a 3D vector, represented in units of meters as 64-bit fixed-point numbers with 20 bits of precision.
Normalizing the vector to unit length, scaling by a fast inverse square root approximation.
Scaling the vector by Jupiter's radius 69911 km.
The challenge is that any error from inv sqrt approximation or round-off in the normalized vector is magnified when scaling by the planet radius. To achieve 0.01 m accuracy, the unit vector's length needs to accurate with error less than
0.01 m / 69911 km = 1.43×10-10.
To get there, the inv sqrt needs to be accurate to 11 digits or better, and the unit vector needs 32 fractional bits at a minimum (for round-off error of 2-33 = 1.16×10-10).
I suggest computing the surface points in a local coordinate system with origin at the center of the planet represented with 64-bit floats, then translating the points to the universe coordinate system. 64-bit floats are accurate to ~16 digits, enough to get the desired accuracy. This seems like the easiest and most efficient solution assuming 64-bit floats are an option.
Alternatively for an iterative "nudging" approach, you can do this:
R = planet radius
x, y, z = surface point with correct angle but possibly wrong length
for 1, 2, ...
scale = 0.5 * (R² / (x² + y² + z²) + 1)
x *= scale
y *= scale
z *= scale
This is a fixed point iteration that converges rapidly.
An example run, starting with a very inaccurate surface point x, y, z = [6991.1, 55928.8, -6991.1]:
# x y z Radius
-------------------------------------------------
0 6991.10 55928.80 -6991.10 56795.96489065047
1 8791.84 70334.70 -8791.84 71425.22857460588
2 8607.42 68859.40 -8607.42 69927.05096841766
3 8605.45 68843.60 -8605.45 69911.00184215968
A further thought:
remember i have to do this maybe a million times, for each vertex of the planet mesh
It looks like from the video that you are already applying a level of detail scheme to reduce the number of vertices when far away. The hierarchical mesh can be leveraged to reduce vertices when near the surface as well: skip finer-stage vertices when their parents are off screen based on frustum culling.

DirectX negative W

I really was trying to find an answer on this very basic (at first sight) question.
For simplicity depth test is disabled during further discussion (it doesn’t have a big deal).
For example, we have triangle (after transformation) with next float4 coordinates.
top CenterPoint: (0.0f, +0.6f, 0.6f, 1f)
basic point1: (+0.4f, -0.4f, 0.4f, 1f),
basic point2: (-0.4f, -0.4f, 0.4f, 1f),
I’m sending float4 for input and use straight VertexShader (without transforms), so I’m sure about input. And we have result is reasonable:
But what we will get if we'll start to move CenterPoint to point of camera position. In our case we don’t have camera so will move this point to minus infinity.
I'm getting quite reasonable results as long as w (with z) is positive.
For example, (0.0f, +0.006f, 0.006f, .01f) – look the same.
But what if I'll use next coordinates (0.0f, -0.6f, -1f, -1f).
(Note: we have to switch points or change rasterizer for culling preventing).
According to huge amount of resource I'll have test like: -w < z < w, so GPU should cut of that point. And yes, in principle, I don’t see point. But triangle still visible! OK, according to huge amount of other resource (and my personal understanding) we'll have division like (x/w, y/w, z/w) so result should be (0, 0.6, 1). But I'm getting
And even if that result have some sense (one point is somewhere far away behind as), how really DirectX (I think it is rather GPU) works in such cases (in case of infinite points and negative W)?
It seems that I don't know something very basic, but it seems that nobody know that.
[Added]: I want to note that point w < 0 - is not a real input.
In real life such points are result of transformation by matrices and according to the math (math that are used in standard Direct sdk and other places) corresponds to the point that appears behind the camera position.
And yes, that point is clipped, but questions is rather about strange triangle that contains such point.
[Brief answer]: Clipping is essentially not just z/w checking and division (see details below).
Theoretically, NDC depth is divided into two distinct areas. The following diagram shows these areas for znear = 1, zfar = 3. The horizontal axis shows view-space z and the vertical axis shows the resulting NDC depth for a standard projective transform:
We can see that the part between view-space z of 1 and 3 (znear, zmax) gets mapped to NDC depth 0 to 1. This is the part that we are actually interested in.
However, the part where view-space z is negative also produces positive NDC depth. However, those are parts that result from fold-overs. I.e., if you take a corner of your triangle and slowly decrease z (along with w), starting in the area between znear and zfar, you would observe the following:
we start between znear and zfar, everything is good
as soon as we pass znear, the point gets clipped because NDC depth < 0.
when we are at view-space z = 0, the point also has w = 0 and no valid projection.
as we decrease view-space z further, the point gets a valid projection again (starting at infinity) and comes back in with positive NDC depth.
However, this last part is the area behind the camera. So, homogeneous clipping is made, such that this part is also clipped away by znear clipping.
Check the old D3D9 documentation for the formulas and some more illustrative explanations here.

Handle "Division by Zero" in Image Processing (or PRNU estimation)

I have the following equation, which I try to implement. The upcoming question is not necessarily about this equation, but more generally, on how to deal with divisions by zero in image processing:
Here, I is an image, W is the difference between the image and its denoised version (so, W expresses the noise in the image), and K is an estimated fingerprint, gained from d images of the same camera. All calculations are done pixel-wise; so the equations does not involve a matrix multiplication. For more on the Idea of estimating digital fingerprints consult corresponding literature like the general wikipedia article or scientific papers.
However my problem arises when an Image has a pixel with value Zero, e.g. perfect black (let's say we only have one image, k=1, so the Zero gets not overwritten by the pixel value of the next image by chance, if the next pixelvalue is unequal Zero). Then I have a division by zero, which apparently is not defined.
How can I overcome this problem? One option I came up with was adding +1 to all pixels right before I even start the calculations. However this shifts the range of pixel values from [0|255] to [1|256], which then makes it impossible to work with data type uint8.
Other authors in papers I read on this topic, often do not consider values close the range borders. For example they only calculate the equation for pixelvalues [5|250]. They reason this, not because of the numerical problem but they say, if an image is totally saturated, or totally black, the fingerprint can not even be estimated properly in that area.
But again, my main concern is not about how this algorithm performs best, but rather in general: How to deal with divisions by 0 in image processing?
One solution is to use subtraction instead of division; however subtraction is not scale invariant it is translation invariant.
[e.g. the ratio will always be a normalized value between 0 and 1 ; and if it exceeds 1 you can reverse it; you can have the same normalization in subtraction but you need to find the max values attained by the variables]
Eventualy you will have to deal with division. Dividing a black image with itself is a proper subject - you can translate the values to some other range then transform back.
However 5/8 is not the same as 55/58. So you can take this only in a relativistic way. If you want to know the exact ratios you better stick with the original interval - and handle those as special cases. e.g if denom==0 do something with it; if num==0 and denom==0 0/0 that means we have an identity - it is exactly as if we had 1/1.
In PRNU and Fingerprint estimation, if you check the matlab implementation in Jessica Fridrich's webpage, they basically create a mask to get rid of saturated and low intensity pixels as you mentioned. Then they convert Image matrix to single(I) which makes the image 32 bit floating point. Add 1 to the image and divide.
To your general question, in image processing, I like to create mask and add one to only zero valued pixel values.
img=imread('my gray img');
a_mat=rand(size(img));
mask=uint8(img==0);
div= a_mat/(img+mask);
This will prevent division by zero error. (Not tested but it should work)

3D to 2D - moving camera

I'm trying to make a 3D engine to to see how it works (i love to know how things exactly work) .. I heard that they put the camera somewhere and move and rotate the whole world - that was easy to make the only hard thing was making a function to multiply matrices
but i want to move the camera and keep the world at it's position - i saw some people doing it and i actually prefer it .. when i tried to make it myself i faced a simple mathematical problem
to understand the problem i will convert 2D to 1D instead of 3D to 2D(same thing)
look at this picture:
Now i have a Camera position(x,y) and it's Vector (the blue point) From 0 to 1
And i made another Vector from the camera to the object and divided it by the distance to get a vector from 0 to 1 (the white point)
Now the distance from the two vectors (d) are the displacement that i need to draw a point on the screen - but the problem is "which direction" .. distance is always positive so i have to determine if the second point is on the right of the direction vector or on the left - it's very simple for the eye but it's very hard using codes
when i tried to compare (y2-y1)/(x2-x1) i got incorrect results when one vector is on a quarter and the other vector is on another quarter (quarters of the Coordinate Plane)
so how to compare those two vectors to see where it is according to the other vector?
i also tried atan2 and i got some incorrect results but i think atan2 will be slow for computers because i have to calculate 2 times for every 3D point
if there is an easier way tell me
i used many words to describe my question because i only know the simple words in english

Normal Vector of Three Points

Hey math geeks, I've got a problem that's been stumping me for a while now. It's for a personal project.
I've got three dots: red, green, and blue. They're positioned on a cardboard slip such that the red dot is in the lower left (0,0), the blue dot is in the lower right (1,0), and the green dot is in the upper left. Imagine stepping back and taking a picture of the card from an angle. If you were to find the center of each dot in the picture (let's say the units are pixels), how would you find the normal vector of the card's face in the picture (relative to the camera)?
Now a few things I've picked up about this problem:
The dots (in "real life") are always at a right angle. In the picture, they're only at a right angle if the camera has been rotated around the red dot along an "axis" (axis being the line created by the red and blue or red and green dots).
There are dots on only one side of the card. Thus, you know you'll never be looking at the back of it.
The distance of the card to the camera is irrelevant. If I knew the depth of each point, this would be a whole lot easier (just a simple cross product, no?).
The rotation of the card is irrelevant to what I'm looking for. In the tinkering that I've been doing to try to figure this one out, the rotation can be found with the help of the normal vector in the end. Whether or not the rotation is a part of (or product of) finding the normal vector is unknown to me.
Hope there's someone out there that's either done this or is a math genius. I've got two of my friends here helping me on it and we've--so far--been unsuccessful.
i worked it out in my old version of MathCAD:
Edit: Wording wrong in screenshot of MathCAD: "Known: g and b are perpendicular to each other"
In MathCAD i forgot the final step of doing the cross-product, which i'll copy-paste here from my earlier answer:
Now we've solved for the X-Y-Z of the
translated g and b points, your
original question wanted the normal of
the plane.
If cross g x b, we'll get the
vector normal to both:
| u1 u2 u3 |
g x b = | g1 g2 g3 |
| b1 b2 b3 |
= (g2b3 - b2g3)u1 + (b1g3 - b3g1)u2 + (g1b2 - b1g2)u3
All the values are known, plug them in
(i won't write out the version with g3
and b3 substituted in, since it's just
too long and ugly to be helpful.
But in practical terms, i think you'll have to solve it numerically, adjusting gz and bz so as to best fit the conditions:
g · b = 0
and
|g| = |b|
Since the pixels are not algebraically perfect.
Example
Using a picture of the Apollo 13 astronauts rigging one of the command module's square Lithium Hydroxide cannister to work in the LEM, i located the corners:
Using them as my basis for an X-Y plane:
i recorded the pixel locations using Photoshop, with positive X to the right, and positive Y down (to keep the right-hand rule of Z going "into" the picture):
g = (79.5, -48.5, gz)
b = (-110.8, -62.8, bz)
Punching the two starting formulas into Excel, and using the analysis toolpack to "minimize" the error by adjusting gz and bz, it came up with two Z values:
g = (79.5, -48.5, 102.5)
b = (-110.8, -62.8, 56.2)
Which then lets me calcuate other interesting values.
The length of g and b in pixels:
|g| = 138.5
|b| = 139.2
The normal vector:
g x b = (3710, -15827, -10366)
The unit normal (length 1):
uN = (0.1925, -0.8209, -0.5377)
Scaling normal to same length (in pixels) as g and b (138.9):
Normal = (26.7, -114.0, -74.7)
Now that i have the normal that is the same length as g and b, i plotted them on the same picture:
i think you're going to have a new problem: distortion introduced by the camera lens. The three dots are not perfectly projected onto the 2-dimensional photographic plane. There's a spherical distortion that makes straight lines no longer straight, makes equal lengths no longer equal, and makes the normals slightly off of normal.
Microsoft research has an algorithm to figure out how to correct for the camera's distortion:
A Flexible New Technique for Camera Calibration
But it's beyond me:
We propose a flexible new technique to
easily calibrate a camera. It is well
suited for use without specialized
knowledge of 3D geometry or computer
vision. The technique only requires
the camera to observe a planar pattern
shown at a few (at least two)
different orientations. Either the
camera or the planar pattern can be
freely moved. The motion need not be
known. Radial lens distortion is
modeled. The proposed procedure
consists of a closed-form solution,
followed by a nonlinear refinement
based on the maximum likelihood
criterion. Both computer simulation
and real data have been used to test
the proposed technique, and very good
results have been obtained. Compared
with classical techniques which use
expensive equipments such as two or
three orthogonal planes, the proposed
technique is easy to use and flexible.
It advances 3D computer vision one
step from laboratory environments to
real world use.
They have a sample image, where you can see the distortion:
(source: microsoft.com)
Note
you don't know if you're seeing the "top" of the cardboard, or the "bottom", so the normal could be mirrored vertically (i.e. z = -z)
Update
Guy found an error in the derived algebraic formulas. Fixing it leads to formulas that i, don't think, have a simple closed form. This isn't too bad, since it can't be solved exactly anyway; but numerically.
Here's a screenshot from Excel where i start with the two knowns rules:
g · b = 0
and
|g| = |b|
Writing the 2nd one as a difference (an "error" amount), you can then add both up and use that value as a number to have excel's solver minimize:
This means you'll have to write your own numeric iterative solver. i'm staring over at my Numerical Methods for Engineers textbook from university; i know it contains algorithms to solve recursive equations with no simple closed form.
From the sounds of it, you have three points p1, p2, and p3 defining a plane, and you want to find the normal vector to the plane.
Representing the points as vectors from the origin, an equation for a normal vector would be
n = (p2 - p1)x(p3 - p1)
(where x is the cross-product of the two vectors)
If you want the vector to point outwards from the front of the card, then ala the right-hand rule, set
p1 = red (lower-left) dot
p2 = blue (lower-right) dot
p3 = green (upper-left) dot
# Ian Boyd...I liked your explanation, only I got stuck on step 2, when you said to solve for bz. You still had bz in your answer, and I don't think you should have bz in your answer...
bz should be +/- square root of gx2 + gy2 + gz2 - bx2 - by2
After I did this myself, I found it very difficult to substitute bz into the first equation when you solved for gz, because when substituting bz, you would now get:
gz = -(gxbx + gyby) / sqrt( gx2 + gy2 + gz2 - bx2 - by2 )
The part that makes this difficult is that there is gz in the square root, so you have to separate it and combine the gz together, and solve for gz Which I did, only I don't think the way I solved it was correct, because when I wrote my program to calculate gz for me, I used your gx, and gy values to see if my answer matched up with yours, and it did not.
So I was wondering if you could help me out, because I really need to get this to work for one of my projects. Thanks!
Just thinking on my feet here.
Your effective inputs are the apparent ratio RB/RG [+], the apparent angle BRG, and the angle that (say) RB makes with your screen coordinate y-axis (did I miss anything). You need out the components of the normalized normal (heh!) vector, which I believe is only two independent values (though you are left with a front-back ambiguity if the card is see through).[++]
So I'm guessing that this is possible...
From here on I work on the assumption that the apparent angle of RB is always 0, and we can rotate the final solution around the z-axis later.
Start with the card positioned parallel to the viewing plane and oriented in the "natural" way (i.e. you upper vs. lower and left vs. right assignments are respected). We can reach all the interesting positions of the card by rotating by \theta around the initial x-axis (for -\pi/2 < \theta < \pi/2), then rotating by \phi around initial y-axis (for -\pi/2 < \phi < \pi/2). Note that we have preserved the apparent direction of the RB vector.
Next step compute the apparent ratio and apparent angle after in terms of \theta and \phi and invert the result.[+++]
The normal will be R_y(\phi)R_x(\theta)(0, 0, 1) for R_i the primitive rotation matrix around axis i.
[+] The absolute lengths don't count, because that just tells you the distance to card.
[++] One more assumption: that the distance from the card to view plane is much large than the size of the card.
[+++] Here the projection you use from three-d space to the viewing plane matters. This is the hard part, but not something we can do for you unless you say what projection you are using. If you are using a real camera, then this is a perspective projection and is covered in essentially any book on 3D graphics.
right, the normal vector does not change by distance, but the projection of the cardboard on a picture does change by distance (Simple: If you have a small cardboard, nothing changes.
If you have a cardboard 1 mile wide and 1 mile high and you rotate it so that one side is nearer and the other side more far away, the near side is magnified and the far side shortened on the picture. You can see that immediately that an rectangle does not remain a rectangle, but a trapeze)
The mostly accurate way for small angles and the camera centered on the middle is to measure the ratio of the width/height between "normal" image and angle image on the middle lines (because they are not warped).
We define x as left to right, y as down to up, z as from far to near.
Then
x = arcsin(measuredWidth/normWidth) red-blue
y = arcsin(measuredHeight/normHeight) red-green
z = sqrt(1.0-x^2-y^2)
I will calculate tomorrow a more exact solution, but I'm too tired now...
You could use u,v,n co-oridnates. Set your viewpoint to the position of the "eye" or "camera", then translate your x,y,z co-ordinates to u,v,n. From there you can determine the normals, as well as perspective and visible surfaces if you want (u',v',n'). Also, bear in mind that 2D = 3D with z=0. Finally, make sure you use homogenious co-ordinates.

Resources