I have the following definition for calculating the gradient at a pixel using central difference:
Where h is small, f'(x)=f(x+0.5h)-f(x-0.5h)
• If we make h twice the distance between pixels
• The above equation simple states that the
derivative of the image gradient at a pixel, is the
next (right) pixel’s value minus the previous (left)
pixel’s value
Why is it not necessary to divide by h to get the rate of change? why does simply subtracting the left pixel's value from the right pixel's value give the derivative at the central pixel?
Your definition is wrong. You do need to divide by h to get a proper estimate of the derivative.
In image processing, oftentimes we see definitions for derivatives that are off by a scaling, like what you have here. In most applications, the scaling is not important, what matters is comparing values in different parts of the image, for example to find the most salient edges. For these cases it is OK to use a simplified definition (that maybe is also cheaper to compute).
For example, the Sobel operator is usually defined in a way that it produces a value 8 times larger than the derivative it tries to estimate.
Related
I have the following equation, which I try to implement. The upcoming question is not necessarily about this equation, but more generally, on how to deal with divisions by zero in image processing:
Here, I is an image, W is the difference between the image and its denoised version (so, W expresses the noise in the image), and K is an estimated fingerprint, gained from d images of the same camera. All calculations are done pixel-wise; so the equations does not involve a matrix multiplication. For more on the Idea of estimating digital fingerprints consult corresponding literature like the general wikipedia article or scientific papers.
However my problem arises when an Image has a pixel with value Zero, e.g. perfect black (let's say we only have one image, k=1, so the Zero gets not overwritten by the pixel value of the next image by chance, if the next pixelvalue is unequal Zero). Then I have a division by zero, which apparently is not defined.
How can I overcome this problem? One option I came up with was adding +1 to all pixels right before I even start the calculations. However this shifts the range of pixel values from [0|255] to [1|256], which then makes it impossible to work with data type uint8.
Other authors in papers I read on this topic, often do not consider values close the range borders. For example they only calculate the equation for pixelvalues [5|250]. They reason this, not because of the numerical problem but they say, if an image is totally saturated, or totally black, the fingerprint can not even be estimated properly in that area.
But again, my main concern is not about how this algorithm performs best, but rather in general: How to deal with divisions by 0 in image processing?
One solution is to use subtraction instead of division; however subtraction is not scale invariant it is translation invariant.
[e.g. the ratio will always be a normalized value between 0 and 1 ; and if it exceeds 1 you can reverse it; you can have the same normalization in subtraction but you need to find the max values attained by the variables]
Eventualy you will have to deal with division. Dividing a black image with itself is a proper subject - you can translate the values to some other range then transform back.
However 5/8 is not the same as 55/58. So you can take this only in a relativistic way. If you want to know the exact ratios you better stick with the original interval - and handle those as special cases. e.g if denom==0 do something with it; if num==0 and denom==0 0/0 that means we have an identity - it is exactly as if we had 1/1.
In PRNU and Fingerprint estimation, if you check the matlab implementation in Jessica Fridrich's webpage, they basically create a mask to get rid of saturated and low intensity pixels as you mentioned. Then they convert Image matrix to single(I) which makes the image 32 bit floating point. Add 1 to the image and divide.
To your general question, in image processing, I like to create mask and add one to only zero valued pixel values.
img=imread('my gray img');
a_mat=rand(size(img));
mask=uint8(img==0);
div= a_mat/(img+mask);
This will prevent division by zero error. (Not tested but it should work)
I'm writing a 2D game, in which I would like to have crate-like objects. These objects would move around, like real crates do. I have a hypothetical idea of how I would like to achieve that:
Basically I'd store the boxes' corners' coordinates with their force and velocity unit vectors, and in every update I'd basically do the following steps:
1. Apply the forces(gravity, from collisions, etc..) accordingly.
2. Modify velocity vector based on the force.
3. Move every corner of the box, like so:
4. I repeat nr 3. for every corner, so I get the real movement of the cube.
My questions are: Is this approach heading in the right direction? Is this theory even correct? If not, what would be the correct way to move a box around based on vectors in a 2D environment?
Just to clarify: I'm only dragging corner "A" in the picture, but I want to repeat the dragging for every other corner, with their own vectors. By "dragging" I mean the algorithm I just stated.
Keeping each corner's coordinate and speed makes no sense as you would be storing lots of redundant information. Boxes are rigid objects, which means that there are constraints that must be satisfied at any time instant, namely the distance between any two given corners is fixed. This also translates to a constraint that links the velocities of all four corners and so they are not independent values. With rigid bodies the movement of any point is the sum of two independent movements - the linear movement of the centre of mass (CM) and the rotation around a fixed axis - often, but not always, chosen to be the one that goes through the CM. Hence you only need to store the position and the velocity of the crate's CM (which coincides with the geometric centre of the crate) as well as the angle of rotation and the rate of rotation around the CM.
As to the motion, the gravity field is a constant vector field and hence cannot induce rotation in symmetric objects like those rectangular crates. Instead it only produces accelerated vertical motion of the CM. This is also what happens due to all external forces - one has to take their vector sum and apply it to the CM. Only external forces whose direction does not go through the CM give torque and so cause rotation. Such forces are any external pushes/pulls or reaction forces that arise when crates collide with each other or hit the ground / a wall. Computing torque due to external forces is easy but computing reaction forces could be quite involving process because of the constrained dynamics that has to be employed. Once the torque has been computed, one has to divide it by the moment of inertia of the create in order to get the angular acceleration. Often it is more convenient to use another axis and not the one that goes through the CM - Steiner's theorem can be employed in this case in order to compute the moment of inertia around that other axis.
To summarise:
all forces, acting on the create, are first added together (as vectors) and the resultant force (divided by the mass of the create) determines the linear acceleration of the CM;
the torque of all forces is computed and then used to determine the angular acceleration around a given axis.
See here for some sample problems of rigid body motion and how the physics is actually worked out.
Given your algorithm, if by "velocity vector" you actually mean "the velocity of CM", then 1 would be correct - all corners move in the same direction (the linear motion of the CM). But 2 would not be always correct - the proper angle of rotation would depend on the time the torque was applied (e.g. the simulation timestep), and one has to take into account that the lever arm length changes in between as the crate rotates.
If I have a mesh of triangles, how does one go about calculating the normals at each given vertex?
I understand how to find the normal of a single triangle. If I have triangles sharing vertices, I can partially find the answer by finding each triangle's respective normal, normalizing it, adding it to the total, and then normalizing the end result. However, this obviously does not take into account proper weighting of each normal (many tiny triangles can throw off the answer when linked with a large triangle, for example).
I think a good method should be using a weighted average but using angles instead of area as weights. This is in my opinion a better answer because the normal you are computing is a "local" feature so you don't really care about how big is the triangle that is contributing... you need a sort of "local" measure of the contribution and the angle between the two sides of the triangle on the specified vertex is such a local measure.
Using this approach a lot of small (thin) triangles doesn't give you an unbalanced answer.
Using angles is the same as using an area-weighted average if you localize the computation by using the intersection of the triangles with a small sphere centered in the vertex.
The weighted average appears to be the best approach.
But be aware that, depending on your application, sharp corners could still give you problems. In that case, you can compute multiple vertex normals by averaging surface normals whose cross product is less than some threshold (i.e., closer to being parallel).
Search for Offset triangular mesh using the multiple normal vectors of a vertex by SJ Kim, et. al., for more details about this method.
This blog post outlines three different methods and gives a visual example of why the standard and simple method (area weighted average of the normals of all the faces joining at the vertex) might sometimes give poor results.
You can give more weight to big triangles by multiplying the normal by the area of the triangle.
Check out this paper: Discrete Differential-Geometry Operators for Triangulated 2-Manifolds.
In particular, the "Discrete Mean Curvature Normal Operator" (Section 3.5, Equation 7) gives a robust normal that is independent of tessellation, unlike the methods in the blog post cited by another answer here.
Obviously you need to use a weighted average to get a correct normal, but using the triangles area won't give you what you need since the area of each triangle has no relationship with the % weight that triangles normal represents for a given vertex.
If you base it on the angle between the two sides coming into the vertex, you should get the correct weight for every triangle coming into it. It might be convenient if you could convert it to 2d somehow so you could go off of a 360 degree base for your weights, but most likely just using the angle itself as your weight multiplier for calculating it in 3d space and then adding up all the normals produced that way and normalizing the final result should produce the correct answer.
I have a gray-scale image and I want to make a function that
closely follows the image
is always grater than it the image
smooth at some given scale.
In other words I want a smooth function that approximates the maximum of another function in the local region while over estimating the that function at all points.
Any ideas?
My first pass at this amounted to picking the "high spots" (by comparing the image to a least-squares fit of a high order 2-D polynomial) and matching a 2-D polynomial to them and their slopes. As the first fit required more working space than I had address space, I think it's not going to work and I'm going to have to come up with something else...
What I did
My end target was to do a smooth adjustment on an image so that each local region uses the full range of values. The key realization was that an "almost perfect" function would do just fine for me.
The following procedure (that never has the max function explicitly) is what I ended up with:
Find the local mean and standard deviation at each point using a "blur" like function.
offset the image to get a zero mean. (image -= mean;)
divide each pixel by its stdev. (image /= stdev;)
the most image should now be in [-1,1] (oddly enough most of my test images have better than 99% in that range rather than the 67% that would be expected)
find the standard deviation of the whole image.
map some span +/- n*sigma to your output range.
With a little manipulation, that can be converted to find the Max function I was asking about.
Here's something that's easy; I don't know how good it is.
To get smooth, use your favorite blurring algorithm. E.g., average points within radius 5. Space cost is order the size of the image and time is the product of the image size with the square of the blurring radius.
Take the difference of each individual pixel with the original image, find the maximum value of (original[i][j] - blurred[i][j]), and add that value to every pixel in the blurred image. The sum is guaranteed to overapproximate the original image. Time cost is proportional to the size of the image, with constant additional space (if you overwrite the blurred image after computing the max.
To do better (e.g., to minimize the square error under some set of constraints), you'll have to pick some class of smooth curves and do some substantial calculations. You could try quadratic or cubic splines, but in two dimensions splines are not much fun.
My quick and dirty answer would be to start with the original image, and repeat the following process for each pixel until no changes are made:
If an overlarge delta in value between this pixel and its neighbours can be resolved by increasing the value of the pixel, do so.
If an overlarge slope change around this pixel and its neighbours can be resolved by increasing the value of the pixel, do so.
The 2D version would look something like this:
for all x:
d = img[x-1] - img[x]
if d > DMAX:
img[x] += d - DMAX
d = img[x+1] - img[x]
if d > DMAX:
img[x] += d - DMAX
dleft = img[x-1] - img[x]
dright = img[x] - img[x+1]
d = dright - dleft
if d > SLOPEMAX:
img[x] += d - SLOPEMAX
Maximum filter the image with an RxR filter, then use an order R-1 B-spline smoothing on the maximum-filtered image. The convex hull properties of the B-spline guarantee that it will be above the original image.
Can you clarify what you mean by your desire that it be "smooth" at some scale? Also, over how large of a "local region" do you want it to approximate the maximum?
Quick and dirty answer: weighted average of the source image and a windowed maximum.
I have an implicit scalar field defined in 2D, for every point in 2D I can make it compute an exact scalar value but its a somewhat complex computation.
I would like to draw an iso-line of that surface, say the line of the '0' value. The function itself is continuous but the '0' iso-line can have multiple continuous instances and it is not guaranteed that all of them are connected.
Calculating the value for each pixel is not an option because that would take too much time - in the order of a few seconds and this needs to be as real time as possible.
What I'm currently using is a recursive division of space which can be thought of as a kind of quad-tree. I take an initial, very coarse sampling of the space and if I find a square which contains a transition from positive to negative values, I recursively divide it to 4 smaller squares and checks again, stopping at the pixel level. The positive-negative transition is detected by sampling a sqaure in its 4 corners.
This work fairly well, except when it doesn't. The iso-lines which are drawn sometimes get cut because the transition detection fails for transitions which happen in a small area of an edge and that don't cross a corner of a square.
Is there a better way to do iso-line drawing in this settings?
I've had a lot of success with the algorithms described here http://web.archive.org/web/20140718130446/http://members.bellatlantic.net/~vze2vrva/thesis.html
which discuss adaptive contouring (similar to that which you describe), and also some other issues with contour plotting in general.
There is no general way to guarantee finding all the contours of a function, without looking at every pixel. There could be a very small closed contour, where a region only about the size of a pixel where the function is positive, in a region where the function is generally negative. Unless you sample finely enough that you place a sample inside the positive region, there is no general way of knowing that it is there.
If your function is smooth enough, you may be able to guess where such small closed contours lie, because the modulus of the function gets small in a region surrounding them. The sampling could then be refined in these regions only.