If I have a function that does 5x5 Gaussian Pyramid Kernel, how do I use this function to produce Gaussian Blur on to an image?
I assume I have loop the whole image, but I am not sure how many times I have to call to function per row and per cols.
If you're simply going to perform a blur, consider decomposing your 2D kernel into two 1D kernels for the sake of speed.
http://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm
The "Pyramid" context may be a bit misleading here; if you only want to blur an image and perhaps display the blurred image, or process it further, then image pyramids aren't necessarily relevant.
Related
I want to calculate the cavity volume on the inside of an 3D object. The object is modeled as a triangular mesh and is hollow. A good example for what I want to calculate is the volume of a ventricular cavity for a human heart.
I have looked at calculating the volume of a mesh according to the answer and paper from this thread; however, it seems to be only calculating the volume of the mesh. Depending on how my model is created the hollow/cavity of the model would not be calculated from this method. For example if my model is layered with an inner and outer surface, as seen in the image, would I be able to calculate the volume in red? I have thought of trying to find just the outer volume(World Space Volume) and subtracting the mesh volume from it to get the cavity volume, but have so far been unsuccessful.
Maybe I am misunderstanding the paper and previous thread, but believe this cavity calculation is a separate problem. As well, the method would need to be able to handle overhangs and inner objects that may cause complications. Let me know whether I am thinking about the previous thread/paper incorrectly or if there is an alternate method to look at.
I have been trying to teach myself some simple computer vision algorithms and am trying to solve a problem where I have some noise corrupted image and all I am trying to do is separate the black background from the foreground which has some signal. Now, the background RGB channels are not all completely zero as they can have some noise. However, the human eye can easily discern the foreground from the background.
So, what I did was use the SLIC algorithm to break the image down into super pixels. The idea being that since the image is noise corrupted, doing statistics on the patches might result in better classification of background and foreground because of higher SNR.
After this, I get around 100 patches which should have similar profile and the result of SLIC seems reasonable. I have been reading about graph cuts (the Kolmogorov paper) and it seemed like something nice to try for the binary problem I have. So, I constructed a graph which is a first order MRF and I have edges between the immediate neighbours (4-connected graph).
Now, I was wondering what possible unary and binary terms I can use here to do my segmentation. So, I was thinking for the unary term, I can model it as a simple Gaussian where the background should have a zero mean intensity and the foreground should have some non-zero mean. Although, I am struggling to figure out how to encode this. Should I just assume some noise variance and compute probabilities directly using patch statistics?
Similarly, for neighbouring patches I do want to encourage them to take similar label but I am not sure what binary term I can design that reflects that. Seems just the difference between the label (1 or 0) seems weird...
Sorry for the long-winded question. Hoping someone can give some helpful hint on how to start.
You could build your CRF model over superpixels, such that a superpixel has a connection to another superpixel if it is a neighbour of it.
For your statistical model Pixel Wise Posteriors are simple and cheap to compute.
So, I suggest the following for the unary terms of the CRF:
Build foreground and background histograms over texture per pixel(assuming you have a mask, or reasonable amount of marked foreground pixels(note, not superpixels)).
For each superpixel, make an independence assumption over pixels within it, such that a superpixels likelihood of being either foreground or background is the product over each observation in the superpixel(in practice, we sum logs). The individual likelihood terms come from the histograms that you generated.
Compute the posterior for foreground as the cumulative likelihood described above for foreground divided by the sum of the cumulative likelihoods of both. Similar for background.
The pairwise terms between superpixels can be as simple as the difference between the mean observed textures(pixelwise) for each passed through a kernel, such as the Radial Basis Function.
Alternatively, you could compute histograms over each superpixels observed texture(again, pixel wise) and compute the Bhattacharyya Distance between each neighbouring pair of superpixels.
I am writing a drawing program that is vector-based but should have drawing tools that behave more like raster-based tools. For example - when you draw with a graphics tablet pen, the resultant vector stroke, with pressure differences and all, is actually a fill. The vector curve just fits to the outside of the stroke. Also, if you pick up the eraser and erase part of this stroke, it only erases exactly where you erase at. It does not just delete vector points, but creates new ones where it has to. These pen and eraser strokes will also be of any geometry - with not only outlines on the outside, but also holes in the middle.
I puzzled for a long time over how to do this, and last night I got the idea of drawing the pen input to an offscreen bitmap (where each pixel is one bit - either touched or untouched by the pen stroke) and then when the stroke ends (pen is lifted up) the program vectorizes this bitmap, and then either places down the vector fill with appropriate color, or performs a boolean operation with the vectors underneath in order to erase.
I can make this work with raster drawing tools (so that you're not constantly drawing lines right on the visible surface, which doesn't look good if you're drawing with transparency), but I don't know how I would fit a vector curve to this one-bit bitmap. The operation has to be fast but not real-time, as it is done once after the pen is lifted up. It also has to create optimized geometry so that only the minimum number of bezier curve points needed to describe the geometry are used.
Does anyone have any suggestions, solutions, pointers, or references for how to do this? Or, is there maybe another way?
Let me explain my problem:
I have a black vector shape (let's say it's a series of joined, straight lines for now, but it'd be nice if I could also support quadratic curves).
I also have a rectangle of a predefined width and height. I'm going to place it on top of the black shape, and then take the union of the two.
My first issue is that I don't know how to quickly extract vector unions, but I think there is a well-defined formula I can figure out for myself.
My second, and more tricky issue is how to efficiently detect the position the rectangle needs to be in (i.e., what translation and rotation are needed by the matrices), in order to maximize the black, remaining after the union (see figure, below).
The red outlined shape below is ~33% black; the green is something like 85%; and there are positions for this shape & rectangle wherein either could have 100% coverage.
Obviously, I can brute-force this by trying every translation and rotation value for every point where at least part of the rectangle is touching the black shape, then keep track of the one with the most black coverage. The problem is, I can only try a finite number of positions (and may therefore miss the maximum). Apart from that, it feels very inefficient!
Can you think of a more efficient way of tackling this problem?
Something from my Uni days tells me that a Fourier transform might improve the efficiency here, but I can't figure out how I'd do that with a vector shape!
Three ideas that have promise of being faster and/or more precise than brute force search:
Suppose you have a 3d physics engine. Define a "cone-shaped" surface where the apex is at say (0,0,-1), the black polygon boundary on the z=0 plane with its centroid at the origin, and the cone surface is formed by connecting the apex with semi-infinite rays through the polygon boundary. Think of a party hat turned upside down and crumpled to the shape of the black polygon. Now constrain the rectangle to be parallel to the z=0 plane and initially so high above the cone (large z value) that it's easy to find a place where it's definitely "inside". Then let the rectangle fall downward under gravity, twisting about z and translating in x-y only as it touches the cone, staying inside all the way down until it settles and can't move any farther. The collision detection and force resolution of the physics engine takes care of the complexities. When it settles, it will be in a position of maximal coverage of the black polygon in a local sense. (If it settles with z<0, then coverage is 100%.) For the convex case it's probably a global maximum. To probabilistically improve the result for non-convex cases (like your example), you'd randomize the starting position, dropping the polygon many times, taking the best result. Note you don't really need a full blown physics engine (though they certainly exist in open source). It's enough to use collision resolution to tell you how to rotate and translate the rectangle in a pseudo-physical way as it twists and slides uniformly down the z axis as far as possible.
Different physics model. Suppose the black area is an attractive field generator in 2d following the usual inverse square rule like gravity and magnetism. Now let the rectangle drift in a damping medium responding to this field. It ought to settle with a maximal area overlapping the black area. There are problems with "nulls" like at the center of a donut, but I don't think these can ever be stable equillibria. Can they? The simulation could be easily done by modeling both shapes as particle swarms. Or since the rectangle is a simple shape and you are a physicist, you could come up with a closed form for the integral of attractive force between a point and the rectangle. This way only the black shape needs representation as particles. Come to think of it, if you can come up with a closed form for torque and linear attraction due to two triangles, then you can decompose both shapes with a (e.g. Delaunay) triangulation and get a precise answer. Unfortunately this discussion implies it can't be done analytically. So particle clouds may be the final solution. The good news is that modern processors, particularly GPUs, do very large particle computations with amazing speed. Edit: I implemented this quick and dirty. It works great for convex shapes, but concavities create stable points that aren't what you want. Using the example:
This problem is related to robot path planning. Looking at this literature may turn up some ideas In RPP you have obstacles and a robot and want to find a path the robot can travel while avoiding and/or sliding along them. If the robot is asymmetric and can rotate, then 2d planning is done in a 3d (toroidal) configuration space (C-space) where one dimension is rotation (so closes on itself). The idea is to "grow" the obstacles in C-space while shrinking the robot to a point. Growing the obstacles is achieved by computing Minkowski Differences.) If you decompose all polygons to convex shapes, then there is a simple "edge merge" algorithm for computing the MD.) When the C-space representation is complete, any 1d path that does not pierce the "grown" obstacles corresponds to continuous translation/rotation of the robot in world space that avoids the original obstacles. For your problem the white area is the obstacle and the rectangle is the robot. You're looking for any open point at all. This would correspond to 100% coverage. For the less than 100% case, the C-space would have to be a function on 3d that reflects how "bad" the intersection of the robot is with the obstacle rather than just a binary value. You're looking for the least bad point. C-space representation is an open research topic. An octree might work here.
Lots of details to think through in both cases, and they may not pan out at all, but at least these are frameworks to think more about the problem. The physics idea is a bit like using simulated spring systems to do graph layout, which has been very successful.
I don't believe it is possible to find the precise maximum for this problem, so you will need to make do with an approximation.
You could potentially render the vector image into a bitmap and use Haar features for this - they provide a very quick O(1) way of calculating the average colour of a rectangular region.
You'd still need to perform this multiple times for different rotations and positions, but it would bring it algorithmic complexity down from a naive O(n^5) to O(n^3) which may be acceptably fast. (with n here being the size of the different degrees of freedom you are scanning)
Have you thought to keep track of the remaining white space inside the blocks with something like if whitespace !== 0?
I have an implicit scalar field defined in 2D, for every point in 2D I can make it compute an exact scalar value but its a somewhat complex computation.
I would like to draw an iso-line of that surface, say the line of the '0' value. The function itself is continuous but the '0' iso-line can have multiple continuous instances and it is not guaranteed that all of them are connected.
Calculating the value for each pixel is not an option because that would take too much time - in the order of a few seconds and this needs to be as real time as possible.
What I'm currently using is a recursive division of space which can be thought of as a kind of quad-tree. I take an initial, very coarse sampling of the space and if I find a square which contains a transition from positive to negative values, I recursively divide it to 4 smaller squares and checks again, stopping at the pixel level. The positive-negative transition is detected by sampling a sqaure in its 4 corners.
This work fairly well, except when it doesn't. The iso-lines which are drawn sometimes get cut because the transition detection fails for transitions which happen in a small area of an edge and that don't cross a corner of a square.
Is there a better way to do iso-line drawing in this settings?
I've had a lot of success with the algorithms described here http://web.archive.org/web/20140718130446/http://members.bellatlantic.net/~vze2vrva/thesis.html
which discuss adaptive contouring (similar to that which you describe), and also some other issues with contour plotting in general.
There is no general way to guarantee finding all the contours of a function, without looking at every pixel. There could be a very small closed contour, where a region only about the size of a pixel where the function is positive, in a region where the function is generally negative. Unless you sample finely enough that you place a sample inside the positive region, there is no general way of knowing that it is there.
If your function is smooth enough, you may be able to guess where such small closed contours lie, because the modulus of the function gets small in a region surrounding them. The sampling could then be refined in these regions only.