Recently I had much fun with the Laplacian Pyramid algorithm (http://persci.mit.edu/pub_pdfs/pyramid83.pdf). But one big problem is that the original paper is limited to 2^m+1*2^n+1 images. My question is: What is the best way to deal with arbitrary w*h instead? I can think of a couple of options:
Up sample the input to the next 2^m+1,2^n+1 up front
Pad even lines. How exactly? Wouldn't it shift the signal?
Shift even lines by half a sample? Wouldn't it loose half a sample?
Does anybody have experience with this? What is the most practical and efficient approach? Also any pointers to papers dealing with this would be very welcome.
One approach is to create an image with a width and height equal to the next 2^m+1,2^n+1, but instead of up-sampling the image to fill the expanded dimensions, just place it in the top-left corner and fill the empty space to the right and below with a constant value (the average value for the image is a good choice for this). Then encode in the normal way, storing the original image dimensions along with the pyramid. When decoding, decode and then crop to the original size.
This won't introduce any visual artifacts or degradation because you aren't stretching or offsetting the image in any way.
Because the empty space to the right and below the original image is a constant value, the high-pass bands at each level in the image pyramid will be all zero in this area. So if you are using a compression scheme like run length encoding to store each level this will be automatically taken care off and these areas will be compressed to almost nothing. If not then you can simply store the top-left (potentially non-zero) area of each level and then fill out the rest with zeros when decoding.
You could find the min and max x and y bounding rectangle of the non-zero values for each level and store this along with the level, cropped to include only non-zero values. The decoder could also be optimized so that areas of the image that are going to be cropped away are not actually decoded in the first place, by only processing the top-left of each level.
Here's an illustration of the technique:
Instead of just filling the lower-right area with a flat color, you could fill it with horizontally and vertically mirrored copies of the image to the right and below, and a copy mirrored in both directions to the bottom-right, like this:
This will avoid the discontinuities of the first technique, although there will be a discontinuity in dx (e.g. if the value was gradually increasing from left to right it will suddenly be decreasing). Choosing a mirror that keeps dx constant and ddx zero will avoid this second-order discontinuity by linearly extrapolating the values.
Another technique, which is similar to what some JPEG encoders do to pad out an image to a whole number of MCU blocks, is to take the last pixel value of each row and repeat it, and likewise for columns, with the bottom-right-most pixel of the image used to fill the bottom-right area:
This last technique could easily be modified to extrapolate the gradient of values or even the gradient of gradients instead of just repeating the same value for the remainder of the row or column.
Related
There was a gif on the internet where someone used some sort of CAD and drew multiple vector pictures in it. On the first frame they zoom-in on a tiny dot, revealing there a whole new different vector picture just on a different scale, and then they proceed to zoom-in further on another tiny dot, revealing another detailed picture, repeating several times. here is the link to the gif
Or another similar example: imagine you have a time-series with a granularity of a millisecond per sample and you zoom out to reveal years-worth of data.
My questions are: how such a fine-detailed data, in the end, gets rendered, when a huge amount of data ends up getting aliased into a single pixel.
Do you have to go through the whole dataset to render that pixel (i.e. in case of time-series: go through million records to just average them out into 1 line or in case of CAD render whole vector picture and blur it into tiny dot), or there are certain level-of-detail optimizations that can be applied so that you don't have to do this?
If so, how do they work and where one can learn about it?
This is a very well known problem in games development. In the following I am assuming you are using a scene graph, a node-based tree of objects.
Typical solutions involve a mix of these techniques:
Level Of Detail (LOD): multiple resolutions of the same model, which are shown or hidden so that only one is "visible" at any time. When to hide and show is usually determined by the distance between camera and object, but you could also include the scale of the object as a factor. Modern 3d/CAD software will sometimes offer you automatic "simplification" of models, which can be used as the low res LOD models.
At the lowest level, you could even just use the object's bounding
box. Checking whether a bounding box is in view is only around 1-7 point checks depending on how you check. And you can utilise object parenting for transitive bounding boxes.
Clipping: if a polygon is not rendered in the view port at all, no need to render it. In the GIF you posted, when the camera zooms in on a new scene, what is left from the larger model is a single polygon in the background.
Re-scaling of world coordinates: as you zoom in, the coordinates for vertices become sub-zero floating point numbers. Given you want all coordinates as precise as possible and given modern CPUs can only handle floats with 64 bits precision (and often use only 32 for better performance), it's a good idea to reset the scaling of the visible objects. What I mean by that is that as your camera zooms in to say 1/1000 of the previous view, you can scale up the bigger objects by a factor of 1000, and at the same time adjust the camera position and focal length. Any newly attached small model would use its original scale, thus preserving its precision.
This transition would be invisible to the viewer, but allows you to stay within well-defined 3d coordinates while being able to zoom in infinitely.
On a higher level: As you zoom into something and the camera gets closer to an object, it appears as if the world grows bigger relative to the view. While normally the camera space is moving and the world gets multiplied by the camera's matrix, the same effect can be achieved by changing the world coordinates instead of the camera.
First, you can use caching. With tiles, like it's done in cartography. You'll still need to go over all the points, but after that you'll be able zoom-in/zoom-out quite rapidly.
But if you don't have extra memory for cache (not so much actually, much less than the data itself), or don't have time to go over all the points you can use probabilistic approach.
It can be as simple as peeking only every other point (or every 10th point or whatever suits you). It yields decent results for some data. Again in cartography it works quite well for shorelines, but not so well for houses or administrative boarders - anything with a lot of straight lines.
Or you can take a more hardcore probabilistic approach: randomly peek some points, and if, for example, there're 100 data points that hit pixel one and only 50 hit pixel two, then you can more or less safely assume that if you'll continue to peek points still pixel one will be twice as likely to be hit that pixel two. So you can just give up and draw pixel one with a twice more heavy color.
Also consider how much data you can and want to put in a pixel. If you'll draw a pixel in black and white, then there're only 256 variants of color. And you don't need to be more precise. Or if you're going to draw a pixel in full color then you still need to ask yourself: will anyone notice the difference between something like rgb(123,12,54) and rgb(123,11,54)?
I'm trying to figure out how to automatically adjust the maximum iteration value when moving around in the Mandelbrot fractal.
All examples I've found uses a constant of 1000 or less but that's not enough when zooming into the fractal set.
Is there a way to determine the number of max_iterations based on for example where you are in the Mandelbrot space (x_start,x_end,y_start,y_end)?
One method I tried was to repetitively pre-process a small area in the region of the Mset boundary with increasing iterations until the percentage change in status from one repetition to the next was small. The problem was, that would vary in different places on the current map, since the "depth" varies across it. How to find the right place to do it? By logging the "deepest" boundary area during the previous generation (that will still be within the next zoom area).
But my best strategy was to avoid iterating wherever possible:
Away from the boundary of the Mset, areas of equal depth can be "contoured" and then filled with that depth. It was not an easy algorithm. Basically I followed a raster scan but when I detected a boundary of iteration change (examining all the neighbours to ensure I wasn't close the the edge of the Mset), I would switch to a curve-stitching method to iterate around a contour back to where it started (obviously not recalculating spots I already did), and then make a second pass filling in the raster lines within the countour with the iteration level. It was fraught with leaks but eventually I cracked it.
Within the Mset, I followed the same approach, because the very last thing you want to do is to plough across vast areas and hit the iteration limit.
The difficult area is close the the boundary, where the iteration results can't be related to smooth contours with the neighbours. The contour stitching method won't work here, since there is only ever 1 pixel of a particular depth.
Using the contour method also will have faults to the lower or Mset sides of this region, but since this area looks chaotic until you zoom deeper, I lived with that.
So having said all that, I simply set the iteration depth as high as I can tolerate, but perhaps you can combine my first paragraph with the area-filling techniques.
BTW colouring the region adjacent to the Mset looks terrible when an animated smooth playback of the zoom is attempted. For that reason I coloured this area in a grey scale, by comparing with neighbours. If there was too much difference, I coloured to 0x808080 at first, then adapted that depending on the predominance of the neighbours' depth. All requiring fine tuning!
I have a graph like:
I would like to generate a set of (x,y) pairs that correspond to points of this graph.
Maybe one for each horizontal pixel.
How would I go about doing this?
If I had the image in uncompressed bitmap format, maybe cropped to the actual graph, I could examine each vertical strip for the blackest point...
I would prefer to work in Python, but I'm interested in any technique.
I answered a question like this a while back. It should be fairly easy to detect the grid, from there you can get the pixel's coordinates relatively to the grid. However, it wasn't clear how to extract the numbers, which you need to do in order to get the the scale of the grid. Although, it might be possible fairly easily if you can match the font and font size (which might be possible via scaling). Otherwise, you'd have to enter the numbers manually.
To extract the grid, you'd start from the top right and move diagonally until you find the start of the grid. From there you can follow the vertical and horizontal lines (of the grid) until they end. This should allow you to say with fairly high probability where the outer rectangle of the grid is and what the x and y intervals of the grid are in terms of pixels. The blackest parts within the grid should do for finding the curve, but it may require some interpolation depending on how many data points you need/want.
It also may be useful to look into techniques for reversing anti-aliasing effects. Although, the uncompressed bitmap image may not need it.
I've got an array of different sized images. I want to place these images on a canvas in a sort of automated collage.
Does anyone have an idea of how to work the logic behind this concept?
All my images have heights divisible by 36 pixels and widths divisible by 9 pixels. They have mouseDown functions that allow you to drag and drop. When dropped the image goes to the closest x point divisible by 9 and y point divisble by 36. There is a grid drawn on top of the canvas.
I've sorted the array of images based on height, then based on their widths.
imagesArray.sortOn("height", Array.NUMERIC | Array.DESCENDING);
imagesArray.sortOn("width", Array.NUMERIC | Array.DESCENDING);
I'd like to take the largest image ( imageArray[0] ) to put in corner x,y = 0,0. Then randomize the rest of the images and fit them into the collage canvas.
What you are trying to do sounds like treemapping.
I think this is what's known as a "Packing problem" or maybe a "2D bin packing problem". Googling those should find you some information, doing it efficiently is not a simple task. If you only have a small number of images, the easy methods would be:
Random...just randomly place images until no more can fit. Run this random placement 10..100..1000 or more times, and pick the best result (where "best" is determined by some criteria like least amount of wasted space, or most pictures fit, etc)
Brute force...try every single possible combination, one by one, and pick the "best" one. Downside to this method is that as number of items scale up, the amount of computation scales up very quickly.
I researched treemapping and packing problems.
.... and eventually decided to create an array of all the points on the canvas, then assign them a value of empty. I then looped through my array of images and placed them on the points that were "empty" and reassigned all the points it occupied with the source name of the image. It worked beautifully. But definitely takes time to create the array.
I did a different take on that I just fits all images to a tile size and tile the into a document.
Image are virturly center croped to the file size via a layer mask.
Paste Image Roll Script http://www.mouseprints.net/old/dpr/PasteImageRoll.html
http://www.mouseprints.net/old/dpr/PasteImageRoll.jsx
I have some kind of a shape consisting of vertical, horizontal and diagonal lines. I have starting X,Y and ending X,Y (this is my input - just 2 points defining a line) of each line and I would like to make the whole shape scalable (just by changing the value of a scale ratio variable), so that I can still preserve the proper connection of the lines and the proportions as well. Just for getting a better idea of what I mean: it'd be as if I had the same lines in a vector editor.
Would that be possible with an algorithm, and could you please, give me another possible solution if there is no such algorithm ?
Thank you very much in advance!
what point do you want it to scale about? You could scale relative to the first point, the center, or some other arbitrary location. Typically, you subtract out an offset (for instance the first point in your input), multiply by a scale factor, and then add back the offset.
A more systematic approach in computer graphics would be to use a transformation matrix... although thats probably overkill in your case.