I want to compose/combine different images to one image.
Specific: I have images of individual coins.
Now I want an image with, e.g., 20 coins in one image (out of the images with the individual coins). The coins are not allowed to overlap or to be cut off. The coins should be placed randomly in the image. Next, the coins should also be rotated (this means not only quadratic images).
Example single coin
Goal: Image with many coins
I'm using the R library "magick" to read/write/transform images.
I've tried "append", "mosaic", "montage", and "image_composite".
The only function that nearly got the right output was "image_composite", but I don't know how to handle the offset as the coins should be placed randomly but are not allowed to overlap.
(Example, images are already loaded:)
img = background
img = image_composite(img, coin1, offset = "+100+100")
img = image_composite(img, coin2, offset = "+644+100")
print(img)
Which function could I use instead or how could I handle the offset (e.g., like a map of "used" space where I can determine free space for the next coin)?
Thank you in advance.
Related
Recently I had much fun with the Laplacian Pyramid algorithm (http://persci.mit.edu/pub_pdfs/pyramid83.pdf). But one big problem is that the original paper is limited to 2^m+1*2^n+1 images. My question is: What is the best way to deal with arbitrary w*h instead? I can think of a couple of options:
Up sample the input to the next 2^m+1,2^n+1 up front
Pad even lines. How exactly? Wouldn't it shift the signal?
Shift even lines by half a sample? Wouldn't it loose half a sample?
Does anybody have experience with this? What is the most practical and efficient approach? Also any pointers to papers dealing with this would be very welcome.
One approach is to create an image with a width and height equal to the next 2^m+1,2^n+1, but instead of up-sampling the image to fill the expanded dimensions, just place it in the top-left corner and fill the empty space to the right and below with a constant value (the average value for the image is a good choice for this). Then encode in the normal way, storing the original image dimensions along with the pyramid. When decoding, decode and then crop to the original size.
This won't introduce any visual artifacts or degradation because you aren't stretching or offsetting the image in any way.
Because the empty space to the right and below the original image is a constant value, the high-pass bands at each level in the image pyramid will be all zero in this area. So if you are using a compression scheme like run length encoding to store each level this will be automatically taken care off and these areas will be compressed to almost nothing. If not then you can simply store the top-left (potentially non-zero) area of each level and then fill out the rest with zeros when decoding.
You could find the min and max x and y bounding rectangle of the non-zero values for each level and store this along with the level, cropped to include only non-zero values. The decoder could also be optimized so that areas of the image that are going to be cropped away are not actually decoded in the first place, by only processing the top-left of each level.
Here's an illustration of the technique:
Instead of just filling the lower-right area with a flat color, you could fill it with horizontally and vertically mirrored copies of the image to the right and below, and a copy mirrored in both directions to the bottom-right, like this:
This will avoid the discontinuities of the first technique, although there will be a discontinuity in dx (e.g. if the value was gradually increasing from left to right it will suddenly be decreasing). Choosing a mirror that keeps dx constant and ddx zero will avoid this second-order discontinuity by linearly extrapolating the values.
Another technique, which is similar to what some JPEG encoders do to pad out an image to a whole number of MCU blocks, is to take the last pixel value of each row and repeat it, and likewise for columns, with the bottom-right-most pixel of the image used to fill the bottom-right area:
This last technique could easily be modified to extrapolate the gradient of values or even the gradient of gradients instead of just repeating the same value for the remainder of the row or column.
My project is a large map that can be panned around containing "info spots" that can be clicked. For now I use four large images, each spans 5000x5000 pixels (so total map size is 20'000x20'000 pixel). On my AMD Phenom 9950 Quad-Core with 8GB RAM and an NVIDIA GeForce 610 this takes a certain while to load while it's quite fast afterwards when panning the image. I tried tiling it up but there's no visible enhancement in loading speed as the image still has to be loaded completely before it's separated into tiles.
The only way to have some real improvement on speed and memory usage would be, to only load those parts of the map image that are actually shown.
Does PyGame offer any way of doing so? I'm thinking of a "theoretical" tile map which contains the needed x and y values of each tile (I group them a little, less to compute each frame) and a theoretical image information (like: which image and which position therein). Only when a tile comes near the visible part of the screen, its image information is loaded, otherwise it remains a number and string value.
Would this make any sense? Is there any way to achieve this?
The only way to accomplish this with Pygame would be to break the images themselves into smaller squares (say 250x250), and then, as the user pans, just get the current topleft x,y coordinates, as well as the screen size, and load any tiles that fit into that screen or around the buffer edge into memory, and clear out any others that are outside that range. The math will be fairly straightforward unless you add support for rotation and/or zooming. I would name the tiles after their location as a multiple of the square size (for example the tile at 500, 500 would be named 2-2.png). This will make it very trivial to generate the tile name that you need to load at each location - take the current x/y coordinates, integer divide by 250, subtract the buffer tile amount, and then loop by your screen width integer divided by 250 plus 1 plus the buffer tile amount for each row. Do that loop for each column.
After reading #lukevp 's reply, I was interested and tried this:
http://imgur.com/Q1N2UtU
Go get this image and create a folder named 'test_data'. Now place this image, code outside of test_data folder and run. The output would be cropped figures named as per their order (It's a bit off on the edges as the image is 1920 * 1080). You can try it with your custom size tho. Also note that I am on ubuntu so take care to appropriate paths.
OUTPUT: http://imgur.com/2v4ucGI Final link: http://imgur.com/a/GHc9l
import pygame, os
pygame.init()
original_image = pygame.image.load('test_pic.jpg')
x_max = 1920
y_max = 1080
current_x = 0
current_y = 0
count = 1
begin_surf = pygame.Surface((x_max,y_max), flags = pygame.SRCALPHA)
begin_surf.blit(original_image,(0,0))
cropped_surf = pygame.Surface((100,100),flags=pygame.SRCALPHA)
while current_y + 100 < y_max:
while current_x + 100 < x_max:
cropped_surf.blit(begin_surf, (0,0), (current_x, current_y,100,100))
pygame.image.save(cropped_surf, os.path.join("test_data", str(count) + '.jpg'))
current_x += 100
count += 1
current_x = 0
current_y += 100
Would now actually be working to load those images and span them as he said.
I have an image which I would like to extract the number but in a dynamic way (I don't want to specify a roi because image may vary) so I have to filter it. I tried to detect the horizontal line(to crop the image) but it failed. I would like to detect high density zones in the binary image (the face and the top of the image)
ps:my problem isn't how to extract numbers but to specify the roi
and all the images have the same format
any help would be appreciated(even without code just the big lines)
thanks
the image
I would start form detecting frame of the whole document.
If you google: rectangle detection opencv, you will find lots of examples.
In second stage i would apply inRange to filter purple line and detect it with HoughLines.
This should be enough to calculate ROI.
I've got an array of different sized images. I want to place these images on a canvas in a sort of automated collage.
Does anyone have an idea of how to work the logic behind this concept?
All my images have heights divisible by 36 pixels and widths divisible by 9 pixels. They have mouseDown functions that allow you to drag and drop. When dropped the image goes to the closest x point divisible by 9 and y point divisble by 36. There is a grid drawn on top of the canvas.
I've sorted the array of images based on height, then based on their widths.
imagesArray.sortOn("height", Array.NUMERIC | Array.DESCENDING);
imagesArray.sortOn("width", Array.NUMERIC | Array.DESCENDING);
I'd like to take the largest image ( imageArray[0] ) to put in corner x,y = 0,0. Then randomize the rest of the images and fit them into the collage canvas.
What you are trying to do sounds like treemapping.
I think this is what's known as a "Packing problem" or maybe a "2D bin packing problem". Googling those should find you some information, doing it efficiently is not a simple task. If you only have a small number of images, the easy methods would be:
Random...just randomly place images until no more can fit. Run this random placement 10..100..1000 or more times, and pick the best result (where "best" is determined by some criteria like least amount of wasted space, or most pictures fit, etc)
Brute force...try every single possible combination, one by one, and pick the "best" one. Downside to this method is that as number of items scale up, the amount of computation scales up very quickly.
I researched treemapping and packing problems.
.... and eventually decided to create an array of all the points on the canvas, then assign them a value of empty. I then looped through my array of images and placed them on the points that were "empty" and reassigned all the points it occupied with the source name of the image. It worked beautifully. But definitely takes time to create the array.
I did a different take on that I just fits all images to a tile size and tile the into a document.
Image are virturly center croped to the file size via a layer mask.
Paste Image Roll Script http://www.mouseprints.net/old/dpr/PasteImageRoll.html
http://www.mouseprints.net/old/dpr/PasteImageRoll.jsx
I'm trying to find similar or equivalent function of Matlabs "Bwareaopen" function in OpenCV?
In MatLab Bwareaopen(image,P) removes from a binary image all connected components (objects) that have fewer than P pixels.
In my 1 channel image I want to simply remove small regions that are not part of bigger ones? Is there any trivial way to solve this?
Take a look at the cvBlobsLib, it has functions to do what you want. In fact, the code example on the front page of that link does exactly what you want, I think.
Essentially, you can use CBlobResult to perform connected-component labeling on your binary image, and then call Filter to exclude blobs according to your criteria.
There is not such a function, but you can
1) find contours
2) Find contours area
3) filter all external contours with area less then threshhold
4) Create new black image
5) Draw left contours on it
6) Mask it with a original image
I had the same problem and came up with a function that uses connectedComponentsWithStats():
def bwareaopen(img, min_size, connectivity=8):
"""Remove small objects from binary image (approximation of
bwareaopen in Matlab for 2D images).
Args:
img: a binary image (dtype=uint8) to remove small objects from
min_size: minimum size (in pixels) for an object to remain in the image
connectivity: Pixel connectivity; either 4 (connected via edges) or 8 (connected via edges and corners).
Returns:
the binary image with small objects removed
"""
# Find all connected components (called here "labels")
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(
img, connectivity=connectivity)
# check size of all connected components (area in pixels)
for i in range(num_labels):
label_size = stats[i, cv2.CC_STAT_AREA]
# remove connected components smaller than min_size
if label_size < min_size:
img[labels == i] = 0
return img
For clarification regarding connectedComponentsWithStats(), see:
How to remove small connected objects using OpenCV
https://www.programcreek.com/python/example/89340/cv2.connectedComponentsWithStats
https://python.hotexamples.com/de/examples/cv2/-/connectedComponentsWithStats/python-connectedcomponentswithstats-function-examples.html
The closest OpenCV solution to your question is the morphological closing or opening.
Say you have white regions in your image that you need to remove. You can use morphological opening. Opening is erosion + dilation, in that order. Erosion is when the white regions in your image are shrunk. Dilation is (the opposite) where white regions in your image are enlarged. When you perform an opening operation, your small white region is eroded until it vanishes. Larger white features will not vanish but will be eroded from the boundary. The subsequent dilation step restores their original size. However, since the small element(s) vanished during the erosion step, they will not appear in the final image after dilation.
For example consider this image where we want to remove the small white regions but retain the 3 large white ellipses. Running the following code removes the white regions and displays the clean image
import cv2
im = cv2.imread('sample.png')
clean = cv2.morphologyEx(im, cv2.MORPH_OPEN, np.ones((10, 10)))
cv2.imshwo("Clean image", clean)
The clean image output would be like this.
The command above uses a square block of size 10 as the kernel. You can modify this to suit your requirement. You can even generate a more advanced kernel using the function getStructuringElement().
Note that if your image is inverted, i.e., with black noise on a white background, you simply need to use the morphological closing operation (cv2.MORPH_CLOSE method) instead of opening. This reverses the order of operation - first the image is eroded and then dilated.