Is there a way to make Amazon Textract return the skew angle of the pdf document that it is processing?
When I do detection on a document that has been scanned it, it attempts to de-skew it but it's calculations are slightly off. There doesn't seem to be a way that I can find to access the raw bounding box or geometry of each LINE or WORD, it is all fixed to de-skewed position.
Can I find the skew angle another way or can I disable de-skewing on the request somehow?
Related
I am not sure how to put this problem in a single sentence, sorry if the title is misleading.
I am currently developing a simple terrain editor with a circle-shaped brush size. The image below shows a few cases that represent my problem.
additional info: the square size is fixed and uniform and in the current version, my concern is only to find which one is hit and which one is not (the amount of region covered is important for weighting the hit, but probably not right now)
My current solution (which is not even correct for a certain condition) is: given a hit in a position (x, y) with radius r, loop through all square from (x-radius, y-radius) to (x+radius, y+radius) and apply 2-D box to circle collision detection. But I don't think this is optimal (or even correct IMO).
Can anyone help me with this one? Thank you
Since i can't add a simple comment due to bureaucracy on this website i have to type it out here.
Anyway you're in luck since i was trying to do this recently as well! The way i did it is i iterated through the vertex array and check if the current vertex falls inside the radius of the circle. But perhaps what you want is to check it against each quad center and if that center falls inside the radius then add the whole quad as it's being collided.
Of course depending on the size of your grid the performance will vary so it's good to try to iterate through as few quads as needed. Though accessing these quads from the array is something you have to figure out yourself.
Currently I'm using "Douglas Peucker" algorithm.
My problem is that when I'm drawing,the previously drawn lines are also changing which of course not realistic. Is there other alternative algorithm to minimize the saved points but not altering the previous drawn points or other way to alter "Douglas Peucker" to fit my need?
Give your pencil drawing tool 2 optional methods for drawing:
Draw a new point on the path using mousemove (which is your current freeform method). This option will let the user add many points which will allow them to be very detailed in their drawing.
Draw a new point on the path only upon mousedown. This option simply connects the previous point on the path to the newly clicked point. This option will let the user add just a few very straight lines which will allow them to outline figures with long running straight edges.
If you are concerned about the freeform path changing while the user is drawing you can apply the simplifying algorithm just once after they have stopped moving the mouse for 1 second.
If you specify the Douglas-Peucker algorithm use a high bias for accuracy then the simplified path will remain quite true to the unsimplified path.
BTW, if you want to draw splines through your points then check out this nice previous post: how to draw smooth curve through N points using javascript HTML5 canvas?
I get an organized point cloud (using pcl and an ASUS Xtion Pro Live), which of course contains NANs and the like. I also get an RGB image of the same scene.
The first step for processing is removing those NANs, which converts the point cloud to unorganized. I then perform a few other steps, but that's not relevant to the question (I think, see P.S.1). What COULD (I'm not sure) be relevant is that I run extract multiple times, and so have quite a few intermediate point clouds. I believe this means I can no longer assume that the points are in the same order they were at the start.
For clarification, I do understand what an unorganized point cloud it and how it differs from unorganized, both theoretically and in terms of how the data is actually stored.
After chopping off various points, I now have a much smaller point cloud which consists only of points in the original point cloud (but much less of them). How do I map these points back to the matching points in the original point cloud? I probably could iterate through the entire cloud to find matches, but this seems hacked together. Is there a better way to do this?
My main aim is to be able to say that 'point A in my final point cloud is of interest to me' and furthermore to map that to pixel K in the RGB image I first obtained. It seems to me that matching the final point cloud with the initial one is the best way to do this, but alternatives are also welcome.
P.S.1 - One of the last few steps in my process is finding a convex hull and then extracting a polygonal prism from the original point cloud. If all else fails, I will just interrogate the (20-50) points on the convex hull to match them with my initial point cloud (minimizing computation) and hence to match them with the original RGB images.
P.S.2 - Random musing - since I know the original size of the RGB image, the origin of the camera relative to the point cloud (or, rather, the position of the points relative to the camera used to take them), and can trivially obtain the camera parameters, could I simply use ray-tracing through each point in my final point cloud to produce an RGB image? The image may need registration with the 'real' RGB image, or it probably won't since nothing will have actually moved except for rounding error.
I have a 2d world made of tiles. Tiles are either passable, non-passable or have some sort of movement penalty.
All entities and tiles have their own hitboxes and sizes for collision detection.
Each tile tile has dimensions of 16x16px.
Most examples I've read seem to suggest that we're moving from center of one tile to the center of another tile. And as we see from the picture below, that red part looks hardly optimal nor it doesn't take entity size into account. Also pathding nodes are also placed into 2d array, with only 8 possible directions from each node.
But wouldn't actually shortest path be something like this?
How should I implement pathfinding?
Should tiles be splitted into smaller nodes for pathfinding or is there some other way to get more accurate routes? Even if I splitted each tile to have 10x10 pathfinding nodes, It still wouldn't find shortest line between 2 points.
Should there be more than 8 directions and if so, how should that be implemented?
For example if my world was 50x50 tiles big, how should the pathfinding map look like and how it should be generated?
It depends on your definition of "shortest path" and what you plan to do with it.
In your example, it appears that you consider a valid move to be from the center of one tile to the center of any other tile in unobstructed view. How you'd validate moves to partially obstructed tiles is not clear. This differs from the geometrically shortest path, which would obviously hug the wall, and the realistic shortest path, would would use a unit width and turn radius to avoid walls and sudden changes in direction.
A common approach is to use A* as usual, and then post-process the path in a number of ways to optimize and smooth it. This works both for grid based worlds like yours, and for more general navmeshes.
Gamasutra had a nice overview of this called Toward More Realistic Pathfinding, with a variety of ideas and techniques from smoothing zigzags and adding curves, to optimizing paths for units with acceleration and direction.
I had almost same problem and I have coded a pre-computation software for all tiles to all tiles with some optimization
You can find source code here : https://github.com/FurkanGozukara/pathfinding-2d-tile-map
The development video is here : https://www.youtube.com/watch?v=jRTA0iLjv6M
I did come up with my own algorithm and implementation. Therefore it is probably not the best nor the most optimized one. Although it is already implemented into my free browser based game MonsterMMORPG and it works great : https://www.monstermmorpg.com/
I'm doing some image processing, and I need to find some information on line growing algorithms - not sure if I'm using the right terminology here, so please call me out on this is needs be.
Imagine my input image is simply a circle on a black background. I'd basically like extract the coordinates, so that I may draw this circle elsewhere based on the coordinates.
Note: I am already using edge detection image filters, but I thought it best to explain with a simple example.
Basically what I'm looking to do is detect lines in an image, and store the result in a data type where by I have say a class called Line, and various different Point objects (containing X/Y coordinates).
class Line
{
Point points[];
}
class Point
{
int X, Y;
}
And this is how I'd like to use it...
Line line;
for each pixel in image
{
if pixel should be added to line
{
add pixel coordinates to line;
}
}
I have no idea how to approach this as you can probably establish, so pointers to any subject matter would be greatly appreciated.
I'm not sure if I'm interpreting you right, but the standard way is to use a Hough transform. It's a two step process:
From the given image, determine whether each pixel is an edge pixel (this process creates a new "binary" image). A standard way to do this is Canny edge-detection.
Using the binary image of edge pixels, apply the Hough transform. The basic idea is: for each edge pixel, compute all lines through it, and then take the lines that went through the most edge pixels.
Edit: apparently you're looking for the boundary. Here's how you do that.
Recall that the Canny edge detector actually gives you a gradient also (not just the magnitude). So if you pick an edge pixel and follow along (or against) that vector, you'll find the next edge pixel. Keep going until you don't hit an edge pixel anymore, and there's your boundary.
What you are talking about is not an easy problem! I have found that this website is very helpful in image processing: http://homepages.inf.ed.ac.uk/rbf/HIPR2/wksheets.htm
One thing to try is the Hough Transform, which detects shapes in an image. Mind you, it's not easy to figure out.
For edge detection, the best is Canny edge detection, also a non-trivial task to implement.
Assuming the following is true:
Your image contains a single shape on a background
You can determine which pixels are background and which pixels are the shape
You only want to grab the boundary of the outside of the shape (this excludes donut-like shapes where you want to trace the inside circle)
You can use a contour tracing algorithm such as the Moore-neighbour algorithm.
Steps:
Find an initial boundary pixel. To do this, start from the bottom-left corner of the image, travel all the way up and if you reach the top, start over at the bottom moving right one pixel and repeat, until you find a shape pixel. Make sure you keep track of the location of the pixel that you were at before you found the shape pixel.
Find the next boundary pixel. Travel clockwise around the last visited boundary pixel, starting from the background pixel you last visited before finding the current boundary pixel.
Repeat step 2 until you revisit first boundary pixel. Once you visit the first boundary pixel a second time, you've traced the entire boundary of the shape and can stop.
You could take a look at http://processing.org/ the project was created to teach the fundamentals of computer programming within a visual context. There is the language, based on java, and an IDE to make 'sketches' in. It is a very good package to quickly work with visual objects and has good examples of things like edge detection that would be useful to you.
Just to echo the answers above you want to do edge detection and Hough transform.
Note that a Hough transform for a circle is slightly tricky (you are solving for 3 parameters, x,y,radius) you might want to just use a library like openCV