Organized point cloud from stereo - point-cloud-library

I am working with disparity maps (1024 x 768) obtained via stereo and I am able to get point clouds with XYZRGB pcl::Points. However not all pixels from the disparity map are valid depth hence there will never be 1024x768 = 786432 XYZRGB points. Fortunately I am able to save the point clouds unorganized (i.e. height=1). Unfortunately, some normal estimation methods etc, are tailored for organized pointclouds. How can I create organised pointclouds from this ?

I believe that this is not possible.
First of all unorganized point cloud (PC) is just list of points in random order written in file
On the other hand organized PC carries information of in which order orginal points were obtained by depth camera and some other information. This information is stored in lets call it grid.
Once you destroy this grid omiting some points theres no algorithm that can put it back together as it originally was
You can use other methods which provides PCL that doesnt take OPC as an argument. Result will be same as if you would use organized point cloud only little bit slower (depends on size of your input cloud)

I assume that you do have the calibration parameters that are necessary to transform the image points and their depth into 3D points, right?
In this case, you simply create a 2D point cloud and do the following for each pixel of the disparity map:
If the point is valid:
set the corresponding point in the point cloud to the 3D point
else:
set the corresponding point in the cloud to NaN (i.e. a 3D point with NaN as coordinates)

Related

Using Geo-coordintes Instead of Cartesian to Draw in Argon and A-Frame

I would like to create a GPS drawing program in Argon and A-Frame which draws lines based upon people's movements.
Lines can be drawn in A-Frame with, for example, the meshline component which uses Cartesian points:
<a-entity meshline="lineWidth: 20; path: -2 -1 0, 0 -2 0</a-entity>
If I were to do this with a GPS device, I would take the GPS coordinates and map them directly to something like Google maps. Does Argon have any similar functionality such that I can use the GPS coordinates directly as the path like so:
<a-entity meshline="lineWidth: 20; path: 37.32299 -122.04185 0, 37.32298 -122.03224</a-entity>
Since one can specify an LLA point for a reference frame I suppose one way to do this would be to conceive of the center LLA point as "0, 0, 0" and then use a function to map the LLA domain to a Cartesian range.
It would be preferable, however, to use the geo-coordinates directly. Is this possible in Argon?
To understand the answer, you need to first understand the various frames of reference used by Argon.
First, Argon makes use of cesiumjs.org's geospatial math libraries and Entity's so that all "locations" in Argon must either be expressed geospatially OR be relative to a geospatial entity. These are rooted at the center of the earth, in what Cesium calls FIXED coordinates, but are also know as ECEF or ECF coordinates. In that system, coordinates are in meters, with up/down going through the poles, east/west going through the meridian (I believe). Any point on the surface of the earth is represented with pretty large numbers.
This coordinate system is nice because we can represent anything on or near the earth precisely using it. Cesium also supports INERTIAL coordinates, which are used to represent near-earth orbital objects, and can convert between the two frames.
But, it is inconvenient when doing AR for a few reasons:
the numbers used to represent the position of the viewer and objects near them are quite large, even if they are very close, which can lead to mathematical accuracy issues, especially in the 3D graphics system.
The coordinates we "think about" when we think about the world around us have the ground as "flat" and "up" as pointing ... well, up. So, in 3D graphics, an object above another object typically has the same X and Z values, but has a Y that's bigger. In ECEF coordinates, all the numbers change because what we perceive as "up" is really a vector from the center of the earth though us, and is only "up" if we're on the north (or south, depending on your +/-) pole. Most 3D graphics libraries you might want to use (e.g., physics libraries, for example), assume a world in which the ground is one plane (typically the XZ plane) and Y is up (some aeronautics and other engineering applications use Z as up and have XY as the ground, but the issue is the same).
Argon deals with this, as do many geospatial AR systems, by creating a local coordinate system for the graphics and application to use. There are really three options for this:
Pick some arbitrary (but fixed) local place as the origin. Some systems, which are built to work in one place, have this hard-coded. Others let the application set it. We don't do this because it would encourage applications to take the easy path and only work in one place (we've seen this in the past).
Set the local place to the camera. This has the advantage that the math is the most "accurate" because all points are expressed relative to the camera. But, this causes two issues. First, the camera tends to move continuously (even if only due to sensor noise) in AR apps. Second, many libraries (again, like physics libraries) assume that the origin of the system is stable and on the earth, with the camera/user moving through it. These issues can be worked around, but they are tedious for application developers to deal with.
Set the origin of the local coordinates to an arbitrary location near the user, and if the user moves far from it, recenter automatically. The advantage of this is the program doesn't necessarily have to do much to deal with it, and it meshes nicely with 3D graphics libraries. The disadvantage is the local coordinates are arbitrary, and might be different each time a program is run. However, the application developer may have to pay attention to when the origin is recentered.
Argon uses open 3. When the app starts, we create a new local coordinate frame at the user's location, on the plane tangent to the earth. If the user moves far from that location we update the origin and emit an event to the application (currently, we recenter if you are 5km away from the origin). In many simple apps, with only a few frames or reference expressed in geospatial coordinates (and the rest of the application data expressed relative to known geospatial locations), the conversion from geospatial to local can just be done each frame, allowing the app developer to ignore the reentering problem. The programmer is free to use either ENU (east-north-up) or EUS (east-up-south) as their coordinate system; we tend to use EUS because it's similar to what most 3D graphics systems use (Y is up, Z points south, and X is east).
One of the reasons we chose this approach is that we've found in the past that if we had predictable local coordinates, application developers would store data using those coordinates even though that's not a good idea (you data is now tied to some relatively arbitrary application-specific coordinate system, and will now only work in that location).
So, now to your question. Your issue is that you want to use geospatial (cesium's coordinates, that argon uses) coordinates in AFrame. The short answer is you can't use them directly, since AFrame is built assuming a local 3D graphics coordinate system. The argon-aframe package binds aframe to argon by allowing you to specify referenceframe components that position an a-entity at an argon/cesium geospatial location, and take care of all the internal conversions for you.
The assumption when I wrote that code was that authors would then create their content using the local, 3D graphics coordinates, and attach those hunks of graphics to a-entity's that were located in the world with referenceframe's.
In order to have individual coordinates in AFrame correspond to geospatial places, you will need to manage that yourself, perhaps by creating a component to do it for you, or (if the data is known at the start) by converting it up front.
Here's what I'd do.
Assuming you have a list of geospatial coordinates (expressed as LLA), I'd convert each to a local coordinates (by first converting from LLA to Cesium's FIXED ECEF coordinates and creating a Cesium Entity, and then calling Argon's context.getEntityPose() on that entity (which will return it's local coordinates). I would pick one geospatial location in the set (perhaps the first one?) and then subtract it's local coordinates from each of them, so that they are all expressed in local coordinates relative to that known geospatial location.
Then, I'd create an AFrame entity attached to the referenceframe of that unique geospatial entity, and create your graphics content inside of it, using the local coordinates that are expressed relative to it. For example, let's say the geospatial location is LongLat = "-84.398881 33.778463" and you stored those points (local coordinates, relative to LongLat) in userPath, you could do something like this:
<ar-scene>
<ar-geopose id="GT" lla=" -84.398881 33.778463" userotation="false">
<a-entity meshline="lineWidth: 20; path: userPath; color: #E20049"></a-entity>
</ar-geopose>
</ar-scene>

How do I best map an unorganized point cloud back to it's organized ancestor?

I get an organized point cloud (using pcl and an ASUS Xtion Pro Live), which of course contains NANs and the like. I also get an RGB image of the same scene.
The first step for processing is removing those NANs, which converts the point cloud to unorganized. I then perform a few other steps, but that's not relevant to the question (I think, see P.S.1). What COULD (I'm not sure) be relevant is that I run extract multiple times, and so have quite a few intermediate point clouds. I believe this means I can no longer assume that the points are in the same order they were at the start.
For clarification, I do understand what an unorganized point cloud it and how it differs from unorganized, both theoretically and in terms of how the data is actually stored.
After chopping off various points, I now have a much smaller point cloud which consists only of points in the original point cloud (but much less of them). How do I map these points back to the matching points in the original point cloud? I probably could iterate through the entire cloud to find matches, but this seems hacked together. Is there a better way to do this?
My main aim is to be able to say that 'point A in my final point cloud is of interest to me' and furthermore to map that to pixel K in the RGB image I first obtained. It seems to me that matching the final point cloud with the initial one is the best way to do this, but alternatives are also welcome.
P.S.1 - One of the last few steps in my process is finding a convex hull and then extracting a polygonal prism from the original point cloud. If all else fails, I will just interrogate the (20-50) points on the convex hull to match them with my initial point cloud (minimizing computation) and hence to match them with the original RGB images.
P.S.2 - Random musing - since I know the original size of the RGB image, the origin of the camera relative to the point cloud (or, rather, the position of the points relative to the camera used to take them), and can trivially obtain the camera parameters, could I simply use ray-tracing through each point in my final point cloud to produce an RGB image? The image may need registration with the 'real' RGB image, or it probably won't since nothing will have actually moved except for rounding error.

How to resize an existing point cloud file?

I am trying to enlarge a point cloud data set. Suppose I have a point cloud data set consisting of 100 points & I want to enlarge it to say 5 times. Actually I am studying some specific structure which is very small, so I want to zoom in & do some computations. I want something like imresize() in Matlab.
Is there any function to do this? What does resize() function do in PCL? Any idea about how can I do it?
Why would you need this? Points are just numbers, regardless whether they are 1 or 100, until all of them are on the same scale and in the same coordinate system. Their size on the screen is just a visual representation, you can zoom in and out as you wish.
You want them to be a thousandth of their original value (eg. millimeters -> meters change)? Divide them by 1000.
You want them spread out in a 5 times larger space in that particular coordinate system? Multiply their coordinates with 5. But even so, their visual representations will look exactly the same on the screen. The data remains basically the same, they will not be resized per se, they numeric representation will change a bit. It is the simplest affine transform, just a single multiplication.
You want to have finer or coarser resolution of your numeric representation? Or have different range? Change your data type accordingly.
That is, if you deal with a single set.
If you deal with different sets, say, recorded with different kinds of sensors and the numeric representations differ a bit (there are angles between the coordinate systems, mm vs cm scale, etc.) you just have to find the transformation from one coordinate system to the other one and apply it to the first one.
Since you want to increase the number of points while preserving shape/structure of the cloud, I think you want to do something like 'upsampling'.
Here is another SO question on this.
The PCL offers a class for bilateral upsampling.
And as always google gives you a lot of hints on this topic.
Beside (what Ziker mentioned) increasing allocated memory (that's not what you want, right?) or zooming in in visualization you could just rescale your point cloud.
This can be done by multiplying each points dimensions with a constant factor or using an affine transformation. So you can e.g switch from mm to m.
If i understand your question correctly
If you have defined your cloud like this
pcl::PointCloud<pcl::PointXYZ>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZ>);
in fact you can do resize
cloud->points.resize (cloud->width * cloud->height);
Note that doing resize does nothing more than allocate more memory for variable thus after resizing original data remain in cloud. So if you want to have empty resized cloud dont forget to add cloud->clear();
If you just want zoom some pcd for visual puposes(i.e you cant see what is shape of cloud because its too small) why dont you use PCL Visualization and zoom by scrolling up/down

Detecting floor under an object in PCL

I'm very new to PCL.
I try to detect the floor under an object for checking if the object topples or is it positioned horizontally.
I've checked API and found the method: pcl::PointCloud< T >::at.
Seems like I could detect Z-value of a point using at. Is it correct?
If yes, I'm confused, how it should work. Mathematically a point is infinite small. On my scans I see the point-density the smaller the more distinct they are in Z-direction.
Will at always return a point? Is the value the mean of nearest physical points?
As referenced in the documentation, pcl::PointCloud< T >::at returns the information of a single point (the coordinates plus other data depending on the point format) given column and row information (roughly the X,Y in the depth image). For this reason, this method just works on organized clouds.
Unfortunately, not every point is a valid point. Unless you filter the point cloud, you could find invalid measurements (points which have NaN components). This is pretty normal, just discard those points using a filter. Your intuition is right, the point density is smaller the further away you go from the sensor.
As for what you're trying to achieve, you should take a look at the planar segmentation tutorial on the PCL website and at the Table Object Detector software by Nicolas Burrus. The latter extracts a plane, and the clusters of objects on top of it.

Pinning latitude longitude on a ski map

I have a map of a mountainous landscape, http://skimap.org/data/989/60/1218033025.jpg. It contains a number of known points, the lat-longs of which can be easily found out using Google maps. I wish to be able to pin any latitude longitude coordinate on the map, of course within the bounds of the landscape.
For this, I tried an approach that seems to be largely failing. I assumed the map to be equivalent to an aerial photograph of the Swiss landscape, without any info about the altitude or other coordinates of the camera. So, I assumed the plane perpendicular to the camera lens normal to be Ax+By+Cz-d=0.
I attempt to find the plane constants, using the known points. I fix my origin at a point, with z=0 at the sea level. I take two known points in the landscape, and using the equation for a line in 3D, I find the length of the projection of this line segment joining the two known points, on the plane. I multiply it by another constant K to account for the resizing of this length on a static 2d representation of this 3D image. The length between the two points on a 2d static representation of this image on this screen can be easily found in pixels, and the actual length of the line joining the two points, can be easily found, since I can calculate the distance between the two points with their lat-longs, and their heights above sea level.
So, I end up with an equation directly relating the distance between the two points on the screen 2d representation, lets call it Ls, and the actual length in the landscape, L. I have many other known points, so plugging them into the equation should give me values of the 4 constants. For this, I needed 8 known points (known parameters being their name, lat-long, and heights above sea level), one being my orogin, and the second being a fixed reference point. The rest 6 points generate a system of 6 linear equations in A^2, B^2, C^2, AB, BC and CA. Solving the system using a online tool, I get the result that the system has a unique solution with all 6 constants being 0.
So, it seems that the assumption that the map is equivalent to an aerial photograph taken from an aircraft, is faulty. Can someone please give me some pointers or any other ideas to get this to work? Do open street maps have a Mercator projection?
I would say that this impossible to do in an automatic way. The skimap should be considered as an image rather than a map, a map is an projection of the real world into one plane, since this doesn't fit skimaps very well they are drawn instead.
The best way is probably to manually define a lot of points in the skimap with known or estimated coordinates and use them to estimate the points betwween. To get an acceptable result you probably have to assign coordinates to each pixel in the skimap.
You could do something like the following: http://magazin.unic.com/en/2012/02/16/making-of-interactive-mobile-piste-map-by-laax/
I am solving the exact same issue. It is pretty hard and lots of maths. Taking me a few weeks to solve it. Interpolation is the key as well with lots of manual mapping. I would say that for a ski mountain it will take at least 1000/1500 points to be able to get the very basic. So, not a trivial task unless you can automate the collection of these points (what I am doing!) ;)

Resources