Get distance measurments of object from 2D RGB image - volume

I am currently working on a project where I have to measure an object's dimensions. I only have 2 images from top corners one for the front and one for the rear.
I have tried to plot the contours of said object, but can't seem to get the measurements (height, width and length)
thank you.

Related

How to get Bokeh to scale scatter plot size according to zoom

Some of the folks on my team, including myself, find it pretty disorienting that in a Bokeh scatter plot, say using the circle method, that for an initial autoscale fit of the data on the figure we can dial in a reasonable size for our data, using for example something like plot.circle( x , y , size=3 )
However when we interactively zoom into our data the glyph sizes as displayed are invariant to the zoom. Is there a way to have them scale proportionally to the zoom we've dialed into? Something akin to an vector graphics interaction (eg svg). If memory serves me right matlab figures and matplotlib figures should maintain zoom proportionality behavior. To demonstrate the behavior we're seeing consider the first image and the red box I approximately zoom into on the second image.
Just as a quick demo using Powerpoint to illustrate the sort of desired behavior...
For circles, set the radius kwarg instead of the size value. (There similar, glyph-specific values for the other glyph-types).
i.e.:
plot.circle(x=[1,2,3], y=[1,2,3], radius=0.5)
size is always rendered in screen coordinates (pixels), but radius and the related properties are computed in data coordinates and should change in magnitude with zooming.
Here's a good demo by Bryan Van de Ven showing the difference between pixel coordinates (size) and data coordinates (radius) given in this conference talk:
Intro to Data Visualization with Bokeh - Part 2 - Strata Hadoop San Jose 2016
... the point is all of these attributes can be vectorized. We could
for instance say size equals you know 2, 4, 6, 8, 10, and now the size
is modulated right. So we have one that has size 2 and one that has
size 4. Size is usually in pixels, radius is usually in data dimension
units. But all the other ones here as well all the colors, all the
visual attributes can be vectorized in this way. You can either give
them a single value as we've done for instance with the line fill
color, or you can give them a vector of values in which case all of
the things are different.
So next exercise here you go to this
notebook this is that second notebook "02 - plotting" it is to try to
create the same example but now set the radius instead of the size and
sort of see what's the difference if you set if you set radius instead
of size.

DICOM why need overlay and how to read it

Just wondering why we need the overlay and when we will need it?
I have a Scout image with overlay, what do these dots mean and what do these numbers or fractions mean?
How these numbers are drawn on the image?
DICOM standard allows two specific types of overlays (graphics and ROI) along with the image and overlays are stored as 1-bit image in Overlay Data (60XX, 0050) attribute. A dataset can have up to 16 separate overpay planes (using the repeating groups encoding).
The overlay plane that represents region of interest (ROI) will have value of “R” for Overlay Type (60xx, 0040) attribute and ROI Area (60xx, 1301), ROI Mean (60xx,1302) and ROI Standard Deviation (60xx, 1303) can be used for the corresponding values of ROI. All bits representing ROI will have a value of 1 that represents the pixels under the boundaries of the actual image data.
Graphic Overlay will have value of “G” in Overlay Type (60xx, 0040) attribute and it is used for expressing reference marks (reference line), graphic annotation, or bitmap text etc. Again, all visible values in an overlay plane are set to 1.
The Overlay Rows (60xx, 0010) and Overlay Columns (60xx,0011) specifies the width and height of the overlay plane. Overlay Bits Allocated is always 1 and Overlay Bit Position is 0 (it was used in previous version and usage has been retired). Overlay Origin (60xx, 0050) is used to described the first overlay point with respect to the pixel in the image and 1\1 represents upper left pixel of the image.
Overlays can be used to display any data over an image. You could, for example, allow users to make annotations or graphics marks. You cannot mark the original data, so the overlay is stored in a separate layer.
In your case, the creator of the overlay should explain its meaning.
The meaning of the overlay is:
i.e. 2/16 -> Series number 2 and slice number 16

How to deal with arbitrary size for Laplacian Pyramid?

Recently I had much fun with the Laplacian Pyramid algorithm (http://persci.mit.edu/pub_pdfs/pyramid83.pdf). But one big problem is that the original paper is limited to 2^m+1*2^n+1 images. My question is: What is the best way to deal with arbitrary w*h instead? I can think of a couple of options:
Up sample the input to the next 2^m+1,2^n+1 up front
Pad even lines. How exactly? Wouldn't it shift the signal?
Shift even lines by half a sample? Wouldn't it loose half a sample?
Does anybody have experience with this? What is the most practical and efficient approach? Also any pointers to papers dealing with this would be very welcome.
One approach is to create an image with a width and height equal to the next 2^m+1,2^n+1, but instead of up-sampling the image to fill the expanded dimensions, just place it in the top-left corner and fill the empty space to the right and below with a constant value (the average value for the image is a good choice for this). Then encode in the normal way, storing the original image dimensions along with the pyramid. When decoding, decode and then crop to the original size.
This won't introduce any visual artifacts or degradation because you aren't stretching or offsetting the image in any way.
Because the empty space to the right and below the original image is a constant value, the high-pass bands at each level in the image pyramid will be all zero in this area. So if you are using a compression scheme like run length encoding to store each level this will be automatically taken care off and these areas will be compressed to almost nothing. If not then you can simply store the top-left (potentially non-zero) area of each level and then fill out the rest with zeros when decoding.
You could find the min and max x and y bounding rectangle of the non-zero values for each level and store this along with the level, cropped to include only non-zero values. The decoder could also be optimized so that areas of the image that are going to be cropped away are not actually decoded in the first place, by only processing the top-left of each level.
Here's an illustration of the technique:
Instead of just filling the lower-right area with a flat color, you could fill it with horizontally and vertically mirrored copies of the image to the right and below, and a copy mirrored in both directions to the bottom-right, like this:
This will avoid the discontinuities of the first technique, although there will be a discontinuity in dx (e.g. if the value was gradually increasing from left to right it will suddenly be decreasing). Choosing a mirror that keeps dx constant and ddx zero will avoid this second-order discontinuity by linearly extrapolating the values.
Another technique, which is similar to what some JPEG encoders do to pad out an image to a whole number of MCU blocks, is to take the last pixel value of each row and repeat it, and likewise for columns, with the bottom-right-most pixel of the image used to fill the bottom-right area:
This last technique could easily be modified to extrapolate the gradient of values or even the gradient of gradients instead of just repeating the same value for the remainder of the row or column.

determine rectangle rotation point

I would like to know how to compute rotation components of a rectangle in space according to four given points in a projection plane.
Hard to depict in a single sentence, thus I explain my needs.
I have a 3D world viewed from a static camera (located in <0,0,0>).
I have a known rectangular shape (an picture, actually) That I want to place in that space.
I only can define points (up to four) in a spherical/rectangular referencial (camera looking at <0°,0°> (sph) or <0,0,1000> (rect)).
I considere the given polygon to be my rectangle shape rotated (rX,rY,rZ). 3 points are supposed to be enough, 4 points should be too constraintfull. I'm not sure for now.
I want to determine rX, rY and rZ, the rectangle rotation about its center.
--- My first attempt at solving this constrint problem was to fix the first point: given spherical coordinates, I "project" this point onto a camera-facing plane at z=1000. Quite easy, this give me a point.
Then, the second point is considered to be on the <0,0,0>- segment, which is about an infinity of solution ; but I fix this by knowing the width(w) and height(h) of my rectangle: I then get two solutions for my second point ; one is "in front" of the first point, and the other is "far away"... I now have a edge of my rectangle. Two, in fact.
And from there, I don't know what to do. If in the end I have my four points, I don't have a clue about how to calculate the rotation equivalency...
It's hard to be lost in Mathematics...
To get an idea of the goal of all this: I make photospheres and I want to "insert" in them images. For instance, I got on my photo a TV screen, and I want to place a picture in the screen. I know my screen size (or I can guess it), I know the size of the image I want to place in (actually, it has the same aspect ratio), and I know the four screen corner positions in my space (spherical or euclidian). My software allow my to place an image in the scene and to rotate it as I want. I can zoom it (to give the feeling of depth)... I then can do all this manually, but it is a long try-fail process and never exact. I would like then to be able to type in the screen corner positions, and get the final image place and rotation attributes in a click...
The question in pictures:
Images presenting steps of the problem
Note that on the page, I present actual images of my app. I mean I had to manually rotate and scale the picture to get it fits the screen but it is not a photoshop. The parameters found are:
Scale: 0.86362
rX = 18.9375
rY = -12.5875
rZ = -0.105881
center position: <-9.55, 18.76, 1000>
Note: Rotation is not enought to set the picture up: we need scale and translation. I assume the scale can be found once a first edge is fixed (first two points help determining two solutions as initial constraints, and because I then know edge length and picture width and height, I can deduce scale. But the software is kind and allow me to modify picture width and height: thus the constraint is just to be sure the four points are descripbing a rectangle in space, with is simple to check with vectors. Here, the problem seems to place the fourth point as a valid rectangle corner, and then deduce rotation from that rectangle. About translation, it is the center (diagonal cross) of the points once fixed.

Project a grid in screenspace on the world xz plane

I want to project a grid on the xz-plane like shown here:
To do that, I created a vertex grid with x and z range [-1|1]. In the shader I multiply the xz screen coordinate of a vertex with the inverse of the View-Projection matrix. Then I want to adjust the height, depending on the new world xz coordinates and finally I transform these coordinates back to screenspace by multiplying them with the View-Projection matrix.
I dont know why, but I get a very strange plane shown on the screen. Are the mathematical oprations I use correct?
The grid that you initially create, is that in projection space or actual screen co-ords? It sounds like it is in projection space since you only transform it with the inverse of the view-projection matrix to get into world co-ords. I think you need to include the "Window" matrix too i.e. transform them by the inverse of the View-Projection-Window matrix (and similarly on the way back to screen co-ords).
Edit:
I'm probably not understanding exactly what it is you're trying to do so here's some questions back. :)
Are you trying to take the grid that's shown in the screenshot in your question and project that onto world z-x co-ordinates? If so, then why do you start with a grid of z-x values? Also, if you apply an inverse view matrix to those then surely you would end up with a line since the camera looks along z although your second screenshots show that you are getting a plane. I'm a bit confused.

Resources