Automatic rotation and crop of images based on detetction reference marks - r

I have developed a framework in R to automatically measure vegetation structure variables from whiteboard photos taken on grasslands for ecological related studies. Until now we have preprocessed the images by hand, however now we need to automatise the rotation and cropping of the images.
The idea is the following: Use reference marks on the whiteboard, and detect these markings to rotate and crop the original photographs. I need help to detect the reference markings. After knowing the position of reference markings (centroids) we can calculate the coordinates/pixels where to crops the image. In the end, we want to get a picture like that.
We can use some special colour for the markings, but these can be obstructed by the vegetation. The bottom of the whiteboard is always obstructed, the cropped part (without the reference markings) should be 25×100 cm.
Possibly edge detection can be a solution. I'm familiar with only with R programming.

Related

Rendering highly granular and "zoomed out" data

There was a gif on the internet where someone used some sort of CAD and drew multiple vector pictures in it. On the first frame they zoom-in on a tiny dot, revealing there a whole new different vector picture just on a different scale, and then they proceed to zoom-in further on another tiny dot, revealing another detailed picture, repeating several times. here is the link to the gif
Or another similar example: imagine you have a time-series with a granularity of a millisecond per sample and you zoom out to reveal years-worth of data.
My questions are: how such a fine-detailed data, in the end, gets rendered, when a huge amount of data ends up getting aliased into a single pixel.
Do you have to go through the whole dataset to render that pixel (i.e. in case of time-series: go through million records to just average them out into 1 line or in case of CAD render whole vector picture and blur it into tiny dot), or there are certain level-of-detail optimizations that can be applied so that you don't have to do this?
If so, how do they work and where one can learn about it?
This is a very well known problem in games development. In the following I am assuming you are using a scene graph, a node-based tree of objects.
Typical solutions involve a mix of these techniques:
Level Of Detail (LOD): multiple resolutions of the same model, which are shown or hidden so that only one is "visible" at any time. When to hide and show is usually determined by the distance between camera and object, but you could also include the scale of the object as a factor. Modern 3d/CAD software will sometimes offer you automatic "simplification" of models, which can be used as the low res LOD models.
At the lowest level, you could even just use the object's bounding
box. Checking whether a bounding box is in view is only around 1-7 point checks depending on how you check. And you can utilise object parenting for transitive bounding boxes.
Clipping: if a polygon is not rendered in the view port at all, no need to render it. In the GIF you posted, when the camera zooms in on a new scene, what is left from the larger model is a single polygon in the background.
Re-scaling of world coordinates: as you zoom in, the coordinates for vertices become sub-zero floating point numbers. Given you want all coordinates as precise as possible and given modern CPUs can only handle floats with 64 bits precision (and often use only 32 for better performance), it's a good idea to reset the scaling of the visible objects. What I mean by that is that as your camera zooms in to say 1/1000 of the previous view, you can scale up the bigger objects by a factor of 1000, and at the same time adjust the camera position and focal length. Any newly attached small model would use its original scale, thus preserving its precision.
This transition would be invisible to the viewer, but allows you to stay within well-defined 3d coordinates while being able to zoom in infinitely.
On a higher level: As you zoom into something and the camera gets closer to an object, it appears as if the world grows bigger relative to the view. While normally the camera space is moving and the world gets multiplied by the camera's matrix, the same effect can be achieved by changing the world coordinates instead of the camera.
First, you can use caching. With tiles, like it's done in cartography. You'll still need to go over all the points, but after that you'll be able zoom-in/zoom-out quite rapidly.
But if you don't have extra memory for cache (not so much actually, much less than the data itself), or don't have time to go over all the points you can use probabilistic approach.
It can be as simple as peeking only every other point (or every 10th point or whatever suits you). It yields decent results for some data. Again in cartography it works quite well for shorelines, but not so well for houses or administrative boarders - anything with a lot of straight lines.
Or you can take a more hardcore probabilistic approach: randomly peek some points, and if, for example, there're 100 data points that hit pixel one and only 50 hit pixel two, then you can more or less safely assume that if you'll continue to peek points still pixel one will be twice as likely to be hit that pixel two. So you can just give up and draw pixel one with a twice more heavy color.
Also consider how much data you can and want to put in a pixel. If you'll draw a pixel in black and white, then there're only 256 variants of color. And you don't need to be more precise. Or if you're going to draw a pixel in full color then you still need to ask yourself: will anyone notice the difference between something like rgb(123,12,54) and rgb(123,11,54)?

Dicom - normalization and standardization

I am new to the field of medical imaging - and trying to solve this (potentially basic problem). For a machine learning purpose, I am trying to standardize and normalize a library of DICOM images, to ensure that all images have the same rotation and are at the same scale (e.g. in mm). I have been playing around with the Mango viewer, and understand that one can create transformation matrices that might be helpful in this regard. I have however the following basic questions:
I would have thought that a scaling of the image would have changed the pixel spacing in the image header. Does this tag not provide the distance between pixels, and should this not change as a result of scaling?
What is the easiest way to standardize a library of images (ideally in python)? Is it possible and should one extract a mean pixel spacing across all images, and then scaling all images to match that mean? or is there a smarter way to ensure consistency in scaling and rotation?
Many thanks in advance, W
Does this tag not provide the distance between pixels, and should this
not change as a result of scaling?
Think of the image voxels as fixed units of space, which are sampling your image. When you apply your transform, you are translating/rotating/scaling your image around within these fixed units of space. That is, the size and shape of the voxels doesn't change. They just sample different parts of your image.
You can resample your image by making your voxels bigger or smaller or changing their shape (pixel spacing), but this can be independent of the transform you are applying to the image.
What is the easiest way to standardize a library of images (ideally in
python)?
One option is FSL-FLIRT, although it only accepts data in NIFTI format, so you'd have to convert your DICOMs to NIFTI. There is also this Python interface to FSL.
Is it possible and should one extract a mean pixel spacing across all
images, and then scaling all images to match that mean? or is there a
smarter way to ensure consistency in scaling and rotation?
I think you'd just to have pick a reference image to register all your other images too. There's no right answer: picking the highest resolution image/voxel dimensions or an average or some resampling into some other set of dimensions all sound reasonable.

Automatically extract data from graph

I have a graph like:
I would like to generate a set of (x,y) pairs that correspond to points of this graph.
Maybe one for each horizontal pixel.
How would I go about doing this?
If I had the image in uncompressed bitmap format, maybe cropped to the actual graph, I could examine each vertical strip for the blackest point...
I would prefer to work in Python, but I'm interested in any technique.
I answered a question like this a while back. It should be fairly easy to detect the grid, from there you can get the pixel's coordinates relatively to the grid. However, it wasn't clear how to extract the numbers, which you need to do in order to get the the scale of the grid. Although, it might be possible fairly easily if you can match the font and font size (which might be possible via scaling). Otherwise, you'd have to enter the numbers manually.
To extract the grid, you'd start from the top right and move diagonally until you find the start of the grid. From there you can follow the vertical and horizontal lines (of the grid) until they end. This should allow you to say with fairly high probability where the outer rectangle of the grid is and what the x and y intervals of the grid are in terms of pixels. The blackest parts within the grid should do for finding the curve, but it may require some interpolation depending on how many data points you need/want.
It also may be useful to look into techniques for reversing anti-aliasing effects. Although, the uncompressed bitmap image may not need it.

QGis: How to import svg or raster images into Quantum GIS?

these vector or raster files being classic files without geocoordinates. They are lat/long projection, I want to import them into QGIS, scale them up/down, place them to their right place, and they become reusable shp or raster geocoordinated layers.
Edit: I'am from the wikipedia Graphic Lab>Map workshop, we want to work more using GIS. We litteraly have hundreds maps to migrate to GIS technologies....
File:Chinese_plain_5c._BC-en.svg
File:Vignobles_basse_loire.svg
Partial Solution: load SVG into Inkscape, Save as DXF file, then you can load this into QGIS. This should at least get you most of the linework into QGIS.
However, it won't yet be properly georeferenced or styled, and different layers may be in different places because the SVG has some scaling and translating operators on parts of the map data that QGIS or Inkscape is ignoring. You'll probably need to work with a layer at a time. This probably isn't a problem since maybe you are only interested in the added data on the map, and not the base map (country outlines etc) since you will probably want to overlay your data onto standard map base layer (natural earth, OpenStreetMap tiles).
The only way I see to do the transformation at present is to work out the affine transformation parameters and use the QgsAffine plugin, but that does require you to work out the parameters beforehand by fitting known source coordinates to known target coordinates.
But to do hundreds? You might be better off writing some custom SVG parsing code for each one...
If you only want to display it in the correct place, scale and rotation treat it as an SVG icon.
1. create a point layer and put a single point at the georeferenced centre of the SVG you will load.
2. edit the symbology and load the SVG as an icon
3. set the size units to map units
4. supply the appropriate dimensions
5. rotate as necessary
The redraw is very slow and painful, but if you use Project>import/Export>Export map as image you can make a georeferenced raster.

Implementing z-axis in a 2D side-scroller

I'm making a side scroller similar to Castle Crashers and right now I'm using SAT for collision detection. That works great, but I want to simulate level "depth" by allowing objects to move up and down on the screen, basically along a z-axis (like this screenshot http://favoniangamers.files.wordpress.com/2009/07/castle-crashers-ps3.jpg). This isn't an isometric game, but rather uses parallax scrolling.
I added a z component to my vector class, and I plan to cull collisions based on the 'thickness' of a shape and it's z position. I'm just not sure how calculate the positions of shapes for rendering or how to add jumping with gravity. How do I calculate the max y value (for the ground) as the z position changes? Basically it's the relationship of the z and y axis that confuses me.
I'd appreciate links to resources if anyone knows of this topic.
Thanks!
It's actually possible to make your collision detection algorithm dimensionally agnostic. Just have a collision detector that works along one dimension, use that to check each dimension, and your answer to "are these colliding or not" is the logical AND of the collision detection along each of the dimensions.
Your game should be organised to keep the interaction of game objects, and the rendering of the game to the screen completely seperate. You can think of these two sections of the program as the "model" and the "view". In the model, you have a full 3D world, with 3 axes. You can't go halvesies on this point without some level of pain. Your model must be proper 3D.
The view will read the location of all the game objects, and project them onto the screen using the camera definition. For this part you don't need a full 3D rendering engine. The correct technical term for the perspective you're talking about is "oblique", and it can be seen in many ancient chinese and japanese scroll paintings and prints- in particular look for images of "The Tale of Genji".
The on screen position of an object (including the ground surface!) goes something like this:
DEPTH_RATIO=0.5;
view_x=model_x-model_z*DEPTH_RATIO-camera_x;
view_y=model_y+model_z*DEPTH_RATIO-camera_y;
you can modify for a straight orthographic front projection:
DEPTH_RATIO=0.5;
view_x=model_x-camera_x;
view_y=model_y+model_z*DEPTH_RATIO-camera_y;
And of course don't forget to cull objects outside the volume defined by the camera.
You can also use this mechanism to handle the positioning of parallax layers for you. This is of course, a matter changing your camera to a 1-point perspective projection instead of an orthographic projection. You don't have to use this to change the rendered size of your sprites, but it will help you manage the x position of objects realistically. if you're up for a challenge, you could even mix projections- use 1 point perspective for deep backgrounds, and the orthographic stuff for the foreground.
You should separate your conceptual Y axis used by you physics calculation (collision detection etc.) and the Y axis you actually draw on the screen. That way it becomes less confusing.
Just do calculations per normal pretending there is no relationship between Y and Z axis then when you actually draw the object on the screen you simulate the Z axis using the Y axis:
screen_Y = Y + Z/some_fudge_factor;
Actually, this is how real 3d engines work. After all the world calculations are done the X, Y and Z coordinates are mapped onto screen_X and screen_Y via a function (usually a bit more complicated than the equation above, but just a bit).
For example, to implement pseudo-isormetric view in your game you can even apply Z to the screen_X axis so objects are displaced diagonally instead of vertically.

Resources