I quote :
DICOM supports up to 65,536 (16 bits) shades of gray for monochrome image display, thus capturing the slightest nuances in medical imaging. In comparison, converting DICOM images into JPEGs or bitmaps (limited to 256 shades of gray) often renders the images unacceptable for diagnostic reading. - Digital Imaging and Communications in Medicine (DICOM): A Practical Introduction and Survival Guide by Oleg S. Pianykh
As I am a beginner in image processing I'm used to process images colored and monochrome with 256 levels, so for Dicom images, in which representation I have to process pixels without rendering them to 256 levels?, because of the loss of information.
note: If you can put a better tittle for this question, please feel free to do so, I've a hard time doing that and didn't come to a good one.
First you have to put the image's pixels through the Modality VOI/LUT transform in order to transform modality-dependent values to known units (e.g. Hounsfield or Optical Density).
Then, all your processing must be done on the entire range of values (do not convert 16 bit values to 8 bit).
The presentation (visualization) can be performed using scaled values (using 8 bit values), usually passing the data through the Presentation VOI/LUT (window-width or LUT).
See this for the Modality transform: rescale slope and rescale intercept
See the this for Window/Width: Window width and center calculation of DICOM image
Related
I am new to the field of medical imaging - and trying to solve this (potentially basic problem). For a machine learning purpose, I am trying to standardize and normalize a library of DICOM images, to ensure that all images have the same rotation and are at the same scale (e.g. in mm). I have been playing around with the Mango viewer, and understand that one can create transformation matrices that might be helpful in this regard. I have however the following basic questions:
I would have thought that a scaling of the image would have changed the pixel spacing in the image header. Does this tag not provide the distance between pixels, and should this not change as a result of scaling?
What is the easiest way to standardize a library of images (ideally in python)? Is it possible and should one extract a mean pixel spacing across all images, and then scaling all images to match that mean? or is there a smarter way to ensure consistency in scaling and rotation?
Many thanks in advance, W
Does this tag not provide the distance between pixels, and should this
not change as a result of scaling?
Think of the image voxels as fixed units of space, which are sampling your image. When you apply your transform, you are translating/rotating/scaling your image around within these fixed units of space. That is, the size and shape of the voxels doesn't change. They just sample different parts of your image.
You can resample your image by making your voxels bigger or smaller or changing their shape (pixel spacing), but this can be independent of the transform you are applying to the image.
What is the easiest way to standardize a library of images (ideally in
python)?
One option is FSL-FLIRT, although it only accepts data in NIFTI format, so you'd have to convert your DICOMs to NIFTI. There is also this Python interface to FSL.
Is it possible and should one extract a mean pixel spacing across all
images, and then scaling all images to match that mean? or is there a
smarter way to ensure consistency in scaling and rotation?
I think you'd just to have pick a reference image to register all your other images too. There's no right answer: picking the highest resolution image/voxel dimensions or an average or some resampling into some other set of dimensions all sound reasonable.
I create a big image stitched out of many single microscope images.
Suddenly, (after several month of working properly) the stitched overview images became blurry and they are containing strange structural artefacts like askew lines (not the rectangulars, they are because of not perfect stitching)
If I open any particular tile in full size, they are not blurry and the artefacts are hardly observable. (Consider, the image below is already 4x scaled)
The overview image is created manually by scaling each tile using QImage::scaled and copying all of them to the corresponding region in the big image. I'm not using opencv's stitching.
I assume, this happens because of image contents, because most if the overview images are ok.
The question is, how can I avoid such hardly observable artefacts to become very clearly visible after scaling? Is there some means in OpenCV or QImage?
Is there any algorithms to find out, if image content could lead to such effect for defined scale-factor?
Many thanks in advance!
Are you sure the camera is calibrated properly? That the lightning is uniform? Is the lens clear? Do you have electrical components that interfere with the camera connection?
If you add image frames of photos on a uniform material (or non-uniform material, moved randomly for significant time), the resultant integrated image should be completely uniform.
If your produced image is not uniform, especially if you get systematic noise (like the apparent sinusoidal noise in the provided pictures), write a calibration function that transforms image -> calibrated image.
Filtering in Fourier space is another way to filter out the noise but considering that the image is rotated you will lose precision, and you'll be cutting off components of the real signal, too. The following empiric method will reduce the noise in your particular case significantly:
ground_output: composite image with per-pixel sum of >10 frames (more is better) over uniform material (e.g. excited slab of phosphorus)
ground_input: the average(or sqrt(sum of px^2)) in ground_output
calib_image: ground_input /(per px) ground_output. Saved for the session, or persistent in a file (important: ensure no lossy compression! (jpeg)).
work_input: the images to work on
work_output = work_input *(per px) calib_image: images calibrated for systematic noise.
If you can't create a perfect ground_input target such as having a uniform material on hand, do not worry too much. If you move any material uniformly (or randomly) for enough time, it will act as a uniform material in this case (think of a blurred photo).
This method has the added advantage of calibrating solitary faulty pixels that ccd cameras have (eg NormalPixel.value(signal)).
If you want to have more fun you can always fit the calibration function to something more complex than a zero-intercept line (steps 3. and 5.).
I suggest scaling the image with some other software to verify if the artifacts are in fact caused by Qt or are inherent in the image you've captured.
The askew lines look a lot like analog tv interference, or CCTV noise induced by 50 or 60 Hz power lines running alongside the signal cable or some other electrical interference on the signal.
If the image distortion is caused by signal interference then you can try to mitigate it by moving the signal lines away from whatever could be the source of the problem, or fit something to try to filter the noise (baluns for example).
I am currently working on some kind of OCR (Optical Character Recognition) system. I have already written a script to extract each character from the text and clean (most of the) irregularities out of it. I also know the font. The images I have now for example are:
M (http://i.imgur.com/oRfSOsJ.png (font) and http://i.imgur.com/UDEJZyV.png (scanned))
K (http://i.imgur.com/PluXtDz.png (font) and http://i.imgur.com/TRuDXSx.png (scanned))
C (http://i.imgur.com/wggsX6M.png (font) and http://i.imgur.com/GF9vClh.png (scanned))
For all of these images I already have a sort of binary matrix (1 for black, 0 for white). I was now wondering if there was some kind of mathematical projection-like formula to see the similarity between these matrices. I do not want to rely on a library, because that was not the task given to me.
I know this question may seem a bit vague and there are similar questions, but I'm looking for the method, not for a package and so far I couldn't find any comments regarding the method. The reason this question being vague is that I really have no point to start. What I want to do is actually described here on wikipedia:
Matrix matching involves comparing an image to a stored glyph on a pixel-by-pixel basis; it is also known as "pattern matching" or "pattern recognition".[9] This relies on the input glyph being correctly isolated from the rest of the image, and on the stored glyph being in a similar font and at the same scale. This technique works best with typewritten text and does not work well when new fonts are encountered. This is the technique the early physical photocell-based OCR implemented, rather directly. (http://en.wikipedia.org/wiki/Optical_character_recognition#Character_recognition)
If anyone could help me out on this one, I would appreciate it very much.
for recognition or classification most OCR's use neural networks
These must be properly configured to desired task like number of layers internal interconnection architecture , and so on. Also problem with neural networks is that they must be properly trained which is pretty hard to do properly because you will need to know for that things like proper training dataset size (so it contains enough information and do not over-train it). If you do not have experience with neural networks do not go this way if you need to implement it yourself !!!
There are also other ways to compare patterns
vector approach
polygonize image (edges or border)
compare polygons similarity (surface area, perimeter, shape ,....)
pixel approach
You can compare images based on:
histogram
DFT/DCT spectral analysis
size
number of occupied pixels per each line
start position of occupied pixel in each line (from left)
end position of occupied pixel in each line (from right)
these 3 parameters can be done also for rows
points of interest list (points where is some change like intensity bump,edge,...)
You create feature list for each tested character and compare it to your font and then the closest match is your character. Also these feature list can be scaled to some fixed size (like 64x64) so the recognition became invariant on scaling.
Here is sample of features I use for OCR
In this case (the feature size is scaled to fit in NxN) so each character has 6 arrays by N numbers like:
int row_pixels[N]; // 1nd image
int lin_pixels[N]; // 2st image
int row_y0[N]; // 3th image green
int row_y1[N]; // 3th image red
int lin_x0[N]; // 4th image green
int lin_x1[N]; // 4th image red
Now: pre-compute all features for each character in your font and for each readed character. Find the most close match from font
min distance between all feature vectors/arrays
not exceeding some threshold difference
This is partially invariant on rotation and skew up to a point. I do OCR for filled characters so for outlined font it may have use some tweaking
[Notes]
For comparison you can use distance or correlation coefficient
Due to showing MPR view based on Dicoms. I've made a 3D array from series of dicom files. And I show it from Coronal and Sagittal sides.
My 3D array includes:
- z = count of dicoms
- c = column value for every dicoms
- r = Row value for every dicoms
But I have a problem. When there is some space between slices, image is made by this way doesn't show a correct view. Because I can not think of simulation distance between them!
I don't know how to calculate space between slices? I want to add extra space between slices. for example, If space between slices is 4. I have to add 4 time z inner slices.
I hope to arrive my mean.
Image Position (Patient) and Image Orientation (Patient) are the two only attributes you should ever used when computing distance between slices. For more details see here or here. For an actual implementation see here, this implementation also does take into account Frame Of Reference UID, as well as Gantry/Detector Tilt.
This question is the question #1 asked on comp.protocols.dicom.
Please see ImageJ bug
I believe the answer from #Matt is erroneous, let me clarify a few things here.
No: 'DICOM does not have an attribute called Spacing Between Slices'. That is very wrong (technically it does not even mean anything).
DICOM defines IODs which define the set of required attributes available in an SOP Class Instance. Let's consider two very common cases: CT Image Storage (legacy) and MR Image Storage (legacy). So we need to compare the set of attributes in between:
CT Image IOD Modules
MR Image IOD Modules
Now let's say we want to check that MR Image Storage support Spacing Between Slices, it is easy to jump to:
MR Image Module Attributes
However it is much harder to find this attribute for CT Image Storage: simply because this attribute does not exist (per standard). So the only time you would find such attribute would be within an extended SOP Class (some vendors may decide that Spacing Between Slices attribute make sense within their extended SOP Class Instance).
Mixing in the same answer both Spacing Between Slices and Slice Thickness (0018,0050) is very confusing for new users.
I agree that Slice Thickness is perfectly defined in the standard for both CT Image Storage and MR Image Storage since they both include Image Plane Module Attributes, however let's not exchange one for the other.
I found a nice summary of Slice Thickness vs Spacing Between Slices here (if you scroll to the section, you can even play the small demo) :
CT Physics: CT Reconstruction and Helical CT
In step and shoot CT the Slice Thickness and Spacing Between Slices are identical so there is no big issue here. However for helical CT those values are not the same and can vary in any direction (they are independent).
[…] Slice Thickness is determined by the detector width and pitch,
while reconstruction interval (=Spacing Between Slices) can be chosen
arbitrarily. […]
In conclusion to compute (safely!) the Spacing Between Slices (= Reconstruction Interval), it is much safer to use Image Orientation (Patient) and Image Position (Patient) since they are available in either MR Image Storage or CT Image Storage instances.
DICOM has an attribute called Spacing Between Slices (0018, 0088) that gives the distance between two adjacent slices (perpendicular to the image plane) and it also has an attribute called Slice Thickness (0018, 0050) that gives the thickness of the imaged slice (the image plane exists at the center of the slice, with half of the volume above the plane and half below). Image Position (Patient) (0020, 0032) and Image Orientation (Patient) (0020, 0037) are also useful attributes for computing spatial relationships between slices.
For a more detailed explanation, see section C.7.6.2 of part 3 of the DICOM standard. (p. 409)
WARNING: Please be aware that different vendors use the same dicom tags for addressing different things. For instance, the attribute Spacing Between Slices (0018, 0088) means two different things depending on the vendor. See this table to have a guide, and this thread for an explanation.
As discussed in the previous answers, it is not straightforward how to calculate space between DICOM slices. Let's phrase the question differently: How to store DICOM slices in a 3D volume, i.e. a list of equally spaced slices for rendering (guess you want to upload into a 3D texture).
This is because the actual position that a CT slice is captured might not be identical to the position selected by the radiologist. A dataset might have been configured to capture 1 mm slices, but the CT returns slices at position 0.0 mm, 0.997 mm, 2.010 mm, ...
If you use an attribute such as Spacing Between Slices to calculate the size of the 3D volume, you will obtain subtle rounding errors easily. Don't go there.
Rather it is essential to use Image Position (Patient) (0020, 0032) and then perform an optimization to figure our how the slices could be fit into an grid.
Typical problems in practice to consider:
Missing slices (interpolate? Gap?)
Out of step slices (hardware defect? data defect?)
I realize both mipmaps and integral images have the problem that the resulting pixel value is not the integral of an arbitrary polygon in original texture space. Integrating over axisaligned rectangle in texture coordinates using integral images requires 4 texture lookups. Using mipmaps, opengl interpolates across the 4 adjacent pixel values in the mipmap so also 4 memory lookups. Using an integral image you need less memory (no extra preresized images, only an integral image instead of the original) and no level determination. Of course this can be implemented through shaders, but why was the (now being deprecated) fixed function pipeline ever designed with mipmap support and no integral image support?
Using an integral image you need less memory
I very much doubt that this statement is true
From what I understand the values of an integral image can get quite large, therefore requiring floating point representation which will use a lot more space than a typical 24bit mipmap (mipmaps only double the size of an image) and/or be less precise and create noise during interpolation. Also floating point images were not really used that often with the fixed function pipeline and GPUs may have been a lot slower with floating point images.
If you would use integers for the picture then the bit-depth required for the integral image would rise unreasonably high (bitdepth = extents+8 for a white image which means a 256x256 image would need a bit-depth of 264bit per color channel) with higher resolution images.
but why was the (now being deprecated) fixed function pipeline ever designed with mipmap support and no integral image support?
Because the access and interpolation of mipmaps could be built as rather simple hardwired circuits. Ever wondered, why texture dimensions had to be powers of two? To implement mipmaping calculations as a series of bit shifts and additions. Also accessing the neighbouring elements in a gaussian pyramid requires less memory accesses than evaluating the integral. And there's your main problem: Fillrate, i.e. video memory bandwidth, always has been a bottleneck of GPUs.