How can I tell if my dicom files are compressed? - dicom

I have been working with dicom files that are about 4 MB each but I recently received some which are 280 KB each. I am not sure whether this is because they are from different CT scanners or if the new dicoms were compressed before being given to me.
Is there a way to find out and if they are compressed is there a way to uncompressed them to the original size?

This is in continuation to the other answer from #kritzel_sw.
If you see any of the following UIDs in (0002,0010) Transfer Syntax UID element:
1.2.840.10008.1.2 Implicit VR Endian: Default Transfer Syntax for DICOM
1.2.840.10008.1.2.1 Explicit VR Little Endian
1.2.840.10008.1.2.2 Explicit VR Big Endian
then the Pixel Data (7FE0,0010) Pixel Data is uncompressed. You will generally observe bigger file size here.
Not a part of your question, but objects other than image (PDF may be in case of Structured Report) can be encapsulated with following Transfer Syntax:
1.2.840.10008.1.2.1.99 Deflated Explicit VR Little Endian
Other well known values for Transfer Syntax mean that the Pixel Data is compressed.
Note that there are also private Transfer Syntax values possible for data set. Implementation of those values is generally private to the respective manufacturer.

Yes and yes.
I recommend the binary tools from the OFFIS DICOM toolkit, but you will be able to achieve the same results with different toolkits. You can find the dcmtk here.
How to find out if your files are compressed:
dcmdump <filename>
Have a look at the metaheader, the attribute Transfer Syntax UID (0002,0010) in particular. Dcmdump "translates" the unique identifier to the human readable transfer syntax, e.g.
(0002,0010) UI =LittleEndianExplicit # 20, 1 TransferSyntaxUID
The Transfer Syntax tells you whether or not the pixel data in this DICOM file is compressed.
How to decompress compressed images:
dcmdjpeg <compressed DICOM file in> <uncompressed DICOM file out>

Related

Output of ARM-WB decoder - What is the format? and How to play it?

I downloaded the 3GPP AMR-WB codec (26.173) from http://www.3gpp.org/DynaReport/26173.htm and successfully compiled it. However, the file format generated from the decoder is some so-called binary synthesized speech file (*.out). I am wondering what is the exact format and how I can play the the file? Thanks
For AMR-WB, output will be raw PCM with following properties
16000 (16Khz) sampling frequency
1 (mono) channel
16 bits per channel
You can play it using Audacity or any other player which supports PCM input.

Unix: What are stdin/out/err REALLY?

Assuming the following are correct...
stdin, stdout, and stderr are streams
streams are file descriptors
file descriptors are numbers/indexes in the kernel representing open files
Questions:
a. Does it follow by transition that stdin/out/err involve open files? So if I do ls /dir, does ls output the results to a file referred to by stdout(2)?
b. Where does above file live? in a /proc//? OR is that where the FD lives?
c. What is /dev/stdout? If I do vim /dev/stdout, vim tells me it is not a file. I see there's a series of links that lead to /dev/pts/27. What is going on? I tried to cat /dev/stdout but nothing happens.
d. In general, how is it that "files" in linux are actually NOT files?
Some of your assumptions are incorrect. For example, stdin is of type FILE*; it's not a "file descriptor".
stdin, stdout, and stderr are macros defined in <stdio.h>. (Yes, they're required to be macros, not just variable names). They expand to expressions of type FILE*, and they point to the FILE objects associated with the standard input, output, and error streams.
A "file descriptor" is a small integer value representing a POSIX stream. On UNIX-like systems, FILE* values are generally associated with file descriptors (you can use the fileno and fdopen functions to go from one to the other), but they're not the same thing.
Basically, there are two distinct I/O systems, one built on top of the other. The lower level system uses numeric file descriptors, manipulated via the open, read, write, and close functions, and so forth. The higher level, as defined by the ISO C standard, uses pointers of type FILE*, manipulated with fopen, fread, fwrite, fprintf, putchar, fclose, and so forth.
As I mentioned, on UNIX-like system, the C standard layer is generally implemented on top of the POSIX layer. On non-POSIX systems (like MS Windows), the C standard layer may be implemented on top of some other system-specific interface.
Linux and other UNIX-like systems try (incompletely) to follow an "everything is a file" philosophy. There are a number of file-like entities under /proc. These are not physical files stored on disk; they're entities that can be accessed using either the POSIX or ISO C I/O layers. Neither layer requires the "files" it deals with to be actual disk files, so there's nothing inconsistent about this.
man proc for more information on what's under the /proc directory (there's far more detail than I can put in this answer).

Why a hex file is used in burning program in micro controller?

When ever we program a micro controller we convert the C file into a hex file and then we burn that into controller.
My question is that why a hex file only, is that hex file a hexadecimal version of binary executable?
If yes then why do not we use a binary file instead?
if you are talking about an "intel hex" file the reason being is that it is ascii which makes it easy to examine and parse. true, it is innefficient in one way but compared to a raw binary it might be smaller. With a raw binary you only have one if any address associated, the starting address (not embedded in the file) in a hex file or motorola srecord which is a similar and often used format as well. both the ihex and srec formats are basically lines of ascii/hex numbers that represent a type a starting address, length data, and a checksum. there are non data lines in there but much of it will be data. so if your program has a few bytes at address 0x1000 and a few bytes at 0x80000000 then a .bin file would be at its smallest 0x8000000-0x1000 plus a few bytes but would typically be 0x80000000+ a few bytes (right, 2 gigabytes). Where an ihex or srec would be in the dozens of bytes total. the ihex and srec have built in checksums to help protect against corrupt files, not perfect of course but better than nothing at all...
Since then elf and coff and other formats have become popular. these are also based on blocks of data and not a complete memory image. these are binary, not ascii formats, but they are not just a memory image. chunks of data with address, type, etc are provided.
Because the ihex and srec are so simple to create and parse they will continue to be used for a long time, it does not take a lot of resources in a bootloader for example to handle receiving an ihex or srec file. (same with a binary of course, but the binary has a lot of fill data in it costing a lot of unnecessary transmission time).

Load raw memory blob in DPX format with Graphicsmagick

I have a library that generates a Big Endian 10-bit DPX image in a memory buffer. It's just the raw 10-bit RGB data, though, with no headers. I'm trying to load this data into an instance of Magick::Image like this:
Magick::Blob blob(dataBuffer, dataBufferSize;
image.read(blob, Magick::Geometry(width, height), 10 /*bits*/, "DPX");
This throws the following exception, though: Magick: Improper image header ()
Is it possible to load a raw DPX into a Magick::Image?
I don't think that your answer is a good one. It it is working by accident. Your blob data is likely to be in some other format than DPX. Specifying 'SDPX' (an unsupported format specification) allowed the file format detection to automatically work and select the correct format.
Using
enter code herMagick::Blob blob(dataBuffer, dataBufferSize);
image.read(blob);
should then be sufficient. Most image file formats do not require specifying the format or the depth.
Figured out my own answer here. I took a look at the DPX loading source and found out for this case this line:
image.read(blob, Magick::Geometry(width, height), 10 /*bits*/, "DPX");
should be:
image.read(blob, Magick::Geometry(width, height), 10 /*bits*/, "SDPX");

InputB vs. Get; code pages; slow reading on unix server

We have been using the usual code to read in a complete file into a string to then parse in VB6. The files are ANSI text but encoded using whatever code page the user was in at the time (we have Chinese and English users for example). This is the code
Open FileName For Binary As nFileUnit
sContents = StrConv(InputB(LOF(nFileUnit), nFileUnit), vbUnicode)
However, we have discovered this is VERY slow reading a file from a server running unix/linux, particularly when the ownership of the file is not the same as the process doing the reading.
I have rewritten the above using Get and discovered it is much faster and does not suffer from any issues with file ownership. I appreciate that this might be solved by reconfiguring the server somehow, but I think since deiscovering even without that issue, the Get method is still much faster than InputB I'd like to replace my existing code using Get.
I wonder if someone could tell me if this will really do the same thing. In particular, is it correctly doing the ANSI to Unicode conversion and will this always be true. My testing suggests the following replacement code does the same thing but faster:
Open FileName For Binary As nFileUnit
sContents = String(LOF(nFileUnit), " ")
Get #nFileUnit, , sContents
I also realise I could use a byte array, but again my tests suggest the above is simpler and works. So how does the buffer work correctly (if you believe the online help for Get it talks of characters returned - clearly this would cause problems when reading in an ANSI file written on the Chinese code page with 2-byte Chinese characters in it).
The following might be of interest becuase the InputB approach is commonly given as the method to read a complete file, but it is much slower, examples
Reading 380Kb file across the network from the unix server
InputB (file owned) = 0.875 sec
InputB (not owned) = 72.8 sec
Get (either) = 0.0156 sec
Reading a 9Mb file across the network from the unix server
InputB (file owned) = 19.65 sec
Get (either) = 0.42 sec
Thanks
Jonathan
InputB() is CVar(InputB$()), and is known to be horribly slow. My suspicion is that InputB$() reads the bytes and converts them to Unicode using the current codepage via some stock logic for reading text from disk, then does another conversion back to ANSI using the current codepage.
You might be far ahead to use ADODB.Stream.LoadFromFile() to load complete ANSI text files. You can set the .Type = adTypeText and .Charset = the appropriate ANSI encoding as required to read Unicode back out of it via .ReadText(x) where x can be a number of bytes, or adReadAll or adReadLine. For line reading you can set .LineSeparator to adCR, adCRLF, or adLF as required.
Many Charset values are supported: KOI8 for Cyrillic, Big5 for Chinese, etc.

Resources