DICOM -- DCMMKDIR -- File Name Max 8 Characters - dicom

I am writing an application which uses the DCM MKDIR, We save our images in little bigger name, but when I am trying to use the DCMMKDIR application, which is asking me to input file name max 8 characters.
Presently, I am planning to rename my Images starting from 1 to N. But remapping these images to known names(on the disk) will be bit difficult(I feel).
are there are any other methods/process to achieve the same.

The restriction of the filename to eight characters is derived from the DICOM Standard to ensure compatibility with applications that support e.g. only ISO 9660 as a file system for CDs.
About the naming you can have a look at the specifications the german CD Testat (http://www.dicom-cd.de/docs/DRG-RequirementsSpecification-2006.pdf). As a vendor you can be certified to conform to certain standards for interoperability of patient CDs which is currently the most common usage of DICOMDIRs.
The DICOMDIR file generated by DCMMKDIR is mainly a kind of index to tell an application, what DICOM files in a certain directory exist and this kind of file structure is usually more common for transfer media.

It is very common to use subdirectories to circumvent the restriction on max 8 length filenames. For example, a file could be 20110331/183421/000001 which would identify it properly with date, time and index without exceeding the somewhat arcane filename limit.

Related

Common properties of index-nodes among unix systems

Files on unix are assigned to index-nodes which may vary in layout and content among differnt systems. Usually, there must be common elements across the different implementations. Which are those ?
Normally, the following elements should be available:
Filetype, 9 RWX permission bits
Number of references to the file
Owner's identity
Owner's group
File size in bytes
13 device addresses
last read date/time
last write date/time
last i-node modification date/time

how to send variable length pdfs through connect direct with fixed LRECL

I am using connect direct with scp and trying to send some pdf files from unix to mainframes.
On unix end, I have archive containing pdfs which I am simply renaming to ABC.XYZ.LMN.PQR (mainframe file name) and then sending to mainframe.
The archive contains variable length pdf files.
However, the requirement is:
For any variable length file mainframe needs to know the maximum possible length, of any record in the file. For e.g. say the LRECL is 1950.
How to include LRECL as well when the pdf files inside the archive file to be sent is of variable length ?
An alternative would be to transfer the files to a Unix System Services file (z/OS Unix) instead of a "traditional" z/OS dataset. Then the folks on the mainframe side could use their utilities to copy the file to a "traditional" mainframe dataset if that's what they need.
For Variable Blocked datasets Only! If your maximum record size is 1950 you will want to specify RECFM=VB,LRECL=1954 adding 4 bytes more than your maximum record. This 4 byte allowance is for the Record Descriptor Word (RDW). If you need to specify BLKSIZE then the minimum is the size of LRECL plus another 4 bytes.
So in your example, your JCL will have DCB parameters that looks like: RECFM=VB,LRECL=1954,BLKSIZE=1958
Ideally, for optimal storage, BLKSIZE should be set to the largest size that does not exceed the device specific recommendation. i.e. TAPE devices typically try to use BLKSIZE=32670 (32 * 1024K - 8 for RDW & BDW). Disk drives may vary but in our shop BLKSIZE=23476 is considered optimal.
Incorrect blocking factors can waste tremendous amounts of space. When in doubt, let your system defaults apply or consult with your local system guru's for their device specific recommendations.

Disassemble to identify encryption algorithm

Goal (General)
My ultimate (long term) goal is to write an importer for a binary file into another application
Question Background
I am interested in two fields within a binary file format. One is
encrypted, and the other is compressed and possibly also encrypted
(See how I arrived at this conclusion here).
I have a viewer program (I'll call it viewer.exe) which can open these files for viewing. I'm hoping this can offer up some clues.
I will (soon) have a correlated deciphered output to compare and have values to search for.
This is the most relevant stackoverflow Q/A I have found
Question Specific
What is the best strategy given the resources I have to identify the algorithm being used?
Current Ideas
I realize that without the key, identifying the algo from just data is practically impossible
Having a file and a viewer.exe, I must have the key somewhere. Whether it's public, private, symmetric etc...that would be nice to figure out.
I would like to disassemble the viewer.exe using OllyDbg with the findcrypt plugin as a first step. I'm just not proficient enough in this kind of thing to accomplish it yet.
Resources
full example file
extracted binary from the field I am interested in
decrypted data In this zip archive there is a binary list of floats representing x,y,z (model2.vertices) and a binary list of integers (model2.faces). I have also included an "stl" file which you can view with many free programs but because of the weird way the data is stored in STL's, this is not what we expect to come out of the original file.
Progress
1. I disassembled the program with Olly, then did the only thing I know how to do at this poing and "searched for all referenced text" after pausing the porgram right before it imports of of the files. Then I searched for words stings like "crypt, hash, AES, encrypt, SHA, etc etc." I came up with a bunch of things, most notably "Blowfish64" which seems to go nicely with the fact that mydata occasionally is 4 bytes too long (and since it is guranteed to be mod 12 = 0) this to me looks like padding for 64 bit block size (odd amounts of vertices result in non mod 8 amounts of bytes). I also found error messages like...
“Invalid data size, (Size-4) mod 8 must be 0"
After reading Igor's response below, here is the output from signsrch. I've updated this image with green dot's which cause no problems when replaced by int3, red if the program can't start, and orange if it fails when loading a file of interest. No dot means I haven't tested it yet.
Accessory Info
Im using windows 7 64 bit
viewer.exe is win32 x86 application
The data is base64 encoded as well as encrypted
The deciphered data is groups of 12 bytes representing 3 floats (x,y,z coordinates)
I have OllyDb v1.1 with the findcrypt plugin but my useage is limited to following along with this guys youtube videos
Many encryption algorithms use very specific constants to initialize the encryption state. You can check if the binary has them with a program like signsrch. If you get any plausible hits, open the file in IDA and search for the constants (Alt-B (binary search) would help here), then follow cross-references to try and identify the key(s) used.
You can't differentiate good encryption (AES with XTS mode for example) from random data. It's not possible. Try using ent to compare /dev/urandom data and TrueCrypt volumes. There's no way to distinguish them from each other.
Edit: Re-reading your question. The best way to determine which symmetric algorithm, hash and mode is being used (when you have a decryption key) is to try them all. Brute-force the possible combinations and have some test to determine if you do successfully decrypt. This is how TrueCrypt mounts a volume. It does not know the algo beforehand so it tries all the possibilities and tests that the first few bytes decrypt to TRUE.

Disk reads / seeks on a shared unix server for a directory list

I want to get a better understanding of how disk reads work for a simple ls command and for a cat * command on a particular folder.
As I understand it, disk reads are the "slowest" operation for a server/any machine, and a webapp I have in mind will be making ls and cat * calls on a certain folder very frequently.
What are "ball park" estimates of the disk reads involved for an "ls" and for a "cat *" for the following number of entries?
Disk reads for ls Disk reads for cat *
200
2,000
20,000
200,000
Each file entry is just a single line of text
Tricky to answer - which is probably why it spent so long getting any answer at all.
In part, the answer will depend on the file system - different file systems will give different answers. However, doing 'ls' requires reading the pages that hold the directory entries, plus reading the pages that hold the inodes identified in the directory. How many pages that is - and therefore how many disk reads - depends on the page size and on the directory size. If you think in terms of 6-8 bytes of overhead per file name, you won't be too far off. If the names are about 12 characters each, then you have about 20 bytes per file, and if your pages are 4096 bytes (4KB), then you have about 200 files per directory page.
If you just list names and not other attributes with 'ls', you are done. If you list attributes (size, etc), then the inodes have to be read too. I'm not sure how big a modern inode is. Once upon a couple of decades ago on a primitive file system, it was 64-bytes each; it might have grown since then. There will be a number of inodes per page, but you can't be sure that the inodes you need are contiguous (adjacent to each other on disk). In the worst case, you might need to read another page for each separate file, but that is pretty unlikely in practice. Fortunately, the kernel is pretty good about caching disk pages, so it is unlikely to have to reread a page. It is impossible for us to make a good guess on the density of the relevant inode entries; it might be, perhaps, 4 inodes per page, but any estimate from 1 to 64 might be plausible. Hence, you might have to read 50 pages for a directory containing 200 files.
When it comes to running 'cat' on the files, the system has to locate the inode for each file, just as with 'ls'; it then has to read the data for the file. Unless the data is stored in the inode itself (I think that is/was possible in some file systems with biggish inodes and small enough file bodies), then you have to read one page per file - unless partial pages for small files are bunched together on one page (again, I seem to remember hearing that could happen in some file systems).
So, for a 200 file directory:
Plain ls: 1 page
ls -l: 51 pages
cat *: 251 pages
I'm not sure I'd trust the numbers very far - but you can see the sort of data that is necessary to improve the estimates.

Max File Extension Length

Is there a maximum lengh for a file extension? The longest one I've seen is .compiled (8 chars)
Useless Background
I'm creating a IHttpHandler to return image icons for a specific filename. I'm simply calling a FileImage.axd?ext=pptx. I'm generating the files on the fly using SHGetFileInfo similar to my post for WPF, then caching them locally in a folder with the filename 'pptx.png'. I'd like to validate the length and trim it to prevent a DoS type attack where someone would try to generate images for and infinite number of junk characters (eg FileImage.axd?ext=asdfasdfweqrsadfasdfwqe...).
As far as I know, there is no limit, except the maximum length of the file name. Extension is not treated specially except in FAT16.
I agree with Arkadiy - there is no formal limit now that the DOS 8.3 system is a thing of the past (or other similar, limited systems). I would say that the majority of the extensions I've seen are in the range 1-3; Java uses 4 for .java and 5 for .class. Your example with 8 is longer than I recall. If I were scoping, I'd aim for 'unlimited'; if that's not feasible, allow at least 16 characters - with the confident expectation that in fact 16 would be quite sufficient for current systems.

Resources