Max File Extension Length - asp.net

Is there a maximum lengh for a file extension? The longest one I've seen is .compiled (8 chars)
Useless Background
I'm creating a IHttpHandler to return image icons for a specific filename. I'm simply calling a FileImage.axd?ext=pptx. I'm generating the files on the fly using SHGetFileInfo similar to my post for WPF, then caching them locally in a folder with the filename 'pptx.png'. I'd like to validate the length and trim it to prevent a DoS type attack where someone would try to generate images for and infinite number of junk characters (eg FileImage.axd?ext=asdfasdfweqrsadfasdfwqe...).

As far as I know, there is no limit, except the maximum length of the file name. Extension is not treated specially except in FAT16.

I agree with Arkadiy - there is no formal limit now that the DOS 8.3 system is a thing of the past (or other similar, limited systems). I would say that the majority of the extensions I've seen are in the range 1-3; Java uses 4 for .java and 5 for .class. Your example with 8 is longer than I recall. If I were scoping, I'd aim for 'unlimited'; if that's not feasible, allow at least 16 characters - with the confident expectation that in fact 16 would be quite sufficient for current systems.

Related

How to calculate duration for a BerkeleyDB dump/load operation for a given BDB file?

I'm using a 3rd party application that uses BerkeleyDB for its local datastore (called BMC Discovery). Over time, its BDB files fragment and become ridiculously large, and BMC Software scripted a compact utility that basically uses db_dump piped into db_load with a new file name, and then replaces the original file with the rebuilt file.
The time it can take for large files is insanely long, and can take hours, while some others for the same size take half that time. It seems to really depend on the level of fragmentation in the file and/or type of data in it (I assume?).
The utility provided uses a crude method to guestimate the duration based on the total size of the datastore (which is composed of multiple BDB files). Ex. if larger than 1G say "will take a few hours" and if larger than 100G say "will take many hours". This doesn't help at all.
I'm wondering if there would be a better, more accurate way, using the commands provided with BerkeleyDB.6.0 (on Red Hat), to estimate the duration of a db_dump/db_load operation for a specific BDB file ?
Note: Even though this question mentions a specific 3rd party application, it's just to put you in context. The question is generic to BerkelyDB.
db_dump/db_load are the usual (portable) way to defragment.
Newer BDB (like last 4-5 years, certainly db-6.x) has a db_hotbackup(8) command that might be faster by avoiding hex conversions.
(solutions below would require custom coding)
There is also a DB->compact(3) call that "optionally returns unused Btree, Hash or Recno database pages to the underlying filesystem.". This will likely lead to a sparse file which will appear ridiculously large (with "ls -l") but actually only uses the blocks necessary to store the data.
Finally, there is db_upgrade(8) / db_verify(8), both of which might be customized with DB->set_feedback(3) to do a callback (i.e. a progress bar) for long operations.
Before anything, I would check configuration using db_tuner(8) and db_stat(8), and think a bit about tuning parameters in DB_CONFIG.

data structure used to store file on Unix system

I just made a simple text editor using inbuilt opening, writing, overwriting file.I used python installed with tkinter. But, I also want to extend the text editor to add some new features like search, replace etc efficiently. In order to make it efficient, I need to know data structure that Unix uses to store data in file and to calculating the time complexity for search.
A text file is stored as a stream of bytes. Depending on the encoding used (ASCII, UTF-8, Unicode, etc.), it can be a fixed value of one or more bytes per character, or in the case of UTF-8 and some other encodings, a varying number of bytes per character.
The best search algorithms have a worst case complexity of O(n + m), where n is the length of the string you're searching for, and m is the length of the string you're searching in. A good example is the Boyer-Moore search algorithm. If the file you're working with is larger than will fit in memory, then you have to worry about buffering and such, which is an added complication, but doesn't impact the efficiency of the search. You'll have to be creative about buffering the input so that you don't miss a string that crosses input buffer boundaries.

how to send variable length pdfs through connect direct with fixed LRECL

I am using connect direct with scp and trying to send some pdf files from unix to mainframes.
On unix end, I have archive containing pdfs which I am simply renaming to ABC.XYZ.LMN.PQR (mainframe file name) and then sending to mainframe.
The archive contains variable length pdf files.
However, the requirement is:
For any variable length file mainframe needs to know the maximum possible length, of any record in the file. For e.g. say the LRECL is 1950.
How to include LRECL as well when the pdf files inside the archive file to be sent is of variable length ?
An alternative would be to transfer the files to a Unix System Services file (z/OS Unix) instead of a "traditional" z/OS dataset. Then the folks on the mainframe side could use their utilities to copy the file to a "traditional" mainframe dataset if that's what they need.
For Variable Blocked datasets Only! If your maximum record size is 1950 you will want to specify RECFM=VB,LRECL=1954 adding 4 bytes more than your maximum record. This 4 byte allowance is for the Record Descriptor Word (RDW). If you need to specify BLKSIZE then the minimum is the size of LRECL plus another 4 bytes.
So in your example, your JCL will have DCB parameters that looks like: RECFM=VB,LRECL=1954,BLKSIZE=1958
Ideally, for optimal storage, BLKSIZE should be set to the largest size that does not exceed the device specific recommendation. i.e. TAPE devices typically try to use BLKSIZE=32670 (32 * 1024K - 8 for RDW & BDW). Disk drives may vary but in our shop BLKSIZE=23476 is considered optimal.
Incorrect blocking factors can waste tremendous amounts of space. When in doubt, let your system defaults apply or consult with your local system guru's for their device specific recommendations.

Disassemble to identify encryption algorithm

Goal (General)
My ultimate (long term) goal is to write an importer for a binary file into another application
Question Background
I am interested in two fields within a binary file format. One is
encrypted, and the other is compressed and possibly also encrypted
(See how I arrived at this conclusion here).
I have a viewer program (I'll call it viewer.exe) which can open these files for viewing. I'm hoping this can offer up some clues.
I will (soon) have a correlated deciphered output to compare and have values to search for.
This is the most relevant stackoverflow Q/A I have found
Question Specific
What is the best strategy given the resources I have to identify the algorithm being used?
Current Ideas
I realize that without the key, identifying the algo from just data is practically impossible
Having a file and a viewer.exe, I must have the key somewhere. Whether it's public, private, symmetric etc...that would be nice to figure out.
I would like to disassemble the viewer.exe using OllyDbg with the findcrypt plugin as a first step. I'm just not proficient enough in this kind of thing to accomplish it yet.
Resources
full example file
extracted binary from the field I am interested in
decrypted data In this zip archive there is a binary list of floats representing x,y,z (model2.vertices) and a binary list of integers (model2.faces). I have also included an "stl" file which you can view with many free programs but because of the weird way the data is stored in STL's, this is not what we expect to come out of the original file.
Progress
1. I disassembled the program with Olly, then did the only thing I know how to do at this poing and "searched for all referenced text" after pausing the porgram right before it imports of of the files. Then I searched for words stings like "crypt, hash, AES, encrypt, SHA, etc etc." I came up with a bunch of things, most notably "Blowfish64" which seems to go nicely with the fact that mydata occasionally is 4 bytes too long (and since it is guranteed to be mod 12 = 0) this to me looks like padding for 64 bit block size (odd amounts of vertices result in non mod 8 amounts of bytes). I also found error messages like...
“Invalid data size, (Size-4) mod 8 must be 0"
After reading Igor's response below, here is the output from signsrch. I've updated this image with green dot's which cause no problems when replaced by int3, red if the program can't start, and orange if it fails when loading a file of interest. No dot means I haven't tested it yet.
Accessory Info
Im using windows 7 64 bit
viewer.exe is win32 x86 application
The data is base64 encoded as well as encrypted
The deciphered data is groups of 12 bytes representing 3 floats (x,y,z coordinates)
I have OllyDb v1.1 with the findcrypt plugin but my useage is limited to following along with this guys youtube videos
Many encryption algorithms use very specific constants to initialize the encryption state. You can check if the binary has them with a program like signsrch. If you get any plausible hits, open the file in IDA and search for the constants (Alt-B (binary search) would help here), then follow cross-references to try and identify the key(s) used.
You can't differentiate good encryption (AES with XTS mode for example) from random data. It's not possible. Try using ent to compare /dev/urandom data and TrueCrypt volumes. There's no way to distinguish them from each other.
Edit: Re-reading your question. The best way to determine which symmetric algorithm, hash and mode is being used (when you have a decryption key) is to try them all. Brute-force the possible combinations and have some test to determine if you do successfully decrypt. This is how TrueCrypt mounts a volume. It does not know the algo beforehand so it tries all the possibilities and tests that the first few bytes decrypt to TRUE.

DICOM -- DCMMKDIR -- File Name Max 8 Characters

I am writing an application which uses the DCM MKDIR, We save our images in little bigger name, but when I am trying to use the DCMMKDIR application, which is asking me to input file name max 8 characters.
Presently, I am planning to rename my Images starting from 1 to N. But remapping these images to known names(on the disk) will be bit difficult(I feel).
are there are any other methods/process to achieve the same.
The restriction of the filename to eight characters is derived from the DICOM Standard to ensure compatibility with applications that support e.g. only ISO 9660 as a file system for CDs.
About the naming you can have a look at the specifications the german CD Testat (http://www.dicom-cd.de/docs/DRG-RequirementsSpecification-2006.pdf). As a vendor you can be certified to conform to certain standards for interoperability of patient CDs which is currently the most common usage of DICOMDIRs.
The DICOMDIR file generated by DCMMKDIR is mainly a kind of index to tell an application, what DICOM files in a certain directory exist and this kind of file structure is usually more common for transfer media.
It is very common to use subdirectories to circumvent the restriction on max 8 length filenames. For example, a file could be 20110331/183421/000001 which would identify it properly with date, time and index without exceeding the somewhat arcane filename limit.

Resources