I like the RIFF standards but I don't like that file length is limited to 4GB. Is there a good modern file format standard for storing binary data that is like RIFF?
I see that for long WAV files Adobe programs will essentially just ignore the file length bytes, but if I am making a new file format it seems weird to include bytes only to NOT use them. If RIFF generally frowned upon in modern formats or no?
Related
I got a file that seems to not have anything readable into it (for a human)
How can I be sure that it hasn't anything readable for a human? Because it's way too large to read it entirely (maybe a program that searches for words or entropy or I don't know.)
How can I know if this file is compressed or encrypted, or both? And is it possible that it has a proprietary compression so I can't distinguish it from encryption?
Because if I can make sure that it's encrypted, I can stop my work directly, but if it's just encoded/compressed, maybe I can find a way to read it
(I tried to compress it with the basic Windows archiver and it loses 18% of its size. Does it mean that it's not encrypted? Does an encryption permit that much compression?)
Yes, it is certainly possible to create a compression format for which all possible sequences of bits is valid. In that case, you would not be able to distinguish the compressed data from random or encrypted data.
I am not aware of a commonly implemented compressed format that has that property. You could try all of the decompressors you can find on the data to see if any continue to decompress through all of the data without erroring out. You can also try starting at different locations in your data, since there may be some sort of header before the compressed data.
Online Decryption
If you would like to decrypt the file. You could simply copy and paste everything inside of https://online-toolz.com/tools/text-encryption-decryption.php
that feature can decrypt messages fast.
Encoder & Decoder
https://www.base64decode.org/
I found this website a while ago, this website is trusted and fast with great reviews.
This method can also help with your request.
I'm developing an application that needs to be able to create & manipulate SQLite databases in user-defined paths. I'm running into a problem I don't really understand. I'm testing my stuff against some really gross sample data with huge unwieldy unicode paths, for most of them there isn't a problem, but for one there is.
An example of a working connection string is:
Data Source="c:\test6\意外な高価で売れるかも? 出品は手順を覚えれば後はかんたん!\11オークションストアの出品は対象外とさせていただきます。\test.db";Version=3;
While one that fails is
Data Source="c:\test6\意外な高価で売れるかも? 出品は手順を覚えれば後はかんたん!\22今やPCライフに欠かせないのがセキュリティソフト。そのため、現在何種類も発売されているが、それぞれ似\test.db";Version=3;
I'm using System.Data.SQLite v1.0.66.0 due to reasons outside of my control, but I quickly tested with the latest, v1.0.77.0 and had the same problems.
Both when attempting to newly create the test.db file or if I manually put one there and it's attempting to open, SQLiteConnection.Open throws an exception saying only "Unable to open the database file", with the stack trace showing that it's actually System.Data.SQLite.SQLite3.Open that is throwing.
Is there any way I can get System.Data.SQLite to play nicely with these paths? A workaround could be to create and manipulate my databases in a temporary location and then just move them to the actual locations for storage, since I can create and manipulate files normally otherwise. That's kind of a last resort though.
Thank you.
I am guessing you are on a Japanese-locale machine where the default system encoding (ANSI code page) is cp932 Japanese (≈Shift-JIS).
The second path contains:
ソ
which encodes to the byte sequence:
0x83 0x5C
Shift-JIS is a multibyte encoding that has the unfortunate property of sometimes re-using ASCII code units in the trail byte. In this case it has used byte 0x5C which corresponds to the backslash \. (Though this typically displays as a yen sign in Japanese fonts, for historical reasons.)
So if this pathname is passed into a byte-based API, it will get encoded in the ANSI code page, and you won't be able to tell the difference between a backslash meant as a directory separator and one that is a side-effect of multi-byte encoding. Consequently any path with one of the following characters in will fail when accessed with a byte-based IO method:
―ソЫⅨ噂浬欺圭構蚕十申曾箪貼能表暴予禄兔喀媾彌拿杤歃畚秉綵臀藹觸軆鐔饅鷭偆砡纊犾
(Also any pathname that contains a Unicode character not present in cp932 will naturally fail.)
It would appear that behind the scenes SQLite is using a byte-based IO method to open the filename it is given. This is unfortunate, but extremely common in cross-platform code, because the POSIX C standard library is defined to use byte-based filenames for operations like file open().
Consequently using the C stdlib functions it is impossible to reliably access files with non-ASCII names. This sad situation inherits into all sorts of cross-platform libraries and languages written using the stdlib; only tools written with specific support for Win32 Unicode filenames (eg Python) can reliably access all files under Windows.
Your options, then, are:
avoid using non-ASCII characters in the path name for your db, as per the move/rename suggestion;
continue to rely on the system locale being Japanese (ANSI code page=932), and just rename files to avoid any of the characters listed above;
get the short (8.3) filename of the file in question and use that instead of the real one—something like c:\test6\85D0~1\22PC~1\test.db. You can use dir /x to see the short-filenames. They are always pure ASCII, avoiding the encoding problem;
add some code to get the short filename from the real one, using GetShortPathName. This is a Win32 API so you need a little help to call it from .NET. Note also short filenames will still fail if run on a machine with the short filename generation feature disabled;
persuade SQLite to add support for Windows Unicode filenames;
persuade Microsoft to fix this problem once and for all by making the default encoding for byte interfaces UTF-8, like it is on all other modern operating systems.
I have used Evil DICOM library to read a DICOM file.It is displaying the Raw DICOM file correctly but it is not displaying the other formats.Plz suggest me solution or suggest me any other C# library which reads all the formats correctly.
I assume you are talking about DICOM files with compressed images. You can access the fragments in the pixel data element and uncompress them yourself in Evil Dicom:
DicomFile df = new DicomFile("compressed.dcm");
Fragment[] frags = df.PixelData.Fragments;
but obviously this is more complicated than you probably want. I will try to get the CompressionHelper class running within the next few versions. Many compression formats are proprietary and code for decompression is hard to find.
I believe Grassroots DICOM may be what you are looking for. Not as easy as Evil Dicom, but it supports the formats you want.
The mac version of my application has just started breaking its full screen and normal layouts which I save and restore using QSettings. Even old versions of my application are now playing up for my customers.
I was just googling for something similar when I found a bug report which contained an interesting line:
QSettings s;
restoreState(s.value(QString::fromLocal8Bit("state")).toByteArray());
When saving to the computers plist's or windows registry do I have to format the data in this fromLocal8bit()?
http://bugreports.qt-project.org/browse/QTBUG-8631
http://bugreports.qt-project.org/secure/attachment/13400/main.cpp
It's the data that is encoded, it's just the literal "state". The values are properly encoded and decoded if you use QByteArray or QString
.
The QString::fromLocal8Bit() part is for converting the string literal in the source file to a unicode string. It's good practice to stick to ASCII in source files, as other encodings such as UTF-8 usually don't work with all compilers, especially MSVC.
To convert literals to QString I would suggest using QLatin1String:
QLatin1String("state")
fromLocal8Bit() is strictly speaking false, as the local 8bit encoding has nothing to do with the source file encoding, where the literal comes from.
But as "state" doesn't contain any non-ascii characters, it shouldn't matter anyway.
I have ASN format files i have to convert into CSV format also readable one
I need a Decoder with some advanced options like schedule,Auto FTP like that
Pretty old thread but it still comes at top on Google search and it will be good for people to get an answer. Not exactly Unix programming but you can find a "generic" Javascript based ASN decoder at http://lapo.it/asn1js/.
You can also download this and run natively on your box.
Erlang provides very good support for reading and writing BER-, DER- and PER-encoded ASN.1 packets. The Erlang compiler even accepts ASN.1 syntax natively and produces a custom codec as an Erlang module.
It's a whole new programming language for most people though, so whether it's worth learning it just for this exercise, I'll leave up to you. It is a very fun language to learn, however, and will teach you a very different way to think about programming.
You could have a look at asn1 compiler.
It converts asn.1 syntax files to C code.
As Marcelo noted in you question, you didn't exactly say precisely what you need, so can't tell if it covers all your bases, but you will be able to compile the code to binary (from C code, obviously)
There is an open source package called asn1c which will do ASN.1 encoding and decoding. Its a C library that you need to build and then write code around to implement your program. In order to build the library, it requires the ASN.1 syntax file that is used to construct the encoded messages. When decoding, one option is to output the data to an XML file which you would then need to convert to a CSV file somehow. At a minimum, it supports BER, XER, and PER.
In Python, there is also the PyASN1 library and tools: http://pyasn1.sourceforge.net/