Interface design for binary file viewer - software-design

I'm currently working on a simple binary viewer for different file formats.
In essence, it parses the binary file to some human readable representation (usually something similar to json).
(using JSON result as an example) then each field would then have associated with it a location in the binary data. This would be an array of intervals ( one interval if the source is a continuous block ).
My problem is how do I design a generic interface for a File Format Reader which can support a reasonable range of different file formats.
Some examples to illustrate the point better
In case the binary file contains a point consisting of 2 integer coordinates each 4 bytes
function read_point( stream ){
const point = new Record();
point.addField( "x", Integer( stream.read( 4 ) ) )
point.addField( "y", Integer( stream.read( 4 ) ) )
return point;
}
Here stream.read(n) would return n bytes as well as the location of those bytes, for example like a struct of the form { data: ?, location: ?}
A function like Integer would then interpret the bytes as an int and keep the location.
Record could then also work similar to the results of Stream.read and Integer having data, namely the values of x and y and also a location, the union of the locations of x and y.
This sort of design with each output being of the form { data: ?, location: ?} works fine for data whose size is known in advance and whose location is somewhat continuous.
Can this idea be extended or reworked for cases like.
a Integer of 4 bytes but whose bytes are not continuous
function exotic_example_1( stream ){
const higher = stream.read( 2 );
stream.read( 1 ); //Some padding data for example
const lower = stream.read( 2 );
Integer( ?join?( higher, lower ) );
}
Or for example a string terminated by a \0 byte. Where you might start decoding the data at the same time as you are reading it.
I want the interface to be of a form where keeping track of the origin of bytes is somewhat automated.

Related

DM Script to import a 2D image in text (CSV) format

Using the built-in "Import Data..." functionality we can import a properly formatted text file (like CSV and/or tab-delimited) as an image. It is rather straight forward to write a script to do so. However, my scripting approach is not efficient - which requires me to loop through each raw (use the "StreamReadTextLine" function) so it takes a while to get a 512x512 image imported.
Is there a better way or an "undocumented" script function that I can tap in?
DigitalMicrograph offers an import functionality via the File/Import Data... menu entry, which will give you this dialog:
The functionality evoked by this dialog can also be accessed by script commands, with the command
BasicImage ImageImportTextData( String img_name, ScriptObject stream, Number data_type_enum, ScriptObject img_size, Boolean lines_are_rows, Boolean size_by_counting )
As with the dialog, one has to pre-specify a few things.
The data type of the image.
This is a number. You can find out which number belongs to which image data type by, f.e., creating an image outputting its data type:
image img := Realimage( "", 4, 100 )
Result("\n" + img.ImageGetDataType() )
The file stream object
This object describes where the data is stored. The F1 help-documention explains how one creates a file-stream from an existing file, but essentially you need to specify a path to the file, then open the file for reading (which gives you a handle), and then using the fileHandle to create the stream object.
string path = "C:\\test.txt"
number fRef = OpenFileForReading( path )
object fStream = NewStreamFromFileReference( fRef, 1 )
The image size object
This is a specific script object you need to allocate. It wraps image size information. In case of auto-detecting the size from the text, you don't need to specify the actual size, but you still need the object.
object imgSizeObj = Alloc("ImageData_ImageDataSize")
imgSizeObj.SetNumDimensions(2) // Not needed for counting!
imgSizeObj.SetDimensionSize(0,10) // Not used for counting
imgSizeObj.SetDimensionSize(1,10) // Not used for counting
Boolean checks
Like with the checkboxes in the UI, you spefic two conditions:
Lines are Rows
Get Size By Counting
Note, that the "counting" flag is only used if "Lines are Rows" is also true. Same as with the dialog.
The following script improrts a text file with couting:
image ImportTextByCounting( string path, number DataType )
{
number fRef = OpenFileForReading( path )
object fStream = NewStreamFromFileReference( fRef, 1 )
number bLinesAreRows = 1
number bSizeByCount = 1
bSizeByCount *= bLinesAreRows // Only valid together!
object imgSizeObj = Alloc("ImageData_ImageDataSize")
image img := ImageImportTextData( "Imag Name ", fStream, DataType, imgSizeObj, bLinesAreRows, bSizeByCount )
return img
}
string path = "C:\\test.txt"
number kREAL4_DATA = 2
image img := ImportTextByCounting( path, kREAL4_DATA )
img.ShowImage()

A datetime arithmetic in MQL4

I would like to define a datetime type variable that is a result of a simple arithmetic operation between datetime type variables.
I've defined:
datetime duration = ( TimeCurrent() - OrderOpenTime() );
datetime TmStop = StringToTime( "1970.01.01 16:00" );
but when I call it in some other arithmetic operation or generally in code like this
ExitBuy_H1 = ( duration > TmClose && ...
or this
text[3]= "Duration: " + TimeToStr( duration, TIME_MINUTES );
it doesn't work.
TmStop instead works fine.
Does anyone know why?
datetime is a simple integer, number of seconds passed since 1970.01.01 00:00. duration in your example is also in seconds, even though it is datetime formated, when you need it in minutes, divide by 60. TmClose from your example means 16*60*60 seconds and you can compare that integer with any other int of course, but what might be the reason for that?
if you hold you position more then 16 hours, then duration > TmClose is true. if you want to convert difference in seconds (duration) into time, then you will have time converted from 1970.01.01 00:00 + duration seconds.
Anyway it is not clear what is your goal in doing this calculations? if you want to make sure that you hold that particular position more then x hours, then simple bool holdMoreThanXHours = TimeCurrent()-OrderOpenTime()>x*PeriodSeconds(PERIOD_H1), and do not forget to reselect each ticket if you have several ones in open
Fact A) the code, as-is, absolutely out of any question works.
//+------------------------------------------------------------------+
//| Test_StackOverflow.mq4 |
//+------------------------------------------------------------------+
#property strict
void OnStart() {
datetime duration = ( TimeCurrent() - OrderOpenTime() );
string txt = "Duration: " + TimeToStr( duration, TIME_MINUTES );
}
//+------------------------------------------------------------------+
0 error(s), 0 warning(s), compile time: 2000 msec 1 1
Fact B) the full MCVE-context of the code, as-is, is missing.
StackOverflow requires users to post a complete MCVE-representation of the problem. This requirement was not met in the original post.
While the datetime and int data-types are mutually interchangeable, the problem does not seem to be hidden in this intrinsic "duality" of a value representation, but must be somewhere else.
The main suspects for Why? are:
variable definition was masked by another variable having the same name
variable scope-of-definition was exceeded ( asking a variable outside of it's scope )
db.Pool-operations were not preceded by OrderSelect()

systemverilog constraint dist using weights array

I need to be able to set a constraint dist with 64 different, changeble weights:
I need to random pick an index of range 0~63, when every index has its own weight / probability to be chosen.
I can write something like:
constraint pick_chan_constraint {pick_channel dist{
0:=channel_weight[0], 1:=channel_weight[1], 2:=channel_weight[2],
3:=channel_weight[3], 4:=channel_weight[4], 5:=channel_weight[5],
6:=channel_weight[6], 7:=channel_weight[7], 8:=channel_weight[8],
9:=channel_weight[9], 10:=channel_weight[10], 11:=channel_weight[11],
12:=channel_weight[12], 13:=channel_weight[13],
14:=channel_weight[14], … ...
NUM_OF_CHANS-1 := channel_weight[NUM_OF_CHANS-1] }}
Obviously it's bad writing and a bad idea, out of 2 reasons:
No flexibility- if NUM_OF_CHANS changes, I'll need to change the code.
It's long and ugly and almost unreadable.
Any ideas?
Thanks
IEEE Std 1800-2012 § 18.5.4 Distribution shows the dist_list needs to be a list of dist_items and a dist_item is defined as a value_range [ dist_weight ]. In other words the distribution needs to be listed out.
Instead of using a constraint you could create a queue array (§ 7.10 Queues) and then use the shuffle method (§ 7.12.2 Array ordering methods). Example:
int channel_weight [64];
int pick_channel;
int weight_chain [$];
weight_chain.delete(); // make sure it is empty
foreach (channel_weight[i]) begin
repeat (channel_weight[i]) begin
weight_chain.push_back(i);
end
end
weight_chain.shuffle(); // randomize order
assert( weight_chain.size() > 0) else $error("all channel_weights are 0");
pick_channel = weight_chain[0];

Decrypting data with Rsa Private Key/Function

I'm just trying to encode/decode data (decode, firstly) with RSA.
I don't care of the type of data (string, bytes, or other lolcat-encoded data), I just looking a very simple function doing the job (cryptographic operations) and I can do the rest.
Here is what I tried :
CryptoPP::InvertibleRSAFunction rsa;
rsa.SetModulus( n_factor );
rsa.SetPrivateExponent(d_private_exponent);
rsa.SetPrime1( rsa_params.at(p1) );
rsa.SetPrime2( rsa_params.at(p2) );
// All above inputs are correct. I don't have public exponent, but it's works in other languages (I comprared all inputs/outputs)
bool key_ok = rsa.Validate(CryptoPP::NullRNG(), 15);
/* Returns false, but doesn't tell me why :/ */
CryptoPP::Integer to_decode( my_data, size_of_my_data );
res = rsa.CalculateInverse(rg,to_decode);
This returns:
EXCEPTION : what(): CryptoMaterial: this object contains invalid values
Returned error code corresponds to "INVALID_DATA_FORMAT", which probably doesn't mean a key problem, but an input problem.
If anyone has some experience with this library, or detects a "noob" mistake (I've my mind in code since 4 hours, so it is possible), any help would be appreciated.
Easier than I thought :
CryptoPP::Integer dec_data = a_exp_b_mod_c( enc_data, d_private_exponent, n_modulus );
So 'd' and 'n' were enought. I really don't understand why it didn't work with crypto features.

URL-Compact representation of GUID/UUID?

I need to generate a GUID and save it via a string representation. The string representation should be as short as possible as it will be used as part of an already-long URL string.
Right now, instead of using the normal abcd-efgh-... representation, I use the raw bytes generated and base64-encode them instead, which results in a somewhat shorter string.
But is it possible to make it even shorter?
I'm OK with losing some degree of uniqueness and keeping a counter, but scanning all existing keys is not an option. Suggestions?
I used an Ascii85 encoding for writing a Guid to a database column in 20 ASCII characters. I've posted the C# code in case it is useful. The specific character set may be different for a URL encoding, but you can pick whichever characters suit your application. It's available here: What is the most efficient way to encode an arbitrary GUID into readable ASCII (33-127)?
Sure, just use a base larger than 64. You'll have to encode them using a custom alphabet, but you should be able to find a few more "url-safe" printable ASCII characters.
Base64 encodes 6 bits using 8, so a 16 byte GUID value becomes 22 bytes encoded. You may be able to reduce that by a character or two, but not much more.
I found this discussion interesting: https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/
Basically you take the 36 characters and turn them into 16 bytes of binary but first sort the three temporal pieces using a stored procedure:
set #uuid:= uuid();
select #uuid;
+--------------------------------------+
| #uuid |
+--------------------------------------+
| 59f3ac1e-06fe-11e6-ac3c-9b18a7fcf9ed |
+--------------------------------------+
CREATE DEFINER=`root`#`localhost`
FUNCTION `ordered_uuid`(uuid BINARY(36))
RETURNS binary(16) DETERMINISTIC
RETURN UNHEX(CONCAT(SUBSTR(uuid, 15, 4),SUBSTR(uuid, 10, 4),SUBSTR(uuid, 1, 8),SUBSTR(uuid, 20, 4),SUBSTR(uuid, 25)));
select hex(ordered_uuid(#uuid));
+----------------------------------+
| hex(ordered_uuid(#uuid)) |
+----------------------------------+
| 11e606fe59f3ac1eac3c9b18a7fcf9ed |
+----------------------------------+
I'm not sure if this is feasible, but you could put all the generated GUIDs in a table and use in the URL only the index of the GUID in the table.
You could also reduce the length of the guid - for example use 2 bytes to indicate the number of days since 2010 for example and 4 bytes for the number of miliseconds since the start of the current day. You will have collisions only for 2 GUIDs generated in the same milisecond. You could also add 2 more random bytes which will make this even better.
(long time, but just came into the same need today)
UUIDs are 128bits long, represented by 32 hex plus 4 hyphens.
If we use a dictionary of 64 (2^6) printable ascii`s, it is just a matter of converting from 32 groups of 4 bits (length of a hex) to 22 groups of 6 bits.
Here is a UUID shortner. Instead 36 chars you get 22, without losing the original bits.
https://gist.github.com/tomlobato/e932818fa7eb989e645f2e64645cf7a5
class UUIDShortner
IGNORE = '-'
BASE6_SLAB = ' ' * 22
# 64 (6 bits) items dictionary
DICT = 'a'.upto('z').to_a +
'A'.upto('Z').to_a +
'0'.upto('9').to_a +
['_', '-']
def self.uuid_to_base6 uuid
uuid_bits = 0
uuid.each_char do |c|
next if c == IGNORE
uuid_bits = (uuid_bits << 4) | c.hex
end
base6 = BASE6_SLAB.dup
base6.size.times { |i|
base6[i] = DICT[uuid_bits & 0b111111]
uuid_bits >>= 6
}
base6
end
end
# Examples:
require 'securerandom'
uuid = ARGV[0] || SecureRandom.uuid
short = UUIDShortner.uuid_to_base6 uuid
puts "#{uuid}\n#{short}"
# ruby uuid_to_base6.rb
# c7e6a9e5-1fc6-4d5a-b889-4734e42b9ecc
# m75kKtZrjIRwnz8hLNQ5hd
You could approach this from the other direction. Produce the shortest possible string representation and map it into a Guid.
Generate the key using a defined alphabet as below:
In psuedocode:
string RandomString(char[] alphabet, int length)
{
StringBuilder result = new StringBuilder();
for (int i = 0; i < length; i++)
result.Append(alphabet[RandomInt(0, alphabet.Length)]);
return result;
}
If you keep the string length < 16, you can simply hex encode the result and pass it to the Guid constructor to parse.
not for exact same problem, but very very close - I have used CRC64, Base64 that and you get 11 bytes, CRC64 has been tested (not proven) to NOT produce duplicates on a wide range of strings.
And since it is 64 bit long by definition - you get the key that is half the size.
To directly answer the original question - you can CRC64 encode any representation of your GUIDs.
Or just run CRC64 on the business key and you will have a 64 bit unique thing that you can then base64.

Resources