I am trying to convert a Binary file parser to Julia from Python. I am struggling to figure out how to unpack the binary stream with a specific format. I found this discourse thread that is exactly what I am trying to do, but its from 2017 and doesn't seem to have a working solution. Does anyone have a solution?
In Python it looks like this:
In [22]: struct.unpack('>idi', b'\x00\x00\x00\x17#\t\x1e\xb8Q\xeb\x85\x1f\x00\x00\x00*')
Out[22]: (23, 3.14, 42)
In Julia I am here:
data = open(filename, "r")
seek(data, 0)
# now I want to get the first 12 bytes of the file and convert to a string.. and am stumped..
I'm not very familiar with struct.unpack in Python, but maybe you could do something like this:
julia> data = IOBuffer("\x00\x00\x00\x17#\t\x1e\xb8Q\xeb\x85\x1f\x00\x00\x00*")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> seekstart(data)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> i = bswap(read(data, Int32))
23
julia> pi = bswap(read(data, Float64))
3.14
julia> i = bswap(read(data, Int32))
42
bswaps are there because there seems to be a difference in endianness between what Julia internally uses and what's encoded in your binary stream. Apart from that, this is just a plain use of read, specifying the type of data to be read.
By the way, here is how you would read the first 12 bytes of the file, and convert them to a string (which is not really necessary in this case, but could be useful in others):
julia> seekstart(data)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> bytes = read(data, 12)
12-element Array{UInt8,1}:
0x00
0x00
0x00
0x17
0x40
0x09
0x1e
0xb8
0x51
0xeb
0x85
0x1f
# note the capital "S" in "String"
julia> String(bytes)
"\0\0\0\x17#\t\x1e\xb8Q\xeb\x85\x1f"
Related
julia> s = "abcdefg"
"abcdefg"
julia> s1 = s[3:4]
"cd"
julia> s2 = match(r"c.", s).match
"cd"
julia> typeof(s)
String
julia> typeof(s1)
String
julia> typeof(s2)
SubString{String}
What functionality does SubString enable? It looks like a container. If so, what other types can it hold? If this is useful, why isn't s1 a SubString?
I found this behavior strange when I had to convert s2 into a pure String to get it into a f(x::String) function. What is the difference between using String(s2) and string(s2) for that conversion?
SubString{String} is just a view of String. s1[3:4] is not a SubString because it is getindex not view function (just like with arrays).
It is SubString{String} to avoid copying of data in the string, see e.g.:
julia> using BenchmarkTools
julia> x = "a"^1_000_000;
julia> #btime $x[1:end];
36.000 μs (1 allocation: 976.69 KiB)
julia> #btime #view $x[1:end];
23.046 ns (0 allocations: 0 bytes)
to note how much difference in allocations and speed it makes
In general you should avoid writing s[3:4] as it is not a safe indexing code (it is only safe if your string is ASCII which you can check with isascii). String indexing in Julia uses byte indices (not character indices)
SubString{String} has String parameter, as there are in general other string types than only String:
julia> using InlineStrings
julia> x = InlineString("abcd")
"abcd"
julia> typeof(x)
String7
julia> y = #view x[1:end]
"abcd"
julia> typeof(y)
SubString{String7}
As it is noted in comment by Antonello - most likely the f function should accept AbstractString and you would not even notice a problem.
All this is explained in https://docs.julialang.org/en/v1/manual/strings/.
If you want something more hands-on check out for example chapter 6 of https://www.manning.com/books/julia-for-data-analysis (I do not want to do too much self promotion, but your question is one of the standard questions users ask and I explained all these topics in this chapter to address them).
I've tried to use Julia and I have some question for fixing length of float and decimal point when saving data.
The input file's name is "L100_A100_C100.dat" and it has data as below:
SIMULATION RESULTS
0.599566E+00 0.666925E-06 0.3348E+02 0.2527E+03 -0.6948E+04
0.599633E+00 0.666924E-06 0.3394E+02 0.2529E+03 -0.6949E+04
0.599699E+00 0.666922E-06 0.3424E+02 0.2528E+03 -0.6949E+04
0.599766E+00 0.666920E-06 0.3440E+02 0.2527E+03 -0.6949E+04
0.599833E+00 0.666919E-06 0.3460E+02 0.2525E+03 -0.6948E+04
0.599899E+00 0.666919E-06 0.3488E+02 0.2522E+03 -0.6948E+04
0.599966E+00 0.666919E-06 0.3530E+02 0.2520E+03 -0.6948E+04
So I programmed as below:
file = open("L100_A100_C100.dat", "r")
data = readdlm(file, Float64, skipstart=1)
writedlm("output.txt", data)
and Output is
0.599566 6.66925e-7 33.48 252.7 -6948.0
0.599633 6.66924e-7 33.94 252.9 -6949.0
0.599699 6.66922e-7 34.24 252.8 -6949.0
0.599766 6.6692e-7 34.4 252.7 -6949.0
0.599833 6.66919e-7 34.6 252.5 -6948.0
0.599899 6.66919e-7 34.88 252.2 -6948.0
0.599966 6.66919e-7 35.3 252.0 -6948.0
but my question is how to fix length of float and decimal point (just like '%10.3f' in Python)?
In Julia, printf is available as a macro. A Julia macro is a piece of Julia code that generates other Julia code at compile time. In this case, at the compile time, the macro #sprintf generates a formatting object (this also validate the formatting string) and the formatting code. Later, at the runtime, the actual formatting is performed.
Hence you can do:
julia> x = rand(2,3)
2×3 Matrix{Float64}:
0.475864 0.285398 0.969636
0.46037 0.708167 0.45792
julia> using Printf
julia> m = (a->(#sprintf "%10.3f" a)).(x)
2×3 Matrix{String}:
" 0.476" " 0.285" " 0.970"
" 0.460" " 0.708" " 0.458"
Now you have a matrix of pre-formatted texts that you directly dump to file (I use stdout instead):
julia> writedlm(stdout, m, "" , quotes=false)
0.476 0.285 0.970
0.460 0.708 0.458
I'm new to Julia. I'm trying to parse a structured binary file. I read n bytes from the file and I want to cast the byte array to an object of type X.
struct X
messageType::UInt8
second::UInt32
end
f = open("myfile.bin")
bytes = read(f, 5)
And now I want to cast bytes to an object of X. How can I do this?
You can use StructIO here is how.
Setup:
using StructIO
#io struct XX
messageType::UInt8
second::UInt32
end align_packed
bytes = UInt8[0x72, 0xa3, 0x97, 0xcf, 0x64]
buf = IOBuffer(bytes)
And now running the code:
julia> seekstart(buf); unpack(buf, XX)
XX(0x72, 0x64cf97a3)
julia> seekstart(buf); unpack(buf, XX, :BigEndian)
XX(0x72, 0xa397cf64)
How to use show to pretty print matrix to String?
It's possible to print it to stdout with show(stdout, "text/plain", rand(3, 3)).
I'm looking for something like str = show("text/plain", rand(3, 3))
For simple conversions usually DelimitedFiles is your best friend.
julia> a = rand(2,3);
julia> using DelimitedFiles
julia> writedlm(stdout, a)
0.7609054249392935 0.5417287267974711 0.9044189728674543
0.8042343804934786 0.8206460267786213 0.43575947315522123
If you want to capture the output use a buffer:
julia> b=IOBuffer();
julia> writedlm(b, a)
julia> s = String(take!(b))
"0.7609054249392935\t0.5417287267974711\t0.9044189728674543\n0.8042343804934786\t0.8206460267786213\t0.43575947315522123\n"
Last but not least, if you want to have a stronger control use CSV - and the pattern is the same - either use stdout or capture the output using a buffer e.g.:
julia> using CSV, Tables
julia> b=IOBuffer();
julia> CSV.write(b, Tables.table(a));
julia> s = String(take!(b))
"Column1,Column2,Column3\n0.7609054249392935,0.5417287267974711,0.9044189728674543\n0.8042343804934786,0.8206460267786213,0.43575947315522123\n"
Even more - if you want to capture the output from display - you can too!
julia> b=IOBuffer();
julia> t = TextDisplay(b);
julia> display(t,a);
julia> s = String(take!(b))
"2×3 Array{Float64,2}:\n 0.760905 0.541729 0.904419\n 0.804234 0.820646 0.435759"
What you were looking for:
b = IOBuffer()
show(b, "text/plain", rand(3, 3))
s = String(take!(b))
I have the following code and I need to covert several UInt32 variables to UInt8 vectors so then combine them into a single UInt8 vector.
The goal is to take the record I have decoded from a Pcap file and put it into a format that I can append to the end of an existing Pcap file.
The code below takes output from a previous function and returns a hex output of 4 UInt 32's and a vector of UInt8's for the payload.
function pcap_get_record(s::PcapOffline)
rec = PcapRec()
if (!eof(s.file))
rec.ts_sec = s.is_big ? read(s.file, UInt32) : ntoh(read(s.file, UInt32))
rec.ts_usec = s.is_big ? read(s.file, UInt32) : ntoh(read(s.file, UInt32))
rec.incl_len = s.is_big ? read(s.file, UInt32) : ntoh(read(s.file, UInt32))
rec.orig_len = s.is_big ? read(s.file, UInt32) : ntoh(read(s.file, UInt32))
rec.payload = read(s.file, rec.incl_len)
return rec
end
nothing
end
Thanks
Here you are
julia> reinterpret(UInt8, rand(UInt32, 1)) |> Vector
4-element Array{UInt8,1}:
0x4d
0x54
0x34
0xd3
remember to check the byte order.
Update: So I have solved this and I was overthinking what needed to be done.
I just wrote the UInt variable in their raw form and that did the trick.
write(pcap, rec.orig_len) #this is a UInt32
write(pcap, rec.payload) #this is a UInt8 vector
Original:
I was having a hard time making my previous comment readable.
Thanks for the response. I am not however able to get the reinterpret to work with my UInt32 variable.
a = reinterpret(UInt8, rec.ts_usec) |> Vector
ERROR: bitcast: argument size does not match size of target type
Stacktrace:
[1] reinterpret(::Type{UInt8}, ::UInt32) at .\essentials.jl:370
[2] top-level scope at none:0
typeof(rec.ts_usec)
UInt32
after messing around some more I was able to get this to work but this doesn't seem very efficient.
"Edit" I just found that this wont work since it cuts off any leading zeros in the UInt32. example rec.incl_len = 0x00000516 would come out as "516" instead of "00000516" which is needed.
julia> hex(n) = string(n, base = 16, pad = 2)
julia> a = hex2bytes(hex(rec.ts_sec))
4-element Array{UInt8,1}:
0x5b
0x60
0xa3
0xa1