How to convert a random string to a byte string - julia

I generated a random string lets say using this randstring(RandomDevice(), 'a':'z', 15) Now I want its output as a byte string. How do I do that?
More context: What I am trying to do is to write something similar to python's os.urandom() function.

Julia doesn't seem to have Python like bytestrings at least in Base.
julia> using Random
julia> using Random: RandomDevice, randstring
julia> rs = randstring(RandomDevice(), 'a':'z', 15)
"wbfgxgoheksvxvx"
You can get a code units wrapper using the codeunits function, which returns a vector of Base.CodeUnits:
julia> cu = codeunits(rs)
15-element CodeUnits{UInt8,String}:
0x77
0x62
0x66
0x67
0x78
0x67
0x6f
0x68
0x65
0x6b
0x73
0x76
0x78
0x76
0x78
Or with the b"" non standard string literal macro:
julia> b"wbfgxgoheksvxvx"
15-element CodeUnits{UInt8,String}:
0x77
0x62
0x66
0x67
0x78
0x67
0x6f
0x68
0x65
0x6b
0x73
0x76
0x78
0x76
0x78
You can have a byte array like this:
julia> ba = Vector{UInt8}(rs)
15-element Array{UInt8,1}:
0x77
0x62
0x66
0x67
0x78
0x67
0x6f
0x68
0x65
0x6b
0x73
0x76
0x78
0x76
0x78
You could use the repr function, along with split and join functions to create your desired string:
julia> function bytestring(s::String)::String
ba = Vector{UInt8}(s)
return join([join(("\\x", split(repr(cu), "x")[2]), "") for cu in ba], "")
end
bytestring (generic function with 1 method)
julia> bytestring(rs)
"\\x77\\x62\\x66\\x67\\x78\\x67\\x6f\\x68\\x65\\x6b\\x73\\x76\\x78\\x76\\x78"
You can put that in a macro in order to create a custom non standard string literal:
julia> macro bs_str(s)
return bytestring(s)
end
#bs_str (macro with 1 method)
julia> bs"wbfgxgoheksvxvx"
"\\x77\\x62\\x66\\x67\\x78\\x67\\x6f\\x68\\x65\\x6b\\x73\\x76\\x78\\x76\\x78"
Finally you could compose it like this:
julia> urandom(r::Random.AbstractRNG, chars, n::Integer)::String = bytestring(randstring(r, chars, n))
urandom (generic function with 1 method)
julia> urandom(RandomDevice(), 'a':'z', 15)
"\\x67\\x61\\x78\\x64\\x71\\x68\\x73\\x77\\x76\\x6e\\x6d\\x6d\\x63\\x78\\x68"

Related

R How to convert a byte in a raw vector into a ascii space

I am reading some very old files created by C code that consist of a header (ASCII) and then data. I use readBin() to get the header data. When I try to convert the header to a string it fails because there are 3 'bad' bytes. Two of them are binary 0 and the other binary 17 (IIRC).
How do I convert the bad bytes to ASCII SPACE?
I've tried some versions of the below code but it fails.
hd[hd == as.raw(0) | hd == as.raw(0x17)] <- as.raw(32)
I'd like to replace each bad value with a space so I don't have to recompute all the fixed data locations in parsing the string derived from hd.
I normally just go through a conversion to integer.
Suppose we have this raw vector:
raw_with_null <- as.raw(c(0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x00,
0x57, 0x6f, 0x72, 0x6c, 0x64, 0x21))
We get an error if we try to convert it to character because of the null byte:
rawToChar(raw_with_null)
#> Error in rawToChar(raw_with_null): embedded nul in string: 'Hello\0World!'
It's easy to convert to numeric and replace any 0s or 23s with 32s (ascii space)
nums <- as.integer(raw_with_null)
nums[nums == 0 | nums == 23] <- 32
We can then convert nums back to raw and then to character:
rawToChar(as.raw(nums))
#> [1] "Hello World!"
Created on 2022-03-05 by the reprex package (v2.0.1)

Unpack Binary Data with specific format in Julia

I am trying to convert a Binary file parser to Julia from Python. I am struggling to figure out how to unpack the binary stream with a specific format. I found this discourse thread that is exactly what I am trying to do, but its from 2017 and doesn't seem to have a working solution. Does anyone have a solution?
In Python it looks like this:
In [22]: struct.unpack('>idi', b'\x00\x00\x00\x17#\t\x1e\xb8Q\xeb\x85\x1f\x00\x00\x00*')
Out[22]: (23, 3.14, 42)
In Julia I am here:
data = open(filename, "r")
seek(data, 0)
# now I want to get the first 12 bytes of the file and convert to a string.. and am stumped..
I'm not very familiar with struct.unpack in Python, but maybe you could do something like this:
julia> data = IOBuffer("\x00\x00\x00\x17#\t\x1e\xb8Q\xeb\x85\x1f\x00\x00\x00*")
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> seekstart(data)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> i = bswap(read(data, Int32))
23
julia> pi = bswap(read(data, Float64))
3.14
julia> i = bswap(read(data, Int32))
42
bswaps are there because there seems to be a difference in endianness between what Julia internally uses and what's encoded in your binary stream. Apart from that, this is just a plain use of read, specifying the type of data to be read.
By the way, here is how you would read the first 12 bytes of the file, and convert them to a string (which is not really necessary in this case, but could be useful in others):
julia> seekstart(data)
IOBuffer(data=UInt8[...], readable=true, writable=false, seekable=true, append=false, size=16, maxsize=Inf, ptr=1, mark=-1)
julia> bytes = read(data, 12)
12-element Array{UInt8,1}:
0x00
0x00
0x00
0x17
0x40
0x09
0x1e
0xb8
0x51
0xeb
0x85
0x1f
# note the capital "S" in "String"
julia> String(bytes)
"\0\0\0\x17#\t\x1e\xb8Q\xeb\x85\x1f"

How can I cast byte arrays to struct in Julia?

I'm new to Julia. I'm trying to parse a structured binary file. I read n bytes from the file and I want to cast the byte array to an object of type X.
struct X
messageType::UInt8
second::UInt32
end
f = open("myfile.bin")
bytes = read(f, 5)
And now I want to cast bytes to an object of X. How can I do this?
You can use StructIO here is how.
Setup:
using StructIO
#io struct XX
messageType::UInt8
second::UInt32
end align_packed
bytes = UInt8[0x72, 0xa3, 0x97, 0xcf, 0x64]
buf = IOBuffer(bytes)
And now running the code:
julia> seekstart(buf); unpack(buf, XX)
XX(0x72, 0x64cf97a3)
julia> seekstart(buf); unpack(buf, XX, :BigEndian)
XX(0x72, 0xa397cf64)

ARM assembly question wiht hexidecimal values

If i have 0x00000065 stored in a register, is that the same as having 0X65 in my register?
Thank you so much.
Yes, it's the same two hexadecimal values:
0x00000065 = 5*(16^0) + 6*(16^1) + 0*(16^2) + ... + 0*(16^7) = 5*(16^0) + 6*(16^1) = 0x65
(Note: the symbol '^' denotes the power operator)
Registers are 32 bits long so you can't have 0x65 in one, only 0x00000065.
But of course, these are equal numbers.

split long hex numbers by 0D 0A in R

I want to write a code in R to split hex numbers by delimiters. I have a file including with all hex numbers separated by space like below:
0x01 0x02 0x03 0x04 0x0d 0x0a 0x05 0x06 0x07 0x0d 0x0a
I want to split all of these hex numbers by 0x0d (CR:carriage return) and 0x0a (LF: line feed), i mean i want output like this:
0x01 0x02 0x03 0x04
0x05 0x06 0x07
I think i can use functions like strsplit() but i dont know how. Would you please tell me how can i implement this in R?
Thanks.
Would a pair of gsub calls work?
text <- "0x01 0x02 0x03 0x04 0x0d 0x0a 0x05 0x06 0x07 0x0d 0x0a"
text <- gsub( "0x0d", "\r", text )
text <- gsub( "0x0a", "\n", text )
Which gives:
text
[1] "0x01 0x02 0x03 0x04 \r \n 0x05 0x06 0x07 \r \n"
library(purrr)
library(stringi)
library(magrittr)
# stringi & tidyverse
readLines(textConnection("0x01 0x02 0x03 0x04 0x0d 0x0a 0x05 0x06 0x07 0x0d 0x0a")) %>%
stri_split_fixed("0x0d 0x0a") %>%
.[[1]] %>%
stri_trim_both() %>%
discard(equals, "")
## [1] "0x01 0x02 0x03 0x04" "0x05 0x06 0x07"
# base R + a little tidyverse
readLines(textConnection("0x01 0x02 0x03 0x04 0x0d 0x0a 0x05 0x06 0x07 0x0d 0x0a")) %>%
strsplit("0x0d 0x0a") %>%
.[[1]] %>%
trimws() %>%
discard(equals, "")
## [1] "0x01 0x02 0x03 0x04" "0x05 0x06 0x07"
# old school R
con <- textConnection("0x01 0x02 0x03 0x04 0x0d 0x0a 0x05 0x06 0x07 0x0d 0x0a")
lines <- readLines(con)
lines <- strsplit(lines, "0x0d 0x0a")[[1]]
lines <- trimws(lines)
lines <- lines[lines != ""]
lines
## [1] "0x01 0x02 0x03 0x04" "0x05 0x06 0x07"
paste0() to make a single string with line breaks. cat() to display to the screen or write to a file as separate lines.

Resources