Read all bytes of a file using Julia - julia

I'm trying to read all the bytes of a file using Julia into an array. So far I have:
s = open(file_path,"r")
I'm not sure how to tell how big the file is. I'm also not sure that I need to. Perhaps I can just pass an empty array into readbytes!

The simplest way do do it is to use read function.
You can either pass an open stream to it like data = read(s) if s was opened with the code you have provided above.
Alternatively you can simply write data = read(file_path). In this way you do not have to close the stream yourself.
You can find out the details by reading help of read by executing ?read in Julia REPL.
To get the size of the file in bytes you can use filesize(file_path) function.

After a bit of testing this seems to work ...
s = open(file_path,"r")
data = UInt8[]
readbytes!(s,data,Inf)
close(s)

Related

How to append some data to a file in Julia?

I have a text file and I want to append some string to it without writing over the existing data. How can I accomplish this in Julia?
Julia provides a bunch of different options to accomplish this same goal. One possible option is to do:
# Open file in append mode and then write to it
exampleFileIOStream = open("example.txt","a")
write(exampleFileIOStream, "Hello world!");
You can read the full docs for the open function and the corresponding functionality in the Julia docs.

Multiple procedures in IDL program

I've written a procedure in IDL which performs some calculations on data and outputs an array of values. The calculations take about 2 minutes to run.
I need to then perform analysis on these results, and ideally I would like not to have to perform the initial calculations each time I want to perform some different analysis.
Is the best way to achieve this to save the output from the calculation to a data file and then read this in from a different program? Or is there a less cumbersome way to go about this?
Thanks in advance for any help
Yes, saving to a file is the easiest way to save the results from your first program for later use in the second (assuming you quit IDL between the two). There are may ways to save the data, depending on it's type and your preferences.
Easiest Way:
An IDL .sav file created by the SAVE command can store any kind of data, IDL variables, and even the whole state of your IDL session. Unfortunately, it only works for IDL (no other languages), and it can need to be re-generated if you upgrade IDL version. You read these files with RESTORE, which even remembers the names of the variables.
my_variable = 'Some data here.'
SAVE, my_variable, FILENAME='myfile.sav' ; save variable(s)
... IDL opened and closed here ...
RESTORE, 'myfile.sav' ; read variable(s) from file
print, my_variable
Some data here.
Most Portable Way:
For simple tabular data, CSV has the advantage of being highly portable and human readable. However, it's also slow, since numbers are stored in ASCII. Use WRITE_CSV to write, and READ_CSV to read.
Most Portable Binary Formats:
For complex data that needs to be read by multiple languages, consider the HDF5 or NetCDF libraries. Both of these are binary formats that can store most types of IDL-supported data. Note that NetCDF is actually an easier-to-use subset of HDF5.
Simplest Binary Format:
Another option for tabular data is a simple binary dump. Use WRITEU to write to a normal file opened for writing. Use READU to read from a normal file open for reading.
Assuming that your data calculations will only change very rarely, then, yes, your best solution is to just save the calculations to an output file, and then read them back into your analysis program. You don't say what kind of data this is, so it's hard to give a more specific answer. Assuming that you have a two-dimensional array of data, you could just write the results as a "flat" binary file:
pro perform_calculations
...
; assume mydata is a float array of dimensions [m,n]
openw, 1, 'results.dat'
writeu, 1, mydata
close, 1
end
Then, in either the same file or preferably a different .pro file:
pro perform_analysis
mydata = fltarr(m, n)
openr, 1, 'results.dat'
readu, 1, mydata
close, 1
...
end
Hope this helps.
Saving is a good way to do it, but if you run in the same session and your second program won't mess up the data from the first one, you can just call one and then pass the result to the second one.
pro do_calculations,result1,result2,result3
result1=1
result2=1.
result3=result1/result2
return
end
pro use_calculations,result1,result2,result3,result4
result4=result1-result2+result3
return
end
Then
IDL> do_calculations,result1,result2,result3
IDL> use_calculations,result1,result2,result3,result4
If you edit use_calculations, you can go again by:
IDL> use_calculations,result1,result2,result3,result4
Because the earlier results will stay in memory unless use_calculations does something bad to them.
You could also set up the second procedure to check to see if it has valid results from the first one and call it as needed.

Streaming data in Julia

Currently, is there a good way to read data in Julia in a streaming fashion?
For example, let's say I have a CSV file that is too big to fit in memory. Are there currently built in functions or a library that facilitates working with this?
I'm aware of the prototype DataStream functionality in DataFrames, but that's not currently exposed via a public API.
The eachline function turns an IO source into an iterator of lines. That should allow you to read a file a line at a time. from there the readcsv and readdlm function can read each line if you turn it into an IOBuffer.
for ln in eachline(open("file.csv"))
data = readcsv(IOBuffer(ln))
# do something with this data
end
It's still pretty do it yourself but there aren't that many steps so it's not too bad.

R passing data frame to another program using system()

I have a data frame that I pass to another program using system(). In the current setup, I first write the contents of the dataframe to a text file, then have the system() command look for the created text file.
df1 <- runif(20)
write(df1, file="file1.txt")
system("myprogram file1.txt")
I have 2 questions:
1) Is there a way to pass a dataframe directly without writing the text file?
2) If not, is there are way to pass the data in memory as a text formatted entity without writing the file to disk?
Thanks for any suggestions.
You can write to anything R calls connections, and that includes network sockets.
So process A can write to the network, and process B can read it without any file-on-disk involved, see help(connections) which even has a working example in the "Examples" section.
Your general topic here is serialization, and R does that for you. You can also pass data that way to other programs using tools that encode metadata about your data structure -- as for example Google's Protocol Buffers (supported in R by the RProtoBuf package).
I spent quite a while and couldn't understand the accepted answer. But I figured out a workaround.
df1 <- runif(20)
system("myprogram /dev/stdin", input = write.table(df1))
However, according to documentation, the input argument will actually be redirected to a temp file, which I suppose will involve some i/o.

Cutting A File into Chunks in Qt

Can anybody give me a hint or initial idea how may I cut a file into chunks in Qt ? Is there any particular file like java it has built in function to split. Later on I want to calculate SHA-256 hash value of each chunks. Any idea guys ??
There is no built in function for that.
Open the original file.
Open a file for the first chunk.
Read bytes from the original file.
Write bytes to the chunk file.
Repeat.
See QFile documentation.

Resources