Julia: Extract Zip files within a Zip file - julia

I'm using Julia's ZipFile package to extract and process csv files. No problem, but when I encounter a zip file within the zip file, I'd like to process that as well, but am encountering an error.
Julia ZipFile docs are here: https://zipfilejl.readthedocs.io/en/latest/
Here's the code:
using ZipFile
using DataFrames
function process_zip(zip::ZipFile.ReadableFile)
if split(zip.name,".")[end] == "zip"
r = ZipFile.Reader(zip) #error: MethodError: no method matching seekend(::ZipFile.ReadableFile)
for f in r.files
process_zip(f)
end
end
if split(zip.name,".")[end] == "csv"
df = readtable(zip) #for now just read it into a dataframe
end
end
r = ZipFile.Reader("yourzipfilepathhere");
for f in r.files
process_zip(f)
end
close(r)
The call to ZipFile.Reader gives the error:
MethodError: no method matching seekend(::ZipFile.ReadableFile)
Closest candidates are:
seekend(::Base.Filesystem.File) at filesystem.jl:191
seekend(::IOStream) at iostream.jl:57
seekend(::Base.AbstractIOBuffer) at iobuffer.jl:178
...
Stacktrace:
[1] _find_enddiroffset(::ZipFile.ReadableFile) at /home/chuck/.julia/v0.6/ZipFile/src/ZipFile.jl:259
[2] ZipFile.Reader(::ZipFile.ReadableFile, ::Bool) at /home/chuck/.julia/v0.6/ZipFile/src/ZipFile.jl:104
[3] process_zip(::ZipFile.ReadableFile) at ./In[27]:7
[4] macro expansion at ./In[27]:18 [inlined]
[5] anonymous at ./<missing>:?
So it seems ZipFile package cannot process a zip file from a zip file as it cannot do a seekend on it.
Any ideas on how to do this?

A workaround is to read the zip file into an IOBuffer. ZipFile.Reader is able to process the IOBuffer. Here is the working code:
using ZipFile
using DataFrames
function process_zip(zip::ZipFile.ReadableFile)
if split(zip.name,".")[end] == "zip"
iobuffer = IOBuffer(readstring(zip))
r = ZipFile.Reader(iobuffer)
for f in r.files
process_zip(f)
end
end
if split(zip.name,".")[end] == "csv"
df = readtable(zip) #for now just read it into a dataframe
end
end
r = ZipFile.Reader("yourzipfilepathhere");
for f in r.files
process_zip(f)
end
close(r)

Related

Jupyter with Julia results in unexpected type error: no method matching

I get an unexpected type error when running the following Julia code in Jupyter, where a seemingly straightforward import goes wrong:
include("./imp.jl")
include("./imp2.jl")
n = Main.Imp.Network([1,2])
Imp2.p2(n)
This results in the following error:
MethodError: no method matching p(::Main.Imp.Network)
Closest candidates are:
p(::Main.Imp2.Imp.Network) at /Users/cg/Dropbox/code/Julia/learning/imp.jl:11
The code is the below. How does this happen?
Imp.jl:
module Imp
export Network, p
mutable struct Network
a::Array{Any,1}
end
function p(network::Network)
network
end
end
Imp2.jl:
module Imp2
include("./imp.jl")
function p2(network)
Imp.p(network)
end
end
More error below:
Stacktrace:
[1] p2(network::Main.Imp.Network)
# Main.Imp2 ~/Dropbox/code/Julia/learning/imp2.jl:5
[2] top-level scope
# In[3]:4
[3] eval
# ./boot.jl:360 [inlined]
[4] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base ./loading.jl:1116
You can either do:
module Imp2
using Main.Imp
function p2(network)
Imp.p(network)
end
end
OR (without sourcing imp.jl outside of module defintion)
module Imp2
include("./imp.jl")
using .Imp
function p2(network)
Imp.p(network)
end
end
In the second case your Julia code could look like:
julia> using Main.Imp2
julia> n = Imp2.Imp.Network([1,2])
Main.Imp2.Imp.Network(Any[1, 2])
julia> Imp2.p2(n)
Main.Imp2.Imp.Network(Any[1, 2])
Addtionally if you add export Imp to the Imp2 module, you could write Imp.Network([1,2]) instead of Imp2.Imp.Network([1,2]).

TypeError: in typeassert in Julia 1.6.3

I am new to Julia. I am working on a Julia package which the writer is not available.
This is the part of the code where I have a problem with:
function find_partition(model::Model, ms)
ps = Dict{Z3Expr,Vector{Int}}()
for (i, m) in enumerate(ms)
mval = Z3.eval(model, m, false)
println(ps)
println(mval)
println(typeof(mval))
if haskey(ps, mval)
push!(ps[mval], i)
else
push!(ps, mval=>Int[i])
end
end
values(ps)
end
Here is the output:
Dict{Z3.Expr, Vector{Int64}}()
1.0
Z3.ExprAllocated
ERROR: TypeError: in typeassert, expected UInt64, got a value of type UInt32
Stacktrace:
[1] hashindex(key::Z3.ExprAllocated, sz::Int64)
# Base ./dict.jl:169
[2] ht_keyindex(h::Dict{Z3.Expr, Vector{Int64}}, key::Z3.ExprAllocated)
# Base ./dict.jl:284
[3] haskey(h::Dict{Z3.Expr, Vector{Int64}}, key::Z3.ExprAllocated)
# Base ./dict.jl:550
[4] find_partition(model::Z3.ModelAllocated, ms::Vector{Z3.ExprAllocated})
# Absynth.NLSat ~/Desktop/faoc/Absynth/src/nlsat/cfinitesolver.jl:101
I tried following it and found out that somewhere inside dict.jl in Julia, there are such lines
sz = length(h.keys)
...
index = hashindex(key, sz)
And this is the hashindex written somewhere in dict.jl:
hashindex(key, sz) = (((hash(key)::UInt % Int) & (sz-1)) + 1)::Int
It seems that the version of Z3 used in this package is not compatible with the current version of Julia. I am not sure if this package was working in the beginning at all.
Is there any quick fix to this? Like rewriting this part of the code:
if haskey(ps, mval)
push!(ps[mval], i)
else
push!(ps, mval=>Int[i])
in a way that this won't happen? I tried writing like merge!(ps,Dict(mval=>Int[i])) but it's eventually reaching this hashindex function again.
If I install the older versions of Julia, would it be possible to solve this issue or the problem is somewhere else?

NetCDF - What is is 'phony_dim_0', 'phony_dim_1', 'phony_dim_2'?

I am very new to using NetCDF files and I am at the exploratory stage trying to understand what this file could do. I am using 'netCDF4' python library. I am trying to find what does 'phony_dim_0', 'phony_dim_1', 'phony_dim_2' mean and contains? I am thinking it could be 'lat','lon', and/or 'time'?
Loading nc file
ds = nc.Dataset('my_file.nc')
type(ds)
>>> netCDF4._netCDF4.Dataset
Printing Keys
print(ds.dimensions.keys())
>>> dict_keys(['phony_dim_0', 'phony_dim_1', 'phony_dim_2'])
Extract what is in this key?
ds.dimensions['phony_dim_0']
>>> <class 'netCDF4._netCDF4.Dimension'>: name = 'phony_dim_0', size = 1179
Error:
for c in ds.dimensions['phony_dim_0']:
print(c) # Want to see what is in this? # Errors: TypeError: 'netCDF4._netCDF4.Dimension' object is not iterable

Using lapply to source multiple R scripts in sub-directories

These are the folders in my directory
128 128-1-32 16384 16384-1-36 4096-1 512 512-1-65 65536-1
128-1 128tbw1 16384-1 4096 4096-1-36 512-1 65536
Each of them has a7.R code that loads files from each folder and creates images.I want my script to enter each of the folders then
source('a7.R')
then exit that folder and repeat the process for all the folders.I am doing this now manually and it is really boring.Is this possible with R?
I have tried solution like this
#!/usr/bin/Rscript
lapply(list.files(full.names=TRUE, recursive = TRUE, pattern = "^a7\\.R$"), source)
milenko#milenko-desktop:~/jbirp/mt07$ Rscript s.R
list()
The coffeinejunky's solution is not working
#!/usr/bin/Rscript
foo <- function(directory) { setwd(directory); source(a7.R) }
do.call("foo", list(directory= 128 128-1-32 16384 16384-1-36 4096-1 512 512-1-65 65536-1 128-1 128tbw1 16384-1 4096 4096-1-36 512-1 65536))
source('n.R')
Error in source("n.R") : n.R:2:33: unexpected numeric constant
1: foo <- function(directory) { setwd(directory); source(a7.R) }
2: do.call("foo", c(directory= 128 128
If i change list like this
do.call("foo", list(directory= "./128" "./128-1" "./128-1-32" "./128tbw1" "./16384" "./16384-1" "./16384-1-36" "./4096" "./4096-1" "./4096-1-36" "./512" "./512-1" "./512-1-65" "./65536" "./65536-1"))
I got
Error in source("n.R") : n.R:2:40: unexpected string constant
1: foo <- function(directory) { setwd(directory); source(a7.R) }
2: do.call("foo", list(directory= "./128" "./128-1"
^
This is what I got when I list path
> list.dirs(path = ".", full.names = TRUE)
[1] "." "./128" "./128-1" "./128-1-32" "./128tbw1"
[6] "./16384" "./16384-1" "./16384-1-36" "./4096" "./4096-1"
[11] "./4096-1-36" "./512" "./512-1" "./512-1-65" "./65536"
[16] "./65536-1"
I need to change directory multiple times and perform the same operation in each of them.Is lapply good for this or not?
The following should work:
directories <- list.dirs(path=".", full.names = T)
# you need to make sure this contains the relevant directories
# otherwise you need to remove irrelevant directories
foo <- function(x) {
old <- setwd(x) # this stores the old directory and changes into the new one
source("a7.R")
setwd(old)
}
lapply(directories, foo)
Alternatively,
for(folder in directories) foo(folder)
This will source every a7.R file with the working directory temporarily set to the sourced file's folder.
a7files <- list.files(full.names=TRUE, recursive = TRUE, pattern = "^a7\\.R$")
sapply(a7files, source, chdir = TRUE)
From ?source
chdir logical; if TRUE and file is a pathname, the R working directory is temporarily changed to the directory containing file for evaluating.

use R to loop through subdirectories and copy files

I am trying to create a batch script in R to pre-process some data and one of the first steps I have to do is check to see if a file exists in a sub-directory and then (if it does) create a copy of it with a new name. I'm having trouble with the syntax.
This is my code:
##Define the subject directory path
sDIR = "/home/bsussman/Desktop/WORKSPACE"
#create data frame to loop through
##list of subject directories
subjects <-list.dirs(path = sDIR, full.names = TRUE, recursive = FALSE)
for (subj in 1:length(subjects)){
oldT1[[subj]] <- dir(subjects[subj], pattern=glob2rx("s*.nii"), full.names=TRUE)
T1[[subj]] <- paste(subjects[subj], pattern="/T1.nii",sep="")
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))=FALSE{
file.copy(oldT1, T1)
}
}
It renames files in one subdirectory, but will not do loop through gives me these errors:
Error: unexpected '=' in:
"
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))="
> file.copy(oldT1, T1)
[1] FALSE
> }
Error: unexpected '}' in " }"
> }
Error: unexpected '}' in "}"
I am not as much worried about the [1]FALSE message. But any ideas?
Thanks!!
It's just a problem with the syntax in the if statement. Try replacing this:
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))=FALSE{
file.copy(oldT1, T1)
}
with this:
if (!file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))){
file.copy(oldT1, T1)
}

Resources