Preserving recursive directory structure in zips with Java 8 - recursion

I have the following directory on my laptop:
/tmp/
myapp/
assets/
config.yml
models/
troll.ply
tree.ply
textures/
troll-skin.png
tree-skin.png
I would like to zip /tmp/myapp/assets (and all its recursive contents) up into a ZIP named assets.zip, such that, when I unzip it (via unzip assets.zip), it preserves the directory structure under the assets folder. Hence, when unzipped, it would show config.yml in the "root" of the ZIP, and 2 directories inside the ZIP (models and textures). The rest of the files would be inside these respective subdirectories, etc.
When I run this code:
File sourceDir = new File("/tmp/myapp/assets");
ZipOutputStream zip = new ZipOutputStream(new FileOutputStream("/Users/myuser/archives/assets.zip"));
File[] contents = sourceDir.listFiles();
for(File file : contents) {
zip.putNextEntry(new ZipEntry(file.name));
InputStream isteam = new FileInputStream(file);
Files.copy(isteam, zip);
zip.closeEntry();
isteam.close();
}
zip.close();
The code correctly creates a ZIP at /Users/myuser/archives/assets.zip.
However, when I unzip it (unzip /Users/myuser/archives/assets.zip) and then run ls -al /Users/myuser/archives, my output is:
-rw-r--r-- 1 myuser 1754083733 492 Dec 30 14:14 assets.zip
-rw-r--r-- 1 myuser 1754083733 10 Dec 30 14:14 config.yml
-rw-r--r-- 1 myuser 1754083733 7 Dec 30 14:14 models
-rw-r--r-- 1 myuser 1754083733 9 Dec 30 14:14 textures
So both models and textures are being treated like files (not as directories). Furthermore, when I take a peek at the contents of the "models file", it appears that the contents of troll.ply and tree.ply have been concatenated inside of it, and ditto for the "tree file" with the 2 PNGs.
How can I tweak this so that directory structure (no matter how deep/nested) is always preserved in the resultant ZIP?

you can probably use the recursive method call to preserve the sub directories structure:
private static void addDir(File sourceDir, ZipOutputStream zip) throws IOException {
File[] contents = sourceDir.listFiles();
for(File file : contents) {
if(file.isDirectory()){
addDir(file, zip);
} else {
zip.putNextEntry(new ZipEntry(file.getAbsolutePath().replace("/tmp/myapp/","")));
System.out.println("file name " + file.getAbsolutePath().replace("/tmp/myapp/",""));
Path rn_demo = Paths.get(String.valueOf(file));
Files.copy(rn_demo, zip);
}
}
zip.closeEntry();
}
and you call in main method as below:
public static void main(String[] args) throws IOException {
File sourceDir = new File("/tmp/myapp/assets");
ZipOutputStream zip = new ZipOutputStream(new FileOutputStream("/Users/myuser/archives/assets.zip"));
addDir(sourceDir, zip);
zip.close();
}

Zipping through Java Zip seems to work differently on different OS's. I had the issue that it was working fine on Windows 7. But on Linux (RHEL6) the files were before the folders. This caused tests to fail.
A way to solve it is to sort the files and folders. folders first and then files. So the..
File[] contents = sourceDir.listFiles();
...File array should be sorted via path. Create a List<> from the Files and sort.
Collections.sort(newFiles, (a, b) ->
b.getAbsolutePath().compareTo(a.getAbsolutePath())
);
Note, I created an InputFile object to store the absolute path of the file.

Related

My R function is consuming too much memory. Can you help me optimizing it?

I'm new to R and having trouble with optimizing a function.
My function is to:
create a directory specified in the function
download the zip file from the link inside the function and extract it to the directory
move extracted files to the main directory if files are extracted under a new subfolder
delete the subfolder
It works but consumes a lot of memory and takes 30mins to do such an easy job on a 2.7MB zip file.
Thank you in advance!
create_dir <- function(directory) {
path <- file.path(getwd(), directory)
if (!file.exists(path)) {
dir.create(path)
}
link <-
"https://d396qusza40orc.cloudfront.net/rprog%2Fdata%2Fspecdata.zip"
temp <- tempfile()
download.file(link, temp, mode = "wb")
unzip(temp, exdir = path)
unlink(temp)
existing_loc <- list.files(path, recursive = TRUE)
for (loc in existing_loc) {
if (length(grep("/", loc))) {
file.copy(file.path(path, loc), path)
file.remove(file.path(path, loc))
}
}
dirs <- list.dirs(path)
rm_dirs <- dirs[dirs != path]
if (length(rm_dirs)) {
for (dir in rm_dirs) {
unlink(rm_dirs, recursive = TRUE)
}
}
}
create_dir("testDirectory")
Thanks, I found the problem. It's because of setting a working directory on OneDrive that syncs for every extraction, moving, and deletion of 332 files processed by the function. AntiVirus also run along with OneDrive and caused my PC to freeze for 30 mins by using 70% of CPU.

Call Java from R

I want to execute Java code from R. I used rJava package and I was able to execute a simple code of Java such as create object or print on screen.
require("rJava")
.jinit()
test<-new (J ("java.lang.String") , "Hello World!")
However what I want to do is to send a dataframe from R or CSV file and execute a code in Java then return the output file to R. At the same time, it is difficult in my case to call the R code from Java, as I want to process the CVS file first in R , then apply the Java code on it and return the result again to R to complete the analysis.
I'd go following way here.
Process CSV file inside R
Save this file somewhere and make sure you know explicit location (e.g. /home/user/some_csv_file.csv)
Create adapter class in Java that will have method String processFile(String file)
Inside method processFile read the file, pass it to your code in Java and do Java based processing
Store output file somewhere and return it's location
Inside R, get the result of processFile method and do further processing in R
At least, that's what I'd do as a first draft of a solution for your problem.
Update
We need Java file
// sample/Adapter.java
package sample;
public class Adapter {
public String processFile(String file) {
System.out.println("I am processing file: " + file);
return "new_file_location.csv";
}
public static void main(String [] arg) {
Adapter adp = new Adapter();
System.out.println("Result: " + adp.processFile("initial_file.csv"));
}
}
We have to compile it
> mkdir target
> javac -d target sample/Adapter.java
> java -cp target sample.Adapter
I am processing file: initial_file.csv
Result: new_file_location.csv
> export CLASSPATH=`pwd`/target
> R
We have to call it from R
> library(rJava)
> .jinit()
> obj <- .jnew("sample.Adapter")
> s <- .jcall(obj, returnSig="Ljava/lang/String;", method="processFile", 'initial_file')
> s
I am processing file: initial_file
> s
[1] "new_file_location.csv"
And your source directory looks like this
.
├── sample
│   └──Adapter.java
└── target
     └── sample
         └── Adapter.class
In processFile you can do whatever you like and call your existing Java code.

Zip Files using PeopleCode in Application Engine

I have a requirment to zip multiple folders inside parent folder and display the file in App Engine ouput. The folder structure in Unix File Server -
Parent Folder
- Folder1 (contains files)
- Folder2 (contains files)
How to zip the folders and store it in parent folder using PeopleCode in AE (Final folder structure will be as follows
Parent Folder
-Folder1
-Folder2
-ParentFolder.Zip.
Note: Process runs on Unix Server.
Actually we were calling java code to zip files.
Such as:
&buffer = CreateJavaArray("byte[]", 18024);
&zipStream = CreateJavaObject("java.util.zip.ZipOutputStream", CreateJavaObject("java.io.FileOutputStream", &outDir | &outZip));
For &i = 1 To &inFiles.Len
&zipStream.putNextEntry(CreateJavaObject("java.util.zip.ZipEntry", &inFiles [&i]));
&inStream = CreateJavaObject("java.io.FileInputStream", &outDir | &inFiles [&i]);
&len = &inStream.read(&buffer);
While &len > 0;
&zipStream.write(&buffer, 0, &len);
&len = &inStream.read(&buffer);
End-While;
&zipStream.closeEntry();
&inStream.close();
End-For;
&zipStream.close();

Copy, verify, and then delete files/children from network location

I want to develop a script that copies,verifies, and then deletes from one network location to another (files over x days old).
Here is my algorithm:
Recursively traverse a network location ($movePath)
for all files $_.LastWriteTime >= x days | forEach {
xcopy or robocopy $FileName = $_.FullName.Replace($movePath, $newPath)
if (the files where written correctly) {
(delete) Remove-Item $Filename from $movePath
}
Can I combine the xcopy /v (verify) with robocopy?
Do you want to maintain the subfolder structure (i.e. files from a subfolder in the source go into the same subfolder in the destination)? If so, this should suffice:
$src = 'D:\source\folder'
$dst = '\\server\share'
$age = 10 # days
robocopy $src $dst /e /move /minage:$age
robocopy can handle verification (done automatically) and deletion by itself.

Using JSch ChannelSftp: How to read multiple files with dynamic names?

I have to read a bunch of .CSV files with dynamic file names from a SFTP server. These files get generated every 15 minutes.
I am using JSch's ChannelSftp, but there is no method which would give the exact filenames. I only see an .ls() method. This gives a Vector e.g.
[drwxr-xr-x 2 2019 2019 144 Aug 9 22:29 .,
drwx------ 6 2019 2019 176 Aug 27 2009 ..,
-rw-r--r-- 1 2019 2019 121 Aug 9 21:03 data_task1_2011_TEST.csv,
-rw-r--r-- 1 2019 2019 121 Aug 9 20:57 data_task1_20110809210007.csv]
Is there a simple way to read all the files in a directory and copy them to another location?
This code works for copying a single file:
JSch jsch = new JSch();
session = jsch.getSession(SFTPUSER,SFTPHOST,SFTPPORT);
session.setPassword(SFTPPASS);
java.util.Properties config = new java.util.Properties();
config.put("StrictHostKeyChecking", "no");
session.setConfig(config);
session.connect();
channel = session.openChannel("sftp");
channel.connect();
channelSftp = (ChannelSftp)channel;
channelSftp.cd(SFTPWORKINGDIR);
channelSftp.get("data_task1_20110809210007.csv","data_task1_20110809210007.csv");
The ls method is the one you need. It returns a vector of LsEntry objects, each of which you can ask about its name.
So, after your channelSftp.cd(SFTPWORKINGDIR);, you could do the following:
Vector<ChannelSftp.LsEntry> list = channelSftp.ls("*.cvs");
for(ChannelSftp.LsEntry entry : list) {
channelSftp.get(entry.getFilename(), destinationPath + entry.getFilename());
}
(This assumes destinationPath is a local directory name ending with / (or \ in Windows).)
Of course, if you don't want to download the same files again after 15 minutes, you might want to have a list of the local files, to compare them (use a HashSet or similar), or delete them from the server.
Note that ls is case sensitive. This method retrieves all csv files, regardless of the extension case
ArrayList<String> list = new ArrayList<String>();
Vector<LsEntry> entries = sftpChannel.ls("*.*");
for (LsEntry entry : entries) {
if(entry.getFilename().toLowerCase().endsWith(".csv")) {
list.add(entry.getFilename());
}
}

Resources