I want to execute Java code from R. I used rJava package and I was able to execute a simple code of Java such as create object or print on screen.
require("rJava")
.jinit()
test<-new (J ("java.lang.String") , "Hello World!")
However what I want to do is to send a dataframe from R or CSV file and execute a code in Java then return the output file to R. At the same time, it is difficult in my case to call the R code from Java, as I want to process the CVS file first in R , then apply the Java code on it and return the result again to R to complete the analysis.
I'd go following way here.
Process CSV file inside R
Save this file somewhere and make sure you know explicit location (e.g. /home/user/some_csv_file.csv)
Create adapter class in Java that will have method String processFile(String file)
Inside method processFile read the file, pass it to your code in Java and do Java based processing
Store output file somewhere and return it's location
Inside R, get the result of processFile method and do further processing in R
At least, that's what I'd do as a first draft of a solution for your problem.
Update
We need Java file
// sample/Adapter.java
package sample;
public class Adapter {
public String processFile(String file) {
System.out.println("I am processing file: " + file);
return "new_file_location.csv";
}
public static void main(String [] arg) {
Adapter adp = new Adapter();
System.out.println("Result: " + adp.processFile("initial_file.csv"));
}
}
We have to compile it
> mkdir target
> javac -d target sample/Adapter.java
> java -cp target sample.Adapter
I am processing file: initial_file.csv
Result: new_file_location.csv
> export CLASSPATH=`pwd`/target
> R
We have to call it from R
> library(rJava)
> .jinit()
> obj <- .jnew("sample.Adapter")
> s <- .jcall(obj, returnSig="Ljava/lang/String;", method="processFile", 'initial_file')
> s
I am processing file: initial_file
> s
[1] "new_file_location.csv"
And your source directory looks like this
.
├── sample
│ └──Adapter.java
└── target
└── sample
└── Adapter.class
In processFile you can do whatever you like and call your existing Java code.
Related
i'm new to nf-core/nextflow and needless to say the documentation does not reflect what might be actually implemented. But i'm defining the basic pipeline below:
nextflow.enable.dsl=2
process RUNBLAST{
input:
val thr
path query
path db
path output
output:
path output
script:
"""
blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
"""
}
workflow{
//println "I want to BLAST $params.query to $params.dbDir/$params.dbName using $params.threads CPUs and output it to $params.outdir"
RUNBLAST(params.threads,params.query,params.dbDir, params.output)
}
Then i'm executing the pipeline with
nextflow run main.nf --query test2.fa --dbDir blast/blastDB
Then i get the following error:
N E X T F L O W ~ version 22.10.6
Launching `main.nf` [dreamy_hugle] DSL2 - revision: c388cf8f31
Error executing process > 'RUNBLAST'
Error executing process > 'RUNBLAST'
Caused by:
Not a valid path value: 'test2.fa'
Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run
I know test2.fa exists in the current directory:
(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md conf other.nf
CITATIONS.md docs pyproject.toml
CODE_OF_CONDUCT.md lib subworkflows
LICENSE main.nf test.fa
README.md modules test2.fa
assets modules.json work
bin nextflow.config workflows
blast nextflow_schema.json
I also tried with "file" instead of path but that is deprecated and raises other kind of errors.
It'll be helpful to know how to fix this to get myself started with the pipeline building process.
Shouldn't nextflow copy the file to the execution path?
Thanks
You get the above error because params.query is not actually a path value. It's probably just a simple String or GString. The solution is to instead supply a file object, for example:
workflow {
query = file(params.query)
BLAST( query, ... )
}
Note that a value channel is implicitly created by a process when it is invoked with a simple value, like the above file object. If you need to be able to BLAST multiple query files, you'll instead need a queue channel, which can be created using the fromPath factory method, for example:
params.query = "${baseDir}/data/*.fa"
params.db = "${baseDir}/blastdb/nt"
params.outdir = './results'
db_name = file(params.db).name
db_path = file(params.db).parent
process BLAST {
publishDir(
path: "{params.outdir}/blast",
mode: 'copy',
)
input:
tuple val(query_id), path(query)
path db
output:
tuple val(query_id), path("${query_id}.out")
"""
blastn \\
-num_threads ${task.cpus} \\
-query "${query}" \\
-db "${db}/${db_name}" \\
-out "${query_id}.out"
"""
}
workflow{
Channel
.fromPath( params.query )
.map { file -> tuple(file.baseName, file) }
.set { query_ch }
BLAST( query_ch, db_path )
}
Note that the usual way to specify the number of threads/cpus is using cpus directive, which can be configured using a process selector in your nextflow.config. For example:
process {
withName: BLAST {
cpus = 4
}
}
I want to use https://github.com/bazelbuild/rules_webtesting. I am using Bazel 5.2.0.
The whole project can be found here.
My WORKSPACE.bazel file looks like this:
load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "io_bazel_rules_webtesting",
sha256 = "3ef3bb22852546693c94e9b0b02c2570e74abab6f800fd58e0cbe79492e49c1b",
urls = [
"https://github.com/bazelbuild/rules_webtesting/archive/581b1557e382f93419da6a03b91a45c2ac9a9ec8/rules_webtesting.tar.gz",
],
)
load("#io_bazel_rules_webtesting//web:repositories.bzl", "web_test_repositories")
web_test_repositories()
My BUILD.bazel file looks like this:
load("#io_bazel_rules_webtesting//web:py.bzl", "py_web_test_suite")
py_web_test_suite(
name = "browser_test",
srcs = ["browser_test.py"],
browsers = [
"#io_bazel_rules_webtesting//browsers:chromium-local",
],
local = True,
deps = ["#io_bazel_rules_webtesting//testing/web"],
)
browser_test.py looks like this:
import unittest
from testing.web import webtest
class BrowserTest(unittest.TestCase):
def setUp(self):
self.driver = webtest.new_webdriver_session()
def tearDown(self):
try:
self.driver.quit()
finally:
self.driver = None
# Your tests here
if __name__ == "__main__":
unittest.main()
When I try to do a bazel build //... I get (under Ubuntu 20.04 and macOS):
INFO: Invocation ID: 74c03efd-9caa-4174-9fda-42f7ff37e38b
ERROR: error loading package '': Every .bzl file must have a corresponding package, but '#io_bazel_rules_webtesting//web:repositories.bzl' does not have one. Please create a BUILD file in the same or any parent directory. Note that this BUILD file does not need to do anything except exist.
INFO: Elapsed time: 0.038s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
The error message does not make sense to me, since there is a BUILD file in
https://github.com/bazelbuild/rules_webtesting/blob/581b1557e382f93419da6a03b91a45c2ac9a9ec8/BUILD.bazel
and https://github.com/bazelbuild/rules_webtesting/blob/581b1557e382f93419da6a03b91a45c2ac9a9ec8/web/BUILD.bazel.
I also tried a different version of Bazel - but with the same result.
Any ideas on how to get this working?
You need to add a strip_prefix = "rules_webtesting-581b1557e382f93419da6a03b91a45c2ac9a9ec8" in your http_archive call.
For debugging, you can look in the folder where Bazel extracts it: bazel-out/../../../external/io_bazel_rules_webtesting. #io_bazel_rules_webtesting//web translates to bazel-out/../../../external/io_bazel_rules_webtesting/web, so if that folder doesn't exist things won't work.
I get the below error when i try to open and download .realm file in /tmp directory of serverless framework.
{"errorType":"Runtime.UnhandledPromiseRejection","errorMessage":"Error: posix_fallocate() failed: Operation not permitted" }
Below is the code:
let realm = new Realm({path: '/tmp/custom.realm', schema: [schema1, schema2]});
realm.write(() => {
console.log('completed==');
});
EDIT: this might soon be finally fixed in Realm-Core: see issue 4957.
In case you'll run into this problem elsewhere, here's a workaround.
This caused by AWS Lambda not supporting the fallocate and fallocate64 system calls. Instead of returning the correct error code in this case, which would be EINVAL for not supported on this file system, Amazon has blocked the system call so that it returns EPERM. Realm-Core has code that handles EINVAL return value correctly but will be bewildered by the unexpected EPERM returned from the system call.
The solution is to add a small shared library as a layer to the lambda: compile the following C file on Linux machine or inside lambda-ci Docker image:
#include <errno.h>
#include <fcntl.h>
int posix_fallocate(int __fd, off_t __offset, off_t __len) {
return EINVAL;
}
int posix_fallocate64(int __fd, off_t __offset, off_t __len) {
return EINVAL;
}
Now, compile this to a shared object with something like
gcc -shared fix.c -o fix.so
Then add it to a root of a ZIP file:
zip layer.zip fix.so
Create a new lambda layer from this zip
Add the lambda layer to your lambda function
Finally make the shared object be loaded by configuring the environment value LD_PRELOAD with value /opt/fix.so to your Lambda.
Enjoy.
I am trying to use qmake to include all files in a directory (this project is an external subversion project with hundreds of files). I am using qmake version 3.1.
What I tried was something like:
server_files = $$files($$PWD/server)
SOURCES += server_files(*.cpp, true)
The first line does not give any error but the second line gives:
:-1: warning: Failure to find: server_files(*.cpp,
:-1: warning: Failure to find: true)
Putting a $ sign in front of the variable as SOURCES += $server_files(*.cpp, true) gives the same error.
The following example function takes a variable name as its only argument, extracts a list of values from the variable with the eval() built-in function, and compiles a list of files:
defineReplace(headersAndSources) {
variable = $$1
names = $$eval($$variable)
headers =
sources =
for(name, names) {
header = $${name}.h
exists($$header) {
headers += $$header
}
source = $${name}.cpp
exists($$source) {
sources += $$source
}
}
return($$headers $$sources)
}
Variable Processing Functions
I am trying to figure the proper way to organize the source tree for a Julia application seqscan. For now I have the following tree:
$ tree seqscan/
seqscan/
├── LICENSE
├── README.md
├── benchmark
├── doc
├── examples
├── src
│ └── seq.jl
└── test
└── test_seq.jl
5 directories, 4 files
The file seq.jl contains
module SeqScan
module Seq
export SeqEntry
type SeqEntry
id
seq
scores
seq_type
end
end
end
and test_seq.jl contains:
module TestSeq
using Base.Test
using SeqScan.Seq
#testset "Testing SeqEntry" begin
#testset "test SeqEntry creation" begin
seq_entry = SeqEntry("test", "atcg")
#test seq_entry.id == "test"
#test seq_entry.seq == "atcg"
end
end
end
However, running the test code yields an error:
ERROR: LoadError: ArgumentError: Module SeqScan not found in current path.
even after setting the JULIA_LOAD_PATH environment variable to include seqscan or seqscan/src, so I must be doing something wrong?
The name of your package (the root of your local tree) needs to match the name of a file that exists under the src directory. Try this:
SeqScan/
|-- src/
|-- SeqScan.jl (your seq.jl)
I don't know why you are enclosing the module Seq in SeqScan. If there is no important reason to do that, you could access the type more directly. You could remove "module Seq" and the paired "end". Then just "using SeqScan" would bring in the type SeqEntry.
The type, SeqEntry, as written knows what to do when given four field values, one for each of the defined fields. If you want to initialize that type with just the first two fields, you need to include a two-argument constructor. For example, assuming seq is a vector of some numeric type and scores is also a vector of that numeric type and and seq_type is a numeric type:
function SeqEntry(id, seq)
seq_type = typeof(seq[1])
scores = zeros(seq_type, length(seq))
return SeqEntry(id, seq, scores, seq_type)
end
An example of a package with internal modules, for Julia v0.5.
The package is named MyPackage.jl; it incorporates two internal modules: TypeModule and FuncModule; each module has its own file: TypeModule.jl and FuncModule.jl.
TypeModule contains a new type, MyType. FuncModule contains a new function, MyFunc, which operates on variable[s] of MyType. There are two forms of that function, a 1-arg and a 2-arg version.
MyPackage uses both internal modules. It incorporates each for immediate use and initializes two variables of MyType. Then MyPackage applies MyFunc to them and prints the results.
I assume Julia's package directory is "/you/.julia/v0.5" (Windows: "c:\you.julia\v0.5"), and refer to it as PkgDir. You can find the real package directory by typing Pkg.dir() at Julia's interactive prompt. The first thing to do make sure Julia's internal information is current: > Pkg.update() and then get a special package call PkgDev: > Pkg.add("PkgDev")
You might start your package on GitHub. If you are starting it locally, you should use PkgDev because it creates the essential package file (and others) using the right structure:
> using PkgDev then > PkgDev.generate("MyPackage","MIT")
This also creates a file, LICENSE.md, with Julia's go-to license. You can keep it, replace it or remove it.
In the directory PkgDir/MyPackage/src, create a subdirectory "internal". In the directory PkgDir/MyPackage/src/internal, create two files: "TypeModule.jl" and "FuncModule.jl", these:
TypeModule.jl:
module TypeModule
export MyType
type MyType
value::Int
end
end # TypeModule
FuncModule.jl:
module FuncModule
export MyFunc
#=
!important!
TypeModule is included in MyPackage.jl before this module
This module gets MyType from MyPackage.jl, not directly.
Getting it directly would create mismatch of module indirection.
=#
import ..MyPackage: MyType
function MyFunc(x::MyType)
return x.value + 1
end
function MyFunc(x::MyType, y::MyType)
return x.value + y.value + 1
end
end # FuncModule
And in the src directory, edit MyPackage.jl so it matches this:
MyPackage.jl:
module MyPackage
export MyType, MyFunc
#=
!important! Do this before including FuncModule
because FuncModule.jl imports MyType from here.
MyType must be in use before including FuncModule.
=#
include( joinpath("internal", "TypeModule.jl") )
using .TypeModule # prefix the '.' to use an included module
include( joinpath("internal", "FuncModule.jl") )
using .FuncModule # prefix the '.' to use an included module
three = MyType(3)
five = MyType(5)
four = MyFunc(three)
eight = MyFunc(three, five)
# show that everything works
println()
println( string("MyFunc(three) = ", four) )
println( string("MyFunc(three, five) = ", eight) )
end # MyPackage
Now, running Julia entering > using MyPackage should show this:
julia> using MyPackage
4 = MyFunc(three)
9 = MyFunc(three, five)
julia>