path not being detected by Nextflow - pipeline

i'm new to nf-core/nextflow and needless to say the documentation does not reflect what might be actually implemented. But i'm defining the basic pipeline below:
nextflow.enable.dsl=2
process RUNBLAST{
input:
val thr
path query
path db
path output
output:
path output
script:
"""
blastn -query ${query} -db ${db} -out ${output} -num_threads ${thr}
"""
}
workflow{
//println "I want to BLAST $params.query to $params.dbDir/$params.dbName using $params.threads CPUs and output it to $params.outdir"
RUNBLAST(params.threads,params.query,params.dbDir, params.output)
}
Then i'm executing the pipeline with
nextflow run main.nf --query test2.fa --dbDir blast/blastDB
Then i get the following error:
N E X T F L O W ~ version 22.10.6
Launching `main.nf` [dreamy_hugle] DSL2 - revision: c388cf8f31
Error executing process > 'RUNBLAST'
Error executing process > 'RUNBLAST'
Caused by:
Not a valid path value: 'test2.fa'
Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run
I know test2.fa exists in the current directory:
(nfcore) MN:nf-core-basicblast jraygozagaray$ ls
CHANGELOG.md conf other.nf
CITATIONS.md docs pyproject.toml
CODE_OF_CONDUCT.md lib subworkflows
LICENSE main.nf test.fa
README.md modules test2.fa
assets modules.json work
bin nextflow.config workflows
blast nextflow_schema.json
I also tried with "file" instead of path but that is deprecated and raises other kind of errors.
It'll be helpful to know how to fix this to get myself started with the pipeline building process.
Shouldn't nextflow copy the file to the execution path?
Thanks

You get the above error because params.query is not actually a path value. It's probably just a simple String or GString. The solution is to instead supply a file object, for example:
workflow {
query = file(params.query)
BLAST( query, ... )
}
Note that a value channel is implicitly created by a process when it is invoked with a simple value, like the above file object. If you need to be able to BLAST multiple query files, you'll instead need a queue channel, which can be created using the fromPath factory method, for example:
params.query = "${baseDir}/data/*.fa"
params.db = "${baseDir}/blastdb/nt"
params.outdir = './results'
db_name = file(params.db).name
db_path = file(params.db).parent
process BLAST {
publishDir(
path: "{params.outdir}/blast",
mode: 'copy',
)
input:
tuple val(query_id), path(query)
path db
output:
tuple val(query_id), path("${query_id}.out")
"""
blastn \\
-num_threads ${task.cpus} \\
-query "${query}" \\
-db "${db}/${db_name}" \\
-out "${query_id}.out"
"""
}
workflow{
Channel
.fromPath( params.query )
.map { file -> tuple(file.baseName, file) }
.set { query_ch }
BLAST( query_ch, db_path )
}
Note that the usual way to specify the number of threads/cpus is using cpus directive, which can be configured using a process selector in your nextflow.config. For example:
process {
withName: BLAST {
cpus = 4
}
}

Related

zsh: Do I need to close file descriptors?

I use the following code to both output something to stdout, and pipe it to a program:
function example() {
local fd1
{
exec {fd1}>&1
{ echo hi >&$fd1 } | true
} always { exec {fd1}>&- }
}
I am wondering if I can safely drop always { exec {fd1}>&- }. fd1 goes out of scope after the function finishes anyways.
You need to keep always { exec {fd1}>&- }. If you get rid of that, the variable containing the file descriptor will go out of scope, but the file descriptor won't be closed, resulting in leaking it. You can see this by doing ls -l /proc/$$/fd before and after running your function without that line. Each run of the function will permanently add another FD to that list. Eventually, you'll run out of file descriptors and won't be able to open any new ones, which will break things.

posix_fallocate() failed: Operation not permitted while opening .realm file

I get the below error when i try to open and download .realm file in /tmp directory of serverless framework.
{"errorType":"Runtime.UnhandledPromiseRejection","errorMessage":"Error: posix_fallocate() failed: Operation not permitted" }
Below is the code:
let realm = new Realm({path: '/tmp/custom.realm', schema: [schema1, schema2]});
realm.write(() => {
console.log('completed==');
});
EDIT: this might soon be finally fixed in Realm-Core: see issue 4957.
In case you'll run into this problem elsewhere, here's a workaround.
This caused by AWS Lambda not supporting the fallocate and fallocate64 system calls. Instead of returning the correct error code in this case, which would be EINVAL for not supported on this file system, Amazon has blocked the system call so that it returns EPERM. Realm-Core has code that handles EINVAL return value correctly but will be bewildered by the unexpected EPERM returned from the system call.
The solution is to add a small shared library as a layer to the lambda: compile the following C file on Linux machine or inside lambda-ci Docker image:
#include <errno.h>
#include <fcntl.h>
int posix_fallocate(int __fd, off_t __offset, off_t __len) {
return EINVAL;
}
int posix_fallocate64(int __fd, off_t __offset, off_t __len) {
return EINVAL;
}
Now, compile this to a shared object with something like
gcc -shared fix.c -o fix.so
Then add it to a root of a ZIP file:
zip layer.zip fix.so
Create a new lambda layer from this zip
Add the lambda layer to your lambda function
Finally make the shared object be loaded by configuring the environment value LD_PRELOAD with value /opt/fix.so to your Lambda.
Enjoy.

Dccp protocol simulation in ns2 2.34

How to add dccp patches to ns2 2.34? Please give me detailed steps.
The file is the file is ns234-dccp-1.patch.
The error comes when I try to simulate dccp is
Kar#ubuntu:~$ ns audiodccp.tcl
invalid command name "Agent/DCCP/TCPlike"
while executing
"Agent/DCCP/TCPlike create _o726 "
invoked from within
"catch "$className create $o $args" msg"
invoked from within
"if [catch "$className create $o $args" msg] {
if [string match "__FAILED_SHADOW_OBJECT_" $msg] {
delete $o
return ""
}
global errorInfo
error "class $..."
(procedure "new" line 3)
invoked from within
"new Agent/DCCP/TCPlike"
invoked from within
"set dccp1 [new Agent/DCCP/TCPlike]"
(file "audiodccp.tcl" line 50)
UBUNTU-10.04
NS2 allinone 2.34
audiodccp.tcl : Unknown file.
invalid command name "Agent/DCCP/TCPlike"
→ → You have a failed build. Or you are using the wrong executable 'ns'. The suggestion is to do :
cd ns-allinone-2.34/-ns-2.34/
cp ns ns-dccp
sudo cp ns-dccp /usr/local/bin/
... and then do simulations with $ ns-dccp [file.tcl]
You can also use ns-2.35, which has DCCP included by default.
Note : You can have as many times ns-allinone-2.xx as you want, installed at the same time. But : Do never add any PATH text to .bashrc. Not required.

Using powershell loop with net.exe command error

I'm trying to use powershell to write a script that calls net.exe's delete on a collection of computers meeting the specific case of having 3 or fewer files open. I'm fairly new at this, obviously, as I'm getting odd errors.
Using the example at Microsoft's blog I made the function below out of net session.
Function Get-ActiveNetSessions
{
# converts the output of the net session cmd into a PSobject
$output = net session | Select-String -Pattern \\
$output | foreach {
$parts = $_ -split "\s+", 4
New-Object -Type PSObject -Property #{
Computer = $parts[0].ToString();
Username = $parts[1];
Opens = $parts[2];
IdleTime = $parts[3];
}
}
}
which does produce a workable object that I can apply logic to.
I can use
$computerList = Get-ActiveNetSessions | Where-Object {$_.Opens -clt 3} | Select-Object {$_.Computer} to pull all computers with less than three opens into a variable, too.
What fails is the loop below
ForEach($computer in $computerList)
{
net session $computer /delete
}
with the error
net : The syntax of this command is:
At line:5 char:5
net session $computer /delete
CategoryInfo :NotSpecified (The syntax of this command is::String) [], RemoteException
FullyQualifiedErrorId : NativeCommandError
NET SESSION
[\\computername] [/DELETE] [/LIST]
Trying to run it with a call of $computer = $computer.ToString() ahead of the execution so it sees a string causes the script to hang without dropping the sessions, forcing me to close and reopen the ISE.
What should I do to get this loop working? Any help is appreciated.
Net session expects a \\ before the server name, it looks like. Have you given that a try?

Time command equivalent in PowerShell

What is the flow of execution of the time command in detail?
I have a user created function in PowerShell, which will compute the time for execution of the command in the following way.
It will open the new PowerShell window.
It will execute the command.
It will close the PowerShell window.
It will get the the different execution times using the GetProcessTimes function function.
Is the "time command" in Unix also calculated in the same way?
The Measure-Command cmdlet is your friend.
PS> Measure-Command -Expression {dir}
You could also get execution time from the command history (last executed command in this example):
$h = Get-History -Count 1
$h.EndExecutionTime - $h.StartExecutionTime
I've been doing this:
Time {npm --version ; node --version}
With this function, which you can put in your $profile file:
function Time([scriptblock]$scriptblock, $name)
{
<#
.SYNOPSIS
Run the given scriptblock, and say how long it took at the end.
.DESCRIPTION
.PARAMETER scriptBlock
A single computer name or an array of computer names. You mayalso provide IP addresses.
.PARAMETER name
Use this for long scriptBlocks to avoid quoting the entire script block in the final output line
.EXAMPLE
time { ls -recurse}
.EXAMPLE
time { ls -recurse} "All the things"
#>
if (!$stopWatch)
{
$script:stopWatch = new-object System.Diagnostics.StopWatch
}
$stopWatch.Reset()
$stopWatch.Start()
. $scriptblock
$stopWatch.Stop()
if ($name -eq $null) {
$name = "$scriptblock"
}
"Execution time: $($stopWatch.ElapsedMilliseconds) ms for $name"
}
Measure-Command works, but it swallows the stdout of the command being run. (Also see Timing a command's execution in PowerShell)
If you need to measure the time taken by something, you can follow this blog entry.
Basically, it suggest to use the .NET StopWatch class:
$sw = [System.Diagnostics.StopWatch]::startNew()
# The code you measure
$sw.Stop()
Write-Host $sw.Elapsed

Resources