I am very new to RNAseq analysis. I am trying to run HTSeq using Galaxy as I am unfamiliar with coding and how to do this with Python.
I get the following error:
Use of uninitialized value $input_file in concatenation (.) or string at /mnt/galaxyTools/shed_tools/toolshed.g2.bx.psu.edu/repos/fcaramia/edger/86292c2b0ba9/edger/htseq.pl line 57.
Usage: file [-bchikLlNnprsvz0] [--apple] [--mime-encoding] [--mime-type]
[-e testname] [-F separator] [-f namefile] [-m magicfiles] file ...
file -C [-m magicfiles]
file [--help]
Use of uninitialized value $input_file in concatenation (.) or string at /mnt/galaxyTools/shed_tools/toolshed.g2.bx.psu.edu/repos/fcaramia/edger/86292c2b0ba9/edger/htseq.pl line 62.
This is followed by a long list of where the error is found. For example:
Error occured when reading first line of sam file.
Error:
[Exception type: StopIteration, raised in count.py:81]
Use of uninitialized value $sample in hash element at /mnt/galaxyTools/shed_tools/toolshed.g2.bx.psu.edu/repos/fcaramia/edger/86292c2b0ba9/edger/htseq.pl line 67.
Use of uninitialized value $files[0] in string eq at /mnt/galaxyTools/shed_tools/toolshed.g2.bx.psu.edu/repos/fcaramia/edger/86292c2b0ba9/edger/htseq.pl line 78.
Related
in AWS CLI, the command aws quicksight describe-data-set blah blah returns a json document with the following troublesome syntax:
{
"Status": 200,
"DataSet": {
"Arn": "arn:aws:quicksight:<region>:<acct>:dataset/b7c87122-e180-47a9-a8a4-19f171e13fc8",
"DataSetId": "b7c87122-e180-47a9-a8a4-19f171e13fc8",
"Name": "MyName",
"CreatedTime": "2022-08-16T12:01:54.948000-05:00",
"LastUpdatedTime": "2022-08-19T08:47:55.553000-05:00",
"PhysicalTableMap": {
"6fac5dee-3691-4ddd-ba7a-0667168bb80c": {
"CustomSql": {
"DataSourceArn": "arn:aws:quicksight:<region>:<acct>:datasource/46f83f8b-181e-4575-8d61-84c50125f3aa",
I need to address that DataSetArn, but the key "6fac5dee-3691-4ddd-ba7a-0667168bb80c" is unknown to me at runtime. How do I address it?
I tried:
jq -r '.DataSet.PhysicalTableMap.*.CustomSql.DataSourceArn'
jq -r '.DataSet.PhysicalTableMap.\*.CustomSql.DataSourceArn'
jq -r '.DataSet.PhysicalTableMap.?.CustomSql.DataSourceArn'
jq -r '.DataSet.PhysicalTableMap.\?.CustomSql.DataSourceArn'
jq -r '.DataSet.PhysicalTableMap.%.CustomSql.DataSourceArn'
jq -r '.DataSet.PhysicalTableMap.\%.CustomSql.DataSourceArn'
All return an error similar to:
jq: error: syntax error, unexpected INVALID_CHARACTER, expecting FORMAT or QQSTRING_START (Unix shell quoting issues?) at <top-level>, line 1:
.DataSet.PhysicalTableMap.\?.CustomSql.DataSourceArn
jq: 1 compile error
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
BrokenPipeError: [Errno 32] Broken pipe
I'm a noob, I know I'm guessing here. Does anyone have any insight on this?
Something like this:
jq -r '.DataSet.PhysicalTableMap[].CustomSql.DataSourceArn'
The part .DataSet.PhysicalTableMap returns the object as one result, the following filter [] takes that object and returns each value as one result. The following filters take each of these separate results and refines more stuff.
Note: If the object is the top-level item then the syntax is .[] .
❯ jq --version
jq-1.6
I'm using .jq file as a filter like following, it works:
❯ cat jq/script.jq
def fi(v):
v | tostring |
if test("\\.") then
"float"
else
"integer"
end;
def estype(v):
if type=="number" then
fi(v)
else
type
end;
def esprop(v):
if type=="object" then
{"properties": v | with_entries(.value |= esprop(.))}
else
{"type": estype(v)}
end;
with_entries(.value |= esprop(.))
❯ cat test.json | jq -f jq/script.jq
...(omit results)
But when I use it as library, it throw an error:
# comment the last filter, except the definitions of functions
❯ cat jq/script.jq
def fi(v):
v | tostring |
if test("\\.") then
"float"
else
"integer"
end;
def estype(v):
if type=="number" then
fi(v)
else
type
end;
def esprop(v):
if type=="object" then
{"properties": v | with_entries(.value |= esprop(.))}
else
{"type": estype(v)}
end;
# with_entries(.value |= esprop(.))
❯ cat test.json | jq -L jq/script.jq 'import script;'
jq: error: syntax error, unexpected IDENT, expecting FORMAT or QQSTRING_START (Unix shell quoting issues?) at <top-level>, line 1:
import script;
jq: 1 compile error
What it means and how could I debug and fix this?
Are .jq files as a filter or a library has different syntax(doesn't seems like that)?
1a. What does it mean?
syntax error, unexpected IDENT, expecting FORMAT or QQSTRING_START
This means the parser found an identifier where it was expecting a string. (FORMAT is the token for a formatter like #csv or #text, while QQSTRING_START is the token for a string, like "script". In practice it's useless to use a formatter here since it won't let you use a non-constant string, but the parser doesn't know that.)
1b. How to debug and fix this?
Probably easiest to refer back to the manual. It says that the form expect for "import" is
import RelativePathString as NAME;
and the form expected for "include" is
include RelativePathString;
It lacks examples to make this 100% clear, but "RelativePathString" is a placeholder - it needs to be a literal string. Try one of these:
cat test.json | jq -L jq 'include "script"; with_entries(.value |= esprop(.))'
cat test.json | jq -L jq 'import "script" as script; with_entries(.value |= script::esprop(.))'
Note that the library path should be the directory containing your script, and the difference between include and import.
2. Do .jq files used as a filter or a library have a different syntax?
They use the same syntax. The problem was with the import statement, not with the script file.
I have error with scan function, why?
https://jqplay.org/s/E-0qbbzRPS
I need do this without -r
There are two issues with your filter. Firstly, you need to separate parameters to a function with semicolon ;, not comma ,:
scan("([0-9A-Za-z_]+) == '([0-9A-Za-z_]+)"; "g")
Secondly, scan with two parameters is not implemented (in contradiction to the manual).
jq: error: scan/2 is not defined at <top-level>, line 1:
But as you are using scan, your regex will match multiple occurrences anyway, so you may as well just drop it :
.spec.selector | [scan("([0-9A-Za-z_]+) == '([0-9A-Za-z_]+)") | {(.[0]): .[1]}]
[
{
"app": "nginx"
}
]
Demo
I am using kislyuk/yq - The more often talked about version, which is a wrapper over jq, written in Python using the PyYAML library for YAML parsing
The version is yq 2.12.2
My jq is jq-1.6
I'm using ubuntu and bash scripts to do my parsing.
I wrote this as bash
alias=alias1
token=abc
yq -y -i ".tokens += { $alias: { value: $token }}" /root/.github.yml
I get the following error
jq: error: abc/0 is not defined at <top-level>, line 1:
.tokens += { alias1: { value: abc }}
I don't get it. Why would there be a /0 at the end?
The problem is abc is not interpreted as a literal string, when the double quotes are expanded by the shell. The underlying jq wrapper tries to match with abc as a standard built-in or a user-defined function which it was not able to resolve to, hence the error.
A JSON string (needed for jq) type needs to be quoted with ".." to be consistent with the JSON grammar. One way would be to pass the arg via command line with the --arg support
yq -y -i --arg t "$token" --arg a "$alias" '.tokens += { ($a): { value: $t } }' /root/.github.yml
Or have a quoting mess like below, which I don't recommend at all
yq -y -i '.tokens += { "'"$alias"'": { value: "'"$token"'" }}' /root/.github.yml
as suggest by Dirk Eddelbuettel in this talk and this answer I tried to profile compiled R code using gperftools. Here is what I did.
I used Dirks profilingSmall.R as script that I want to profile. I repeat it here:
## R Extensions manual, section 3.2 'Profiling R for speed'
## 'N' reduced to 99 here
suppressMessages(library(MASS))
suppressMessages(library(boot))
storm.fm <- nls(Time ~ b*Viscosity/(Wt - c), stormer, start = c(b=29.401, c=2.2183))
st <- cbind(stormer, fit=fitted(storm.fm))
storm.bf <- function(rs, i) {
st$Time <- st$fit + rs[i]
tmp <- nls(Time ~ (b * Viscosity)/(Wt - c), st, start = coef(storm.fm))
tmp$m$getAllPars()
}
rs <- scale(resid(storm.fm), scale = FALSE) # remove the mean
Rprof("boot.out")
storm.boot <- boot(rs, storm.bf, R = 99) # pretty slow
Rprof(NULL)
To profile it I run the following script
LD_PRELOAD="/usr/lib/libprofiler.so.0"
\CPUPROFILE=sample.log \
Rscript profilingSmall.R
Then I tried to parse the log file using
pprof /usr/bin/R sample.log
This returned the following error
Using local file /usr/bin/R.
Using local file sample.log.
substr outside of string at /usr/local/bin/pprof line 3618.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3618.
substr outside of string at /usr/local/bin/pprof line 3620.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3620.
sample.log: header size >= 2**16
sample.log is empty. However, a bunch of sample.log_digit were created that contain information that looks reasonable.
I had the same problem, but realized my problem. I'd done:
export CPUPROFILE=test.prof
export LD_PRELOAD="/usr/local/lib/libprofiler.so"
testprog ...
pprof --web `which testprog` test.prof
If I stopped after running testprog the prof files wasn't empty but after pprof it was. pprof crashed with the substr error.
What I realized later was that by setting and exporting LD_PRELOAD that libprofiler.so was also loaded for pprof, overwriting test.prof.
You just need to ensure LD_PRELOAD is not set when you run pprof.
I'm using gperftools-2.5, and I also encountered the same problem:
[root#localhost ivrserver]# pprof --text ./IvrServer ivr.prof
Using local file ./IvrServer.
Using local file ivr.prof.
substr outside of string at /usr/local/bin/pprof line 3695.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3695.
substr outside of string at /usr/local/bin/pprof line 3697.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3697.
ivr.prof: header size >= 2**16
I found this is because the prof file (ivr.prof in my example) is empty.
everytime the profiler start and end, it will create a new prof file, you should use xxx.prof.0 xxx.prof.1 ... to get the right result