I am facing below error while fast loading CSV file.
Tried to find out the solution but with no luck.
Can you please help me ?
EDIT: Added sctipt file ...
Following steps worked in my case ...
Drop the error tables used in fast load script.
Recreate the table being loaded - Release table wont works everytime.
Last but not least, do not forget to mention the .LOGOFF at the end of fast load script.
Cure for above step is to .LOGOFF using bteq utility in cmd window.
Related
I'm trying to use the package bambu to quantify gene counts from bam files. I am using my university's HPC, so I have written an R script and a batch submission file to launch it.
When the script gets to the point of running the bambu function, it gives the following error:
Start generating read class files
| | 0%[W::hts_idx_load2] The index file is older than the data file: ./results/minimap2/KD_R1.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: ./results/minimap2/KD_R3.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: ./results/minimap2/WT_R1.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: ./results/minimap2/WT_R2.sorted.bam.bai
|================== | 25%
Error: BiocParallel errors
element index: 1, 2, 3
first error: cannot open the connection
In addition: Warning message:
stop worker failed:
attempt to select less than one element in OneIndex
Execution halted
So it looks like BiocParallel isn't happy and cannot open a certain connection, but I'm not sure how to fix this?
This is my R script:
#Bambu R script
#load libraries
library(Rsamtools)
library(bambu)
#Creating files
bamFiles<- Rsamtools::BamFileList(c("./results/minimap2/KD_R1.sorted.bam","./results/minimap2/KD_R2.sorted.bam","./results/minimap2/KD_R3.sorted.bam","./results/minimap2/WT_R1.sorted.bam","./results/minimap2/WT_R2.sorted.bam","./results/minimap2/WT_R3.sorted.bam"))
annotation<-prepareAnnotations("./ref_data/Homo_sapiens.GRCh38.104.chr.gtf")
fa.file<-"./ref_data/Homo_sapiens.GRCh38.dna.primary_assembly.fa"
#Running bambu
se<- bambu(reads=bamFiles, annotations=annotation, genome=fa.file,ncore=4)
se
seGene<- transcriptToGeneExpression(se)
#Saving files
save.file<-tempfile(fileext=".gtf")
writeToGTF(rowRanges(se),file=save.file)
save.dir <- tempdir()
writeBambuOutput(se,path=save.fir,prefix="Nanopore_")
writeBambuOutput(seGene,path=save.fir,prefix="Nanopore_")
If you have any ideas on why this happens it would be so helpful! Thank you
I think that #Chris has a good point. Under the hood it seems likely that bambu is running htslib based on those warnings. While they may indeed only be warnings, I would like to know what the results would look like if you ran this interactively.
This question is hard to answer right now as it's missing some information (what do the files look like, a minimal reproducible example, etc.). But in the meantime here are some possibly useful questions for figuring it out:
what does bamFiles look like? Does it have the right number of read records? Do all of those files have nonzero read records? Are any suspiciously small?
What are the timestamps on the bai vs bam files (e.g. ls -lh /results/minimap2/)? Are they about what you'd expect or is it wonky? Are any of them (say, ./results/minimap2/WT_R2.sorted.bam.bai) weirdly small?
What happens when you run it interactively? Where does it fail? You say it's at the bambu() call, but how do you know that?
What happens when you run bambu() with ncores=1?
It seems very likely that this is due to a problem with the files, and it is only at the biocParallel step that the error is bubbling up to the top. Many utilities have an annoying habit of being happy to accept an empty file, only to fail confusingly without informative error messages when asked to do something with the empty file.
You might also consider raising an issue with the developers.
(why the warning is only possibly a problem: The index file sometimes has a timestamp like that for very small alignment files which are generated and indexed programmatically, where the indexing step is near-instantaneous.)
savvy people
Important: I want the R-Code to be in an external file.
I have run the R-code already succesfully. It is run in an SSIS-package within an SQL-Statment the using the #script=N'..R-code here..' argument. So far so good, with the exception that the window where the SQL statement is entered is a pain. Wrapping is weird, no code highlighting etc, which renders the whole thing basically unmaintanable.
So I would like to supply the script as VARCHAR-Variable in the following fashion #script=#loaded_script.
I tried to bulk load the script, but I get errors that the file cannot be found. Does somebody have an idea?
I have been using RDCOMClient for a while now to interact with vendor software. For the most part it has worked fine. Recently, however, I have the need to loop through many operations (several hundred). I am running into problems with the RDCOM.err file growing to a very large size (easily GBs). This file is put in C: with no apparent option to change that. Is there some way that I can suppress this output or specify another location for the file to go? I don't need any of the output in the file so suppressing it would be best.
EDIT: I tried to add to my script a file.remove but R has the file locked. The only way I can get the lock released is to restart R.
Thanks.
Setting the permissions to read only was going to be my suggested hack.
A slightly more elegant approach is to edit one line of the C code in the package in src/RUtils.h from
\#define errorLog(a,...) fprintf(getErrorFILE(), a, ##__VA_ARGS__); fflush(getErrorFILE());
to
\#define errorLog(a, ...) {}
However, I've pushed some simple updates to the package on github that add a writeErrors() function that one can use to toggle whether errors are written or not. So this allows this to be turned on and off dynamically.
So
library(RDCOMClient)
writeErrors(FALSE)
will turn off the error logging to the file.
I found a work around for this. I created the files C:\RDCOM.err and C:\RDCOM_server.err and marked them both as read-only. I am not sure if there is a better way to accomplish this, but for now I am running without logging.
Currently I'm working with Jedox and try to use the RScript Transform component.
The installation of R itself on the server was a little bit tricky, but after several attempts it finally worked.
For the installation helpful were the infos on this blog: jedoxtools.wordpress.com
The key challenge though was to enter the correct directory path in the 'Path' (C:\Program Files\R\R-3.4.1\bin\x64) and in the 'R_Home' (C:\Program Files\R\R-3.4.1) variables.
But now where the 'hard part' should already be done I simply can't get the transform component running.
Based on the example Rscript in this presentation everytime I try simple scripts, I got the following error message:
Failed to retrieve data from source [my RScript components name] : null
The script I run is as simple as this:
data <- my_datasource
Result <- data
There is data in the source and if I do the test locally in RStudio it works perfectly fine.
Anyone here with R experiences in Jedox?
A few attempts later I found the solution myself and it's of course super easy, u just have to know about it.
In the Jedox documentation the given example shows a script which indicates the returned result set is called 'result'.
Instead you can return any object, all you have to do is to name the result set in an extra field which is above the script-box.
The working script (input=output) is shown here:
rscript solution
If I have more data loaded in R I'm having difficulties with opening and choosing new file via file.choose() and later upload via read.csv(), but I would not get to that point since the file.choose function stacks and the R "crushes" and reports something like "unidentified error occurred and that the R must restart".
I'm using RStudio and running this on Windows 7. The hardware is up to date.
Could someone point me on why this is happing and what would be a remedy against this. Are there other options to select file? I know I can insert the path right into the read.csv command, but the (file is different every time).
EDIT:
The error just happened again. I can not reproduce the error so it happens rather only with high likelihood if the conditions for it are met.
The error reads as: R Session Aborted.R encountered fatal error. The session was terminated. And in window: "Start New Session".
EDIT 2:
I would just rephrase my question. The question is whether there is other option like command or package that deals with choosing a file. [file.choose()]
The error can not be reproduced and hence I can not expect someone gives reasonable comment on this. But if this occurred someone in the past and solved it, I would like to hear about it. Thanks.
EDIT 3: Further to the error. I have spotted just now sentence in red in Console: Error: Unable to provide connection with R