i am trying to pass external arguments to R when calling a Rscript with a command line in that i am getting an Error in setwd(args[8]) : cannot change working directory
Execution halted
Not sure why this error is occurring Please suggest some ideas to sort out
Any idea would be appreciated
Rscript /home/mike/learn-analysis/SL_input.R chr21 10542449 10542649 + Y Y 1 /mnt/prep/learn/CA-21-10542394-10542448.2.D.txt D /home/mike/learn-analysis/SL_table.R
#SL_table.R file
args <- commandArgs(trailingOnly = TRUE)
setwd(args[8]) # directory to write file to
source(args[11]) # can be updated to reflect location of source code
results<-SL_mutate(args[1],args[2],args[3],args[4],args[5],args[6],args[7]) # chrom, start, stop, strand, gene, transcript, exon number
results$type<-args[10]
results$transcript<-NULL
results$exon_num<-NULL
write.table(results,args[9],quote = F, row.names = F, sep="\t") # write to filename given in argument (in working directory)
Related
I am new to latex and I am trying to output the R results from describe( ) function through latex( ).
A .tex file is being created but it is not giving me a pdf output
df = data.frame(A = rnorm(n = 100, 0,1)) %>% dplyr::mutate(B = 2*A+ runif(n = 100,-0.05,0.05))
# call latex( )
f = 'sample.tex'
df_latext = Hmisc::latex(describe(df,descript = 'Sample Description'), file = f)
When I run df_latext in the R console it is giving me the following error:
MiKTeX Problem Report
Message: Windows API error 2: The system cannot find the file specified.
Data: path="C:\Users\domjo\AppData\Local\Temp\RtmpiYZcTM\file56f02ed47c3a.dvi"
Source: Libraries\MiKTeX\Core\File\win\winFile.cpp
Line: 301
MiKTeX: 22.8
OS: Windows 10.0.22000
Invokers: .../explorer/rstudio/rsession
SystemAdmin: no
Root0: C:\Users\domjo\AppData\Roaming\MiKTeX
Root1: C:\Users\domjo\AppData\Local\MiKTeX
Root2: C:\Users\domjo\AppData\Local\Programs\MiKTeX
UserInstall: C:\Users\domjo\AppData\Local\Programs\MiKTeX
UserConfig: C:\Users\domjo\AppData\Roaming\MiKTeX
UserData: C:\Users\domjo\AppData\Local\MiKTeX
Also, I tried running the .tex file created but its giving the following error saying Environment spacing undefined.:
! LaTeX Error: Environment spacing undefined.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.1 \begin{spacing}{0.7}
?
I am trying to reproduce this protocol for DNA sequencing data analysis. It requires running this bash script that links to an R script. However, I am getting this error (see bottom) that I cant seem to solve.
#!/bin/bash
Project_dir=~/base
cd /${Project_dir}
SCRTP=~/scRepliseq-Pipeline
OUTNAME="bam/G1_F121_A1.adapter_filtered2"
genome_name="mm10"
bamfile=${OUTNAME}.${genome_name}.clean_srt_markdup.bam
rscript=${SCRTP}/util/Step3_R-Aneu-Fragment-bins.R
out_dir="Aneu_analysis"
Name=‘$bamfile’
Name=${name%.adapter_filtered2.${genome_name}.clean_srt_markdup.bam}
blacklist=~/blacklist/mm10-blacklist-v1_id.bed
genome_file=~/reference/UCSC_hg19_female.fa.fai
mkdir -p ${out_dir}
Rscript --vanilla $rscript ${bamfile} ${out_dir} ${name} ${blacklist} ${genome_file}
it links to this R script
args = commandArgs(TRUE)
bamfile=args[1]
out_dir=args[2]
name=args[3]
blacklist=args[4]
genome_file=args[5]
options(scipen=100)
##Extension of file name##
ext="_mapq10_blacklist_fragment.Rdata"
ext2="_mapq10_blacklist_bin.Rdata"
library(AneuFinder)
##loading black list and genome Info##
genome_tmp <- read.table(genome_file,sep="\t") #UCSC_mm9.woYwR.fa.fai
genome=data.frame(UCSC_seqlevel=genome_tmp$V1,UCSC_seqlength=genome_tmp$V2)
chromosomes=as.character(genome$UCSC_seqlevel)
##setup output directories##
out_dir_f=paste0(out_dir,"/fragment")
out_dir_b=paste0(out_dir,"/bins")
dir.create(out_dir,showWarnings = FALSE)
dir.create(out_dir_f,showWarnings = FALSE)
dir.create(out_dir_b,showWarnings = FALSE)
##save the fragment file (>10 MAPQ), filtering out the blacklist regions##
raw_reads=bam2GRanges(bamfile,remove.duplicate.reads = TRUE,min.mapq = 10,blacklist = blacklist)
save(raw_reads,file = paste0(out_dir_f,"/",name,ext))
##save the bin data file ##
bins_reads=binReads(raw_reads,
assembly=genome,
chromosomes=chromosomes,
binsizes=c(40000,80000,100000,200000,500000))
rpm=1000000/length(raw_reads)
bins_reads[["rpm"]]=rpm
save(bins_reads,file=paste(out_dir_b,"/",name,ext2,sep=""))
It shows this error:
Error in file(file, "rt") : invalid 'description' argument
Calls: read.table -> file
Execution halted
I am trying to download all .gz files from this link:
ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/
So far I tried this and I am not getting any results:
require(RCurl)
url= "ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/"
filenames = getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE)
filenames <- strsplit(filenames, "\r\n")
filenames = unlist(filenames)
I am getting this error:
Error in function (type, msg, asError = TRUE) :
Operation timed out after 300552 milliseconds with 0 out of 0 bytes received
Can someone please help with this?
Thanks
EDIT:
I tried to run with with filenames provided to me bellow so in my r script I have:
require(RCurl)
my_url <-"ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/"
my_filenames= c("bed_chr_11.bed.gz", ..."bed_chr_9.bed.gz.md5")
my_filenames <- strsplit(my_filenames, "\r\n")
my_filenames = unlist(my_filenames)
for(my_file in my_filenames){
download.file(paste0(my_url, my_file), destfile = file.path('/mydir', my_file))
}
And when I run the script I get these warnings:
trying URL 'ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/bed_chr_11.bed.gz'
Error in download.file(paste0(my_url, my_file), destfile = file.path("/mydir", :
cannot open URL 'ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/bed_chr_11.bed.gz'
In addition: Warning message:
In download.file(paste0(my_url, my_file), destfile = file.path("/mydir", :
URL 'ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/bed_chr_11.bed.gz': status was 'Timeout was reached'
Execution halted
The file names you're trying to access are
filenames <- c("bed_chr_11.bed.gz", "bed_chr_11.bed.gz.md5", "bed_chr_12.bed.gz",
"bed_chr_12.bed.gz.md5", "bed_chr_13.bed.gz", "bed_chr_13.bed.gz.md5",
"bed_chr_14.bed.gz", "bed_chr_14.bed.gz.md5", "bed_chr_15.bed.gz",
"bed_chr_15.bed.gz.md5", "bed_chr_16.bed.gz", "bed_chr_16.bed.gz.md5",
"bed_chr_17.bed.gz", "bed_chr_17.bed.gz.md5", "bed_chr_18.bed.gz",
"bed_chr_18.bed.gz.md5", "bed_chr_19.bed.gz", "bed_chr_19.bed.gz.md5",
"bed_chr_20.bed.gz", "bed_chr_20.bed.gz.md5", "bed_chr_21.bed.gz",
"bed_chr_21.bed.gz.md5", "bed_chr_22.bed.gz", "bed_chr_22.bed.gz.md5",
"bed_chr_AltOnly.bed.gz", "bed_chr_AltOnly.bed.gz.md5", "bed_chr_MT.bed.gz",
"bed_chr_MT.bed.gz.md5", "bed_chr_Multi.bed.gz", "bed_chr_Multi.bed.gz.md5",
"bed_chr_NotOn.bed.gz", "bed_chr_NotOn.bed.gz.md5", "bed_chr_PAR.bed.gz",
"bed_chr_PAR.bed.gz.md5", "bed_chr_Un.bed.gz", "bed_chr_Un.bed.gz.md5",
"bed_chr_X.bed.gz", "bed_chr_X.bed.gz.md5", "bed_chr_Y.bed.gz",
"bed_chr_Y.bed.gz.md5", "bed_chr_1.bed.gz", "bed_chr_1.bed.gz.md5",
"bed_chr_10.bed.gz", "bed_chr_10.bed.gz.md5", "bed_chr_2.bed.gz",
"bed_chr_2.bed.gz.md5", "bed_chr_3.bed.gz", "bed_chr_3.bed.gz.md5",
"bed_chr_4.bed.gz", "bed_chr_4.bed.gz.md5", "bed_chr_5.bed.gz",
"bed_chr_5.bed.gz.md5", "bed_chr_6.bed.gz", "bed_chr_6.bed.gz.md5",
"bed_chr_7.bed.gz", "bed_chr_7.bed.gz.md5", "bed_chr_8.bed.gz",
"bed_chr_8.bed.gz.md5", "bed_chr_9.bed.gz", "bed_chr_9.bed.gz.md5"
)
The files are big, so I didn't check that this whole loop, but this worked at least for the first file. Add this to the end of your code.
my_url <- 'ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/BED/'
for(my_file in filenames){ # loop over the files
# download each file, saving in a directory that you need to create on your own computer
download.file(paste0(my_url, my_file), destfile = file.path('c:/users/josep/Documents/', my_file))
}
I would like to schedule a R script which works fine via Rstudio. But when try to run the script with Rscript I got an error. The script take some parameters in a csv file and try to gather data on opentsdb.
library(devtools)
install_local("/home/me/Downloads/opentsdbr-master")
install_local("/home/me/Downloads/nomalyDetection-master")
install_local("/home/me/Downloads/anomalydots", force="TRUE")
install_local("/home/me/Downloads/RAD")
library(reticulate)
library(anomalydots)
lbrary('methods')
l <- read.csv(file = "/home/bbtex/Downloads/metrics_list.csv", sep = ",")
n <- length(l$metrics)
for (i in 1:n){
tag=as.character(l$tags[i])
metric=as.character(l$metrics[i])
tag=eval(parse(text=paste("c(",tag,")",sep="")))
newdata=getTsOpenTSDBOptimized(metric=metric,hostname="opentsdb-read.intcs.meshcore.net",port=4242,downsample=10,endDate=as.character(Sys.Date()),agg=l$agg[i],tags=tag)
newdata = anomalydots::fillHoles2(newdata)
newdata = anomalydots::timeSeriePreparation(newdata)
timeseries[[i]] <- newdata
}
print ("Upload completed successfully")
When I run the following code with Rscript I have the following error:
Error in tsd_get_ascii(metric, interval, tags, agg, rate, downsample,
:
Calls: getTsOpenTSDBOptimized -> getTsOpenTSDB -> tsd_get -> tsd_get_ascii
In addition: Warning messages:
1: In tsd_get_ascii(metric, interval, tags, agg, rate, downsample, :
Response code 503
2: In tsd_get_ascii(metric, interval, tags, agg, rate, downsample, :
URL: http://opentsdb-read.intcs.meshcore.net:4242/q?start=2018%2F10%2F19-02%3A00%3A00&m=avg%3A600s-sum%3Adb.mysql.threads_connected%7Bhost%3Dbpct4005s%7D&end=2018%2F10%2F22-01%3A59%3A59&ascii=
Execution halted
When I follow the link in the error, I can see the data printed in the webpage.
Thank you in advance for your help
(Windows 7 / R version 3.0.1)
Below the commands and the resulting error:
> library(tm)
> pdf <- readPDF(PdftotextOptions = "-layout")
> dat <- pdf(elem = list(uri = "17214.pdf"), language="de", id="id1")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp
\RtmpS8Uql1\pdfinfo167c2bc159f8': No such file or directory
How do I solve this issue?
EDIT I
(As suggested by Ben and described here)
I downloaded Xpdf copied the 32bit version to
C:\Program Files (x86)\xpdf32
and the 64bit version to
C:\Program Files\xpdf64
The environment variables pdfinfo and pdftotext are referring to the respective executables either 32bit (tested with R 32bit) or to 64bit (tested with R 64bit)
EDIT II
One very confusing observation is that starting from a fresh session (tm not loaded) the last command alone will produce the error:
> dat <- pdf(elem = list(uri = "17214.pdf"), language="de", id="id1")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp\RtmpKi5GnL
\pdfinfode8283c422f': No such file or directory
I don't understand this at all because the function variable is not defined by tm.readPDF yet. Below you'll find the function pdf refers to "naturally" and to what is returned by tm.readPDF:
> pdf
function (elem, language, id)
{
meta <- tm:::pdfinfo(elem$uri)
content <- system2("pdftotext", c(PdftotextOptions, shQuote(elem$uri),
"-"), stdout = TRUE)
PlainTextDocument(content, meta$Author, meta$CreationDate,
meta$Subject, meta$Title, id, meta$Creator, language)
}
<environment: 0x0674bd8c>
> library(tm)
> pdf <- readPDF(PdftotextOptions = "-layout")
> pdf
function (elem, language, id)
{
meta <- tm:::pdfinfo(elem$uri)
content <- system2("pdftotext", c(PdftotextOptions, shQuote(elem$uri),
"-"), stdout = TRUE)
PlainTextDocument(content, meta$Author, meta$CreationDate,
meta$Subject, meta$Title, id, meta$Creator, language)
}
<environment: 0x0c3d7364>
Apparently there is no difference - then why use readPDF at all?
EDIT III
The pdf file is located here: C:\Users\Raffael\Documents
> getwd()
[1] "C:/Users/Raffael/Documents"
EDIT IV
First instruction in pdf() is a call to tm:::pdfinfo() - and there the error is caused within the first few lines:
> outfile <- tempfile("pdfinfo")
> on.exit(unlink(outfile))
> status <- system2("pdfinfo", shQuote(normalizePath("C:/Users/Raffael/Documents/17214.pdf")),
+ stdout = outfile)
> tags <- c("Title", "Subject", "Keywords", "Author", "Creator",
+ "Producer", "CreationDate", "ModDate", "Tagged", "Form",
+ "Pages", "Encrypted", "Page size", "File size", "Optimized",
+ "PDF version")
> re <- sprintf("^(%s)", paste(sprintf("%-16s", sprintf("%s:",
+ tags)), collapse = "|"))
> lines <- readLines(outfile, warn = FALSE)
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp\RtmpquRYX6\pdfinfo8d419174450': No such file or direc
Apparently tempfile() simply doesn't create a file.
> outfile <- tempfile("pdfinfo")
> outfile
[1] "C:\\Users\\Raffael\\AppData\\Local\\Temp\\RtmpquRYX6\\pdfinfo8d437bd65d9"
The folder C:\Users\Raffael\AppData\Local\Temp\RtmpquRYX6 exists and holds some files but none is named pdfinfo8d437bd65d9.
Intersting, on my machine after a fresh start pdf is a function to convert an image to a PDF:
getAnywhere(pdf)
A single object matching ‘pdf’ was found
It was found in the following places
package:grDevices
namespace:grDevices [etc.]
But back to the problem of reading in PDF files as text, fiddling with the PATH is a bit hit-and-miss (and annoying if you work across several different computers), so I think the simplest and safest method is to call pdf2text using system as Tony Breyal describes here.
In your case it would be (note the two sets of quotes):
system(paste('"C:/Program Files/xpdf64/pdftotext.exe"',
'"C:/Users/Raffael/Documents/17214.pdf"'), wait=FALSE)
This could easily be extended with an *apply function or loop if you have many PDF files.