jpeg() function incompatible with paste() function in R? - r

I want to write jpeg files with dynamical filenames.
In plot_filename I concatenate strings with values from other variables to create
a dynamical filename.
plot_filename = paste("Series T_all","/",Participant[i],"/",Part[i,2],"/",Part[i,3],".jpg")
The output of plot_filename is just another string: "Series T_all / 802 / 1 / 64 .jpg"
However when I want to use this string as a filename in the jpeg() function
jpeg(filename= plot_filename, width = 2000, height = 1500, quality = 100,
pointsize = 50)
plot(T1)
dev.off()
I get the following error:
Error in jpeg(filename = paste("Series T_all", "/", Participant[i], "/", :
unable to start jpeg() device
In addition: Warning messages:
1: In jpeg(filename = paste("Series T_all", "/", Participant[i], "/", :
unable to open file 'Series T_all / 802 / 1 / 64 .jpg' for writing
2: In jpeg(filename = paste("Series T_all", "/", Participant[i], "/", :
opening device failed
But when I just use a plain string (without the paste function) as a filename
name="plot_filename.jpg"
the jpeg() function works just fine.
Does anybody know how this is possible? It seems to me that in both cases you're just inputting strings into the jpeg() function so I don't see why one but not the other would work.
Thanks

The statement
plot_filename = paste("Series T_all","/",Participant[i],"/",Part[i,2],"/",Part[i,3],".jpg")
separates the individual strings with spaces (the default) as you can see in your output example
"Series T_all / 802 / 1 / 64 .jpg"
This path, however, does not exists.
If you use
plot_filename = paste("Series T_all","/",Participant[i],"/",Part[i,2],"/",Part[i,3],".jpg", sep="")
this should give a string like
"Series T_all/802/1/64.jpg"
In general, sep= can take any character or string. So you can also use sep="/" to separate your strings so you do not have to write "/" when you concatenate you strings. However, this would affect the concatenation of Part[i,3] and ".jpg". If you want to use it that way, you may append ".jpeg" in a second step with sep="". For your case, I think it is okay just to use sep="".

Related

Read txt file as a numeric array in R

I am using RStudio in a MAC (10.14.6), and I am trying to read a text file that looks like this
5:[0.12126984126984124, 0.11682539682539679, 0.14666666666666664, 0.07269841269841269, 0.06984126984126983, 0.0911111111111111, 0.1092063492063492, 0.12253968253968253, 0.08698412698412696, 0.09523809523809523, 0.12222222222222222, 0.10761904761904759]
I've used several iterations of "read", "read.delim", and "read.csv" and all pretty much do the same
> data.matrix(read.delim("data.txt",sep=','))
X5..0.12126984126984124 X0.11682539682539679 X0.14666666666666664 X0.07269841269841269 X0.06984126984126983
X0.0911111111111111 X0.1092063492063492 X0.12253968253968253 X0.08698412698412696 X0.09523809523809523
X0.12222222222222222 X0.10761904761904759.
Using "unlist", "as.numeric", "as.character" does not yield anything most likely due to the presence of the X in front of each number. Does anyone have ideas to read this file properly?
if you are only interested in reading the numbers, then you first have to delete 5:[ at the beginning and also ] at the end. then read using scan with the sep = ','
scan(text=gsub("^.*\\[|\\]", "", string), sep = ",")
Read 12 items
[1] 0.12126984 0.11682540 0.14666667 0.07269841 0.06984127 0.09111111
[7] 0.10920635 0.12253968 0.08698413 0.09523810 0.12222222 0.10761905

Can I import variables into R from a global file?

I am integrating an R script to produce some graphics into a larger project that is pulled together with a Makefile. In this larger project, I have a file called globals.mk that contains global variables used by many other scripts in the project. For example, the number of simulations I want to run is a global that I want to use in this R script. Can I "import" this as a variable, or is it necessary to manually define every variable within the R script?
Edit: here is a sample of the globals that I would need to read in.
num = 100
path = ./here/is/a/path
file = $(path)/file.csv
And I would like the R script to set the variables num as 100 (or "100"), path as "./here/is/a/path" and file as "./here/is/a/path/file.csv".
If it is ok to replace the parentheses with brace brackets then readRenviron will read in such files and perform the substitutions returning the contents as environmental variables.
# write out test file globals2.mk which uses brace brackets
Lines <- "num = 100
path = ./here/is/a/path
file = ${path}/file.csv"
cat(Lines, file = "globals2.mk")
readRenviron("globals2.mk")
Sys.getenv("num")
## [1] "100"
Sys.getenv("path")
## [1] "./here/is/a/path"
Sys.getenv("file")
## [1] "./here/is/a/path/file.csv"
If it is important to use parentheses rather than brace brackets, read in globals.mk, replace the parentheses with brace brackets and then write the file out again.
# write out test file - this one uses parentheses as in question
Lines <- "num = 100
path = ./here/is/a/path
file = $(path)/file.csv"
cat(Lines, file = "globals.mk")
# read globals.mk, perform () to {} substitutions, write out and then re-read
tmp <- tempfile()
L <- readLines("globals.mk")
cat(paste(chartr("()", "{}", L), collapse = "\n"), file = tmp)
readRenviron(tmp)
If the .mk file has anything other than direct variable expansion (such as more complex make-rules/tricks/functions), it might be better to trust make to do the expansion for you, and then read it in. There's a post here that I found that dumps all variable contents (after processing).
TL;DR
expand_mkvars <- function(path, aslist = FALSE) {
stopifnot(file.exists(mk <- Sys.which("make")))
tf <- tempfile(fileext = ".mk")
# needed on my windows system
tf <- normalizePath(tf, winslash = "/", mustWork = FALSE) # tempfile should suffice
on.exit(suppressWarnings(file.remove(tf)), add = TRUE)
writeLines(c(".PHONY: printvars",
"printvars:",
"\t#$(foreach V,$(sort $(.VARIABLES)), \\",
"\t $(if $(filter-out environment% default automatic, \\",
"\t $(origin $V)),$(warning $V=$($V))))"), con = tf)
out <- system2(mk, c("-f", shQuote(path), "-f", shQuote(tf), "-n", "printvars"),
stdout = TRUE, stderr = TRUE)
out <- out[grepl(paste0("^", tf), out)]
out <- gsub(paste0("^", tf, ":[0-9]+:\\s*"), "", out)
known_noneed <- c(".DEFAULT_GOAL", "CURDIR", "GNUMAKEFLAGS", "MAKEFILE_LIST", "MAKEFLAGS")
out <- out[!grepl(paste0("^(", paste(known_noneed, collapse = "|"), ")="), out)]
if (aslist) {
spl <- strsplit(out, "=")
nms <- sapply(spl, `[[`, 1)
rest <- lapply(spl, function(a) paste(a[-1], collapse = "="))
setNames(rest, nms)
} else out
}
In action:
expand_mkvars("~/StackOverflow/karthikt.mk")
# [1] "file=./here/is/a/path/file.csv" "num=100"
# [3] "path=./here/is/a/path"
expand_mkvars("~/StackOverflow/karthikt.mk", aslist = TRUE)
# $file
# [1] "./here/is/a/path/file.csv"
# $num
# [1] "100"
# $path
# [1] "./here/is/a/path"
I have not tested on other systems, so you might need to adjust known_noneed to add extra variables that popup. Depending on your needs, you might be able to filter more-intelligently (e.g., none of your variables lead with a capital letter), but for this example I kept it to the known-not-wanted variables that make is giving us.
The blog post suggests using a phony target of
.PHONY: printvars
printvars:
#$(foreach V,$(sort $(.VARIABLES)), \
$(if $(filter-out environment% default automatic, \
$(origin $V)),$(warning $V=$($V))))
(some are tabs, not all spaces, very important for make)
Unfortunately, it produces more output than you technically need:
$ /c/Rtools/bin/make.exe -f ~/StackOverflow/karthikt.mk printvars
C:/Users/r2/StackOverflow/karthikt.mk:10: .DEFAULT_GOAL=all
C:/Users/r2/StackOverflow/karthikt.mk:10: CURDIR=/Users/r2/Projects/Ford/shiny/shinyobjects/inst
C:/Users/r2/StackOverflow/karthikt.mk:10: GNUMAKEFLAGS=
C:/Users/r2/StackOverflow/karthikt.mk:10: MAKEFILE_LIST= C:/Users/r2/StackOverflow/karthikt.mk
C:/Users/r2/StackOverflow/karthikt.mk:10: MAKEFLAGS=
C:/Users/r2/StackOverflow/karthikt.mk:10: SHELL=sh
C:/Users/r2/StackOverflow/karthikt.mk:10: file=./here/is/a/path/file.csv
C:/Users/r2/StackOverflow/karthikt.mk:10: num=100
C:/Users/r2/StackOverflow/karthikt.mk:10: path=./here/is/a/path
make: Nothing to be done for 'printvars'.
so we need a little filtering, ergo the majority of code in the function.
Edit: it the readRenviron-to-envvar is the best way for you, it would not be difficult to redirect the output of this make call to another file, parse out the relevant lines, and then do readRenviron on that new file. It seems more indirect due to the use of two temp files, but they're cleaned up so that should be nothing to worry about.

Ignoring the symbol ° in R devtools function document()

I would like to create a package for internal usage (not to distribute somewhere). One of my functions contains the line
if (data$unit[i] != "°C") {
It works perfectly in the script, but if I want to create the documentation for my package using document() from devtools, i get the error
Error in parse(text = lines, keep.source = TRUE, srcfile = srcfilecopy(file, path_to_my_code: unexpected INCOMPLETE_STRING
279: if (! is.na(data$unit[i]){
280: if (data$unit[i] != "
addition: Warning message:
In readLines(con, warn = FALSE, n = n, ok = ok, skipNul = skipNul) :
invalid input found on input connection 'path_to_my_code'
If I delete the °-character, document() works. But I need this character there, so this is not an option.
When using double-\ in the if-clause, my function doesn't detect °C anymore as shown here:
test <- c("mg/l", "°C")
"\\°C" %in% test
[1] FALSE
If I use tryCatch, the documentation is also not created.
Replacing "°C" by gsub(pattern = '\\\\', replacement = "", x = '\\°C') causes the function to crash at the double-\ .
How can I tell document() that everything is fine and it should just create the files?

efficiently read in fasta file and calculate nucleotide frequencies in R

How can I read in a fasta file (~4 Gb) and calculate nucleotide frequencies in a window of 4 bps in length?
it takes too long to read in the fasta file using
library(ShortRead)
readFasta('myfile.fa')
I have tried to index it using (and there are many of them)
library(Rsamtools)
indexFa('myfile.fa')
fa = FaFile('myfile.fa')
however I do not know how to access the file in this format
I would guess that 'slow' to read in a file that size would be a minute; longer than that and something other than software is the problem. Maybe it's appropriate to ask where your file comes from, your operating system, and whether you have manipulated the files (e.g., trying to open them in a text editor) before processing.
If 'too slow' is because you are running out of memory, then reading in chunks might help. With Rsamtools
fa = "my.fasta"
## indexFa(fa) if the index does not already exist
idx = scanFaIndex(fa)
create chunks of index, e.g., into n=10 chunks
chunks = snow::splitIndices(length(idx), 10)
and then process the file
res = lapply(chunks, function(chunk, fa, idx) {
dna = scanFa(fa, idx[chunk])
## ...
}, fa, idx)
Use do.call(c, res) or similar to concatenate the final result, or perhaps use a for loop if you're accumulating a single value. Indexing the fasta file is via a call to the samtools library; using samtools on the command line is also an option, on non-Windows.
An alternative is to use Biostrings::fasta.index() to index the file, then chunk through with that
idx = fasta.index(fa, seqtype="DNA")
chunks = snow::splitIndices(nrow(fai), 10)
res = lapply(chunks, function(chunk) {
dna = readDNAStringSet(idx[chunk, ])
## ...
}, idx)
If each record consists of a single line of DNA sequence, then reading the records in to R, in (even-numbered) chunks via readLines() and processing from there is relatively easy
con = file(fa)
open(fa)
chunkSize = 10000000
while (TRUE) {
lines = readLines(fa, chunkSize)
if (length(lines) == 0)
break
dna = DNAStringSet(lines[c(FALSE, TRUE)])
## ...
}
close(fa)
Load the Biostrings Package and then use the readDNAStringSet() method
From example("readDNAStringSet"), slightly modified:
library(Biostrings)
# example("readDNAStringSet") #optional
filepath1 <- system.file("extdata", "someORF.fa", package="Biostrings")
head(fasta.seqlengths(filepath1, seqtype="DNA")) #
x1 <- readDNAStringSet(filepath1)
head(x1)

How to use a non-ASCII symbol (e.g. £) in an R package function?

I have a simple function in one of my R packages, with one of the arguments symbol = "£":
formatPound <- function(x, digits = 2, nsmall = 2, symbol = "£"){
paste(symbol, format(x, digits = digits, nsmall = nsmall))
}
But when running R CMD check, I get this warning:
* checking R files for non-ASCII characters ... WARNING
Found the following files with non-ASCII characters:
formatters.R
It's definitely that £ symbol that causes the problem. If I replace it with a legitimate ASCII character, like $, the warning disappears.
Question: How can I use £ in my function argument, without incurring a R CMD check warning?
Looks like "Writing R Extensions" covers this in Section 1.7.1 "Encoding Issues".
One of the recommendations in this page is to use the Unicode encoding \uxxxx. Since £ is Unicode 00A3, you can use:
formatPound <- function(x, digits=2, nsmall=2, symbol="\u00A3"){
paste(symbol, format(x, digits=digits, nsmall=nsmall))
}
formatPound(123.45)
[1] "£ 123.45"
As a workaround, you can use intToUtf8() function:
# this causes errors (non-ASCII chars)
f <- function(symbol = "➛")
# this also causes errors in Rd files (non-ASCII chars)
f <- function(symbol = "\u279B")
# this is ok
f <- function(symbol = intToUtf8(0x279B))

Resources