How to quietly change the System Locale in R [duplicate] - r

I'm looking to suppress the output of one command (in this case, the apply function).
Is it possible to do this without using sink()? I've found the described solution below, but would like to do this in one line if possible.
How to suppress output

It isn't clear why you want to do this without sink, but you can wrap any commands in the invisible() function and it will suppress the output. For instance:
1:10 # prints output
invisible(1:10) # hides it
Otherwise, you can always combine things into one line with a semicolon and parentheses:
{ sink("/dev/null"); ....; sink(); }

Use the capture.output() function. It works very much like a one-off sink() and unlike invisible(), it can suppress more than just print messages. Set the file argument to /dev/null on UNIX or NUL on windows. For example, considering Dirk's note:
> invisible(cat("Hi\n"))
Hi
> capture.output( cat("Hi\n"), file='NUL')
>

The following function should do what you want exactly:
hush=function(code){
sink("NUL") # use /dev/null in UNIX
tmp = code
sink()
return(tmp)
}
For example with the function here:
foo=function(){
print("BAR!")
return(42)
}
running
x = hush(foo())
Will assign 42 to x but will not print "BAR!" to STDOUT
Note than in a UNIX OS you will need to replace "NUL" with "/dev/null"

R only automatically prints the output of unassigned expressions, so just assign the result of the apply to a variable, and it won't get printed.

you can use 'capture.output' like below. This allows you to use the data later:
log <- capture.output({
test <- CensReg.SMN(cc=cc,x=x,y=y, nu=NULL, type="Normal")
})
test$betas

In case anyone's arriving here looking for a solution applicable to RMarkdown, this will suppress all output:
```{r error=FALSE, warning=FALSE, message=FALSE}
invisible({capture.output({
# Your code goes here
2 * 2
# etc
# etc
})})
```
The code will run, but the output will not be printed to the HTML document

invisible(cat("Dataset: ", dataset, fill = TRUE))
invisible(cat(" Width: " ,width, fill = TRUE))
invisible(cat(" Bin1: " ,bin1interval, fill = TRUE))
invisible(cat(" Bin2: " ,bin2interval, fill = TRUE))
invisible(cat(" Bin3: " ,bin3interval, fill = TRUE))
produces output without NULL at the end of the line or on the next line
Dataset: 17 19 26 29 31 32 34 45 47 51 52 59 60 62 63
Width: 15.33333
Bin1: 17 32.33333
Bin2: 32.33333 47.66667
Bin3: 47.66667 63

Making Hadley's comment to an answer: Use of apply family without printing is possible with use of the plyr package
x <- 1:2
lapply(x, function(x) x + 1)
#> [[1]]
#> [1] 2
#>
#> [[2]]
#> [1] 3
plyr::l_ply(x, function(x) x + 1)

Here is a version that is robust to errors in the code to be shushed:
quietly <- function(x) {
sink("/dev/null") # on Windows (?) instead use `sink("NUL")`
tryCatch(suppressMessages(x), finally = sink())
}
This is based directly on the accepted answer, for which thanks.
But it avoids leaving output silenced if an error occurs in the quieted code.

Related

How does R Markdown automatically format print effects into dataframes? Or how can I access special print methods?

I'm working with the WRS2 package and there are cases where it'll output its analysis (bwtrim) into a list with a special class of the analysis type class = "bwtrim". I can't as.data.frame() it, but I found that there is a custom print method called print.bwtrim associated with it.
As an example let's say this is the output: bwtrim.out <- bwtrim(...). When I run the analysis output in an Rmarkdown chunk, it seems to "steal" part of the text output and make it into a dataframe.
So here's my question, how can I either access print.bwtrim or how does R markdown automatically format certain outputs into dataframes? Because I'd like to take this outputted dataframe and use it for other purposes.
Update: Here is a minimally working example -- put the following in a chunk in Rmd file."
```{r}
library(WRS2)
df <-
data.frame(
subject = rep(c(1:100), each = 2),
group = rep(c("treatment", "control"), each = 2),
timepoint = rep(c("pre", "post"), times = 2),
dv = rnorm(200, mean = 2)
)
analysis <- WRS2::bwtrim(dv ~ group * timepoint,
id = subject,
data = df,
tr = .2)
analysis
```
With this, a data.frame automatically shows up in the chunk afterwards and it shows all the values very nicely. My main question is how can I get this data.frame for my own uses. Because if you do str(analysis), you see that it's a list. If you do class(analysis) you get "bwtrim". if you do methods(class = "bwtrim"), you get the print method. And methods(print) will have a line that says print.bwtrim*. But I can't seem to figure out how to call print.bwtrim myself.
Regarding what Rmarkdown is doing, compare the following
If you run this in a chunk, it actually steals the data.frame part and puts it into a separate figure.
```{r}
capture.output(analysis)
```
However, if you run the same line in the console, the entire output comes out properly. What's also interesting is that if you try to assign it to another object, the output will be stolen before it can be assigned.
Compare x when you run the following in either a chunk or the console.
```{r}
x<-capture.output(analysis)
```
This is what I get from the chunk approach when I call x
[1] "Call:"
[2] "WRS2::bwtrim(formula = dv ~ group * timepoint, id = subject, "
[3] " data = df, tr = 0.2)"
[4] ""
[5] ""
This is what I get when I do it all in the console
[1] "Call:"
[2] "WRS2::bwtrim(formula = dv ~ group * timepoint, id = subject, "
[3] " data = df, tr = 0.2)"
[4] ""
[5] " value df1 df2 p.value"
[6] "group 1.0397 1 56.2774 0.3123"
[7] "timepoint 0.0001 1 57.8269 0.9904"
[8] "group:timepoint 0.5316 1 57.8269 0.4689"
[9] ""
My question is what can I call whatever Rstudio/Rmarkdown is doing to make data.frames, so that I can have an easy data.frame myself?
Update 2: This is probably not a bug, as discussed here https://github.com/rstudio/rmarkdown/issues/1150.
Update 3: You can access the method by using WRS2:::bwtrim(analysis), though I'm still interested in what Rmarkdown is doing.
Update 4: It might not be the case that Rmarkdown is stealing the output and automatically making dataframes from it, as you can see when you call x after you've already captured the output. Looking at WRS2:::print.bwtrim, it prints a dataframe that it creates, which I'm guessing Rmarkdown recognizes then formats it out.
See below for the print.bwtrim.
function (x, ...)
{
cat("Call:\n")
print(x$call)
cat("\n")
dfx <- data.frame(value = c(x$Qa, x$Qb, x$Qab), df1 = c(x$A.df[1],
x$B.df[1], x$AB.df[1]), df2 = c(x$A.df[2], x$B.df[2],
x$AB.df[2]), p.value = c(x$A.p.value, x$B.p.value, x$AB.p.value))
rownames(dfx) <- c(x$varnames[2], x$varnames[3], paste0(x$varnames[2],
":", x$varnames[3]))
dfx <- round(dfx, 4)
print(dfx)
cat("\n")
}
<bytecode: 0x000001f587dc6078>
<environment: namespace:WRS2>
In R Markdown documents, automatic printing is done by knitr::knit_print rather than print. I don't think there's a knit_print.bwtrim method defined, so it will use the default method, which is defined as
function (x, ..., inline = FALSE)
{
if (inline)
x
else normal_print(x)
}
and normal_print will call print().
You are asking why the output is different. I don't see that when I knit the document to html_document, but I do see it with html_notebook. I don't know the details of what is being done, but if you look at https://rmarkdown.rstudio.com/r_notebook_format.html you can see a discussion of "output source functions", which manipulate chunks to produce different output.
The fancy output you're seeing looks a lot like what knitr::knit_print does for a dataframe, so maybe html_notebook is substituting that in place of print.

How to print plus-minus and beta signs in bquote, and correctly export to pdf

¿How do you print the ± sign in a bquote() expression in R?
I have tried the following:
pm
%pm%
±
These have not worked.
UPDATE #1 Here is some sample code
plot(NULL,xlim=c(0,10),ylim=c(0,10),xlab=NA,ylab=NA,xaxs="i",yaxs="i")
c <- "name"
p <- .004
n <- 969
b <- 1.23
s <- 0.45
tmp.txt <- paste(c(c," (n=",n,")\nslope = ",b,"±",s,"\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,9.5,labels=tmp.txt,adj=c(1,1),cex=.75)
What I am trying to do is to make the 2nd line have beta (the symbol) instead of slope, and the ± symbol to appear. If I use expression, I can get the beta, but not the ±; if I just paste in ß (or something similar), it won't run.
UPDATE #2: It appears I HAVE to use bquote()...else the beta character won't print when piped out via pdf().
An answer to this question suggests using paste with bquote. You could then use the Unicode character of ±:
x <- 232323
plot(1:10, main = bquote(paste(ARL[1], " curve for ", S^2, "; x=\U00B1",.(x))))
Note that this example (minus the inclusion of \U00B1) came from fabian's answer to the previously linked question.
I appreciate the advice given, but it didn't fully accomplish my goal. Here is the workaround I came up with (and I personally think it is just short of asinine...but I'm at a loss).
c <- "name"
p <- .004
n <- 969
b <- 1.23
s <- 0.45
## draw empty plot
plot(NULL,xlim=c(0,10),ylim=c(0,10),xlab=NA,ylab=NA,xaxs="i",yaxs="i")
## place the "poor man's substitute"
tmp.txt <- paste(c(c," (n=",n,")\nslope = ",b,"±",s,"\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,9.5,labels=tmp.txt,adj=c(1,1),cex=.75)
## place the next best option
tmp.txt <- paste(c(c," (n=",n,")\n\U03B2 = ",b,"±",s,"\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,7.5,labels=tmp.txt,adj=c(1,1),cex=.75)
## place the two boxes to superimpose the bquote() version
tmp.txt2 <- paste(c(c," (n=",n,")\n\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,5.5,labels=tmp.txt2,adj=c(1,0.5),cex=.75)
text(9.5,5.5,labels=bquote(beta == .(b)%+-%.(s)),adj=c(1,0.5,cex=.75))
## same as above, but piped to a *.pdf
pdf("tmp_output.pdf")
plot(NULL,xlim=c(0,10),ylim=c(0,10),xlab=NA,ylab=NA,xaxs="i",yaxs="i")
tmp.txt <- paste(c(c," (n=",n,")\nslope = ",b,"±",s,"\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,9.5,labels=tmp.txt,adj=c(1,1),cex=.75)
tmp.txt <- paste(c(c," (n=",n,")\n\U03B2 = ",b,"±",s,"\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,7.5,labels=tmp.txt,adj=c(1,1),cex=.75)
tmp.txt2 <- paste(c(c," (n=",n,")\n\n",ifelse(p==0,"p<.001",paste0("p=",p))),collapse="")
text(9.5,5.5,labels=tmp.txt2,adj=c(1,0.5),cex=.75)
text(9.5,5.5,labels=bquote(beta == .(b)%+-%.(s)),adj=c(1,0.5,cex=.75))
dev.off()
If you run this, it appears to work both inside of R and in the resulting *.pdf file.
As always, a more elegant (and sensible) solution would be much appreciated.

How to read unquoted extra \r with data.table::fread

Data I have to process has unquoted text with some additional \r character. Files are big (500MB), copious (>600), and changing the export is not an option. Data might look like
A,B,C
blah,a,1
bloo,a\r,b
blee,c,d
How can this be handled with data.table's fread?
Is there a better R read CSV function for this, that's similarly performant?
Repro
library(data.table)
csv<-"A,B,C\r\n
blah,a,1\r\n
bloo,a\r,b\r\n
blee,c,d\r\n"
fread(csv)
Error in fread(csv) :
Expected sep (',') but new line, EOF (or other non printing character) ends field 1 when detecting types from point 0:
bloo,a
Advanced repro
The simple repro might be too trivial to give a sense of scale...
samplerecs<-c("blah,a,1","bloo,a\r,b","blee,c,d")
randomcsv<-paste0(c("A,B,C",rep(samplerecs,2000000)))
write(randomcsv,file = "sample.csv")
# Naive approach
fread("sample.csv")
# Akrun's approach with needing text read first
fread(gsub("\r\n|\r", "", paste0(randomcsv,collapse="\r\n")))
#>Error in file.info(input) : file name conversion problem -- name too long?
# Julia's approach with needing text read first
readr::read_csv(gsub("\r\n|\r", "", paste0(randomcsv,collapse="\r\n")))
#> Error: C stack usage 48029706 is too close to the limit
Further to #dirk-eddelbuettel & #nrussell's suggestions, a way of solving this is to is to pre-process the file. The processor could also be called within fread() but here it is performed in seperate steps:
samplerecs<-c("blah,a,1","bloo,a\r,b","blee,c,d")
randomcsv<-paste0(c("A,B,C",rep(samplerecs,2000000)))
write(randomcsv,file = "sample.csv")
# Remove errant `\r`'s with tr - shown here is the Windows R solution
shell("C:/Rtools/bin/tr.exe -d '\\r' < sample.csv > sampleNEW.csv")
fread("sampleNEW.csv")
We can try with gsub
fread(gsub("\r\n|\r", "", csv))
# A B C
#1: blah a 1
#2: bloo a b
#3: blee c d
You can also do this with tidyverse packages, if you'd like.
> library(readr)
> library(stringr)
> read_csv(str_replace_all(csv, "\r", ""))
# A tibble: 3 × 3
A B C
<chr> <chr> <chr>
1 blah a 1
2 bloo a b
3 blee c d
If you do want to do it purely in R, you could try working with connections. As long as a connection is kept open, it will start reading/writing from its previous position. Of course, this means the burden of opening and closing connections falls on you.
In the following code, the file is processed by chunks:
library(data.table)
input_csv <- "sample.csv"
in_conn <- file(input_csv)
output_csv <- "out.csv"
out_conn <- file(output_csv, "w+")
open(in_conn)
chunk_size <- 1E6
return_pattern <- "(?<=^|,|\n)([^,]*(?<!\n)\r(?!\n)[^,]*)(?=,|\n|$)"
buffer <- ""
repeat {
new_chars <- readChar(in_conn, chunk_size)
buffer <- paste0(buffer, new_chars)
while (grepl("[\r\n]$", buffer, perl = TRUE)) {
next_char <- readChar(in_conn, 1)
buffer <- paste0(buffer, next_char)
if (!length(next_char))
break
}
chunk <- gsub("(.*)[,\n][^,\n]*$", "\\1", buffer, perl = TRUE)
buffer <- substr(buffer, nchar(chunk) + 1, nchar(buffer))
cleaned <- gsub(return_pattern, '"\\1"', chunk, perl = TRUE)
writeChar(cleaned, out_conn, eos = NULL)
if (!length(new_chars))
break
}
writeChar('\n', out_conn, eos = NULL)
close(in_conn)
close(out_conn)
result <- fread(output_csv)
Process:
If a chunk ends with a \r or \n, another character is added until it doesn't.
Quotes are put around values containing a \r which isn't adjacent to a
\n.
The cleaned chunk is added to the end of another file.
Rinse and repeat.
This code simplifies the problem by assuming no quoting is done for any field in sample.csv. It's not especially fast, but not terribly slow. Larger values for chunk_size should reduce the amount of time spent in I/O operations. If used for anything beyond this toy example, I'd strongly suggesting wrapping it in a tryCatch(...) call to make sure the files are closed afterwards.

How to convert code to more readable form in R

I copy code from the terminal to post here. It is in following form:
> ddf2 = ddf[ddf$stone_ny>'stone',] # this is first command
> ddf2[!duplicated(ddf2$deltnr),] # second command
deltnr us stone_ny stone_mobility
4 1536 63 stone mobile
10 1336 62 stone mobile
First 2 lines are commands while next 3 lines are output. However, this cannot be copied from here back to R terminal since the commands start with '> '. How can I convert this to:
ddf2 = ddf[ddf$stone_ny>'stone',] # this is first command
ddf2[!duplicated(ddf2$deltnr),] # second command
# deltnr us stone_ny stone_mobility
#4 1536 63 stone mobile
#10 1336 62 stone mobile
So that it become suitable for copying from here.
I tried:
text
[1] "> ddf2 = ddf[ddf$stone_ny>'stone',] # this is first command\n> ddf2[!duplicated(ddf2$deltnr),] # second command\n deltnr us stone_ny stone_mobility \n4 1536 63 stone mobile \n10 1336 62 stone mobile "
text2 = gsub('\n','#',text)
text2 = gsub('#>','\n',text2)
text2 = gsub('#','\n#',text2)
text2
[1] "> ddf2 = ddf[ddf$stone_ny>'stone',] \n# this is first command\n
ddf2[!duplicated(ddf2$deltnr),] \n# second command\n# deltnr us stone_ny stone_mobility \n#4 1536 63 stone mobile \n#10 1336 62 stone mobile "
But it cannot get pasted to the terminal.
I've been waiting for an opportunity to share this function I keep in my .Rprofile file. While it may not answer exactly your question, I feel it is accomplishing something very close to what you are after. So you might get some ideas by looking at its code. And others might find it useful just as it is. The function:
SO <- function(script.file = '~/.active-rstudio-document') {
# run the code and store the output in a character vector
tmp <- tempfile()
capture.output(
source(script.file, echo = TRUE,
prompt.echo = "> ",
continue.echo = "+ "), file = tmp)
out <- readLines(tmp)
# identify lines that are comments, code, results
idx.comments <- grep("^> [#]{2}", out)
idx.code <- grep("^[>+] ", out)
idx.blank <- grep("^[[:space:]]*$", out)
idx.results <- setdiff(seq_along(out),
c(idx.comments, idx.code, idx.blank))
# reformat
out[idx.comments] <- sub("^> [#]{2} ", "", out[idx.comments])
out[idx.code] <- sub("^[>+] ", " ", out[idx.code])
out[idx.results] <- sub("^", " # ", out[idx.results])
# output
cat(out, sep = "\n", file = stdout())
}
This SO function is what allows me to quickly format my answers to questions on this very website, StackOverflow. My workflow is as follows:
1) In RStudio, write my answer in an untitled script (that's the top-left quadrant). For example:
## This is super easy, you can do
set.seed(123)
# initialize x
x <- 0
while(x < 0.5) {
print(x)
# update x
x <- runif(1)
}
## And voila.
2) Near the top, click the "Source" button. It will execute the code in the console which is not really what we are after: rather, it will have the side effect of saving the code to the default file '~/.active-rstudio-document'.
3) Run SO() from the console (bottom-left quadrant) which will source the code (again...) from the saved file, capture the output and print it in a SO-friendly format:
This is super easy, you can do
set.seed(123)
# initialize x
x <- 0
while(x < 0.5) {
print(x)
# update x
x <- runif(1)
}
# [1] 0
# [1] 0.2875775
And voila.
4) Copy-paste into stackoverflow and done.
Note: For code that takes a while to run, you can avoid running it twice by saving your script to a file (e.g. 'xyz.R') instead of clicking the "Source" button. Then run SO("xyz.R").
You could try cat with an ifelse condition.
cat(ifelse(substr(s <- strsplit(text, "\n")[[1]], 1, 1) %in% c("_", 0:9, " "),
paste0("# ", s),
gsub("[>] ", "", s)),
sep = "\n")
which results in
ddf2 = ddf[ddf$stone_ny>'stone',] # this is first command
ddf2[!duplicated(ddf2$deltnr),] # second command
# deltnr us stone_ny stone_mobility
# 4 1536 63 stone mobile
# 10 1336 62 stone mobile
The "_" and 0:9 are in there because one of the rules in R is that a function cannot begin with a _ or a digit. You can adjust it to fit your needs.

change data frame so thousands are separated by dots

At the moment, I'm working with RMarkdown and Pandoc. My data.frames in R look like this:
3.538e+01 3.542e+01 3.540e+01
9.583e+00 9.406e+00 9.494e+00
2.601e+05 2.712e+05 5.313e+05
After I ran pandoc, the result looks like this:
35.380 35.420 35.400
9.583 9.406 9.494
260116.000 271217.000 531333.000
What it should look like is:
35,380 35,420 35,400
9,583 9,406 9,494
260.116 271.217 531.333
So I want commas instead of dots and I want no comma or dot after 260116 (thousand numbers). The dots to separate the thousand would be nice. Is there a way to directly Change the appearance in R or do I have to set options in knitr/markdown?
Thanks
Here's an example of some of the conversions that can be done with format():
x <- c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300)
format(x, decimal.mark=",", big.mark=".", scientific=FALSE)
# [1] " 35,380" " 35,420" " 35,400" " 9,583" " 9,406"
# [6] " 9,494" "260.100,000" "271.200,000" "531.300,000"
There are several other options, such as trim, justify, and so on that might be of interest in getting your output ready for pandoc.
As this question was really inspiring, I recently introduced that big.mark feature in my pander package, that can return markdown formatted tables from R objects with predefined options -- building on format by the way. Small demo:
Load the package (installed from GH until this features gets to CRAN):
> library(pander)
Create a demo data.frame:
> x <- matrix(c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300), 3, byrow = TRUE)
Set your default options: (values for US context may need to be switched)
> panderOptions('decimal.mark', ',')
> panderOptions('big.mark', '.')
Let pander do the rest:
> pander(x)
------- ------- -------
35,38 35,42 35,4
9,583 9,406 9,494
260.100 271.200 531.300
------- ------- -------
You can find and use even more options there (like the markdown syntax for the table).

Resources