Preserving long comments in console output. Not falling victim to ".... [TRUNCATED]" - r

I am trying to run a script that has lots of comments to explain each table, statistical test and graph. I am using RStudio IDE as follows
source(filename, echo=T)
That ensures that the script outputs everything to the console. If I run the following sequence it will send all the output to a txt file and then switch off the output diversion
sink("filenameIwantforoutput.txt")
source(filename, echo=T)
sink()
Alas, I am finding that a lot of my comments are not being outputted. Instead I get
"...but only if we had had an exclusively b .... [TRUNCATED]".
Once before I learned where to preserve the output but that was a few months ago and now I cannot remember. Can you?

Set the max.deparse.length= argument to source. You probably need something greater than the default of 150. For example:
source(filename, echo=TRUE, max.deparse.length=1e3)
And note the last paragraph in the Details section of ?source reads:
If ‘echo’ is true and a deparsed
expression exceeds
‘max.deparse.length’, that many
characters are output followed by ‘
.... [TRUNCATED] ’.

You can make this behavior the default by overriding the source() function in your .Rprofile.
This seems like a reasonable case for overriding a function because in theory the change should only affect the screen output. We could contrive an example where this is not the case, e.g., can capture the screen output and use as a variable like capture.output(source("somefile.R")) but it seems unlikely. Changing a function in a way that the return value is changed will likely come back to bite you or whoever you share your code with (e.g., if you change a default of a function's na.rm argument).
.source_modified <- source
formals(.source_modified)$max.deparse.length <- Inf
# Use 'nms' because we don't want to do it for all because some were already
# unlocked. Thus if we lock them again, then we are changing the previous
# state.
# (https://stackoverflow.com/a/62563023/1376404)
rlang::env_binding_unlock(env = baseenv(), nms = "source")
assign(x = "source", value = .source_modified, envir = baseenv())
rlang::env_binding_lock(env = baseenv(), nms = "source")
rm(.source_modified)
An alternative is to create your own 'alias'-like function. I use the following in my .Rprofile:
s <- source
formals(s)$max.deparse.length <- Inf
formals(s)$echo <- TRUE

Related

Printing check mark is triggering warning when running R cmd checks

I‘m writing a simple function that reads multiple files from the user, then it prints into the terminal (print) file name and yes or no next to it, if it was valid/invalid. I would like to use x/✓symbols.
The problem however, since the check symbol is not in ascii, R cmd is throwing a warning. I tried several approaches (charToRaw, intToUtf8, symbol(“\326”) yet non is working with simple print in terminal.
As an example:
Df <- data.frame(file = myfiles, status = “x”)
Df$status[1]= “✓”
print (Df)
Any idea? Thanks
The cli package offers a way to print these with cli_alert_success and cli_alert_danger. Presumably, you have some more complicated check as to if the file was valid. Store that as a boolean instead of the explicit character.
Df <- data.frame(file = myfiles, isValid = myValidityCheckFx(myfiles))
purrr::walk2(
Df$file, Df$isValid,
~if(.y) cli::cli_alert_success(.x) else cli::cli_alert_danger(.x)
)

suppress line/index numbers in R output

Can I systematically suppress the index of the first element in the line of the output in R's output in the console?
I am looking for an option to prettify the output, without having to type anything extra. I imagine that if such a feat is possible, it would be set up as an option in the .renviron file (or similar). An RStudio-specific answer would be acceptable. Apologies if I have overlooked something obvious in the settings (I would have expected that option to be in Preferences --> Code --> Display.
Currently the R console and RStudio consoles display:
1+1
[1] 2
I would like to see:
1+1
2
I know I can get the above with cat(1+1), but what I'm looking for is a systematic change in the display style. Something like the typical Python output (open a terminal, type Python followed by 1+1. I want that)
Edit: Another example. In RStudio, if I define x=1:5, it appears as int [1:5] 1 2 3 4 5 in the environment: that's informative and I don't mind it. But in the R console, it looks like [1] 1 2 3 4 5, which I do not find informative, especially when there are multiple lines.
I have personally got used to these numbers, as I imagine everyone has, but that doesn't make them right: (1) they serve no purpose: if you widen the console, the lines get wider and the line numbers change (if they marked the 80-character width, ok, maybe they would serve a purpose), (2) when I copy-paste output into lecture notes, these line numbers interfere with clarity and confuse the novice.
I have not found an answer to this question, which is surprising, so please let me know if I have missed it. The following question is related but not a duplicate
https://stackoverflow.com/questions/3271939. Is there a duplicate I have missed?
Edit As pointed out by Adiel Loinger in the comments section, these are not "line numbers", as I had called them, but "the index of the first element of the line being printed in the console". Thanks for the correction. I have tried to edit my question accordingly.
I believe the only way to do that is to modify the sources. R is open source, so that's not impossible, but it's not easy.
It's easier to change the print format for particular classes of objects. For example, if you don't like the way lm objects print, you can create your own print.lm method to do it yourself:
print.lm <- function (x, ...)
{
cat("My new version!")
}
Then
> lm(rnorm(10) ~ I(1:10))
My new version!
This doesn't work for things like 1+1, because for efficiency reasons, R always uses the internal version of the print method for auto-printing.
By the way, the printed indices do serve a purpose: if you print a long vector and wonder what the index is for some particular element, you only need to count from the start of the line, not from the start of the vector, to find it.
You can work around indexes and row names by converting the answers to data frames. It's not perfect, but not too hard and depending on your application, maybe an improvement. Functions below.
Base function with the slightly annoying index:
paste0("The answer is ", foo, "bar")
}
my_fun("foo")
[1] "The answer is foobar"
Improvement with data frame:
Note: For data frames with multiple rows, instead of just df, use print.data.frame(df, row.names = FALSE)
my_funner <- function(foo){
df <- data.frame("The_answer_is" = paste0(foo, "bar"), row.names = "")
df
}
my_funner("foo")
The_answer_is
foobar
Another option:
my_funnest <- function(foo){
df <- data.frame("Sorry_about" = "The_answer_is", "the_col_names" = paste0(foo, "bar"), row.names = "")
df
}
my_funnest("foo")
Sorry_about the_col_names
The_answer_is foobar
But those gaps are annoying, so one more option:
my_most_funnest <- function(foo){
df <- data.frame("Sorry_about_the_col_names" = paste0("The answer is ", foo, "bar"), row.names = "")
df
}
my_most_funnest("foo")
Sorry_about_the_col_names
The answer is foobar

R program does not output

I'm new to R and programming and taking a Coursera course. I've asked in their forums, but nobody can seem to provide an answer in the forums. To be clear, I'm trying to determine why this does not output.
When I first wrote the program, I was getting accurate outputs, but after I tried to upload, something went wonky. Rather than producing any output with [1], [2], etc. when I run the program from RStudio, I only get the the blue +++, but no errors and anything I change still does not produce an output.
I tried with a previous version of R, and reinstalled the most recent version 3.2.1 for Windows.
What I've done:
Set the correct working directory through RStudio
pol <- function(directory, pol, id = 1:332) {
files <- list.files("specdata", full.names = TRUE);
data <- data.frame();
for (i in ID) {
data <- rbind(data, read.csv(files_list[i]))
}
subset <- subset(data, ID %in% id);
polmean <- mean(subset[pol], na.rm = TRUE);
polmean("specdata", "sulfate", 1:10)
polmean("specdata", "nitrate", 70:72)
polmean("specdata", "nitrate", 23)
}
Can someone please provide some direction - debug help?
when I adjust the code the following errors tend to appear:
ID not found
Missing or unexpected } (although I've matched them all).
The updated code is as follow, if I'm understanding:
data <- data.frame();
files <- files[grepl(".csv",files)]
pollutantmean <- function(directory, pollutant, id = 1:332) {
pollutantmean <- mean(subset1[[pollutant]], na.rm = TRUE);
}
Looks like you haven't declared what ID is (I assume: a vector of numbers)?
Also, using 'subset' as a variable name while it's also a function, and pol as both a function name and the name of one of the arguments of that same function is just asking for trouble...
And I think there is a missing ")" in your for-loop.
EDIT
So the way I understand it now, you want to do a couple of things.
Read in a bunch of files, which you'll use multiple times without changing them.
Get some mean value out of those files, under different conditions.
Here's how I would do it.
Since you only want to read in the data once, you don't really need a function to do this (you can have one, but I think it's overkill for now). You correctly have code that makes a vector with the file names, and then loop over over them, rbinding them to each other. The problem is that this can become very slow. Check here. Make sure your directory only contains files that you want to read in, so no Rscripts or other stuff. A way (not 100% foolproof) to do this is using files <- files[grepl(".csv",files)], which makes sure you only have the csv's (grepl checks whether a certain string is a substring of another, and returns a boolean the [] then only keeps the elements for which a TRUE was returned).
Next, there is 'a thing you want to do multiple times', namely getting out mean values. This is where you'd use a function. Apparently you want to get the mean for different types of pollution, and you want this in restricted IDs.
Let's assume that 1. has given you a dataframe df with a column named Type for the type of pollution and a column called Id that somehow represents a sort of ID (substitute with the actual names in your script - if you don't have a column for ID, I'll edit the answer later on). Now you want a function
polmean <- function(type, id) {
# some code that returns the mean of a restricted version of df
}
This is all you need. You write the code that generates df, you then write a function that will get you what you want from that dataframe, and then you call it for the circumstances you want to use it in (the three polmean calls at the end of your original code, but now without the first argument as you no longer need this).
Ok - I finally solved this. Thanks for the help.
I didn't need to call "specdata" in line 2. the directory in line 1 referred to the correct directory.
My for/in statement needed to refer the the id in the first line not the ID in the dataset. The for/in statement doesn't appear to need to be indented (but it looks cleaner)
I did not need a subset
The last 3 lines for pollutantmean did not need to be a part of the program. These are used in the R console to call the results one by one.

R - Handling cut intervals of form [x,y] in tables when converting to LaTeX

I'm working on a document in R, with knitr to pdflatex and am using the extended version of toLatex from memisc.
When I'm producing a table with cut intervals however, the square brackets are not sanitised and the pdflatex job errors because of the existence of [.
I tried putting sanitize=TRUE in the knitr chunk code, but this only works for tikz.
Previously, I have used gsub and replaced the string in the R object itself which is rather inelegant. I'm hoping someone could point me in the direction of a nuance of memisc or knitr that I'm missing or another function/method that would easily handle latex special characters.
Example
library("memisc")
library("Hmisc")
example<-data.frame(cbind(x=1:100,y=1:100))
example$x<-cut2(example$x,m=20)
toLatex(example)
UPDATE
Searching SO I found a post about applying latexTranslate with apply function, but this requires characters so I would have to unclass from factor to character.
I found another SO post that identifies the knitr:::escape_latex function however, the chunk then outputs the stuff as markup instead of translating it (using results='asis') or produces an R style table inside a code block (using results='markup'). I tried configuring it as a hook function in my parent document and it had the effect of outputting all the document contents as markup. This is a brand new area for me so I probably implemented it incorrectly.
<<setup,include=FALSE>>=
hook_inline = knit_hooks$get('inline')
knit_hooks$set(inline = function(x) {
if (is.character(x)) x = knitr:::escape_latex(x)
hook_inline(x)
})
#
...
<<tab-example,echo=FALSE,cache=TRUE,results='asis',sanitize=TRUE,inline=TRUE>>=
library("Hmisc")
library("memisc")
example<-data.frame(cbind(x=1:100,y=1:100))
example$x<-cut2(example$x,m=20)
toLatex(example)
#
According to #yihui this is the wrong way to go
UPDATE 2
I have created a gsub wrapper which will escape percentages etc, however the [ symbol still pushes latex into maths mode and errors.
Courtesy of folks on the tex SE, a [ directly after a line break(\\) is considered an entry into math-mode. It is very simple to prevent this behaviour by adding {} into the output just before a [. My function looks like:
escapedLatex<-function (df = NULL)
{
require("memisc")
gsub(gsub(x = toLatex(df, show.xvar = TRUE), pattern = "%",
replacement = "\\%", fixed = TRUE), pattern = "[", replacement = "{}[",
fixed = TRUE)
}
I'd be very happy to see any alternative, more elegant solutions around and will leave it open for a few days.

Is it possible to to export from reporttools?

I am using tableNominal{reporttools} to produce frequency tables. The way I understand it, tableNominal() produces latex code which has to be copied and pasted onto a text file and then saved as .tex. But is it possible to simple export the table produced as can be done in print(xtable(table), file="path/outfile.tex"))?
You may be able to use either latex or latexTranslate from the "Hmisc" package for this purpose. If you have the necessary program infrastructure the output gets sent to your TeX engine. (You may be able to improve the level of our answers by adding specific examples.)
Looks like that function does not return a character vector, so you need to use a strategy to capture the output from cat(). Using the example in the help page:
capture.output( TN <- tableNominal(vars = vars, weights = weights, group = group,
cap = "Table of nominal variables.", lab = "tab: nominal") ,
file="outfile.tex")

Resources