Is there a way to page break the output in console [duplicate] - r

Is there an equivalent to the unix less command that can be used within the R console?

There is also page() which displays a representation of an object in a pager, like less.
dat <- data.frame(matrix(rnorm(1000), ncol = 10))
page(dat, method = "print")

Not really. There are the commands
head() and tail() for showing the beginning and end of objects
print() for explicitly showing an object, and just its name followed by return does the same
summary() for concise summary that depends on the object
str() for its structure
and more. An equivalent for less would be a little orthogonal to the language and system. Where the Unix shell offers you less to view the content of a file (which is presumed to be ascii-encoded), it cannot know about all types.
R is different in that it knows about the object types which is why summary() -- as well as the whole modeling framework -- are more appropriate.
Follow-up edit: Another possibility is provided by edit() as well as edit.data.frame().

I save the print output to a file and then read it using an editor or less.
Type the following in R
sink("Routput.txt")
print(varname)
sink()
Then in a shell:
less Routput.txt

If the file is already on disk, then you can use file.show

You might like my little toy here:
short <- function(x=seq(1,20),numel=4,skipel=0,ynam=deparse(substitute(x))) {
ynam<-as.character(ynam)
#clean up spaces
ynam<-gsub(" ","",ynam)
#unlist goes by columns, so transpose to get what's expected
if(is.list(x)) x<-unlist(t(x))
if(2*numel >= length(x)) {
print(x)
}
else {
frist=1+skipel
last=numel+skipel
cat(paste(ynam,'[',frist,'] thru ',ynam,'[',last,']\n',sep=""))
print(x[frist:last])
cat(' ... \n')
cat(paste(ynam,'[',length(x)-numel-skipel+1,'] thru ', ynam, '[', length(x)-skipel,']\n',sep=""))
print(x[(length(x)-numel-skipel+1):(length(x)-skipel)])
}
}
blahblah copyright by me, not Disney blahblah free for use, reuse, editing, sprinkling on your Wheaties, etc.

Related

suppress line/index numbers in R output

Can I systematically suppress the index of the first element in the line of the output in R's output in the console?
I am looking for an option to prettify the output, without having to type anything extra. I imagine that if such a feat is possible, it would be set up as an option in the .renviron file (or similar). An RStudio-specific answer would be acceptable. Apologies if I have overlooked something obvious in the settings (I would have expected that option to be in Preferences --> Code --> Display.
Currently the R console and RStudio consoles display:
1+1
[1] 2
I would like to see:
1+1
2
I know I can get the above with cat(1+1), but what I'm looking for is a systematic change in the display style. Something like the typical Python output (open a terminal, type Python followed by 1+1. I want that)
Edit: Another example. In RStudio, if I define x=1:5, it appears as int [1:5] 1 2 3 4 5 in the environment: that's informative and I don't mind it. But in the R console, it looks like [1] 1 2 3 4 5, which I do not find informative, especially when there are multiple lines.
I have personally got used to these numbers, as I imagine everyone has, but that doesn't make them right: (1) they serve no purpose: if you widen the console, the lines get wider and the line numbers change (if they marked the 80-character width, ok, maybe they would serve a purpose), (2) when I copy-paste output into lecture notes, these line numbers interfere with clarity and confuse the novice.
I have not found an answer to this question, which is surprising, so please let me know if I have missed it. The following question is related but not a duplicate
https://stackoverflow.com/questions/3271939. Is there a duplicate I have missed?
Edit As pointed out by Adiel Loinger in the comments section, these are not "line numbers", as I had called them, but "the index of the first element of the line being printed in the console". Thanks for the correction. I have tried to edit my question accordingly.
I believe the only way to do that is to modify the sources. R is open source, so that's not impossible, but it's not easy.
It's easier to change the print format for particular classes of objects. For example, if you don't like the way lm objects print, you can create your own print.lm method to do it yourself:
print.lm <- function (x, ...)
{
cat("My new version!")
}
Then
> lm(rnorm(10) ~ I(1:10))
My new version!
This doesn't work for things like 1+1, because for efficiency reasons, R always uses the internal version of the print method for auto-printing.
By the way, the printed indices do serve a purpose: if you print a long vector and wonder what the index is for some particular element, you only need to count from the start of the line, not from the start of the vector, to find it.
You can work around indexes and row names by converting the answers to data frames. It's not perfect, but not too hard and depending on your application, maybe an improvement. Functions below.
Base function with the slightly annoying index:
paste0("The answer is ", foo, "bar")
}
my_fun("foo")
[1] "The answer is foobar"
Improvement with data frame:
Note: For data frames with multiple rows, instead of just df, use print.data.frame(df, row.names = FALSE)
my_funner <- function(foo){
df <- data.frame("The_answer_is" = paste0(foo, "bar"), row.names = "")
df
}
my_funner("foo")
The_answer_is
foobar
Another option:
my_funnest <- function(foo){
df <- data.frame("Sorry_about" = "The_answer_is", "the_col_names" = paste0(foo, "bar"), row.names = "")
df
}
my_funnest("foo")
Sorry_about the_col_names
The_answer_is foobar
But those gaps are annoying, so one more option:
my_most_funnest <- function(foo){
df <- data.frame("Sorry_about_the_col_names" = paste0("The answer is ", foo, "bar"), row.names = "")
df
}
my_most_funnest("foo")
Sorry_about_the_col_names
The answer is foobar

Is it a bad idea to get name of script using sys.frame(1)$ofile?

I've been searching for a while for a way to get the name of the currently executed script. Most answers I've seen were one of:
Use commandArgs() - but this won't work for me because in RStudio commandArgs() does not return the filepath
Define the name of the script as the top line and then use that in the rest of the script
I saw one mention of sys.frames() and found out that I can use sys.frame(1)$ofile to get the name of the currently executing script. I don't know much about these kinds of functions, so can anyone advise me if that's a bad a idea or when it can fail me?
Thanks
The problem is that R does't really run code as "scripts." When you "source" a file, it's basically like re-typing the contents of the file at the console. The exception is that functions can keep track of where they were sourced from.
So if you had a file like mycode.R that had
fn <- function(x) {
x + 1 # A comment, kept as part of the source
}
and then you can do
source("mycode.R")
getSrcFilename(fn)
# [1] "mycode.R"
so in order to do that you just need to know a name of the function in the file. You could also make a function like this
gethisfilename <- function(z) {
x<-eval(match.call()[[1]])
getSrcFilename(x)
}
Assuming it's also in mycode.R, you can do
source("mycode.R")
gethisfilename()
# [1] "mycode.R"
Actually I think it is a bad idea, as I explained in my comment here: if you place this code in file1.R and then you source("file1.R") from file2.R, this will actually return "file2.R" instead of "file1.R", where it is called from!
So, to overcome this, you need to use sys.frames() and go for this solution: https://stackoverflow.com/a/1816487/684229
this.file.name <- function () # https://stackoverflow.com/a/1816487
{
frame_files <- lapply(sys.frames(), function(x) x$ofile)
frame_files <- Filter(Negate(is.null), frame_files)
frame_files[[length(frame_files)]]
}
Then you can use this.file.name() in any script and it will return the correct answer! It doesn't depend how deep is the "source-stack". And also it doesn't depend where is the this.file.name() function defined. It will return the information of the source file where it's called from.
(and apart from MrFlick's interesting solution, this doesn't need any function to be defined in the file)

How to create a new read.csv in R so it can read .csv file without typing the full name of .csv file

guys, thanks for read this. This is my first time writing a program so pardon me if I make stupid questions.
I have bunch of .csv files named like: 001-XXX.csv;002-XXX.csv...150-XXX.csv. Here XXX is a very long name tag. So it's a little annoying that every time I need to type read.csv("001-xxx.csv"). I want to make a function called "newread" that only ask me for the first three digits, the real id number, to read the .csv files. I thought "newread" should be like this:
newread <- function(id){
as.character(id)
a <- paste(id,"-XXX.csv",sep="")
read.csv(a)
}
BUt R shows Error: unexpected '}' in "}" What's going wrong? It looks logical.
I am running Rstudio on Windows 8.
as.character(id) will not change id into a character string. Change it to:
id = as.character(id)
Edit: According to comments, you should call newread() with a character paramter, and there is no difference between newread(001) and newread(1).
This is not specifically an answer to your question (others have covered that), but rather some advice that may be helpful for accomplishing your task in a different way.
First, some of the GUI's for R have file name completion. You can type the first part: read.csv("001- and then hit a key or combination of keys (In the windows GUI you press TAB) and the rest of the filename will be filled in for you (as long as it is unique).
You can use the file.choose or choose.files functions to open a dialog box to choose your file using the mouse: read.csv(file.choose()).
If you want to read in all the above files then you can do this in one step using lapply and either sprintf or list.files (or others):
mycsvlist <- lapply( 1:150, function(x) read.csv( sprintf("%03d-XXX.csv", x) ) )
or
mvcsvlist <- lapply( list.files(pattern="\\.csv$"), read.csv )
You could also use list.files to get a list of all the files matching a pattern and then pass one of the returned values to read.csv:
tmp <- list.files(pattern="001.*csv$")
read.csv(tmp[1])

R - Handling cut intervals of form [x,y] in tables when converting to LaTeX

I'm working on a document in R, with knitr to pdflatex and am using the extended version of toLatex from memisc.
When I'm producing a table with cut intervals however, the square brackets are not sanitised and the pdflatex job errors because of the existence of [.
I tried putting sanitize=TRUE in the knitr chunk code, but this only works for tikz.
Previously, I have used gsub and replaced the string in the R object itself which is rather inelegant. I'm hoping someone could point me in the direction of a nuance of memisc or knitr that I'm missing or another function/method that would easily handle latex special characters.
Example
library("memisc")
library("Hmisc")
example<-data.frame(cbind(x=1:100,y=1:100))
example$x<-cut2(example$x,m=20)
toLatex(example)
UPDATE
Searching SO I found a post about applying latexTranslate with apply function, but this requires characters so I would have to unclass from factor to character.
I found another SO post that identifies the knitr:::escape_latex function however, the chunk then outputs the stuff as markup instead of translating it (using results='asis') or produces an R style table inside a code block (using results='markup'). I tried configuring it as a hook function in my parent document and it had the effect of outputting all the document contents as markup. This is a brand new area for me so I probably implemented it incorrectly.
<<setup,include=FALSE>>=
hook_inline = knit_hooks$get('inline')
knit_hooks$set(inline = function(x) {
if (is.character(x)) x = knitr:::escape_latex(x)
hook_inline(x)
})
#
...
<<tab-example,echo=FALSE,cache=TRUE,results='asis',sanitize=TRUE,inline=TRUE>>=
library("Hmisc")
library("memisc")
example<-data.frame(cbind(x=1:100,y=1:100))
example$x<-cut2(example$x,m=20)
toLatex(example)
#
According to #yihui this is the wrong way to go
UPDATE 2
I have created a gsub wrapper which will escape percentages etc, however the [ symbol still pushes latex into maths mode and errors.
Courtesy of folks on the tex SE, a [ directly after a line break(\\) is considered an entry into math-mode. It is very simple to prevent this behaviour by adding {} into the output just before a [. My function looks like:
escapedLatex<-function (df = NULL)
{
require("memisc")
gsub(gsub(x = toLatex(df, show.xvar = TRUE), pattern = "%",
replacement = "\\%", fixed = TRUE), pattern = "[", replacement = "{}[",
fixed = TRUE)
}
I'd be very happy to see any alternative, more elegant solutions around and will leave it open for a few days.

R: Improving workflow and keeping track of output

I have what I think is a common enough issue, on optimising workflow in R. Specifically, how can I avoid the common issue of having a folder full of output (plots, RData files, csv, etc.), without, after some time, having a clue where they came from or how they were produced? In part, it surely involves trying to be intelligent about folder structure. I have been looking around, but I'm unsure of what the best strategy is. So far, I have tackled it in a rather unsophisticated (overkill) way: I created a function metainfo (see below) that writes a text file with metadata, with a given file name. The idea is that if a plot is produced, this command is issued to produce a text file with exactly the same file name as the plot (except, of course, the extension), with information on the system, session, packages loaded, R version, function and file the metadata function was called from, etc. The questions are:
(i) How do people approach this general problem? Are there obvious ways to avoid the issue I mentioned?
(ii) If not, does anyone have any tips on improving this function? At the moment it's perhaps clunky and not ideal. Particularly, getting the file name from which the plot is produced doesn't necessarily work (the solution I use is one provided by #hadley in 1). Any ideas would be welcome!
The function assumes git, so please ignore the probable warning produced. This is the main function, stored in a file metainfo.R:
MetaInfo <- function(message=NULL, filename)
{
# message - character string - Any message to be written into the information
# file (e.g., data used).
# filename - character string - the name of the txt file (including relative
# path). Should be the same as the output file it describes (RData,
# csv, pdf).
#
if (is.null(filename))
{
stop('Provide an output filename - parameter filename.')
}
filename <- paste(filename, '.txt', sep='')
# Try to get as close as possible to getting the file name from which the
# function is called.
source.file <- lapply(sys.frames(), function(x) x$ofile)
source.file <- Filter(Negate(is.null), source.file)
t.sf <- try(source.file <- basename(source.file[[length(source.file)]]),
silent=TRUE)
if (class(t.sf) == 'try-error')
{
source.file <- NULL
}
func <- deparse(sys.call(-1))
# MetaInfo isn't always called from within another function, so func could
# return as NULL or as general environment.
if (any(grepl('eval', func, ignore.case=TRUE)))
{
func <- NULL
}
time <- strftime(Sys.time(), "%Y/%m/%d %H:%M:%S")
git.h <- system('git log --pretty=format:"%h" -n 1', intern=TRUE)
meta <- list(Message=message,
Source=paste(source.file, ' on ', time, sep=''),
Functions=func,
System=Sys.info(),
Session=sessionInfo(),
Git.hash=git.h)
sink(file=filename)
print(meta)
sink(file=NULL)
}
which can then be called in another function, stored in another file, e.g.:
source('metainfo.R')
RandomPlot <- function(x, y)
{
fn <- 'random_plot'
pdf(file=paste(fn, '.pdf', sep=''))
plot(x, y)
MetaInfo(message=NULL, filename=fn)
dev.off()
}
x <- 1:10
y <- runif(10)
RandomPlot(x, y)
This way, a text file with the same file name as the plot is produced, with information that could hopefully help figure out how and where the plot was produced.
In terms of general R organization: I like to have a single script that recreates all work done for a project. Any project should be reproducible with a single click, including all plots or papers associated with that project.
So, to stay organized: keep a different directory for each project, each project has its own functions.R script to store non-package functions associated with that project, and each project has a master script that starts like
## myproject
source("functions.R")
source("read-data.R")
source("clean-data.R")
etc... all the way through. This should help keep everything organized, and if you get new data you just go to early scripts to fix up headers or whatever and rerun the entire project with a single click.
There is a package called Project Template that helps organize and automate the typical workflow with R scripts, data files, charts, etc. There is also a number of helpful documents like this one Workflow of statistical data analysis by Oliver Kirchkamp.
If you use Emacs and ESS for your analyses, learning Org-Mode is a must. I use it to organize all my work. Here is how it integrates with R: R Source Code Blocks in Org Mode.
There is also this new free tool called Drake which is advertised as "make for data".
I think my question belies a certain level of confusion. Having looked around, as well as explored the suggestions provided so far, I have reached the conclusion that it is probably not important to know where and how a file is produced. You should in fact be able to wipe out any output, and reproduce it by rerunning code. So while I might still use the above function for extra information, it really is a question of being ruthless and indeed cleaning up folders every now and then. These ideas are more eloquently explained here. This of course does not preclude the use of Make/Drake or Project Template, which I will try to pick up on. Thanks again for the suggestions #noah and #alex!
There is also now an R package called drake (Data Frames in R for Make), independent from Factual's Drake. The R package is also a Make-like build system that links code/dependencies with output.
install.packages("drake") # It is on CRAN.
library(drake)
load_basic_example()
plot_graph(my_plan)
make(my_plan)
Like it's predecessor remake, it has the added bonus that you do not have to keep track of a cumbersome pile of files. Objects generated in R are cached during make() and can be reloaded easily.
readd(summ_regression1_small) # Read objects from the cache.
loadd(small, large) # Load objects into your R session.
print(small)
But you can still work with files as single-quoted targets. (See 'report.Rmd' and 'report.md' in my_plan from the basic example.)
There is package developed by RStudio called pins that might address this problem.

Resources