I realize that a lot of work in R is done interactively where the output can immediately be seen. However, I usually want my scripts to be user-friendly and descriptive so I want output from my scripts to be minimally verbose.
I'm just still new in R and like to be able to transfer my work into a script for reuse. After searching around a bit on StackOverflow I came up with this result for my script:
# Display the results
paste("The number of houses is: ", numHouses, sep=" ")
This kind of output in Python could be:
print("The number of houses is %d", numHouses")
I would also REALLY prefer to end my sentences with a period, so append "." at the end of the output in R.
At the moment I do not need the data written to a file. I just want it written to the console. I'm using R Studio.
I like to use message because it can be easily suppressed in cases where the user doesn't want output. cat and print also work, but they don't seem to be handled as well from inside functions or loops. But cat can work on a stream of information, and print and message both take single character string inputs.
message(paste0("The number of houses is ", numHouses, "."))
Related
I have multiple regressions in an R script and want to append the regression summaries to a single text file output. I know I can use the following code to do this for one regression summary, but how would I do this for multiple?
rpt1 <- summary(fit)
capture.output(rpt1, file = "results.txt")
I would prefer not to have to use this multiple times in the same script (for rpt1, rpt2, etc.), and thus have separate text files for each result. I'm sure this is easy, but I'm still learning the R ropes. Any ideas?
You can store the result as a list and then use the capture.output
fit1<-lm(mpg~cyl,data=mtcars)
fit2<-lm(mpg~cyl+disp,data=mtcars)
myresult<-list(fit1,fit2)
capture.output(myresult, file = "results.txt")
If you want multiple output sent to a file then look at the sink function, it will redirect all output to a file until you call sink again. The capture.output function actually uses sink.
You might also be interested in the txtStart function (and friends) in the TeachingDemos package which will also include the commands interspersed with the output and gives a few more options for output formatting.
Eventually you will probably want to investigate the knitr package for ways of running a set of commands in a batch and nicely capturing all the output together nicely formatted (and documented).
I have multiple regressions in an R script and want to append the regression summaries to a single text file output. I know I can use the following code to do this for one regression summary, but how would I do this for multiple?
rpt1 <- summary(fit)
capture.output(rpt1, file = "results.txt")
I would prefer not to have to use this multiple times in the same script (for rpt1, rpt2, etc.), and thus have separate text files for each result. I'm sure this is easy, but I'm still learning the R ropes. Any ideas?
You can store the result as a list and then use the capture.output
fit1<-lm(mpg~cyl,data=mtcars)
fit2<-lm(mpg~cyl+disp,data=mtcars)
myresult<-list(fit1,fit2)
capture.output(myresult, file = "results.txt")
If you want multiple output sent to a file then look at the sink function, it will redirect all output to a file until you call sink again. The capture.output function actually uses sink.
You might also be interested in the txtStart function (and friends) in the TeachingDemos package which will also include the commands interspersed with the output and gives a few more options for output formatting.
Eventually you will probably want to investigate the knitr package for ways of running a set of commands in a batch and nicely capturing all the output together nicely formatted (and documented).
Lets say I want to use sink for writing to a file in R.
sink("outfile.txt")
cat("Line one.\n")
cat("Line two.")
sink()
question 1. I have seen people writing sink() at the end, why do we need this? Can something go wrong when we do not have this?
question 2. What is the best way to write many lines one by one to file with a for-loop, where you also need to format each line? That is I might need to have different number in each line, like in python I would use outfile.write("Line with number %.3f",1.231) etc.
Question 1:
The sink function redirects all text going to the stdout stream to the file handler you give to sink. This means that anything that would normally print out into your interactive R session, will now instead be written to the file in sink, in this case "outfile.txt".
When you call sink again without any arguments you are telling it to resume using stdout instead of "outfile.txt". So no, nothing will go wrong if you don't call sink() at the end, but you need to use it if you want to start seeing output again in your R session/
As #Roman has pointed out though, it is better to explicitly tell cat to output to the file. That way you get only what you want, and expect to see in the file, while still getting the rest ouf the output in the R session.
Question 2:
This also answers question two. R (as far as I am aware) does not have direct file handling like in python. Instead you can just use cat(..., file="outfile.txt") inside a for loop.
I am looking for an easy way to get objects into MS Excel.
(I am using the preinstalled "Puromycin"-dataset for the examples)
I would like to place the contents of these objects to a single excel file:
Puromycin
summary(Puromycin$rate)
summary(Purymycin$conc)
table(Puromycin$state)
lm( conc ~ rate , data=Puromycin)
By "contents" i mean what is shown in the console when i press enter. I dont know what to call it.
I tried to do this:
sink("datafilewhichexcelhopefullyunderstands.csv")
Puromycin
summary(Puromycin$rate)
summary(Purymycin$conc)
table(Puromycin$state)
lm( conc ~ rate , data=Puromycin)
sink()
This gives med a file with the CSV-extension, however when i open the file in notepad,
there is comma-separation. That means that i cant get Excel to open it properly. By properly
i mean that each number is in its own cell.
Others have suggested this for a similar problem
https://stackoverflow.com/a/13007555/1831980
But as a novice i feel that the solution is too complex, and I am hoping for a simpler method.
What I am doing now is this:
write.table(Puromycin, file="clipboard" , sep=";" , row.names=FALSE )
write.table(summary(Purymycin$conc), file="clipboard" , sep=";" , row.names=FALSE )
... etc...
But this requires i lot of copy-ing and pasting, which I hope to eliminate.
Any help would appreciated.
write.table and its friends are intended to write out columns of data separated by whatever separator is specified. Your clipboard contains several data types because you are using summary which always gives a unique output.
For writing the data values out, you can use write.csv on a data frame and then open with Excel. For example, Puromycin is already a data frame (which you can see with str(Puromycin)) so you can just write it out directly:
write.csv(file = "some file.csv", x = Puromycin)
Which will go into the current working directory (which can be determined with getwd()).
To write out/save the results of the regression model is a bit more of a challenge. You could definitely use sink as you did, but specify an extension of .txt on your file so a text editor can open it. There are fancier methods (sweave, knitr) which you might want to look into in the long run, as they can write really nice reports automatically.
In the meantime, get to know str(any R object) as it will be your friend. You can see all the objects in your workspace with ls().
This will only be helpful if you are prepared to use Excel's Data/Text to Columns functions:
capture.output( sapply( c(Puromycin,
summary(Puromycin$rate),
summary(Puromycin$conc),
table(Puromycin$state),
lm( conc ~ rate , data=Puromycin) ), FUN=print), file="datafilewhichexcelhopefullyunderstands.csv", append=TRUE)
The problem being that Excel will not read the whitespace as a cell separator unless you specifically tell it to. You can (and I have often done so) use the fixed filed input features offered by the Text-to-Columns dialog interface.
Your simplest option may be to use the RExcel tool, it transfers information between R and Excel. However it is not free software.
The XLConnect package is another option, it can be used to write information directly to an Excel file.
The tricky part is the lm call. lm does not return a simple vector, matrix, or data frame (all of which are easy to convert to csv or send directly) and there is not a clear way to convert the various parts of a list to cells in a spreadsheet. What would be better is to use extractor functions to pull the important parts from the return of lm or the summary of the lm object and send those to Excel using the other tools.
If you can tell us more about why you want the numbers in Excel and what you plan to do with them after, then we may be able to offer better help (you may be able to completely skip excel).
If the main goal is to share output with others then you should really look at the knitr package (or other related packages). This will not create Excel files, but can be used (along with the pandoc program and possibly other tools) to create a report file in a format easy to share with others not familiar with R. You could put everything into a .pdf file or a .docx file (the latter read by MS Word and would have tables wich can be edited using Word). There is not a simple way to get edits back into R, but with the track changes you can easily see what changes have been made and hand edit your R script/template accordingly.
I am using R or starting to use R. I did some script using for loops, if... and I am happy with the results.
Now the issue I have is that in the console I would have all the line of codes (around 150 lines) when really I am just interested in 4 lines, my results.
Is there anyway to clean the console to see only some requested lines? and not all of the codes? If not I am thinking about saving them in a csv file and access the csv file to see the results of the script but it is not really efficient.
Thanks in advance
Xavier
I expect this to depend on how your 'results' are in the console, and whether all the rest is truly 'code'. Are these 4 lines the result of cat/print statements? Then you could look at ?sink to send the results to a file.
Another option is to store these results in a variable (e.g. a list), and at the end of all your calculations, print this list. after that it should be easy to do the separation.
You are writing code in a script editor and not in the console right? Another option would be to use source() on the script which will run the entire script but won't show in the console (only the output). RStudio (which I strongly recommend you use for R; http://rstudio.org/) has a "source this file" button or something like that.
But more importantly, getting R to clearly return the results is a big part of learning how to program in R. You want your scripts to be clear for others as well! Some solutions would be to make some code chunks a function or as Nick suggested storing results in a list.
For me, I would put your code into a function, which would effectively hide the code from the console as it is run, and store the results of the code into a variable and then save that to a file
foo <- function(x) {
result<-0
for(i in 1:length(x)){
result<-result+x[i]
}
return(result)
}
bar <- foo(x=c(2,3,4,5,4,3,2,3,4,5))
write.csv(bar, "resultfile.csv")