R write fixed width columns to csv (in the plain text file) - r

I write data frames to csv files using write.csv(). When this is done, the output when viewed in a plain text editor, in particular vi or notepad++, shows no spacing between the column content and the commas, resulting in it being relatively hard to read. For example, the columns are not lined up down the page.
I have negative interest in using excel to view the csv files. I am definitely not looking for a suggestion for a csv viewer. Nor do I want instructions on how to modify the plain text file afterward. Padding needs to be spaces not tabs.
I am interested in how to get R to line up the columns in the plain text csv file so that they are easier to read using a non specialized plain text editor.
I could (and might) write my own routine that converts everything to some fixed width string format and print that. But, I would prefer to find that this is an option within write.csv() or similar common output library call.
[I just this moment found out about printf in R, and that might be the best answer to this conundrum].

Related

Read a CSV Directly Into R as a String

I have a CSV file that I want to read into R without making it a Data Frame. It seems like it would be quite simple but I can't figure out how to do it. I quite literally just want the CSV file to read in as it would appear in a text editor. The reason for this is I need to feed the string into an API.
Using read.csv() obviously won't work for this because it automatically reads in as a df.
Try readLines()
This will read in the file with each line being a value in a vector. You'll need to then wrap that in a paste(readLines(),collapse="\n") to have it be a single text string that could be passed to an API.

Copy an R data.frame to an Excel spreadsheet

I have a question which is exactly similar to this question.
As part of my work, I have to copy output from the R Studio Console to an excel worksheet in order to make excel graphs. However, the R Studio Console uses formatted text, which excel doesn't read so well. To compensate, I'm always copying from the R Studio Console, pasting into notepad, then copying into Excel. That way, when I paste a table, I can tell excel that it's actually fixed width delimited data, and not just a clump of text.
How can I copy output from the R Studio console so that it goes into the clipboard as unformatted text so that I can paste it directly into Excel and thus organize the numbers into different cells? This would be very helpful as I dislike having to copy/paste tables into notepad then excel to make graphs.
It works with an easy trick.
First, you have to visualize your data in the Viewer pane of Rstudio (you can use the function View()), then you should start selecting from the last value to the first, it is from bottom to top (see image). Note that the first cell should be selected completely. Finally, right click on the selection, copy, and then paste it in Excel as you want, with or without format.
Good luck!
UPDATE:
based on this Post, other alternative is making a new function to copy your data.frame to Excel through the clipboard:
write.excel <- function(x,row.names=FALSE,col.names=TRUE,...) {
write.table(x,"clipboard",sep="\t",row.names=row.names,col.names=col.names,...)
}
write.excel(my.df)
and finally Ctr+V in Excel :)
This is by far the easiest way I have found so far:
clipr::write_clip(my_df)
source here
I usually source the following function:
cb <- function(df, sep="\t", dec=",", max.size=(200*1000)){
# Copy a data.frame to clipboard
write.table(df, paste0("clipboard-", formatC(max.size, format="f", digits=0)), sep=sep, row.names=FALSE, dec=dec)
}
A few notes:
Max.size allows you to specify how big the clipboard can become (in kilobytes) before it cancels, it's set to ~200MB right now.
It works perfectly for copying an R dataframe from an R studio session to Excel (with my EU locale). You might have to adjust the separator / decimal symbols to make it work with US versions.
How to use:
df <- mtcars
cb(df)
# Paste in excel as 'values'
From my experience there is no convenient way, I use two methods:
For small data frames, use RStudio's View(data.frame) function, if you copy only data without headers it works fine, but if you want to copy with headers then you have to paste it into notepad first to add at least one character to the top left empty cell.
For large data frames, use write.csv or write.xls (from package WriteXLS)

The woes of endless columns in .csv data in R

So I have a bunch of .csv files that were output by a simulation. I'm writing an R script to run through them and make a histogram of a column in each .csv file. However, the .csv is written in such a way that R does not like it. When I was testing it, I had been originally opening the files in Excel and apparently this changed the format to one R liked. Then when I went back to run the script on the entire folder I discovered that R doesn't like the format.
I was reading the data in as:
x <- read.csv("synch-imit-characteristics-2-tags-2-size-200-cost-0.1run-2-.csv", strip.white=TRUE)
Error in read.table(test, strip.white = TRUE, header = TRUE) :
more columns than column names
Investigating I found that the original .csv file, which R does not like, looks different than after the test one I opened with excel. I copied and pasted the first bit below after opening it in notepad:
cost,0.1
mean-loyalty, mean-hospitality
0.9885449527316088, 0.33240076252915735
weight,1 of p1, 2 of p1,
However, in notepad, there is no apparent formatting. In fact, between rows there is no space at all, ie it is cost,0.1mean-loyalty,mean-hospitality0.988544, etc. So it is weird to me as well that when I cope and paste it from notepad it gets the desired formatting as above. Anyway, moving on, after I had opened it in excel it got transferred to this"
cost,0.1,,,,,,,,
mean-loyalty, mean-hospitality,,,,,,,,
0.989771257,0.335847092,,,,,,,,
weight,1 of p1, etc...
So it seems like the data originally has no separation between rows (but I don't know how excel figures it out, or copying and pasting it) but R doesn't pick up on this. Instead, it views it all as one row (and since I have 40,000+ rows, it doesn't have that many columns). I don't want to have to open and save every file in excel. Is there a way to get R to read the data as desired?
Since when I copy and paste it from notepad it had new lines for the rows, it seems like I just need R to read it knowing that commas separate columns on the same row and a return separates rows. I tried messing around with all the sep="" commands I could find. But I can't figure it out.
To first solve the Notepad issue:
You must have CR (carriage return, \r) characters between the lines (and no LF, \n characters, causing Notepad to see it as one line).
Some programs accept this as well as a new line character, some don't.
You can for example use Notepad++ to replace all '\r' with '\n' or '\r\n', using Replace wih the "Extended" option. First select View > Show Symbol > Show all characters, so see what you are doing.
Finally, to get back to R:
(As it was pointed out, R can actually handle CR as a newline)
read.csv assumes that you have non-empty header names in the first row, but instead you have:
cost,0.1
while later in the data you have a row with more than just two columns:
weight,1 of p1, 2 of p1,
This means that not all columns have a header name (and I wonder if 0.1 was supposed to be a header name anyway).
The two solutions can be:
add a header including all columns, or
as it was pointed out in a comment use header=F.

Excel data organized in multiple nested rows, can R read it?

Please see the picture. I've started using R, and know how/that it can read files from Excel, but can it read something formatted like this?
http://www.flickr.com/photos/68814612#N05/8632809494/
(my apologies, upload was not working for me)
Elaborating on some of what's in the comments:
If you load the file into Excel, you can save it as a fixed-width or comma-delimited text file. Either should be easy to read into R.
The following may be obvious to you already.
(First, a question: Are you sure that you can't get the data in a format that has one set of data per line? Is it possible that the file you're getting was generated from a different file format that is more conducive to loading the data into R?)
Whether you should start rearranging the data in R or instead manipulate the raw text depends on what comes naturally to you (or to people you have around who can help). For me, personally, I would rearrange the text file outside of R before loading it into R. That's what's easiest for me. Perl is a great language for this purpose, but you could also do it with Unix shell scripts if that's accessible to you, or using a powerful editor such as Vim or Emacs. If you have no preference, I'd suggest Perl. If you have any significant programming experience, you'll be able to learn what you need. On the other hand, you're already loading it into R, so maybe it would be better to process the data there.
For example, you could execute a loop that goes the text file line by line and does something like this:
while (still have lines to read) {
read first header line into an vector if this is the first time through the loop
otherwise, read it and throw it away
read data line 1 into an vector
read second header line into vector if this is the first time
otherwise, read it and throw it away
read data line 2 into an vector
read third header line into vector if this is the first time
otherwise, read it and throw it away
read data line 3 into an vector
if this is first time through, concatenate the header vectors; store as next row
in something (a file, a matrix, a dataframe, etc.)
concatenate the data vectors you've been saving, and store as next row in same thing
}
write out the whole 2D data structure
Or if the headers will never change, then you could just embed them literally into the script before the loop, and throw them out no matter what. That will make the code cleaner. Or read the first few lines of the file separately to get the headers, and then have a separate script to read the data and add it to the file with the headers in it. (The headers will probably be useful in R, so I would suggest preserving them at the top of the text file.)

Wrap text in dataframe in R- or in output cell to Word

I am working in R, trying to export a dataframe to MS Word. I am using R2wd and would like a dataframe to export to MSWORD, and wrap a long string of text within a cell. Is that even possible?
Bare minimum at least pass a command from R to set the height of each row to fit the contents of the cell...
I don't see any demos or documentation but surely somebody must need to do that sometimes!
I'm not sure if this is exactly what you want, but you could export the data.frame to a .Rnw file with xtable, process it with Sweave, and then run the .tex file through latex2rtf. Unfortunately, latex2rtf does not format tables nicely . . .
You could do that with the odfWeave package, which is similar to Sweave, except for you can make dynamic odt (Open Document Format) documents. Well, the package does not generates a doc or docx file, but the odt can be opened without problem in the newer versions of MS Office (2007 and above), and in OpenOffice - so might work for you.
The main advantage of the package is that you can define the styles of the table (header or every cell of the table) to your taste. See the examples in the package's archive for further information.

Resources