writing to a tab-delimited file or a csv file - r

I have a RMA normalized data ( from the CEL files ) and would like to write it into a file that I could open in excel but have some problems.
library(affy)
cel <- ReadAffy()
pre<-rma(cel)
write.table(pre, file="norm.txt", sep="\t")
write.table(pre, file="norma.txt")
The outut is arranged row-wise in the text file that is written using the above command and hence when exported to excel it is in a wrong form and many of the information is cut off as the maximum rows are used up .The output looks the following way :
GSM 133971.CEL 5.85302 3.54678 6.57648 9.45634
GSM 133972.CEL 4.65784 3.64578 3.54213 7.89566
GSM 133973.CEL 6.78543 3.54623 2.54345 7.89767
How to write it in a proper format from CEL files in R to a notepad or excel ?

You need to extract the values from the normalised probes using the exprs function. Something like:
write.csv(exprs(pre), file="output.csv", row.names=FALSE)
should do the trick.

I'm not totally clear about what the problem is, what do you mean by "proper format"? Excel will struggle with huge tables and doing your analysis in R with Bioconductor is likely a better way to go, you could then export a smaller results or summary table to excel.
Nevertheless, if you want the file written columnwise, you could try:
write.csv(t(pre),file="norm.txt")
But excel (at least used to) allow many more rows than columns.

Related

Writing .xlsx in R, importing into PowerBi error

I'm experiencing an odd error. I have a large dataframe in R (75000 rows, 97 columns) and I need to save it out and then import it into Power Bi.
At first I just did the simple:
library(tidyverse)
write_csv(Visits,"Visits.csv")
and while it seems to export and looks fine in excel, the csv itself is all messed up when I look at the contents in Power Bi. Here's an example of what I mean:
The 'phase.x' column should only have "follow-up" or "treatment" in that column. In excel, looks great:
but that exact same file gets screwed up in Power Bi:
I figured that being a 'comma separated variable' file, there must be some extra comma somewhere, and I saved it as an .xlsx instead.
So, while in excel, I saved that .csv as an .xlsx and it opened great in Power Bi!
Jump forward a moment and instead of write_csv() in R, I use write.xlsx(). But now I get this error:
If I simply go to that file, open it in excel, save it and hit close, that error goes away and it can load into Power Bi just fine. I figure it has something to do with this question on here.
Any ideas on what I might be screwing up as I save it out of R? Somehow I can fix it in R and not have to open and save it every time?
In power BI check that your source has ignore quoted line breaks enabled. I've found this is often an issue with .csv files in PowerBI.

R misreading csv files after modifications on Excel

This is more of a curiosity.
Sometimes I modify csv files from Excel rather than R (suppose I manage to find a missing piece of info and I type it in the csv file), of course maintaining commas and quotes as they were.
Every time I do this, R becomes unable to read the csv file, i.e. it imports a single column as it appears on Excel, rather than separating the values (no options like sep= or quote= change this).
Does anyone know why this happens?
Thanks a lot
An example
This was readable:
state,"city","county"
AK,"Anchorage",""
AK,"Haines",""
AK,"Juneau","Juneau"
After adding the missing info under "county", R fails to import it as a data frame, reading it instead as a single vector.
state,"city","county"
AK,"Anchorage","Anchorage"
AK,"Haines","Haines"
AK,"Juneau","Juneau"
Edit:
I'm just running the basic read.csv
df <- read.csv("C:/directory/df.csv")

Fetch data from an open excel sheet into R?

I am wondering is it possible to read an excel file that is currently open, and capture things you manually test into R?
I have an excel file opened (in Windows). In my excel, I have connected to a SSAS cube. And I do some manipulations using PivotTable Fields (like changing columns, rows, and filters) to understand the data. I would like to import some of the results I see in excel into R to create a report. (I mean without manually copy/paste the results into R or saving excel sheets to read them later). Is this a possible thing to do in R?
UPDATE
I was able to find an answer. Thanks to awesome package created by Andri Signorell.
library(DescTools)
fxls<-GetCurrXL()
tttt<-XLGetRange(header=TRUE)
I was able to find an answer. Thanks to awesome package created by Andri Signorell.
library(DescTools)
fxls<-GetCurrXL()
tttt<-XLGetRange(header=TRUE)
Copy the values you are interested in (in a single spread sheet at a time) to clipboard.
Then
dat = read.table('clipboard', header = TRUE, sep = "\t")
You can save the final excel spreadsheet as a csv file (comma separated).
Then use read.csv("filename") in R and go from there. Alternatively, you can use read.table("filename",sep=",") which is the more general version of read.csv(). For tab separated files, use sep="\t" and so forth.
I will assume this blog post will be useful: http://www.r-bloggers.com/a-million-ways-to-connect-r-and-excel/
In the R console, you can type
?read.table
for more information on the arguments and uses of this function. You can just repeat the same call in R after Excel sheet changes have been saved.

How to export a dataset to SPSS?

I want to export a dataset in the MASS package to SPSS for further investigation. I'm looking for the EuStockMarkets data set in the package.
As described in http://www.statmethods.net/input/exportingdata.html, I did:
library(foreign)
write.foreign(EuStockMarkets, "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
I got a text file but the sps file is not a valid SPSS file. I'm really looking for a way to export the dataset to something that a SPSS can open.
As Thomas has mentioned in the comments, write.foreign doesn't generate native SPSS datafiles (.sav). What it does generate is the data in a comma delimited format (the .txt file) and a basic syntax file for reading that data into SPSS (the .sps file). The EuStockMarkets data object class is multivariate time series (mts) so when it's exported the metadata is lost and the resulting .sps file, lacking variable names, throws an error when you try to run it in SPSS. To get around this you can export it as a data frame instead:
write.foreign(as.data.frame(EuStockMarkets), "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
Now you just need to open mydata.sps as a syntax file (NOT as a datafile) in SPSS and run it to read in the datafile.
Rather than exporting it, use the STATS GET R extension command. It will take a specified data frame from an R workspace/dataset and convert it into a Statistics dataset. You need the R Essentials for Statistics and the extension command, which are available via the SPSS Community site (www.ibm.com/developerworks/spssdevcentral)
I'm not trying to answer a question that has been answered. I just think there is something else to complement for other users looking for this.
On your SPSS window, you just need to find the first line of code and edit it. It should be something like this:
"file-name.txt"
You need to find the folder path where you're keeping your file:
"C:\Users\DELL\Google Drive\Folder-With-Your-File"
Then you just need to add this path to your file's name:
"C:\Users\DELL\Google Drive\Folder-With-Your-File\file-name.txt"
Otherwise SPSS will not recognize the .txt file.
Sorry if I'm repeating some information here, I just wanted to make it easier to understand.
I suppose that EuStockMarkets is a (labelled) data frame.
This should work and even keep the variable and value labels:
require(sjlabelled)
write_spss(EuStockMarkets, "mydata.sav")
Or you try rio:
rio::export(EuStockMarkets, "mydata.sav")

How to write multiple tables, dataframes, regression results etc - to one excel file?

I am looking for an easy way to get objects into MS Excel.
(I am using the preinstalled "Puromycin"-dataset for the examples)
I would like to place the contents of these objects to a single excel file:
Puromycin
summary(Puromycin$rate)
summary(Purymycin$conc)
table(Puromycin$state)
lm( conc ~ rate , data=Puromycin)
By "contents" i mean what is shown in the console when i press enter. I dont know what to call it.
I tried to do this:
sink("datafilewhichexcelhopefullyunderstands.csv")
Puromycin
summary(Puromycin$rate)
summary(Purymycin$conc)
table(Puromycin$state)
lm( conc ~ rate , data=Puromycin)
sink()
This gives med a file with the CSV-extension, however when i open the file in notepad,
there is comma-separation. That means that i cant get Excel to open it properly. By properly
i mean that each number is in its own cell.
Others have suggested this for a similar problem
https://stackoverflow.com/a/13007555/1831980
But as a novice i feel that the solution is too complex, and I am hoping for a simpler method.
What I am doing now is this:
write.table(Puromycin, file="clipboard" , sep=";" , row.names=FALSE )
write.table(summary(Purymycin$conc), file="clipboard" , sep=";" , row.names=FALSE )
... etc...
But this requires i lot of copy-ing and pasting, which I hope to eliminate.
Any help would appreciated.
write.table and its friends are intended to write out columns of data separated by whatever separator is specified. Your clipboard contains several data types because you are using summary which always gives a unique output.
For writing the data values out, you can use write.csv on a data frame and then open with Excel. For example, Puromycin is already a data frame (which you can see with str(Puromycin)) so you can just write it out directly:
write.csv(file = "some file.csv", x = Puromycin)
Which will go into the current working directory (which can be determined with getwd()).
To write out/save the results of the regression model is a bit more of a challenge. You could definitely use sink as you did, but specify an extension of .txt on your file so a text editor can open it. There are fancier methods (sweave, knitr) which you might want to look into in the long run, as they can write really nice reports automatically.
In the meantime, get to know str(any R object) as it will be your friend. You can see all the objects in your workspace with ls().
This will only be helpful if you are prepared to use Excel's Data/Text to Columns functions:
capture.output( sapply( c(Puromycin,
summary(Puromycin$rate),
summary(Puromycin$conc),
table(Puromycin$state),
lm( conc ~ rate , data=Puromycin) ), FUN=print), file="datafilewhichexcelhopefullyunderstands.csv", append=TRUE)
The problem being that Excel will not read the whitespace as a cell separator unless you specifically tell it to. You can (and I have often done so) use the fixed filed input features offered by the Text-to-Columns dialog interface.
Your simplest option may be to use the RExcel tool, it transfers information between R and Excel. However it is not free software.
The XLConnect package is another option, it can be used to write information directly to an Excel file.
The tricky part is the lm call. lm does not return a simple vector, matrix, or data frame (all of which are easy to convert to csv or send directly) and there is not a clear way to convert the various parts of a list to cells in a spreadsheet. What would be better is to use extractor functions to pull the important parts from the return of lm or the summary of the lm object and send those to Excel using the other tools.
If you can tell us more about why you want the numbers in Excel and what you plan to do with them after, then we may be able to offer better help (you may be able to completely skip excel).
If the main goal is to share output with others then you should really look at the knitr package (or other related packages). This will not create Excel files, but can be used (along with the pandoc program and possibly other tools) to create a report file in a format easy to share with others not familiar with R. You could put everything into a .pdf file or a .docx file (the latter read by MS Word and would have tables wich can be edited using Word). There is not a simple way to get edits back into R, but with the track changes you can easily see what changes have been made and hand edit your R script/template accordingly.

Resources