I'm working with a huge file to do some data analysis in R. This file is a .csv. I can import it just fine. However, after transposing all the rows and columns using data.frame(t(data)), I export it and cannot re-import this data.
This is the code I am using:
write.csv(transposed_data, file = "transposed_data.csv", row.names = FALSE, quote = FALSE)
When I transpose the rows and columns, does something happen to the data that is causing these issues? When using read.csv, the transposed data simply will not open.
I have merged two tables together and I want to write them to a .txt file. I have managed to do this, however when I open the .txt file in excel the symbols   have been added to some values. How do I stop this from happening? I have used the following code:
ICU_PPS <- merge(ICU, PPS, by=c("Study","Lane","Isolate_ID","Sample_Number","MALDI_ID", "WGS","Source"),all=TRUE)
write.table(ICU_PPS,"ICUPPS2.txt", sep="\t", row.names = FALSE)
An example of some values in a column that I get:
100_1#175
100_1#176
100_1#177
100_1#179
100_1#18 
100_1#19 
100_1#20 
What I want to achieve:
100_1#175
100_1#176
100_1#177
100_1#179
100_1#18
100_1#19
100_1#20
I have multiple excel files with data. I wanted to split the data in each excel file into multiple sheets within that particular excel file. I have already managed to do that with the following code:
library(Openxlsx)
data<- read.xlsx(file.choose())
splitdata <- split(data, data$Assigned)
splitdata
workbook <- createWorkbook()
Map(function(data,name){
addWorksheet(workbook, name)
writeDataTable(workbook, name, data)
},splitdata, names(splitdata))
saveWorkbook(workbook, file = "WorkbookWithMultipleSheets.xlsx", overwrite = TRUE)
However, I have more than 50 excel files, for which I need to create multiple sheets using the code above. Is there any way to create a loop so that I won't have to write this data for each excel file that I have?
Any help is appreciated! Thank you!
My question is about how to specify the class for various columns when reading in data that come from many files. More specifically, I am uploading 1000s of .xlsx files at a time and converting them to .csv files using the read.xls() function in the gdata package.
My approach is as follows:
Myfiles<-list.files() # lists all files in working directory (which contains data files)
library(gdata)
Mylist <- lapply(Myfiles, read.xls, header=T,
perl="C:/Users/A/PERL/perl/bin/perl.exe",
sheet=1,
method="csv",
skip=1,
as.is=1)
I apologize for not providing a workable example. I'm not sure how to do so for this problem.
All the .xlsx files have identical headers and set-up, but the classes of corresponding columns in the data frames within Mylist are not all the same. Is there a way to specify the classes within the lapply() approach I am using? I know you can extend functions of read.table() to read.xls() but I haven't figured out how to specify the column classes properly within the lapply call.
It's all in Gabor's comment, but to put this one to bed:
lapply(Myfiles, read.xls, colClasses = c("character", "numeric", "factor"), header=T)
I am working on a large questionnaire - and I produce summary frequency tables for different questions (e.g. df1 and df2).
a<-c(1:5)
b<-c(4,3,2,1,1)
Percent<-c(40,30,20,10,10)
df1<-data.frame(a,b,Percent)
c<-c(1,1,5,2,1)
Percent<-c(10,10,50,20,10)
df2<-data.frame(a,c,Percent)
rm(a,b,c,Percent)
I normally export the dataframes as csv files using the following command:
write.csv(df1 ,file="df2.csv")
However, as my questionnaire has many questions and therefore dataframes, I was wondering if there is a way in R to combine different dataframes (say with a line separating them), and export these to a csv (and then ultimately open them in Excel)? When I open Excel, I therefore will have just one file with all my question dataframes in, one below the other. This one csv file would be so much easier than having individual files which I have to open in turn to view the results.
Many thanks in advance.
If your end goal is an Excel spreadsheet, I'd look into some of the tools available in R for directly writing an xls file. Personally, I use the XLConnect package, but there is also xlsx and also several write.xls functions floating around in various packages.
I happen to like XLConnect because it allows for some handy vectorization in situations just like this:
require(XLConnect)
#Put your data frames in a single list
# I added two more copies for illustration
dfs <- list(df1,df2,df1,df2)
#Create the xls file and a sheet
# Note that XLConnect doesn't seem to do tilde expansion!
wb <- loadWorkbook("/Users/jorane/Desktop/so.xls",create = TRUE)
createSheet(wb,"Survey")
#Starting row for each data frame
# Note the +1 to get a gap between each
n <- length(dfs)
rows <- cumsum(c(1,sapply(dfs[1:(n-1)],nrow) + 1))
#Write the file
writeWorksheet(wb,dfs,"Survey",startRow = rows,startCol = 1,header = FALSE)
#If you don't call saveWorkbook, nothing will happen
saveWorkbook(wb)
I specified header = FALSE since otherwise it will write the column header for each data frame. But adding a single row at the top in the xls file at the end isn't much additional work.
As James commented, you could use
merge(df1, df2, by="a")
but that would combine the data horizontally. If you want to combine them vertically you could use rbind:
rbind(df1, df2, df3,...)
(Note: the column names need to match for rbind to work).