I have 3 data frame, and I want them to be written in one single .csv file, one above the others, not in a same table. So, 3 different tables in one csv file. They all have same size.
The problem with write.csv: it does not contain "append" feature
The problem with write.table: csv files from write.table are not read prettily by Excel 2010 like those from write.csv
Post I already read and in which I could not find solution to my problem :
write.csv() a list of unequally sized data.frames
Creating a file with more than one data frame
Solution ?
write.csv just calls write.table under the hood, with appropriate arguments. So you can achieve what you want with 3 calls to write.table.
write.table(df1, "filename.csv", col.names=TRUE, sep=",")
write.table(df2, "filename.csv", col.names=FALSE, sep=",", append=TRUE)
write.table(df3, "filename.csv", col.names=FALSE, sep=",", append=TRUE)
Actually, you could avoid the whole issue by combining your data frames into a single df with rbind, then calling write.csv once.
write.csv(rbind(df1, d32, df3), "filename.csv")
We use sink files:
# Sample dataframes:
df1 = iris[1:5, ]
df2 = iris[20:30, ]
# Start a sink file with a CSV extension
sink('multiple_df_export.csv')
# Write the first dataframe, with a title and final line separator
cat('This is the first dataframe')
write.csv(df1)
cat('____________________________')
cat('\n')
cat('\n')
# Write the 2nd dataframe to the same sink
cat('This is the second dataframe')
write.csv(df2)
cat('____________________________')
# Close the sink
sink()
Related
I have two files. One file (csv) contains data, and second contains header for data (in one column). I need to unite both files and get data.frame with data from first file and header from second file. How it can be done?
Reduced sample. Data file:
10;21;36
7;56;543
7;7;7
7890;1;1
Header file:
height
weight
light
I need data.frame as from csv file:
height;weight;light
10;21;36
7;56;543
7;7;7
7890;1;1
You could use the col.names argument in read.table() to read the header file as the column names in the same call used to read the data file.
read.table(datafile, sep = ";", col.names = scan(headerfile, what = ""))
As #chinsoon12 shows in the comments, readLines() could also be used in place of scan().
We can read both the datasets with header=FALSE and change the column names with the first column of second dataset.
df1 <- read.csv("firstfile.csv", sep=";", header=FALSE)
df2 <- read.csv("secondfile.csv", header=FALSE)
colnames(df1) <- as.character(df2[,1])
My issue is likely with how I'm exporting the data from the for loop, but I'm not sure how to fix it.
I've got over 200 files in a folder, all structured in the same way, from which I'd like to pull the maximum number from a single column. I've made a for loop to do this based off of code from here http://www.r-bloggers.com/looping-through-files/
What I have running so far looks like this:
fileNames<-Sys.glob("*.csv")
for(i in 1:length(fileNames)){
data<-read.csv(fileNames[i])
VelM = max(data[,8],na.rm=TRUE)
write.table(VelM, "Summary", append=TRUE, sep=",",
row.names=FALSE,col.names=FALSE)
}
This works, but I need to figure out a way to have a second column in my summary file that contains the original file name the data in that row came from for reference.
I tried making both a matrix and a data frame instead of going straight to the table writing, but in both cases I wasn't able to append the data and ended up with values from only the last file.
Any ideas would be greatly appreciated!
Here's what I would recommend to improve your current method, also going with fread() because it's very fast and has the select argument. Notice I have moved the write.table() call outside the for() loop. This allows a cleaner way of adding the new column of file names alongside the max column, and eliminates the need to append to the file on every iteration.
library(data.table)
fileNames <- Sys.glob("*.csv")
VelM <- numeric(length(fileNames))
for(i in seq_along(fileNames)) {
VelM[i] <- max(fread(fileNames[i], select = 8)[[1L]], na.rm = TRUE)
}
write.table(data.frame(VelM, fileNames), "Summary", sep = ",",
row.names = FALSE, col.names = FALSE)
If you want to quickly read files, you should consider using data.table::fread or readr::read_csv instead of base read.csv.
For example:
fileNames <- list.files(path = your_path, pattern='\\.csv') # instead of Sys.glob
library('data.table')
dt <- rbindlist(lapply(fileNames, fread, select=8, idcol=TRUE))
dt[, .(max_val = max(your_var)), by = id]
write.table(dt, 'yourfile.csv', sep=',', row.names=FALSE, col.names=FALSE)
Explanation: data.table::fread reads in only the select=8th column from each file (via lapply to fileNames, which returns a list of data.tables). Then data.table::rbindlist combines this list of data.tables (of one column each) into a single data.table, producing an additional column idcol. From ?fread, note that
If input is a named list, ids are generated using them
Because lapply returns a named list with each name being the element of fileNames, this is an easy way of passing fileNames index for grouping.
The rest is data.table syntax. It wasn't clear from your question if there is a header row and whether you know the heading in advance. If so, you can either keep header=TRUE and use the header name for your_var, or you can do skip=1, header=FALSE, col.names = 'your_var'.
Hi so I have a data in the following format
101,20130826T155649
------------------------------------------------------------------------
3,1,round-0,10552,180,yellow
12002,1,round-1,19502,150,yellow
22452,1,round-2,28957,130,yellow,30457,160,brake,31457,170,red
38657,1,round-3,46662,160,yellow,47912,185,red
and I have been reading them and cleaning/formating them by this code
b <- read.table("sid-101-20130826T155649.csv", sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep="") )
b$id<- b[1,1]
b<-b[-1,]
b<-b[-1,]
b$yellow<-B$V6
and so on
There are about 300 files like this, and ideally they will all compiled without the first two lines, since the first line is just id and I made a separate column to identity these data. Does anyone know how to read these table quickly and clean and format the way I want then compile them into a large file and export them?
You can use lapply to read all the files, clean and format them, and store the resulting data frames in a list. Then use do.call to combine all of the data frames into single large data frame.
# Get vector of files names to read
files.to.load = list.files(pattern="csv$")
# Read the files
df.list = lapply(files.to.load, function(file) {
df = read.table(file, sep = ',', fill=TRUE, col.names=paste("V", 1:18,sep=""))
... # Cleaning and formatting code goes here
df$file.name = file # In case you need to know which file each row came from
return(df)
})
# Combine into a single data frame
df.combined = do.call(rbind, df.list)
I have a technical question in R:
how can I rowbind the following results (results1 and result2) into a data frame and keeping the columns labels for both:
result1:
meanAUC.SIM meanCmax.SIM meanTmax.SIM AUC.OBS Cmax.OBS Tmax.OBS PE.AUC PE.Cmax PE.Tmax
777.4444 74.64377 4.551254 820.7667 73.46508 3.089009 5.278274 1.604416 47.33703
result2:
medianAUC.SIM medianCmax.SIM medianTmax.SIM AUC.OBS Cmax.OBS Tmax.OBS PE.AUC PE.Cmax PE.Tmax
764.6611 72.4534 4.5 795.765 68.2 3 3.908683 6.236657 50
The reason behind this is that I want to write them in a *.csv file in an organized way with the correct labeling.
If the only reason you want to combine the data frames is to write them to a csv file, then you can instead just write each data frame separately to the same csv file. For example:
write.table(result1, "myfile.csv", row.names=FALSE, sep=",")
# If you want a blank row between them
cat("\n", file = "myfile.csv", append = TRUE)
write.table(result2, "myfile.csv", row.names=FALSE, sep=",", append=TRUE)
Here's what the file looks like:
This is a very simple issue and I'm surprised that there are no examples online.
I have a vector:
vector <- c(1,1,1,1,1)
I would like to write this as a csv as a simple row:
write.csv(vector, file ="myfile.csv", row.names=FALSE)
When I open up the file I've just written, the csv is written as a column of values.
It's as if R decided to put in newlines after each number 1.
Forgive me for being ignorant, but I always assumed that the point of having comma-separated-values was to express a sequence from left to right, of values, separated by commas. Sort of like I just did; in a sense mimicking the syntax of written word. Why does R cling so desperately to the column format when a csv so clearly should be a row?
All linguistic philosophy aside, I have tried to use the transpose function. I've dug through the documentation. Please help! Thanks.
write.csv is designed for matrices, and R treats a single vector as a matrix with a single column. Try making it into a matrix with one row and multiple columns and it should work as you expect.
write.csv(matrix(vector, nrow=1), file ="myfile.csv", row.names=FALSE)
Not sure what you tried with the transpose function, but that should work too.
write.csv(t(vector), file ="myfile.csv", row.names=FALSE)
Here's what I did:
cat("myVar <- c(",file="myVars.r.txt", append=TRUE);
cat( myVar, file="myVars.r.txt", append=TRUE, sep=", ");
cat(")\n", file="myVars.r.txt", append=TRUE);
this generates a text file that can immediately be re-loaded into R another day using:
source("myVars.r.txt")
Following up on what #Matt said, if you want a csv, try eol=",".
I tried with this:
write.csv(rbind(vector), file ="myfile.csv", row.names=FALSE)
Output is getting written column wise, but, with column names.
This one seems to be better:
write.table(rbind(vector), file = "myfile.csv", row.names =FALSE, col.names = FALSE,sep = ",")
Now, the output is being printed as:
1 1 1 1 1
in the .csv file, without column names.
write.table(vector, "myfile.csv", eol=" ", row.names=FALSE, col.names=FALSE)
You can simply change the eol to whatever you want. Here I've made it a space.
You can use cat to append rows to a file. The following code would write a vector as a line to the file:
myVector <- c("a","b","c")
cat(myVector, file="myfile.csv", append = TRUE, sep = ",", eol = "\n")
This would produce a file that is comma-separated, but with trailing commas on each line, hence it is not a CSV-file.
If you want a real CSV-file, use the solution given by #vamosrafa. The code is as follows:
write.table(rbind(myVector), file = "myfile.csv", row.names =FALSE, col.names = FALSE,sep = ",", append = TRUE)
The output will be like this:
"a","b","c"
If the function is called multiple times, it will add lines to the file.
One more:
write.table(as.list(vector), file ="myfile.csv", row.names=FALSE, col.names=FALSE, sep=",")
fwrite from data.table package is also another option:
library(data.table)
vector <- c(1,1,1,1,1)
fwrite(data.frame(t(vector)),file="myfile.csv",sep=",",row.names = FALSE)