Load header for data.frame from file - r

I have two files. One file (csv) contains data, and second contains header for data (in one column). I need to unite both files and get data.frame with data from first file and header from second file. How it can be done?
Reduced sample. Data file:
10;21;36
7;56;543
7;7;7
7890;1;1
Header file:
height
weight
light
I need data.frame as from csv file:
height;weight;light
10;21;36
7;56;543
7;7;7
7890;1;1

You could use the col.names argument in read.table() to read the header file as the column names in the same call used to read the data file.
read.table(datafile, sep = ";", col.names = scan(headerfile, what = ""))
As #chinsoon12 shows in the comments, readLines() could also be used in place of scan().

We can read both the datasets with header=FALSE and change the column names with the first column of second dataset.
df1 <- read.csv("firstfile.csv", sep=";", header=FALSE)
df2 <- read.csv("secondfile.csv", header=FALSE)
colnames(df1) <- as.character(df2[,1])

Related

Add filename as column header when combining multiple files

I am combining 1 column of data from multiple/many text files into a single CSV file. This part is fine with the code I have. However, I would like to have after importing the filename (e.g., "roth_Aluminusa_E1.0.DPT") as the column header for the data column taken from that file. I know, similar questions have been asked but I can't work it out. Thanks for any help :-)
Code I am using which works for combining the files:
files3 <- list.files()
WAVELENGTH <- read.table(files3[1], header=FALSE, sep=",")[ ,1]
TEST9 <- do.call(cbind,lapply(files3,function(fn)read.table(fn, header=FALSE, sep = ",")[ , 2]))
TEST10 <- cbind(WAVELENGTH, TEST9)
You can do the following to add column names to TEST10. This assumes the column name you want for the first column is files3[1]
colnames(TEST10) <- c(files3[1], files3)
In case you want to keep the name of the first column as is, then we add the desired column names before binding WAVELENGTH with TEST9.
colnames(TEST9) <- files3
TEST10 <- cbind(WAVELENGTH, TEST9)
Then you can write to a csv as usual, keeping the column names as headers in the resulting file.
write.csv(TEST10, file = "TEST10.csv", row.names = FALSE)

How to write data to csv file using fwrite in R

One of the columns in my dataframe contains semicolon(;) and when I try to download the dataframe to a csv using fwrite, it is splitting that value into different columns.
Ex: Input : abcd;#6 After downloading it becomes : 1st column : abcd,
2nd column: #6
I want both to be in the same column.
Could you please suggest how to write the value within a single column.
I am using below code to read the input file:
InpData <- read.table(File01, header=TRUE, sep="~", stringsAsFactors = FALSE,
fill=TRUE, quote="", dec=",", skipNul=TRUE, comment.char="‌​")
while for writing:
fwrite(InpData, File01, col.names=T, row.names=F, quote = F, sep="~")
You didn't give us an example, but it is possible you need to use a different separator than ";"
fwrite(x, file = "", sep = ",")
sep: The separator between columns. Default is ",".
If this simple solution does not work, we need the data to reproduce your problem.

How can write.csv and read.csv preserve the data format?

I have a dataframe which contains majority of the columns as numeric. Then I use this command to write to csv file
write.csv(df, "mydf.csv", row.names=FALSE, na="")
Then later I read in the file using:
df = read.csv("mydf.csv", header = F, sep = ",", stringsAsFactors=F,dec=".")
Then as I checked the data formate using
sapply(df, class)
all the columns change into character. If I don't put stringsAsFactors=F,
all the columns change into factor.
I could manually change columns into numeric later. I just wonder if there is a method I could preserve the data format at least for the majority columns while write or read csv file. Is there better solutions?
By default write.csv will include headers, but when you are importing them you are telling R that there are no headers. It's likely that those headers are non-numeric which triggers the switch to character. So either turn off the headers in the write.csv() or set header=TRUE in the read.csv()

R - Importing Multiple Tables from a Single CSV file

I was hoping there may be a way to do this, but after trying for a while I have had no luck.
I am working with a datafile (.csv format) that is being supplied with multiple tables in a single file. Each table has its own header row, and data associated with it. Is there a way to import this file and create separate data frames for each header/dataset?
Any help or ideas that can be provided would be greatly appreciated.
A sample of the datafile and it's structure can be found Here
When trying to use read.csv I get the following error:
"Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names"
Read the help for read.table:
nrows: number of rows to parse
skip: number of rows to skip
You can parse your file as follows:
first <- read.table(myFile, nrows=2)
second <- read.table(myFile, skip=3, nrows=2)
third <- read.table(myFile, skip=6, nrows=8)
You can always automate this by using grep() to search for the table seperators.
You can also read the table using fill=TRUE, and then split out the tables afterwards.

Write different data frame in one .csv file with R

I have 3 data frame, and I want them to be written in one single .csv file, one above the others, not in a same table. So, 3 different tables in one csv file. They all have same size.
The problem with write.csv: it does not contain "append" feature
The problem with write.table: csv files from write.table are not read prettily by Excel 2010 like those from write.csv
Post I already read and in which I could not find solution to my problem :
write.csv() a list of unequally sized data.frames
Creating a file with more than one data frame
Solution ?
write.csv just calls write.table under the hood, with appropriate arguments. So you can achieve what you want with 3 calls to write.table.
write.table(df1, "filename.csv", col.names=TRUE, sep=",")
write.table(df2, "filename.csv", col.names=FALSE, sep=",", append=TRUE)
write.table(df3, "filename.csv", col.names=FALSE, sep=",", append=TRUE)
Actually, you could avoid the whole issue by combining your data frames into a single df with rbind, then calling write.csv once.
write.csv(rbind(df1, d32, df3), "filename.csv")
We use sink files:
# Sample dataframes:
df1 = iris[1:5, ]
df2 = iris[20:30, ]
# Start a sink file with a CSV extension
sink('multiple_df_export.csv')
# Write the first dataframe, with a title and final line separator
cat('This is the first dataframe')
write.csv(df1)
cat('____________________________')
cat('\n')
cat('\n')
# Write the 2nd dataframe to the same sink
cat('This is the second dataframe')
write.csv(df2)
cat('____________________________')
# Close the sink
sink()

Resources