csv not retaining format after splitting in R - r

I have a csv file say named abc.csv where name of a column is Component.Number and there are 20 different components I'm working with. The csv file contains 2000 entries which are sorted by component number. I'm using the following code to split the csv into 20 csv files where one file contains only data corresponding to a particular Component.Number.
abc = read.csv("abc.csv")
for (name in levels(abc$Component.Number)){
tmp=subset(abc,Component.Number==name)
#Create a new filename for each Component - the folder 'skews' should already exist in the same directory
fn=paste('skews/',gsub(' ','',name),sep='')
#Save the CSV file containing separate expenses data for each Component
write.table (tmp,fn,row.names=FALSE,sep = ",")
}
The code is working fine and I'm getting split files in the "skews" folder but the format of the files are not csv, in fact they don't have any file type. I have also tried write.csv instead of write.table in the last line, but no luck. So, how do I get the split files in .csv format and run the same R code on all of them using some kind of loop? The file names are those different component numbers. Thanks.

Related

Trying to create new columns using header information, add a column containing the file name and merge multiple csv files in R

I have only recently started using R and am now trying to automate some tasks with it. I've a task where I want to merge information from ~300 .csv files. Each file is in the same format with information in a header section followed by data in standard columns.
I want to
Create a new column that contains the file name
Create columns that use header information (e.g. lot number) on each row in the file
Merge all csv files in a folder together.
I've seen bits of code that can merge csv files together using list_files(), lapply() and bind_rows() but struggling to get the header information into new columns before merging the csv files together.
sample of csv file
Has anyone a solution to this?

Importing CSV files in R leads to unnecessary variables & observations

Here are the files I'm trying to import:
Data
Included are two files, xlsx and CSV, that represent the same dataset. Though they represent the same information, I get different results when I import them into R. Via the read_excel(file.choose()) command, I can correctly import the xlsx file correctly, but if I use the read.csv(file.choose(), sep=";") command on the CSV file, I get unnecessary additional observations and variables. I only saved the Excel file as a comma separated values files (.csv) so R should actually construct the same data frames. What did I do wrong?

Merge all files using R in a directory while removing the headers and more

I need to merge multiple .csv files together while removing the header row from each file except from the first file in R Studio. All the files have the same number of columns and I just need to merge all the rows from each file.
However, this is the complicated part, or what I think is. The way this data is produced, each file is in its own folder. So if I have 100 files, then I have 100 individual folders and each folder inside is one file. The folders are named by each day and the file is named by each day as well. The only part of the name of the file that changes is the date. So for example, I'll have a folder named "20160420" with the file inside named "20160420_file". The next file would be named "20160419" with the file inside named "20160419_file". And so on. Each file has a header row, and below it are a days worth of data every minute.
The machines archives data everyday. We have over 100 machines, and each machine has been producing these files for the past 8 years. So you can imagine how many files there are and just how long it would take if I did this manually.
How would I write the code in R in R Studio to combine all these files together into one file and remove the duplicate header rows? Any ideas or help would be greatly appreciated.
You can use list.files() or dir() with argument full.names = TRUE and recursive = TRUE to get a vector of file names with paths from across multiple directories.
files <- dir(path = "c:/", pattern = "csv", full.names = TRUE, recursive = TRUE)
Then you can use a loop of some sort to process the files, for example
require(plyr)
allData <- ldply(as.list(files), read.csv)

read a selected column from multiple csv files and combine into one large file in R

Hi,
My task is to read selected columns from over 100 same formatted .csv files in a folder, and to cbind into a big large file using R. I have attached a screen shot in this question for a sample data file.
This is the code I'm using:
filenames <- list.files(path="G:\\2014-02-04")
mydata <- do.call("cbind",lapply(filenames,read.csv,skip=12))
My problem is, for each .csv file I have, the first column is the same. So using my code will create a big file with duplicate first columns... How can I create a big with just a single column A (no duplictes). And I would like to name the second column read from each .csv file using the value of cell B7, which is the specific timestamp of each .csv file.
Can someone help me on this?
Thanks.

How do I merge the headers from one csv file with another csv file in R?

What I'm trying to ask is, how would I use the headers from one csv as the headers for another csv file? It would kind of be like a merge, except the first csv file is JUST headers, and the second csv file has JUST data
Something as simple as this will work
dn <- read.csv("d-names.txt")
dd <- read.csv("d-data.txt",header=FALSE)
names(dd)<-names(dn)
Just assign the names from one data.frame to the other. Just make sure the files have exactly the same number of columns.

Resources