Importing CSV files in R leads to unnecessary variables & observations - r

Here are the files I'm trying to import:
Data
Included are two files, xlsx and CSV, that represent the same dataset. Though they represent the same information, I get different results when I import them into R. Via the read_excel(file.choose()) command, I can correctly import the xlsx file correctly, but if I use the read.csv(file.choose(), sep=";") command on the CSV file, I get unnecessary additional observations and variables. I only saved the Excel file as a comma separated values files (.csv) so R should actually construct the same data frames. What did I do wrong?

Related

How to use for loop to read and append multiple csv files in R?

I am a student and just started learning R. I have 19 excel csv files. I want to read the files one by one and append the new rows into a data frame. It is recommended to use functions from tidyverse package to read and to import these files. The first 7 rows of each file are metadata which need to be skipped. How can I do these steps inside a for loop?

Export large csv format data from R

I exported very large csv format data via R, but since the the maximum raw of excel is 10480, I had to split it into several csv files in order to export it as csv, it's very inconvenient to handle them after that, is there any apropos way to export nearly 10 times of 10480 raw csv in just one file at the same time via R?
Thank you very much.
If you must use Excel with the data, you can keep the huge csv files and query them from within Excel. You could alternatively start building a database with the csv files and query the database from within Excel. This link might help you.
https://support.office.com/en-us/article/Use-Microsoft-Query-to-retrieve-external-data-42a2ea18-44d9-40b3-9c38-4c62f252da2e

csv not retaining format after splitting in R

I have a csv file say named abc.csv where name of a column is Component.Number and there are 20 different components I'm working with. The csv file contains 2000 entries which are sorted by component number. I'm using the following code to split the csv into 20 csv files where one file contains only data corresponding to a particular Component.Number.
abc = read.csv("abc.csv")
for (name in levels(abc$Component.Number)){
tmp=subset(abc,Component.Number==name)
#Create a new filename for each Component - the folder 'skews' should already exist in the same directory
fn=paste('skews/',gsub(' ','',name),sep='')
#Save the CSV file containing separate expenses data for each Component
write.table (tmp,fn,row.names=FALSE,sep = ",")
}
The code is working fine and I'm getting split files in the "skews" folder but the format of the files are not csv, in fact they don't have any file type. I have also tried write.csv instead of write.table in the last line, but no luck. So, how do I get the split files in .csv format and run the same R code on all of them using some kind of loop? The file names are those different component numbers. Thanks.

Using a CSV file to create a stem plot

I'm new to R (and anything programming related) so am getting my head around what is actually happening.
I created a CSV file in Excel with one column consisting of names and the other column consisting of pretend exam scores for each name. I saved it from Excel as a CSV file.
So to import it into R I used the following command:
data1<- read.csv(file.choose(),header=F)
When I created the CSV file I didn't create any headers so the column for names is given the header V1 and the column for the exam scores is given the header V2.
So to create my stem plot I then use the command:
class_stem <- stem(data1$V2)
Is this the most efficient way to do this?
My real confusion starts when I import the data as a table. Should I be even importing it as a table or just leaving it as I had done? My purpose at this stage was just to create a stem and leaf plot

Importing .csv files with Sys.Dates()

I have a .csv dataset that gets dumped everyday which I use to generate a daily list for tracking participants using a R script. I would like to automate this R script, however in order to do so, I need to read in the .csv using Sys.Date().
The .csv dataset is named: DumpedList_2013-11-27 (The date will always be today's date).
I would like to import this into the script, like I would for .Rdata file.
load(paste('/srv/Data/Baseline2/baseline2_', Sys.Date(), '.Rdata',sep=''))
What is the equivalent of the command above for reading in .csv files?
I have tried load and read.csv commands, but get error messages:
data=read.csv('P:/DirectoryPath/DumpedList_',Sys.Date(),'.csv')
I also attempted to create todaydate=Sys.Date() and then used it to load the data, but error messages again. a=load(paste("P:/DirectoryPath/DumpedList_",todaydate,".csv"))
Any insight?
By default paste will separate with spaces, use paste0 to join strings together seamlessly:
read.csv(paste0('P:/DirectoryPath/DumpedList_',Sys.Date(),'.csv'))

Resources