I'm using RStudio and I wanted to import csv data.
This data has 3 columns and they are separated by ",".
Now I type test <- read.csv("data1.csv", sep=",")
Data is imported but its imported as just ONE column.
Headers are okay, but also the headers (actually 3) are combined together in just one column.
If I set header=F, there was V1 as heading. So there really is just one column.
Why is my separator not working?
try read_csv() from readr package
install.packages("devtools")
devtools::install_github("hadley/readr")
with your sample input
library(readr)
file <- 'Alter.des.Hauses,"Quadratfuß","Marktwert"\n33,1.812,"$90.000,00"\n'
read_csv(file) # for the actual use read_csv("file.csv") ...
read_csv2(file)
Related
I tried to import the CSV file from here: https://covid19.who.int/WHO-COVID-19-global-table-data.csv using read.csv function:
WHO_data <- read.csv("https://covid19.who.int/WHO-COVID-19-global-table-data.csv")
But the WHO_data I got has 12 columns and recognizes the first column as a row name.
I tried another method by getting a tibble instead of dataframe:
library(readr)
WHO_data <- read_csv("https://covid19.who.int/WHO-COVID-19-global-table-data.csv")
It then gives the error below:
Warning: 1 parsing failure.
row col expected actual file
1 -- 12 columns 13 columns 'https://covid19.who.int/WHO-COVID-19-global-table-data.csv'
Can anyone help me explain why this happens and how to fix this?
The file seem to be improperly formatted. There is an extra comma on the end of the second line. You can read the raw line data, remove the comma, then pass to read.csv. For example
file <- "https://covid19.who.int/WHO-COVID-19-global-table-data.csv"
rows <- readLines(file)
rows[2] <- gsub(",$", "", rows[2])
WHO_data <- read.csv(text=rows)
Here is another solution based on the data.table package. If you want to return a data.frame (as opposed to data.table), you can additionally specify the argument data.table=FALSE to the fread function:
library(data.table)
file <- "https://covid19.who.int/WHO-COVID-19-global-table-data.csv"
WHO_data <- fread(file, select=1:12, fill=TRUE)
I have a csv file with data like this
firstcolumn secondcolumn
text1 freetext 1
text2 freetext 2
When I read the csv file I use this:
df <- read.csv("C:/Users/Desktop/testfile.csv", header=TRUE, sep=",")
Is there any parameter I should include in order to have every line of the second column as chr?
I am assuming, when you do read.csv, the second column will be of the type factors.
You can do this to cross check:
class(df$secondcolumn)
Now, if you want to convert them to characters, I can think of two ways. The first one does not work for me always, but the second one does.
First one:
stringsAsFactors needs to be set to false FALSE
df <- read.csv("C:/Users/Desktop/testfile.csv", header=TRUE, sep=",", stringsAsFactors=FALSE)
Second one:
If the first method does not work, then you do this manually by setting the particular column to characters
df$secondcolumn <- as.character(df$secondcolumn)
Can you use the import wizard in RStudio? You can specify all the formats there, and have the wizard generate the code.
I have two files. One file (csv) contains data, and second contains header for data (in one column). I need to unite both files and get data.frame with data from first file and header from second file. How it can be done?
Reduced sample. Data file:
10;21;36
7;56;543
7;7;7
7890;1;1
Header file:
height
weight
light
I need data.frame as from csv file:
height;weight;light
10;21;36
7;56;543
7;7;7
7890;1;1
You could use the col.names argument in read.table() to read the header file as the column names in the same call used to read the data file.
read.table(datafile, sep = ";", col.names = scan(headerfile, what = ""))
As #chinsoon12 shows in the comments, readLines() could also be used in place of scan().
We can read both the datasets with header=FALSE and change the column names with the first column of second dataset.
df1 <- read.csv("firstfile.csv", sep=";", header=FALSE)
df2 <- read.csv("secondfile.csv", header=FALSE)
colnames(df1) <- as.character(df2[,1])
I want to import an excel file in R. The file however has columns such as Jan-13, Jan14 and so on. These are the column headers. When I import the data using the readxl package, it by default converts the date into numbers. So my columns which should be dates are now numbers.
I am using the code :
library(readxl)
data = read_excel("FileName", col_names = TRUE, skip = 0)
Can someone please help?
The date information is still there. It's just in the wrong format. This should work:
names(data) <- format(as.Date(as.numeric(names(data), origin="1899-01-01")), "%b%d")
I was hoping there may be a way to do this, but after trying for a while I have had no luck.
I am working with a datafile (.csv format) that is being supplied with multiple tables in a single file. Each table has its own header row, and data associated with it. Is there a way to import this file and create separate data frames for each header/dataset?
Any help or ideas that can be provided would be greatly appreciated.
A sample of the datafile and it's structure can be found Here
When trying to use read.csv I get the following error:
"Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names"
Read the help for read.table:
nrows: number of rows to parse
skip: number of rows to skip
You can parse your file as follows:
first <- read.table(myFile, nrows=2)
second <- read.table(myFile, skip=3, nrows=2)
third <- read.table(myFile, skip=6, nrows=8)
You can always automate this by using grep() to search for the table seperators.
You can also read the table using fill=TRUE, and then split out the tables afterwards.