R read excel(xlsx) file date,time format issue - r

When reading column of date and time format of excel file, the following problem occurs.
Any help would be greatly appreciated.
test <- read_excel('test.xlsx')
Data to read
2017-03-03
2017-03-04
2017-03-05
2017-03-06
2017-03-07
2017-03-08
2017-03-09
2017-03-10
1010-01-01
After Loading R
test
A tibble: 9 x 1
test1
1 42797
2 42798
3 42799
4 42800
5 42801
6 42802
7 42803
8 42804
9 1010-01-01

Try defining the column type in the function call:
read_excel("test.xlsx", col_types = "date")
It looks like some cells are formatted as date in excel and others probably as character. If you fix that column in Excel by setting the correct format for it, it should fix it too.
EDIT: There was a screenschot in the questions that hinted that the data wasn't in the same format in all cells (content was aligned differently). It is now deleted.

Related

Opening csv file correctly

I am trying to use this dataset: wine_quality_dataset
I am running the following function:
data2 <- read.table("C:/Users/Magda/Downloads/winewhite.csv")
And here is what I got:
head(data2)
V1
1 fixed acidity;volatile acidity;citric acid;residual sugar;chlorides;free sulfur dioxide;total sulfur dioxide;density;pH;sulphates;alcohol;quality
2 7;0.27;0.36;20.7;0.045;45;170;1.001;3;0.45;8.8;6
3 6.3;0.3;0.34;1.6;0.049;14;132;0.994;3.3;0.49;9.5;6
4 8.1;0.28;0.4;6.9;0.05;30;97;0.9951;3.26;0.44;10.1;6
5 7.2;0.23;0.32;8.5;0.058;47;186;0.9956;3.19;0.4;9.9;6
6 7.2;0.23;0.32;8.5;0.058;47;186;0.9956;3.19;0.4;9.9;6
What command should I use to read csv file correctly?
Try
readr::read_csv("C:/Users/Magda/Downloads/winewhite.csv")
readr is part of tidyverse a collection of libraries that help you tidying up data.
If you are using European format CSV with a semicolon ; separator, use
readr::read_csv2("C:/Users/Magda/Downloads/winewhite.csv")

How to created timeBased file in R

I thought I would post here since I have spent hours trying to figure this out. So I'm working with a csv file with Date and Closing return price. However, I can't get the file to be "timeBased." (timeBased function is from package xts). For example:
timeBased(dfx)
[1] FALSE
Here is what I have:
dfx = xts(aus$AUS, order.by=as.Date(aus$DATE))
and here's what the first 10 rows look like of the file:
DATE AUS
1 12/1/1988 -0.0031599720
2 12/2/1988 -0.0015724670
3 12/5/1988 -0.0000897619
4 12/6/1988 -0.0022670620
5 12/7/1988 0.0052895550
6 12/8/1988 -0.0048259860
7 12/9/1988 0.0106990910
8 12/12/1988 0.0033538810
9 12/13/1988 0.0118568700
10 12/14/1988 -0.0050105200
If anyone can help, I would appreciate it! I tried multiple codes using zoo and other edits, but nothing. Thank you!![enter image description here][1]
As Joshua Ulrich points out, using the timeBased function with an xts object should be expected to return FALSE. In addition to that, there may be another problem with your code. Assuming that your example displays the contents of aus, then aus$DATE is actually a factor or character data, not a Date object. To properly convert to an xts object, you'll have to specify the date format of the aus$DATE data. To convert and then test whether dfx is an xts object, you could use the following code:
dfx = xts(aus$AUS, order.by=as.Date(aus$DATE, "%m/%d/%Y"))
dfx
[,1]
1988-12-01 -0.0031599720
1988-12-02 -0.0015724670
1988-12-05 -0.0000897619
1988-12-06 -0.0022670620
timeBased(dfx)
[1] FALSE
is.xts(dfx)
[1] TRUE

what kind of files is suitable to be read in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Read an Excel file directly from a R script
I made an Excel file, I named it test.xlsx. I wanted to read the file in R.
date price
1 34
2 34.5
3 34
4 34
5 35
6 34.5
7 36
Now, when I used
x = read.csv("test.xlsx")
didn't work. Also I used
x = read.table("test.xlsx")
I got the warning
Warning message:
In read.table("test.xlsx") :
incomplete final line found by readTableHeader on 'test.xlsx'
and the result:
V1
1 PK\003\004\024
2 PˆTز\005›DQ4ï½ùfىé|[™d\003\001µ³9\033g
So, do I need to make a special file in order to read it in R?
try using a simple CSV file. you can save one in Excel using the Save As option
You may want to have a look at the XLConnect package for dealing with Excel files in R: http://cran.r-project.org/web/packages/XLConnect/index.html

Help interpreting/converting odd date format

I have data pulled from a database and stored in Stata .dta files. But when I read it into R using the foreign package, I get a date format unlike any I've seen. All of the other dates are "%m/%d/%Y" and import correctly.
I have searched the database's documentation, but there's no explanation for the odd date format for "DealActiveDate". The "facilitystartdate" date should be close to the "DealActiveDate", but not necessarily the same. Here are a few rows of these two columns.
facilitystartdate DealActiveDate
1 09/12/1987 874022400000
2 09/12/1987 874022400000
3 09/12/1987 874022400000
4 09/01/1987 873072000000
5 09/08/1987 873676800000
6 10/01/1987 875664000000
7 08/01/1987 870393600000
8 08/01/1987 870393600000
9 10/01/1987 875664000000
10 09/01/1987 873072000000
Please let me know if you have any idea how to convert "DealActiveDate" to a more conventional date. Thanks! (I'm not sure SO is the best venue, but I couln't think of any other options!)
Looks like milliseconds since 1960-01-01:
as.POSIXct(874022400000/1000, origin="1960-01-01")
# [1] "1987-09-12 01:00:00 CDT"

Importing timeSeries from csv file into R with correct dates

I have looked all over the internet to find an answer to my problem and failed.
I am using R with the Rmetrics package.
I tried reading my own dataset.csv via the readSeries function but sadly the dates I entered are not imported correctly, now every row has the current date.
I tried using their sample data sets, exported them to csv and re-imported an it creates the same problem.
You can test it using this code:
data <- head(SWX.RET[,1:3])
write.csv(data, file="myData.csv")
data2 <- readSeries(file="myData.csv",header=T,sep=",")
If you now check the data2 time series you will notice that the row date is the current date.
I am confused why this is and what to do to fix it.
Your help is much appreciated!
This can be achieved with the extra option row.names=FALSE to the write.csv() functions; see its help page for details. Here is a worked example:
R> fakeData <- data.frame(date=Sys.Date()+seq(-7,-1), value=runif(7))
R> fakeData
date value
1 2011-02-14 0.261088
2 2011-02-15 0.514413
3 2011-02-16 0.675607
4 2011-02-17 0.982817
5 2011-02-18 0.759544
6 2011-02-19 0.566488
7 2011-02-20 0.849690
R> write.csv(fakeData, "/tmp/fakeDate.csv", row.names=FALSE, quote=FALSE)
R> readSeries("/tmp/fakeDate.csv", header=TRUE, sep=",")
GMT
value
2011-02-14 0.261088
2011-02-15 0.514413
2011-02-16 0.675607
2011-02-17 0.982817
2011-02-18 0.759544
2011-02-19 0.566488
2011-02-20 0.849690
R>

Resources