In R How do I convert this CSV data to XTS - r

I am trying to read in a CSV file and change it to XTS format. However, I am running into and issue with the CSV format have date and time fields in separate columns.
2012.10.30,20:00,1.29610,1.29639,1.29607,1.29619,295
2012.10.30,20:15,1.29622,1.29639,1.29587,1.29589,569
2012.10.30,20:30,1.29590,1.29605,1.29545,1.29574,451
2012.10.30,20:45,1.29576,1.29657,1.29576,1.29643,522
2012.10.30,21:00,1.29643,1.29645,1.29581,1.29621,526
2012.10.30,21:15,1.29621,1.29644,1.29599,1.29642,330
I am trying to pull it in with
euXTS <- as.xts(read.zoo(file="EURUSD15.csv", sep=",", format="%Y.%m.%d", header=FALSE))
But it gives me this warning message so I think somehow I have to attached the time stamp but I am not sure the best way to do that.
Warning message:
In zoo(rval3, ix) :
Some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique

It is better to use read.zoo to read directly your ts in a zoo object, easily coerced to xts one:
library(xts)
ts.z <- read.zoo(text='2012.10.30,20:00,1.29610,1.29639,1.29607,1.29619,295
2012.10.30,20:15,1.29622,1.29639,1.29587,1.29589,569
2012.10.30,20:30,1.29590,1.29605,1.29545,1.29574,451
2012.10.30,20:45,1.29576,1.29657,1.29576,1.29643,522
2012.10.30,21:00,1.29643,1.29645,1.29581,1.29621,526
2012.10.30,21:15,1.29621,1.29644,1.29599,1.29642,330',
sep=',',index=1:2,tz='',format="%Y.%m.%d %H:%M")
as.xts(ts.z)
V3 V4 V5 V6 V7
2012-10-30 20:00:00 1.29610 1.29639 1.29607 1.29619 295
2012-10-30 20:15:00 1.29622 1.29639 1.29587 1.29589 569
2012-10-30 20:30:00 1.29590 1.29605 1.29545 1.29574 451
2012-10-30 20:45:00 1.29576 1.29657 1.29576 1.29643 522
2012-10-30 21:00:00 1.29643 1.29645 1.29581 1.29621 526
2012-10-30 21:15:00 1.29621 1.29644 1.29599 1.29642 330

Related

getSymbols downloading data for multiple symbols and export adjusting prices to CSV file

quantmode newbie here,
My end goal is to have a CSV file including monthly stock prices, I've downloaded the data using getSymbols using this code:
Symbols <- c("DIS", "TSLA","ATVI", "MSFT", "FB", "ABT","AAPL","AMZN",
"BAC","NFLX","ADBE","WMT","SRE","T","MS")
Data <- new.env()
getSymbols(c("^GSPC",Symbols),from="2015-01-01",to="2020-12-01"
,periodicity="monthly",
env=Data)
the line above works fine, now I need to create a data frame that only includes the adjusted prices for all the symbols with a data column ofc,
any help, please? :)
Desired output would be something similar to this
enter image description here
Another straightforward way to get your monthly data:
tickers <- c('AMZN','FB','GOOG','AAPL')
getSymbols(tickers,periodicity="monthly")
head(do.call("merge.xts",c(lapply(mget(tickers),"[",,6),all=FALSE)),3)
AMZN.Adjusted FB.Adjusted GOOG.Adjusted AAPL.Adjusted
2012-06-01 228.35 31.10 288.9519 17.96558
2012-07-01 233.30 21.71 315.3032 18.78880
2012-08-01 248.27 18.06 341.2658 20.46477
Note the logical argument all = FALSE is the equivalent of an innerjoin and you get data when all of your stocks have prices. all = TRUE fills data which is not available with NAs (outerjoin).
To write the file you can use:
write.zoo(monthlyPrices,file = 'filename.csv',sep=',',quote=FALSE)
First get your data from the environment:
require(quantmod)
# your code
dat <- mget(ls(Data), env=Data)
Then draw the data from the Objects:
newdat <- as.data.frame(sapply( names(dat), function(x) coredata(dat[[x]])[,1] ))
Note that this takes the Opening values (see: dat[[x]])[,1]), the Objects have more, e.g.:
names(dat[["AAPL"]])
[1] "AAPL.Open" "AAPL.High" "AAPL.Low" "AAPL.Close"
[5] "AAPL.Volume" "AAPL.Adjusted"
Last, get the dates (assumes symmetric dates for all symbols):
rownames(newdat) <- index(dat[["AAPL"]])
# OR, more universal, by extracting from the complete list:
rownames(newdat) <-
as.data.frame( sapply( names(dat), function(x) as.character(index(dat[[x]])) ) )[,1]
head(newdat, 3)
AAPL ABT ADBE AMZN ATVI BAC DIS FB GSPC MS
2015-01-01 27.8475 45.25 72.70 312.58 20.24 17.99 94.91 78.58 2058.90 39.05
2015-02-01 29.5125 44.93 70.44 350.05 20.90 15.27 91.30 76.11 1996.67 33.96
2015-03-01 32.3125 47.34 79.14 380.85 23.32 15.79 104.35 79.00 2105.23 35.64
MSFT NFLX SRE T TSLA WMT
2015-01-01 46.66 49.15143 111.78 33.59 44.574 86.27
2015-02-01 40.59 62.84286 112.38 33.31 40.794 84.79
2015-03-01 43.67 67.71429 108.20 34.56 40.540 83.93
Writing the csv:
write.csv(newdat, "file.csv")

Reading and writing files in xts format with R

Basically, I want to capture data using getSymbols (quantmod), write the file(s) to disk, and read them back in with another script. I would like to use xts objects if possible. I cannot seem to make this work. Here is what I have done (and many variations thereof):
getSymbols("VNQ", from = as.Date("2015-12-01"), to = as.Date("2015-12-15"))
this.tkr <- get("VNQ")
head(this.tkr)
VNQ.Open VNQ.High VNQ.Low VNQ.Close VNQ.Volume VNQ.Adjusted
2015-12-01 79.50 80.52 79.42 80.49 3847300 79.38125
2015-12-02 80.26 80.37 78.73 78.85 5713500 77.76385
2015-12-03 78.73 78.85 77.40 77.61 4737300 76.54093
2015-12-04 77.68 79.29 77.65 79.09 3434100 78.00054
2015-12-07 78.96 79.19 78.52 78.87 4195100 77.78357
2015-12-08 78.44 79.09 78.36 78.80 3638600 77.71454
class(this.tkr)
[1] "xts" “zoo"
write.zoo(this.tkr, "Data/TestZoo”)
## then in some other script ....
new.tkr <- read.table("Data/TestZoo", stringsAsFactors = FALSE)
class(new.tkr)
[1] “data.frame"
head(new.tkr)
V1 V2 V3 V4 V5 V6 V7
1 Index VNQ.Open VNQ.High VNQ.Low VNQ.Close VNQ.Volume VNQ.Adjusted
2 2015-12-01 79.5 80.519997 79.419998 80.489998 3847300 79.381254
3 2015-12-02 80.260002 80.370003 78.730003 78.849998 5713500 77.763845
4 2015-12-03 78.730003 78.849998 77.400002 77.610001 4737300 76.540928
5 2015-12-04 77.68 79.290001 77.650002 79.089996 3434100 78.000537
6 2015-12-07 78.959999 79.190002 78.519997 78.870003 4195100 77.783574
## attempt to convert this to an xts object ...
new.tkr <- new.tkr[2:nrow(new.tkr), ] #delete first row of text captions
new.xts <- xts(new.tkr[, 2:ncol(new.tkr)], as.Date(new.tkr$V1))
head(new.xts)
V2 V3 V4 V5 V6 V7
2015-12-01 "79.5" "80.519997" "79.419998" "80.489998" "3847300" "79.381254"
2015-12-02 "80.260002" "80.370003" "78.730003" "78.849998" "5713500" "77.763845"
2015-12-03 "78.730003" "78.849998" "77.400002" "77.610001" "4737300" "76.540928"
2015-12-04 "77.68" "79.290001" "77.650002" "79.089996" "3434100" "78.000537"
2015-12-07 "78.959999" "79.190002" "78.519997" "78.870003" "4195100" "77.783574"
2015-12-08 "78.440002" "79.089996" "78.360001" "78.800003" "3638600" “77.714538"
Why does the xts conversion insist on making the columns of mode “character"? When I look at str(new.xts) the columns are all factors. Where am I jumping the track?
To preserve as much metadata as possible, save it as an R data file:
saveRDS(this.tkr, file = '~/Desktop/data.Rds')
df2 <- readRDS('~/Desktop/data.Rds')
That way,
> class(df2)
[1] "xts" "zoo"
The downside of this approach is that your data is less portable if you need to share it with people using things besides R, but that doesn't sound like a issue in this case.
This will write a zoo object in text form (portably) and read it back:
library(quantmod)
this.tkr <- getSymbols("VNQ", from = as.Date("2015-12-01"), to = as.Date("2015-12-15"),
auto.assign = FALSE, return.class = "zoo")
write.zoo(this.tkr, "TestZoo")
zz <- read.zoo("TestZoo", header = TRUE)
identical(this.tkr, zz)
## [1] TRUE
If you have an xts object convert it to zoo first like this:
library(quantmod)
this.tkr <- getSymbols("VNQ", from = as.Date("2015-12-01"), to = as.Date("2015-12-15"),
auto.assign = FALSE)
z <- as.zoo(this.tkr)
write.zoo(z, "TestZoo")
zz <- read.zoo("TestZoo", header = TRUE)
identical(z, zz)
## [1] TRUE
x <- as.xts(zz)

Read intraday data with getSymbols.csv

I installed the quantmod package and I'm trying to import a csv file with 1 minute intraday data. Here is a sample GAZP.csv file:
"D";"T";"Open";"High";"Low";"Close";"Vol"
20130902;100100;132.2000000;133.0500000;131.9200000;132.5000000;131760
20130902;100200;132.3700000;132.5700000;132.2500000;132.2900000;66090
20130902;100300;132.3600000;132.5000000;132.2600000;132.4700000;37500
I've tried:
> getSymbols('GAZP',src='csv')
Error in `colnames<-`(`*tmp*`, value = c("GAZP.Open", "GAZP.High", "GAZP.Low", :
length of 'dimnames' [2] not equal to array extent
> getSymbols.csv('GAZP',src='csv')
> # or
> getSymbols.csv('GAZP',env,dir="c:\\!!",extension="csv")
Error in missing(verbose) : 'missing' can only be used for arguments
How should I properly use the getSymbols.csv command to read such data?
#Vladimir, if you are not insisting to use the "getSymbols" function from the quantmod package you can import your csv file - assuming it is in your working directory - as zoo object with the line:
GAZP=read.zoo("GAZP.csv",sep=";",header=TRUE,index.column=list(1,2),FUN = function(D,T) as.POSIXct(paste(D, T), format="%Y%m%d %H%M%S"))
and convert it to a xts object if you want.
GAZP.xts <- as.xts(GAZP)
> GAZP
Open High Low Close Vol
2013-09-02 10:01:00 132.20 133.05 131.92 132.50 131760
2013-09-02 10:02:00 132.37 132.57 132.25 132.29 66090
2013-09-02 10:03:00 132.36 132.50 132.26 132.47 37500

Read a CSV file in R, and select each element

Sorry if the title is confusing. I can import a CSV file into R, but once I would like to select one element by providing the row and col index. I got more than one elements. All I want is to use this imported csv as a data.frame, which I can select any columns, rows and single cells. Can anyone give me some suggestions?
Here is the data:
SKU On Off Duration(hr) Sales
C010100100 2/13/2012 4/19/2012 17:00 1601 238
C010930200 5/3/2012 7/29/2012 0:00 2088 3
C011361100 2/13/2012 5/25/2012 22:29 2460 110
C012000204 8/13/2012 11/12/2012 11:00 2195 245
C012000205 8/13/2012 11/12/2012 0:00 2184 331
CODE:
Dat = read.table("Dat.csv",header=1,sep=',')
Dat[1,][1] #This is close to what I need but is not exactly the same
SKU
1 C010100100
Dat[1,1] # Ideally, I want to have results only with C010100100
[1] C010100100
3861 Levels: B013591100 B024481100 B028710300 B038110800 B038140800 B038170900 B038260200 B038300700 B040580700 B040590200 B040600400 B040970200 ... YB11624Q1100
Thanks!
You can convert to character to get the value as a string, and no longer as a factor:
as.character(Dat[1,1])
You have just one element, but the factor contains all levels.
Alternatively, pass the option stringsAsFactors=FALSE to read.table when you read the file, to prevent creation of factors for character values:
Dat = read.table("Dat.csv",header=1,sep=',', stringsAsFactors=FALSE )

Remove duplicate rows from xts object

I am having trouble deleting duplicated rows in an xts object. I have a R script that will download tick financial data of a currency and convert it to an xts object of OHLC format. The script also pulls new data every 15 minutes. The new data is downloaded from the first trade of today to the last recorded trade of today. The old previous data downloaded was stored in .Rdata format and called. Then the new data is added to the old data and it overwrites the old data in .Rdata format.
Here is an example of what my data looks like:
.Open .High .Low .Close .Volume .Adjusted
2012-01-07 00:00:11 6.69683 7.01556 6.38000 6.81000 48387.58 6.81000
2012-01-08 00:00:09 6.78660 7.20000 6.73357 7.11358 57193.53 7.11358
2012-01-09 00:00:57 7.08362 7.19100 5.81000 6.32570 148406.85 6.32570
2012-01-10 00:01:01 6.32687 6.89000 6.00100 6.36000 110210.25 6.36000
2012-01-11 00:00:07 6.44904 7.13800 6.41266 6.90000 99442.07 6.90000
2012-01-12 00:01:02 6.90000 6.99700 6.33700 6.79999 140116.52 6.79999
2012-01-13 00:02:01 6.78211 6.80400 6.40000 6.41000 60228.77 6.41000
2012-01-14 00:00:23 6.42000 6.50000 6.23150 6.31894 25392.98 6.31894
Now if I run the script again I will add the new data to the xts.
.Open .High .Low .Close .Volume .Adjusted
2012-01-07 00:00:11 6.69683 7.01556 6.38000 6.81000 48387.58 6.81000
2012-01-08 00:00:09 6.78660 7.20000 6.73357 7.11358 57193.53 7.11358
2012-01-09 00:00:57 7.08362 7.19100 5.81000 6.32570 148406.85 6.32570
2012-01-10 00:01:01 6.32687 6.89000 6.00100 6.36000 110210.25 6.36000
2012-01-11 00:00:07 6.44904 7.13800 6.41266 6.90000 99442.07 6.90000
2012-01-12 00:01:02 6.90000 6.99700 6.33700 6.79999 140116.52 6.79999
2012-01-13 00:02:01 6.78211 6.80400 6.40000 6.41000 60228.77 6.41000
2012-01-14 00:00:23 6.42000 6.50000 6.23150 6.31894 25392.98 6.31894
2012-01-14 00:00:23 6.42000 6.75000 6.22010 6.57157 75952.01 6.57157
As you can see the last line is the same day as the second to last line. I want to keep the last row for the last date and delete the second to last row. When I try the following code to delete duplicated rows it does not work, the duplicated rows stay there.
xx <- mt.xts[!duplicated(mt.xts$Index),]
xx
.Open .High .Low .Close .Volume .Adjusted
I do not get any result. How can I delete duplicate data entries in an xts object using the Index as the indicator of duplication?
Should't it be index(mt.xts) rather than mt.xts$Index?
The following seems to work.
# Sample data
library(xts)
x <- xts(
1:10,
rep( seq.Date( Sys.Date(), by="day", length=5 ), each=2 )
)
# Remove rows with a duplicated timestamp
y <- x[ ! duplicated( index(x) ), ]
# Remove rows with a duplicated timestamp, but keep the latest one
z <- x[ ! duplicated( index(x), fromLast = TRUE ), ]
In my case,
x <- x[! duplicated( index(x) ),]
did not work as intended, because the system somehow makes date-time unique in each row.
x <- x[! duplicated( coredata(x) ),]
This may work if the previous solution did not help.

Resources