I have worked with daily stock data using quantmod. Quantmod automatically dowloads data from google/yahoo finance sites and convert automatically to a xts object as date as the index.
AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
2014-10-01 100.59 100.69 98.70 99.18 51491300 97.09741
2014-10-02 99.27 100.22 98.04 99.90 47757800 97.80230
2014-10-03 99.44 100.21 99.04 99.62 43469600 97.52818
2014-10-06 99.95 100.65 99.42 99.62 37051200 97.52818
2014-10-07 99.43 100.12 98.73 98.75 42094200 96.67644
2014-10-08 98.76 101.11 98.31 100.80 57404700 98.68340
2014-10-09 101.54 102.38 100.61 101.02 77376500 98.89877
Now I am woking with intraday data(csv format) of one minute duration which I converted to a data frame(df) of six column.
Date Time Open High Low Close
1 20150408 09:17:00 7.15 7.15 7.10 7.10
2 20150408 09:18:00 7.15 7.15 7.15 7.15
3 20150408 09:19:00 7.10 7.10 7.10 7.10
4 20150408 09:20:00 7.10 7.10 7.05 7.10
5 20150408 09:21:00 7.10 7.15 7.10 7.10
6 20150408 09:22:00 7.10 7.10 7.05 7.10
Now how to convert this dataframe to a time series in such a way that I can use it with the default quantmod functions such as Cl(),Op(),OHLC() etc.
Elementary, dear Watson: combine date and time into a POSIXct, use that.
Untested as you supplied no reproducible data:
pt <- as.POSIXct(paste(X$Date, X$Time), format="%Y%m%d %H:%M:%S")
N <- xts(X[, -(1:2)], order.by=pt)
Here X is your current data.frame, and N is a new xts object formed from the data of X (minus date and time) using pt as the index.
Related
I am a beginner in R. I have the following problem - I want to load a CSV file into R and then convert it into a XTS object. However, after the operation I get an error. First, a small snippet of the data:
a=read.csv('/Users/..../Desktop/SYNEKTIK.csv',h=T)
head(a)
Name Date Open High Low Close Volume
1 SYNEKTIK 20110809 5.76 8.23 5.76 8.23 28062
2 SYNEKTIK 20110810 9.78 9.78 8.10 8.13 9882
3 SYNEKTIK 20110811 9.00 9.00 9.00 9.00 2978
4 SYNEKTIK 20110812 9.70 9.70 8.90 9.60 5748
5 SYNEKTIK 20110816 9.70 11.00 9.70 11.00 23100
6 SYNEKTIK 20110818 10.90 11.00 10.90 10.90 319
The following does not work:
w=xts(a[,-1],order.by=as.POSIXct(a[,1]))
As it produces the following error:
error'as.POSIXlt.character(as.character(x), ...)':
character string is not in a standard unambiguous format
Another try that did not work:
a=a[,-1]
head(a)
Date Open High Low Close Volume
1 20110809 5.76 8.23 5.76 8.23 28062
2 20110810 9.78 9.78 8.10 8.13 9882
3 20110811 9.00 9.00 9.00 9.00 2978
4 20110812 9.70 9.70 8.90 9.60 5748
5 20110816 9.70 11.00 9.70 11.00 23100
6 20110818 10.90 11.00 10.90 10.90 319
w=xts(a[,-1],order.by=as.POSIXct(a[,1]))
error 'as.POSIXct.numeric(a[, 1])':'origin' must be supplied
Finally, when I saved the date in the following format: yyyy -mm - dd Everything turned out right, and I could convert into an XTS object, why?
Maybe something like this will help:
w <- xts(a[,c(-1,-2)],order.by=as.Date(as.character(a[,2]),"%Y%m%d"))
Below are monthly prices of a particular stock;
Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2008 46.09 50.01 48 48 50.15 43.45 41.05 41.67 36.66 25.02 22.98 22
2009 20.98 15 13.04 14.4 26.46 14.32 14.6 11.83 14 14.4 13.07 13.6
2010 15.31 15.71 18.97 15.43 13.5 13.8 14.21 12.73 12.35 13.17 14.59 15.01
2011 15.3 15.22 15.23 15 15.1 14.66 14.8 12.02 12.41 12.9 11.6 12.18
2012 12.45 13.33 12.4 14.16 13.99 13.75 14.4 15.38 16.3 18.02 17.29 19.49
2013 20.5 20.75 21.3 20.15 22.2 19.8 19.75 19.71 19.99 21.54 21.3 27.4
2014 23.3 20.5 20 22.7 25.4 25.05 25.08 24.6 24.5 21.2 20.52 18.41
2015 16.01 17.6 20.98 21.15 21.44 0 0 0 0 0 0 0
I want to decompose the data into seasonal and trend data but I am not getting a result.
How can I load the data as a "ts" class data so I can decompose it?
Here is a solution using tidyr, which is fairly accessible.
library(dplyr); library(tidyr)
data %>% gather(month, price, -Year) %>% # 1 row per year-month pair, name the value "price"
mutate(synth_date_txt= paste(month,"1,",Year), # combine month and year into a date string
date=as.Date(synth_date_txt,format="%b %d, %Y")) %>% # convert date string to date
select(date, price) # keep just the date and price
# date price
# 1 2008-01-01 46.09
# 2 2009-01-01 20.98
# 3 2010-01-01 15.31
# 4 2011-01-01 15.30
# 5 2012-01-01 12.45
This gives you an answer with date format (even though you didn't specify a date, just a month and year). It should work for your time series analysis, but if you really need a timestamp you can just use as.POSIXct(date)
Mike,
The program is R and below is the code I have tried.
sev=read.csv("X7UPM.csv")
se=ts(sev,start=c(2008, 1), end=c(2015,1), frequency=12)
se
se=se[,1]
S=decompose(se)
plot(se,col=c("blue"))
plot(decompose(se))
S.decom=decompose(se,type="mult")
plot(S.decom)
trend=S.decom$trend
trend
seasonal=S.decom$seasonal
seasonal
ts.plot(cbind(trend,trend*seasonal),lty=1:2)
plot(stl(se,"periodic"))
This question already has answers here:
Access odd-named object returned by getSymbols
(4 answers)
Closed 8 years ago.
I want to get OHLC data from google finance from London Stock Exchange. I've tried using:
> require(quantmod)
> getSymbols("LON:DRTY", src="google")
[1] "LON:DRTY"
> head(LON:DRTY)
Error in head(LON:DRTY) : object 'LON' not found
getSymbols seems to have returned the data, but I cannot access it. How do I actually get the data from the returned object?
Actually, I want to download data from Japan. For example:"TYO:2501"
(https://www.google.com/finance?q=TYO%3A2501&ei=2aC6VOHcKsX1wAPa3YCoCg). However, getSymbols can't find it under tickers like "TYO:2501", "2501", "TYO%3A2501". I get '404 Not Found' or '400 Bad Request'.
> getSymbols("TYO:2502", from="2014-01-01", to="2014-05-03", src="google")
In this case getSymbols does not return a syntactically valid name. : is a binary operator in R, used to create sequences of numbers. So when you type head(LON:DRTY) R is looking for an object named LON and an object named DRTY in order to create a sequence. For example:
> LON <- 1
> DRTY <- 10
> head(LON:DRTY)
[1] 1 2 3 4 5 6
I will fix this in a future release, but you can use one of these work-arounds in the meantime:
> require(quantmod)
> getSymbols("LON:DRTY",src="google")
[1] "LON:DRTY"
> # use backticks to reference the object
> head(`LON:DRTY`)
LON:DRTY.Open LON:DRTY.High LON:DRTY.Low LON:DRTY.Close LON:DRTY.Volume
2012-08-01 0.43 0.45 0.41 0.44 410093
2012-08-02 41.25 42.75 40.00 41.50 751816
2012-08-03 41.00 44.00 41.00 42.75 582187
2012-08-06 42.00 44.41 42.00 42.50 370042
2012-08-07 42.00 44.00 40.75 42.00 1366845
2012-08-08 42.00 42.50 42.00 42.25 437467
> # manually assign the object to a "valid" name
> LON.DRTY <- getSymbols("LON:DRTY",src="google",auto.assign=FALSE)
> head(LON.DRTY)
LON:DRTY.Open LON:DRTY.High LON:DRTY.Low LON:DRTY.Close LON:DRTY.Volume
2012-08-01 0.43 0.45 0.41 0.44 410093
2012-08-02 41.25 42.75 40.00 41.50 751816
2012-08-03 41.00 44.00 41.00 42.75 582187
2012-08-06 42.00 44.41 42.00 42.50 370042
2012-08-07 42.00 44.00 40.75 42.00 1366845
2012-08-08 42.00 42.50 42.00 42.25 437467
> # use setSymbolLookup to specify the name
> setSymbolLookup(LON.DRTY=list(name="LON:DRTY",src="google"))
> getSymbols("LON.DRTY")
[1] "LON.DRTY"
> head(LON.DRTY)
LON:DRTY.Open LON:DRTY.High LON:DRTY.Low LON:DRTY.Close LON:DRTY.Volume
2012-08-01 0.43 0.45 0.41 0.44 410093
2012-08-02 41.25 42.75 40.00 41.50 751816
2012-08-03 41.00 44.00 41.00 42.75 582187
2012-08-06 42.00 44.41 42.00 42.50 370042
2012-08-07 42.00 44.00 40.75 42.00 1366845
2012-08-08 42.00 42.50 42.00 42.25 437467
Link to data:
http://dl.dropbox.com/u/56075871/data.txt
I want to divide each observation by mean for that hour. Example:
2012-01-02 10:00:00 5.23
2012-01-03 10:00:00 5.28
2012-01-04 10:00:00 5.29
2012-01-05 10:00:00 5.29
2012-01-09 10:00:00 5.28
2012-01-10 10:00:00 5.33
2012-01-11 10:00:00 5.42
2012-01-12 10:00:00 5.55
2012-01-13 10:00:00 5.68
2012-01-16 10:00:00 5.53
mean for that is 5.388. Next i want divide each observation by that mean, so... 5.23/5.388, 5.28/5.388, ... until end 5.53/5.388
I have hourly timeseries for 10 stocks:
S1.1h S2.1h S3.1h S4.1h S5.1h S6.1h S7.1h S8.1h S9.1h S10.1h
2012-01-02 10:00:00 64.00 110.7 5.23 142.0 20.75 34.12 32.53 311.9 7.82 5.31
2012-01-02 11:00:00 64.00 110.8 5.30 143.2 20.90 34.27 32.81 312.0 7.97 5.34
2012-01-02 12:00:00 64.00 111.1 5.30 142.8 20.90 34.28 32.70 312.4 7.98 5.33
2012-01-02 13:00:00 61.45 114.7 5.30 143.1 21.01 34.35 32.85 313.0 7.96 5.35
2012-01-02 14:00:00 61.45 116.2 5.26 143.7 21.10 34.60 32.99 312.9 7.95 5.36
2012-01-02 15:00:00 63.95 116.2 5.26 143.2 21.26 34.72 33.00 312.6 7.99 5.37
2012-01-02 16:00:00 63.95 117.3 5.25 143.3 21.27 35.08 33.04 312.7 7.99 5.36
2012-01-02 17:00:00 63.95 117.8 5.24 144.7 21.25 35.40 33.10 313.6 7.99 5.40
2012-01-02 18:00:00 63.95 117.9 5.23 145.0 21.20 35.50 33.17 312.5 7.98 5.35
2012-01-03 10:00:00 63.95 115.5 5.28 143.5 21.15 35.31 33.05 311.7 7.94 5.37
...
And i want to divie each observation by its mean for hour (periodical)
I have some code. Code to make means:
#10:00:00, 11:00:00, ... 18:00:00
times <- paste(seq(10, 18),":00:00", sep="")
#means - matrix of means for timeseries and hour
means <- matrix(ncol= ncol(time_series), nrow = length(times))
for (t in 1:length(times)) {
#t is time 10 to 18
for(i in 1:ncol(time_series)) {
#i is stock 1 to 10
# hour mean for each observation in data
means[t,i] <- mean(time_series[grep(times[t], index(time_series)), i])
}
}
And my function to get "things done":
for (t in 1:length(times)) {
# get all dates with times[t] hour
hours <- time_series[grep(times[t], index(time_series))]
ep <- endpoints(hours, "hours")
out <- rbind(out, period.apply(hours, INDEX=ep, FUN=function(x) {
x/means[t,]
}))
}
I know this is awful, but it works. How can i simplify code?
Here's one way to do it:
# Split the xts object into chunks by hour
# .indexhour() returns the hourly portion for each timestamp
s <- split(time_series, .indexhour(time_series))
# Use sweep to divide each value of x by colMeans(x) for each group of hours
l <- lapply(s, function(x) sweep(x, 2, colMeans(x), FUN="/"))
# rbind everything back together
r <- do.call(rbind, l)
The scale function can do that. Used with ave you could restrict to calcs within hours. Post the resutls of dput on that xts/zoo object and you will get rapid replies.
I had read previous post but I cannot obtain that I want. I need to obtain a serie with 16 intervals by day (least the first and last day, in these cases the intervals start/end with the first/last observation). I would like that the observed variables are located in the corresponding inteval and NA otherwise.
My data look as follows: [Ya and Yb are the observed variables]
mdyhms Ya Yb
Mar-27-2009 19:56:47 25 58.25
Mar-27-2009 20:38:59 9 81.25
Mar-28-2009 08:00:30 9 88.75
Mar-28-2009 09:26:29 0 89.25
Mar-28-2009 11:57:01 8.5 74.25
Mar-28-2009 12:19:10 7.5 71.00
Mar-28-2009 14:17:05 1.5 70.00
Mar-28-2009 15:13:14 NA NA
Mar-28-2009 17:09:53 4 85.50
Mar-28-2009 18:37:24 0 86.00
Mar-28-2009 19:19:23 0 50.50
Mar-28-2009 20:45:50 0 36.25
Mar-29-2009 08:44:16 4.5 34.50
Mar-29-2009 10:35:12 8.5 39.50
Mar-29-2009 11:09:13 3.67 69.00
Mar-29-2009 12:40:07 0 54.25
Mar-29-2009 14:31:48 5.33 35.75
Mar-29-2009 16:19:27 6.33 71.75
Mar-29-2009 16:43:20 7.5 64.75
Mar-29-2009 18:37:42 8 83.75
Mar-29-2009 20:01:26 6.17 93.75
Mar-29-2009 20:43:53 NA NA
Mar-30-2009 08:42:05 12.67 88.50
Mar-30-2009 09:52:57 4.33 75.50
Mar-30-2009 12:01:32 1.83 70.75
Mar-30-2009 12:19:40 NA NA
Mar-30-2009 14:23:37 3.83 86.75
Mar-30-2009 16:00:59 37.33 80.25
Mar-30-2009 17:19:28 10.17 77.75
Mar-30-2009 17:49:12 9.83 73.00
Mar-30-2009 20:06:00 11.17 76.75
Mar-30-2009 21:40:35 20.33 68.25
Mar-31-2009 08:11:12 18.33 69.75
Mar-31-2009 09:51:29 14.5 65.50
Mar-31-2009 11:10:41 NA NA
Mar-31-2009 13:27:09 NA NA
Mar-31-2009 13:44:35 NA NA
Mar-31-2009 16:01:23 NA NA
Mar-31-2009 16:56:14 NA NA
Mar-31-2009 18:27:28 NA NA
Mar-31-2009 19:17:46 NA NA
Mar-31-2009 21:12:22 NA NA
Apr-01-2009 08:35:24 2.33 60.25
Apr-01-2009 09:24:49 1.33 71.50
Apr-01-2009 11:28:34 5.67 62.00
Apr-01-2009 13:31:48 NA NA
Apr-01-2009 14:52:18 NA NA
Apr-01-2009 15:11:44 1.5 71.50
Apr-01-2009 17:00:53 3.17 84.00
Thanks!
Presuming your dataframe is called "Data", I'd use xts package. They're a whole lot easier to work with :
#Conversion of dates
Data$time <- as.POSIXct(Data$mdyhms,format="%b-%d-%Y %H:%M:%S")
#conversion to time series
library(xts)
TimeSeries <- xts(Data[,c("Ya","Yb")],Data[,"time"])
Then TimeSeries can be used subsequently. You can't use a normal ts, because you don't have a regular time series. No way on earth you can defend that the time intervals between your observations are equal.
EDIT :
In regard of your remarks in the comments, you can try the following :
#Calculate the period they're into
#This is based on GMT and the fact that POSIXct gives the number of seconds
#passed since the origin. 5400 is 1/16 of 86400 seconds in a day
Data$mdyhms <- as.POSIXct(Data$mdyhms,format="%b-%d-%Y %H:%M:%S",tz="GMT")
Data$Period <- as.numeric(Data$mdyhms) %/% 5400 * 5400
#Make a new data frame with all periods in the range of the dataframe
Date <- as.numeric(trunc(Data$mdyhms,"day"))
nData <- data.frame(
Period = seq(min(Date),max(Date)+86399,by=5400)
)
# Merge both dataframes and take the mean of values within a dataframe
nData <- merge(Data[c('Ya','Yb','Period')],nData,by="Period",all=T)
nData <- ddply(nData,"Period",mean,na.rm=T)
#Make the time series and get rid of the NaN values
#These come from averaging vectors with only NA
TS <- ts(nData[c('Ya','Yb')],frequency=16)
TS[is.nan(TS)] <- NA