I am using R to pull financial data from Yahoo with Quantmod's getSymbols() function. I use a character vector, Tickers, as the first argument in getSymbols() and the function then creates xts objects of each symbol passed from the Tickers vector. I then would like to merge the various xts objects into one object to perform further analysis on. Rather than typing out each new object's name, I would like to reference back to the original Tickers vector and use its contents (a vector of strings) to reference the newly created objects/variables.
So far I've fiddled around with various combinations of the below functions, but have had no luck:
assign(Ticker, merge(Ticker))
eval(parse(text = Ticker) --- This seems promising but only returns the last object in the Ticker vector. So close yet so far.
get()
as.name / as.symbol
rlang::syms()
library(tidyverse)
library(quantmod)
# Symbol List
Tickers <- c("RY", "TD", "BNS", "BMO", "CM")
# From To
StartDate <- as.Date("2001-01-01", format = "%Y-%m-%d")
EndDate <- Sys.Date()
# Symbol Lookup Function
SymLookup <- function(ticker, from){
assign(ticker, getSymbols(ticker, from = StartDate, to = EndDate, auto.assign = FALSE)[,6], pos = 1)
}
# Retrieving price data from Yahoo Finance
for (i in seq_along(Tickers)) {
SymLookup(Tickers[i],StartDate)
}
### At this point we have 5 newly created xts objects which all have variable
### names corresponding to the 5 character strings in the Tickers vector
### i.e. RY is now an xts object from 2001-01-01 to today etc.
### Attempting to systematically merge XTS dataframes together
BtBB <- assign(BtBB_syms, merge(BtBB_syms))
# Doesn't work - no default value for "y" which is missing
BtBB <- merge(eval(parse(text = Tickers)))
# This works, but only creates a merged data.frame with
# the last instance of Tickers, CM
BtBB <- merge(as.name(Tickers))
# Cannot coerce class `"name"` to a data.frame
BtBB <- merge(rlang::syms(Tickers))
# Same error as first attempt with assign function
I am hoping to just create a merged xts object with n number of columns created from however many symbols I input into the initial Tickers vector.
Basically I'm trying to reference variables in the global environment by using a vector of strings (plural) that I created previously.
Thanks so much!
Related
I wish to store some XTS objects as data frames within a list in R.
The XTS objects are stock price data collected using the tidyquant package, I need to convert these objects to data frames and store them in a list. I have one additional requirement, I only want to retain the index column and the closing price column for each stock.
I have tried using dplyr syntax to select the columns of interest but my code fails to select column indexes greater than 2
Error: Can't subset columns that don't exist.
x Locations 3 and 4 don't exist.
i There are only 2 columns.
This is the code I am using but I am struggling to understand how I can't select the closing price from my 'fortified' data frames
pacman::p_load(tidyquant,tidyverse,prophet)
tickers = c("AAPL","AMZN")
getSymbols(tickers,
from = '2015-01-01',
to = today(),
warnings = FALSE,
auto.assign = TRUE)
dfList <- list()
for (i in tickers) {
dfList[[i]] <- fortify.zoo(i) %>%
select(c(1,5))
}
When I convert an individual XTS object to a data frame using fortify.zoo I can select the columns of interest but not when I loop through them.
fortify.zoo(AAPL) %>% select(c(1,5)) %>% head(n = 10)
Can anyone help me understand where I am falling down in my understanding on this issue please?
getSymbols can put the stock data into an environment stocks and Cl will extract the close and the Index. Replace Cl with Ad if you want the adjusted close. Then iterate through the names in the environment. Finally leave it as an environment stocks or optionally convert it to a list L. No packages other than quantmod and the packages that it pulls in are used. Also there is the question if you even need to convert the data to data frames. You could just leave it as xts.
library(quantmod)
tickers = c("AAPL","AMZN")
stocks <- new.env()
getSymbols(tickers, env = stocks, from = '2015-01-01')
for(nm in ls(stocks)) stocks[[nm]] <- fortify.zoo(Cl(stocks[[nm]]))
L <- as.list(stocks) # optional
Another possibility if you do want a list is to replace the last two lines with an eapply:
L <- eapply(stocks, function(x) fortify.zoo(Cl(x)))
It is better to initialize a list with fixed length and name it with the tickers. In the OP's code, it is looping over the tickers directly, so each 'i' is the ticker name which is a string
dfList <- vector('list', length(tickers))
names(dfList) <- tickers
As the i here is a string name of the object "AAPL" or "AMZN", we can use get to return the value of that object from the global env
for (i in tickers) {
dfList[[i]] <- fortify.zoo(get(i)) %>%
select(c(1,5))
}
-check the dimensions
sapply(dfList, dim)
# AAPL AMZN
#[1,] 1507 1507
#[2,] 2 2
Another approach is mget to return all those objects into a list
library(purrr)
library(dplyr)
dfList2 <- mget(tickers) %>%
map(~ fortify.zoo(.x) %>%
select(1, 5))
After downloading stocks data using Quantmod package I want to subset the data and also compare the last row data in the xts with the previous row using (last / lag).
First I created a function to classify the volume in its quartile.
Second I create a new dataset to filter out which stocks in the list get yesterday a volume of 3(3rd quartile) = "stocks_with3"
Now I'd like to subset again the newly created "stocks_with3" dataset.
Specifically what I'm trying to get is TRUE/FALSE of comparing the "Open" of Yesterday (using last) and the "Close" of before yesterday "(using lag).
Exactly what I'm trying to get is if the "Open" was less or equal than the "Close" before yesterday of the stocks that yesterday had a volume in the 3rd quartile.
But when running the subset I'm getting an error message: "incorrect number of dimensions"
My approach for the subset is using last(to get the last available data in the xts) and lag ( to compare it with the immediately previous row)
#Get stock list data
library(quantmod)
library(xts)
Symbols <- c("XOM","MSFT","JNJ","IBM","MRK","BAC","DIS","ORCL","LW","NYT","YELP")
start_date=as.Date("2018-06-01")
getSymbols(Symbols,from=start_date)
stock_data = sapply(.GlobalEnv, is.xts)
all_stocks <- do.call(list, mget(names(stock_data)[stock_data]))
#function to split volume data quartiles into 0-4 results
Volume_q_rank <- function(x) {
stock_name <- stringi::stri_extract(names(x)[1], regex = "^[A-Z]+")
stock_name <- paste0(stock_name, ".Volqrank")
column_names <- c(names(x), stock_name)
x$volqrank <- as.integer(cut(quantmod::Vo(x),
quantile(quantmod::Vo(x),probs=0:4/4),include.lowest=TRUE))
x <- setNames(x, column_names)return(x)
}
all_stocks <- lapply(all_stocks, Volume_q_rank)
#Create a new dataset using names and which with stocks of Volume in the 3rd quartile.
stock3 <- sapply(all_stocks, function(x) {last(x[, grep("\\.Volqrank",names(x))]) == 3})
stocks_with3 <- names(which(stock3 == TRUE))
#Here is when I get the error.
stock3_check <- sapply(stocks_with3, function(x) {last(x[, grep("\\.Open",names(x))]) <= lag(x[, grep("\\.Close", 1), names(x)])})
#Expected result could be the same or running this for a single stock but applied to all the stocks in the list:
last(all_stocks$MSFT$MSFT.Open) <= lag(all_stocks$MSFT$MSFT.Close, 1)
#But I'm having the error when trying to apply to whole list using "sapply" "last" and "lag"
Any suggestion will be appreciated.
Thank you very much.
You have 2 mistakes in your sapply function. First you are trying use a character vector (stock_with3) instead of a list (all_stocks). Second the function used inside the sapply is incorrect. the lag closing bracket is before the grep.
This should work.
stock3_check <- sapply(all_stocks[stocks_with3], function(x) {
last(x[, grep("\\.Open", names(x))]) <= lag(x[, grep("\\.Close", names(x))])
})
additional comments
I'm not sure what you are trying to achieve with this code. As for retrieving your data, the following code is easier to read, and doesn't first put all the objects in your R session and then you putting them into a list:
my_stock_data <- lapply(Symbols , getSymbols, auto.assign = FALSE)
names(my_stock_data) <- Symbols
My code:
library(quantmod)
library(tseries)
library(ggplot2)
companies = c("IOC.BO", "BPCL.BO", "ONGC.BO", "HINDPETRO.BO", "GAIL.BO")
stocks = list()
for(i in 1:5){
stocks[[i]] = getSymbols(companies[i], auto.assign = FALSE)
}
stocks is a list of dataframes. Now I'm trying to bind the all $adjusted columns all the dataframes stored in stock but to do that I need to remove the rownames (someone please tell me if there's a better method to do this):
for(i in 1:5)
rownames(stocks[[i]])<- NULL
but the resulting dataframes still have their row names, could someone please tell me where I'm going wrong?
P.S. Further my end goal is to have a dataframe with only the adjusted columns of the dataframes in the list stocks for which I did this:
adjusted=data.frame()
for(i in 1:5)
coln=stocks[[1]][,6]
adjusted=cbind(ajusted,coln)
adjusted
but this returns adjusted as a list.
Row names
Regarding row names after running the code in the question
rownames(stocks[[1]])
## NULL
so it is not true that stocks have row names afterwards.
Adjusted series
To create a time series of adjusted values use Ad as shown below.
Adjusted <- do.call("merge", lapply(stocks, Ad))
Putting it all together
Note that we don't really need the entire row names processing and the following is sufficient. The second last line is optional as its only purpose is to make the column names nicer and the last line converts the xts object Adjusted to a data frame and may not be needed either since you may find working with an xts object more convenient than using data frames.
library(quantmod)
library(ggplot2)
stocks <- lapply(companies, getSymbols, auto.assign = FALSE)
Adjusted <- do.call("merge", lapply(stocks, Ad))
names(Adjusted) <- sub(".BO.Adjusted", "", names(Adjusted))
adjustedDF <- fortify(Adjusted)
I am trying to secure proper convert from a dataframe (with nested dataframes & lists) into xts. I know that the best startpoint is to use a matrix as a base for creating an xts, but I am trying to simulate how I get the original downloaded data.
My question: If I need to use the base setup with nested dataframes and list, what steps other than in the below script, do I need to secure to get the data into xts?
Below you find my code. Note that it works to get the xts produced if uncomment complete section "3."
# 1.creates the "top-level" df
date <- c("2016-10-10 21:32:00", "2016-10-10 21:33:00") # vector for creating the df
volume <- c(1,2) # vector for creating the df
df2 <- data.frame(date = date, volume = volume) # creating the df
# 2.change class of 2 columns
df2$date <- as.POSIXct(df2$date) #change from character to POSIX
df2$volume <- as.numeric(df2$volume) #change from character to numeric
# 3.create openPrices (set: base/bid/ask) < - Potential problem for creating xts
df2$openPrice <- data.frame(no2 = c(1,2)) # creating a nested df with 2 temp column/values.
df2$openPrice$bid <- list(1.1, 1.2) # creating a list within the nested df
df2$openPrice$ask <- list(2.2, 2.3) # creating a list within the nested df
df2$openPrice$no2 <- NULL # remove the 2 temp column/values
# 4.create an xts(myxts1), based on a dataframe (df2) with nested dataframes and lists.
myxts1 <- xts(df2[,-1],order.by = df2$date)
! Note. The result needs to be stripped off from nested, since it is
the nested that provokes problem for the step of creating the xts.
I want to extract the numerical values of a xts object. Let's look at an example
data <- new.env()
starting.date <- as.Date("2006-01-01")
nlookback <- 20
getSymbols("UBS", env = data, src = "yahoo", from = starting.date)
Reg.curve <- rollapply(Cl(data$UBS), nlookback, mean, align="right")
The Reg.cuve is still a xts object but actually I'm just interested in the running means. How can I modify Reg.curve to get a numerical vector?
Use coredata:
reg.curve.num <- coredata(Reg.curve)
# or, if you want a vector:
reg.curve.num <- drop(coredata(Reg.curve))
To extract the numerical values of any xts, ts, or zoo object use:
as.numeric(Reg.curve)