Combining time-series objects and lists: Package "termstrc" - r

The R package "termstrc", designed for term-structure estimation, is an incredibly useful tool, but it requires data to be set in a particularly awkward format: lists within lists.
Question: What is the best way to prepare and shape data, either outside R or inside R, in order to create the repeated sublist format required to run the function "dyncouponbonds"?
The "dyncouponbonds" command requires data to be set in a repeated sublist, whereby a list of bonds and time-invariant features of those bonds (let's call this "bondlist"), is appended with some time t features of those bonds (price and accrued interest), and replicated for time t+1 to T.
Below is an example of the list format for one period. The "dyncouponbonds" command requires this format to be replicated, within an umbrella list, for all T periods. ISIN, MATURITYDATE, ISSUEDATE, COUPONRATE will be identical for each period. PRICE, ACCRUED, CASHFLOWS and TODAY will be different for each period.
R> str(govbonds$GERMANY)
List of 8
$ ISIN : chr [1:52] "DE0001141414" "DE0001137131" "DE0001141422" ...
$ MATURITYDATE:Class 'Date' num [1:52] 13924 13952 13980 14043 ...
$ ISSUEDATE :Class 'Date' num [1:52] 11913 13215 12153 13298 ...
$ COUPONRATE : num [1:52] 0.0425 0.03 0.03 0.0325 ...
$ PRICE : num [1:52] 100 99.9 99.8 99.8 ...
$ ACCRUED : num [1:52] 4.09 2.66 2.43 2.07 ...
$ CASHFLOWS :List of 3
..$ ISIN: chr [1:384] "DE0001141414" "DE0001137131" "DE0001141422" ...
..$ CF : num [1:384] 104 103 103 103 ...
..$ DATE:Class 'Date' num [1:384] 13924 13952 13980 14043 ...
$ TODAY :Class 'Date' num 13908

This a fairly advanced data manipulation question. R has many powerful data manipulation tools and you're not going to need to move away from R to prepare the (admittedly fairly obtuse) dyncouponbonds object. Indeed you actually shouldn't, because taking a structure from another language and then turning into dyncouponbonds will simply be more work.
The first thing I would make sure is that you are very familiar with the lapply function. You're going to be making plenty of use of it. You're going to be using it to create a list of couponbonds objects, which is what dyncouponbonds actually is. Creating couponbonds objects however is a little tougher, mainly because of the CASHFLOWS sublist which wants each cashflow associated with the bond's ISIN and with the date of the cashflow. For this you'll use lapply and some fairly advanced subscripting. The subset function will also come in handy.
This question also very much depends on where you will be getting the data from, and getting it out of Bloomberg is non-trivial, mainly because you will need to go back in history using the BDS function and "DES_CASH_FLOW" field for each bond to get its cashflows. I say history, because if you're using dyncouponbonds I'm assuming you will want to do historic yield curve analysis. You'll need to override the BDS function's "SETTLE_DT" field, to the value that you will have received for the bond using the BDP function and field "FIRST_SETTLE_DT", so that you get all the cashflows from the beginning of the bond's life (otherwise it'll only return from today, and that's no good for historic analysis). But I digress. If you're not using bloomberg I don't know where you'll get this data from.
You'll then need to get the static data for each bond, namely the maturity, the ISIN, and the coupon rate and the issue date. And you'll need historic price and accrued interest data. Again if using bloomberg, you'll use the BDP function for this with fields you'll see in the code, below, and the historic data function BDH which I have wrapped as bbdh. Assuming again that you're a bloomberg user, here is the code:
bbGetCountry <- function(cCode, up = FALSE) {
# this function is going to get all the data out of bloomberg that we need for a
# country, and update it if ncessary
if (up == TRUE) startDate <- as.Date("2012-01-01") else startDate <- histStartDate
# first get all the curve members for history
wdays <- wdaylist(startDate, Sys.Date()) # create the list of working days from startdate
actives <- lapply(wdays, function(x) {
bds(conn, BBcurveIDs[cCode], "CURVE_MEMBERS", override_fields = "CURVE_DATE",
override_values = format(x, "%Y%m%d"))
})
names(actives) <- wdays
uniqueActives <- unique(unlist(actives)) # there will be puhlenty duplicates. Get rid of them
# now get the unchanging bond data
staticData <- bdp(conn, uniqueActives, bbStaticDataFields)
# now get the cash flowdata
cfData <- lapply(uniqueActives, function(x) {
bds(conn, x, "DES_CASH_FLOW_ADJ", override_fields = "SETTLE_DT",
override_values = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d"))
})
names(cfData) <- uniqueActives
# now for historic data
historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
names(historicData) <- bbHistoricDataFields # put the names in otherwise we get a numbered list
allDates <- as.Date(index(historicData$LAST_PRICE)) # all the dates we will find settlement dates for for all bonds. No posix
save(actives, file = paste("data/", cCode, "actives.dat", sep = "")) #save all the files now
save(staticData, file = paste("data/", cCode, "staticData.dat", sep = ""))
save(cfData, file = paste("data/", cCode, "cfData.dat", sep = ""))
save(historicData, file = paste("data/", cCode, "historicData.dat", sep = ""))
#save(settleDates, file = paste("data/", cCode, "settleDates.dat", sep = ""))
assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData, #
historicData = historicData), pos = 1)
}
the bbdh function I use above is wrapper around the Rbbg library's bdh function and looks like this:
bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
#this function gets secs over years from bloomberg daily data
if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d") #convert date classes to bb string
if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d") # if we've been passed wrong format character string
rawd <- bdh(conn, secs, flds, startDate, always.display.tickers = TRUE, include.non.trading.days = TRUE,
option_names = c("nonTradingDayFillOption", "nonTradingDayFillMethod"),
option_values = c("NON_TRADING_WEEKDAYS", "PREVIOUS_VALUE"))
rawd <- dcast(rawd, date ~ ticker) #put into columns
colnames(rawd) <- sub(" .*", "", colnames(rawd)) #remove the govt, currncy bits from bb tickers
return(xts(rawd[, -1], order.by = as.POSIXct(rawd[, 1])))
}
The country code comes from a structure which associates two letter names with bloomberg yield curve descriptions:
BBcurveIDs <- list(PO = "YCGT0084 Index", #Portugal
DE = "YCGT0016 Index",
FR = "YCGT0014 Index",
SP = "YCGT0061 Index",
IT = "YCGT0040 Index",
AU = "YCGT0001 Index", #Australia
AS = "YCGT0063 Index", #Austria
JP = "YCGT0018 Index",
GB = "YCGT0022 Index",
HK = "YCGT0095 Index",
CA = "YCGT0007 Index",
CH = "YCGT0082 Index",
NO = "YCGT0078 Index",
SE = "YCGT0021 Index",
IR = "YCGT0062 Index",
BE = "YCGT0006 Index",
NE = "YCGT0020 index",
ZA = "YCGT0090 Index",
PL = "YCGT0177 Index", #Poland
MX = "YCGT0251 Index")
So bbGetCountry will create 4 different data structures, called actives, staticData, dynamicData, and historicData, all from the following bloomberg fields:
bbStaticDataFields <- c("ID_ISIN",
"ISSUER",
"COUPON",
"CPN_FREQ",
"MATURITY",
"CALC_TYP_DES", # pricing calculation type
"INFLATION_LINKED_INDICATOR", # N or Y, in R returned as TRUE or FALSE
"ISSUE_DT",
"FIRST_SETTLE_DT",
"PX_METHOD", # PRC or YLD
"PX_DIRTY_CLEAN", # market convention dirty or clean
"DAYS_TO_SETTLE",
"CALLABLE",
"MARKET_SECTOR_DES",
"INDUSTRY_SECTOR",
"INDUSTRY_GROUP",
"INDUSTRY_SUBGROUP")
bbDynamicDataFields <- c("IS_STILL_CALLABLE",
"RTG_MOODY",
"RTG_MOODY_WATCH",
"RTG_SP",
"RTG_SP_WATCH",
"RTG_FITCH",
"RTG_FITCH_WATCH")
bbHistoricDataFields <- c("PX_BID",
"PX_ASK",
#"PX_CLEAN_BID",
#"PX_CLEAN_ASK",
"PX_DIRTY_BID",
"PX_DIRTY_ASK",
#"ASSET_SWAP_SPD_BID",
#"ASSET_SWAP_SPD_ASK",
"LAST_PRICE",
#"SETTLE_DT",
"YLD_YTM_MID")
Now you're ready to create couponbond objects, using all these data structures:
createCouponBonds <- function(cCode, dateString) {
cdata <- get(paste(cCode, "data", sep = "")) # get the data set
today <- as.Date(dateString)
settleDate <- today
daycount <- 0
while(daycount < 3) {
settleDate <- settleDate + 1
if (!(weekdays(settleDate) %in% c("Saturday", "Sunday"))) daycount <- daycount + 1
}
goodbonds <- subset(cdata$staticData, COUPON != 0 & INFLATION_LINKED_INDICATOR == FALSE) # clean out zeros and tbills
goodbonds <- goodbonds[rownames(goodbonds) %in% cdata$actives[[dateString]][, 1], ]
stripnames <- sapply(strsplit(rownames(goodbonds), " "), function(x) x[1])
pxbid <- cdata$historicData$PX_BID[today, stripnames]
pxask <- cdata$historicData$PX_ASK[today, stripnames]
pxdbid <- cdata$historicData$PX_DIRTY_BID[today, stripnames]
pxdask <- cdata$historicData$PX_DIRTY_ASK[today, stripnames]
price <- as.numeric((pxbid + pxask) / 2)
accrued <- as.numeric(pxdbid - pxbid)
cashflows <- lapply(rownames(goodbonds), function(x) {
goodflows <- cdata$cfData[[x]][as.Date(cdata$cfData[[x]][, "Date"]) >= today, ]
#gfstipnames <- sapply(strsplit(rownames(goodflows), " "), function(x) x[1]) dunno if I need this
isin <- rep(cdata$staticData[x, "ID_ISIN"], nrow(goodflows))
cf <- apply(goodflows[, 2:3], 1, sum) / 10000
dt <- as.Date(goodflows[, 1])
return(list(isin = isin, cf = cf, dt = dt))
})
isinvec <- unlist(lapply(cashflows, function(x) x$isin))
cfvec <- as.numeric(unlist(lapply(cashflows, function(x) x$cf)))
datevec <- unlist(lapply(cashflows, function(x) x$dt))
govbonds <- list(ISIN = goodbonds$ID_ISIN,
MATURITYDATE = as.Date(goodbonds$MATURITY),
ISSUEDATE = as.Date(goodbonds$FIRST_SETTLE_DT),
COUPONRATE = as.numeric(goodbonds$COUPON) / 100,
PRICE = price,
ACCRUED = accrued,
CASHFLOWS = list(ISIN = isinvec, CF = cfvec, DATE = as.Date(datevec)),
TODAY = settleDate)
govbonds <- list(govbonds)
names(govbonds) <- cCode
class(govbonds) <- "couponbonds"
return(govbonds)
}
Take a close look at the cashflows <- lapply... function because this is where you'll create the sublist and is the core of the answer to your question, although of course, how this is done depends very much on how you have decided to build the intermediate data structures, and I have given you just one possibility. I realise that my answer is complex, but the problem is very complex. All the code you need is not in this answer either, a few helper functions are missing, but I am happy to provide them if you contact me. Certainly the skeleton of the core functions is all here, and actually, much of the problem is getting the data in the first place, and structuring it appropriately. You correctly surmise that some of the data is static for each bond, some of it is dynamic, and some of it is historical. So the dimensions of the intermediate datas structures are different for different pieces of the couponbonds objects. How you represent that is up to you, though I have used separate lists / data frames for each, linked via the bond IDs where necessary.
The function above will take a date string so you can do it for each of your historic data points, using the above-mentioned lapply, and hey "presto", dyncouponds:
spl <<- lapply(dodates, function(x) createCouponBonds("SP", x))
names(spl) <<- lapply(spl, function(x) x$SP$TODAY)
class(spl) <- "dyncouponbonds"
There you go. You asked for it....
If you're not using bloomberg, your input data structures will be very different but, as I said starting out, get super familiar with lapply and sapply. OBviously there are many other ways this problem could be solved, but the above works for Bloomberg. If you understand this code, you'll surely know what you're doing for other data sources.
Finally please note that the Rbbg package from findata.org is used to interface to bloomberg.

My 2 cents, I have been trying to get this work with new Rblpapi. I still have some problems with createCouponBonds part but I think other functions returns correctly. Won't solve whole problem but at least partial fix. BBcurveIDs, bbStaticDataFields, bbDynamicDataFields, bbHistoricDataFields are the same as above.
bbGetCountry <- function(cCode, up = FALSE) {
if (up == TRUE) startDate <- as.Date("2016-01-01") else startDate <- histStartDate
cal <- Calendar(weekdays=c("saturday", "sunday"))
wdays <- as.list(bizseq(startDate, Sys.Date(), cal))
actives <- lapply(wdays, function(x) {
bds(BBcurveIDs[cCode][[1]], "CURVE_MEMBERS", override = c(CURVE_DATE=format(x, "%Y%m%d")))
})
names(actives) <- wdays
uniqueActives <- unique(unlist(actives))
staticData <- bdp(uniqueActives, bbStaticDataFields)
cfData <- lapply(uniqueActives, function(x) {
bds(x, "DES_CASH_FLOW_ADJ", override = c(SETTLE_DT = format(as.Date(staticData[x, "FIRST_SETTLE_DT"]), "%Y%m%d")))
})
names(cfData) <- uniqueActives
historicData <- lapply(bbHistoricDataFields, function(x) bbdh(uniqueActives, flds = x, startDate = startDate))
names(historicData) <- bbHistoricDataFields
allDates <- as.Date(index(historicData$LAST_PRICE))
save(actives, file = paste("data_", cCode, "actives.dat", sep = ""))
save(staticData, file = paste("data_", cCode, "staticData.dat", sep = ""))
save(cfData, file = paste("data_", cCode, "cfData.dat", sep = ""))
save(historicData, file = paste("data_", cCode, "historicData.dat", sep = ""))
#save(settleDates, file = paste("data_", cCode, "settleDates.dat", sep = ""))
assign(paste(cCode, "data", sep = ""), list(actives = actives, staticData = staticData, cfData = cfData, #
historicData = historicData), pos = 1)
}
And bbdh function:
bbdh <- function(secs, years = 1, flds = "last_price", startDate = NULL) {
if(is.null(startDate)) startDate <- Sys.Date() - years * 365.25
if(class(startDate) == "Date") stardDate <- format(startDate, "%Y%m%d")
if(nchar(startDate) > 8) startDate <- format(as.Date(startDate), "%Y%m%d")
rawd <- bdh(secs, flds,
startDate,
include.non.trading.days = FALSE,
options = structure(c("PREVIOUS_VALUE", "NON_TRADING_WEEKDAYS"),
names = c("nonTradingDayFillMethod","nonTradingDayFillOption")))
rawd <- ldply(rawd, data.frame)
colnames(rawd) <- c("sec", "date", "fld")
rawd <- dcast(rawd, date ~ sec, value.var="fld")
colnames(rawd) <- gsub(" Corp", "", colnames(rawd))
return(xts(rawd[,-1], order.by=rawd[,1]))
}

Related

Why is my loop in R skipping the first element from the results?

My code takes the two destination airports (JFK and then Las Vegas), passes them through a URL to return flight information in the For Loop, which I'm trying to add to a data frame. However, it only is including the results from the last element, Las Vegas. Should I use something other than a list for this?
library (httr)
library (jsonlite)
des <- c("JFK", "LAS")
flights = NULL
flights = list()
for (x in 1 : length(des))
{
url <- paste0("https://travelpayouts-travelpayouts-flight-data-v1.p.rapidapi.com/v1/prices/direct/?destination=", des[x], "&origin=BOS")
r<-GET(url, add_headers("X-RapidAPI-Host" = "travelpayouts-travelpayouts-flight-data-v1.p.rapidapi.com",
"X-RapidAPI-Key" = " MY KEY HERE ",
"X-Access-Token" = " MY TOKEN HERE"))
jsonResponseParsed<-content(r,as="text")
f <- fromJSON(jsonResponseParsed, flatten = TRUE)
flights[[x]] <- data.frame(f$data)
}
data = do.call(rbind, flights)
#price will be in rubles will need to convert to USD

Problems extracting metadata from NCBI in R

I am trying to extract some information (metadata) from GenBank using the R package "rentrez" and the example I found here https://ajrominger.github.io/2018/05/21/gettingDNA.html. Specifically, for a particular group of organisms, I search for all records that have geographical coordinates and then want to extract data about the accession number, taxon, sequenced locus, country, lat_long, and collection date. As an output, I want a csv file with the data for each record in a separate row. It seems that the code below can do the job but at some point, rows get muddled with data from different records overlapping the neighbouring rows. For example, from 157 records that rentrez retrieves from NCBI 109 records in the file look like what I want to achieve but the rest is a total mess. I would greatly appreciate any advice on how to fix the issue because I am a total newbie with R and figuring out each step takes a lot of time.
setwd ("C:/R-Works")
library('XML')
library('rentrez')
argasid <- entrez_search(db="nuccore", term = "Argasidae[Organism] AND [lat]", use_history=TRUE, retmax=15000)
x <- entrez_fetch (db="nuccore", id=argasid$ids, rettype= "native", retmode="xml", parse=TRUE)
x <-xmlToList(x)
cleanEntrez <- function(x) {
basePath <- 'Seq-entry_seq.Bioseq'
c(
genbank = as.character(x[paste(basePath,
'Bioseq_id', 'Seq-id', 'Seq-id_genbank',
'Textseq-id', 'Textseq-id_accession',
sep = '.')]),
taxon = as.character(x[paste(basePath,
'Bioseq_descr', 'Seq-descr', 'Seqdesc',
'Seqdesc_source', 'BioSource', 'BioSource_org',
'Org-ref', 'Org-ref_taxname',
sep = '.')]),
bseqdesc_title = as.character(x[paste(basePath,
'Bioseq_descr', 'Seq-descr', 'Seqdesc',
'Seqdesc_title',
sep = '.')]),
lat_lon = as.character(x[grep('lat-lon', x) + 1]),
geo_description = as.character(x[grep('country', x) + 1]),
coll_date = as.character(x[grep('collection-date', x) + 1])
)
}
getGenbankMeta <- function(ids) {
allRec <- entrez_fetch(db = 'nuccore', id = ids,
rettype = 'native', retmode = 'xml',
parsed = TRUE)
allRec <- xmlToList(allRec)[[1]]
o <- lapply(allRec, function(x) {
cleanEntrez(unlist(x))
})
temp <- array(unlist(o), dim = c(length(o[[1]]), length(ids)))
seqVec <- temp[nrow(temp), ]
seqDF <- as.data.frame(t(temp[-nrow(temp), ]))
names(seqDF) <- names(o[[1]])[-nrow(temp)]
return(list(seq = seqVec, data = seqDF))
}
write.csv(getGenbankMeta(argasid$ids), 'argasid_georef.csv')

Using a loop to apply gmapsdistance to a list in R

I am trying to use the gmapsdistance package in R to calculate the journey time by public transport between a list of postcodes (origin) and a single destination postcode.
The output for a single query is:
$Time
[1] 5352
$Distance
[1] 34289
$Status
[1] "OK"
I actually have 2.5k postcodes to use but whilst I troubleshoot it I have set the iterations to 10. london1 is a dataframe containing a single column with 2500 postcodes in 2500 rows.
This is my attempt so far;
results <- for(i in 1:10) {
gmapsdistance::set.api.key("xxxxxx")
gmapsdistance::gmapsdistance(origin = "london1[i]"
destination = "WC1E 6BT"
mode = "transit"
dep_date = "2017-04-18"
dep_time = "09:00:00")}
When I run this loop I get
results <- for(i in 1:10) {
+ gmapsdistance::set.api.key("AIzaSyDFebeOppqSyUGSut_eGs8JcjdsgPBo8zk")
+ gmapsdistance::gmapsdistance(origin = "london1[i]"
+ destination = "WC1E 6BT"
Error: unexpected symbol in:
" gmapsdistance::gmapsdistance(origin = "london1[i]"
destination"
mode = "transit"
dep_date = "2017-04-18"
dep_time = "09:00:00")}
Error: unexpected ')' in " dep_time = "09:00:00")"
My questions are:
1)How can I fix this?
2) How do I need to format this, so the output is a dataframe or matrix containing the origin postcode and journey time
Thanks
There are a few things going on here:
"london[i]" needs to be london[i, 1]
you need to separate your arguments with commas ,
I get an error when using, e.g., "WC1E 6BT", I found it necessary to replace the space with a dash, like "WC1E-6BT"
the loop needs to explicitly assign values to elements of results
So your code would look something like:
library(gmapsdistance)
## some example data
london1 <- data.frame(postCode = c('WC1E-7HJ', 'WC1E-6HX', 'WC1E-7HY'))
## make an empty list to be filled in
results <- vector('list', 3)
for(i in 1:3) {
set.api.key("xxxxxx")
## fill in your results list
results[[i]] <- gmapsdistance(origin = london1[i, 1],
destination = "WC1E-6BT",
mode = "transit",
dep_date = "2017-04-18",
dep_time = "09:00:00")
}
It turns out you don't need a loop---and probably shouldn't---when using gmapsdistance (see the help doc) and the output from multiple inputs also helps in quickly formatting your output into a data.frame:
set.api.key("xxxxxx")
temp1 <- gmapsdistance(origin = london1[, 1],
destination = "WC1E-6BT",
mode = "transit",
dep_date = "2017-04-18",
dep_time = "09:00:00",
combinations = "all")
The above returns a list of data.frame objects, one each for Time, Distance and Status. You can then easily make those into a data.frame containing everything you might want:
res <- data.frame(origin = london1[, 1],
desination = 'WC1E-6BT',
do.call(data.frame, lapply(temp1, function(x) x[, 2])))
lapply(temp1, function(x) x[, 2]) extracts the needed column from each data.frame in the list, and do.call puts them back together as columns in a new data.frame object.

Create loop to download 10 year data from Oanda via quantmod package

I am trying to download bulk Oanda forex data using quantmod::getSymbols. The help file states that you can download only 500 days worth of data per request whereas I get a warning about a cap of 5 years worth of data from warnings(). Nevertheless, I tried to create a loop to download data from 1997 until this date. This is my code:
library(xts)
library(quantmod)
date_from = c("1996-01-01", "2001-01-02", "2005-01-03", "2009-01-03", "2013-01-04")
date_to = c("2001-01-01", "2005-01-02", "2009-01-03", "2013-01-03", "2016-01-04")
for (i in 1:5) {
getSymbols("EUR/AUD", src="oanda", from = dates_from[i], to = date_to[i])
forex = for (i=1) EURAUD else NULL
final_Dataset<- rbind(c(forex, EURAUD))
}
What changes should I implement?
Edit 1
I made it work but it is sloppily written. Any proposed changes would be much appreciated.
date_from = c("1996-01-01", "2001-01-02", "2005-01-03", "2009-01-03", "2013-01-04")
date_to = c("2001-01-01", "2005-01-02", "2009-01-03", "2013-01-03", "2016-01-04")
forex = vector(mode = 'list', length = 5)
for (i in 1:5) {
getSymbols("EUR/AUD", src="oanda", from = dates_from[i], to = date_to[i])
forex[[i]] = EURAUD
}
EUR_AUD = Reduce(rbind,forex)
You can do this by looping over a vector of dates that are 500 days apart. Notice that I wrapped the getSymbols call in try because the first 2 starting dates did not work. I'm not sure why.
require(quantmod)
Data <- do.call(rbind, lapply(dates, function(d) {
sym <- "EUR/AUD"
x <- try(getSymbols(sym, src="oanda", from=d, to=d+499, auto.assign=FALSE))
if (inherits(x, "try-error"))
return(NULL)
else
return(x)
}))

Need help manipulating URL using concatenation to span 2 years of archived data in R

I would like to concatenate the following values in R.
day <- sprintf("%02d", 1:31)
month <- sprintf("%02d", 1:12)
year <- 2015:as.numeric(format(Sys.time(), "%Y"))
I need them to be in the following format 2015/01/01012015 (YYYY/MM/MMDDYYYY) where MMs would have to be equal at all times.
Ultimately I want to attach it to the end on this URL http://brocktonpolice.com/wp-content/uploads/ so I can pass it as an argument to a download function to download the files.
Here is what I have so far
links <- NULL
i <- 1
while (i <= length(year)) {
links[i] <- paste0("http://brocktonpolice.com/wp-content/uploads/",year[i], sep = "/")
i = i + 1
}
I would like it to span the entire year of 2015 and 2016.
For example:
http://brocktonpolice.com/wp-content/uploads/2015/01/01012015.pdf
http://brocktonpolice.com/wp-content/uploads/2015/01/01022015.pdf
http://brocktonpolice.com/wp-content/uploads/2015/01/01032015.pdf
http://brocktonpolice.com/wp-content/uploads/2015/01/01042015.pdf
...
http://brocktonpolice.com/wp-content/uploads/2015/02/02012015.pdf
http://brocktonpolice.com/wp-content/uploads/2015/02/02022015.pdf
http://brocktonpolice.com/wp-content/uploads/2015/02/02032015.pdf
...
etc
Use seq.Date. It's much easier.
prefix <- "http://brocktonpolice.com/wp-content/uploads/"
AllDays <- seq.Date(from = as.Date('2015-01-01'), to = Sys.Date(), by = "day")
links <- paste0(prefix, format(AllDays, '%Y/%m/%m%d%Y'), '.pdf')
print(links)

Resources