IBrokers request Historical Futures Contract Data? - r

I tried to request historical futures data but for a beginner the ibrokers.pdf document is not well enough documented.
example Gold Miny Contract Dec11 NYSELIFFE:
goldminy<-twsFuture("YG","NYSELIFFE","201112",multiplier="33.2")
reqHistoricalData(conn,
Contract= "goldminy",
endDateTime"",
barSize = "1 S",
duration = "1 D",
useRTH = "0",
whatToShow = "TRADES","BID", "ASK", "BID_ASK",
timeFormat = "1",
tzone = "",
verbose = TRUE,
tickerId = "1",
eventHistoricalData,
file)
I also don't know how to specify some of the data parameters correctly ?
whatToShow ? i need Date,Time,BidSize,Bid,Ask,AskSize,Last,LastSize,Volume
tickerID ?
eventHistoricalData ?
file ?

I wrote the twsInstrument package (on RForge) to alleviate these sorts of headaches.
getContract will find the contract for you if you give it anything reasonable. Any of these formats should work:
"YG_Z1", "YG_Z11", "YGZ1", "YGZ11", "YGZ2011", "YGDEC2011", "YG_DEC2011", etc. (also you could use the conId, or give it an instrument object, or the name of an instrument object)
> library(twsInstrument)
> goldminy <- getContract("YG_Z1")
Connected with clientId 100.
Contract details request complete. Disconnected.
> goldminy
List of 16
$ conId : chr "42334455"
$ symbol : chr "YG"
$ sectype : chr "FUT"
$ exch : chr "NYSELIFFE"
$ primary : chr ""
$ expiry : chr "20111228"
$ strike : chr "0"
$ currency : chr "USD"
$ right : chr ""
$ local : chr "YG DEC 11"
$ multiplier : chr "33.2"
$ combo_legs_desc: chr ""
$ comboleg : chr ""
$ include_expired: chr "0"
$ secIdType : chr ""
$ secId : chr ""
I don't have a subscription to market data for NYSELIFFE, so I will use the Dec 2011 e-mini S&P future for the rest of this answer.
You could get historical data like this
tws <- twsConnect()
hist.data <- reqHistoricalData(tws, getContract("ES_Z1"))
This will give you back these columns, and it will all be 'TRADES' data
> colnames(hist.data)
[1] "ESZ1.Open" "ESZ1.High" "ESZ1.Low" "ESZ1.Close" "ESZ1.Volume"
[6] "ESZ1.WAP" "ESZ1.hasGaps" "ESZ1.Count"
whatToShow must be one of 'TRADES', 'BID', 'ASK', or 'BID_ASK'. If your request uses whatToShow='BID' then you will get the OHLC etc. of the BID prices. "BID_ASK" means that the Ask price will be used for the High and the Bid price will be used for the Low.
Since you said the vignette was too advanced, it bears repeating that Interactive Brokers limits historical data requests to 6 every 60 seconds. So you should pause for 10 seconds between each request (or for getting lots of data I usually pause for 30 seconds after I make 3 requests so that if I have BID data for something I am also likely have ASK data for it)
The function getBAT will download the BID, ASK and TRADES data, and merge together only the closing values of those into a single xts object that looks like this:
> getBAT("ES_Z1")
Connected with clientId 120.
waiting for TWS reply on ES ............. done.
Pausing 10 seconds between requests ...
waiting for TWS reply on ES .... done.
Pausing 10 seconds between requests ...
waiting for TWS reply on ES .... done.
Pausing 10 seconds between requests ...
Disconnecting ...
[1] "ES_Z1"
> tail(ES_Z1)
ES.Bid.Price ES.Ask.Price ES.Trade.Price ES.Mid.Price
2011-09-27 15:09:00 1170.25 1170.50 1170.50 1170.375
2011-09-27 15:10:00 1170.50 1170.75 1170.50 1170.625
2011-09-27 15:11:00 1171.25 1171.50 1171.25 1171.375
2011-09-27 15:12:00 1171.50 1171.75 1171.50 1171.625
2011-09-27 15:13:00 1171.25 1171.50 1171.25 1171.375
2011-09-27 15:14:00 1169.75 1170.00 1170.00 1169.875
ES.Volume
2011-09-27 15:09:00 6830
2011-09-27 15:10:00 4509
2011-09-27 15:11:00 4902
2011-09-27 15:12:00 6089
2011-09-27 15:13:00 6075
2011-09-27 15:14:00 14380
You asked for both LastSize and Volume. The "Volume" that getBAT returns is the total amount traded over the time of the bar. So, with 1 minute bars, it's the total volume that took place in that 1 minute.
Here's an answer that doesn't use twsInstrument:
I'm almost certain this will work, but as I said, I don't have the required market data subscription, so I can't test.
reqHistoricalData(tws, twsFuture("YG","NYSELIFFE","201112"))
Using the e-mini S&P again:
> mydata <- reqHistoricalData(tws, twsFuture("ES","GLOBEX","201112"), barSize='1 min', duration='5 D', useRTH='0', whatToShow='TRADES')
waiting for TWS reply on ES .... done.
> head(mydata)
ESZ1.Open ESZ1.High ESZ1.Low ESZ1.Close ESZ1.Volume ESZ1.WAP ESZ1.hasGaps ESZ1.Count
2011-09-21 15:30:00 1155.25 1156.25 1155.00 1155.75 3335 1155.50 0 607
2011-09-21 15:31:00 1155.75 1156.25 1155.50 1155.75 917 1155.95 0 164
2011-09-21 15:32:00 1155.75 1156.25 1155.50 1156.00 859 1155.90 0 168
2011-09-21 15:33:00 1156.00 1156.25 1155.50 1155.75 642 1155.83 0 134
2011-09-21 15:34:00 1155.50 1156.00 1155.25 1155.25 1768 1155.65 0 232
2011-09-21 15:35:00 1155.25 1155.75 1155.25 1155.25 479 1155.45 0 94
One of the problems with your attempt is that if you're using a barSize of '1 S', your duration cannot be greater than '60 S' See IB Historical Data Limitations

Related

Request to API using httr2 not changing like httr

I'm trying to switch from the R package httr to httr2
httr2 is the modern rewrite and should be superior, but I'm a novice when it comes to APIs and coding and I've been stuck all day trying to figure out what I'm doing wrong. I can only get httr to work with this API.
I believe I am messing up with adding the headers, as I don't think the path sent to the API is changing. So my problem is the request gets rejected simply because the API can't read my API key.
Here is what I have done in httr2:
gov_url <- "https://api.dummy.gov/aaa/bb/cc"
resp <- request(gov_url) %>%
req_headers(
param1 = "10",
api_key = "abcdefg",
param2 = "xyz",
param3 = "09/10/2022"
) %>%
req_dry_run()
Output:
GET /destiny/v1/placeholder HTTP/1.1
Host: api.dummy.gov
User-Agent: httr2/0.2.1 r-curl/4.3.2 libcurl/7.64.1
Accept: */*
Accept-Encoding: deflate, gzip
param1: 10
api_key: abcdefg
param2: xyz
param3: 09/10/2022
The first line of that output hasn't changed.
GET /destiny/v1/placeholder HTTP/1.1
Showing the structure of the 'resp' object with str(resp)
List of 7
$ url : chr "https://api.dummy.gov/destiny/v1/placeholder"
$ method : NULL
$ headers :List of 4
..$ param1 : chr "10"
..$ api_key : chr "abcdefg"
..$ param2 : chr "xyz"
..$ param3 : chr "09/10/2022"
$ body : NULL
$ fields : list()
$ options : list()
$ policies: list()
- attr(*, "class")= chr "httr2_request"
Sending the request with resp %>% req_perform(verbosity = 2) gives me this error:
HTTP/1.1 403 Forbidden
---
"error":
"code": "API_KEY_MISSING",
"message": "No api_key was supplied. Please submit with a valid API key."
But when I use httr though, I can pull data from the API.
gov_url <- "https://api.dummy.gov/destiny/v1/placeholder"
query_params <- list('param1' = '10',
'api_key' = "abcdefg",
'param2' = 'xyz',
'param3' = '09/10/2022')
gov_api <- GET(path, query = query_params)
Showing structure str(gov_api) shows the path has changed which is quite good because it matches the example input provided from the API
List of 10
$ url : chr "https://api.dummy.gov/destiny/v1/placeholder?param1=10&api_key=abcdefg&param2"| __trunc
$ status_code: int 200
$ headers :List of 19
..$ xyz : chr "1"
..$ abc : chr "application/json"
..$ edg : chr "something"
And http_status(gov_api) shows its making the connection
$message
"Success: (200) OK"
Then I'm able to successfully use httr to pull data from the API.
Thank you to anyone who has read this far. I'd appreciate any feedback if possible.
Other things I've tried to no avail:
!!! to evaluate the list of expressions
Sending the API a list named queue like I do in httr
Sending the param "accept" = "application/json"
Different syntax, quotations
I'm not sure what you're after, but if you want to reproduce same comportement as httr with httr2 you can do :
library(httr2)
gov_url <- "https://api.dummy.gov/aaa/bb/cc"
param <- list(param1 = "10",
api_key = "abcdefg",
param2 = "xyz",
param3 = "09/10/2022")
resp <- request(gov_url) %>%
req_url_query(!!!param)
resp$url
#> [1] "https://api.dummy.gov/aaa/bb/cc?param1=10&api_key=abcdefg&param2=xyz&param3=09%2F10%2F2022"
Perhaps the confusion come from the fact that in httr2 you're setting headers and in httr you're setting query.

Extracting All Emails Using GmailR

I'm trying to extract all the emails from my gmail account to do some analysis. The end goal is a dataframe of emails. I'm using the gmailR package.
So far I've extracted all the email threads and "expanded" them by mapping all the thread IDs to gm_thread(). Here's the code for that:
threads <- gm_threads(num_results = 5)
thread_ids <- gm_id(threads)
#extract all the thread ids
threads_expanded <- map(thread_ids, gm_thread)
This returns a list of all the threads. The structure of this is a list of gmail_thread objects. When you drill down one level into the list of thread objects, str(threads_expanded[[1]], max.level = 1), you get a single thread object which looks like:
List of 3
$ id : chr "xxxx"
$ historyId: chr "yyyy"
$ messages :List of 3
- attr(*, "class")= chr "gmail_thread"
Then, if you drill down further into the messages composing the threads, you start to get the useful info. str(threads_expanded[[1]]$messages, max.level = 1) gets you a list of the gmail_message objects for that thread:
List of 3
$ :List of 8
..- attr(*, "class")= chr "gmail_message"
$ :List of 8
..- attr(*, "class")= chr "gmail_message"
$ :List of 8
..- attr(*, "class")= chr "gmail_message"
Where I'm stuck is actually extracting all the useful information from each email within all the threads. The end goal is a dataframe with a column for the message_id, thread_id, to, from, etc. I'm imagining something like this:
message_id | thread_id | to | from | ... |
-------------------------------------------------------------------------
1234 | abcd | me#gmail.com | pam#gmail.com | ... |
1235 | abcd | pam#gmail.com | me#gmail.com | ... |
1236 | abcf | me#gmail.com | tim#gmail.com | ... |
It's not the prettiest answer, but it works. I'm going to work on vectorizing it later:
threads <- gm_threads(num_results = 5)
thread_ids <- gm_id(threads)
#extract all the thread ids
threads_expanded <- map(thread_ids, gm_thread)
msgs <- vector()
for(i in (1:length(threads_expanded))){
msgs <- append(msgs, values = threads_expanded[[i]]$messages)
}
#extract all the individual messages from each thread
msg_ids <- unlist(map(msgs, gm_id))
#get the message id for each message
msg_body <- vector()
#get message body, store in vector
for(msg in msgs){
body <- gm_body(msg)
attchmnt <- nrow(gm_attachments(msg))
if(length(body) != 0 && attchmnt == 0){
#does not return a null value, rather an empty list or list
of length 0, so if,
#body is not 0 (there is something there) and there are no attachemts,
#add it to vector
msg_body <- append(msg_body, body)
#if there is no to info, fill that spot with an empty space
}
else{
msg_body <- append(msg_body, "")
#if there is no attachment but the body is also empty add "" to the list
}
}
msg_body <- unlist(msg_body)
msg_datetime <- msgs %>%
map(gm_date) %>%
unlist()%>%
dmy_hms()
#get datetime info, store in vector
message_df <- tibble(msg_ids, msg_datetime, msg_body)
#all the other possible categories, e.g., to, from, cc, subject, etc.,
#either use a similar for loop or a map call

Need to better understand lists in R

So I am using a package in R called twitteR
consumer_key <- "MY API KEY"
consumer_secret <- "MY API KEY"
access_token <- "MY API KEY"
access_secret <- "MY API KEY"
setup_twitter_oauth(consumer_key,
consumer_secret,
access_token,
access_secret)
tweets<-searchTwitter('NASA',n=3200,lang = 'en')
Now I have list called tweets
> length(tweets)
[1] 3200
Looking at the first few elements
> tweets[1:3]
[[1]]
[1] "PepperAlbo: RT #mashable: NASA researchers reinvented the wheel"
[[2]]
[1] "UnitedStatesTD: NASA to release Voyager Golden Record as a vinyl box set - via #UnitedStatesTD "
[[3]]
[1] "ISSAboveYou: Hello #Space_Station from The Vails of Long Beach CA 302.0 mi away #NASA_Johnson #issabove "
Clearly the tweets are in lists within list, so lets say I call on the first element
> tweets[[1]]
[1] "PepperAlbo: RT #mashable: NASA researchers reinvented the wheel "
But there is actually more to the list
> str(tweets[[1]])
Reference class 'status' [package "twitteR"] with 17 fields
$ text : chr "RT #mashable: NASA researchers reinvented the wheel "
$ favorited : logi FALSE
$ favoriteCount: num 0
$ replyToSN : chr(0)
$ created : POSIXct[1:1], format: "2017-11-29 01:07:18"
$ truncated : logi FALSE
$ replyToSID : chr(0)
$ id : chr "935676507661524992"
$ replyToUID : chr(0)
$ statusSource : chr "Twitter for iPhone"
$ screenName : chr "PepperAlbo"
$ retweetCount : num 685
$ isRetweet : logi TRUE
$ retweeted : logi FALSE
$ longitude : chr(0)
$ latitude : chr(0)
$ urls :'data.frame': 0 obs. of 4 variables:
..$ url : chr(0)
..$ expanded_url: chr(0)
..$ dispaly_url : chr(0)
..$ indices : num(0)
and 53 methods, of which 39 are possibly relevant:
getCreated, getFavoriteCount, getFavorited, getId, getIsRetweet, getLatitude, getLongitude, getReplyToSID, getReplyToSN, getReplyToUID, getRetweetCount,
getRetweeted, getRetweeters, getRetweets, getScreenName, getStatusSource, getText, getTruncated, getUrls, initialize, setCreated, setFavoriteCount,
setFavorited, setId, setIsRetweet, setLatitude, setLongitude, setReplyToSID, setReplyToSN, setReplyToUID, setRetweetCount, setRetweeted, setScreenName,
setStatusSource, setText, setTruncated, setUrls, toDataFrame, toDataFrame#twitterObj
using one of them
> tweets[[1]]$id
[1] "935676507661524992"
So my question is where is all of this stored? is it along with the tweet text in the tweet[[1]]? So when I call only tweet[[1]] why is it that only the tweet gets printed and nothing else?
Is this a special kind of list that is being defined by this Reference class 'status' ?

R - subsetting by date

i'm trying to subset a large dataframe by date field ad facing strange behaviour:
1) find interesting time interval:
> ld[ld$bps>30000000,]
Date.first.seen Duration Proto Src.IP.Addr Src.Pt Dst.IP.Addr Dst.Pt Tos Packets Bytes bps
1400199 2015-03-31 13:52:24 0.008 TCP 3.3.3.3 3128 4.4.4.4 65115 0 39 32507 32500000
1711899 2015-03-31 14:58:10 0.004 TCP 3.3.3.3 3128 4.4.4.7 49357 0 29 23830 47700000
2) and try to look whats happening on that second:
> ld[ld$Date.first.seen=="2015-03-31 13:52:24",]
Date.first.seen Duration Proto Src.IP.Addr Src.Pt Dst.IP.Addr Dst.Pt Tos Packets Bytes bps
1401732 2015-03-31 13:52:24 17.436 TCP 3.3.3.3 3128 6.6.6.6 51527 0 3 1608 737
don't really understand the behavior - i should get way more results.
for example
> ld[1399074,]
Date.first.seen Duration Proto Src.IP.Addr Src.Pt Dst.IP.Addr Dst.Pt Tos Packets Bytes bps
1399074 2015-03-31 13:52:24 0.152 TCP 10.10.10.10 3128 11.11.11.11 62375 0 8 3910 205789
for date i use POSIXlt
> str(ld)
'data.frame': 2657583 obs. of 11 variables:
$ Date.first.seen: POSIXlt, format: "2015-03-31 06:00:00" "2015-03-31 06:00:00" "2015-03-31 06:00:00" "2015-03-31 06:00:01" ...
...
would appreciate any assistance. thanks!
POSIXlt may carry additional info which is supressed when printing the entire data.frame, timezone, daylight savings etc. Have a look at https://stat.ethz.ch/R-manual/R-devel/library/base/html/DateTimeClasses.html.
Printing only the POSIXlt variable (ld$Date.first.seen) does generally supply at least some of this additional information.
If you're not for some particular reason required to keep your variable in the POSIXlt and if you don't need the extra functionality the format enables, a simple:
ld$Date.first.seen = as.character(ld$Date.first.seen)
Added before your subset statement will probably solve your problem.

Selecting dates and time interval from observations in R

Having an object of class zoo we can select observations for a range of dates of interest using the function:
window(z, start = as.Date("2006-01-05"), end = as.Date("2006-01-08"))
In running this function the following warning message occurs:
Warning messages:
1: In which(in.index & all.indexes >= start & all.indexes <= end) :
Metodi incompatibili ("Ops.POSIXt", "Ops.Date") per ">="
2: In which(in.index & all.indexes >= start & all.indexes <= end) :
Metodi incompatibili ("Ops.POSIXt", "Ops.Date") per "<="
I have checked that the object is of class zoo and that Dates are included in the time series.
How is that possible?
Below the str(z) as requested:
‘zoo’ series from 2006-01-03 to 2013-01-24
Data: num [1:1795, 1:40] 3.65 3.68 3.69 3.72 3.7 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:1795] "1" "2" "3" "4" ...
..$ : chr [1:40] "EURARS" "EURAUD" "EURBRO" "EURCAD" ...
Index: POSIXct[1:1795], format: "2006-01-03" "2006-01-04" "2006-01-05" "2006-01-06" ...
Below the dput(head(z)) as requested:
structure(c(3.6511, 3.6833, 3.6931, 3.7152, 3.7027, 3.6897, 1.62349,
1.62257, 1.62011, 1.6115, 1.60243, 1.61108, 2.802, 2.7692, 2.7727,
2.7741, 2.7238, 2.729, 1.38937, 1.39109, 1.40716, 1.41627, 1.41196,
1.40666, 1.55055, 1.5472, 1.5448, 1.54335, 1.54215, 1.545, 623.73,
624.16, 628.43, 638.11, 632.27, 630.7, 9.6988, 9.7803, 9.7689,
9.802, 9.7492, 9.7354, 2742.03, 2765.68, 2758.65, 2769.27, 2753.3,
2747.31, 29.047, 28.972, 28.9, 28.88, 28.764, 28.792, 7.4616,
7.4601, 7.4612, 7.458, 7.46, 7.4589, 6.8983, 6.9551, 6.9594,
6.9838, 6.9374, 6.9253, 0.6882, 0.68905, 0.68961, 0.6863, 0.68473,
0.68358, 9.3178, 9.3963, 9.3889, 9.4207, 9.3702, 9.3516, 251.72,
250.28, 250.66, 250.39, 249.89, 250.86, 11657.46, 11677.26, 11612.05,
11603.59, 11433.35, 11403.84, 5.5244, 5.5808, 5.5799, 5.6134,
5.5858, 5.5957, 53.5288, 54.0151, 54.0323, 54.0105, 53.5591,
53.6189, 74.96, 74.88, 74.41, 73.94, 73.79, 73.84, 139.6, 140.71,
140.39, 139.09, 138.39, 137.93, 1208.3493, 1210.0214, 1195.7746,
1200.3966, 1181.3457, 1184.9635, 160.65, 162.15, 162.02, 162.53,
161.7, 161.43, 10.9802, 10.997, 10.9947, 10.9635, 10.9909, 10.9874,
12.7724, 12.8255, 12.8746, 12.8338, 12.7859, 12.8273, 4.5416,
4.5702, 4.5623, 4.5597, 4.5308, 4.5229, 7.9654, 7.9248, 7.9254,
7.914, 7.9574, 8.0011, 1.7571, 1.7626, 1.7622, 1.7574, 1.7411,
1.7391, 4.1276, 4.1608, 4.1665, 4.1818, 4.1606, 4.1534, 63.2627,
63.4733, 63.6725, 63.7198, 63.3608, 63.3412, 3.8295, 3.8116,
3.807, 3.805, 3.7609, 3.7799, 3.6732, 3.6815, 3.6834, 3.6842,
3.6644, 3.6492, 34.5475, 34.8363, 34.7254, 34.8369, 34.68, 34.39,
9.3648, 9.3279, 9.3321, 9.3152, 9.3389, 9.3603, 1.9844, 1.9932,
1.9938, 1.9902, 1.9766, 1.9716, 48.9853, 48.9426, 48.6762, 48.3184,
47.9995, 48.0187, 1.6149, 1.6195, 1.6193, 1.6204, 1.6155, 1.6148,
1.6129, 1.6175, 1.6184, 1.6201, 1.6182, 1.6221, 39.2261, 39.1868,
38.7569, 39.1189, 38.6148, 38.6309, 6.0673, 6.114, 6.1208, 6.1484,
6.1095, 6.1082, 1.2019, 1.2119, 1.211, 1.2151, 1.2088, 1.2065,
7.4834, 7.4559, 7.4658, 7.3872, 7.3206, 7.3497), .Dim = c(6L,
40L), .Dimnames = list(c("1", "2", "3", "4", "5", "6"), c("EURARS",
"EURAUD", "EURBRO", "EURCAD", "EURCHF", "EURCLP", "EURCNO", "EURCOP",
"EURCZK", "EURDKK", "EUREGP", "EURGBP", "EURHKD", "EURHUF", "EURIDO",
"EURILS", "EURINO", "EURISK", "EURJPY", "EURKRO", "EURKZT", "EURMAD",
"EURMXN", "EURMYO", "EURNOK", "EURNZD", "EURPEN", "EURPHO", "EURPLN",
"EURRON", "EURRUB", "EURSEK", "EURSGO", "EURTHO", "EURTND", "EURTRY",
"EURTWO", "EURUAH", "EURUSD", "EURZAR")), index = structure(c(1136242800,
1136329200, 1136415600, 1136502000, 1136761200, 1136847600), class = c("POSIXct",
"POSIXt"), tzone = ""), class = "zoo")
You shouldn't be comparing Date with POSIXct
Try this:
window(z, start = as.POSIXct("2006-01-05"), end = as.POSIXct("2006-01-08"))
Alternatively, as #JoshuaUlrich points out in a comment, your data is daily frequency, so you'd be better off using a Date index class for your data to avoid timezone weirdness.
index(z) <- as.Date(index(z))
window(z, start = as.Date("2006-01-05"), end = as.Date("2006-01-08"))
It looks like window doesn't convert between Date and POSIXct automatically. You have to specify the times in the same class as in the data:
window(z, start = as.POSIXct("2006-01-05"), end = as.POSIXct("2006-01-08"))
EURARS EURAUD EURBRO EURCAD EURCHF EURCLP EURCNO EURCOP
2006-01-05 23:00:00 3.7152 1.6115 2.7741 1.41627 1.54335 638.11 9.802 2769.27
EURCZK EURDKK EUREGP EURGBP EURHKD EURHUF EURIDO EURILS
2006-01-05 23:00:00 28.88 7.458 6.9838 0.6863 9.4207 250.39 11603.59 5.6134
EURINO EURISK EURJPY EURKRO EURKZT EURMAD EURMXN
2006-01-05 23:00:00 54.0105 73.94 139.09 1200.397 162.53 10.9635 12.8338
EURMYO EURNOK EURNZD EURPEN EURPHO EURPLN EURRON EURRUB
2006-01-05 23:00:00 4.5597 7.914 1.7574 4.1818 63.7198 3.805 3.6842 34.8369
EURSEK EURSGO EURTHO EURTND EURTRY EURTWO EURUAH EURUSD
2006-01-05 23:00:00 9.3152 1.9902 48.3184 1.6204 1.6201 39.1189 6.1484 1.2151
EURZAR
2006-01-05 23:00:00 7.3872

Resources