MturkR, R, automatically posting microbatches - r

I am new to MTurk, but have some fluency with R. I am using the MTurkR package the first time, and I am trying to create "micro-batches" that post on MTurk over time. The code I am using can be found below (the XXXX parts are obviously filled with the correct values). I don't get any error messages, the code runs, and posts the HIT both in the Sandbox and in the real correctly. However, the HITs posted do not show up in the Sandbox Requester account, or the real requester account which means - as far as I understand - that I can't evaluate the workers who submit a completion code before they are paid automatically.
Could anyone point out where the error is, and how could I review the HITs?
Thanks.
K
##### Notes:
# 1) Change sandbox from TRUE to FALSE to run live (make sure to test in sandbox first!!)
##### Step 1: Load library, set parameters
#### Load MTurkR library
library(MTurkR)
#### HIT Layout ID
# Layout ID for the choice task
# my_hitlayoutid = "XXXX"
# Layout ID for the choice task in the Sandbox
my_hitlayoutid = "XXXX"
#### Set MTurk credentials
Sys.setenv(
AWS_ACCESS_KEY_ID = "XXXX",
AWS_SECRET_ACCESS_KEY = "XXXX"
)
#### HIT parameters
## Run in sandbox?
sandbox_val <- "FALSE"
## Set the name of your project here (used to retrieve HITs later)
myannotation <- "myannotation"
## Enter other HIT aspects
newhittype <- RegisterHITType(
title = "hope trial",
description = "Description",
reward = "2.5",
duration = seconds(hours = 1),
keywords = "survey, demographics, neighborhoods, employment",
sandbox = sandbox_val
)
##### Step 2: Define functions
## Define a function that will create a HIT using information above
createhit <- function() {
CreateHIT(
hit.type = newhittype$HITTypeId,
assignments = 2,
expiration = seconds(days = 30),
annotation = myannotation,
verbose = TRUE,
sandbox = sandbox_val,
hitlayoutid = my_hitlayoutid
)
}
## Define a function that will expire all running HITs
## This keeps HITs from "piling up" on a slow day
## It ensures that A) HIT appears at the top of the list, B) workers won't accidentally accept HIT twice
# expirehits <- function() {
# ExpireHIT(
# annotation = myannotation,
# sandbox = sandbox_val
# )
#}
##### Step 3: Execute a loop that runs createhit/expirehit functions every hour, and it will log the output to a file
## Define number of times to post the HIT (totalruns)
totalruns <- 2
counter <- 0
## Define log file (change the location as appropriate)
logfile <- file("/Users/kinga.makovi/Dropbox/Bias_Experiment/MTurk/logfile.txt", open="a")
sink(logfile, append=TRUE, type="message")
## Run loop (note: interval is hourly, but can be changed in Sys.sleep)
repeat {
message(Sys.time())
createhit()
Sys.sleep(10)
#expirehits()
counter = counter + 1
if (counter == totalruns){
break
}
}
## To stop the loop before it finishes, click the "STOP" button
## To stop logging, run sink()

You can't see HITs that are created via the API (through MTurkR or otherwise) in the requester website. It's a "feature". You'll have to access the HITs through MTurkR (e.g., SearchHITs() and GetHIT()).

Related

Uploading a file to Figshare using rapiclient swagger api

I am trying to programmatically update my Figshare repository using rapiclient.
Following the answer to this question, I managed to authenticate and see my repository by:
library(rapiclient)
library(httr)
# figshare repo id
id = 3761562
fs_api <- get_api("https://docs.figshare.com/swagger.json")
header <- c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))
fs_api <- list(operations = get_operations(fs_api, header),
schemas = get_schemas(fs_api))
reply <- fs_api$operations$article_files(id)
I also managed to delete a file using:
fs_api$operations$private_article_file_delete(article_id = id, file_id = F)
Now, I would like to upload a new file to the repository. There seem to be two methods I need:
fs_api$operations$private_article_upload_initiate
fs_api$operations$private_article_upload_complete
But I do not understand the documentation. According to fs_api$operations$private_article_upload_initiate help:
> fs_api$operations$private_article_upload_initiate
private_article_upload_initiate
Initiate Upload
Description:
Initiate new file upload within the article. Either use link to
provide only an existing file that will not be uploaded on figshare
or use the other 3 parameters(md5, name, size)
Parameters:
link (string)
Url for an existing file that will not be uploaded on figshare
md5 (string)
MD5 sum pre computed on the client side
name (string)
File name including the extension; can be omitted only for linked
files.
size (integer)
File size in bytes; can be omitted only for linked files.
What does "file that will not be uploaded on Figshare" mean? How would I use the API to upload a local file ~/foo.txt?
fs_api$operations$private_article_upload_initiate(link='~/foo.txt')
returns HTTP 400.
I feel like I sent you down a bad path with my previous answer because I am not sure how to edit some of the api endpoints when using rapiclient. For example, the corresponding endpoint for fs_api$operations$private_article_upload_initiate() will be https://api.figshare.com/v2/account/articles/{article_id}/files, and I am not sure how to substitute for {article_id} prior to sending the request.
You may have to define your own client for operations you cannot get working any other way.
Here is an example of uploading a file to an existing private article as per the goal of your question.
library(httr)
# id of previously created figshare article
my_article_id <- 99999999
# make example file to upload
my_file <- tempfile("my_file", fileext = ".txt")
writeLines("Hello World!", my_file)
# Step 1 initiate upload
# https://docs.figshare.com/#private_article_upload_initiate
r <- POST(
url = sprintf("https://api.figshare.com/v2/account/articles/%s/files", my_article_id),
add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))),
body = list(
md5 = tools::md5sum(my_file)[[1]],
name = basename(my_file),
size = file.size(my_file)
),
encode = "json"
)
initiate_upload_response <- content(r)
# Step 2 single file info (get upload url)
# https://docs.figshare.com/#private_article_file
r <- GET(url = initiate_upload_response$location,
add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
single_file_response <- content(r)
# Step 3 uploader service (get number of upload parts required)
# https://docs.figshare.com/#endpoints
r <- GET(url = single_file_response$upload_url,
add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
upload_service_response <- content(r)
# Step 4 upload parts (this example only has one part)
# https://docs.figshare.com/#endpoints_1
r <- PUT(url = single_file_response$upload_url, path = 1,
add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT")))),
body = upload_file(my_file)
)
upload_parts_response <- content(r)
# Step 5 complete upload (after all part uploads are successful)
# https://docs.figshare.com/#private_article_upload_complete
r <- POST(
url = initiate_upload_response$location,
add_headers(c(Authorization = sprintf("token %s", Sys.getenv("RFIGSHARE_PAT"))))
)
complete_upload_response <- content(r)

Chainer: custom extension for early stopping based on time limit

I have a trainer that already has a stop trigger based on the total number of epochs:
trainer = training.Trainer(updater, (epochs, 'epoch'))
Now I would like to add a stopping condition based on the total elapsed time starting from some point in the code (which may be different than the elapsed_time stored inside trainer):
global_start = time.time()
# Some long preprocessing
expensive_processing()
# Trainer starts here and its internal elapsed time
# does not take into account the preprocessing
trainer.run()
What I tried is to define an extension as follows.
trainer.global_start = global_start
trainer.global_elapsed_time = 0.0
def stop_training():
return True
def check_time_limit(my_trainer):
my_trainer.global_elapsed_time = time.time() - my_trainer.global_start
# If reach the time limit, then set the stop_trigger as a callable
# that is always True
if my_trainer.global_elapsed_time > args.time_limit * 3600:
my_trainer.stop_trigger = stop_training
# Add the extension to trainer
trainer.extend(check_time_limit, trigger=(1000, 'iteration'))
Running the code, I obtained some The previous value of epoch_detail is not saved error. What did I do wrong?
Thank you so much in advance for your help!

How can I solve the error when the user inputs only an "enter" in R?

I make a code based on a table game. At the beginning of the code, it must ask the player the name of the player, and when the player inputs an "enter", my code shows an error. I want that when the player inputs an "enter", the program says something like "This name is invalid", ask repeat asking the name of the player. Here is a part of my code:
repeat{
if(r==1){
print("Name Player 1: ")
name1=scan(,what="character",1)
if(any(name1==gamers)){
r=readline(prompt = "This player is already in the file. Would you like to change the name? \n 1. Yes \n 2. No \n Select an option: ")
if(r==0){
r<-99
}
Instead of print(...); name1=scan(...), I'd use readline, as such:
while (!nzchar(name1 <- readline("Name Player 1: "))) TRUE
# Name Player 1: <-- me just hitting <enter>
# Name Player 1: <-- again
# Name Player 1: r2evans
name1
# [1] "r2evans"
You might prefer to allow a max number of failed attempts, though, instead of requiring the user interrupt the process with ctrl-c, so perhaps:
tries <- 3L
while (tries > 0 && !nzchar(name1 <- readline("Name Player 1: "))) tries <- tries - 1L
# Name Player 1:
# Name Player 1:
# Name Player 1:
And the loop just stopped/exited. You "know" that the user chose to quit because after the loop, tries == 0L and !nzchar(name1) both indicate the user's intent.

How to check if subset is empty in R

I have a set of data with weight with time (t), I need to identify outliers of weight for every time (t), after which I need to send a notification email.
I'm using bloxplot($out) to identify the outliers, it seems to work, but I'm not sure if:
It's the correct way to use the boxplot?
I can't detect if the boxplot has no outlier or if its empty (or maybe, I'm using a wrong technique)
Or possibly the subset itself is empty (could be the root cause)
For now, I just need to trap the empty subset and check if out variable is empty or not.
Below is my R script code:
#i am a comment, and the compiler doesn't care about me
#load our libraries
library(ggplot2)
library(mailR)
#some variables to be used later
from<-""
to<-""
getwd()
setwd("C:\\Temp\\rwork")
#read the data file into a data(d) variable
d<-read.csv("testdata.csv", header=TRUE) #file
#get the current time(t)
t <-format(Sys.time(),"%H")
#create a subset of d based on t
sbset<-subset(d,Time==t)
#identify if outlier exists then send an email report
out<-boxplot(sbset$weight)$out
if(length(out)!=0){
#create a boxplot of the subset
boxplot(sbset$weight)
subject = paste("Attention: An Outlier is detected for Scheduled Job Run on Hour ",t)
message = toString(out) #sort(out)
}else{
subject = paste("No Outlier Identified")
message = ""
}
email<-send.mail(from=from,
to=to,
subject=subject,
body=message,
html=T,
smtp=list(host.name = "smtp.gmail.com",
port = 465,
user.name = from,
passwd = "", #password of sender email
ssl = TRUE),
authenticate=TRUE,
send=TRUE)
DATA
weight,Time,Chick,x
42,0,1,1
51,2,1,1
59,4,1,1
64,6,1,1
76,8,1,1
93,10,1,1
106,12,1,1
125,14,1,1
149,16,1,1
171,18,1,1
199,20,1,1
205,21,1,1
40,0,2,1
49,2,2,1
58,4,2,1
72,6,2,1
84,8,2,1
103,10,2,1
122,12,2,1
138,14,2,1
162,16,2,1
187,18,2,1
209,20,2,1
215,21,2,1
43,0,3,1
39,2,3,1
55,4,3,1
67,6,3,1
84,8,3,1
99,10,3,1
115,12,3,1
138,14,3,1
163,16,3,1
187,18,3,1
198,20,3,1
202,21,3,1
42,0,4,1
49,2,4,1
56,4,4,1
67,6,4,1
74,8,4,1
87,10,4,1
102,12,4,1
108,14,4,1
136,16,4,1
154,18,4,1
160,20,4,1
157,21,4,1
41,0,5,1
42,2,5,1
48,4,5,1
60,6,5,1
79,8,5,1
106,10,5,1
141,12,5,1
164,14,5,1
197,16,5,1
199,18,5,1
220,20,5,1
223,21,5,1
41,0,6,1
49,2,6,1
59,4,6,1
74,6,6,1
97,8,6,1
124,10,6,1
141,12,6,1
148,14,6,1
155,16,6,1
160,18,6,1
160,20,6,1
157,21,6,1
41,0,7,1
49,2,7,1
57,4,7,1
71,6,7,1
89,8,7,1
112,10,7,1
146,12,7,1
174,14,7,1
218,16,7,1
250,18,7,1
288,20,7,1
305,21,7,1
42,0,8,1
50,2,8,1
61,4,8,1
71,6,8,1
84,8,8,1
93,10,8,1
110,12,8,1
116,14,8,1
126,16,8,1
134,18,8,1
125,20,8,1
42,0,9,1
51,2,9,1
59,4,9,1
68,6,9,1
85,8,9,1
96,10,9,1
90,12,9,1
92,14,9,1
93,16,9,1
100,18,9,1
100,20,9,1
98,21,9,1
41,0,10,1
44,2,10,1
52,4,10,1
63,6,10,1
74,8,10,1
81,10,10,1
89,12,10,1
96,14,10,1
101,16,10,1
112,18,10,1
120,20,10,1
124,21,10,1
43,0,11,1
51,2,11,1
63,4,11,1
84,6,11,1
112,8,11,1
139,10,11,1
168,12,11,1
177,14,11,1
182,16,11,1
184,18,11,1
181,20,11,1
175,21,11,1
41,0,12,1
49,2,12,1
56,4,12,1
62,6,12,1
72,8,12,1
88,10,12,1
119,12,12,1
135,14,12,1
162,16,12,1
185,18,12,1
195,20,12,1
205,21,12,1
41,0,13,1
48,2,13,1
53,4,13,1
60,6,13,1
65,8,13,1
67,10,13,1
71,12,13,1
70,14,13,1
71,16,13,1
81,18,13,1
91,20,13,1
96,21,13,1
41,0,14,1
49,2,14,1
62,4,14,1
79,6,14,1
101,8,14,1
128,10,14,1
164,12,14,1
192,14,14,1
227,16,14,1
248,18,14,1
259,20,14,1
266,21,14,1
41,0,15,1
49,2,15,1
56,4,15,1
64,6,15,1
68,8,15,1
68,10,15,1
67,12,15,1
68,14,15,1
41,0,16,1
45,2,16,1
49,4,16,1
51,6,16,1
57,8,16,1
51,10,16,1
54,12,16,1
42,0,17,1
51,2,17,1
61,4,17,1
72,6,17,1
83,8,17,1
89,10,17,1
98,12,17,1
103,14,17,1
113,16,17,1
123,18,17,1
133,20,17,1
142,21,17,1
39,0,18,1
35,2,18,1
43,0,19,1
48,2,19,1
55,4,19,1
62,6,19,1
65,8,19,1
71,10,19,1
82,12,19,1
88,14,19,1
106,16,19,1
120,18,19,1
144,20,19,1
157,21,19,1
41,0,20,1
47,2,20,1
54,4,20,1
58,6,20,1
65,8,20,1
73,10,20,1
77,12,20,1
89,14,20,1
98,16,20,1
107,18,20,1
115,20,20,1
117,21,20,1
40,0,21,2
50,2,21,2
62,4,21,2
86,6,21,2
125,8,21,2
163,10,21,2
217,12,21,2
240,14,21,2
275,16,21,2
307,18,21,2
318,20,21,2
331,21,21,2
41,0,22,2
55,2,22,2
64,4,22,2
77,6,22,2
90,8,22,2
95,10,22,2
108,12,22,2
111,14,22,2
131,16,22,2
148,18,22,2
164,20,22,2
167,21,22,2
43,0,23,2
52,2,23,2
61,4,23,2
73,6,23,2
90,8,23,2
Your first use of boxplot is unnecessarily creating a plot, you can use
out <- boxplot.stats(sbset$weight)$out
for a little efficiency.
You are interested in the presence of rows, but length(sbset) will return the number of columns. I suggest instead nrow or NROW.
if (NROW(out) > 0) {
boxplot(sbset$weight)
# ...
} else {
# ...
}

How to restrict order open to a certain time and close it in another certain time with Interactive Brokers API

Using Interactive Brokers API, I would like to restrict order open to a certain time for example not before 09:35, I would also like to close the position at about 5 minutes before the end of the day.I tried to use an if statment with Sys.time() but I didn't work and in addition it is not elegant..How can I fix the error or use another method to full fill my need ?
Hour<-as.integer(format(Sys.time(), "%H"))
Minute<-as.integer(format(Sys.time(), "%M"))
print(lastValue)
library(IBrokers)
options("scipen"=4)
myconid = 3
twsobj = twsConnect(myconid)
Sys.sleep(2)
myorderid = as.integer(reqIds(twsobj))
print(myorderid)
Sys.sleep(2)
if(lastValue>0.5 && Hour > 16 && Minute > 35 ){
placeOrder(twsobj,Contract=twsSTK("SPY"),Order=twsOrder(myorderid ,"BUY",1,"MKT"))
print("IT WAS A BUY ORDER")
Sys.sleep(10)
placeOrder(twsobj,Contract=twsSTK("SPY"),Order=twsOrder(myorderid + 1 ,"SELL",1,"MKT"))
} else{
placeOrder(twsobj,Contract=twsSTK("SPY"),Order=twsOrder(myorderid , "SELL" , 1 , "MKT"))
print("IT WAS A SELL ORDER")
Sys.sleep(10)
placeOrder(twsobj,Contract=twsSTK("SPY"),Order=twsOrder(myorderid + 1 , "BUY" ,1, "MKT"))
}
Every broker has a facility to specify the good after time (GAT). It should be as simple as setting this field in the R interface. In the CRAN IBrokers docs it says the field
goodAfterTime Trades Good After Time: YYYYMMDD hh:mm:ss or ""
Here's the order properties API info at IB
https://www.interactivebrokers.com/en/software/api/apiguide/java/order.htm
The trade's "Good After Time," format
"YYYYMMDD hh:mm:ss (optional time zone)"
Note, MKT orders will be filled practically instantly. A LMT order may take some time.

Resources