I am asking to write a text or graphical progress tracker while rforcecom's batch update function loads batches of up to 10,000.
To set up and complete a batch update, a few objects must be created--there is no avoiding it. I really do not like having to re-run code in order to check the status of rforcecom.checkBatchStatus(). This needs to be automated while a progress bar gives a visual of actual progress, since checking in the global environment isn't preferred and it will be a static "status" update until it's run again.
Here's how the code is set up:
require(Rforcecom)
## Login to Salesforce using your username and password token
## Once ready to update records, use the following:
job<- rforcecom.createBulkJob(session, operation = 'update',
object = 'custom_object__c')
info<- rforcecom.createBulkBatch(session, jobId = job$id, data = entry,
batchSize = 10000)
### Re-run this line if status(in global environment) is "In Progress" for
### updated status
status<- lapply(info, FUN = function(x) {
rforcecom.checkBatchStatus(session, jobId = x$jobId, batchId = x$id)})
###Once complete, check details
details<- lapply(status, FUN = function(x){
rforcecom.getBatchDetails(session, jobId = x$jobId, batchId = x$id)})
close<- rforcecom.closeBulkJob(session, jobId = job$id)
To automate re-running the status code, use the repeat loop:
repeat {
statements...
if (condition) {
break
}
}
Then, to get a visual for a progress update, use the txtProgressBar() in base R. For this particular function, I made my own progress bar function with two simple companion functions. As a note about progressValue(), the rforcecom.checkBatchStatus() outputs as a list of 1 and a sublist. The sublist name for checking the number of records processed is "numberRecordsProcessed".
progressBar<- function(x, start = 0, finish){
# x is your object that is performing a function over a varying time length
# finish is your number of rows of data your function is processing
pb <- txtProgressBar(min = start, max = finish, style = 3)
for (i in 1:finish){
i<- progressValue(x)
setTxtProgressBar(pb, i)
if (progressValue(x)/finish == 1) {
close(pb)
}
}
}
finish<- function(x){
return(as.numeric(nrow(x)))
}
progressValue<- function(x){
x=x[[1]][["numberRecordsProcessed"]]
return(as.numeric(x))
}
Now, bring it all together! Repeat loops can be trained to end as long as you know your conditions: "Completed" or "Failed". Repeat "status", which will update the number of records processed, and by doing so this will update your progress bar. When the number of records processed equals the number of rows in your data, the progress bar will quit and so will your repeat loop.
repeat {
status<- lapply(info, FUN = function(x){
rforcecom.checkBatchStatus(session, jobId = x$jobId, batchId = x$id)})
progressBar(status, finish = finish(entry))
if (status[[1]][["state"]]=="Completed") {
break
}
if (status[[1]][["state"]]=="Failed") {
break
}
}
Related
We have a basic tryCatch that writes a dataframe to Google Sheets, and trys again if the first write fails for any reason:
result = tryCatch({
print('TRYING')
googlesheets4::sheet_write(data = our_df, ss = our_spreadsheet, sheet = 'our_sheetname')
}, error = function(e) {
print('ERROR, TRYING AGAIN')
googlesheets4::sheet_write(data = our_df, ss = our_spreadsheet, sheet = 'our_sheetname')
})
It is possible to generalize this code to retry the googlesheets4::sheet_write() function call for N number of tries? Is something built into base R for this or is there a good R library that handles unlimited retries of a function?
You can put it in a for loop like this.
First, I am going to define a function that often fails (as I don't have access to your Google sheet).
russian_roulette <- function(n = 6) {
revolver <- sample(1:n, 1)
if (revolver == 1) {
return("You lived")
} else {
stop("Better luck next time...")
}
}
Then you can try it as many times as you consider reasonable. You can replace my call to russian_roulette() with your call to googlesheets4::sheet_write().
NUM_TRIES <- 10
for (i in 1:NUM_TRIES) {
message(i)
result <- try({
russian_roulette()
})
if (class(result) != "try-error") {
print("Success!")
break
}
}
Output:
1
Error in russian_roulette() : Better luck next time...
2
Error in russian_roulette() : Better luck next time...
3
Error in russian_roulette() : Better luck next time...
4
Error in russian_roulette() : Better luck next time...
5
Error in russian_roulette() : Better luck next time...
6
[1] "Success!"
result
# [1] "You lived"
I don't know why you expect writing to a file to fail - depending on the reason you may want to add a Sys.sleep() call in there for a certain number of seconds after every failure.
Trying to cut a bunch of audio (.WAV) files into smaller samples in R. For this example, I'm using a loop to cut out 1 minute samples at 140 minutes.
For some files, the recording ends before 140 minutes due to an error in the recording device. When this occurs, an error appears -- and the loop stops. I'm trying to make it so the loop continues by using the try or tryCatch function however keep getting errors.
The code is as follows:
for(i in 1:length(AR_CD288)){
CUT_AR288_5 <- try({readWave(AR_CD288[i], from = 140, to = 141, units = "minutes")})
FILE.OUT_AR288_5<- sub("\\.wav$", "_140.wav", AR_CD288)
OUT.PATH_AR288_5 <- file.path("New files", basename(FILE.OUT_AR288_5))
writeWave(CUT_AR288_5, extensible=FALSE, filename = OUT.PATH_AR288_5[i])
}
I get the following two errors from the code:
Error in readBin(con, int, n = N, size = bytes, signed = (bytes != 1), :
invalid 'n' argument
Error in writeWave(CUT_AR288_5, extensible = FALSE, filename = OUT.PATH_AR288_5[i]) :
'object' needs to be of class 'Wave' or 'WaveMC
The loop still saves some samples into the "New files" directory, however, once the loop reaches a file <140 minutes, the loop stops.
I am very stuck! Any help would be greatly appreciated.
Cheers.
When I use try, I always do one (or both) of:
check the return value to see if it inherits "try-error", indicating that the command failed; or
add try(., silent = TRUE), indicating that I don't care if it succeeded (but this implies that I will not use its return value, either).
Try this:
for (i in seq_along(AR_CD288)) {
CUT_AR288_5 <- try({
readWave(AR_CD288[i], from = 140, to = 141, units = "minutes")
}, silent = TRUE)
if (!inherits(CUT_AR288_5, "try_error")) {
FILE.OUT_AR288_5 <- sub("\\.wav$", "_140.wav", AR_CD288)
OUT.PATH_AR288_5 <- file.path("New files", basename(FILE.OUT_AR288_5))
writeWave(CUT_AR288_5, extensible = FALSE, filename = OUT.PATH_AR288_5[i])
}
}
Three notes:
I changed 1:length(.) to seq_along(.); the latter is more resilient in an automated use when it is feasible that the vector might be length 0. For example, if AR_CD288 can ever be length 2, intuitively we expect 1:length(AR_CD288) to return nothing so that the for loop will not run; unfortunately, it resolves to 1:0 which returns a vector of length 2, which will often fail (based on whatever code is operating in the loop). The use of seq_along(.) will always return a vector of length 0 with an empty input, which is what we need. (Alternatively and equivalent, seq_len(length(AR_CD288)), though that's really what seq_along is intended to do.)
If you do not add silent=TRUE (or explicitly add silent=FALSE), then you will get an error message indicating that the command failed. Unfortunately, the error message may not indicate which i failed, so you may be left in the dark as far as fixing or removing the errant file. You may prefer to add an else to the if (inherits(.,"try-error")) clause so that you can provide a clearer error, such as
if (inherits(CUT_AR288_5, "try_error")) {
warning("'readWave' failed on ", sQuote(AR_CD288[i]), call. = FALSE)
} else {
FILE.OUT_AR288_5 <- sub("\\.wav$", "_140.wav", AR_CD288)
# ...
}
(noting that I put the "things worked" code in the else clause here ... I find it odd to do if (!...) {} else {}, seems like a double-negation :-).
The choice to wrap one function or the whole block depends on your needs: I tend to prefer to know exactly where things fail, so the will-possibly-fail functions are often individually wrapped with try so that I can react (or log/message) accordingly. If you don't need that resolution of error-detection, then you can certainly wrap the whole code-block in a sense:
for (i in seq_along(AR_CD288)) {
ret <- try({
CUT_AR288_5 <- readWave(AR_CD288[i], from = 140, to = 141, units = "minutes")
FILE.OUT_AR288_5 <- sub("\\.wav$", "_140.wav", AR_CD288)
OUT.PATH_AR288_5 <- file.path("New files", basename(FILE.OUT_AR288_5))
writeWave(CUT_AR288_5, extensible = FALSE, filename = OUT.PATH_AR288_5[i])
}, silent = TRUE)
if (inherits(ret, "try-error")) {
# do or log something
}
}
I am receiving some values in my R process and I want to compute them asynchronously. I am using promises and future package.
This is how my current code looks like:
arr = list()
i=0
while(i < 10)
{
a = read messages from KAFKA topic
arr[[i]] = future(DoSomething(a))
i = i + 1
}
Now, arr contains a list of promises
How do I get value() of the promise that has resolved first (and so on)?
Something like promise.race in javascript.
Edit: I just re-read your question and saw that you were asking about getting the first result, not just all results. Below is the code for getting that. It's a while loop that waits for any result to be ready and then moves forward when there's a result ready.
There is also a function called promise_race in the promises package, but the issue with the promises package is that it can only output results. You can't get the value produced back into a variable for further computations in the main thread.
require(future)
plan(multiprocess)
longRunningFunction <- function(value) {
random1<- runif(min= 5 ,max = 30,n = 1)
Sys.sleep(random1)
return(value)
}
arr = list()
#changed starting number to 1 since R lists start at 1, not 0
i=1
#If the number of futures generated is more than the number of cores available, then the main thread will block until the first future completes and allows more futures to be started
while(i < 6)
{
arr[[i]] = future(longRunningFunction(i))
i = i + 1
}
while(all(!resolved(arr))){ }
raceresults_from_future<-lapply(arr[resolved(arr)], value)
print(paste("raceresults_from_future: ",raceresults_from_future) )
Is it possible to write a function in R which will hold its execution, giving the users control over the console (while in interactive mode of course), meanwhile recording their inputs, and continuing execution either:
after a certain input has been made
or after a certain output has been made
or a certain duration of time has passed
Example: ask the user a question (without using readline() for the answer)
question <- function() {
message("How much is 2 + 2?")
#let users take control of the console
#continue to next statement only if they input "2+2", or "4" or a minute has passed
#meanwhile record their last input similar to ".Last.Value", e.g.:
startTime <- Sys.time()
timeout <- FALSE
lastInput <- lastInput()
while (eval(parse(text = lastInput)) != 4 & !timeout) {
if (difftime(Sys.time(), startTime, units = "mins") > 1) {
timeout <- TRUE
}
lastInput <- lastInput()
}
if (timeout) {
stop("Sorry, timeout.")
} else {
message("Correct! Let's continue with this function:")
}
}
Where lastInput() is a function which "listens" to user input when it changes.
Obviously the above structure is tentative and won't give me what I want, some way to "listen" or "observe" and only react when the user inputs something to the console.
The final user experience should be:
> question()
How much is 2+2?
> #I'm the user, I can do whatever
> head(mtcars)
> plot(1:10)
> 3
> 2 + 2
[1] 4
Correct! Let's continue with this function:
Am I too optimistic or is there some R magic for this?
Thanks to #parth I have looked at swirl's source code and got acquainted with the addTaskCallback function. From the help file:
addTaskCallback registers an R function that is to be called each time a top-level task is completed.
And so we can make R check the users input ("top-level task") with a specific function, responding accordingly.
But since the swirl code is very "heavy", I think I need to supply a minimal example:
swirllike <- function(...){
removeTaskCallback("swirllike")
e <- new.env(globalenv())
e$prompt <- TRUE
e$startTime <- Sys.time()
cb <- function(expr, val, ok, vis, data=e){
e$expr <- expr
e$val <- val
e$ok <- ok
e$vis <- vis
# The result of f() will determine whether the callback
# remains active
return(f(e, ...))
}
addTaskCallback(cb, name = "swirllike")
message("How much is 2+2?")
}
OK, so the swirllike function evokes the 2+2 question, but it also declares a new environment e with some objects the user needs not know. It then adds the swirllike task callback to the task callback list (or rather vector). This "task callback" holds the cb function which calls the f function - the f function will run with every input.
If you run this, make sure you see the swirllike task callback with:
> getTaskCallbackNames()
[1] "swirllike"
Now the f function is similar to my sketch in the question:
f <- function(e, ...){
if (e$prompt) {
if (difftime(Sys.time(), e$startTime, units = "mins") > 1) {
timeout <- TRUE
stop("Sorry, timeout.")
}
if(!is.null(.Last.value) && .Last.value == 4) {
message("Correct! Let's continue with this function:")
e$prompt <- FALSE
while (!e$prompt) {
#continue asking questions or something, but for this example:
break
}
}
}
return(TRUE)
}
And don't forget to remove the swirllike task callback with:
removeTaskCallback("swirllike")
I'm trying to use ReactivePoll to update a data frame. The check function is supposed to return different values when a) new rows are added to database tables and b) input$type changes.
# checks to see if input$type has changed or the table corresponding to input$type has grown
check.data <- function() {
row.count <- 0
if (data.connection$source == "pg") {
pg.connection$connection <- dbConnect(pg.connection$driver, dbname = pg.settings.database(), host = pg.settings.host(), port = as.integer(pg.settings.port()), user = pg.settings.user(), password = pg.settings.password())
row.count <- as.integer(dbGetQuery(pg.connection$connection, paste0("SELECT count(*) FROM ", input$type ,"datum")))
dbDisconnect(pg.connection$connection)
}
print(row.count) # make sure return value is changing when it should
row.count
}
# refreshes data
refresh.data <- function() {
print("check") # check if function is executing
frame <- data.frame()
if (data.connection$source == "pg") {
pg.connection$connection <- dbConnect(pg.connection$driver, dbname = pg.settings.database(), host = pg.settings.host(), port = as.integer(pg.settings.port()), user = pg.settings.user(), password = pg.settings.password())
frame <- dbReadTable(pg.connection$connection, paste0(input$type, "datum"))
dbDisconnect(pg.connection$connection)
}
frame
}
data <- reactivePoll(15000, NULL, check.data, refresh.data)
According to R Shiny's documentation, "the check function indicates change by returning a different value from the previous time it was called." I've verified that row.count is changing, but "check" is never printed, implying the value function never executes. I know row counting isn't a reliable way to check for change (tables may have the same number of rows), but row.count is changing.