How to flush the print buffer in R? - r

I want to run a long-running simulation and have updates printed periodically. However, I am finding that my print statements are being buffered, even when I explicitly try to flush. Here, for example:
for (i in 1:10)
{
print(i)
flush(stdout())
Sys.sleep(1)
}
I would expect this to increment every 1sec, but it outputs everything at the end, after 10 seconds.
How would you force a flush of the print buffer?

I usually do it like this:
for (i in 1:10) {
message(i,"\r",appendLF=FALSE)
flush.console()
Sys.sleep(1)
}

You can also use cat():
for (i in 1:10) {
# Sleep for 1 second
Sys.sleep(1)
# Print the current iteration
cat(paste0("\r", i))
}

Related

Asynchronous programming in R

Overview
I am writing a program (in R) that makes API calls at certain designated times. The API calls take a while, but I need the timer (main loop) to continue counting while the API call is made. To do so, I need to "outsource" the API call to another CPU thread. I believe this is possible and have looked into the future and promises packages, but haven't found a solution yet.
Reproducible Example
Let's run a for loop that counts from 0 to 100. When the counter (i) gets to 50, it has to complete a resource-intensive process (calling the function sampler, which samples 1 million normal distributions 10,000 times for the sake of taking up computation space). The desire is for the counter to continue counting while sampler() is doing its work on another thread.
#Something to take up computation space
sampler <- function(){
for(s in 1:10000) sample(1000000)
}
#Get this counter to continue while sampler() runs on another thread
for(i in 1:100){
message(i)
if(i == 50){
sampler()
}
}
What I have tried (unsuccessfully)
library(future)
sampler <- function(){
for(s in 1:10000) sample(1000000)
}
for(i in 1:100){
message(i)
if(i == 50){
mySamples <- future({ sampler() }) %plan% multiprocess
}
}
It seems to me your call is only blocking while the workers are created, but not for the duration of the actual work. E.g. if do the plan() first, the counter will not block:
library(future)
sampler <- function(){
for(s in 1:10000) sample(1000000)
}
plan(multiprocess)
for(i in 1:100){
message(i)
if(i == 50){
mySamples <- future({ sampler() })
}
}
Also note, that the runtime of sampler() is much longer than the duration of the blocking call in your code and that, after executing your code, mySamples still has the status resolved: FALSE and CPU usage is still high.

Pausing a loop for specific time at a specific time in R

I have to run a long loop that updates some data and stores it in my company's server. The problem is that the company runs a back-up routine at midnight, and, for that, they shutdown the server for around 15 minutes.
So, given that I have to write down a file for every iteration, when the server goes down it breaks the loop.
I managed to circumvent the problem by writing the loop as follows
for(i in bills.list){
url = paste0("ulalah",i,"/")
# Download the data
bill.result <- try(getURL(url)) # if there is an error try again
while(class(bill.result)=="try-error"){
Sys.sleep(1)
bill.result <- try(getURL(url))
}
# if iteration is between 23:59:00 and 23:59:40 wait 17 min to restart the loop
if(as.numeric(format(Sys.time(), "%H%M%S")) > 235900 &
as.numeric(format(Sys.time(), "%H%M%S")) < 235940){
Sys.sleep(1020)
}
# Write the page to local hard drive
write(bill.result, paste0("bill", i, ".txt"))
# Print progress of download
cat(i, "\n")
}
Problem is, by evaluating the time at all iterations, I lose some precious time. Any more efficient thoughts?
I think you could simply try to store date. If it fails, it might be you are inside backup window
store <- function(data, retry = 10) {
while(retry > 0) {
result <- try(write(data, "/some_broken_place"))
if(class(result) == "try-error") {
# it might be we are in the backup window
cat("I will sleep if it turns out that it's backup time")
if(as.numeric(format(Sys.time(), "%H%M%S")) > 235900 &
as.numeric(format(Sys.time(), "%H%M%S")) < 235940){
Sys.sleep(1020)
}
retry <- retry - 1
}
}
if(retry == 0) {
cat("Very, very bad situation - no chance to store data")
}
}

Command that allows re-checking TRUE/FALSE

How to write command (function) that would allow "endless loop" checking whether the internet connection is TRUE else wait and than check again and so on....
Here is attempt what I mean:
havingIP <- function() { if (.Platform$OS.type == "windows") {
ipmessage <- system("ipconfig", intern = TRUE) } else {
ipmessage <- system("ifconfig", intern = TRUE) }
validIP <- "((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.]){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
any(grep(validIP, ipmessage)) }
The source for the above solution and credit goes here:
How to determine if you have an internet connection in R
if(havingIP()){ source(....) } else { for(i in 1:5) { Sys.sleep(1); cat(i) }}
Something like this but this is not appropriate since I want to execute the command source only once.
while(TRUE){
if(havingIP()){ print("working") } else { for(i in 1:5) { Sys.sleep(1);
cat(i) }}
}
So how to run this without loop that would check every 5sec and if the internet connection is not on wait another 5sec and so on until the internet is on, then execute the source only once and that's it.
Sorry I tried to search for this solution, I'm sure someone has asked something similar but could not find anything since I'm not sure how to search for it anyway. Thanks!
It seems like you might be looking for break:
while (TRUE) {
if (havingIP()) {
print("working") # execute what you want here
break # and if we ever reach here, then exit the while loop
} else {
for (i in 1:5) {
Sys.sleep(1)
cat(i)
}
}
}
A simpler take on lee's answer:
while(!havingIP()) for(i in 1:5) {Sys.sleep(1); cat(i)}
source(...)
This will pause execution until havingIP returns TRUE.

foreach - dopar do not start workers

I have the following piece of code that I would like to run with the doMC engine:
who_wins<-function(probs_a,probs_b,delta_order=0,delta_down=0){
#browser()
team_a<-runif(5,0,1)
team_b<-runif(5,0,1)
sya<-syb<-0
for(i in 1:5){
for(j in 1:2){
if(j==1){
if(sya<syb){
team_a[i]<-(1-delta_down)*team_a[i]
}
team_a[i]<-(1-(i-1)*delta_order)*team_a[i]
sya<-sya+(team_a[i]<probs_a[i])
}
else{
if(syb<sya){
team_b[i]<-(1-delta_down)*team_b[i]
}
team_b[i]<-(1-(i-1)*delta_order)*team_b[i]
syb<-syb+(team_b[i]<probs_b[i])
}
}
}
if(sya>syb){
return(1)
}
else if(sya<syb){
return(2)
}
else {
return(0)
}
}
library(doMC)
registerDoMC(8)
probs_a<-seq(.6,.8,length.out=5)
probs_b<-probs_a[5:1]
nsim<-20000
results<-foreach(icount(nsim), .combine=c) %dopar% {
return(who_wins(probs_a,probs_b))
}
The problem is that a couple of seconds after the first worker starts, the engine tries to launch the remaining. I see an spike in all processors, but they all die quickly, even the first one. Then, a new process is launched and the remaining of the code is run through this lone worker.
I have tried with different pieces of code and the engine works perfectly. But with this specific rutine, it doesn't.
Can anybody tell me what is happening? Thanks in advance.
Adding a Sys.sleep(0.01) inside your loop, I see all 8 processes “busy” with that one. After they are done, the main process remains busy for some time. I assume that the overhead of collecting the data from the individual processes and combining it into a single result is on a similar scale than the actual benefit from the parallelized computation. If you simply change the “computation” to return(1), you will see that this takes about as long as your computation, so the time is not spent on the workload but assembling the result.
Neither .inorder=FALSE nor use of doParallel instead of doMC change this. However, I would consider this a problem in the foreach package, as mclapply has significantly less overhead:
result <- unlist(mclapply(1:nsim, function(i) {
return(who_wins(probs_a, probs_b))
}, mc.cores=8))

Why does R say no loop for break/next, jumping to top level

Why does R throw the error "Error in value[3L] : no loop for break/next, jumping to top level" instead of going to the next iteration of a loop? I'm on R version 2.13.1 (2011-07-08)
for (i in seq(10)) {
tryCatch(stop(), finally=print('whoops'), error=function(e) next)
}
This problem came up because I wanted to create a different image or no image at all when plot failed. The code, using joran's approach, would look like this:
for (i in c(1,2,Inf)) {
fname = paste(sep='', 'f', i, '.png')
png(fname, width=1024, height=768)
rs <- tryCatch(plot(i), error=function(e) NULL)
if (is.null(rs)){
print("I'll create a different picture because of the error.")
}
else{
print(paste('image', fname, 'created'))
dev.off()
next
}
}
Maybe you could try :
for (i in seq(10)) {
flag <- TRUE
tryCatch(stop(), finally=print('whoops'), error=function(e) flag<<-FALSE)
if (!flag) next
}
Unfortunately, once you get inside your error function you're no longer in a loop. There's a way you could hack around this:
for (i in seq(10)) {
delayedAssign("do.next", {next})
tryCatch(stop(), finally=print('whoops'),
error=function(e) force(do.next))
}
Though that is... well, hacky. Perhaps there is a less hacky way, but I don't see one right off.
(This works because delayedAssign happens every loop, canceling out the efforts of force)
EDIT
Or you could use continuations:
for (i in seq(10)) {
callCC(function(do.next) {
tryCatch(stop(), finally=print('whoops'),
error=function(e) do.next(NULL))
# Rest of loop goes here
print("Rest of loop")
})
}
EDIT
As Joris points out, you probably shouldn't actually use either of these, because they're confusing to read. But if you really want to call next in a loop, this is how :).
Wouldn't it make more sense to put the next outside the tryCatch based on an if check? Something like this:
for (i in c(1,2,Inf)) {
rs <- tryCatch(seq(i), finally=print('whoops'), error=function(e) NULL)
if (is.null(rs)){
print("I found an error!")
}
else{
next
}
}
although I'm not sure this is what you want, since I'm a little unclear on what you're trying to do.
EDIT
Based on the OP's revisions, this formulation works for me:
plotFn <- function(fname,i){
png(fname, width=400, height=200)
plot(i)
dev.off()
}
for (i in c(1,Inf,3)) {
fname = paste('f', i, '.png',sep="")
rs <- tryCatch(plotFn(fname,i), error=function(e){dev.off(); return(NULL)})
if (is.null(rs)){
print("I'll create a different picture because of the error.")
}
else{
print(paste('image', fname, 'created'))
next
}
}
I'm certain that not having a dev.off() call in the case of an error needed to be fixed. I'd have to dig a little deeper to figure out exactly why separating png and plot was causing problems. But I think it's probably cleaner to keep the png(); plot(); dev.off() sequence self contained anyway. Also note that I put a dev.off() in the error function.
I haven't tested what will happen if plotFn throws an error on png(), never creates the device and then reaches the error function and calls dev.off(). Behavior may depend on what else you have going on in your R session.

Resources