R - How to print progress in a loop over list? - r

I need to process a long list of images using a loop. It takes a considerable time to run everything, and therefore I would like to keep track of the progress.
This is my loop:
files.list <- c("LC82210802013322LGN00_B1.TIF", "LC82210802013322LGN00_B10.TIF",
"LC82210802013322LGN00_B11.TIF", "LC82210802013322LGN00_B2.TIF",
"LC82210802013322LGN00_B3.TIF", "LC82210802013322LGN00_B4.TIF",
"LC82210802013322LGN00_B5.TIF", "LC82210802013322LGN00_B6.TIF",
"LC82210802013322LGN00_B7.TIF", "LC82210802013322LGN00_B8.TIF",
"LC82210802013322LGN00_B9.TIF", "LC82210802013322LGN00_BQA.TIF",
"LC82210802013354LGN00_B1.TIF", "LC82210802013354LGN00_B10.TIF",
"LC82210802013354LGN00_B11.TIF", "LC82210802013354LGN00_B2.TIF",
"LC82210802013354LGN00_B3.TIF", "LC82210802013354LGN00_B4.TIF",
"LC82210802013354LGN00_B5.TIF", "LC82210802013354LGN00_B6.TIF",
"LC82210802013354LGN00_B7.TIF", "LC82210802013354LGN00_B8.TIF",
"LC82210802013354LGN00_B9.TIF", "LC82210802013354LGN00_BQA.TIF",
"LC82210802014021LGN00_B1.TIF", "LC82210802014021LGN00_B10.TIF",
"LC82210802014021LGN00_B11.TIF", "LC82210802014021LGN00_B2.TIF",
"LC82210802014021LGN00_B3.TIF", "LC82210802014021LGN00_B4.TIF",
"LC82210802014021LGN00_B5.TIF", "LC82210802014021LGN00_B6.TIF",
"LC82210802014021LGN00_B7.TIF", "LC82210802014021LGN00_B8.TIF",
"LC82210802014021LGN00_B9.TIF", "LC82210802014021LGN00_BQA.TIF",
"LC82210802014037LGN00_B1.TIF", "LC82210802014037LGN00_B10.TIF",
"LC82210802014037LGN00_B11.TIF", "LC82210802014037LGN00_B2.TIF",
"LC82210802014037LGN00_B3.TIF", "LC82210802014037LGN00_B4.TIF",
"LC82210802014037LGN00_B5.TIF", "LC82210802014037LGN00_B6.TIF",
"LC82210802014037LGN00_B7.TIF", "LC82210802014037LGN00_B8.TIF",
"LC82210802014037LGN00_B9.TIF", "LC82210802014037LGN00_BQA.TIF",
"LC82210802014085LGN00_B1.TIF", "LC82210802014085LGN00_B10.TIF",
"LC82210802014085LGN00_B11.TIF", "LC82210802014085LGN00_B2.TIF",
"LC82210802014085LGN00_B3.TIF", "LC82210802014085LGN00_B4.TIF",
"LC82210802014085LGN00_B5.TIF", "LC82210802014085LGN00_B6.TIF",
"LC82210802014085LGN00_B7.TIF", "LC82210802014085LGN00_B8.TIF",
"LC82210802014085LGN00_B9.TIF", "LC82210802014085LGN00_BQA.TIF"
)
for (x in files.list) { #loop over files
# Tell about progress
cat('Processing image', x, 'of', length(files.list),'\n')
}
Of course, instead of showing the name of the file, I would like to show the index of the current file in the context of the length of the entire list.
I really need the names of the files within the loop, because I need to load and save a new version of each one of them.
Any ideas? Thanks in advance.

for (x in 1:length(files.list)) { #loop over files
# doing something on x-th file => files.list[x]
# Tell about progress
cat('Processing image', x, 'of', length(reproj),'\n')
}

for (i in 1:length(files.list)) {
x <- files.list[i]
# do stuff with x
message('Processing image ', i, ' of ', length(files.list))
}

You can use system window progress bar as under:
# put this before start of loop
total = length of your loop
# put this before closing braces of loop
pb <- winProgressBar(title = "progress bar", min = 0, max =total , width = 300)
Sys.sleep(0.1)
# Here i is loop itrator
setWinProgressBar(pb, i, title=paste( round(i/total*100, 0),"% done"))
# put this after closing braces of loop
close(pb)

Related

Function not changing the data frame

I've recently made a simple for loop that outputs the Max and Min of the past 5 prices and it works perfectly, creating 2 new columns showing MaxH and MinL:
for(i in 5:nrow(XBTUSD_df_s)){
XBTUSD_df_s$MaxH[i] = max(XBTUSD_df_s$Price[(i-(5-1)):i])
XBTUSD_df_s$MinL[i] = min(XBTUSD_df_s$Price[(i-(5-1)):i])
}
I then put this for loop into a function so that I can adjust how many prices I want the Max and Min to be based off like so (the print lines were added as a sanity check):
FindMaxMin = function(x){
for(i in x:nrow(XBTUSD_df_s)){
XBTUSD_df_s$MaxH[i] = max(XBTUSD_df_s$Price[(i-(x-1)):i])
XBTUSD_df_s$MinL[i] = min(XBTUSD_df_s$Price[(i-(x-1)):i])
print(XBTUSD_df_s$MaxH[i])
print(XBTUSD_df_s$MinL[i])
}
}
But after for example:
FindMaxMin(x = 10)
The console will spit out all the expected results but unlike the for loop by itself, my dataframe will not automatically add on the MaxH and MinL columns.
I've tried return() and I think most likely it is a global environment problem but can't seem to wrap my head around it.
Thanks in advance!
You need to return the object from the function and then assign it later:
FindMaxMin = function(x, XBTUSD_df_s){
for(i in x:nrow(XBTUSD_df_s)){
XBTUSD_df_s$MaxH[i] = max(XBTUSD_df_s$Price[(i-(x-1)):i])
XBTUSD_df_s$MinL[i] = min(XBTUSD_df_s$Price[(i-(x-1)):i])
print(XBTUSD_df_s$MaxH[i])
print(XBTUSD_df_s$MinL[i])
}
return (XBTUSD_df_s)
}
new = FindMaxMin(10, XBTUSD_df_s)

Progress bar for rforcecom.checkbatchstatus()

I am asking to write a text or graphical progress tracker while rforcecom's batch update function loads batches of up to 10,000.
To set up and complete a batch update, a few objects must be created--there is no avoiding it. I really do not like having to re-run code in order to check the status of rforcecom.checkBatchStatus(). This needs to be automated while a progress bar gives a visual of actual progress, since checking in the global environment isn't preferred and it will be a static "status" update until it's run again.
Here's how the code is set up:
require(Rforcecom)
## Login to Salesforce using your username and password token
## Once ready to update records, use the following:
job<- rforcecom.createBulkJob(session, operation = 'update',
object = 'custom_object__c')
info<- rforcecom.createBulkBatch(session, jobId = job$id, data = entry,
batchSize = 10000)
### Re-run this line if status(in global environment) is "In Progress" for
### updated status
status<- lapply(info, FUN = function(x) {
rforcecom.checkBatchStatus(session, jobId = x$jobId, batchId = x$id)})
###Once complete, check details
details<- lapply(status, FUN = function(x){
rforcecom.getBatchDetails(session, jobId = x$jobId, batchId = x$id)})
close<- rforcecom.closeBulkJob(session, jobId = job$id)
To automate re-running the status code, use the repeat loop:
repeat {
statements...
if (condition) {
break
}
}
Then, to get a visual for a progress update, use the txtProgressBar() in base R. For this particular function, I made my own progress bar function with two simple companion functions. As a note about progressValue(), the rforcecom.checkBatchStatus() outputs as a list of 1 and a sublist. The sublist name for checking the number of records processed is "numberRecordsProcessed".
progressBar<- function(x, start = 0, finish){
# x is your object that is performing a function over a varying time length
# finish is your number of rows of data your function is processing
pb <- txtProgressBar(min = start, max = finish, style = 3)
for (i in 1:finish){
i<- progressValue(x)
setTxtProgressBar(pb, i)
if (progressValue(x)/finish == 1) {
close(pb)
}
}
}
finish<- function(x){
return(as.numeric(nrow(x)))
}
progressValue<- function(x){
x=x[[1]][["numberRecordsProcessed"]]
return(as.numeric(x))
}
Now, bring it all together! Repeat loops can be trained to end as long as you know your conditions: "Completed" or "Failed". Repeat "status", which will update the number of records processed, and by doing so this will update your progress bar. When the number of records processed equals the number of rows in your data, the progress bar will quit and so will your repeat loop.
repeat {
status<- lapply(info, FUN = function(x){
rforcecom.checkBatchStatus(session, jobId = x$jobId, batchId = x$id)})
progressBar(status, finish = finish(entry))
if (status[[1]][["state"]]=="Completed") {
break
}
if (status[[1]][["state"]]=="Failed") {
break
}
}

R with tcltk/tcltk2: Improve slow performance when displaying big data.frame with TkTable?

Please see two edits below (added later)...
I have loaded a big data.frame into memory (2.7 mio rows and 7 columns - 74 MB of RAM).
If I want to view the data using Tcl/Tk's Tktable widget via the tcltk2 package function tk2edit
it takes over 15 minutes till the window is displayed with the data
and about 7 GB of RAM (!) is consumed by R (incl. Tcl/Tk) en plus!
Example:
library(tcltk2)
my.data.frame <- data.frame(ID=1:2600000,
col1=rep(LETTERS,100000),
col2=rep(letters,1E5),
col3=26E5:1) # about 40 MB of data
tk2edit(my.data.frame)
The basic problem seems to be that each cell of the data.frame must loaded into an tcl array via two nested loops (see the code in this tktable question).
The tcltk2 package's function tk2edit works the same way, over-simplified:
# my.data.frame contains a lot of rows...
for (i in 0:(dim(my.data.frame)[1])) {
for (j in 0:(dim(my.data.frame)[2]-1)) {
tclarray1[[i,j]] <- my.data.frame[i, j]
}
}
Question: Is there any way to optimize displaying big data.frames with tktable, e. g. by avoiding the nested loops? I just want to view data (no editing required)...
tktable has the -variable option where you can set the tcl array variable that contains ALL the data of the table. So we "only" have to find way to create a tcl array from an R data.frame with "one call to tcl from R"...
PS: This is not a problem of the tcltk2 package but seems to be a general problem how to "bulk load" data of a data.frame into Tcl variables...
PS2: The good thing is that Tktable seems to be able to display such a lot of data efficiently (I can scroll and even edit cells without noticing any severe delays).
Edit 1 (09/01/2015): Adding pure Tcl/Tk benchmark results with Tktable and data in an array
I have prepared a simple benchmark in Tcl/Tk to measure the execution time and memory consumption of filling a similar Tktable:
#!/usr/bin/env wish
package require Tktable
set rows 2700000
set columns 4
for {set row 0} {$row <= $rows} {incr row} {
for {set column 0} {$column < $columns} {incr column} {
if {$row == 0} {
set data($row,$column) Titel$column
} else {
set data($row,$column) R${row}C${column}
}
}
}
ttk::frame .fr
table .fr.table -rows $rows -cols $columns -titlerows 1 -titlecols 0 -height 5 -width 25 -rowheight 1 -colwidth 9 -maxheight 100 -maxwidth 400 -selectmode extended -variable data -xscrollcommand {.fr.xscroll set} -yscrollcommand {.fr.yscroll set}
scrollbar .fr.xscroll -command {.fr.table xview} -orient horizontal
scrollbar .fr.yscroll -command {.fr.table yview}
pack .fr -fill both -expand 1
pack .fr.xscroll -side bottom -fill x
pack .fr.yscroll -side right -fill y
pack .fr.table -side right -fill both -expand 1
Results:
Memory consumption: 3.2 GB
Time until the table is displayed: 15 sec.
Conclusion: Tcl/Tk arrays are wasting memory, but the performance is very good (the runtime of 15 minutes when using R with tcltk seem to be caused by R to Tcl/Tk communication overhead.
Test setup: Ubuntu 14.04 64 Bit with 16 GB RAM...
Edit 2 (10/01/2015): Adding pure Tcl/Tk benchmark results of ttk::treeview with data in a list
To compare the memory consumption of Tktable to ttk::treeview I wrote this code:
#!/usr/bin/env wish
set rows 2700000
set columns 4
set data {}
set colnames {}
for {set i 0} {$i < $columns} {incr i} {
lappend colnames Title$i
}
for {set row 0} {$row <= $rows} {incr row} {
set newrow {}
for {set column 0} {$column < $columns} {incr column} {
lappend newrow R${row}C${column}
}
lappend data $newrow
}
ttk::treeview .tv -columns $colnames -show headings -yscrollcommand {.sbY set} -xscrollcommand {.sbX set}
foreach Element $data {
.tv insert {} end -values $Element
}
foreach column $colnames {
.tv heading $column -text $column
}
ttk::scrollbar .sbY -command {.tv yview}
ttk::scrollbar .sbX -command {.tv xview} -orient horizontal
pack .sbY -side right -fill y
pack .sbX -side bottom -fill x
pack .tv -side left -fill both
Results:
Memory consumption: 2 GB (thereof data stored as list: 1.2 GB)
Time until the table is displayed: 15 sec.
Compare: 10 mio rows consume 7.2 GB of RAM but selecting a row takes serveral seconds (2 - 5) then (possible reason: Internal list traversal?)
Conclusion:
The treeview is more memory efficient than Tktable since it can use a list instead of an array.
For bigger data sizes (> a few million rows) the row selection is slow (the more at the end the slower!)
I have found one possible solution/workaround using Tktable in an "unbound" (command) mode.
With the command option of Tktable you can specify a function that is called each time a cell shall be displayed on the screen. This avoids "loading" all the data from R to Tcl at once improving the "start-up" time and significantly reduces the memory consumption caused by TCL's way of storing arrays and lists.
This way every time you scroll a series of function calls are done to ask for the content of the visible cells.
It works for me even with over 10 mio. rows!
Drawback: Calling an R function that returns a Tcl variable for each cell is still far from being efficient. If you scroll for the first time you can watch the cells being updated. Therefore I am still looking for a bulk data transfer solution between R and Tcl/Tk.
Any suggestions to improve the performance are welcome!
I have implemented a small demo (with 1 mio. rows and 21 columns consuming 1.2 GB of RAM) and added some buttons to test different features (like caching).
Note: The long start-up time is caused by creating the underlying test data, NOT by Tktable!
library(tcltk)
library(data.table)
# Tktable example with -command ("unbound" mode) ---------------------------
# Doc: http://tktable.sourceforge.net/tktable/doc/tkTable.html
NUM.ROWS <- 1E6
NUM.COLS <- 20
# generate a big data.frame - this will take a while but is required for the demo
dt.data <- data.table(ID = 1:NUM.ROWS)
for (i in 1:NUM.COLS) {
dt.data[, (paste("Col",i)) := paste0("R", 1:NUM.ROWS, " C", i)]
}
# Fill one cell with a long text containing special control characters to test the Tktable behaviour
dt.data[3,3 := "This is a long text with backslash \\ and \"quotes\"!"]
tclRequire("Tktable")
t <- tktoplevel()
tkwm.protocol(t, "WM_DELETE_WINDOW", function() tkdestroy(t))
# Function to return the current row and column as "calculated" value (without an underlying data "model")
calculated.data <- function(C) {
# Function arguments for Tcl "substitutions":
# See: http://tktable.sourceforge.net/tktable/doc/tkTable.html
# %c the column of the triggered cell.
# %C A convenience substitution for %r,%c.
# %i 0 for a read (get) and 1 for a write (set). Otherwise it is the current cursor position in the cell.
# %r the row of the triggered cell.
return(tclVar(C)) # this does work!
}
# Function to return the content of a data.table for the current row and colum
data.frame.data <- function(r, c) {
if( r == "0")
return(tclVar(names(dt.data)[as.integer(c)+1])) # First row contains the column names
else
return(tclVar(as.character(dt.data[as.integer(r)+1, as.integer(c)+1, with = FALSE]))) # Other rows are data rows
}
frame <- ttklabelframe(t, text = "Data:")
# Add the table to the window environment to ensure killing it when the window is closed (= no more phantom calls to the data command handler)
# Cache = TRUE: This greatly enhances speed performance when used with -command but uses extra memory.
t$env$table <- tkwidget(frame, "table", rows = NUM.ROWS, cols = NUM.COLS, titlerows = 1, selecttype = "cell", selectmode = "extended", command = calculated.data, cache = TRUE, yscrollcommand = function(...) tkset(scroll.y, ...), xscrollcommand = function(...) tkset(scroll.x, ...))
scroll.x <- ttkscrollbar(frame, orient = "horizontal", command=function(...) tkxview(t$env$table,...)) # command that performs the scrolling
scroll.y <- ttkscrollbar(frame, orient = "vertical", command=function(...) tkyview(t$env$table,...)) # command that performs the scrolling
buttons <- ttkframe(t)
btn.read.only <- ttkbutton(buttons, text = "make read only", command = function() tkconfigure(t$env$table, state = "disabled"))
btn.read.write <- ttkbutton(buttons, text = "make writable", command = function() tkconfigure(t$env$table, state = "normal"))
btn.clear.cache <- ttkbutton(buttons, text = "clear cache", command = function() tcl(t$env$table, "clear", "cache"))
btn.bind.data.frame <- ttkbutton(buttons, text = "Fill cells from R data.table",
command = function() {
tkconfigure(t$env$table, command = data.frame.data, rows = nrow(dt.data), cols = ncol(dt.data), titlerows = 1)
tcl(t$env$table, "clear", "cache")
tkwm.title(t,"Cells are filled from an R data.table")
})
btn.bind.calc.value <- ttkbutton(buttons, text = "Fill cells with calculated values",
command = function() {
tkconfigure(t$env$table, command = calculated.data, rows = 1E5, cols = 40)
tcl(t$env$table, "clear", "cache")
tkwm.title(t,"Cells are calculated values (to test the highest performance possible)")
})
tkgrid(btn.read.only, row = 0, column = 1)
tkgrid(btn.read.write, row = 0, column = 2)
tkgrid(btn.clear.cache, row = 0, column = 3)
tkgrid(btn.bind.data.frame, row = 0, column = 5)
tkgrid(btn.bind.calc.value, row = 0, column = 6)
tkpack(frame, fill = "both", expand = TRUE)
tkpack(scroll.x, fill = "x", expand = FALSE, side = "bottom")
tkpack(scroll.y, fill = "y", expand = FALSE, side = "right")
tkpack(t$env$table, fill = "both", expand = TRUE, side = "left")
tkpack(buttons, side = "bottom")

Pause print of large object when it fills the screen [duplicate]

This question already has answers here:
Equivalent to unix "less" command within R console
(5 answers)
Closed 8 years ago.
Say I want to print a large object in R, such as
x <- 1:2e3
When I print x, the R console fills the screen with its elements and, since it doesn't fit all in the screen, it will scroll down. Then I have to scroll back up to see everything that went off screen.
What I would like is to have a command that would print x and stop when the screen fills, requiring the user to do something (like press enter) in order to have another screen full of data displayed. What I have in mind is something similar to MS DOS's dir /p command. Is there such a thing?
As suggested by #baptiste, this solution, page(x, method = 'print'), doesn't really solve my problem. To be more clear, what I would like is a solution that wouldn't involve printing the object in another window, as this would disrupt my workflow. If I didn't care for this, I'd just use View() or something similar.
Here is a quick and dirty more function:
more <- function(expr, lines=20) {
out <- capture.output(expr)
n <- length(out)
i <- 1
while( i < n ) {
j <- 0
while( j < lines && i <= n ) {
cat(out[i],"\n")
j <- j + 1
i <- i + 1
}
if(i<n) readline()
}
invisible(out)
}
It will evaluate the expression and print chunks of lines (default to 20, but you can change that). You need to press 'enter' to move on to the next chunk. Only 'Enter' will work, you can't just use the space bar or other keys like other programs and it does not have options for searching, going back, etc.
You can try it with a command like:
more( rnorm(250) )
Here is an expanded version that will quit if you type 'q' or 'Q' (or anything starting with either) then press 'Enter', it will print out the last lines rows of the output if you type 'T' then enter, and if you type a number it will jump to that decile through the output (e.g. typing 5 will jump to half way through, 8 will be 80% of the way through). Anything else and it will continue.
more <- function(expr, lines=20) {
out <- capture.output(expr)
n <- length(out)
i <- 1
while( i < n ) {
j <- 0
while( j < lines && i <= n ) {
cat(out[i],"\n")
j <- j + 1
i <- i + 1
}
if(i<n){
rl <- readline()
if( grepl('^ *q', rl, ignore.case=TRUE) ) i <- n
if( grepl('^ *t', rl, ignore.case=TRUE) ) i <- n - lines + 1
if( grepl('^ *[0-9]', rl) ) i <- as.numeric(rl)/10*n + 1
}
}
invisible(out)
}

RCurl: Display progress meter in Rgui

Using R.exe or Rterm.exe, this gives an excellent progress meter.
page=getURL(url="ftp.wcc.nrcs.usda.gov", noprogress=FALSE)
In Rgui I am limited to:
page=getURL(url="ftp.wcc.nrcs.usda.gov",
noprogress=FALSE, progressfunction=function(down,up) print(down))
which gives a very limited set of download information.
Is there a way to improve this?
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.
I am glad to tell that I was wrong. At least for a single line, \r can do the trick. In fact:
conc=function(){
cat(" abcd")
cat(" ABCD", '\n')
}
conc()
# abcd ABCD
But:
over=function(){
cat(" abcd")
cat("\r ABCD", "\n")
}
over()
# ABCD
That given, I wrote this progressDown function, which can monitor download status rewriting always on the same same line:
library(RCurl) # Don't forget
### Callback function for curlPerform
progressDown=function(down, up, pcur, width){
total=as.numeric(down[1]) # Total size as passed from curlPerform
cur=as.numeric(down[2]) # Current size as passed from curlPerform
x=cur/total
px= round(100 * x)
## if(!is.nan(x) && px>60) return(pcur) # Just to debug at 60%
if(!is.nan(x) && px!=pcur){
x= round(width * x)
sc=rev(which(total> c(1024^0, 1024^1, 1024^2, 1024^3)))[1]-1
lb=c('B', 'KB', 'MB', 'GB')[sc+1]
cat(paste(c(
"\r |", rep.int(".", x), rep.int(" ", width - x),
sprintf("| %g%s of %g%s %3d%%",round(cur/1024^sc, 2), lb, round(total/1024^sc, 2), lb, px)),
collapse = ""))
flush.console() # if the outptut is buffered, it will go immediately to console
return(px)
}
return(pcur)
}
Now we can use the callback with curlPerform
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
width= getOption("width") - 25 # you can make here your line shorter/longer
pcur=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(down,up) pcur<<-progressDown(down, up, pcur, width),
followlocation=T)
close(f)
cat('\n Download', names(ret), '- Ret', ret, '\n') # is success?
}
Running it with a small sample binary:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
the intermediate output at 60% is (no # protection):
|................................. | 133.74KB of 222.75KB 60%
where KB, will be adjusted to B, KB, MB, GB, based on total size.
Final output with success status, is:
|.......................................................| 222.61KB of 222.75KB 100%
Download OK - Ret 0
Note, the output line width is relative to R width option (which controls the maximum number of columns on a line) and can be customised changing the curlProgress line:
width= getOption("width") - 25
This is enough for my needs and solves my own question.
Here's a simple example using txtProgressBar. Basically, just do a HEAD request first to get the file size of the file you want to retrieve, then setup a txtProgressBar with that as its max size. Then you use the progressfunction argument to curlPerform to call setTxtProgressBar. It all works very nicely (unless there is no "content-length" header, in which case this code works by just not printing a progress bar).
url <- 'http://stackoverflow.com/questions/21731548/rcurl-display-progress-meter-in-rgui'
h <- basicTextGatherer()
curlPerform(url=url, customrequest='HEAD',
header=1L, nobody=1L, headerfunction=h$update)
if(grepl('Transfer-Encoding: chunked', h$value())) {
size <- 1
} else {
size <- as.numeric(strsplit(strsplit(h$value(),'\r\nContent-Type')[[1]][1],
'Content-Length: ')[[1]][2])
}
bar <- txtProgressBar(0, size)
h2 <- basicTextGatherer()
get <- curlPerform(url=url, noprogress=0L,
writefunction=h2$update,
progressfunction=function(down,up)
setTxtProgressBar(bar, down[2]))
h2$value() # return contents of page
The output is just a bunch of ====== across the console.
What about:
curlProgress=function(url, fname){
f = CFILE(fname, mode="wb")
prev=0
ret=curlPerform(url=url, writedata=f#ref, noprogress=FALSE,
progressfunction=function(a,b){
x=round(100*as.numeric(a[2])/as.numeric(a[1]))
if(!is.nan(x) && x!=prev &&round(x/10)==x/10) prev<<-x else x='.'
cat(x)
}, followlocation=T)
close(f)
cat(' Download', names(ret), '- Ret', ret, '\n')
}
?
It prints dots or percent download divisible by 10 and breaks line on 50%.
And with a small 223 KB file:
curlProgress("http://www.nirsoft.net/utils/websitesniffer-x64.zip", "test.zip")
it sounds like this:
................10...............20................30...............40...............50
..............................70...............80...............90...............100... Download OK - Ret 0
I start doubting that with standard R commands it is possible to reprint overwriting the current line, which is what RCurl does in non-GUI mode.

Resources