subtracting posixCT objects and not getting seconds - r

I'm getting weird units after I subtract POSIXct objects, which are returned from two calls to Sys.time(). I'm using Sys.time() to time some call to system()--something like this:
start <- Sys.time()
system("./something_complicated_that_takes_a_while")
end <- Sys.time()
cat(end - start, "seconds\n")
I get 1.81494815872775 seconds, which is very strange. The runtime was closer to 1.8 hours, though. Just to check, I can do this:
start <- Sys.time()
system("/bin/sleep 2")
end <- Sys.time()
cat(end - start, "seconds\n")
and I get 2.002262 seconds, so it's working fine here. Any idea what's going on here?

Your first code is ok , it is 1.8 hours not seconds , here is explanation
a <- Sys.time()
b <- Sys.time() + 2 * 60 *60 # i add 2 hours here
b - a
#> Time difference of 2.000352 hours
above the deference b - a gives the answer in hours not seconds , so if you want to use cat try
cat(b-a , attr(b - a , "units"))
#> 2.000352 hours
and if you want your output in seconds try this
difftime(b , a , units = "secs")
#> Time difference of 7201.266 secs

Related

R paste time difference with unit (sec, min etc)

In R I want to get the timing in a character string keeping the unit (e.g., if it is sec or min). Please see example code below.
T1 <- Sys.time()
T2 <- Sys.time()
duration <- T2-T1
# Looking at duration show unit:
duration
time_description <- paste("it took: ", round(duration, 2), sep="", col="")
# However int time description the unit is removed
time_description
Preferably without using additional packages.
Thanks in advance.
You can use units to extract the unit from difftime object.
time_description <- sprintf('it took %.2f %s', duration, units(duration))
time_description
#[1] "it took 0.39 secs"

How to convert milliseconds into a hh:mm:ss.sss timestamp format in R

Although there seem to be numerous posts concerned with this issue or related issues, I could not find a post providing a solution in R.
The problem should be easy to solve once you know how to do it: I have vectors with milliseconds. I'd like to mutate them into hours, minutes, seconds (with three decimals). For example:
x <- c(29, 300, 1000, 213451)
The expected result is this:
# "00:00:00.029" "00:00:00.300" "00:00:01.000" "00:03:33.451"
What I've tried so far is this convoluted series of operations:
hrs = x / (60 * 60 * 1000)
mins = (hrs %% 1) * 60
secs <- sprintf(((mins %% 1) * 60), fmt = '%#.3f')
paste(trunc(hrs), trunc(mins), secs, sep = ":")
[1] "0:0:0.029" "0:0:0.300" "0:0:1.000" "0:3:33.451"
The result is better than nothing but still at a remove from the expected result and, what's more, the code to get there is anything but straightforward or elegant.
What's a quicker and more elegant way to convert milliseconds into the timestamp format?
EDIT:
Alternatively, what I've tried is this:
library(chron)
times(x / (24 * 60 * 60 * 1000))
But this fails to print the decimals.
You can set "%OSn" to give the seconds truncated to n decimal places, where n is between 0 and 6.
format(as.POSIXct(x / 1000, "UTC", origin = "1970-01-01"), "%H:%M:%OS3")
# [1] "00:00:00.029" "00:00:00.300" "00:00:01.000" "00:03:33.450"

In R, image processing loop takes an order of magnitude longer to process after ~50 iterations

EDIT: changed the image names from Image1-Image11 to Image50-Image60 for clarity.
EDIT2: Solved by adding a garbage collection command after removing the image file in each loop iteration. Code is updated.
I have 400+ jpeg images in a folder. I'm trying write a script to: read each image, identify some text in the image, and then write the file name and that text into a data frame.
When I run the script below, the first ~50 iterations print a time of .1-.3 seconds. Then, for a few iterations, the iteration will take 1-3 seconds. Then, this bumps up to 1-5 minutes, after which I kill the script.
library(dplyr)
library(magick)
fileList3 = list.files(path = filePath)
printJobXRes = data.frame(
jobName = as.character(),
xRes = as.numeric(),
stringsAsFactors = FALSE
)
i = 0
for (fileName in fileList3){
img = paste0(filePath, '/', fileName, '_TestImage.jpg')
start_time = Sys.time()
temp.xRes = image_read(img, strip = T) %>%
image_rotate(270) %>%
image_crop('90x150+1750') %>%
image_negate %>%
image_convert(type = 'Bilevel') %>%
image_ocr %>%
as.numeric
stop_time = Sys.time()
i = i+1
print(paste(fileName,'first attempt, item #', i))
print(stop_time-start_time)
temp.df3 = data.frame(
jobName = fileName,
xRes = temp.xRes,
stringsAsFactors = FALSE
)
printJobXRes = rbind(printJobXRes, temp.df3)
rm(temp.xRes)
rm(temp.df3)
rm(img)
gc() #This solved the issue
}
Here's a couple lines of the output:
#Images 1-49 process in .1-.3 seconds each
[1] "Image50.jpg first attempt, item # 50"
Time difference of 0.2320111 secs
[1] "Image51.jpg first attempt, item # 51"
Time difference of 0.213742 secs
[1] "Image52.jpg first attempt, item # 52"
Time difference of 0.2536581 secs
[1] "Image53.jpg first attempt, item # 53"
Time difference of 1.253844 secs
[1] "Image54.jpg first attempt, item # 54"
Time difference of 1.149764 secs
[1] "Image55.jpg first attempt, item # 55"
Time difference of 1.171134 secs
[1] "Image56.jpg first attempt, item # 56"
Time difference of 1.397093 secs
[1] "Image57.jpg first attempt, item # 57"
Time difference of 1.201915 secs
[1] "Image58.jpg first attempt, item # 58"
Time difference of 1.455768 secs
[1] "Image59.jpg first attempt, item # 59"
Time difference of 1.618744 secs
[1] "Image60.jpg first attempt, item # 60"
Time difference of 4.527751 mins
Can anyone offer suggestions as to why the loop doesn't continue to take ~.1-.3 seconds? All jpgs are roughly the same size, resolution, and all generated from the same source.
I was able to solve my issue based on Mark's suggestion. I was removing the image file from memory in each loop iteration, but the freed up memory was never realized by R. I added a garbage collection command (gc()) into the loop to fix this issue, and the loop then ran as expected.

Unexpected results in benchmark of read.csv / fread [duplicate]

I can run a piece of code for 5 or 10 seconds using the following code:
period <- 10 ## minimum time (in seconds) that the loop should run for
tm <- Sys.time() ## starting data & time
while(Sys.time() - tm < period) print(Sys.time())
The code runs just fine for 5 or 10 seconds. But when I replace the period value by 60 for it to run for a minute, the code never stops. What is wrong?
As soon as elapsed time exceeds 1 minute, the default unit changes from seconds to minutes. So you want to control the unit:
while (difftime(Sys.time(), tm, units = "secs")[[1]] < period)
From ?difftime
If ‘units = "auto"’, a suitable set of units is chosen, the
largest possible (excluding ‘"weeks"’) in which all the absolute
differences are greater than one.
Subtraction of date-time objects gives an object of this class, by
calling ‘difftime’ with ‘units = "auto"’.
Alternatively use proc.time, which measures various times ("user", "system", "elapsed") since you started your R session in seconds. We want "elapsed" time, i.e., the wall clock time, so we retrieve the 3rd value of proc.time().
period <- 10
tm <- proc.time()[[3]]
while (proc.time()[[3]] - tm < period) print(proc.time())
If you are confused by the use of [[1]] and [[3]], please consult:
How do I extract just the number from a named number (without the name)?
How to get a matrix element without the column name in R?
Let me add some user-friendly reproducible examples. Your original code with print inside a loop is quite annoying as it prints thousands of lines onto the screen. I would use Sys.sleep.
test.Sys.time <- function(sleep_time_in_secs) {
t1 <- Sys.time()
Sys.sleep(sleep_time_in_secs)
t2 <- Sys.time()
## units = "auto"
print(t2 - t1)
## units = "secs"
print(difftime(t2, t1, units = "secs"))
## use '[[1]]' for clean output
print(difftime(t2, t1, units = "secs")[[1]])
}
test.Sys.time(5)
#Time difference of 5.005247 secs
#Time difference of 5.005247 secs
#[1] 5.005247
test.Sys.time(65)
#Time difference of 1.084357 mins
#Time difference of 65.06141 secs
#[1] 65.06141
The "auto" units is very clever. If sleep_time_in_secs = 3605 (more than an hour), the default unit will change to "hours".
Be careful with time units when using Sys.time, or you may be fooled in a benchmarking. Here is a perfect example: Unexpected results in benchmark of read.csv / fread. I had answered it with a now removed comment:
You got a problem with time units. I see that fread is more than 20 times faster. If fread takes 4 seconds to read a file, read.csv takes 80 seconds = 1.33 minutes. Ignoring the units, read.csv is "faster".
Now let's test proc.time.
test.proc.time <- function(sleep_time_in_secs) {
t1 <- proc.time()
Sys.sleep(sleep_time_in_secs)
t2 <- proc.time()
## print user, system, elapsed time
print(t2 - t1)
## use '[[3]]' for clean output of elapsed time
print((t2 - t1)[[3]])
}
test.proc.time(5)
# user system elapsed
# 0.000 0.000 5.005
#[1] 5.005
test.proc.time(65)
# user system elapsed
# 0.000 0.000 65.057
#[1] 65.057
"user" time and "system" time are 0, because both CPU and the system kernel are idle.

Timing R code with Sys.time()

I can run a piece of code for 5 or 10 seconds using the following code:
period <- 10 ## minimum time (in seconds) that the loop should run for
tm <- Sys.time() ## starting data & time
while(Sys.time() - tm < period) print(Sys.time())
The code runs just fine for 5 or 10 seconds. But when I replace the period value by 60 for it to run for a minute, the code never stops. What is wrong?
As soon as elapsed time exceeds 1 minute, the default unit changes from seconds to minutes. So you want to control the unit:
while (difftime(Sys.time(), tm, units = "secs")[[1]] < period)
From ?difftime
If ‘units = "auto"’, a suitable set of units is chosen, the
largest possible (excluding ‘"weeks"’) in which all the absolute
differences are greater than one.
Subtraction of date-time objects gives an object of this class, by
calling ‘difftime’ with ‘units = "auto"’.
Alternatively use proc.time, which measures various times ("user", "system", "elapsed") since you started your R session in seconds. We want "elapsed" time, i.e., the wall clock time, so we retrieve the 3rd value of proc.time().
period <- 10
tm <- proc.time()[[3]]
while (proc.time()[[3]] - tm < period) print(proc.time())
If you are confused by the use of [[1]] and [[3]], please consult:
How do I extract just the number from a named number (without the name)?
How to get a matrix element without the column name in R?
Let me add some user-friendly reproducible examples. Your original code with print inside a loop is quite annoying as it prints thousands of lines onto the screen. I would use Sys.sleep.
test.Sys.time <- function(sleep_time_in_secs) {
t1 <- Sys.time()
Sys.sleep(sleep_time_in_secs)
t2 <- Sys.time()
## units = "auto"
print(t2 - t1)
## units = "secs"
print(difftime(t2, t1, units = "secs"))
## use '[[1]]' for clean output
print(difftime(t2, t1, units = "secs")[[1]])
}
test.Sys.time(5)
#Time difference of 5.005247 secs
#Time difference of 5.005247 secs
#[1] 5.005247
test.Sys.time(65)
#Time difference of 1.084357 mins
#Time difference of 65.06141 secs
#[1] 65.06141
The "auto" units is very clever. If sleep_time_in_secs = 3605 (more than an hour), the default unit will change to "hours".
Be careful with time units when using Sys.time, or you may be fooled in a benchmarking. Here is a perfect example: Unexpected results in benchmark of read.csv / fread. I had answered it with a now removed comment:
You got a problem with time units. I see that fread is more than 20 times faster. If fread takes 4 seconds to read a file, read.csv takes 80 seconds = 1.33 minutes. Ignoring the units, read.csv is "faster".
Now let's test proc.time.
test.proc.time <- function(sleep_time_in_secs) {
t1 <- proc.time()
Sys.sleep(sleep_time_in_secs)
t2 <- proc.time()
## print user, system, elapsed time
print(t2 - t1)
## use '[[3]]' for clean output of elapsed time
print((t2 - t1)[[3]])
}
test.proc.time(5)
# user system elapsed
# 0.000 0.000 5.005
#[1] 5.005
test.proc.time(65)
# user system elapsed
# 0.000 0.000 65.057
#[1] 65.057
"user" time and "system" time are 0, because both CPU and the system kernel are idle.

Resources