Plotting during a loop in RStudio - r

I am implementing a solution to the Traveling Salesman Problem (TSP) in R (simulated Annealing) and I want to output the current best path periodically. I have searched quite a bit for how to output plots during a for loop and have thus far failed.
I use RStudio, and want to see the graphs as they are generated. If you have ever watched TSP solvers do their thing, you will understand how cool it is to watch. Here is a sample of the graphics output I want to see http://www.staff.science.uu.nl/~beuke106/anneal/anneal.html
I don't think that the memory usage will be a problem (during about 500,000 iterations, I am only expecting 50-100 plots). Here is a sample function, where we would expect to see 10 different plots during the time the function runs:
Plotz <- function(iter = 1000000, interval = 100000) {
x <- 1:10
for(i in 1:iter){
y <- runif(10)
if(i %% interval == 0) {
plot(x, y)
}
}
return(c(x, y))
}
Plotz()
When I run this, all I see is the final plot produced (in RStudio). How can I see the plots as they're generated?
Also: I am on Ubuntu (whatever the newest stable release is). Don't know if that is relevant.
Thank you everyone in advance.
EDIT: Per Captain Murphy's suggestion, I tried running this in the Linux terminal, and the graphics appeared. I still think the question of "How to do this in RStudio?" Is still relevant, however. It's such a good program, so maybe someone has an idea of what could be done to get this to work?
EDIT2: As Thilo stated, this is a known bug in Rstudio. If anyone has any other ideas to solve this without the software itself being fixed, then there is still something to discuss. Otherwise, consider this question solved.

Calling Sys.sleep(0) should cause the plot to draw. Unlike the X11 solution, this will work on server versions of RStudio as well.
(I was surprised that dev.flush() did not give the result you were hoping for, that might be a bug.)

One thing you can do is open a x11 window and plot in there:
x11()
Plotz()
That should work the same as running it in terminal.

Following up on #JoeCheng's answer and #RGuy's comment on that answer: as I worked out with the RStudio folks, the problem seems to primarily arise when there is too much plotting going on in too short a timespan. The solution is twofold:
Sys.sleep(0) helps force an update to the plotting window.
Plotting updates every Wth loop rather than every loop.
For instance, on my computer (i7, RStudio Server), the following code does not update until the loop completes:
N <- 1000
x <- rep(NA,N)
plot(c(0,1)~c(0,N), col=NA)
for(i in seq(N)) {
Sys.sleep(.01)
x[i] <- runif(1)
iseq <- seq(i-99,i)
points( x[i]~i )
Sys.sleep(0)
}
The following code updates in real-time, despite having the same number of points to be plotted:
N <- 1000
x <- rep(NA,N)
plot(c(0,1)~c(0,N), col=NA)
for(i in seq(N)) {
Sys.sleep(.01)
x[i] <- runif(1)
iseq <- seq(i-99,i)
if(i%%100==0) {
points( x[iseq]~iseq )
Sys.sleep(0)
}
}
In other words, it's the number of calls the plot that seems to matter, not the amount of data to be plotted.

If you want to save the plots as well you could just open a new device in the loop and close it afterwards.
Plotz <- function(iter = 1000, interval = 100) {
x <- 1:10
p <- 0 #plot number
for(i in 1:iter){
y <- runif(10)
if(i %% interval == 0) {
png(file=paste(i,"png",sep="."))
p <- p + 1; plot(x, y)
dev.off()
}
}
return(c(x, y))
}

Plotz <- function(iter = 1000, interval = 100) {
x <- 1:10
p <- 0 #plot number
for(i in 1:iter){
y <- runif(10)
if(i %% interval == 0) {
p <- p + 1; plot(x, y)
readline("Please press the Enter key to see the next plot if there is one.")
}
}
return(c(x, y))
}
Plotz()

You can also use the back arrows on the plots tab of the lower left pane of the RStudio interface in order to view the plots.

You can use the animate package to layer your plots into a GIF.

Related

Simple function animation

I want to generate a short and easy animation to illustrate how the function x^a changes for x between 0 and 1 and a increasing from 0 to 1. I run this code:
p1 <- seq(0, 1, 0.001)
alpha <- seq(0, 1, 0.05)
n <- length(alpha)
for (i in 1:n) {
p2 <- p1^(alpha[i])
p <- plot(p1, p2, ylim = c(0,1.1))
}
The problem is that R waits til the whole loop is done and then just displays all plots, so I have to skip through the plots by myself. What I rather want is that the newest plot replaces the old one and by this getting an animation. I tried using print()somehow, but it did not work. Is there any way to do an animation that way?
I know that there are animation packages, but they all seem more complicated than my way. However, if you think it would be better to use one of these, please tell.
Thanks for the help.
The adding of Sys.sleep() did exactly what I wanted to achieve.

How to represent the efficency of a function in a plot

In school, we are learning to use R and we had to find an algorithm to calculate the order of a permutation in different ways. So I came up with 4 different algorithms that can be compared. But now, I'd like to be able to display the time that each the function works depending on the size of the data that we give.
So first, I wanted to display the time given for at least one of the functions (I called it calculOrdrePermutation) without changing the size of the data.
So that's what I did :
createProcessTest <- function(func, variables, numberOfTests) {
outputProgress <- T
ptm <- proc.time()
times <- c()
for(i in 1:numberOfTests) {
func(variables)
times <- append(times, (proc.time() - ptm)[3])
if(outputProgress & i %% 5 == 0) {
print(paste((i/numberOfTests) * 100, "%"))
}
}
return(times)
}
sampleSize <- 100
nbOperations <- 100
extrait <- sample(1:sampleSize, sampleSize)
matriceDePermutation <- trouverMatriceDePermutation(extrait)
tempsRapideMatrice = createProcessTest(calculOrdrePermutation, matriceDePermutation, nbOperations)
plot(y=tempsRapideMatrice, x=1:nbOperations, cex=0.1, type="l", main="Using matrix", sub="sans boucle", ylab="Time (s)", xlab="Number of iterations")
It looks approximatively like this
So that's not that bad, I'm able to display a plot that represent the time for this function. But it is linear, of course, so there's not much that interest us...
So I started to create a function that do the process by changing graduatly the sampleSize :
doFullTest <- function(func, useMatrix, numberOfTestsPerN, maxN) {
temps <- c()
for(sampleSize in seq(from=1, to=maxN, by=1)) {
permut <- sample(1:sampleSize, sampleSize)
if(useMatrix) {
permut <- trouverMatriceDePermutation(extrait)
}
temps <- append(temps, mean(createProcessTest(func, permut, numberOfTestsPerN)))
}
return(temps)
}
And so I can use it this way :
plot(x=1:100, y=doFullTest(calculOrdrePermutation, T, 5, 100), type="h")
(source: i.ibb.co)
Time used depending on the size of the data, from N=1 to N=100
So what I asked is to run 5 times the function per size of data to take the mean, and then repeat with an increased size. But as you can see, it isn't possible to study it, I hoped to have a linear histogram (because my algorithm has a complexity of O(n) ).
Is there a problem in my code? Am I doing it totally wrong?
I'm pretty sure I'm not that far from my goal, but the result is quite upseting...
Thank you for your help!

Using R and Sensor Accelerometer Data to Detect a Jump

I'm fascinated by sensor data. I used my iPhone and an app called SensorLog to capture
accelerometer data while I stand and push my legs to jump.
My goal is to use R to create a model which can identify jumps and how long I'm in the air.
I'm unsure how to proceed in such a challenge. I have a timeseries with accelerometer data.
https://drive.google.com/file/d/0ByWxsCBUWbqRcGlLVTVnTnZIVVk/view?usp=sharing
Some questions:
How can a jump be detected in timeseries data?
How to identify the air time part?
How to train such a model?
Below is the R code used to create the graphs above, which is me standing and doing a simple jump.
Thanks!
# Training set
sample <- read.csv("sample-data.csv")
# Sum gravity
sample$total_gravity <- sqrt(sample$accelerometerAccelerationX^2+sample$accelerometerAccelerationY^2+sample$accelerometerAccelerationZ^2)
# Smooth our total gravity to remove noise
f <- rep(1/4,4)
sample$total_gravity_smooth <- filter(sample$total_gravity, f, sides=2)
# Removes rows with NA from smoothing
sample<-sample[!is.na(sample$total_gravity_smooth),]
#sample$test<-rollmaxr(sample$total_gravity_smooth, 10, fill = NA, align = "right")
# Plot gravity
plot(sample$total_gravity, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(sample$total_gravity_smooth, col="red")
stdevs <- mean(sample$total_gravity_smooth)+c(-2,-1,+1,+2)*sd(sample$total_gravity_smooth)
abline(h=stdevs)
This is probably less than perfect solution, but it might be enough to get you started. The first part relies on a small modification of the find_peaks function from the gazetools package.
find_maxima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == max(r))])
}
}
peaks
}
find_minima <- function(x, threshold)
{
ranges <- find_peak_ranges(x, threshold)
peaks <- NULL
if (!is.null(ranges)) {
for (i in 1:nrow(ranges)) {
rnge <- ranges[i, 1]:ranges[i, 2]
r <- x[rnge]
peaks <- c(peaks, rnge[which(r == min(r))])
}
}
peaks
}
In order to get the find_maxima and find_minima functions to give us what we're looking for we are going to need to smooth the total_gravity data even further:
spline <- smooth.spline(sample$loggingSample, y = sample$total_gravity, df = 30)
Note: I 'zeroed out' total gravity (sample$total_gravity <- sample$total_gravity - 1)
Next, pull out the smoothed x and y values:
out <- as.data.frame(cbind(spline$x,spline$y))
Then find our local maxima and minima
max <- find_maxima(out$y, threshold = 0.4)
min <- find_minima(out$y, threshold = -0.4)
And then plot the data to make sure everything looks legit:
plot(out$y, type="l", col=grey(.2), xlab="Series", ylab="Gravity", main="Accelerometer Gravitational Force")
lines(out$y, col="red")
stdevs <- mean(out$y)+c(-2,-1,+1,+2)*sd(out$y)
abline(h=stdevs)
abline(v=max[1], col = 'green')
abline(v=max[2], col = 'green')
abline(v=min[1], col = 'blue')
And finally, we can see how long you were off the ground.
print(hangtime <- min[1] - max[1])
[1] 20
You can reduce your thresholds to get additional datapoints (changes in acceleration).
Hope this helps!
I would consider a few things:
Smooth the data by collecting median values every 100ms - accelerometer data on iPhones is not perfectly accurate, so this approach will help.
Identify turningpoints as #scribbles suggests.
There is code available in my github repository that could be modified to help with both of these issues. A PDF with some explanation is here: https://github.com/MonteShaffer/mPowerEI/blob/master/mPowerEI/example/challenge-1a.pdf
Specifically, take a look at:
library(devtools);
install_github("MonteShaffer/mPowerEI", subdir="mPowerEI");
library(mPowerEI);
# data smoothing
?scaleToTimeIncrement
# turning points
?pastecs::turnpoints

Levy Walk simulation in R

I am trying to generate a series of numbers to simulate a Levy Walk in R. Currently I am using the following code:
alpha=2
n=1000
x=rep(0,n)
y=rep(0,n)
for (i in 2:n){
theta=runif(1)*2*pi
f=runif(1)^(-1/alpha)
x[i]=x[i-1]+f*cos(theta)
y[i]=y[i-1]+f*sin(theta)
}
The code is working as expected and I am able to generate the numbers according to my requirements. The figure below shows on such Levy Walk:
The following histogram confirms that the numbers generated (i.e. f) actually belong to a power law:
My question is as follows:
The step lengths generated (i.e. f) are quite large. Haw can I modify the code so that the step lengths only fall within some bound [fmin, fmax]?
P.S. I have intentionally not vectorized the code.
Try using this:
f=runif(1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
Note that you need 0 < fmin < fmax.
BTW, you can vectorize your code like this:
theta <- runif(n-1)*2*pi
f <- runif(n-1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
x <- c(0, cumsum(f*cos(theta)))
y <- c(0, cumsum(f*sin(theta)))
Just for precision, what you're simmulating here is a Lévy flight. For it to be a Lévy walk, you should allow the particle to "walk" from the beginning to the end of each flight (with a for, for example). If you plot your resulting simmulation with plot(x, y, type = "o") you will see that there are no positions within flights (no walking) using your code.
library(ggplot2)
library(gridExtra)
alpha= 5
n= 1000
x= rep(0,n)
y= rep(0,n)
fmin= 1
fmax= n
for (i in 2:n){
theta= runif(n-1)*2*pi
f= runif(n-1, fmax^(-alpha), fmin^(-alpha))^(-1/alpha)
x= c(0, cumsum(f*cos(theta)))
y= c(0, cumsum(f*sin(theta)))
}
ggplot(data.frame(x=x, y=y), aes(x, y))+geom_point()+geom_path()

Automating great-circle map production in R

I've taken some of the things I learned in a Flowing Data great circle mapping tutorial and combined them with code linked in the comments to prevent weird things from happening when R plots trans-equatorial great circles. That gives me this:
airports <- read.csv("/home/geoff/Desktop/DissertationData/airports.csv", header=TRUE)
flights <- read.csv("/home/geoff/Desktop/DissertationData/ATL.csv", header=TRUE, as.is=TRUE)
library(maps)
library(geosphere)
checkDateLine <- function(l){
n<-0
k<-length(l)
k<-k-1
for (j in 1:k){
n[j] <- l[j+1] - l[j]
}
n <- abs(n)
m<-max(n, rm.na=TRUE)
ifelse(m > 30, TRUE, FALSE)
}
clean.Inter <- function(p1, p2, n, addStartEnd){
inter <- gcIntermediate(p1, p2, n=n, addStartEnd=addStartEnd)
if (checkDateLine(inter[,1])){
m1 <- midPoint(p1, p2)
m1[,1] <- (m1[,1]+180)%%360 - 180
a1 <- antipode(m1)
l1 <- gcIntermediate(p1, a1, n=n, addStartEnd=addStartEnd)
l2 <- gcIntermediate(a1, p2, n=n, addStartEnd=addStartEnd)
l3 <- rbind(l1, l2)
l3
}
else{
inter
}
}
# Unique months
monthyear <- unique(flights$month)
# Color
pal <- colorRampPalette(c("#FFEA00", "#FF0043"))
colors <- pal(100)
for (i in 1:length(monthyear)) {
png(paste("monthyear", monthyear[i], ".png", sep=""), width=750, height=500)
map("world", col="#191919", fill=TRUE, bg="black", lwd=0.05)
fsub <- flights[flights$month == monthyear[i],]
fsub <- fsub[order(fsub$cnt),]
maxcnt <- max(fsub$cnt)
for (j in 1:length(fsub$month)) {
air1 <- airports[airports$iata == fsub[j,]$airport1,]
air2 <- airports[airports$iata == fsub[j,]$airport2,]
p1 <- c(air1[1,]$long, air1[1,]$lat)
p2 <- c(air2[1,]$long, air2[1,]$lat)
inter <- clean.Inter(p1,p2,n=100, addStartEnd=TRUE)
colindex <- round( (fsub[j,]$cnt / maxcnt) * length(colors) )
lines(inter, col=colors[colindex], lwd=1.0)
}
dev.off()
}
I'd like to automate the production of maps for a large dataset containing all scheduled commercial routes — dummy sample — shared between ATL and other airports in the global network (airports.csv is linked to in the Flowing Data post). Preferably, I'd produce one map per month that I would use as frame in a short video depicting changes in the Atlanta airport network space.
The problem: I can't get the loop to produce any more than one PNG—from only the first unique month in each CSV—each time I run it. I'm fairly certain Aaron Hardin's code 'breaks' the automation as it is used in the Flowing Data tutorial. After three days of messing with it and chasing down any relevant R how-to's, I realize I simply lack the chops to reconcile one with the other. Can anybody help me automate the process?
There's a dissertation acknowledgement in it for you!
Too much information for a comment, so I post an answer instead. Here is what I think (and read to the end to see what could potentially be the problem):
I have tried to run your code on the original data in the Flowing Data tutorial. (Obviously you have to add a column for monthly data, so I simply added this line to randomise the month:):
airports <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/airports.csv",
header=TRUE)
flights <- read.csv("http://datasets.flowingdata.com/tuts/maparcs/flights.csv",
header=TRUE, as.is=TRUE)
# Add column with random data for month
flights$month <- sample(month.abb[1:4], nrow(flights), replace=TRUE)
Whenever I have a loop that takes a long time to run, I generally stick a bit of code in there that gives me a progress check. Use what takes your fancy: print, cat, tcltk::tkProgressBar. I use message:
for (i in 1:length(monthyear)) {
message(i)
#
# your code here
#
}
Anyway, I then ran your code. Everything works exactly as it should. Since I sampled four months worth of data, I get:
The message with the current iteration of i prints four times
Four png plots, each with a dark world map and bright yellow lines. Here is one of the four lines:
So, why does it work on my machine and not yours?
I can only guess, but my guess is that you haven't set the working directory. There is no setwd in your code, and the call to png just gives the filename. I suspect your code is being written to whatever your working directory is in your system.
By default, on my installation, the working directory is:
getwd()
[1] "C:/Program Files/eclipse 3.7"
To solve this, do one of the following:
Use setwd() to set your working directory at the top of your script.
Or use the full path and file name in your call to png()

Resources