Progressive point to point walk plotting in R - r

How can I to plot a progressive walk from point to point?
Lets have p1 =[1,0], p2=[0,1], p3=[1,1]. Plot should first draw a line from p1 to p2 showing the direction, wait for a second, then draw another line from p2 to p3 and it goes on if you have more data.
The plot size should be first fixed to (0,1)^2. Correct output should look similar to this image:
Example plot
My code is this for now:
plot(x,y,xlim=range(x), ylim=range(y), xlab="x", ylab="y", main="Filled Plane",pch=16)
#lines(x,y,xlim=range(x),ylim=(y),pch=16)
for(i in 1:20){
arrows(x[i],y[i],x[i+1],y[i+1],length = 0.25, angle = 30, col = 1:3)
}

One option is to use arrows. Fist you need to create a plot giving the data you want. Then you can draw lines to connect your points.
Let say you have random uniform arrays of x,y. Set the limit to decide how many points you want to plot.
Although I placed the points immediately ( I could not place the grid properly otherwise) Hope it helps.
limit<- 50
x <- runif(limit)
y <- runif(limit)
plot(x,y, xlim=range(0,1), ylim=range(0,1),
xlab="x", ylab="y", main = "Random Walk")
grid(nx = 10, ny = 10, col = "lightgray", lty = "dotted",
lwd = par("lwd"), equilogs = TRUE)
for(i in 1:limit){
arrows(x[i],y[i],x[i+1],y[i+1], length = 0.1, angle = 20)
Sys.sleep(0.5)
}

Related

How dose persp define ticks? Especially how persp decide how many ticks and which tick delta are used?

Hy there,
I use persp for a 3D-Plot and i am try to find out how persp define the ticks when the parameter ticktype="detailed" is set.
I want to draw lines into the box around a surface corresponding to the ticks. Up till now, frist I draw the surface without any labels and axes and add all lines and axes afterwords. To make it clear what I have done -> example code:
z <- matrix(rep(1:10, each=10), nrow=10, ncol=10)
x.axis <- 1:nrow(z)
y.axis <- 1:ncol(z)
max.y <- max(y.axis)
# Drawing the surface without the axes and no lines on the surface
pmat <- persp(z = z, x = x.axis, y = y.axis ,
lphi = 100, phi = 25, theta = -30,
axes=F,
border = NA, # no lines on the surface
col="deepskyblue",
expand = 0.5,
shade = 0.65)
Now I add the the lines on the surface with different color and the axes with ticks and labels:
par(new=T)
pmat <- persp(z = z, x = x.axis, y = y.axis ,
lphi = 100, phi = 25, theta = -30,
ticktype = "detailed",
expand = 0.5,
cex.lab=0.75,
col=NA,
border="grey80")
par(new=F)
To get lines on the box around the surface I use the following:
for (z_high in c(2,4,6,8)) {
lines(trans3d(x.axis, max.y, z_high, pmat) , col="black", lty=3)
}
As you can see, I use a own defined vector c(2,4,6,8) which represents the z-values for the box lines in the back. If the input surface now changes, I have to adapted this vector by my own. Is there a way to get the ticks for all axes in the persp plot? Did anyone know how persp define the ticks?

R: Colour points on a map based on their value and add legend

I have a dataset containing longitude, latitude and a value column showing humidity, about 300 rows in length. Each point shows the humidity for a different location. I would like to plot all of them on a map and colour them according to their value (such as in gradient colours) and add a legend. It is a bit similar to the question here, but I can't get it to work. The code is basically there but only the colouring and displaying it properly in a legend does not really work. The points represent a line in Africa and the humidity values have been originally extracted from a raster dataset and they contain several digits. I created some sample data to illustrate where I am stuck.
library("maps")
library("raster")
# create sample data
lon <- seq(from=35.6, to=43.2, by=0.2)
lat <- seq(from=10.5, to=22.2, by=0.2)
humidity <- runif(59, min=9.6, max=13.5)
data <- data.frame(lon,lat, humidity)
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(100))
legend("topleft",title="q (g/kg)",legend=c(11,11.5,12,12.5,13),col =colfunc(100), pch=20)
The resulting plot looks like this:
Something is clearly wrong with the legend, I would like to have a few points shown in the legend with the corresponding colour and value or even use a nice colourbar. I am not sure why the colour in the legend is just blue. I also suspect that the line of points is not coloured according to their actual value and just displaying the whole colour gradient. Thanks for any suggestions!
UPDATE with code from Alex:
n <- 10
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
mycol <- function(x, myrange, n=10) round( 1+(x-myrange[1])/diff(myrange) * (n-1))
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(n)[mycol(humidity, range(humidity), n)])
mylist <- c(10,11,11.5,12,12.5,13)
legend("topleft",title="q (g/kg)",legend=mylist,col = colfunc(n)[mycol(mylist,range(humidity), n)], pch=20)
That generates this plot:
Points are overlapping and it is hard to see the overall values of the points, is there any way to colour the points according to a defined range using the colourramp? Such as "red" for values 10 to 11, "green" for 11 to 12 and so on?
You may have mixed up a few things. In you code, you are plotting points with a colour that is only a result of the order of the points (first point gets first colour in the list etc). The colour doesn't depend on the value.
Now in a colour gradient for humidity values in the full range 0:100 you will, frankly, not see any difference between values 11 and 13. You need a lot more contrast.
So you should first do
mycol <- function(x, myrange, n=100) round( 1+(x-myrange[1])/diff(myrange) * (n-1))
now mycol(x, range(humidity), n) will return an integer that is 1 for the minimum value and n for the maximum.
n=100
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(n)[mycol(humidity, range(humidity), n)])
mylist <- c(11,11.5,12,12.5,13)
legend("topleft",title="q (g/kg)",legend=mylist,col = colfunc(n)[mycol(mylist,range(humidity), n)], pch=20)
you can seq the legend
library("maps")
library("raster")
n <- 4 # number in legend
# create sample data
lon <- seq(from=35.6, to=43.2, by=0.2)
lat <- seq(from=10.5, to=22.2, by=0.2)
humidity <- runif(39, min=9.6, max=20)
data <- data.frame(lon,lat[1:39], humidity)
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5,pch=18, col=colfunc(nrow(data)))
legend("topleft",title="q (g/kg)",legend=round(seq(min(humidity),max(humidity),length.out = n),0),col =colfunc(n), pch=20)

Histogram to decide whether two distributions have the same shape in R [duplicate]

I am using R and I have two data frames: carrots and cucumbers. Each data frame has a single numeric column that lists the length of all measured carrots (total: 100k carrots) and cucumbers (total: 50k cucumbers).
I wish to plot two histograms - carrot length and cucumbers lengths - on the same plot. They overlap, so I guess I also need some transparency. I also need to use relative frequencies not absolute numbers since the number of instances in each group is different.
Something like this would be nice but I don't understand how to create it from my two tables:
Here is an even simpler solution using base graphics and alpha-blending (which does not work on all graphics devices):
set.seed(42)
p1 <- hist(rnorm(500,4)) # centered at 4
p2 <- hist(rnorm(500,6)) # centered at 6
plot( p1, col=rgb(0,0,1,1/4), xlim=c(0,10)) # first histogram
plot( p2, col=rgb(1,0,0,1/4), xlim=c(0,10), add=T) # second
The key is that the colours are semi-transparent.
Edit, more than two years later: As this just got an upvote, I figure I may as well add a visual of what the code produces as alpha-blending is so darn useful:
That image you linked to was for density curves, not histograms.
If you've been reading on ggplot then maybe the only thing you're missing is combining your two data frames into one long one.
So, let's start with something like what you have, two separate sets of data and combine them.
carrots <- data.frame(length = rnorm(100000, 6, 2))
cukes <- data.frame(length = rnorm(50000, 7, 2.5))
# Now, combine your two dataframes into one.
# First make a new column in each that will be
# a variable to identify where they came from later.
carrots$veg <- 'carrot'
cukes$veg <- 'cuke'
# and combine into your new data frame vegLengths
vegLengths <- rbind(carrots, cukes)
After that, which is unnecessary if your data is in long format already, you only need one line to make your plot.
ggplot(vegLengths, aes(length, fill = veg)) + geom_density(alpha = 0.2)
Now, if you really did want histograms the following will work. Note that you must change position from the default "stack" argument. You might miss that if you don't really have an idea of what your data should look like. A higher alpha looks better there. Also note that I made it density histograms. It's easy to remove the y = ..density.. to get it back to counts.
ggplot(vegLengths, aes(length, fill = veg)) +
geom_histogram(alpha = 0.5, aes(y = ..density..), position = 'identity')
On additional thing, I commented on Dirk's question that all of the arguments could simply be in the hist command. I was asked how that could be done. What follows produces exactly Dirk's figure.
set.seed(42)
hist(rnorm(500,4), col=rgb(0,0,1,1/4), xlim=c(0,10))
hist(rnorm(500,6), col=rgb(1,0,0,1/4), xlim=c(0,10), add = TRUE)
Here's a function I wrote that uses pseudo-transparency to represent overlapping histograms
plotOverlappingHist <- function(a, b, colors=c("white","gray20","gray50"),
breaks=NULL, xlim=NULL, ylim=NULL){
ahist=NULL
bhist=NULL
if(!(is.null(breaks))){
ahist=hist(a,breaks=breaks,plot=F)
bhist=hist(b,breaks=breaks,plot=F)
} else {
ahist=hist(a,plot=F)
bhist=hist(b,plot=F)
dist = ahist$breaks[2]-ahist$breaks[1]
breaks = seq(min(ahist$breaks,bhist$breaks),max(ahist$breaks,bhist$breaks),dist)
ahist=hist(a,breaks=breaks,plot=F)
bhist=hist(b,breaks=breaks,plot=F)
}
if(is.null(xlim)){
xlim = c(min(ahist$breaks,bhist$breaks),max(ahist$breaks,bhist$breaks))
}
if(is.null(ylim)){
ylim = c(0,max(ahist$counts,bhist$counts))
}
overlap = ahist
for(i in 1:length(overlap$counts)){
if(ahist$counts[i] > 0 & bhist$counts[i] > 0){
overlap$counts[i] = min(ahist$counts[i],bhist$counts[i])
} else {
overlap$counts[i] = 0
}
}
plot(ahist, xlim=xlim, ylim=ylim, col=colors[1])
plot(bhist, xlim=xlim, ylim=ylim, col=colors[2], add=T)
plot(overlap, xlim=xlim, ylim=ylim, col=colors[3], add=T)
}
Here's another way to do it using R's support for transparent colors
a=rnorm(1000, 3, 1)
b=rnorm(1000, 6, 1)
hist(a, xlim=c(0,10), col="red")
hist(b, add=T, col=rgb(0, 1, 0, 0.5) )
The results end up looking something like this:
Already beautiful answers are there, but I thought of adding this. Looks good to me.
(Copied random numbers from #Dirk). library(scales) is needed`
set.seed(42)
hist(rnorm(500,4),xlim=c(0,10),col='skyblue',border=F)
hist(rnorm(500,6),add=T,col=scales::alpha('red',.5),border=F)
The result is...
Update: This overlapping function may also be useful to some.
hist0 <- function(...,col='skyblue',border=T) hist(...,col=col,border=border)
I feel result from hist0 is prettier to look than hist
hist2 <- function(var1, var2,name1='',name2='',
breaks = min(max(length(var1), length(var2)),20),
main0 = "", alpha0 = 0.5,grey=0,border=F,...) {
library(scales)
colh <- c(rgb(0, 1, 0, alpha0), rgb(1, 0, 0, alpha0))
if(grey) colh <- c(alpha(grey(0.1,alpha0)), alpha(grey(0.9,alpha0)))
max0 = max(var1, var2)
min0 = min(var1, var2)
den1_max <- hist(var1, breaks = breaks, plot = F)$density %>% max
den2_max <- hist(var2, breaks = breaks, plot = F)$density %>% max
den_max <- max(den2_max, den1_max)*1.2
var1 %>% hist0(xlim = c(min0 , max0) , breaks = breaks,
freq = F, col = colh[1], ylim = c(0, den_max), main = main0,border=border,...)
var2 %>% hist0(xlim = c(min0 , max0), breaks = breaks,
freq = F, col = colh[2], ylim = c(0, den_max), add = T,border=border,...)
legend(min0,den_max, legend = c(
ifelse(nchar(name1)==0,substitute(var1) %>% deparse,name1),
ifelse(nchar(name2)==0,substitute(var2) %>% deparse,name2),
"Overlap"), fill = c('white','white', colh[1]), bty = "n", cex=1,ncol=3)
legend(min0,den_max, legend = c(
ifelse(nchar(name1)==0,substitute(var1) %>% deparse,name1),
ifelse(nchar(name2)==0,substitute(var2) %>% deparse,name2),
"Overlap"), fill = c(colh, colh[2]), bty = "n", cex=1,ncol=3) }
The result of
par(mar=c(3, 4, 3, 2) + 0.1)
set.seed(100)
hist2(rnorm(10000,2),rnorm(10000,3),breaks = 50)
is
Here is an example of how you can do it in "classic" R graphics:
## generate some random data
carrotLengths <- rnorm(1000,15,5)
cucumberLengths <- rnorm(200,20,7)
## calculate the histograms - don't plot yet
histCarrot <- hist(carrotLengths,plot = FALSE)
histCucumber <- hist(cucumberLengths,plot = FALSE)
## calculate the range of the graph
xlim <- range(histCucumber$breaks,histCarrot$breaks)
ylim <- range(0,histCucumber$density,
histCarrot$density)
## plot the first graph
plot(histCarrot,xlim = xlim, ylim = ylim,
col = rgb(1,0,0,0.4),xlab = 'Lengths',
freq = FALSE, ## relative, not absolute frequency
main = 'Distribution of carrots and cucumbers')
## plot the second graph on top of this
opar <- par(new = FALSE)
plot(histCucumber,xlim = xlim, ylim = ylim,
xaxt = 'n', yaxt = 'n', ## don't add axes
col = rgb(0,0,1,0.4), add = TRUE,
freq = FALSE) ## relative, not absolute frequency
## add a legend in the corner
legend('topleft',c('Carrots','Cucumbers'),
fill = rgb(1:0,0,0:1,0.4), bty = 'n',
border = NA)
par(opar)
The only issue with this is that it looks much better if the histogram breaks are aligned, which may have to be done manually (in the arguments passed to hist).
Here's the version like the ggplot2 one I gave only in base R. I copied some from #nullglob.
generate the data
carrots <- rnorm(100000,5,2)
cukes <- rnorm(50000,7,2.5)
You don't need to put it into a data frame like with ggplot2. The drawback of this method is that you have to write out a lot more of the details of the plot. The advantage is that you have control over more details of the plot.
## calculate the density - don't plot yet
densCarrot <- density(carrots)
densCuke <- density(cukes)
## calculate the range of the graph
xlim <- range(densCuke$x,densCarrot$x)
ylim <- range(0,densCuke$y, densCarrot$y)
#pick the colours
carrotCol <- rgb(1,0,0,0.2)
cukeCol <- rgb(0,0,1,0.2)
## plot the carrots and set up most of the plot parameters
plot(densCarrot, xlim = xlim, ylim = ylim, xlab = 'Lengths',
main = 'Distribution of carrots and cucumbers',
panel.first = grid())
#put our density plots in
polygon(densCarrot, density = -1, col = carrotCol)
polygon(densCuke, density = -1, col = cukeCol)
## add a legend in the corner
legend('topleft',c('Carrots','Cucumbers'),
fill = c(carrotCol, cukeCol), bty = 'n',
border = NA)
#Dirk Eddelbuettel: The basic idea is excellent but the code as shown can be improved. [Takes long to explain, hence a separate answer and not a comment.]
The hist() function by default draws plots, so you need to add the plot=FALSE option. Moreover, it is clearer to establish the plot area by a plot(0,0,type="n",...) call in which you can add the axis labels, plot title etc. Finally, I would like to mention that one could also use shading to distinguish between the two histograms. Here is the code:
set.seed(42)
p1 <- hist(rnorm(500,4),plot=FALSE)
p2 <- hist(rnorm(500,6),plot=FALSE)
plot(0,0,type="n",xlim=c(0,10),ylim=c(0,100),xlab="x",ylab="freq",main="Two histograms")
plot(p1,col="green",density=10,angle=135,add=TRUE)
plot(p2,col="blue",density=10,angle=45,add=TRUE)
And here is the result (a bit too wide because of RStudio :-) ):
Plotly's R API might be useful for you. The graph below is here.
library(plotly)
#add username and key
p <- plotly(username="Username", key="API_KEY")
#generate data
x0 = rnorm(500)
x1 = rnorm(500)+1
#arrange your graph
data0 = list(x=x0,
name = "Carrots",
type='histogramx',
opacity = 0.8)
data1 = list(x=x1,
name = "Cukes",
type='histogramx',
opacity = 0.8)
#specify type as 'overlay'
layout <- list(barmode='overlay',
plot_bgcolor = 'rgba(249,249,251,.85)')
#format response, and use 'browseURL' to open graph tab in your browser.
response = p$plotly(data0, data1, kwargs=list(layout=layout))
url = response$url
filename = response$filename
browseURL(response$url)
Full disclosure: I'm on the team.
So many great answers but since I've just written a function (plotMultipleHistograms() in 'basicPlotteR' package) function to do this, I thought I would add another answer.
The advantage of this function is that it automatically sets appropriate X and Y axis limits and defines a common set of bins that it uses across all the distributions.
Here's how to use it:
# Install the plotteR package
install.packages("devtools")
devtools::install_github("JosephCrispell/basicPlotteR")
library(basicPlotteR)
# Set the seed
set.seed(254534)
# Create random samples from a normal distribution
distributions <- list(rnorm(500, mean=5, sd=0.5),
rnorm(500, mean=8, sd=5),
rnorm(500, mean=20, sd=2))
# Plot overlapping histograms
plotMultipleHistograms(distributions, nBins=20,
colours=c(rgb(1,0,0, 0.5), rgb(0,0,1, 0.5), rgb(0,1,0, 0.5)),
las=1, main="Samples from normal distribution", xlab="Value")
The plotMultipleHistograms() function can take any number of distributions, and all the general plotting parameters should work with it (for example: las, main, etc.).

r program grouping 3 histograms into one grouped histogram [duplicate]

I am using R and I have two data frames: carrots and cucumbers. Each data frame has a single numeric column that lists the length of all measured carrots (total: 100k carrots) and cucumbers (total: 50k cucumbers).
I wish to plot two histograms - carrot length and cucumbers lengths - on the same plot. They overlap, so I guess I also need some transparency. I also need to use relative frequencies not absolute numbers since the number of instances in each group is different.
Something like this would be nice but I don't understand how to create it from my two tables:
Here is an even simpler solution using base graphics and alpha-blending (which does not work on all graphics devices):
set.seed(42)
p1 <- hist(rnorm(500,4)) # centered at 4
p2 <- hist(rnorm(500,6)) # centered at 6
plot( p1, col=rgb(0,0,1,1/4), xlim=c(0,10)) # first histogram
plot( p2, col=rgb(1,0,0,1/4), xlim=c(0,10), add=T) # second
The key is that the colours are semi-transparent.
Edit, more than two years later: As this just got an upvote, I figure I may as well add a visual of what the code produces as alpha-blending is so darn useful:
That image you linked to was for density curves, not histograms.
If you've been reading on ggplot then maybe the only thing you're missing is combining your two data frames into one long one.
So, let's start with something like what you have, two separate sets of data and combine them.
carrots <- data.frame(length = rnorm(100000, 6, 2))
cukes <- data.frame(length = rnorm(50000, 7, 2.5))
# Now, combine your two dataframes into one.
# First make a new column in each that will be
# a variable to identify where they came from later.
carrots$veg <- 'carrot'
cukes$veg <- 'cuke'
# and combine into your new data frame vegLengths
vegLengths <- rbind(carrots, cukes)
After that, which is unnecessary if your data is in long format already, you only need one line to make your plot.
ggplot(vegLengths, aes(length, fill = veg)) + geom_density(alpha = 0.2)
Now, if you really did want histograms the following will work. Note that you must change position from the default "stack" argument. You might miss that if you don't really have an idea of what your data should look like. A higher alpha looks better there. Also note that I made it density histograms. It's easy to remove the y = ..density.. to get it back to counts.
ggplot(vegLengths, aes(length, fill = veg)) +
geom_histogram(alpha = 0.5, aes(y = ..density..), position = 'identity')
On additional thing, I commented on Dirk's question that all of the arguments could simply be in the hist command. I was asked how that could be done. What follows produces exactly Dirk's figure.
set.seed(42)
hist(rnorm(500,4), col=rgb(0,0,1,1/4), xlim=c(0,10))
hist(rnorm(500,6), col=rgb(1,0,0,1/4), xlim=c(0,10), add = TRUE)
Here's a function I wrote that uses pseudo-transparency to represent overlapping histograms
plotOverlappingHist <- function(a, b, colors=c("white","gray20","gray50"),
breaks=NULL, xlim=NULL, ylim=NULL){
ahist=NULL
bhist=NULL
if(!(is.null(breaks))){
ahist=hist(a,breaks=breaks,plot=F)
bhist=hist(b,breaks=breaks,plot=F)
} else {
ahist=hist(a,plot=F)
bhist=hist(b,plot=F)
dist = ahist$breaks[2]-ahist$breaks[1]
breaks = seq(min(ahist$breaks,bhist$breaks),max(ahist$breaks,bhist$breaks),dist)
ahist=hist(a,breaks=breaks,plot=F)
bhist=hist(b,breaks=breaks,plot=F)
}
if(is.null(xlim)){
xlim = c(min(ahist$breaks,bhist$breaks),max(ahist$breaks,bhist$breaks))
}
if(is.null(ylim)){
ylim = c(0,max(ahist$counts,bhist$counts))
}
overlap = ahist
for(i in 1:length(overlap$counts)){
if(ahist$counts[i] > 0 & bhist$counts[i] > 0){
overlap$counts[i] = min(ahist$counts[i],bhist$counts[i])
} else {
overlap$counts[i] = 0
}
}
plot(ahist, xlim=xlim, ylim=ylim, col=colors[1])
plot(bhist, xlim=xlim, ylim=ylim, col=colors[2], add=T)
plot(overlap, xlim=xlim, ylim=ylim, col=colors[3], add=T)
}
Here's another way to do it using R's support for transparent colors
a=rnorm(1000, 3, 1)
b=rnorm(1000, 6, 1)
hist(a, xlim=c(0,10), col="red")
hist(b, add=T, col=rgb(0, 1, 0, 0.5) )
The results end up looking something like this:
Already beautiful answers are there, but I thought of adding this. Looks good to me.
(Copied random numbers from #Dirk). library(scales) is needed`
set.seed(42)
hist(rnorm(500,4),xlim=c(0,10),col='skyblue',border=F)
hist(rnorm(500,6),add=T,col=scales::alpha('red',.5),border=F)
The result is...
Update: This overlapping function may also be useful to some.
hist0 <- function(...,col='skyblue',border=T) hist(...,col=col,border=border)
I feel result from hist0 is prettier to look than hist
hist2 <- function(var1, var2,name1='',name2='',
breaks = min(max(length(var1), length(var2)),20),
main0 = "", alpha0 = 0.5,grey=0,border=F,...) {
library(scales)
colh <- c(rgb(0, 1, 0, alpha0), rgb(1, 0, 0, alpha0))
if(grey) colh <- c(alpha(grey(0.1,alpha0)), alpha(grey(0.9,alpha0)))
max0 = max(var1, var2)
min0 = min(var1, var2)
den1_max <- hist(var1, breaks = breaks, plot = F)$density %>% max
den2_max <- hist(var2, breaks = breaks, plot = F)$density %>% max
den_max <- max(den2_max, den1_max)*1.2
var1 %>% hist0(xlim = c(min0 , max0) , breaks = breaks,
freq = F, col = colh[1], ylim = c(0, den_max), main = main0,border=border,...)
var2 %>% hist0(xlim = c(min0 , max0), breaks = breaks,
freq = F, col = colh[2], ylim = c(0, den_max), add = T,border=border,...)
legend(min0,den_max, legend = c(
ifelse(nchar(name1)==0,substitute(var1) %>% deparse,name1),
ifelse(nchar(name2)==0,substitute(var2) %>% deparse,name2),
"Overlap"), fill = c('white','white', colh[1]), bty = "n", cex=1,ncol=3)
legend(min0,den_max, legend = c(
ifelse(nchar(name1)==0,substitute(var1) %>% deparse,name1),
ifelse(nchar(name2)==0,substitute(var2) %>% deparse,name2),
"Overlap"), fill = c(colh, colh[2]), bty = "n", cex=1,ncol=3) }
The result of
par(mar=c(3, 4, 3, 2) + 0.1)
set.seed(100)
hist2(rnorm(10000,2),rnorm(10000,3),breaks = 50)
is
Here is an example of how you can do it in "classic" R graphics:
## generate some random data
carrotLengths <- rnorm(1000,15,5)
cucumberLengths <- rnorm(200,20,7)
## calculate the histograms - don't plot yet
histCarrot <- hist(carrotLengths,plot = FALSE)
histCucumber <- hist(cucumberLengths,plot = FALSE)
## calculate the range of the graph
xlim <- range(histCucumber$breaks,histCarrot$breaks)
ylim <- range(0,histCucumber$density,
histCarrot$density)
## plot the first graph
plot(histCarrot,xlim = xlim, ylim = ylim,
col = rgb(1,0,0,0.4),xlab = 'Lengths',
freq = FALSE, ## relative, not absolute frequency
main = 'Distribution of carrots and cucumbers')
## plot the second graph on top of this
opar <- par(new = FALSE)
plot(histCucumber,xlim = xlim, ylim = ylim,
xaxt = 'n', yaxt = 'n', ## don't add axes
col = rgb(0,0,1,0.4), add = TRUE,
freq = FALSE) ## relative, not absolute frequency
## add a legend in the corner
legend('topleft',c('Carrots','Cucumbers'),
fill = rgb(1:0,0,0:1,0.4), bty = 'n',
border = NA)
par(opar)
The only issue with this is that it looks much better if the histogram breaks are aligned, which may have to be done manually (in the arguments passed to hist).
Here's the version like the ggplot2 one I gave only in base R. I copied some from #nullglob.
generate the data
carrots <- rnorm(100000,5,2)
cukes <- rnorm(50000,7,2.5)
You don't need to put it into a data frame like with ggplot2. The drawback of this method is that you have to write out a lot more of the details of the plot. The advantage is that you have control over more details of the plot.
## calculate the density - don't plot yet
densCarrot <- density(carrots)
densCuke <- density(cukes)
## calculate the range of the graph
xlim <- range(densCuke$x,densCarrot$x)
ylim <- range(0,densCuke$y, densCarrot$y)
#pick the colours
carrotCol <- rgb(1,0,0,0.2)
cukeCol <- rgb(0,0,1,0.2)
## plot the carrots and set up most of the plot parameters
plot(densCarrot, xlim = xlim, ylim = ylim, xlab = 'Lengths',
main = 'Distribution of carrots and cucumbers',
panel.first = grid())
#put our density plots in
polygon(densCarrot, density = -1, col = carrotCol)
polygon(densCuke, density = -1, col = cukeCol)
## add a legend in the corner
legend('topleft',c('Carrots','Cucumbers'),
fill = c(carrotCol, cukeCol), bty = 'n',
border = NA)
#Dirk Eddelbuettel: The basic idea is excellent but the code as shown can be improved. [Takes long to explain, hence a separate answer and not a comment.]
The hist() function by default draws plots, so you need to add the plot=FALSE option. Moreover, it is clearer to establish the plot area by a plot(0,0,type="n",...) call in which you can add the axis labels, plot title etc. Finally, I would like to mention that one could also use shading to distinguish between the two histograms. Here is the code:
set.seed(42)
p1 <- hist(rnorm(500,4),plot=FALSE)
p2 <- hist(rnorm(500,6),plot=FALSE)
plot(0,0,type="n",xlim=c(0,10),ylim=c(0,100),xlab="x",ylab="freq",main="Two histograms")
plot(p1,col="green",density=10,angle=135,add=TRUE)
plot(p2,col="blue",density=10,angle=45,add=TRUE)
And here is the result (a bit too wide because of RStudio :-) ):
Plotly's R API might be useful for you. The graph below is here.
library(plotly)
#add username and key
p <- plotly(username="Username", key="API_KEY")
#generate data
x0 = rnorm(500)
x1 = rnorm(500)+1
#arrange your graph
data0 = list(x=x0,
name = "Carrots",
type='histogramx',
opacity = 0.8)
data1 = list(x=x1,
name = "Cukes",
type='histogramx',
opacity = 0.8)
#specify type as 'overlay'
layout <- list(barmode='overlay',
plot_bgcolor = 'rgba(249,249,251,.85)')
#format response, and use 'browseURL' to open graph tab in your browser.
response = p$plotly(data0, data1, kwargs=list(layout=layout))
url = response$url
filename = response$filename
browseURL(response$url)
Full disclosure: I'm on the team.
So many great answers but since I've just written a function (plotMultipleHistograms() in 'basicPlotteR' package) function to do this, I thought I would add another answer.
The advantage of this function is that it automatically sets appropriate X and Y axis limits and defines a common set of bins that it uses across all the distributions.
Here's how to use it:
# Install the plotteR package
install.packages("devtools")
devtools::install_github("JosephCrispell/basicPlotteR")
library(basicPlotteR)
# Set the seed
set.seed(254534)
# Create random samples from a normal distribution
distributions <- list(rnorm(500, mean=5, sd=0.5),
rnorm(500, mean=8, sd=5),
rnorm(500, mean=20, sd=2))
# Plot overlapping histograms
plotMultipleHistograms(distributions, nBins=20,
colours=c(rgb(1,0,0, 0.5), rgb(0,0,1, 0.5), rgb(0,1,0, 0.5)),
las=1, main="Samples from normal distribution", xlab="Value")
The plotMultipleHistograms() function can take any number of distributions, and all the general plotting parameters should work with it (for example: las, main, etc.).

Time lapse plotting

I am new to R environment. I have generated a simulated time lapse plot using the following code.
seq_x<-seq(1,10)
seq_y<-function(y)
{
z<-y^2+y+1
return (c(z))
}
yrange<-seq_y(1)
yrange[2]<-seq_y(length(seq_x))
for(i in 1:length(seq_x) )
{
xdata<-seq_x[1:i]
ydata<-seq_y(xdata)
plot(xdata,ydata,xlim=range(seq_x),ylim=range(yrange),type="o",col="royalblue",plot.first=grid())
Sys.sleep(1)
}
I get the following plot(after the final iteration).
Now I need to plot a straight line and a circle right across the plot as shown below.
The straight line should grow with the data sequence. The circle should be at the center of the data sequence with user specified radius. Any advice in this regard will be highly appreciated.
Try this:
#draw circle
require(plotrix)
userRadius <- 1
draw.circle(median(xdata), median(ydata), userRadius)
#draw line
segments(x0=xdata[1],y0=ydata[1],
x1=xdata[length(xdata)],y1=ydata[length(ydata)])
You can use lines to add additional lines to your plot. Then you just have to calculate the points on the circle (or ellipsis) and you can draw both the extra line and the circle.
# data
x <- seq(1,10)
y <- x^2 + x + 1
# function to calculate points on the ellipsis
ellipsis_fct <- function(mx, my, rx, ry){
phi <- seq(0, 2*pi, length = 100) # change length if you need better resolution
data.frame(x = mx + rx*sin(phi),
y = my + ry*cos(phi))
}
# actually calculate the points.
circ <- ellipsis_fct(mean(range(x)), mean(range(y)), diff(range(x))/5, diff(range(y))/5)
# plotting commands
plot(x, y, xlim=range(x), ylim=range(y), type="o", col="royalblue", plot.first=grid())
lines(range(x), range(y), col = "darkred", lty = "dashed")
lines(circ, col = "orange")

Resources