Plot With Blocks - r

I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?

I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer

It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))

Related

3D interactive surface plot with spatial data

I would like to create an interactive 3D surface plot of depths in a lake, ideally using the plotly or rgl libraries. I have extracted my data from a SpatialLinesDataFrame of contour lines in Gauss-Krueger/EPSG:31468 CRS, i.e. metric units. Now each contour line produces a set of coordinates with the same depth value. The resulting data frame is rather large, but looks something like this:
set.seed(41)
xx <- rnorm(100,4448929,100)
yy <- rnorm(100,5308097,100)
zz <- c(rep(-10,10),rep(-20,10),rep(-30,10),rep(-40,10),rep(-50,10),rep(-60,10),rep(-70,10),rep(-80,10),rep(-90,10),rep(-100,10))
df <- data.frame(xx,yy,zz)
I have tried plotting the data with plotly as in this example and with rgl as in this post. In both cases I get error messages relating to my data not being in a matrix format, i.e. where x- and y-values are represented as row- and column-numbers.
What does work, is using the add_trace command in plotly:
plot_ly() %>% add_trace(df,x = ~df$xx, y = ~df$yy, z = ~df$zz,type="mesh3d")
However, the resulting graph not only lacks the fancy colour legend of the add_surface command, but more importantly, warps the x- and y-values in relation to the z-values. The z-values are shown much too large, although all have the same metric unit.
I have also tried reshaping the data frame to a matrix as in this post, but it either doesn't work at all, or gives me a matrix consisting almost entirely of NAs. I can only speculate that the number of coordinates that have depth values attached is very small in comparison to all x-y-combinations of coordinates in that range?
Any suggestions will be much appreciated - thanks!
Those are randomly located points, so rgl::persp3d can't handle them directly. However, you can follow the example in ?rgl::persp3d.deldir to triangulate them and then plot. For example,
dxyz <- deldir::deldir(df$xx, df$yy, z = df$zz, suppressMsgs=TRUE)
persp3d(dxyz, col = "lightblue")
This results in a pretty ugly picture, but with some work (e.g. fixing the axis labels, using real data) you should get something reasonable.

How to send parameter to Geom.histogram when using Geom.subplot_grid in Gadfly?

I am trying to plot several histograms for the same data set, but with different numbers of bins. I am using Gadfly.
Suppose x is just an array of real values, plotting each histogram works:
plot(x=x, Geom.histogram(bincount=10))
plot(x=x, Geom.histogram(bincount=20))
But I'm trying to put all the histograms together. I've added the number of bins as another dimension to my data set:
x2 = vcat(hcat(10*ones(length(x)), x), hcat(20*ones(length(x)), x)
df = DataFrame(Bins=x2[:,1], X=x2[:,2])
Is there any way to send the number of bins (the value from the first column) to Geom.histogram when using Geom.subplot_grid? Something like this:
plot(df, x="X", ygroup="Bins", Geom.subplot_grid(Geom.histogram(?)))
I think you would be better off not using subplot grid at that point, and instead just combine them with vstack or hstack. From the docs
Plots can also be stacked horizontally with ``hstack`` or vertically with
``vstack``. This allows more customization in regards to tick marks, axis
labeling, and other plot details than is available with ``subplot_grid``.

construct a flat plot using third parameter or with three axis

I don't know how can I plot in better way.
I have
df1 <- data.frame(x=c(1,3,5), y=c(2,4,6))
df2 <- data.frame(x=c(2,6,10,12), y=c(1,4,7,15)
Those data frames have x as time, y as its own value.
I have data-frames with different amount of elements
I want to combine this data by x (time), but I need one method of two to show them on one plot: a) to show df1.y on x axis of a plot to see distribution df2 by df1, so these two data frames should be connected by the time (x) but shown each on one of two axis, or b) to show three axis, and for df1.y the y axis should be at the right side of a plot.
For a better terminology, I will rename your example variables according to your sample plots.
df1 <- data.frame(time=c(1,3,5), memory=c(2,4,6))
df2 <- data.frame(time=c(2,6,10,12), threads=c(1,4,7,15))
Your first plot:
From your description, I assume that you want to do the following: For each available time value get the value of df1$memory and df2$threads. However, that value may not always be available. One suitable approach is to fill up missing values by linear interpolation. This may be done using the approx-function:
merged.time <- sort(unique(c(df1$time, df2$time))
merged.data <- data.frame(time = merged.time,
memory = approx(df1$time, df1$memory, xout=merged.time)$y
threads = approx(df2$time, df2$threads, xout=merged.time)$y
)
Note that appprox(...)$y just extracts the interpolated data.
Plotting may now be done using standard plotting commands (or, as your tags suggest, using ggplot2:
ggplot(data=merged.data, aes(x=memory, y=threads)) + geom_line()
Your second plot
... is not possible with ggplot2. That is for numerous reasons, for example see here.

R Lattice Plot Multiple Lines with Specific Color

I have two problems that I am having trouble to solve for. Firstly when I do a multiple column matrix plot using lattice xyplot, I find that all the points are connected. How can I get separate disconnected lines?
x<-cbind(rnorm(10),rnorm(10))
xyplot(x~1:nrow(x),type="l")
Secondly, I am having trouble figuring out how to make one line thicker than the other. For example, given that I want column 1, then column 1's line will be thicker than that of column 2.
The lattice plotting paradigm,like that of ggplot2 that followed it, expects data to be in long format in dataframes:
dfrm <- data.frame( y=c(rnorm(10),rnorm(10)),
x=1:10,
grp=rep(c("a","b"),each=10))
xyplot(y~x, group=grp, type="l", data=dfrm, col=c("red","blue"))
This might not be the most elegant solution but it gets the job done:
x<-cbind(rnorm(10),rnorm(10))
plot1<-xyplot(x[,1]~1:nrow(x),type="l",col="red",lwd=3)
plot2<-xyplot(x[,2]~1:nrow(x),type="l")
library(latticeExtra)
plot1+plot2
I assumed that you wanted V1 and V2 plotted against the number of observations.
Otherwise you indeed only have one line.
You can adjust the axis and labels according to taste.

R - logistic curve plot with aggregate points

Let's say I have the following dataset
bodysize=rnorm(20,30,2)
bodysize=sort(bodysize)
survive=c(0,0,0,0,0,1,0,1,0,0,1,1,0,1,1,1,0,1,1,1)
dat=as.data.frame(cbind(bodysize,survive))
I'm aware that the glm plot function has several nice plots to show you the fit,
but I'd nevertheless like to create an initial plot with:
1)raw data points
2)the loigistic curve and both
3)Predicted points
4)and aggregate points for a number of predictor levels
library(Hmisc)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
All fine up to here.
Now I want to plot the real data survival rates for a given levels of x1
dat$bd<-cut2(dat$bodysize,g=5,levels.mean=T)
AggBd<-aggregate(dat$survive,by=list(dat$bd),data=dat,FUN=mean)
plot(AggBd,add=TRUE)
#Doesn't work
I've tried to match AggBd to the dataset used for the model and all sort of other things but I simply can't plot the two together. Is there a way around this?
I basically want to overimpose the last plot along the same axes.
Besides this specific task I often wonder how to overimpose different plots that plot different variables but have similar scale/range on two-dimensional plots. I would really appreciate your help.
The first column of AggBd is a factor, you need to convert the levels to numeric before you can add the points to the plot.
AggBd$size <- as.numeric (levels (AggBd$Group.1))[AggBd$Group.1]
to add the points to the exisiting plot, use points
points (AggBd$size, AggBd$x, pch = 3)
You are best specifying your y-axis. Also maybe using par(new=TRUE)
plot(bodysize,survive,xlab="Body size",ylab="Probability of survival")
g=glm(survive~bodysize,family=binomial,dat)
curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE)
points(bodysize,fitted(g),pch=20)
#then
par(new=TRUE)
#
plot(AggBd$Group.1,AggBd$x,pch=30)
obviously remove or change the axis ticks to prevent overlap e.g.
plot(AggBd$Group.1,AggBd$x,pch=30,xaxt="n",yaxt="n",xlab="",ylab="")
giving:

Resources