How to do a 3D plot using R? - r

I want to plot a 3D plot using R. My data set is independent, which means the values of x, y, and z are not dependent on each other. The plot I want is given in this picture:
This plot was drawn by someone using MATLAB. How can I can do the same kind of Plot using R?

Since you posted your image file, it appears you are not trying to make a 3d scatterplot, rather a 2d scatterplot with a continuous color scale to indicate the value of a third variable.
Option 1: For this approach I would use ggplot2
# make data
mydata <- data.frame(x = rnorm(100, 10, 3),
y = rnorm(100, 5, 10),
z = rpois(100, 20))
ggplot(mydata, aes(x,y)) + geom_point(aes(color = z)) + theme_bw()
Which produces:
Option 2: To make a 3d scatterplot, use the cloud function from the lattice package.
library(lattice)
# make some data
x <- runif(20)
y <- rnorm(20)
z <- rpois(20, 5) / 5
cloud(z ~ x * y)

I usually do these kinds of plots with the base plotting functions and some helper functions for the color levels and color legend from the sinkr package (you need the devtools package to install from GitHib).
Example:
#library(devtools)
#install_github("marchtaylor/sinkr")
library(sinkr)
# example data
grd <- expand.grid(
x=seq(nrow(volcano)),
y=seq(ncol(volcano))
)
grd$z <- c(volcano)
# plot
COL <- val2col(grd$z, col=jetPal(100))
op <- par(no.readonly = TRUE)
layout(matrix(1:2,1,2), widths=c(4,1), heights=4)
par(mar=c(4,4,1,1))
plot(grd$x, grd$y, col=COL, pch=20)
par(mar=c(4,1,1,4))
imageScale(grd$z, col=jetPal(100), axis.pos=4)
mtext("z", side=4, line=3)
par(op)
Result:

Related

How to plot multiple columns at the same time? [duplicate]

I would like to plot y1 and y2 in the same plot.
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
plot(x, y1, type = "l", col = "red")
plot(x, y2, type = "l", col = "green")
But when I do it like this, they are not plotted in the same plot together.
In Matlab one can do hold on, but does anyone know how to do this in R?
lines() or points() will add to the existing graph, but will not create a new window. So you'd need to do
plot(x,y1,type="l",col="red")
lines(x,y2,col="green")
You can also use par and plot on the same graph but different axis. Something as follows:
plot( x, y1, type="l", col="red" )
par(new=TRUE)
plot( x, y2, type="l", col="green" )
If you read in detail about par in R, you will be able to generate really interesting graphs. Another book to look at is Paul Murrel's R Graphics.
When constructing multilayer plots one should consider ggplot package. The idea is to create a graphical object with basic aesthetics and enhance it incrementally.
ggplot style requires data to be packed in data.frame.
# Data generation
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
df <- data.frame(x,y1,y2)
Basic solution:
require(ggplot2)
ggplot(df, aes(x)) + # basic graphical object
geom_line(aes(y=y1), colour="red") + # first layer
geom_line(aes(y=y2), colour="green") # second layer
Here + operator is used to add extra layers to basic object.
With ggplot you have access to graphical object on every stage of plotting. Say, usual step-by-step setup can look like this:
g <- ggplot(df, aes(x))
g <- g + geom_line(aes(y=y1), colour="red")
g <- g + geom_line(aes(y=y2), colour="green")
g
g produces the plot, and you can see it at every stage (well, after creation of at least one layer). Further enchantments of the plot are also made with created object. For example, we can add labels for axises:
g <- g + ylab("Y") + xlab("X")
g
Final g looks like:
UPDATE (2013-11-08):
As pointed out in comments, ggplot's philosophy suggests using data in long format.
You can refer to this answer in order to see the corresponding code.
I think that the answer you are looking for is:
plot(first thing to plot)
plot(second thing to plot,add=TRUE)
Use the matplot function:
matplot(x, cbind(y1,y2),type="l",col=c("red","green"),lty=c(1,1))
use this if y1 and y2 are evaluated at the same x points. It scales the Y-axis to fit whichever is bigger (y1 or y2), unlike some of the other answers here that will clip y2 if it gets bigger than y1 (ggplot solutions mostly are okay with this).
Alternatively, and if the two lines don't have the same x-coordinates, set the axis limits on the first plot and add:
x1 <- seq(-2, 2, 0.05)
x2 <- seq(-3, 3, 0.05)
y1 <- pnorm(x1)
y2 <- pnorm(x2,1,1)
plot(x1,y1,ylim=range(c(y1,y2)),xlim=range(c(x1,x2)), type="l",col="red")
lines(x2,y2,col="green")
Am astonished this Q is 4 years old and nobody has mentioned matplot or x/ylim...
tl;dr: You want to use curve (with add=TRUE) or lines.
I disagree with par(new=TRUE) because that will double-print tick-marks and axis labels. Eg
The output of plot(sin); par(new=T); plot( function(x) x**2 ).
Look how messed up the vertical axis labels are! Since the ranges are different you would need to set ylim=c(lowest point between the two functions, highest point between the two functions), which is less easy than what I'm about to show you---and way less easy if you want to add not just two curves, but many.
What always confused me about plotting is the difference between curve and lines. (If you can't remember that these are the names of the two important plotting commands, just sing it.)
Here's the big difference between curve and lines.
curve will plot a function, like curve(sin). lines plots points with x and y values, like: lines( x=0:10, y=sin(0:10) ).
And here's a minor difference: curve needs to be called with add=TRUE for what you're trying to do, while lines already assumes you're adding to an existing plot.
Here's the result of calling plot(0:2); curve(sin).
Behind the scenes, check out methods(plot). And check body( plot.function )[[5]]. When you call plot(sin) R figures out that sin is a function (not y values) and uses the plot.function method, which ends up calling curve. So curve is the tool meant to handle functions.
if you want to split the plot into two columns (2 plots next to each other), you can do it like this:
par(mfrow=c(1,2))
plot(x)
plot(y)
Reference Link
As described by #redmode, you may plot the two lines in the same graphical device using ggplot. In that answer the data were in a 'wide' format. However, when using ggplot it is generally most convenient to keep the data in a data frame in a 'long' format. Then, by using different 'grouping variables' in the aesthetics arguments, properties of the line, such as linetype or colour, will vary according to the grouping variable, and corresponding legends will appear.
In this case, we can use the colour aessthetics, which matches colour of the lines to different levels of a variable in the data set (here: y1 vs y2). But first we need to melt the data from wide to long format, using e.g. the function 'melt' from reshape2 package. Other methods to reshape the data are described here: Reshaping data.frame from wide to long format.
library(ggplot2)
library(reshape2)
# original data in a 'wide' format
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
df <- data.frame(x, y1, y2)
# melt the data to a long format
df2 <- melt(data = df, id.vars = "x")
# plot, using the aesthetics argument 'colour'
ggplot(data = df2, aes(x = x, y = value, colour = variable)) + geom_line()
If you are using base graphics (i.e. not lattice/ grid graphics), then you can mimic MATLAB's hold on feature by using the points/lines/polygons functions to add additional details to your plots without starting a new plot. In the case of a multiplot layout, you can use par(mfg=...) to pick which plot you add things to.
You can use points for the overplot, that is.
plot(x1, y1,col='red')
points(x2,y2,col='blue')
Idiomatic Matlab plot(x1,y1,x2,y2) can be translated in R with ggplot2 for example in this way:
x1 <- seq(1,10,.2)
df1 <- data.frame(x=x1,y=log(x1),type="Log")
x2 <- seq(1,10)
df2 <- data.frame(x=x2,y=cumsum(1/x2),type="Harmonic")
df <- rbind(df1,df2)
library(ggplot2)
ggplot(df)+geom_line(aes(x,y,colour=type))
Inspired by Tingting Zhao's Dual line plots with different range of x-axis Using ggplot2.
Rather than keeping the values to be plotted in an array, store them in a matrix. By default the entire matrix will be treated as one data set. However if you add the same number of modifiers to the plot, e.g. the col(), as you have rows in the matrix, R will figure out that each row should be treated independently. For example:
x = matrix( c(21,50,80,41), nrow=2 )
y = matrix( c(1,2,1,2), nrow=2 )
plot(x, y, col("red","blue")
This should work unless your data sets are of differing sizes.
You could use the ggplotly() function from the plotly package to turn any of the gggplot2 examples here into an interactive plot, but I think this sort of plot is better without ggplot2:
# call Plotly and enter username and key
library(plotly)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
plot_ly(x = x) %>%
add_lines(y = y1, color = I("red"), name = "Red") %>%
add_lines(y = y2, color = I("green"), name = "Green")
You can also create your plot using ggvis:
library(ggvis)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
df <- data.frame(x, y1, y2)
df %>%
ggvis(~x, ~y1, stroke := 'red') %>%
layer_paths() %>%
layer_paths(data = df, x = ~x, y = ~y2, stroke := 'blue')
This will create the following plot:
Using plotly (adding solution from plotly with primary and secondary y axis- It seems to be missing):
library(plotly)
x <- seq(-2, 2, 0.05)
y1 <- pnorm(x)
y2 <- pnorm(x, 1, 1)
df=cbind.data.frame(x,y1,y2)
plot_ly(df) %>%
add_trace(x=~x,y=~y1,name = 'Line 1',type = 'scatter',mode = 'lines+markers',connectgaps = TRUE) %>%
add_trace(x=~x,y=~y2,name = 'Line 2',type = 'scatter',mode = 'lines+markers',connectgaps = TRUE,yaxis = "y2") %>%
layout(title = 'Title',
xaxis = list(title = "X-axis title"),
yaxis2 = list(side = 'right', overlaying = "y", title = 'secondary y axis', showgrid = FALSE, zeroline = FALSE))
Screenshot from working demo:
we can also use lattice library
library(lattice)
x <- seq(-2,2,0.05)
y1 <- pnorm(x)
y2 <- pnorm(x,1,1)
xyplot(y1 + y2 ~ x, ylab = "y1 and y2", type = "l", auto.key = list(points = FALSE,lines = TRUE))
For specific colors
xyplot(y1 + y2 ~ x,ylab = "y1 and y2", type = "l", auto.key = list(points = F,lines = T), par.settings = list(superpose.line = list(col = c("red","green"))))
Use curve for mathematical functions.
And use add=TRUE to use the same plot and axis.
curve( log2 , to=5 , col="black", ylab="log's(.)")
curve( log , add=TRUE , col="red" )
curve( log10, add=TRUE , col="blue" )
abline( h=0 )

Plotting log-scale in R's rCharts using NVD3

I am using rCharts to create an interactive scatter plot in R. The following code works just fine:
library(rCharts)
test.df <- data.frame(x=sample(1:100,size=30,replace=T),
y=sample(10:1000,size=30,replace=T),
g=rep(c('a','b'),each=15))
n1 <- nPlot(y ~ x, group="g", data=test.df, type='scatterChart')
n1
What I need is to use a log-scale for both X- and Y-axis. How can I specify this in rCharts without hacking the produced html/javascript?
Update 1:
A more realistic and static version of the plot I am trying to get plotted with rCharts:
set.seed(2935)
y_nbinom <- c(rnbinom(n=20, size=10, mu=90), rnbinom(n=20, size=20, mu=1282), rnbinom(n=20, size=30, mu=12575))
x_nbinom <- c(rnbinom(n=20, size=30, mu=100), rnbinom(n=20, size=40, mu=1000), rnbinom(n=20, size=50, mu=10000))
x_fixed <- c(rep(100,20), rep(1000,20), rep(10000,20))
realp <- rep(0:2, each=2) * 20 + sample(1:20, size=6, replace=F)
tdf <- data.frame(y = c(y_nbinom,y_nbinom,y_nbinom[realp]), x=c(x_fixed,x_nbinom,x_nbinom[realp]), type=c(rep(c('fixed','nbinom'),each=60), rep('real',6)))
with(tdf, plot(x, y, col=type, pch=19, log='xy'))
I think this question is a bit old but I had a similar problem and solve it using the info in this post:
rCharts nvd3 library force ticks
Here my solution for a base10 log-scaled stacked area chart, it shouldn't be too different for a scatter plot.
df<-data.frame(x=rep(10^seq(0,5,length.out=24),each=4),
y=round(runif(4*24,1,50)),
var=rep(LETTERS[1:4], 4))
df$x<-log(df$x,10)
p <- nPlot(y ~ x, group = 'var', data = df,
type = 'stackedAreaChart', id = 'chart')
p$xAxis(tickFormat = "#!function (x) {
tickformat = [1,10,100,1000,10000,'100k'];
return tickformat[x];}!#")
p

How to plot multiple ECDF's on one plot in different colors in R

I am trying to plot 4 ecdf functions on one plot but can't seem to figure out the proper syntax.
If I have 4 functions "A, B, C, D" what would be the proper syntax in R to get them to be plotted on the same chart with different colors. Thanks!
Here is one way (for three of them, works for four the same way):
set.seed(42)
ecdf1 <- ecdf(rnorm(100)*0.5)
ecdf2 <- ecdf(rnorm(100)*1.0)
ecdf3 <- ecdf(rnorm(100)*2.0)
plot(ecdf3, verticals=TRUE, do.points=FALSE)
plot(ecdf2, verticals=TRUE, do.points=FALSE, add=TRUE, col='brown')
plot(ecdf1, verticals=TRUE, do.points=FALSE, add=TRUE, col='orange')
Note that I am using the fact that the third has the widest range, and use that to initialize the canvas. Else you need ylim=c(...).
The package latticeExtra provides the function ecdfplot.
library(lattice)
library(latticeExtra)
set.seed(42)
vals <- data.frame(r1=rnorm(100)*0.5,
r2=rnorm(100),
r3=rnorm(100)*2)
ecdfplot(~ r1 + r2 + r3, data=vals, auto.key=list(space='right')
Here is an approach using ggplot2 (using the ecdf objects from [Dirk's answer])(https://stackoverflow.com/a/20601807/1385941)
library(ggplot2)
# create a data set containing the range you wish to use
d <- data.frame(x = c(-6,6))
# create a list of calls to `stat_function` with the colours you wish to use
ll <- Map(f = stat_function, colour = c('red', 'green', 'blue'),
fun = list(ecdf1, ecdf2, ecdf3), geom = 'step')
ggplot(data = d, aes(x = x)) + ll
A simpler way is to use ggplot and have the variable that you want to plot as a factor. In the example below, I have Portfolio as a factor and plotting the distribution of Interest Rates by Portfolio.
# select a palette
myPal <- c( 'royalblue4', 'lightsteelblue1', 'sienna1')
# plot the Interest Rate distribution of each portfolio
# make an ecdf of each category in Portfolio which is a factor
g2 <- ggplot(mortgage, aes(x = Interest_Rate, color = Portfolio)) +
scale_color_manual(values = myPal) +
stat_ecdf(lwd = 1.25, geom = "line")
g2
You can also set geom = "step", geom = "point" and adjust the line width lwd in the stat_ecdf() function. This gives you a nice plot with the legend.

specific colours are required within Hexbin package?

I am plotting scatter plot for high density of dots.I used Hexbin package and I successfully plot the data.The colour is not pretty,and I am asked to follow a standard colour. I wonder if it is supported by R. Image shows my out put(right) and the wanted colour(left).
Example:
x <- rnorm(1000)
y <- rnorm(1000)
bin<-hexbin(x,y, xbins=50)
plot(bin, main="Hexagonal Binning")
Using the example on the package helpapge for hexbin you can get close using rainbow and playing with the colcuts argument like so...
x <- rnorm(10000)
y <- rnorm(10000)
(bin <- hexbin(x, y))
plot(hexbin(x, y + x*(x+1)/4),main = "Example" ,
colorcut = seq(0,1,length.out=64),
colramp = function(n) rev(rainbow(64)),
legend = 0 )
You will need to play with the legend specification etc to get exactly what you want.
Alternative colour palette suggested by #Roland
## nicer colour palette
cols <- colorRampPalette(c("darkorchid4","darkblue","green","yellow", "red") )
plot(hexbin(x, y + x*(x+1)/4), main = "Example" ,
colorcut = seq(0,1,length.out=24),
colramp = function(n) cols(24) ,
legend = 0 )

draw one or more plots in the same window

I want compare two curves, it's possible with R to draw a plot and then draw another plot over it ? how ?
thanks.
With base R, you can plot your one curve and then add the second curve with the lines() argument. Here's a quick example:
x <- 1:10
y <- x^2
y2 <- x^3
plot(x,y, type = "l")
lines(x, y2, col = "red")
Alternatively, if you wanted to use ggplot2, here are two methods - one plots different colors on the same plot, and the other generates separate plots for each variable. The trick here is to "melt" the data into long format first.
library(ggplot2)
df <- data.frame(x, y, y2)
df.m <- melt(df, id.var = "x")
qplot(x, value, data = df.m, colour = variable, geom = "line")
qplot(x, value, data = df.m, geom = "line")+ facet_wrap(~ variable)
Using lattice package:
require(lattice)
x <- seq(-3,3,length.out=101)
xyplot(dnorm(x) + sin(x) + cos(x) ~ x, type = "l")
There's been some solutions already for you. If you stay with the base package, you should get acquainted with the functions plot(), lines(), abline(), points(), polygon(), segments(), rect(), box(), arrows(), ...Take a look at their help files.
You should see a plot from the base package as a pane with the coordinates you gave it. On that pane, you can draw a whole set of objects with the abovementioned functions. They allow you to construct a graph as you want. You should remember though that, unless you play with the par settings like Dr. G showed, every call to plot() gives you a new pane. Also take into account that things can be plot over other things, so think about the order you use to plot things.
See eg:
set.seed(100)
x <- 1:10
y <- x^2
y2 <- x^3
yse <- abs(runif(10,2,4))
plot(x,y, type = "n") # type="n" only plots the pane, no curves or points.
# plots the area between both curves
polygon(c(x,sort(x,decreasing=T)),c(y,sort(y2,decreasing=T)),col="grey")
# plot both curves
lines(x,y,col="purple")
lines(x, y2, col = "red")
# add the points to the first curve
points(x, y, col = "black")
# adds some lines indicating the standard error
segments(x,y,x,y+yse,col="blue")
# adds some flags indicating the standard error
arrows(x,y,x,y-yse,angle=90,length=0.1,col="darkgreen")
This gives you :
Have a look at par
> ?par
> plot(rnorm(100))
> par(new=T)
> plot(rnorm(100), col="red")
ggplot2 is a great package for this sort of thing:
install.packages('ggplot2')
require(ggplot2)
x <- 1:10
y1 <- x^2
y2 <- x^3
df <- data.frame(x = x, curve1 = y1, curve2 = y2)
df.m <- melt(df, id.vars = 'x', variable_name = 'curve' )
# now df.m is a data frame with columns 'x', 'curve', 'value'
ggplot(df.m, aes(x,value)) + geom_line(aes(colour = curve)) +
geom_point(aes(shape=curve))
You get the plot coloured by curve, and with different piont marks for each curve, and a nice legend, all painlessly without any additional work:
Draw multiple curves at the same time with the matplot function. Do help(matplot) for more.

Resources