I am plotting scatter plot for high density of dots.I used Hexbin package and I successfully plot the data.The colour is not pretty,and I am asked to follow a standard colour. I wonder if it is supported by R. Image shows my out put(right) and the wanted colour(left).
Example:
x <- rnorm(1000)
y <- rnorm(1000)
bin<-hexbin(x,y, xbins=50)
plot(bin, main="Hexagonal Binning")
Using the example on the package helpapge for hexbin you can get close using rainbow and playing with the colcuts argument like so...
x <- rnorm(10000)
y <- rnorm(10000)
(bin <- hexbin(x, y))
plot(hexbin(x, y + x*(x+1)/4),main = "Example" ,
colorcut = seq(0,1,length.out=64),
colramp = function(n) rev(rainbow(64)),
legend = 0 )
You will need to play with the legend specification etc to get exactly what you want.
Alternative colour palette suggested by #Roland
## nicer colour palette
cols <- colorRampPalette(c("darkorchid4","darkblue","green","yellow", "red") )
plot(hexbin(x, y + x*(x+1)/4), main = "Example" ,
colorcut = seq(0,1,length.out=24),
colramp = function(n) cols(24) ,
legend = 0 )
Related
I have a code similar to this:
x <- rnorm(100)
y <- density(x, n = 1000)
plot(y)
polygon(y,col="red")
However, I would also like to add a colour gradient to a density plot in basic R, particularly using a palette such as the Spectral from blue to red. In this way, the output would look like this:
I appreciate any help! Thanks!
You could use segments() to add countless line segments with gradient colours.
x <- rnorm(100)
dens <- density(x, n = 1000)
plot(dens)
segments(dens$x, 0, dens$x, dens$y, col = hcl.colors(1000, "Spectral", rev = TRUE))
polygon(dens)
I want to plot a 3D plot using R. My data set is independent, which means the values of x, y, and z are not dependent on each other. The plot I want is given in this picture:
This plot was drawn by someone using MATLAB. How can I can do the same kind of Plot using R?
Since you posted your image file, it appears you are not trying to make a 3d scatterplot, rather a 2d scatterplot with a continuous color scale to indicate the value of a third variable.
Option 1: For this approach I would use ggplot2
# make data
mydata <- data.frame(x = rnorm(100, 10, 3),
y = rnorm(100, 5, 10),
z = rpois(100, 20))
ggplot(mydata, aes(x,y)) + geom_point(aes(color = z)) + theme_bw()
Which produces:
Option 2: To make a 3d scatterplot, use the cloud function from the lattice package.
library(lattice)
# make some data
x <- runif(20)
y <- rnorm(20)
z <- rpois(20, 5) / 5
cloud(z ~ x * y)
I usually do these kinds of plots with the base plotting functions and some helper functions for the color levels and color legend from the sinkr package (you need the devtools package to install from GitHib).
Example:
#library(devtools)
#install_github("marchtaylor/sinkr")
library(sinkr)
# example data
grd <- expand.grid(
x=seq(nrow(volcano)),
y=seq(ncol(volcano))
)
grd$z <- c(volcano)
# plot
COL <- val2col(grd$z, col=jetPal(100))
op <- par(no.readonly = TRUE)
layout(matrix(1:2,1,2), widths=c(4,1), heights=4)
par(mar=c(4,4,1,1))
plot(grd$x, grd$y, col=COL, pch=20)
par(mar=c(4,1,1,4))
imageScale(grd$z, col=jetPal(100), axis.pos=4)
mtext("z", side=4, line=3)
par(op)
Result:
I am trying to plot 4 ecdf functions on one plot but can't seem to figure out the proper syntax.
If I have 4 functions "A, B, C, D" what would be the proper syntax in R to get them to be plotted on the same chart with different colors. Thanks!
Here is one way (for three of them, works for four the same way):
set.seed(42)
ecdf1 <- ecdf(rnorm(100)*0.5)
ecdf2 <- ecdf(rnorm(100)*1.0)
ecdf3 <- ecdf(rnorm(100)*2.0)
plot(ecdf3, verticals=TRUE, do.points=FALSE)
plot(ecdf2, verticals=TRUE, do.points=FALSE, add=TRUE, col='brown')
plot(ecdf1, verticals=TRUE, do.points=FALSE, add=TRUE, col='orange')
Note that I am using the fact that the third has the widest range, and use that to initialize the canvas. Else you need ylim=c(...).
The package latticeExtra provides the function ecdfplot.
library(lattice)
library(latticeExtra)
set.seed(42)
vals <- data.frame(r1=rnorm(100)*0.5,
r2=rnorm(100),
r3=rnorm(100)*2)
ecdfplot(~ r1 + r2 + r3, data=vals, auto.key=list(space='right')
Here is an approach using ggplot2 (using the ecdf objects from [Dirk's answer])(https://stackoverflow.com/a/20601807/1385941)
library(ggplot2)
# create a data set containing the range you wish to use
d <- data.frame(x = c(-6,6))
# create a list of calls to `stat_function` with the colours you wish to use
ll <- Map(f = stat_function, colour = c('red', 'green', 'blue'),
fun = list(ecdf1, ecdf2, ecdf3), geom = 'step')
ggplot(data = d, aes(x = x)) + ll
A simpler way is to use ggplot and have the variable that you want to plot as a factor. In the example below, I have Portfolio as a factor and plotting the distribution of Interest Rates by Portfolio.
# select a palette
myPal <- c( 'royalblue4', 'lightsteelblue1', 'sienna1')
# plot the Interest Rate distribution of each portfolio
# make an ecdf of each category in Portfolio which is a factor
g2 <- ggplot(mortgage, aes(x = Interest_Rate, color = Portfolio)) +
scale_color_manual(values = myPal) +
stat_ecdf(lwd = 1.25, geom = "line")
g2
You can also set geom = "step", geom = "point" and adjust the line width lwd in the stat_ecdf() function. This gives you a nice plot with the legend.
I have following data and plot:
pos <- rep(1:2000, 20)
xv =c(rep(1:20, each = 2000))
# colrs <- unique(xv)
colrs <- xv # edits
yv =rnorm(2000*20, 0.5, 0.1)
xv = lapply(unique(xv), function(x) pos[xv==x])
to.add = cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
plot (bp,yv, pch = "*", col = colrs)
I have few issues in this plot I could not figure out.
(1) I want to use different color for different group or two different color for different groups (i.e xv), but when I tried color function in terms to be beautiful mixture. Although I need to highlight some points (for example bp 4000 to 4500 for example with blue color)
(2) Instead of bp positions I want to put a tick mark and label with the group.
Thank you, appreciate your help.
Edits: with help of the following answer (with slight different approach in case I have unbalanced number in each group will work) I could get the similar plot. But still question remaining regarding colors is what if I want to use two alternate colors in alternate group ?
You can solve your colour issue by repeating the colour index however many times each group has a point plotted, like so:
plot (bp,yv, pch = "*", col = rep(colrs,each=2000))
The default colour palette (see ?palette or palette() ) will wrap around itself and you might want to specify your own to get 20 distinct colours.
To relabel the x axis, try plotting without the axis and then specifying the points and labels manually.
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20)
If you are trying to squeeze a lot of labels in there, you might have to shrink the text (cex.axis)or spin the labels 90 degrees (las=2).
plot (bp,yv, pch = "*", col = rep(colrs,each=2000),xaxt="n")
axis(1,at=seq(1000,58000,3000),labels=1:20,cex.axis=0.7,las=2)
Result:
One way is you could use a nested ifelse.
I'm still learning R, but one way it could be done would look something like:
plot(whatev$x, whatev$y, col=ifelse(xv<2000,red,ifelse(2000<xv & xv<4000,yellow,blue)))
You could nest as many of these as you want to have specificity on the colors and the intervals. The ifelse command is of form ifelse(TEST, True, False).
A simpler way would be to use the unique groups in xv to assign rainbow colors.
colrs=rainbow(length(unique(xv))) #Or colrs=rainbow(length(xv)) if xv is unique.
plot(whatev$x, whatev$y, col=colrs)
I hope I got all that right. I'm still learning R myself.
I'm going to go out on a limb and guess that your real data are something like 2000 values of things from 20 different groups. For instance, heights of 2000 plants of 20 different species. In such a case, you might want to look at the dotplot() function (or as illustrated below, dotplot.table()) in the lattice package.
Generate matrix of hypothetical values:
set.seed(1)
myY <- sapply( seq_len(20), function(x) rnorm(2000, x^(1/3)))
Transpose matrix to get groups as rows
myY <- t(myY)
Provide names of groups to matrix:
dimnames(myY)[[1]]<-paste("group", seq_len(nrow(myY)))
Load lattice package
library(lattice)
Generate dotplot
dotplot(myY, horizontal = FALSE, panel = function(x, y, horizontal, ...) {
panel.dotplot(x = x, y = y, horizontal = horizontal, jitter.x = TRUE,
col = seq_len(20)[x], pch = "*", cex = 1.5)
}, scales = list(x = list(rot = 90))
)
Which looks like (with unfortunate y-axis labeling):
Seeing that #JohnCLK is requesting a way of colouring by values on the x axis, I tried these demos in ggplot2-- each uses a dummy variable that is coded based on values or ranges to be highlighted in the other variables.
So, first set up the data, as in the question:
pos <- rep(1:2000, 20)
xv <- c(rep(1:20, each = 2000))
yv <- (2000*20, 0.5, 0.1)
xv <- lapply(unique(xv), function(x) pos[xv==x])
to.add <- cumsum(sapply(xv, max) + 1000)
bp <- c(xv[[1]], unlist(lapply(2:length(xv), function(x) xv[[x]] + to.add[x-1])))
Then load ggplot2, prepare a couple of utility functions, and set the default theme:
library("ggplot2")
make.png <- function(p, fName) {
png(fName, width=640, height=480, units="px")
print(p)
dev.off()
}
make.plot <- function(df) {
p <- ggplot(df,
aes(x = bp,
y = yv,
colour = highlight))
p <- p + geom_point()
p <- p + opts(legend.position = "none")
return(p)
}
theme_set( theme_bw() )
Draw a plot which highlights values in a defined range on the vertical axis:
# highlight a horizontal band
df <- data.frame(cbind(bp, yv))
df$highlight <- 0
df$highlight[ df$yv >= 0.4 & df$yv < 0.45 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_horizontal.png")
Next draw a plot which highlights values in a defined range on the x axis, a vertical band:
# highlight a vertical band
df$highlight <- 0
df$highlight[ df$bp >= 38000 & df$bp < 42000 ] <- 1
p <- make.plot(df)
print(p)
make.png(p, "demo_vertical.png")
And finally draw a plot which highlights alternating vertical bands, by x value:
# highlight alternating bands
library("gtools")
alt.band.width <- 2000
df$highlight <- as.integer(df$bp / alt.band.width)
df$highlight <- ifelse(odd(df$highlight), 1, 0)
p <- make.plot(df)
print(p)
make.png(p, "demo_alternating.png")
Hope this helps; it was good practice anyway.
I want compare two curves, it's possible with R to draw a plot and then draw another plot over it ? how ?
thanks.
With base R, you can plot your one curve and then add the second curve with the lines() argument. Here's a quick example:
x <- 1:10
y <- x^2
y2 <- x^3
plot(x,y, type = "l")
lines(x, y2, col = "red")
Alternatively, if you wanted to use ggplot2, here are two methods - one plots different colors on the same plot, and the other generates separate plots for each variable. The trick here is to "melt" the data into long format first.
library(ggplot2)
df <- data.frame(x, y, y2)
df.m <- melt(df, id.var = "x")
qplot(x, value, data = df.m, colour = variable, geom = "line")
qplot(x, value, data = df.m, geom = "line")+ facet_wrap(~ variable)
Using lattice package:
require(lattice)
x <- seq(-3,3,length.out=101)
xyplot(dnorm(x) + sin(x) + cos(x) ~ x, type = "l")
There's been some solutions already for you. If you stay with the base package, you should get acquainted with the functions plot(), lines(), abline(), points(), polygon(), segments(), rect(), box(), arrows(), ...Take a look at their help files.
You should see a plot from the base package as a pane with the coordinates you gave it. On that pane, you can draw a whole set of objects with the abovementioned functions. They allow you to construct a graph as you want. You should remember though that, unless you play with the par settings like Dr. G showed, every call to plot() gives you a new pane. Also take into account that things can be plot over other things, so think about the order you use to plot things.
See eg:
set.seed(100)
x <- 1:10
y <- x^2
y2 <- x^3
yse <- abs(runif(10,2,4))
plot(x,y, type = "n") # type="n" only plots the pane, no curves or points.
# plots the area between both curves
polygon(c(x,sort(x,decreasing=T)),c(y,sort(y2,decreasing=T)),col="grey")
# plot both curves
lines(x,y,col="purple")
lines(x, y2, col = "red")
# add the points to the first curve
points(x, y, col = "black")
# adds some lines indicating the standard error
segments(x,y,x,y+yse,col="blue")
# adds some flags indicating the standard error
arrows(x,y,x,y-yse,angle=90,length=0.1,col="darkgreen")
This gives you :
Have a look at par
> ?par
> plot(rnorm(100))
> par(new=T)
> plot(rnorm(100), col="red")
ggplot2 is a great package for this sort of thing:
install.packages('ggplot2')
require(ggplot2)
x <- 1:10
y1 <- x^2
y2 <- x^3
df <- data.frame(x = x, curve1 = y1, curve2 = y2)
df.m <- melt(df, id.vars = 'x', variable_name = 'curve' )
# now df.m is a data frame with columns 'x', 'curve', 'value'
ggplot(df.m, aes(x,value)) + geom_line(aes(colour = curve)) +
geom_point(aes(shape=curve))
You get the plot coloured by curve, and with different piont marks for each curve, and a nice legend, all painlessly without any additional work:
Draw multiple curves at the same time with the matplot function. Do help(matplot) for more.