Related
par(mfrow=c(1,2))
Trigen <- data.frame(OTriathlon$Gender,OTriathlon$Swim,OTriathlon$Bike,OTriathlon$Run)
colnames(Trigen) <- c("Gender","Swim","Bike","Run")
res <- split(Trigen[,2:4],Trigen$Gender)
pairs(res$Male, pch="M", col = 4)
points(res$Female, pch ="F", col= 2)
Basically, Customize the pairs plot, so where the plot symbol and color of each data point represents
gender.
I did some random things in the code but the issue that I am facing is that I cant add female points to the existing plot. After running the points code it just stays the same doesn't get updated
There is no need to call points sevral times, because you can use the factor directly as a color. Example:
plot(iris[,c(2,3)], col=iris$Species)
I'm learning R, and want to draw a scatterplot of a large dataframe (~55000 rows). I'm using the scatterplot in car:
library(car)
d=read.csv("patches.csv", header=T)
scatterplot(energy ~ homogenity | label, data=d,
ylab="energy", xlab="homogenity ",
main="Scatter Plot",
labels=row.names(d))
where patches.csv contains the dataframe (below)
I want to show the two label sets differently. With a large volume of data, the plot is very dense, so I get the result below right (mostly red data visible). The image takes a while to render, so I can see the black labelled data fleetingly (below left) before it gets hidden in the final diagram.
Can I control R to plot the data with red first, or is there a better way to achieve my goal?
Here's a sample of my data:
label,channel,x,y,contrast,energy,entropy,homogenity
1,21,460,76,0.991667,0.640399,0.421422,0.939831
1,22,460,76,0.0833333,0.62375,0.364379,0.969445
1,23,460,76,0.129167,0.422908,0.589938,0.935417
1,24,460,76,0,1,0,1
1,25,460,76,0,1,0,1
1,26,460,76,0.0875,0.789627,0.253649,0.967361
1,27,460,76,2.4,0.528516,0.700859,0.845558
1,28,460,76,0.120833,0.562066,0.392998,0.945139
1,29,460,76,0.0125,0.975234,0.0329461,0.99375
1,30,460,76,0,1,0,1
1,31,460,76,0.1625,0.384662,0.5859,0.929861
0,0,483,82,0.404167,0.309505,0.61573,0.947222
0,1,483,82,0.0166667,0.728559,0.221967,0.991667
0,2,483,82,0,1,0,1
0,3,483,82,0.416667,0.327083,0.644057,0.940972
0,4,483,82,0.0208333,0.919054,0.0940364,0.989583
0,5,483,82,0.416667,0.327083,0.644057,0.940972
0,6,483,82,0,1,0,1
0,7,483,82,0.0333333,0.794479,0.192471,0.983333
0,8,483,82,0,1,0,1
0,9,483,82,0,1,0,1
0,10,483,82,0.0208333,0.958984,0.0502502,0.989583
If you want to change the order of the coloring, pass the parameter col=2:1 to scatterplot, then you would be plotting red before black. You can use the function alpha from scales package to make your points translucent (it takes a vector of colors and alpha values allowing to make each color different density).
## More data
d <- data.frame(homogeneity=(x=rnorm(10000, 0.85, sd=0.15)),
label=factor((lab=1:2)),
energy=rnorm(10000, lab^1.8*x^2-lab, sd=x))
library(car)
library(scales) # for alpha
opacity <- c(0.3, 0.1) # opacity for each color
col <- 1:2 # black then red
scatterplot(energy ~ homogeneity | label, data=d,
ylab="energy", xlab="homogenity ",
main=paste0(palette()[col], "(", opacity, ")", collapse=","),
col=alpha(col, opacity),
labels=row.names(d))
Similar to what bunk said with alpha,
If you have lots of points, the actual identification of individual points is no longer meaningful. Instead, you probably want a representation of the density. For that use smoothScatter(x,y) and overlay highlighted points with the usual points(morex,morey). You obviously know how to use points (same parameters as plot) so it's very easy for you to implement, and requires very little extra knowledge on your part.
I've run a 2d simulation in some modelling software from which i've got an export of x,y point locations with a set of 6 attributes. I wish to recreate a figure that combines the data, like this:
The ellipses and the background are shaded according to attribute 1 (and the borders of these are of course representing the model geometry, but I don't think I can replicate that), the isolines are contours of attribute 2, and the arrow glyphs are from attributes 3 (x magnitude) and 4 (y magnitude).
The x,y points are centres of the triangulated mesh I think, and look like this:
I want to know how I can recreate a plot like this with R. To start with I have irregularly-spaced data due to it being exported from an irregular mesh. That's immediately where I get stuck with R, having only ever used it for producing box-and-whisper plots and the like.
Here's the data:
https://dl.dropbox.com/u/22417033/Ellipses_noheader.txt
Edit: fields: x, y, heat flux (x), heat flux (y), thermal conductivity, Temperature, gradT (x), gradT (y).
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
It's quite easy to make the lower plot (making the assumption that there is a dataframe named 'edat' read in with:
edat <- read.table(file=file.choose())
with(edat, plot(V1,V2), cex=0.2)
Things get a bit more beautiful with:
with(edat, plot(V1,V2, cex=0.2, col=V5))
So I do not think your original is being faithfully represented by the data. The contour lines are NOT straight across the "conductors". I call them "conductors" because this looks somewhat like iso-potential lines in electrostatics. I'm adding some text here to serve as a search handle for others who might be searching for plotting problems in real world physics: vector-field (the arrows) , heat equations, gradient, potential lines.
You can then overlay the vector field with:
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
You could"zoom in" with xlim and ylim:
with(edat, plot(V1,V2, cex=0.3, col=V5, xlim=c(0, 10000), ylim=c(-8000, -2000) ))
with(edat, arrows(V1,V2, V1-20*V6*V7, V2-20*V6*V8, length=0.04, col="orange") )
Guessing that the contour requested if for the Temps variable. Take your pick of contourplots.
require(akima)
intflow<- with(edat, interp(x=x, y=y, z=Temps, xo=seq(min(x), max(x), length = 410),
yo=seq(min(y), max(y), length = 410), duplicate="mean", linear=FALSE) )
require(lattice)
contourplot(intflow$z)
filled.contour(intflow)
with( intflow, contour(x=x, y=y, z=z) )
The last one will mix with the other plotting examples since those were using base plotting functions. You may need to switch to points instead of plot.
There are several parts to your plot so you will probably need several tools to make the different parts.
The background and ellipses can be created with polygon (once you figure where they should be).
The contourLines function can calculate the contour lines for you which you can add with the lines function (or contour has and add argument and could probably be used to add the lines directly).
The akima package has a function interp which can estimate values on a grid given the values ungridded.
The my.symbols function along with ms.arrows, both from the TeachingDemos package, can be used to draw the vector field.
#DWin is right to say that your graph don't represent faithfully your data, so I would advice to follow his answer. However here is how to reproduce (the closest I could) your graph:
Ellipses <- read.table(file.choose())
names(Ellipses) <- c('x','y','dfluxx','dfluxy','kxx','Temps','gradTx','gradTy')
require(splancs)
require(akima)
First preparing the data:
#First the background layer (the 'kxx' layer):
# Here the regular grid on which we're gonna do the interpolation
E.grid <- with(Ellipses,
expand.grid(seq(min(x),max(x),length=200),
seq(min(y),max(y),length=200)))
names(E.grid) <- c("x","y") # Without this step, function inout throws an error
E.grid$Value <- rep(0,nrow(E.grid))
#Split the dataset according to unique values of kxx
E.k <- split(Ellipses,Ellipses$kxx)
# Find the convex hull delimiting each of those values domain
E.k.ch <- lapply(E.k,function(X){X[chull(X$x,X$y),]})
for(i in unique(Ellipses$kxx)){ # Pick the value for each coordinate in our regular grid
E.grid$Value[inout(E.grid[,1:2],E.k.ch[names(E.k.ch)==i][[1]],bound=TRUE)]<-i
}
# Then the regular grid for the second layer (Temp)
T.grid <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=200),
yo=seq(min(y),max(y),length=200),
duplicate="mean", linear=FALSE))
# The regular grids for the arrow layer (gradT)
dx <- with(Ellipses,
interp(x,y,gradTx,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
dy <- with(Ellipses,
interp(x,y,gradTy,xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
T.grid2 <- with(Ellipses,
interp(x,y,Temps, xo=seq(min(x),max(x),length=15),
yo=seq(min(y),max(y),length=10),
duplicate="mean", linear=FALSE))
gradTgrid<-expand.grid(dx$x,dx$y)
And then the plotting:
palette(grey(seq(0.5,0.9,length=5)))
par(mar=rep(0,4))
plot(E.grid$x, E.grid$y, col=E.grid$Value,
axes=F, xaxs="i", yaxs="i", pch=19)
contour(T.grid, add=TRUE, col=colorRampPalette(c("blue","red"))(15), drawlabels=FALSE)
arrows(gradTgrid[,1], gradTgrid[,2], # Here I multiply the values so you can see them
gradTgrid[,1]-dx$z*40*T.grid2$z, gradTgrid[,2]-dy$z*40*T.grid2$z,
col="yellow", length=0.05)
To understand in details how this code works, I advise you to read the following help pages: ?inout, ?chull, ?interp, ?expand.grid and ?contour.
I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?
I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer
It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))
I'm trying to combine or that is to say overlay a barchart with a xyplot (with regression line) with two variables whose values are quite different.
Here's my data: https://www.dropbox.com/s/aacbkmo577uagjs/example.csv
There are the two numeric variables "rb" and "rae" and several factor variables (sample.size, effect.size, allocation.design, true.dose) that are to be displayed in panels according to the code below. The variable "rae" should be displayed in a barchart (ideally in faint colors in the background), whereas the variable "rb" is to be displayed in a xyplot with a regression line. There are two main questions:
(1) How to combine / overlay both types of graphs?
(2) How to customize axes labels (different scales for y-scale)
For (1), I know how to combine different types of graphs with ggplot2, but it should be also possible with lattice, am I right? I tried "doubleYScale", which doesn't seem to work.
For (2), I only accomplished to use "relation='free'" for the y-scale in the "scales"-option (see code). This is nice since the focus is on the important range of the values. However, it would be more appropriate if axes-labels are additionally drawn on the left and right outside (for "rae" and "rb", respectively).
Here's the code so far (modified by Dieter Menne to be self-contained)
library(lattice)
library(latticeExtra)
df.dose <- read.table("example.csv", header=TRUE, sep=",")
df.dose <- transform(df.dose,
sample.size=as.factor(sample.size),
true.dose = as.factor(true.dose))
rae.plot <- xyplot(
rae ~ sample.size | allocation.design*true.dose,
df.dose, as.table=TRUE,
groups = type,
lty = 1, jitter.x=TRUE,
main="RAE",
scales=list(y=list(draw=F, relation="free", tck=.5)),
panel = function(x,y) {
panel.xyplot(x,y,jitter.x=TRUE)
panel.lmline(x,y, col="darkgrey", lwd=1.5)
})
useOuterStrips(rae.plot)
rb.plot <- barchart(
rb ~ sample.size | allocation.design*as.factor(true.dose),
df.dose, as.table=TRUE,
groups = type,
key=list(
text=list(levels(as.factor(df.dose$type))),
scales=list(y=list(draw=F, relation="free", tck=.5)),
main="RB"))
useOuterStrips(rb.plot)
print(useOuterStrips(rae.plot), split=c(1,1,1,2),more=TRUE)
print(useOuterStrips(rb.plot), split=c(1,2,1,2), more=FALSE)
will print both on one page; it's easier than in ggplot2.
scales=list(
y=list(alternating=1,tck=c(1,0)),
x=list(alternating=1,tck=c(1,0)))
xyplot (... scales=scales)