I am quite new to programming/R and I'm having a very unusual problem. I've made a scatterplot and I would like to simply put the x y axis at 0 on the plot. However, when I use abline they are slightly off. I managed to get them to 0 using trial and error, but trying to plot other lines becomes impossible.
library('car')
scatterplot(cost~qaly, reg.line=FALSE, smooth=FALSE, spread=FALSE,
boxplots='xy', span=0.5, xlab="QALY", ylab="COST", main="Bootstrap",
cex=0.5, data=scat2, xlim=c(-.05,.05), grid=FALSE)
abline(v = 0, h = 0)
This gives lines which are slightly to the left and below 0.
here is an image of what this returns:
(I can't post an image since I'm new apparently)
I found that these values put the lines on 0:
abline(v=0.003)
abline(h=3000)
Thanks in advance for the help!
Using #Laterow's example, reproduce the issue
require(car)
set.seed(10)
x <- rnorm(1000); y <- rnorm(1000)
scatterplot(y ~ x)
abline(v=0, h=0)
scatterplot seems to be resetting the par settings on exit. You can sort of check this with locator(1) around some point, eg, for {-3,-3} I get
# $x
# [1] -2.469414
#
# $y
# [1] -2.223922
Option 1
As #joran points out, reset.par = FALSE is the easiest way
scatterplot(y ~ x, reset.par = FALSE)
abline(v=0, h=0)
Option 2
In ?scatterplot, it says that ... is passed to plot meaning you can use plot's very useful panel.first and panel.last arguments (among others).
scatterplot(y ~ x, panel.first = {grid(); abline(v = 0)}, grid = FALSE)
Note that if you were to do the basic
scatterplot(y ~ x, panel.first = abline(v = 0))
you would be unable to see the line because the default scatterplot grid covers it up, so you can turn that off, plot a grid first then do the abline.
You could also do the abline in panel.last, but this would be on top of your points, so maybe not as desirable.
Related
Suppose we have a multi-panel plot in R, created by using layout(). I would like to draw an arrow from a specified point in one panel to a specified point in another panel. Thus, the arrow goes across panels of the layout. The starting point of the arrow is specified in the coordinates of its panel, and the end point of the arrow is specified in the coordinates of the destination panel.
As a minimal example, consider this:
layout( matrix( 1:2 , nrow=2 ) )
plot( x=c(1,2) , y=c(1,2) , main="Plot 1" )
plot( x=c(10,20) , y=c(10,20) , main="Plot 2" )
# I want to make an arrow
# from point c(x=1.2,y=1.2) in Plot 1
# to point c(x=18,y=18) in Plot 2
I've searched for methods to accomplish this, but haven't found anything. Thank you for solutions or pointers.
Update
(I'm keeping the previous answer below this, but this more-programmatic way is better given your comments.)
The trick is knowing how to convert from "user" coordinates to the coordinates of the overarching device. This can be done with grconvertX and *Y. I've made some sloppy helper functions here, though they are barely necessary.
user2ndc <- function(x, y) {
list(x = grconvertX(x, 'user', 'ndc'),
y = grconvertY(y, 'user', 'ndc'))
}
ndc2user <- function(x, y) {
list(x = grconvertX(x, 'ndc', 'user'),
y = grconvertY(y, 'ndc', 'user'))
}
For the sake of keeping magic-constants out of the code, I'll predefine your points-of-interest:
pointfrom <- list(x = 1.2, y = 1.2)
pointto <- list(x = 18, y = 18)
It's important that the conversion from 'user' to 'ndc' happen while the plot is still current; once you switch from plot 1 to 2, the coordinates change.
layout( matrix( 1:2 , nrow=2 ) )
Plot 1.
plot( x=c(1,2) , y=c(1,2) , main="Plot 1" )
points(y~x, data=pointfrom, pch=16, col='red')
ndcfrom <- with(pointfrom, user2ndc(x, y))
Plot 2.
plot( x=c(10,20) , y=c(10,20) , main="Plot 2" )
points(y~x, data=pointto, pch=16, col='red')
ndcto <- with(pointto, user2ndc(x, y))
As I did before (far below here), I remap the region on which the next plotting commands will take place. Under the hood, layout is doing things like this. (Some neat tricks can be done with par(fig=..., new=T), including overlaying one plot in, around, or barely-overlapping another.)
par(fig=c(0:1,0:1), new=TRUE)
plot.new()
newpoints <- ndc2user(c(ndcfrom$x, ndcto$x), c(ndcfrom$y, ndcto$y))
with(newpoints, arrows(x[1], y[1], x[2], y[2], col='green', lwd=2))
I might have been able to avoid the ndc2user conversino from ndc back to current user points, but that's playing with margins and axis-expansion and things like that, so I opted not to.
It is possible that the translated points may be outside of the user-points region of this last overlaid plot, in which case they may be masked. To fix this, add xpd=NA to arrows (or in a par(xpd=NA) before it).
Generalized
Okay, so imagine you want to be able to determine the coordinates of any drawing after layout completion. There's a more complex implementation that currently supports what you're asking for. the only requirement is that you call NDC$add() after every (meaningful) plot. For example:
NDC$reset()
layout(matrix(1:4, nrow=2))
plot(1)
NDC$add()
plot(11)
NDC$add()
plot(21)
NDC$add()
plot(31)
NDC$add()
with(NDC$convert(1:4, c(1,1,1,1), c(1,11,21,31)), {
arrows(x[1], y[1], x[2], y[2], xpd=NA, col='red')
arrows(x[2], y[2], x[3], y[3], xpd=NA, col='blue')
arrows(x[3], y[3], x[4], y[4], xpd=NA, col='green')
})
Source can be found here: https://gist.github.com/r2evans/8a8ba8fff060bade13bf21e89f0616c5
Previous Answer
One way is to use par(fig=...,new=TRUE), but it does not preserve the coordinates you e
layout(matrix(1:4,nr=2))
plot(1)
plot(1)
plot(1)
plot(1)
par(fig=c(0,1,0,1),new=TRUE)
plot.new()
lines(c(0.25,0.75),c(0.25,0.75),col='blue',lwd=2)
Since you may be more likely to use this if you have better (non-arbitrary) control over the ends of the points, here's a trick to allow you more control over the points. If I use this, connectiong the top-left point with the bottom-right point:
p <- locator(2)
str(p)
# List of 2
# $ x: num [1:2] 0.181 0.819
# $ y: num [1:2] 0.9738 0.0265
and then in place of lines above I use this:
with(p, arrows(x[1], y[1], x[2], y[2], col='green', lwd=2))
I get
(This picture and the values in p demonstrate how the coordinates are different. When using par(fig=...,new=T);plot.new();, the coordinates return to
par('usr')
# [1] -0.04 1.04 -0.04 1.04
There might be trickery to try to workaround this (such as if you need to automate this step), but it likely will be non-trivial (and not robust).
How to create multiple boxplot with value shown in R ?
Now I'm using this code
boxplot(Data_frame[ ,2] ~ Data_frame[ ,3], )
I tried to use this
boxplot(Data_frame[ ,2] ~ Data_frame[ ,3], )
text(y=fivenum(Data_frame$x), labels =fivenum(Data_frame$x), x=1.25)
But only first boxplot have value. How to show value in all boxplot in one graph.
Thank you so much!
As far as I understand your question (it is not clear how the fivenum summary should be displayed) here is one solution. It presents the summary using the top axis.
x <- data.frame(
Time = c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3),
Value = c(5,10,15,20,30,50,70,80,100,5,7,9,11,15,17,19,17,19,100,200,300,400,500,700,1000,200))
boxplot(x$Value ~ x$Time)
fivenums <- aggregate(x$Value, by=list(Time=x$Time), FUN=fivenum)
labels <- apply(fivenums[,-1], 1, function(x) paste(x[-1], collapse = ", "))
axis(3, at=fivenums[,1],labels=labels, las=1, col.axis="red")
Of course you can additionally play with the font size or rotation for this summary. Moreover you can break the line in one place, so the label will have smaller width.
Edit
In order to get what have you posted in the comment below you can add
text(x = 3 + 0.5, y = fivenums[3,-1], labels=fivenums[3,-1])
and you will get
however it won't be readable for other boxplots.
I am trying to visualize some data and in order to do it I am using R's hist.
Bellow are my data
jancoefabs <- as.numeric(as.vector(abs(Janmodelnorm$coef)))
jancoefabs
[1] 1.165610e+00 1.277929e-01 4.349831e-01 3.602961e-01 7.189458e+00
[6] 1.856908e-04 1.352052e-05 4.811291e-05 1.055744e-02 2.756525e-04
[11] 2.202706e-01 4.199914e-02 4.684091e-02 8.634340e-01 2.479175e-02
[16] 2.409628e-01 5.459076e-03 9.892580e-03 5.378456e-02
Now as the more cunning of you might have guessed these are the absolute values of some model's coefficients.
What I need is an histogram that will have for axes:
x will be the number (count or length) of coefficients which is 19 in total, along with their names.
y will show values of each column (as breaks?) having a ylim="" set, according to min and max of those values (or something similar).
Note that Janmodelnorm$coef simply produces the following
(Intercept) LON LAT ME RAT
1.165610e+00 -1.277929e-01 -4.349831e-01 -3.602961e-01 -7.189458e+00
DS DSA DSI DRNS DREW
-1.856908e-04 1.352052e-05 4.811291e-05 -1.055744e-02 -2.756525e-04
ASPNS ASPEW SI CUR W_180_270
-2.202706e-01 -4.199914e-02 4.684091e-02 -8.634340e-01 -2.479175e-02
W_0_360 W_90_180 W_0_180 NDVI
2.409628e-01 5.459076e-03 -9.892580e-03 -5.378456e-02
So far and consulting ?hist, I am trying to play with the code bellow without success. Therefore I am taking it from scratch.
# hist(jancoefabs, col="lightblue", border="pink",
# breaks=8,
# xlim=c(0,10), ylim=c(20,-20), plot=TRUE)
When plot=FALSE is set, I get a bunch of somewhat useful info about the set. I also find hard to use breaks argument efficiently.
Any suggestion will be appreciated. Thanks.
Rather than using hist, why not use a barplot or a standard plot. For example,
## Generate some data
set.seed(1)
y = rnorm(19, sd=5)
names(y) = c("Inter", LETTERS[1:18])
Then plot the cofficients
barplot(y)
Alternatively, you could use a scatter plot
plot(1:19, y, axes=FALSE, ylim=c(-10, 10))
axis(2)
axis(1, 1:19, names(y))
and add error bars to indicate the standard errors (see for example Add error bars to show standard deviation on a plot in R)
Are you sure you want a histogram for this? A lattice barchart might be pretty nice. An example with the mtcars built-in data set.
> coef <- lm(mpg ~ ., data = mtcars)$coef
> library(lattice)
> barchart(coef, col = 'lightblue', horizontal = FALSE,
ylim = range(coef), xlab = '',
scales = list(y = list(labels = coef),
x = list(labels = names(coef))))
A base R dotchart might be good too,
> dotchart(coef, pch = 19, xlab = 'value')
> text(coef, seq(coef), labels = round(coef, 3), pos = 2)
Refer to the above plot. I have drawn the equations in excel and then shaded by hand. You can see it is not very neat. You can see there are six zones, each bounded by two or more equations. What is the easiest way to draw inequalities and shade the regions using hatched patterns ?
To build up on #agstudy's answer, here's a quick-and-dirty way to represent inequalities in R:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
with(as.list(c(b,a)),{
id <- yB<=yA
# b<a area
polygon(x = c(xB[id], rev(xA[id])),
y = c(yB[id], rev(yA[id])),
density=10, angle=0, border=NULL)
# a>b area
polygon(x = c(xB[!id], rev(xA[!id])),
y = c(yB[!id], rev(yA[!id])),
density=10, angle=90, border=NULL)
})
If the area in question is surrounded by more than 2 equations, just add more conditions:
plot(NA,xlim=c(0,1),ylim=c(0,1), xaxs="i",yaxs="i") # Empty plot
a <- curve(x^2, add = TRUE) # First curve
b <- curve(2*x^2-0.2, add = TRUE) # Second curve
d <- curve(0.5*x^2+0.2, add = TRUE) # Third curve
names(a) <- c('xA','yA')
names(b) <- c('xB','yB')
names(d) <- c('xD','yD')
with(as.list(c(a,b,d)),{
# Basically you have three conditions:
# curve a is below curve b, curve b is below curve d and curve d is above curve a
# assign to each curve coordinates the two conditions that concerns it.
idA <- yA<=yD & yA<=yB
idB <- yB>=yA & yB<=yD
idD <- yD<=yB & yD>=yA
polygon(x = c(xB[idB], xD[idD], rev(xA[idA])),
y = c(yB[idB], yD[idD], rev(yA[idA])),
density=10, angle=0, border=NULL)
})
In R, there is only limited support for fill patterns and they can only be
applied to rectangles and polygons.This is and only within the traditional graphics, no ggplot2 or lattice.
It is possible to fill a rectangle or polygon with a set of lines drawn
at a certain angle, with a specific separation between the lines. A density
argument controls the separation between the lines (in terms of lines per inch)
and an angle argument controls the angle of the lines.
here an example from the help:
plot(c(1, 9), 1:2, type = "n")
polygon(1:9, c(2,1,2,1,NA,2,1,2,1),
density = c(10, 20), angle = c(-45, 45))
EDIT
Another option is to use alpha blending to differentiate between regions. Here using #plannapus example and gridBase package to superpose polygons, you can do something like this :
library(gridBase)
vps <- baseViewports()
pushViewport(vps$figure,vps$plot)
with(as.list(c(a,b,d)),{
grid.polygon(x = xA, y = yA,gp =gpar(fill='red',lty=1,alpha=0.2))
grid.polygon(x = xB, y = yB,gp =gpar(fill='green',lty=2,alpha=0.2))
grid.polygon(x = xD, y = yD,gp =gpar(fill='blue',lty=3,alpha=0.2))
}
)
upViewport(2)
There are several submissions on the MATLAB Central File Exchange that will produce hatched plots in various ways for you.
I think a tool that will come handy for you here is gnuplot.
Take a look at the following demos:
feelbetween
statistics
some tricks
I want to add the following x-axis label to my bar plot but unfortunately R does not recognize the character '!' and prints dots instead of whitespaces:
I want: I get:
!src x.x.x.x X.src.x.x.x.x
!TCP X.TCP
!udp && !src x.x.x.x X.udp.....src.x.x.x.x
Additionally a would like to increase the margin because the text is to long and when setting the size over 'cex.names=0.6' then it just vanishes!?
There are two reason I can think of that R will have substituted X. for instances of !.
I suspect that the labelling you are seeing is due to R's reading of your data. Those column names aren't really syntactically valid and the erroneous character has been replaced by X.. This happens at the data import stage, so I presume you didn't check how R had read your data in?, or
You have a vector and the names of that vector are similarly invalid and R has done the conversion.
However, as you haven't made this reproducible it could be anything.
To deal with case 1 above, either edit your data file to contain valid names or pass check.names = FALSE in your read.table() call used to read in the data. Although doing the latter will make it difficult for you to select variable by name without quoting the name fully.
If you have a vector, then you can reset the names again:
> vec <- 1:5
> names(vec) <- paste0("!",LETTERS[1:5])
> vec
!A !B !C !D !E
1 2 3 4 5
> barplot(vec)
Also note that barplot() has a names.arg argument that you can use to pass it the labels to draw beneath each bar. For example:
> barplot(vec, names.arg = paste0("!", letters[1:5]))
which means you don't need to rely on what R has read in/converted for you as you tell it exactly what to label the plot with.
To increase the size of the margin, there are several ways to specify the size but I find setting it in terms of number of lines most useful. You change this via graphical parameter mar, which has the defaults c(5,4,4,2) + 0.1 which correspond to the bottom, left, top, and right margins respectively. Use par() to change the defaults, for example in the code below the defaults are store in op and a much larger bottom margin specified
op <- par(mar = c(10,4,4,2) + 0.1)
barplot(vec, names.arg = paste0("!", letters[1:5]), las = 2)
par(op) ## reset
The las = 2 will rotate the bar labels 90 degrees to be perpendicular to the axis.
One option is to use ann=F and add anotation to the plot using mtext.
x <- 1:2
y <- runif(2, 0, 100)
par(mar=c(4, 4, 2, 4))
plot(x, y, type="l", xlim=c(0.5, 2.5), ylim=c(-10, 110),
axes=TRUE, ann=FALSE)
Then add annotation:
mtext("!udp && !src x.x.x.x ", side=1, line=2)
Edit It is a question of a barplot and not simple plot.
as said in Gavin solution, the names argument can be setted. Here I show an example.
barplot(VADeaths[1:2,], angle = c(45, 135),
density = 20, col = "grey",
names=c("!src x.x.x.x", "!TCP", "!udp && !src x.x.x.x", "UF"),
horiz=FALSE)