The data for some of these types graphs that I'm graphing in R,
http://graphpad.com/faq/images/1352-1(1).gif
has outliers that are way out of range and I can't just exclude them. I attempted to use the axis.break() function from plotrix but the function doesn't rescale the y axis. It just places a break mark on the axis. The purpose of doing this is to be able to show the medians for both groups, as well as the data points, and the outliers all in one plot frame. Essentially, the data points that are far apart from the majority is taking up a chunk of space and the majority of points are being squished, not displaying much differences. Here is the code:
https://gist.github.com/9bfb05dcecac3ecb7491
Any suggestions would be helpful.
Thanks
Unfortunately the code you link to isn't self-contained, but possibly the code you have for gap.plot() there doesn't work as you expect because you are setting ylim to cover the full data range rather than the plotted sections only. Consider the following plot:
As you can see, the y axis has tickmarks for every 50 pg/ml, but there is a gap between 175 and 425. So the data range (to the nearest 50) is c(0, 500) but the range of the y axis is c(0, 250) - it's just that the tickmarks for 200 and 250 are being treated as those for 450 and 500.
This plot was produced using the following modified version of your code:
## made up data
GRO.Controls <- c(25, 40:50, 60, 150)
GRO.Breast <- c(70, 80:90, 110, 500)
##Scatter plot for both groups
library(plotrix)
gap.plot(jitter(rep(0,length(GRO.Controls)),amount = 0.2), GRO.Controls,
gap = c(175,425), xtics = -2, # no xtics visible
ytics = seq(0, 500, by = 50),
xlim = c(-0.5, 1.5), ylim = c(0, 250),
xlab = "", ylab = "Concentrations (pg/ml)", main = "GRO(P=0.0010)")
gap.plot(jitter(rep(1,length(GRO.Breast)),amount = 0.2), GRO.Breast,
gap = c(175, 425), col = "blue", add = TRUE)
##Adds x- variable (groups) labels
mtext("Controls", side = 1, at= 0.0)
mtext("Breast Cancer", side = 1, at= 1.0)
##Adds median lines for each group
segments(-0.25, median(GRO.Controls), 0.25, median(GRO.Controls), lwd = 2.0)
segments(0.75, median(GRO.Breast), 1.25, median(GRO.Breast), lwd = 2.0,
col = "blue")
You could be using gap.plot() which is easily found by following the link on the axis.break help page. There is a worked example there.
Related
I am trying to plot some data points from a matrix complete with their standard deviation, but I am having troubles in plotting the latter.
My tools are:
a matrix with the data points to plot at a x coordinate within a properly xlim-defined x-axis;
a vector of as many y arbitrary coordinates for the plotting height, just not making them overlap;
a vector of lengths of the standard deviation lines, to be displayed horizontally around the data points.
Yeah, eventually it'll look like a flying saucer invasion.
I can easily plot the points at the given height, one by one - it is the way I want to do it.
Trouble comes in adding the standard deviation horizontal lines for each point.
Has someone an idea on how to do it?
x<-matrix(c(1:4,NA,NA,10:16), nrow=4, ncol=4)
y<-seq(0.001,0.006, 0.001)
std.dev<-c(runif(7, 0.1, 0.5), NA, NA, runif(7, 0.1, 0.5))
plot(0,0, xlim=c(min = 0, max(x), na.rm=T)+0.001), ylim = c(0,0.016), type = "n", xlab = "My x", yaxt = "n", ylab ="")
points(x = x[1,2], y = y[1], pch = 21, bg = "red", col = "red")
When working with base R it is amazing to find out that R does not provide a "built-in" support for error bars. You may want to consult doing this with other packages.
With base R the work-around is to use the arrow() function and setting the "arrow head angle" to 90 degrees.
Note: I had to change your given data definition as it threw errors. Also have a look at this part of your code.
I plot the error bars in vertical mode. You can easily adapt this for horizontal bars. I did this for presentation reasons to avoid overlapping error bars.
Using your full data will make it easier to deconflict the bars.
x<-matrix(c(1:7,NA,NA,10:16), nrow=4, ncol=4) # adapted to ensure same length
y<-seq(0.001,0.016, 0.001) # adapted to ensure same length
std.dev<-c(runif(7, 0.1, 0.5), NA, NA, runif(7, 0.1, 0.5))
plot(0,0
, xlim= c(min = 0, max(x, na.rm=T)) # had to fix xlim definition
, ylim = c(-1,1) # changed to show give std.dev
, type = "n", xlab = "My x", yaxt = "n", ylab ="")
points(x = x, y = y, pch = 21, bg = "red", col = "red") # set x and y to show all
# --------------- add arrows with "flat head --------------------------
arrows( x0 = x, , x1 = x
,y0 = y-std.dev, y1 = y+std.dev # center deviation on data point
, code=3, angle=90 # set the angle for the head to emulate error bar
, length=0.1)
This yields:
I was wondering if it's possible to get a two sided barplot (e.g. Two sided bar plot ordered by date) that shows above Data A and below Data B of each X-Value.
Data A would be for example the age of a person and Data B the size of the same person. The problem with this and the main difference to the examples above: A and B have obviously totally different units/ylims.
Example:
X = c("Anna","Manuel","Laura","Jeanne") # Name of the Person
A = c(12,18,22,10) # Age in years
B = c(112,186,165,120) # Size in cm
Any ideas how to solve this? I don't mind a horizontal or a vertical solution.
Thank you very much!
Here's code that gets you a solid draft of what I think you want using barplot from base R. I'm just making one series negative for the plotting, then manually setting the labels in axis to reference the original (positive) values. You have to make a choice about how to scale the two series so the comparison is still informative. I did that here by dividing height in cm by 10, which produces a range similar to the range for years.
# plot the first series, but manually set the range of the y-axis to set up the
# plotting of the other series. Set axes = FALSE so you can get the y-axis
# with labels you want in a later step.
barplot(A, ylim = c(-25, 25), axes = FALSE)
# plot the second series, making whatever transformations you need as you go. Use
# add = TRUE to add it to the first plot; use names.arg to get X as labels; and
# repeat axes = FALSE so you don't get an axis here, either.
barplot(-B/10, add = TRUE, names.arg = X, axes = FALSE)
# add a line for the x-axis if you want one
abline(h = 0)
# now add a y-axis with labels that makes sense. I set lwd = 0 so you just
# get the labels, no line.
axis(2, lwd = 0, tick = FALSE, at = seq(-20,20,5),
labels = c(rev(seq(0,200,50)), seq(5,20,5)), las = 2)
# now add y-axis labels
mtext("age (years)", 2, line = 3, at = 12.5)
mtext("height (cm)", 2, line = 3, at = -12.5)
Result with par(mai = c(0.5, 1, 0.25, 0.25)):
I am trying to do a density plot of a dataset that has a wide range.
data=c(-10,-20,-20,-18,-17,1000,10000, 500, 500, 500, 500000)
plot(density(data))
As you can see in the figure, we can not see much
.
Is there a way to make an axis break (or several ones) on the x axis to visualise better the distribution of the data? Or, is there a way to plot a certain range of the data in several graphs and than paste it together?
Thanks a lot!
There is a function gap.plot() in package plotrix but I think it has some problems (see How to plot “multiple” curves with a break through y-data-range in R?). I recommend you draw two plots.
## use small margins and relatively big outer margins (to write labels).
old.par <- par(mfrow = c(1, 2), mar = rep(0.5, 4), oma = c(4, 4, 1, 1))
plot(density(data), xlim = c(-1000, 29000), main = "", bty="c") # diff 30000
abline(v = par("usr")[2], lty=2) # keep the same diff of xlim to avoid misleading
plot(density(data), xlim = c(471000, 501000), main = "", yaxt ="n", bty="]") # diff 30000
abline(v = par("usr"[1]), lty=2)
par(old.par)
I am working on a forest plot in R using the metafor package and am trying to shift the whole x-axis (alim) to the right to accommodate ilab columns.
Am still not allowed to post images so my current plot now appears as something like this where the text and x-axis overlaps:
|ilab text| |mean [ci.lb, ci.ub]|
|---measure values + ci---|
And I want something like this
|ilab text| |mean [ci.lb, ci.ub]|
|---measure values + ci---|
Although the forestplot package seemed to allow this with its graph.pos function, I couldn't seem to find a similar function in metafor.
I have two questions:
1) Is the x-axis position set on default in metafor?
2) Can this default be overwritten, and if so, how?
Thanks!
Wen
Found the answer: the key is to adjust the xlim, alim and ilab.xpos parameters in relation to 0 (the start of the x-axis) as a reference point.
For example, if this code gives you an overlap,
forest(x, ci.lb = lower, ci.ub = upper,
xlim = c(-350, 170), xlab = "Proportion (%)", at = c(0, 20, 40, 60, 80, 100),
alim = c(0, 100),
ilab = cbind(period, population), ilab.xpos = c(-275, -175), ilab.pos = c(4, 4), cex = 0.75)
You can adjust ilab text further to the left of the x axis by adjusting ilab.xpos() values further away from 0 (e.g. from -175 in the above code to -200). This has to be within the limits of your xlim.
Given such data:
#Cutpoint SN (1-PPV)
5 0.56 0.01
7 0.78 0.19
9 0.91 0.58
How can I plot ROC curve with R that produce similar result like the
attached ?
I know ROCR package but it doesn't take such input.
If you just want to create the plot (without that silly interpolation spline between points) then just plot the data you give in the standard way, prepending a point at (0,0) and appending one at (1,1) to give the end points of the curve.
## your data with different labels
dat <- data.frame(cutpoint = c(5, 7, 9),
TPR = c(0.56, 0.78, 0.91),
FPR = c(0.01, 0.19, 0.58))
## plot version 1
op <- par(xaxs = "i", yaxs = "i")
plot(TPR ~ FPR, data = dat, xlim = c(0,1), ylim = c(0,1), type = "n")
with(dat, lines(c(0, FPR, 1), c(0, TPR, 1), type = "o", pch = 25, bg = "black"))
text(TPR ~ FPR, data = dat, pos = 3, labels = dat$cutpoint)
abline(0, 1)
par(op)
To explain the code: The first plot() call sets up the plotting region, without doing an plotting at all. Note that I force the plot to cover the range (0,1) in both axes. The par() call tells R to plot axes that cover the range of the data - the default extends them by 4 percent of the range on each axis.
The next line, with(dat, lines(....)) draws the ROC curve and here we prepend and append the points at (0,0) and (1,1) to give the full curve. Here I use type = "o" to give both points and lines overplotted, the points are represented by character 25 which allows it to be filled with a colour, here black.
Then I add labels to the points using text(....); the pos argument is used to position the label away from the actual plotting coordinates. I take the labels from the cutpoint object in the data frame.
The abline() call draws the 1:1 line (here the 0, and 1 mean an intercept of 0 and a slope of 1 respectively.
The final line resets the plotting parameters to the defaults we saved in op prior to plotting (in the first line).
The resulting plot looks like this:
It isn't an exact facsimile and I prefer the plot using the default for the axis ranges(adding 4 percent):
plot(TPR ~ FPR, data = dat, xlim = c(0,1), ylim = c(0,1), type = "n")
with(dat, lines(c(0, FPR, 1), c(0, TPR, 1), type = "o", pch = 25, bg = "black"))
text(TPR ~ FPR, data = dat, pos = 3, labels = dat$cutpoint)
abline(0, 1)
Again, not a true facsimile but close.