One of the things that most bugs me about R is the separation of the plot, points, and lines commands. It's somewhat irritating to have to change plot to whatever variant for the first plot done, and to have to replot from scratch if you failed to have set the correct the ylim and xlim initially. Wouldn't it be nice to have one command that:
Picks lines, points or both via an argument, as in plot(..., type = "l") ?
By default, chooses whether to create a new plot, or add to an existing one according to whether the current device is empty or not.
Rescales axes automatically if the added elements to the plot exceed the current bounds.
Has anyone done anything like this? If not, and there's no strong reason why this isn't possible, I'll answer this myself in a bit...
Some possible functionality that may help with what you want:
The matplot function uses base graphics and will plot several sets of points or lines in one step, figuring out the correct ranges in one step.
There is an update method for lattice graphics that can be used to add/change things in the plot and will therefore result in automatic recalculation of things like limits and axes.
If you add additional information (useing +) to a ggplot2 plot, then the things that are automatically calculated will be recalculated.
You already found zoomplot and there is always the approach of writing your own function like you did.
Anyway, this is what I came up with: (It uses zoomplot from TeachingDemos)
fplot <- function(x, y = NULL, type = "l", new = NULL, xlim, ylim, zoom = TRUE,...){
require(TeachingDemos)
if (is.null(y)){
if (length(dim(x)) == 2){
y = x[,2]
x = x[,1]
} else {
y = x
x = 1:length(y)
}
}
if ( is.null(new) ){
#determine whether to make a new plot or not
new = FALSE
if (is.null(recordPlot()[[1]])) new = TRUE
}
if (missing(xlim)) xlim = range(x)
if (missing(ylim)) ylim = range(y)
if (new){
plot(x, y, type = type, xlim = xlim, ylim = ylim, ...)
} else {
if (type == "p"){
points(x,y, ...)
} else {
lines(x,y, type = type, ...)
}
if (zoom){
#rescale plot
xcur = par("usr")[1:2]
ycur = par("usr")[3:4]
#shrink coordinates and pick biggest
xcur = (xcur - mean(xcur)) /1.08 + mean(xcur)
ycur = (ycur - mean(ycur)) /1.08 + mean(ycur)
xlim = c(min(xlim[1], xcur[1]), max(xlim[2], xcur[2]))
ylim = c(min(ylim[1], ycur[1]), max(ylim[2], ycur[2]))
#zoom plot
zoomplot(xlim, ylim)
}
}
}
So you can do, e.g.
dev.new()
fplot(1:4)
fplot(1:4 +1, col = 2)
fplot(0:400/100 + 1, sin(0:400/10), type = "p")
dev.new()
for (k in 1:20) fplot(sort(rnorm(20)), type = "b", new = (k==1) )
par(mfrow) and log axis don't currently work well with zooming, but, it's a start...
Related
How can such a non-linear transformation be done?
here is the code to draw it
my.sin <- function(ve,a,f,p) a*sin(f*ve+p)
s1 <- my.sin(1:100, 15, 0.1, 0.5)
s2 <- my.sin(1:100, 21, 0.2, 1)
s <- s1+s2+10+1:100
par(mfrow=c(1,2),mar=rep(2,4))
plot(s,t="l",main = "input") ; abline(h=seq(10,120,by = 5),col=8)
plot(s*7,t="l",main = "output")
abline(h=cumsum(s)/10*2,col=8)
don't look at the vector, don't look at the values, only look at the horizontal grid, only the grid matters
####UPDATE####
I see that my question is not clear to many people, I apologize for that...
Here are examples of transformations only along the vertical axis, maybe now it will be more clear to you what I want
link Source
#### UPDATE 2 ####
Thanks for your answer, this looks like what I need, but I have a few more questions if I may.
To clarify, I want to explain why I need this, I want to compare vectors with each other that are non-linearly distorted along the horizontal axis .. Maybe there are already ready-made tools for this?
You mentioned that there are many ways to do such non-linear transformations, can you name a few of the best ones in my case?
how to make the function f() more non-linear, so that it consists, for example, not of one sinusoid, but of 10 or more. Тhe figure shows that the distortion is quite simple, it corresponds to one sinusoid
and how to make the function f can be changed with different combinations of sinusoids.
set.seed(126)
par(mar = rep(2, 4),mfrow=c(1,3))
s <- cumsum(rnorm(100))
r <- range(s)
gridlines <- seq(r[1]*2, r[2]*2, by = 0.2)
plot(s, t = "l", main = "input")
abline(h = gridlines, col = 8)
f <- function(x) 2 * sin(x)/2 + x
plot(s, t = "l", main = "input+new greed")
abline(h = f(gridlines), col = 8)
plot(f(s), t = "l", main = "output")
abline(h = f(gridlines), col = 8)
If I understand you correctly, you wish to map the vector s from the regular spacing defined in the first image to the irregular spacing implied by the second plot.
Unfortunately, your mapping is not well-defined, since there is no clear correspondence between the horizontal lines in the first image and the second image. There are in fact an infinite number of ways to map the first space to the second.
We can alter your example a bit to make it a bit more rigorous.
If we start with your function and your data:
my.sin <- function(ve, a, f, p) a * sin(f * ve + p)
s1 <- my.sin(1:100, 15, 0.1, 0.5)
s2 <- my.sin(1:100, 21, 0.2, 1)
s <- s1 + s2 + 10 + 1:100
Let us also create a vector of gridlines that we will draw on the first plot:
gridlines <- seq(10, 120, by = 2.5)
Now we can recreate your first plot:
par(mar = rep(2, 4))
plot(s, t = "l", main = "input")
abline(h = gridlines, col = 8)
Now, suppose we have a function that maps our y axis values to a different value:
f <- function(x) 2 * sin(x/5) + x
If we apply this to our gridlines, we have something similar to your second image:
plot(s, t = "l", main = "input")
abline(h = f(gridlines), col = 8)
Now, what we want to do here is effectively transform our curve so that it is stretched or compressed in such a way that it crosses the gridlines at the same points as the gridlines in the original image. To do this, we simply apply our mapping function to s. We can check the correspondence to the original gridlines by plotting our new curves with a transformed axis :
plot(f(s), t = "l", main = "output", yaxt = "n")
axis(2, at = f(20 * 1:6), labels = 20 * 1:6)
abline(h = f(gridlines), col = 8)
It may be possible to create a mapping function using the cumsum(s)/10 * 2 that you have in your original example, but it is not clear how you want this to correspond to the original y axis values.
Response to edits
It's not clear what you mean by comparing two vectors. If one is a non-linear deformation of the other, then presumably you want to find the underlying function that produces the deformation. It is possible to create a function that applies the deformation empirically simply by doing f <- approxfun(untransformed_vector, transformed_vector).
I didn't say there were many ways of doing non-linear transformations. What I meant is that in your original example, there is no correspondence between the grid lines in the original picture and the second picture, so there is an infinite choice for which gridines in the first picture correspond to which gridlines in the second picture. There is therefore an infinite choice of mapping functions that could be specified.
The function f can be as complicated as you like, but in this scenario it should at least be everywhere non-decreasing, such that any value of the function's output can be mapped back to a single value of its input. For example, function(x) x + sin(x)/4 + cos(3*(x + 2))/5 would be a complex but ever-increasing sinusoidal function.
If I plot a data and use lines to superimpose the same data points on the graph, I get the same data points. Lets say
x<-rnorm(100)
plot(x, type="p")
lines(x, type="p",pch=2)
However, I have realized that there is a distortion in R plots when the same is done in a multipanel graph. It seems R is unable to recall the exact values on the y-axis when you plot the same data again. A simple code below shows the outputs from "plot" and "lines" are not the same.
set.seed(1000)
Range<-rbind(rep(0,4),c(100,100,1,100));thres<-70
Ylab<-c("MAD","Bias","CP","CIL")
X<-list(EVI=cbind(runif(10,0,100),runif(10,0,100),
runif(10,0,1),runif(10,0,100)),
Qp=cbind(runif(10,0,100),runif(10,0,100),runif(10,0,1),runif(10,0,100)))
Plot<-function(x,Pch=1,thres)
{
par(mfrow=c(1,4),las=2)
for(j in 1:4)
{
plot(x[,j],xaxt = "n",xlab="Estimator",
ylab=Ylab[j],type = "p", pch = Pch, ylim=Range[,j])
par(mfg=c(1,j))
axis(1, at=1:nrow(x), labels=LETTERS[1:nrow(x)])
if(j!=3){
par(mfg=c(1,j))
abline(h=thres,col=2)
}else{
par(mfg=c(1,j))
abline(h=c(0.90,0.95,0.99),lty=c(2,1,2),col=rep(2,3))
}
}
}
Line<-function(x,Pch)
{
for(j in 1:ncol(x)) {
par(mfg=c(1,j))
lines(x[,j], type = "p", pch = Pch,col=2)
}
}
lapply(X,function(dat)Plot(dat,thres=thres))
## First panel
Line(X$EVI,Pch=2)
## Move to second panel
Line(X$Qp,Pch=2)
What explains the distortions in the positioning of the points in the 3rd column? Note that, I have included the range of each data courtesy #WhiteViking in the "Plot" function. However, the distortion keeps showing. Thank you
The problem is in the ordering of 'plot' and 'lines'.
Code like this, with all 3 'plot' commands upfront:
set.seed(1)
X <- cbind(rnorm(20), 2 * rnorm(20), 3 * rnorm(20))
par(mfrow = c(1,3))
for (i in 1:3) {
plot(X[,i])
}
for (i in 1:3) {
par(mfg = c(1,i))
lines(X[,i], type = "p", col = 2, pch = 3)
}
yields misaligment:
In the example above the first 'lines' command that get executed bases its scaling on the last 'plot' that happened. Since that had a larger vertical range than the first, the scaling of the 'lines' is incorrect.
Whereas structured like so:
set.seed(1)
X <- cbind(rnorm(20), 2 * rnorm(20), 3 * rnorm(20))
par(mfrow = c(1,3))
for (i in 1:3) {
par(mfg = c(1,i))
plot(X[,i])
lines(X[,i], type = "p", col = 2, pch = 3)
}
it gives correct alignment of 'plot' and 'lines':
You'll probably have to rework your code to group 'plot' and 'lines' together for each sub-plot.
When the third column is converted to percentages, the ylim becomes uniform and hence there isn't such distortion. However, it will be good to get a way around it instead of such adhoc transformation
plot() sets up a coordinate system via plot.window based on the range of the data. This information is apparently stored in par(usr) for the latest plot, which means that if you want to revisit older plots, you should store those usr values and reset them accordingly,
set.seed(123)
d1 <- data.frame(x=1:10, y=rnorm(10))
d2 <- data.frame(x=1:10, y=10*rnorm(10))
par(mfrow=c(1,2),mar=c(2.5,2.5,0,0))
plot(d1, type="p")
usr1 <- par("usr")
plot(d2, type="p")
usr2 <- par("usr")
par(mfg=c(1,1), usr=usr1)
points(d1, col="red", pch=3)
par(mfg=c(1,2), usr=usr2)
points(d2, col="red", pch=3)
I am trying to write a function that will produce what I regard as a real dot plot (unlike the Cleveland variety, I require a univariate scatterplot with the dots stacked for (nearly) equal values). I have come close:
In this illustration, the dots you see are actually rotated text strings of lower-case "o"s. It is done this way because I need the dot spacing to stay constant if the plot is re-scaled. However, I'd like something better than lower-case "o"s, for example, filled dots instead of circles. This could be done if I could access the font that is used for the standard plotting symbols (pch = 1:25 in the plot function and relatives). Then I could make a text string with that font and get what's needed. Does anybody know how to do that?
PS - No, a histogram with lots of bins is not an acceptable substitute.
I did find a way to get the desired dot plot using low-level graphics parameters (namely "usr", the actual user coordinates of the plotting area, and "cxy", the character size). The recordGraphics() function wraps the part that needs to be changed when the graph is resized. Here's the function:
dot.plot = function(x, pch = 16, bins = 50, spacing = 1, xlab, ...) {
if(missing(xlab))
xlab = as.character(substitute(x))
# determine dot positions
inc = diff(pretty(x, n = bins)[1:2])
freq = table(inc * round(x / inc, 0))
xx = rep(as.numeric(names(freq)), freq)
yy = unlist(lapply(freq, seq_len))
# make the order of the dots the same as the order of the data
idx = seq_along(x)
idx[order(x)] = idx
xx = xx[idx]
yy = yy[idx]
# make a blank plot
plot(xx, yy, type = "n", axes = FALSE, xlab = xlab, ylab = "")
# draw scale
axis(1)
ylow = par("usr")[3]
abline(h = ylow) # extend to full width
# draw points and support resizing
recordGraphics({
yinc = 0.5 * spacing * par("cxy")[2]
points(xx, ylow + yinc * (yy - .5), pch = pch, ...)
},
list(),
environment(NULL))
invisible()
}
The spacing argument may be used if you want a tighter or looser gap between dots. An example...
with(iris, dot.plot(Sepal.Length, col = as.numeric(Species)))
This is a better solution than trying to do it with text, but also a little bit scary because of the warnings you see in the documentation for recordGraphics
I am creating violin plots with lots (will be ~100) of columns (violins). The problem is that the name of each column is very long. What I am doing current is as follows:
jpeg("stats/AllDistanceViolinPlot.jpg", width = 1000, height = 1000);
do.call(vioplot, c(lapply(data, na.omit),list(names=c("veryveryveryverylongname1", "veryveryveryverylongname2", "veryveryveryverylongname4", "veryveryveryverylongname4", "veryveryveryverylongname5", "veryveryveryverylongname6", "veryveryveryverylongname7", "veryveryveryverylongname8"))));
dev.off()
Which gives me this plot:
As you can see, the names of the columns are very long and some actually are not shown. I have also tried something without the list:
jpeg("stats/plot.jpg", width = 1000, height = 1000);
do.call(vioplot, c(lapply(data, na.omit)));
dev.off()
Which gives me this plot:
What I'd like is one of two things:
The names of the columns would be vertical so that they are shown and aren't cut off
or
Make the main plot like the second image I posted and have a separate legend that would correlate each column with the full name. For example, something like the following:
1 - veryveryveryverylongname1
2 - veryveryveryverylongname2
...
8 - veryveryveryverylongname8
Could someone please suggest the better way (or both) and comment on how to implement them?
Greatly appreciated.
Unfortunately the vioplot function in the vioplot package does not accept the usual base graphics parameters for modifying the orientation of axis annotation. You will need to make a new vioplot function and change this code:
if (!horizontal) {
if (!add) {
plot.window(xlim = xlim, ylim = ylim)
axis(2)
axis(1, at = at, label = label)
To this:
if (!horizontal) {
if (!add) {
plot.window(xlim = xlim, ylim = ylim)
axis(2)
axis(1, at = at, label = label , las=2)
I am drawing dotplot() using lattice or Dotplot() using Hmisc. When I use default parameters, I can plot error bars without small vertical endings
--o--
but I would like to get
|--o--|
I know I can get
|--o--|
when I use centipede.plot() from plotrix or segplot() from latticeExtra, but those solutions don't give me such nice conditioning options as Dotplot(). I was trying to play with par.settings of plot.line, which works well for changing error bar line color, width, etc., but so far I've been unsuccessful in adding the vertical endings:
require(Hmisc)
mean = c(1:5)
lo = mean-0.2
up = mean+0.2
d = data.frame (name = c("a","b","c","d","e"), mean, lo, up)
Dotplot(name ~ Cbind(mean,lo,up),data=d,ylab="",xlab="",col=1,cex=1,
par.settings = list(plot.line=list(col=1),
layout.heights=list(bottom.padding=20,top.padding=20)))
Please, don't give me solutions that use ggplot2...
I've had this same need in the past, with barchart() instead of with Dotplot().
My solution then was to create a customized panel function that: (1) first executes the original panel function ; and (2) then uses panel.arrows() to add the error bar (using a two-headed arrow, in which the edges of the head form a 90 degree angle with the shaft).
Here's what that might look like with Dotplot():
# Create the customized panel function
mypanel.Dotplot <- function(x, y, ...) {
panel.Dotplot(x,y,...)
tips <- attr(x, "other")
panel.arrows(x0 = tips[,1], y0 = y,
x1 = tips[,2], y1 = y,
length = 0.15, unit = "native",
angle = 90, code = 3)
}
# Use almost the same call as before, replacing the default panel function
# with your customized function.
Dotplot(name ~ Cbind(mean,lo,up),data=d,ylab="",xlab="",col=1,cex=1,
panel = mypanel.Dotplot,
par.settings = list(plot.line=list(col=1),
layout.heights=list(bottom.padding=20,top.padding=20)))