I apologize first for bringing what I imagine to be a ridiculously simple problem here, but I have been unable to glean from the help file for package 'polynom' how to solve this problem. For one out of several years, I have two vectors of x (d for day of year) and y (e for an index of egg production) data:
d=c(169,176,183,190,197,204,211,218,225,232,239,246)
e=c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,0.016599262,0.002810977,0.00560387 8,0,0.002810977,0.002810977)
I want to, for each year, use the poly.calc function to create a polynomial function that I can use to interpolate the timing of maximum egg production. I want then to superimpose the function on a plot of the data. To begin, I have no problem with the poly.calc function:
egg1996<-poly.calc(d,e)
egg1996
3216904000 - 173356400*x + 4239900*x^2 - 62124.17*x^3 + 605.9178*x^4 - 4.13053*x^5 +
0.02008226*x^6 - 6.963636e-05*x^7 + 1.687736e-07*x^8
I can then simply
plot(d,e)
But when I try to use the lines function to superimpose the function on the plot, I get confused. The help file states that the output of poly.calc is an object of class polynomial, and so I assume that "egg1996" will be the "x" in:
lines(x, len = 100, xlim = NULL, ylim = NULL, ...)
But I cannot seem to, based on the example listed:
lines (poly.calc( 2:4), lty = 2)
Or based on the arguments:
x an object of class "polynomial".
len size of vector at which evaluations are to be made.
xlim, ylim the range of x and y values with sensible defaults
Come up with a command that successfully graphs the polynomial "egg1996" onto the raw data.
I understand that this question is beneath you folks, but I would be very grateful for a little help. Many thanks.
I don't work with the polynom package, but the resultant data set is on a completely different scale (both X & Y axes) than the first plot() call. If you don't mind having it in two separate panels, this provides both plots for comparison:
library(polynom)
d <- c(169,176,183,190,197,204,211,218,225,232,239,246)
e <- c(0,0,0.006839425,0.027323127,0.024666883,0.005603878,
0.016599262,0.002810977,0.005603878,0,0.002810977,0.002810977)
egg1996 <- poly.calc(d,e)
par(mfrow=c(1,2))
plot(d, e)
plot(egg1996)
Related
I am trying to simulate a signal in order to apply some methods of non-linear fittings, but I have some problems when plotting it.
x<-sample(seq(0,1,length.out = 1000),200)
y<-2*sin(4*pi*x)-6*abs(x-0.4)^(0.3)+2*exp(-30*(4*x-2)^2)+8*x+rnorm(200,0,0.5)
s<-2*sin(4*pi*x)-6*abs(x-0.4)^(0.3)+2*exp(-30*(4*x-2)^2)+8*x
plot(x,y)
lines(x,s,col="red")
The idea I want to have 200 observations uniformly sampled with an additive white noise term, and the I would like to plot this "perturbed" signal together with the original signal. (y and s respectively).
The fact is that if I use the code that I wrote I obtain as result something like:
Probably is such a simple thing, but I'm kinda stuck with this.
Any hint or suggestion will be greatly appreciated.
Lines are plotted sequentially, and you decided to randomly draw your X values, so x values sitting next to each other in x are not next to each other on the axis - hence the mess. Just sort it:
x<-sort(sample(seq(0,1,length.out = 1000),200))
y<-2*sin(4*pi*x)-6*abs(x-0.4)^(0.3)+2*exp(-30*(4*x-2)^2)+8*x+rnorm(200,0,0.5)
s<-2*sin(4*pi*x)-6*abs(x-0.4)^(0.3)+2*exp(-30*(4*x-2)^2)+8*x
plot(x,y)
lines(x,s,col="red")
Another way to do this on the fly mentioned by mickey is:
ord = order(x)
lines(x[ord], s[ord], col = 'red')
You need to reorder the x observations order in ascending order, you can do that by storing everything in a dataframe object and then ordering it:
x<-sample(seq(0,1,length.out = 1000),200)
df_p= data.frame(x)
df_p$y<-2*sin(4*pi*df_p$x)-6*abs(df_p$x-0.4)^(0.3)+2*exp(-30*(4*df_p$x-2)^2)+8*df_p$x+rnorm(200,0,0.5)
df_p$s<-2*sin(4*pi*df_p$x)-6*abs(df_p$x-0.4)^(0.3)+2*exp(-30*(4*df_p$x-2)^2)+8*df_p$x
df_p = df_p[order(df_p$x),]
plot(df_p$x,df_p$y)
lines(df_p$x, df_p$s,col="red")
Also if you want to avoid this step you can use the ggplot2 library:
p <- ggplot(df_p) + geom_point(aes(x = x,y= y)) + geom_line(aes(x=x,y=s,color='red'))
plot(p)
I have a measurment of which should fit an hysteresis. For visualisation purpose I would like to plot a line approximating the hysteresis to help explain this pattern.
I created an example in the following image using the code below.
I would like to have an output similar to the green curve - however I don't have this data directly available, and I don't care whether it is pointy.
However most smoothing functions such as smooth.spline which I plotted in blue - allow no loops. The closest I can find is from the bezier library - plotted in red. Not nicely visible here but it produces a loop, however it fits poorly (and gives some warnings and takes quite some time).
Can you suggest a method?
set.seed(12345)
up <- seq(0,1,length.out=100)^3
down <- sqrt(seq(1,0,length.out=100))
x <- c(seq(0,1,length.out=length(up)),
seq(1,0, length.out=length(down)))
data <- data.frame(x=x, y=c(up,down),
measuredx=x + rnorm(length(x))*0.01,
measuredy=c(up,down) + rnorm(length(up)+length(down))*0.03)
with(data,plot(measuredx,measuredy, type = "p"))
with(data,lines(x,y, col='green'))
sp <- with(data,smooth.spline(measuredx, measuredy))
with(sp, lines(x,y, col="blue"))
library(bezier)
bf <- bezierCurveFit(as.matrix(data[,c(1,3)]))
lines(bezier(t=seq(0, 1, length=500), p=bf$p), col="red", cex=0.25)
UPDATE
As it turns out my actual problem is slightly different I ask another question to reflect my actual issue in the question: How to fit a smooth hysteresis in a poorly distributed data set?
set.seed(12345)
up <- seq(0,1,length.out=100)^3
down <- sqrt(seq(1,0,length.out=100))
x <- c(seq(0,1,length.out=length(up)),
seq(1,0, length.out=length(down)))
data <- data.frame(x=x, y=c(up,down),
measuredx=x + rnorm(length(x))*0.01,
measuredy=c(up,down) + rnorm(length(up)+length(down))*0.03)
Instead of smoothing data$measuredy directly over data$measuredx, do two separate smoothing, by smoothing each against a time stamp variable. Then combine the fitted values from two smoothing. This is a general way for smoothing a closed curve or a loop. (See also Q & A: Smoothing Continuous 2D Points)
t <- seq_len(nrow(data) + 1)
xs <- smooth.spline(t, c(data$measuredx, data$measuredx[1]))$y
ys <- smooth.spline(t, c(data$measuredy, data$measuredy[1]))$y
with(data, plot(measuredx, measuredy))
lines(xs, ys)
c(data$measuredx, data$measuredx[1]) for example is just to ensure that the last value in the vector agrees with the first, so that it completes a cycle.
The curve is not really closed at the bottom left corner, because smooth.spline is doing smoothing not interpolation, so even if we have ensure that data vector completes a cycle, the fitted one may not be a closed one. A practical workaround is to use weighted regression, imposing heavy weight on this spot to make it closed.
t <- seq_len(nrow(data) + 1)
w <- rep(1, length(t)) ## initially identical weight everywhere
w[c(1, length(w))] <- 100000 ## give heavy weight
xs <- smooth.spline(t, c(data$measuredx, data$measuredx[1]), w)$y
ys <- smooth.spline(t, c(data$measuredy, data$measuredy[1]), w)$y
with(data, plot(measuredx, measuredy), col = 8)
lines(xs, ys, lwd = 2)
Is there a way to get the plot function to generate equal xlimand ylimautomatically?
I do not want to define a fix range beforehand, but I want the plot function to decide about the range itself. However, I expect it to pick the same range for x and y.
A possible solution is to define a wrapper to the plot function:
plot.Custom <- function(x, y, ...) {
.limits <- range(x, y)
plot(x, y, xlim = .limits, ylim = .limits, ...)
}
One way is to manipulate interactively and then choose the right one. A slider will appear once you run the following code.
library(manipulate)
manipulate(
plot(cars, xlim=c(x.min,x.max)),
x.min=slider(0,15),
x.max=slider(15,30))
I'm not aware of anyway to do this using plot(doesn't mean there isn't one). ggplot might be the way to go; it lends itself more to be being retroactively changed since it is designed around a layer system.
library(ggplot2)
#Creating our ggplot object
loop_plot <- ggplot(cars, aes(x = speed, y = dist)) +
geom_point()
#pulling out the 'auto' x & y axis limits
rangepull <- t(cbind(
ggplot_build(loop_plot)$panel$ranges[[1]]$x.range,
ggplot_build(loop_plot)$panel$ranges[[1]]$y.range))
#taking the max and min(so we don't cut out data points)
newrange <- list(cor.min = min(rangepull[,1]), cor.max = max(rangepull[,2]))
#changing our plot size to be nice and symmetric
loop_plot <- loop_plot +
xlim(newrange$cor.min, newrange$cor.max) +
ylim(newrange$cor.min, newrange$cor.max)
Note that the loop_plot object is of ggplot class, and wont actually print until its called.
I used the cars dataset in the code above to show whats going on, but just sub in your data set[s] and then do whatever postmortem your end goal is.
You'll also be able to add in titles and the like based off of the dataset name et cetera which will likely end up producing a clearer visualization out of your loop.
Hopefully this works for your needs.
I need to draw lines from the data stored in a text file.
So far I am able only to draw points on a graph and i would like to have them as lines (line graph).
Here's the code:
pupil_data <- read.table("C:/a1t_left_test.dat", header=T, sep="\t")
max_y <- max(pupil_data$PupilLeft)
plot(NA,NA,xlim=c(0,length(pupil_data$PupilLeft)), ylim=c(2,max_y));
for (i in 1:(length(pupil_data$PupilLeft) - 1))
{
points(i, y = pupil_data$PupilLeft[i], type = "o", col = "red", cex = 0.5, lwd = 2.0)
}
Please help me change this line of code:
points(i, y = pupil_data$PupilLeft[i], type = "o", col = "red")
to draw lines from the data.
Here is the data in the file:
PupilLeft
3.553479
3.539469
3.527239
3.613131
3.649437
3.632779
3.614373
3.605981
3.595985
3.630766
3.590724
3.626535
3.62386
3.619688
3.595711
3.627841
3.623596
3.650569
3.64876
By default, R will plot a single vector as the y coordinates, and use a sequence for the x coordinates. So to make the plot you are after, all you need is:
plot(pupil_data$PupilLeft, type = "o")
You haven't provided any example data, but you can see this with the built-in iris data set:
plot(iris[,1], type = "o")
This does in fact plot the points as lines. If you are actually getting points without lines, you'll need to provide a working example with your data to figure out why.
EDIT:
Your original code doesn't work because of the loop. You are in effect asking R to plot a line connecting a single point to itself each time through the loop. The next time through the loop R doesn't know that there are other points that you want connected; if it did, this would break the intended use of points, which is to add points/lines to an existing plot.
Of course, the line connecting a point to itself doesn't really make sense, and so it isn't plotted (or is plotted too small to see, same result).
Your example is most easily done without a loop:
PupilLeft <- c(3.553479 ,3.539469 ,3.527239 ,3.613131 ,3.649437 ,3.632779 ,3.614373
,3.605981 ,3.595985 ,3.630766 ,3.590724 ,3.626535 ,3.62386 ,3.619688
,3.595711 ,3.627841 ,3.623596 ,3.650569 ,3.64876)
plot(PupilLeft, type = 'o')
If you really do need to use a loop, then the coding becomes more involved. One approach would be to use a closure:
makeaddpoint <- function(firstpoint){
## firstpoint is the y value of the first point in the series
lastpt <- firstpoint
lastptind <- 1
addpoint <- function(nextpt, ...){
pts <- rbind(c(lastptind, lastpt), c(lastptind + 1, nextpt))
points(pts, ... )
lastpt <<- nextpt
lastptind <<- lastptind + 1
}
return(addpoint)
}
myaddpoint <- makeaddpoint(PupilLeft[1])
plot(NA,NA,xlim=c(0,length(PupilLeft)), ylim=c(2,max(PupilLeft)))
for (i in 2:(length(PupilLeft)))
{
myaddpoint(PupilLeft[i], type = "o")
}
You can then wrap the myaddpoint call in the for loop with whatever testing you need to decide whether or not you will actually plot that point. The function returned by makeaddpoint will keep track of the plot indexing for you.
This is normal programming for Lisp-like languages. If you find it confusing you can do this without a closure, but you'll need to handle incrementing the index and storing the previous point value 'manually' in your loop.
There is a strong aversion among experienced R coders to using for-loops when not really needed. This is an example of a loop-less use of a vectorized function named segments that takes 4 vectors as arguments: x0,y0, x1,y1
npups <-length(pupil_data$PupilLeft)
segments(1:(npups-1), pupil_data$PupilLeft[-npups], # the starting points
2:npups, pupil_data$PupilLeft[-1] ) # the ending points
I just came by the following plot:
And wondered how can it be done in R? (or other softwares)
Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.
Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.
Make.Funny.Plot <- function(x){
unique.vals <- length(unique(x))
N <- length(x)
N.val <- min(N/20,unique.vals)
if(unique.vals>N.val){
x <- ave(x,cut(x,N.val),FUN=min)
x <- signif(x,4)
}
# construct the outline of the plot
outline <- as.vector(table(x))
outline <- outline/max(outline)
# determine some correction to make the V shape,
# based on the range
y.corr <- diff(range(x))*0.05
# Get the unique values
yval <- sort(unique(x))
plot(c(-1,1),c(min(yval),max(yval)),
type="n",xaxt="n",xlab="")
for(i in 1:length(yval)){
n <- sum(x==yval[i])
x.plot <- seq(-outline[i],outline[i],length=n)
y.plot <- yval[i]+abs(x.plot)*y.corr
points(x.plot,y.plot,pch=19,cex=0.5)
}
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)
EDIT : corrected so it always works.
I recently came upon the beeswarm package, that bears some similarity.
The bee swarm plot is a
one-dimensional scatter plot like
"stripchart", but with closely-packed,
non-overlapping points.
Here's an example:
library(beeswarm)
beeswarm(time_survival ~ event_survival, data = breast,
method = 'smile',
pch = 16, pwcol = as.numeric(ER),
xlab = '', ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast$ER),
title = 'ER', pch = 16, col = 1:2)
(source: eklund at www.cbs.dtu.dk)
I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):
px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
Try the vioplot package:
library(vioplot)
vioplot(rnorm(100))
(with awful default color ;-)
There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.
Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.
Which adds another geom to ggplot to be used instead of geom_jitter or the like.
In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.
Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.
set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()