I have managed to create a scatterplot with two datasets on a single plot. One set of data has an X axis that ranges from 0 -40 (Green), while the other only ranges from 0 -15 (Red).
I used this code to add trend lines to the red and green data separately (using par(new)).
plot( x1,y1, col="red", axes=FALSE, xlab="",ylab="",ylim= range(0:1), xlim= range(0:40))
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
fit <- nls(y1 ~ f(x1,a,b,d), start = c(a=1, b=1, d=1))
co <- coef(fit)
curve(f(x, a=co[1], b=co[2], d=co[3]), add = TRUE, col="red", lwd=1)
My issue is I can't seem to find a way to stop the red trend line at 15 on the x axis. I "googled" around and nothing seemed to come up for my issue. Lots on excel trend lines! I tired adding an end= statement to fit<- and that did not work either.
Please help,
I hope I have posted enough information. Thanks in advance.
Try using ggplot. Following example uses mtcars data:
library(ggplot2)
ggplot(mtcars, aes(qsec, wt, color=factor(vs)))+geom_point()+ stat_smooth(se=F)
You can do this in base graphics with the from and to arguments of the curve function (see the help for curve for more details). For example:
# Your function
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
# Plot the function from x=-100 to x=100
curve(f(x, a=-2, b=3, d=0), from=-100, to=100, col="red", lwd=1, lty=1)
# Same curve going from x=-100 to x=0 (and shifted down by 1000 units so it's
# easy to see)
curve(f(x, a=-2, b=3, d=-1000), from=-100, to=0,
add=TRUE, col="blue", lwd=2, lty=1)
If you want to set the curve x-limits programmatically, you can do something like this (assuming your data frame is called df and your x-variable is called x):
curve(f(x, a=-2, b=3, d=0), from=range(df$x)[1], to=range(df$x)[2],
add=TRUE, col="red", lwd=1, lty=1)
Related
I have loaded the data set Animals2 from the package library (robustbase), and I am interested on working with the logarithm of these data.
library(robustbase)
x<-log(Animals2)
plot(x, main="Plot Animals2", col="darkgreen")
Now I need to add the least-squares line, the LMS line, and the line
of Siegel, by using different colours. Finally, I also wanted to show two plots with the estimated 2-dimensional densities. The first plot should be a 3D visualization of the density (I was trying to use command persp) and the second plot with the command image (applied to the
density estimate).
I am a bit stuck and would appreciate some help.
Assuming that mblm() does the Siegel model, you could use this:
library(robustbase)
library(mblm)
data(Animals2)
x <- log(Animals2)
plot(x, main="Plot Animals2", col="darkgreen")
abline(lm(brain ~ body, data=x), col="red")
abline(MASS::lqs(brain ~ body, data=x, method="lms"), col="blue")
abline(mblm(brain ~ body, data=x, repeated=TRUE), col="black")
legend("topleft", c("LM", "LMS", "Siegel"),
col = c("red", "blue", "black"),
lty = c(1,1,1),
inset=.01)
d2 <- MASS::kde2d(x$body, x$brain)
persp(d2)
image(d2, xlab="body", ylab="brain")
Created on 2022-05-20 by the reprex package (v2.0.1)
x <- sin(1:20)
y <- (dplyr::lag(x)-x)-x
plot(x, y, type="l")
Yields this plot:
How do I get a smooth, cyclical trajectory? The smoothing functions I've tried all want to make a single smooth function.
Example 1 (throws an error):
lines(smooth.spline(x, y))
Example 2 (draws a function):
lo <- loess(y~x)
lines(predict(lo), col='red', lwd=2)
Example 3 (draws a function):
xspline(c(x,y), shape=1, border='blue' )
I think you're asking for both (a) smoother curves and (b) cylindrical/symmetric circles, correct?
Increase the count of points to make it smoother. Fix the aspect ratio to make it a circle (i.e., asp=1).
x <- sin((1:20000) / 1000)
y <- cos((1:20000) / 1000)
plot(x, y, type="l", asp=1)
EDIT, responding to OP's comments below.:
This uses dplyr::lag(), while increasing the number of points. Don't conceptually shift the lag though (i.e., y should be a full value of 1 from x).
(x=1, y=0), (x=1.001, y=0.001), ..., (x=20, y=19)
x <- sin((1:20000) / 1000)
y <- (dplyr::lag(x, 1000)-x)-x
plot(x, y, type="l", asp=1)
If that's not what you want, maybe sketch a picture and attach in (in like MS Paint or Inkscape).
Here is my R program:
#The code of the first picture
a<-c(278,410,37.9,100,300,71,195,51.05,59.4,145,900,718,220,130,220,138,
135,206,260,150,510,310,70,17,120,110,254.6,289.3,190)
b<-log10(a)
b.n<-length(b)
b.location<-mean(b)
b.var<-(b.n-1)/b.n*var(b)
b.scale<-sqrt(3*b.var)/pi
library(stats4)
ll.logis<-function(location=b.location,scale=b.scale){-sum(dlogis(b,location,scale,log=TRUE))}
fit.mle<-mle(ll.logis,method="Nelder-Mead")
fit.location<-coef(fit.mle)[1]
fit.scale<-coef(fit.mle)[2]
plot(b, rank(b)/length(b),pch=16,xlab="Lg toxicity
data(μg/L)",pch=16,xlab="Lg toxicity data(μg/L)",ylab="Cumulative probability",lwd=3,font.lab=2,font.axis=2)
f <- function(x) plogis(x, fit.location, fit.scale)
plot(f, add=TRUE, xlim=extendrange(b,f=0.5))
#The code of the second picture is
c<-c(1300,541,441,35,278,167,276,159,126,60.8,160,9740,3480,264.6,379,170,251.3,
155.84,187.01,2800,66.5,420,840,40,1380,469,260,50,370)
d<-log10(c)
d.n<-length(d)
d.location<-mean(d)
d.var<-(d.n-1)/d.n*var(d)
d.scale<-sqrt(3*d.var)/pi
library(stats4)
ll.logis<-function(location=d.location,scale=d.scale){-sum(dlogis(d,location,scale,log=TRUE))}
fit.mle<-mle(ll.logis,method="Nelder-Mead")
fit.location<-coef(fit.mle)[1]
fit.scale<-coef(fit.mle)[2]
plot(d, rank(d)/length(d),pch=25,col="blue",xlab="Lg toxicity data(μg/L)",ylab="Cumulative probability",lwd=3,font.lab=2,font.axis=2)
k <- function(c) plogis(c, fit.location, fit.scale)
plot(k, add=TRUE, xlim=extendrange(b,f=0.5))
After this two code parts, I can get two pictures (A and B).
But now, I just want to plot two graphs using the same coordinate system for intuitive comparison,the picture like C.
The pictures are like this:
What code I should write ?
Here one solution. Cleaned a little bit your code. You can define the graph function only once.
#The code of the first picture
library(stats4)
f <- function(c) plogis(c, fit.location, fit.scale)
a<-c(278,410,37.9,100,300,71,195,51.05,59.4,145,900,718,220,130,220,138,
135,206,260,150,510,310,70,17,120,110,254.6,289.3,190)
b<-log10(a)
b.n<-length(b)
b.location<-mean(b)
b.var<-(b.n-1)/b.n*var(b)
b.scale<-sqrt(3*b.var)/pi
#The code of the second picture is
c<-c(1300,541,441,35,278,167,276,159,126,60.8,160,9740,3480,264.6,379,170,251.3,
155.84,187.01,2800,66.5,420,840,40,1380,469,260,50,370)
d<-log10(c)
d.n<-length(d)
d.location<-mean(d)
d.var<-(d.n-1)/d.n*var(d)
d.scale<-sqrt(3*d.var)/pi
# Plotting first picture
plot(b, rank(b)/length(b), ylim=c(0,1), xlim=round((range(c(b, d)))))
# calculate parameters for first function
ll.logis<-function(location=b.location,scale=b.scale){-sum(dlogis(b,location,scale,log=TRUE))}
fit.mle<-mle(ll.logis,method="Nelder-Mead")
fit.location<-coef(fit.mle)[1]
fit.scale<-coef(fit.mle)[2]
plot(f, add=TRUE, xlim=extendrange(b,f= 0.5))
# Plotting second
par(new=TRUE)
plot(d, rank(d)/length(d), col=2, pch=2, xaxt="n", yaxt="n", xlab="", ylab="")
# calculate parameters for dataset 2
ll.logis<-function(location=d.location,scale=d.scale){-sum(dlogis(d,location,scale,log=TRUE))}
fit.mle<-mle(ll.logis,method="Nelder-Mead")
fit.location<-coef(fit.mle)[1]
fit.scale<-coef(fit.mle)[2]
plot(f, add=TRUE, xlim=extendrange(d,f=0.5), col=2)
In R I'm able to overlap a normal curve to a density histogram:
Eventually I can convert the density histogram to a probability one:
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
curve(dnorm(x, mean=mean(a), sd=sd(a)), add=TRUE)
But I cannot overlap the normal curve anymore since it goes off scale.
Any solution? Maybe a second Y-axis
Now the question is clear to me. Indeed a second y-axis seems to be the best choice for this as the two data sets have completely different scales.
In order to do this you could do:
set.seed(2)
a <- rnorm(1:100)
test <-hist(a, plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100 # Probability
plot(test, ylab="Probability")
#start new graph
par(new=TRUE)
#instead of using curve just use plot and create the data your-self
#this way below is how curve works internally anyway
curve_data <- dnorm(seq(-2, 2, 0.01), mean=mean(a), sd=sd(a))
#plot the line with no axes or labels
plot(seq(-2, 2, 0.01), curve_data, axes=FALSE, xlab='', ylab='', type='l', col='red' )
#add these now with axis
axis(4, at=pretty(range(curve_data)))
Output:
At first you should save your rnorm data otherwise you get different data each time.
seed = rnorm(100)
Next go ahead with
hist(seed,probability = T)
curve(dnorm(x, mean=mean(na.omit(seed)), sd=sd(na.omit(seed))), add=TRUE)
Now you have the expected result. Histogram with density curve.
The y-axis isn't a "probability" as you have labeled it. It is count data. If you convert your histogram to probabilities, you shouldn't have a problem:
x <- rnorm(1000)
hist(x, freq= FALSE, ylab= "Probability")
curve(dnorm(x, mean=mean(x), sd=sd(x)), add=TRUE)
I'm trying to calculate a Bezier-like spline curve that passes through a sequence of x-y coordinates. An example would be like the following output from the cscvn function in Matlab (example link):
I believe the (no longer maintained) grid package used to do this (grid.xspline function?), but I haven't been able to install an archived version of the package, and don't find any examples exactly along the lines of what I would like.
The bezier package also looks promising, but it is very slow and I also can't get it quite right:
library(bezier)
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
p <- cbind(x,y)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
plot(p, xlim=xlim, ylim=ylim)
text(p, labels=seq(n), pos=3)
bp <- pointsOnBezier(cbind(x,y), n=100)
lines(bp$points)
arrows(bp$points[nrow(bp$points)-1,1], bp$points[nrow(bp$points)-1,2],
bp$points[nrow(bp$points),1], bp$points[nrow(bp$points),2]
)
As you can see, it doesn't pass through any points except the end values.
I would greatly appreciate some guidance here!
There is no need to use grid really. You can access xspline from the graphics package.
Following from your code and the shape from #mrflick:
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
p <- cbind(x,y)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
plot(p, xlim=xlim, ylim=ylim)
text(p, labels=seq(n), pos=3)
You just need one extra line:
xspline(x, y, shape = c(0,rep(-1, 10-2),0), border="red")
It may not the be the best approach, bit grid certainly isn't inactive. It's included as a default package with the R installation. It's the underlying graphics engine for plotting libraries like lattice and ggplot. You shouldn't need to install it, you should just be able to load it. Here's how I might translate your code to use grid.xpline
set.seed(1)
n <- 10
x <- runif(n)
y <- runif(n)
xlim <- c(min(x) - 0.1*diff(range(x)), c(max(x) + 0.1*diff(range(x))))
ylim <- c(min(y) - 0.1*diff(range(y)), c(max(y) + 0.1*diff(range(y))))
library(grid)
grid.newpage()
pushViewport(viewport(xscale=xlim, yscale=ylim))
grid.points(x, y, pch=16, size=unit(2, "mm"),
default.units="native")
grid.text(seq(n), x,y, just=c("center","bottom"),
default.units="native")
grid.xspline(x, y, shape=c(0,rep(-1, 10-2),0), open=TRUE,
default.units="native")
popViewport()
which results in
note that grid is pretty low-level so it's not super easy to work with, but it does allow you far more control of what and where you plot.
And if you want to extract the points along the curve rather than draw it, look at the ?xsplinePoints help page.
Thanks to all that helped with this. I'm summarizing the lessons learned plus a few other aspects.
Catmull-Rom spline vs. cubic B-spline
Negative shape values in the xspline function return a Catmull-Rom type spline, with spline passing through the x-y points. Positive values return a cubic B type spline. Zero values return a sharp corner. If a single shape value is given, this is used for all points. The shape of end points is always treated like a sharp corner (shape=0), and other values do not influence the resulting spline at the end points:
# Catmull-Rom spline vs. cubic B-spline
plot(p, xlim=extendrange(x, f=0.2), ylim=extendrange(y, f=0.2))
text(p, labels=seq(n), pos=3)
# Catmull-Rom spline (-1)
xspline(p, shape = -1, border="red", lwd=2)
# Catmull-Rom spline (-0.5)
xspline(p, shape = -0.5, border="orange", lwd=2)
# cubic B-spline (0.5)
xspline(p, shape = 0.5, border="green", lwd=2)
# cubic B-spline (1)
xspline(p, shape = 1, border="blue", lwd=2)
legend("bottomright", ncol=2, legend=c(-1,-0.5), title="Catmull-Rom spline", col=c("red", "orange"), lty=1)
legend("topleft", ncol=2, legend=c(1, 0.5), title="cubic B-spline", col=c("blue", "green"), lty=1)
Extracting results from xspline for external plotting
This took some searching, but the trick is to apply the argument draw=FALSE to xspline.
# Extract xy values
plot(p, xlim=extendrange(x, f=0.1), ylim=extendrange(y, f=0.1))
text(p, labels=seq(n), pos=3)
spl <- xspline(x, y, shape = -0.5, draw=FALSE)
lines(spl)
arrows(x0=(spl$x[length(spl$x)-0.01*length(spl$x)]), y0=(spl$y[length(spl$y)-0.01*length(spl$y)]),
x1=(spl$x[length(spl$x)]), y1=(spl$y[length(spl$y)])
)