Adding least squares and LMS lines to a plot - r

I have loaded the data set Animals2 from the package library (robustbase), and I am interested on working with the logarithm of these data.
library(robustbase)
x<-log(Animals2)
plot(x, main="Plot Animals2", col="darkgreen")
Now I need to add the least-squares line, the LMS line, and the line
of Siegel, by using different colours. Finally, I also wanted to show two plots with the estimated 2-dimensional densities. The first plot should be a 3D visualization of the density (I was trying to use command persp) and the second plot with the command image (applied to the
density estimate).
I am a bit stuck and would appreciate some help.

Assuming that mblm() does the Siegel model, you could use this:
library(robustbase)
library(mblm)
data(Animals2)
x <- log(Animals2)
plot(x, main="Plot Animals2", col="darkgreen")
abline(lm(brain ~ body, data=x), col="red")
abline(MASS::lqs(brain ~ body, data=x, method="lms"), col="blue")
abline(mblm(brain ~ body, data=x, repeated=TRUE), col="black")
legend("topleft", c("LM", "LMS", "Siegel"),
col = c("red", "blue", "black"),
lty = c(1,1,1),
inset=.01)
d2 <- MASS::kde2d(x$body, x$brain)
persp(d2)
image(d2, xlab="body", ylab="brain")
Created on 2022-05-20 by the reprex package (v2.0.1)

Related

Parameters in R lines Function

I want to plot the 97% confidence interval and prediction interval into my other plot using different colours and legends. I want them to be represented by just straight lines across the scatter plot graph.
Please forgive me as I'm new to R so any tips also on how to phrase this question would be helpful!
I'm confused what to put in as the x and y coordinates in the lines function
I tried to put in a and m to match but it creates a very weird complicated graph that I'm sure is not correct, would you mind explaining what I should be putting inside the place where I put HELP in the code below?
attach(my_data)
plot(a, m,
xlab="a", ylab = "m",
main = "Confidence intervals and prediction intervals",
ylim = c(10,50))
abline(lm.fit,lwd=5,col='pink')
p_pred <- predict(lm.fit,data.frame(a=c(14.50)),interval="prediction",level=0.97)
p_conf <- predict(lm.fit,data.frame(a=c(14.50)),interval="confidence",level=0.97)
lines(HELP,p_conf[,"lwr"], col="red", type="b", pch="+")
lines(HELP,p_conf[,"upr"], col="red", type="b", pch="+")
lines(HELP,p_pred[,"upr"], col="blue", type="b", pch="*")
lines(HELP,p_pred[,"lwr"], col="blue", type="b", pch="*")
legend("bottomright",
pch=c("+","*"),
col=c("red","blue"),
legend = c("confidence","prediction"))
I'm sure for this problem the solution is very simple, so I apologize as I am not that familiar with R if I am asking an easy question!
If you want just plot you don't need to calculate predicitons yourself but you can use geom_smooth from ggplot2 library. As in the example below:
library(ggplot2)
ggplot(mtcars, aes(mpg,cyl))+
geom_point()+
geom_smooth(method = "lm", level =0.97)
Thank you for the suggestions on the page! I actually ended up fixing the problem by creating a new variable that took the min and max value of the x variable and then the lines fitted on the graph.

Setting Trend Line Length in R

I have managed to create a scatterplot with two datasets on a single plot. One set of data has an X axis that ranges from 0 -40 (Green), while the other only ranges from 0 -15 (Red).
I used this code to add trend lines to the red and green data separately (using par(new)).
plot( x1,y1, col="red", axes=FALSE, xlab="",ylab="",ylim= range(0:1), xlim= range(0:40))
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
fit <- nls(y1 ~ f(x1,a,b,d), start = c(a=1, b=1, d=1))
co <- coef(fit)
curve(f(x, a=co[1], b=co[2], d=co[3]), add = TRUE, col="red", lwd=1)
My issue is I can't seem to find a way to stop the red trend line at 15 on the x axis. I "googled" around and nothing seemed to come up for my issue. Lots on excel trend lines! I tired adding an end= statement to fit<- and that did not work either.
Please help,
I hope I have posted enough information. Thanks in advance.
Try using ggplot. Following example uses mtcars data:
library(ggplot2)
ggplot(mtcars, aes(qsec, wt, color=factor(vs)))+geom_point()+ stat_smooth(se=F)
You can do this in base graphics with the from and to arguments of the curve function (see the help for curve for more details). For example:
# Your function
f <- function(x1,a,b,d) {(a*x1^2) + (b*x1) + d}
# Plot the function from x=-100 to x=100
curve(f(x, a=-2, b=3, d=0), from=-100, to=100, col="red", lwd=1, lty=1)
# Same curve going from x=-100 to x=0 (and shifted down by 1000 units so it's
# easy to see)
curve(f(x, a=-2, b=3, d=-1000), from=-100, to=0,
add=TRUE, col="blue", lwd=2, lty=1)
If you want to set the curve x-limits programmatically, you can do something like this (assuming your data frame is called df and your x-variable is called x):
curve(f(x, a=-2, b=3, d=0), from=range(df$x)[1], to=range(df$x)[2],
add=TRUE, col="red", lwd=1, lty=1)

Autocorrelation scatterplot for a special lag

I am using acf function to calculate Autocorrelation. I want to have a scatter plot for a special lag for example lag 2. Does it possible to do it with acf function.
Is there any package that help me to have a scatter plot for each lag?
lag.plot1=function(data1,max.lag=1,corr=TRUE,smooth=FALSE){
name1=paste(deparse(substitute(data1)),"(t-",sep="")
name2=paste(deparse(substitute(data1)),"(t)",sep="")
data1=as.ts(data1)
max.lag=as.integer(max.lag)
prow=ceiling(sqrt(max.lag))
pcol=ceiling(max.lag/prow)
a=acf(data1,max.lag,plot=FALSE)$acf[-1]
par(mfrow=c(prow,pcol), mar=c(2.5, 4, 2.5, 1), cex.main=1.1, font.main=1)
for(h in 1:max.lag){
plot(lag(data1,-h), data1, xy.labels=FALSE, main=paste(name1,h,")",sep=""), ylab=name2, xlab="")
if (smooth==TRUE)
lines(lowess(ts.intersect(lag(data1,-h),data1)[,1],
ts.intersect(lag(data1,-h),data1)[,2]), col="red")
if (corr==TRUE)
legend("topright", legend=round(a[h], digits=2), text.col ="blue", bg="white", x.intersp=0)
}
}
lag.plot1(dat,12,smooth=TRUE)
The full call is lag.plot1(x,m,corr=TRUE,smooth=TRUE) and it will generate a grid of scatterplots of x(t-h) versusx(t) for h = 1,...,m, along with the autocorrelation values in blue and a lowess fit in red.If you don't want any correlations or lines, simply use R's lag.plot
library(forecast)
lag.plot(LakeHuron,lags=3,do.lines=FALSE)

Plot a log-curve to a scatter plot

I am facing a probably pretty easy-to-solve issue: adding a log- curve to a scatter plot.
I have already created the corresponding model and now only need to add the respective curve/line.
The current model is as follows:
### DATA
SpStats_urbanform <- c (0.3702534,0.457769,0.3069843,0.3468263,0.420108,0.2548158,0.347664,0.4318018,0.3745645,0.3724192,0.4685135,0.2505839,0.1830535,0.3409849,0.1883303,0.4789871,0.3979671)
co2 <- c (6.263937,7.729964,8.39634,8.12979,6.397212,64.755192,7.330138,7.729964,11.058834,7.463414,7.196863,93.377393,27.854284,9.081405,73.483949,12.850917,12.74407)
### Plot initial plot
plot (log10 (1) ~ log10 (1), col = "white", xlab = "PUSHc values",
ylab = "Corrected GHG emissions [t/cap]", xlim =c(0,xaxes),
ylim =c(0,yaxes), axes =F)
axis(1, at=seq(0.05, xaxes, by=0.05), cex.axis=1.1)
axis(2, at=seq(0, yaxes, by=1), cex.axis=1.1 )
### FIT
fit_co2_urbanform <- lm (log10(co2) ~ log10(SpStats_urbanform))
### Add data points (used points() instead of simple plot() bc. of other code parts)
points (co2_cap~SpStats_urbanform, axes = F, cex =1.3)
Now, I've already all the fit_parameters and are still not able to construct the respective fit-curve for co2_cap (y-axis)~ SpStats_urbanform (x-axis)
Can anyone help me finalizing this little piece of code ?
First, if you want to plot in a log-log space, you have to specify it with argument log="xy":
plot (co2~SpStats_urbanform, log="xy")
Then if you want to add your regression line, then use abline:
abline(fit_co2_urbanform)
Edit: If you don't want to plot in a log-log scale then you'll have to translate your equation log10(y)=a*log10(x)+b into y=10^(a*log10(x)+b) and plot it with curve:
f <- coefficients(fit_co2_urbanform)
curve(10^(f[1]+f[2]*log10(x)),ylim=c(0,100))
points(SpStats_urbanform,co2)

How do I plot the 'inverse' of a survival function?

I am trying to plot the inverse of a survival function, as the data I'm is actually an increase in proportion of an event over time. I can produce Kaplan-Meier survival plots, but I want to produce the 'opposite' of these. I can kind of get what I want using the following fun="cloglog":
plot(survfit(Surv(Days_until_workers,Workers)~Queen_Number+Treatment,data=xdata),
fun="cloglog", lty=c(1:4), lwd=2, ylab="Colonies with Workers",
xlab="Days", las=1, font.lab=2, bty="n")
But I don't understand quite what this has done to the time (i.e. doesn't start at 0 and distance decreases?), and why the survival lines extend above the y axis.
Would really appreciate some help with this!
Cheers
Use fun="event" to get the desired output
fit <- survfit(Surv(time, status) ~ x, data = aml)
par(mfrow=1:2, las=1)
plot(fit, col=2:3)
plot(fit, col=2:3, fun="event")
The reason for fun="cloglog" screwing up the axes is that it does not plot a fraction at all. It is instead plotting this according to ?plot.survfit:
"cloglog" creates a complimentary log-log survival plot (f(y) = log(-log(y)) along with log scale for the x-axis)
Moreover, the fun argument is not limited to predefined functions like "event" or "cloglog", so you can easily give it your own custom function.
plot(fit, col=2:3, fun=function(y) 3*sqrt(1-y))

Resources