Extendable piecewise function for any number of knots/breaks - r

I have the following collection of slopes, breaks, and intercepts:
slopes <- c(4, 2, 8, 4)
breaks <- c(0.0150, 0.030, 0.035)
intercepts <- c(0.0299, 0.0599, -0.1201, 0.0199)
They define the following lines:
# y = slopes[1] * x + intercepts[1]
# y = slopes[2] * x + intercepts[2]
# y = slopes[3] * x + intercepts[3]
# y = slopes[4] * x + intercepts[4]
Graphing the lines yields:
tibble(x = seq(0.0025, 0.06, 0.0025), y = x) %>%
ggplot(aes(x, y)) +
geom_point(alpha = 0) +
geom_abline(intercept = intercepts[1], slope = slopes[1], color = "red") +
geom_abline(intercept = intercepts[2], slope = slopes[2], color = "orange") +
geom_abline(intercept = intercepts[3], slope = slopes[3], color = "yellow") +
geom_abline(intercept = intercepts[4], slope = slopes[4], color = "green2") +
scale_y_continuous(limits = c(0, 1))
I'd like to create a piecewise function based on the lines and breaks/knots, like so (follow: red -> orange -> yellow -> green):
I could wrap a function over a couple of if/else statements to get what I want. But I'd like for the solution to be extendable for any number of breaks/knots (instead of just 3, in this example).
How might I accomplish this?

This should be fairly extensible:
piecewise <- function(x, slopes, intercepts, breaks) {
i = 1 + findInterval(x, breaks)
y = slopes[i] * x + intercepts[i]
return(y)
}
Note that I put the breaks argument last, since that seemed most natural to me.
It automatically implements the piecewise defined function for any number of pieces.
Example:
slopes <- c(4, 2, 8, 4)
intercepts <- c(0.0299, 0.0599, -0.1201, 0.0199)
breaks <- c(0.0150, 0.030, 0.035)
df <- tibble(x = seq(0.0025, 0.06, 0.0025)) %>%
mutate(y = piecewise(x, slopes, intercepts, breaks))
df %>%
ggplot(aes(x, y)) +
geom_line()

Related

Axis with label = comma without showing decimals for large numbers

I have an x-axis in logscale, and I would like to display the labels without scientific notation (i.e. not 1e3, but 1,000 instead). I have always done this with label = scales::comma, but now my dataset also has very small values (0.001, for instance). Hence, when I add + scale_x_log10(label = comma), I get an x-axis where 1e-3 looks like 0.001 (as it should), but 1e3 looks like 1,000.000. I would like to remove the three decimal places, so that instead of 1,000.000 I just have 1,000. Using label = comma_format(accuracy = 1), as suggested here will make values like 0.001 look just like 0, so it's not a valid option.
Anyone has any idea?
Here there is a reproducible example of the problem:
library(ggplot2)
X <- 10^seq(-3, 3, length.out = 50)
Y <- 100 * X/(X + 1)
Demo_data <- data.frame(X, Y)
ggplot(Demo_data, aes(x = X, y = Y)) + geom_line(size = 1.5) +
scale_x_log10(breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = scales::comma)
This solution does not work:
ggplot(Demo_data, aes(x = X, y = Y)) + geom_line(size = 1.5) +
scale_x_log10(breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = scales::comma_format(accuracy = 1))
One option would be to use an ifelse to conditionally set the accuracy for values > 1 and < 1:
X <- 10^seq(-3, 3, length.out = 50)
Y <- 100 * X / (X + 1)
Demo_data <- data.frame(X, Y)
library(ggplot2)
library(scales)
ggplot(Demo_data, aes(x = X, y = Y)) +
geom_line(size = 1.5) +
scale_x_log10(
breaks = c(1e-3, 1e-2, 1e-1, 1, 10, 1e2, 1e3),
label = ~ ifelse(.x < 1, scales::comma(.x), scales::comma(.x, accuracy = 1))
)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.

How can I add confidence intervals to a scatterplot for a regression on two variables?

I need to create an insightful graphic with a regression line, data points, and confidence intervals. I am not looking for smoothed lines. I have tried multiple codes, but I just can't get it right.
I am looking for something like this:
Some codes I have tried:
p <- scatterplot(df.regsoft$w ~ df.regsoft$b,
data = df.regsoft,
boxplots = FALSE,
regLine = list(method=lm, col="red"),
pch = 16,
cex = 0.7,
xlab = "Fitted Values",
ylab = "Residuals",
legend = TRUE,
smooth = FALSE)
abline(coef = confint.lm(result.rs))
But this doesn't create what I want to create, however it is closest to what I intended. Notice that I took out "smooth" since this is not really what I am looking for.
How can I make this plot interactive?
If you don't mind switch to ggplot and the tidyverse, then this is simply a geom_smooth(method = "lm"):
library(tidyverse)
d <- tibble( #random stuff
x = rnorm(100, 0, 1),
y = 0.25 * x + rnorm(100, 0, 0.25)
)
m <- lm(y ~ x, data = d) #linear model
d %>%
ggplot() +
aes(x, y) + #what to plot
geom_point() +
geom_smooth(method = "lm") +
theme_bw()
without method = "lm" it draws a smoothed line.
As for the Conf. interval (Obs 95%) lines, it seems to me that's simply a quantile regression. In that case, you can use the quantreg package.
If you want to make it interactive, you can use the plotly package:
library(plotly)
p <- d %>%
ggplot() +
aes(x, y) +
geom_point() +
geom_smooth(method = "lm") +
theme_bw()
ggplotly(p)
================================================
P.S.
I am not completely sure this is what the figure you posted is showing (I guess so), but to add the quantile lines, I would just perform two quantile regressions (upper and lower) and then calculate the values of the quantile lines for your data:
library(tidyverse)
library(quantreg)
d <- tibble( #random stuff
x = rnorm(100, 0, 1),
y = 0.25 * x + rnorm(100, 0, 0.25)
)
m <- lm(y ~ x, data = d) #linear model
# 95% quantile, two tailed
rq_low <- rq(y ~ x, data = d, tau = 0.025) #lower quantile
rq_high <- rq(y ~ x, data = d, tau = 0.975) #upper quantile
d %>%
mutate(low = rq_low$coefficients[1] + x * rq_low$coefficients[2],
high = rq_high$coefficients[1] + x * rq_high$coefficients[2]) %>%
ggplot() +
geom_point(aes(x, y)) +
geom_smooth(aes(x, y), method = "lm") +
geom_line(aes(x, low), linetype = "dashed") +
geom_line(aes(x, high), linetype = "dashed") +
theme_bw()

How do I plot a bivariate function equated to 0 in R and ggplot2?

I am trying to plot a bivariate function equated to 0 in R, using no packages other than the basic packages and ggplot2. Namely, the function is:
f(x,y) = x-log(x)+y-log(y)+C, where C < -2
Can I plot this function equated to 0, using R?. I did this using Desmos online graphing calculator and it worked, but now I can't figure out how to do it in R. I don't need the exact solution of x and y, but just the plot.
Here's a quick brute force approach:
library(tidyverse)
my_func <- function(x, y, C) { x - log(x) + y - log(y) + C }
expand_grid(x = seq(0, 15, by = 0.02),
y = seq(0, 15, by = 0.02),
C = seq(-10, -2, by = 1)) %>%
mutate(error = my_func(x, y, C)) %>%
filter(abs(error) < 0.1) %>%
ggplot(aes(x, y, alpha = 1 - abs(error))) +
geom_tile() +
guides(alpha = F) +
facet_wrap(~C)
EDIT: version using ggplot2 with base R.
library(ggplot2)
output <- expand.grid(x = seq(0, 15, by = 0.02),
y = seq(0, 15, by = 0.02),
C = seq(-10, -2, by = 1))
output$error = my_func(output$x, output$y, output$C)
output <- output[abs(output$error) < 0.1,]
ggplot(output, aes(x, y, alpha = 1 - abs(error))) +
geom_tile() +
guides(alpha = F) +
facet_wrap(~C)

saving ggplot in a list gives me the same graph

I am trying to plot 12 different plots on a 3 by 4 grid. But,it only plots the last one 12 times. Can any one help me? I am so fed up with it. Thanks
library(ggplot2)
library(gridExtra)
pmax=0.85
K_min = 0.0017
T = seq(100,1200,by=100) ## ISIs
lambda =1/T
p=list()
for(i in (1:length(lambda))){
p[[i]]<-ggplot(data.frame(x = c(0, 1)), aes(x = x)) +
stat_function(fun = function (x) (lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1),colour = "dodgerblue3")+
scale_x_continuous(name = "Probability") +
scale_y_continuous(name = "Frequency") + theme_bw()
main <- grid.arrange(grobs=p,ncol=4)
}
This code produces the correct picture but I need to use ggplot since my other figures are in ggplot.
par( mfrow = c( 3, 4 ) )
for (i in (1:length(lambda))){
f <- function (x) ((lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1) )
curve(f,from=0, to=1, col = "violet",lwd=2,sub = paste0("ISI = ",round(1/lambda[i],3), ""),ylab="PDF",xlab="R")
}
Correct plot using curve:
ggplot objects created in a loop are evaluated at the end of the loop. Since all the ggplot objects in this case use data calculated with lambda[i], they get the same result based on the last i value (12). Here are two possible workarounds:
Workaround 1. Convert each ggplot object into a grob within the loop, & save that to the list:
for(i in (1:length(lambda))){
# code for generating each plot is unchanged
g <- ggplot(data.frame(x = c(0, 1)), aes(x = x)) +
stat_function(fun = function (x) (lambda[i]*(1-(1-pmax))/K_min)*(1-x)^((lambda[i]/K_min)-1)*
(1-(1-pmax)*x)^-((lambda[i]/K_min)+1),colour = "dodgerblue3")+
scale_x_continuous(name = "Probability") +
scale_y_continuous(name = "Frequency") + theme_bw()
p[[i]] <- ggplotGrob(g)
}
main <- grid.arrange(grobs=p, ncol=4)
Workaround 2. Put all the data in a data frame, & create a single ggplot with a facet for each ISI:
library(dplyr)
pmax = 0.85
K_min = 0.0017
ISI = seq(100, 1200, by = 100) # I changed this; using `T` as a name clashes with T from TRUE/FALSE
lambda = 1/ISI
df <- data.frame(
x = rep(seq(0, 1, length.out = 101), length(ISI)),
ISI = rep(ISI, each = 101),
l = rep(lambda, each = 101)
) %>%
mutate(y = (l * pmax / K_min) * (1-x) ^ ((l / K_min) - 1) *
(1 - (1 - pmax) * x)^-((l / K_min) + 1))
ggplot(data,
aes(x = x, y = y, group = 1)) +
geom_line(colour = "dodgerblue3") +
facet_wrap(~ISI, nrow = 3, scales = "free_y") +
labs(x = "Probability", y = "Frequency") +
theme_bw()

How do I get the equation for a regression line in log-log plot in ggplot2?

I've a log-log plot, I got the regression line by using:
geom_smooth(formula = y ~ x, method='lm')
But now I'd like to obtain the equation of this line (e.g. y=a*x^(-b)) and print it. I managed to get it in a lin-lin plot but not in this case.
Here's the code:
mydataS<-data.frame(DurPeak_h[],IntPeak[],IntPeakxDurPeak[],ID[]) #df peak
names(mydataS)<-c("x","y","ID","IDEVENT")
plotID<-ggplot(mydataS, aes(x=x, y=y, label=IDEVENT)) +
geom_text(check_overlap = TRUE, hjust = 0, nudge_x = 0.02)+
geom_point(colour="black", size = 2) + geom_point(aes(colour = ID)) +
geom_quantile(quantiles = qs, colour="green")+
scale_colour_gradient(low = "white", high="red") +
scale_x_log10(limits = c(min(DurEnd_h),max(DurEnd_h))) +
scale_y_log10(limits = c(min(IntEnd),max(IntEnd))) +
geom_smooth(formula = y ~ x, method='lm')
ggsave(height=7,"plot.pdf")
mydataS<-data.frame(DurPeak_h[],IntPeak[],IntPeakxDurPeak[],ID[])
names(mydataS)<-c("x","y","ID","IDEVENT")
model <- lm(y~x, header = T)
summary(model)
use the intercept value given as "b" and the coefficient as your "a"
Did it with a workaround: using nls to calculate the two parameters a and b, precisely:
nlsPeak <- coef(nls(y ~ a*(x)^b, data = mydataS, start = list(a=30, b=-0.1)))
then plotting the line with annotate (see some examples here) and finally printing the equation using the function:
power_eqn = function(ds){
m = nls(y ~ a*x^b, start = list(a=30, b=-0.1), data = ds);
eq <- substitute(italic(y) == a ~italic(x)^b,
list(a = format(coef(m)[1], digits = 4),
b = format(coef(m)[2], digits = 2)))
as.character(as.expression(eq));
}
called as follow:
annotate("text",x = 3, y = 180,label = power_eqn(mydataS), parse=TRUE, col="black") +
Hope it helps!

Resources