Conditionally colour data points outside of confidence bands in R - r

I need to colour datapoints that are outside of the the confidence bands on the plot below differently from those within the bands. Should I add a separate column to my dataset to record whether the data points are within the confidence bands? Can you provide an example please?
Example dataset:
## Dataset from http://www.apsnet.org/education/advancedplantpath/topics/RModules/doc1/04_Linear_regression.html
## Disease severity as a function of temperature
# Response variable, disease severity
diseasesev<-c(1.9,3.1,3.3,4.8,5.3,6.1,6.4,7.6,9.8,12.4)
# Predictor variable, (Centigrade)
temperature<-c(2,1,5,5,20,20,23,10,30,25)
## For convenience, the data may be formatted into a dataframe
severity <- as.data.frame(cbind(diseasesev,temperature))
## Fit a linear model for the data and summarize the output from function lm()
severity.lm <- lm(diseasesev~temperature,data=severity)
# Take a look at the data
plot(
diseasesev~temperature,
data=severity,
xlab="Temperature",
ylab="% Disease Severity",
pch=16,
pty="s",
xlim=c(0,30),
ylim=c(0,30)
)
title(main="Graph of % Disease Severity vs Temperature")
par(new=TRUE) # don't start a new plot
## Get datapoints predicted by best fit line and confidence bands
## at every 0.01 interval
xRange=data.frame(temperature=seq(min(temperature),max(temperature),0.01))
pred4plot <- predict(
lm(diseasesev~temperature),
xRange,
level=0.95,
interval="confidence"
)
## Plot lines derrived from best fit line and confidence band datapoints
matplot(
xRange,
pred4plot,
lty=c(1,2,2), #vector of line types and widths
type="l", #type of plot for each column of y
xlim=c(0,30),
ylim=c(0,30),
xlab="",
ylab=""
)

Well, I thought that this would be pretty easy with ggplot2, but now I realize that I have no idea how the confidence limits for stat_smooth/geom_smooth are calculated.
Consider the following:
library(ggplot2)
pred <- as.data.frame(predict(severity.lm,level=0.95,interval="confidence"))
dat <- data.frame(diseasesev,temperature,
in_interval = diseasesev <=pred$upr & diseasesev >=pred$lwr ,pred)
ggplot(dat,aes(y=diseasesev,x=temperature)) +
stat_smooth(method='lm') + geom_point(aes(colour=in_interval)) +
geom_line(aes(y=lwr),colour=I('red')) + geom_line(aes(y=upr),colour=I('red'))
This produces:
alt text http://ifellows.ucsd.edu/pmwiki/uploads/Main/strangeplot.jpg
I don't understand why the confidence band calculated by stat_smooth is inconsistent with the band calculated directly from predict (i.e. the red lines). Can anyone shed some light on this?
Edit:
figured it out. ggplot2 uses 1.96 * standard error to draw the intervals for all smoothing methods.
pred <- as.data.frame(predict(severity.lm,se.fit=TRUE,
level=0.95,interval="confidence"))
dat <- data.frame(diseasesev,temperature,
in_interval = diseasesev <=pred$fit.upr & diseasesev >=pred$fit.lwr ,pred)
ggplot(dat,aes(y=diseasesev,x=temperature)) +
stat_smooth(method='lm') +
geom_point(aes(colour=in_interval)) +
geom_line(aes(y=fit.lwr),colour=I('red')) +
geom_line(aes(y=fit.upr),colour=I('red')) +
geom_line(aes(y=fit.fit-1.96*se.fit),colour=I('green')) +
geom_line(aes(y=fit.fit+1.96*se.fit),colour=I('green'))

The easiest way is probably to calculate a vector of TRUE/FALSE values that indicate if a data point is inside of the confidence interval or not. I'm going to reshuffle your example a little bit so that all of the calculations are completed before the plotting commands are executed- this provides a clean separation in the program logic that could be exploited if you were to package some of this into a function.
The first part is pretty much the same, except I replaced the additional call to lm() inside predict() with the severity.lm variable- there is no need to use additional computing resources to recalculate the linear model when we already have it stored:
## Dataset from
# apsnet.org/education/advancedplantpath/topics/
# RModules/doc1/04_Linear_regression.html
## Disease severity as a function of temperature
# Response variable, disease severity
diseasesev<-c(1.9,3.1,3.3,4.8,5.3,6.1,6.4,7.6,9.8,12.4)
# Predictor variable, (Centigrade)
temperature<-c(2,1,5,5,20,20,23,10,30,25)
## For convenience, the data may be formatted into a dataframe
severity <- as.data.frame(cbind(diseasesev,temperature))
## Fit a linear model for the data and summarize the output from function lm()
severity.lm <- lm(diseasesev~temperature,data=severity)
## Get datapoints predicted by best fit line and confidence bands
## at every 0.01 interval
xRange=data.frame(temperature=seq(min(temperature),max(temperature),0.01))
pred4plot <- predict(
severity.lm,
xRange,
level=0.95,
interval="confidence"
)
Now, we'll calculate the confidence intervals for the origional data points and run a test to see if the points are inside the interval:
modelConfInt <- predict(
severity.lm,
level = 0.95,
interval = "confidence"
)
insideInterval <- modelConfInt[,'lwr'] < severity[['diseasesev']] &
severity[['diseasesev']] < modelConfInt[,'upr']
Then we'll do the plot- first a the high-level plotting function plot(), as you used it in your example, but we will only plot the points inside the interval. We will then follow up with the low-level function points() which will plot all the points outside the interval in a different color. Finally, matplot() will be used to fill in the confidence intervals as you used it. However instead of calling par(new=TRUE) I prefer to pass the argument add=TRUE to high-level functions to make them act like low level functions.
Using par(new=TRUE) is like playing a dirty trick a plotting function- which can have unforeseen consequences. The add argument is provided by many functions to cause them to add information to a plot rather than redraw it- I would recommend exploiting this argument whenever possible and fall back on par() manipulations as a last resort.
# Take a look at the data- those points inside the interval
plot(
diseasesev~temperature,
data=severity[ insideInterval,],
xlab="Temperature",
ylab="% Disease Severity",
pch=16,
pty="s",
xlim=c(0,30),
ylim=c(0,30)
)
title(main="Graph of % Disease Severity vs Temperature")
# Add points outside the interval, color differently
points(
diseasesev~temperature,
pch = 16,
col = 'red',
data = severity[ !insideInterval,]
)
# Add regression line and confidence intervals
matplot(
xRange,
pred4plot,
lty=c(1,2,2), #vector of line types and widths
type="l", #type of plot for each column of y
add = TRUE
)

I liked the idea and tried to make a function for that. Of course it's far from being perfect. Your comments are welcome
diseasesev<-c(1.9,3.1,3.3,4.8,5.3,6.1,6.4,7.6,9.8,12.4)
# Predictor variable, (Centigrade)
temperature<-c(2,1,5,5,20,20,23,10,30,25)
## For convenience, the data may be formatted into a dataframe
severity <- as.data.frame(cbind(diseasesev,temperature))
## Fit a linear model for the data and summarize the output from function lm()
severity.lm <- lm(diseasesev~temperature,data=severity)
# Function to plot the linear regression and overlay the confidence intervals
ci.lines<-function(model,conf= .95 ,interval = "confidence"){
x <- model[[12]][[2]]
y <- model[[12]][[1]]
xm<-mean(x)
n<-length(x)
ssx<- sum((x - mean(x))^2)
s.t<- qt(1-(1-conf)/2,(n-2))
xv<-seq(min(x),max(x),(max(x) - min(x))/100)
yv<- coef(model)[1]+coef(model)[2]*xv
se <- switch(interval,
confidence = summary(model)[[6]] * sqrt(1/n+(xv-xm)^2/ssx),
prediction = summary(model)[[6]] * sqrt(1+1/n+(xv-xm)^2/ssx)
)
# summary(model)[[6]] = 'sigma'
ci<-s.t*se
uyv<-yv+ci
lyv<-yv-ci
limits1 <- min(c(x,y))
limits2 <- max(c(x,y))
predictions <- predict(model, level = conf, interval = interval)
insideCI <- predictions[,'lwr'] < y & y < predictions[,'upr']
x_name <- rownames(attr(model[[11]],"factors"))[2]
y_name <- rownames(attr(model[[11]],"factors"))[1]
plot(x[insideCI],y[insideCI],
pch=16,pty="s",xlim=c(limits1,limits2),ylim=c(limits1,limits2),
xlab=x_name,
ylab=y_name,
main=paste("Graph of ", y_name, " vs ", x_name,sep=""))
abline(model)
points(x[!insideCI],y[!insideCI], pch = 16, col = 'red')
lines(xv,uyv,lty=2,col=3)
lines(xv,lyv,lty=2,col=3)
}
Use it like this:
ci.lines(severity.lm, conf= .95 , interval = "confidence")
ci.lines(severity.lm, conf= .85 , interval = "prediction")

Related

R plot confidence interval lines with a robust linear regression model (rlm)

I need to plot a Scatterplot with the confidence interval for a robust linear regression (rlm) model, all the examples I had found only work with LM.
This is my code:
model1 <- rlm(weightsE$brain ~ weightsE$body)
newx <- seq(min(weightsE$body), max(weightsE$body), length.out=70)
newx<-as.data.frame(newx)
colnames(newx)<-"brain"
conf_interval <- predict(model1, newdata = data.frame(x=newx), interval = 'confidence',
level=0.95)
#create scatterplot of values with regression line
plot(weightsE$body, weightsE$body)
abline(model1)
#add dashed lines (lty=2) for the 95% confidence interval
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)
but the results of predict don't produce a straight line for the upper and lower level, they are more like random predictions.
You have a few problems to fix here.
When you generate a model, don't use rlm(weightsE$brain ~ weightsE$body), instead use rlm(brain ~ body, data = weightsE). Otherwise, the model cannot take new data for predictions. Any predictions you get will be produced from the original weightsE$body values, not from the new data you pass into predict
You are trying to create a prediction data frame with a column called "brain', but you are trying to predict the value of "brain", so you need a column called "body"
newx is already a data frame, but for some reason you are wrapping it inside another data frame when you do newdata = data.frame(x=newx). Just pass newx.
You are plotting with plot(weightsE$body, weightsE$body), when it should be plot(weightsE$body, weightsE$brain)
Putting all this together, and using a dummy data set with the same names as your own (see below), we get:
library(MASS)
model1 <- rlm(brain ~ body, data = weightsE)
newx <- data.frame(body = seq(min(weightsE$body),
max(weightsE$body), length.out=70))
conf_interval <- predict(model1, newdata = data.frame(x=newx),
interval = 'confidence',
level=0.95)
#create scatterplot of values with regression line
plot(weightsE$body, weightsE$brain)
abline(model1)
#add dashed lines (lty=2) for the 95% confidence interval
lines(newx$body, conf_interval[, 2], col = "blue", lty = 2)
lines(newx$body, conf_interval[, 3], col = "blue", lty = 2)
Incidentally, you could do the whole thing in ggplot in much less code:
library(ggplot2)
ggplot(weightsE, aes(body, brain)) +
geom_point() +
geom_smooth(method = MASS::rlm)
Reproducible dummy data
data(mtcars)
weightsE <- setNames(mtcars[c(1, 6)], c("brain", "body"))
weightsE$body <- 10 - weightsE$body

(R) Adding Confidence Intervals To Plots

I am using R. I am following this tutorial over here (https://rviews.rstudio.com/2017/09/25/survival-analysis-with-r/ ) and I am trying to adapt the code for a similar problem.
In this tutorial, a statistical model is developed on a dataset and then this statistical model is used to predict 3 news observations. We then plot the results for these 3 observations:
#load libraries
library(survival)
library(dplyr)
library(ranger)
library(data.table)
library(ggplot2)
#use the built in "lung" data set
#remove missing values (dataset is called "a")
a = na.omit(lung)
#create id variable
a$ID <- seq_along(a[,1])
#create test set with only the first 3 rows
new = a[1:3,]
#create a training set by removing first three rows
a = a[-c(1:3),]
#fit survival model (random survival forest)
r_fit <- ranger(Surv(time,status) ~ age + sex + ph.ecog + ph.karno + pat.karno + meal.cal + wt.loss, data = a, mtry = 4, importance = "permutation", splitrule = "extratrees", verbose = TRUE)
#create new intermediate variables required for the survival curves
death_times <- r_fit$unique.death.times
surv_prob <-data.frame(r_fit$survival)
avg_prob <- sapply(surv_prob, mean)
#use survival model to produce estimated survival curves for the first three observations
pred <- predict(r_fit, new, type = 'response')$survival
pred <- data.table(pred)
colnames(pred) <- as.character(r_fit$unique.death.times)
#plot the results for these 3 patients
plot(r_fit$unique.death.times, pred[1,], type = "l", col = "red")
lines(r_fit$unique.death.times, r_fit$survival[2,], type = "l", col = "green")
lines(r_fit$unique.death.times, r_fit$survival[3,], type = "l", col = "blue")
From here, I would like to try an add confidence interval (confidence regions) to each of these 3 curves, so that they look something like this:
I found a previous stackoverflow post (survfit() Shade 95% confidence interval survival plot ) that shows how to do something similar, but I am not sure how to extend the results from this post to each individual observation.
Does anyone know if there is a direct way to add these confidence intervals?
Thanks
If you create your plot using ggplot, you can use the geom_ribbon function to draw confidence intervals as follows:
ggplot(data=...)+
geom_line(aes(x=..., y=...),color=...)+
geom_ribbon(aes(x=.. ,ymin =.., ymax =..), fill=.. , alpha =.. )+
geom_line(aes(x=..., y=...),color=...)+
geom_ribbon(aes(x=.. ,ymin =.., ymax =..), fill=.. , alpha =.. )
You can put + after geom_line and repeat the same steps for each observation.
You can also check:
Having trouble plotting multiple data sets and their confidence intervals on the same GGplot. Data Frame included and
https://bookdown.org/ripberjt/labbook/appendix-guide-to-data-visualization.html

How to plot confidence bands for my weighted log-log linear regression?

I need to plot an exponential species-area relationship using the exponential form of a weighted log-log linear model, where mean species number per location/Bank (sb$NoSpec.mean) is weighted by the variance in species number per year (sb$NoSpec.var).
I am able to plot the fit, but have issues figuring out how to plot the confidence intervals around this fit. The following is the best I have come up with so far. Any advice for me?
# Data
df <- read.csv("YearlySpeciesCount_SizeGroups.csv")
require(doBy)
sb <- summaryBy(NoSpec ~ Short + Area + Regime + SizeGrp, df,
FUN=c(mean,var, length))
# Plot to fill
plot(S ~ A, xlab = "Bank Area (km2)", type = "n", ylab = "Species count",
ylim = c(min(S), max(S)))
text(A, S, label = Pisc$Short, col = 'black')
# The Arrhenius model
require(vegan)
gg <- data.frame(S=S, A=A, W=W)
mloglog <- lm(log(S) ~ log(A), weights = 1 / (log10(W + 1)), data = gg)
# Add exponential fit to plot (this works well)
lines(xtmp, exp(predict(mloglog, newdata = data.frame(A = xtmp))),
lty=1, lwd=2)
Now I want to add confidence bands... This is where I'm finding issues...
## predict using original model.. get standard errors
pp<-data.frame(A = xtmp)
p <- predict(mloglog, newdata = pp, se.fit = TRUE)
pp$fit <- p$fit
pp$se <- p$se.fit
## Calculate lower and upper bounds for each estimate using standard error * 1.96
pp$upr95 <- pp$fit + (1.96 * pp$se)
pp$lwr95 <- pp$fit - (1.96 * pp$se)
But I am not sure whether the following is correct. I couldn't find any answers that didn't involve ggplot when searching google / stack overflow / cross validated.
## Create new linear models to create a fitted line given upper and lower bounds?
upr <- lm(log(upr95) ~ log(A), data=pp)
lwr <- lm(log(lwr95) ~ log(A), data=pp)
lines(xtmp, exp(predict(upr, newdata=pp)), lty=2, lwd=1)
lines(xtmp, exp(predict(lwr, newdata=pp)), lty=2, lwd=1)
Thanks in advance for any help!
It is OK for this question to be without data provided, because:
OP's code is said to be working fine so there is nothing "not working";
this question is more related to statistical procedure: what is the right thing to do.
I would make a brief answer, as I saw you added "solved" to question title in your last update. Note it is not recommended to add such key word to question title. If something is solved, use an answer.
Strictly speaking, using 1.96 is incorrect. You can read How does predict.lm() compute confidence interval and prediction interval? for details. We need residual degree of freedom and 0.025 quantile of t-distribution.
What I want to say, is that predict.lm can return confidence interval for you:
pp <- data.frame(A = xtmp)
p <- predict(mloglog, newdata = pp, interval = "confidence")
p will be a three-column matrix, with "fit", "lwr" and "upr".
Since you fitted a log-log model, both fitted values and confidence interval need be back transformed. Simply take exp on this matrix p:
p <- exp(p)
Now you can easily use matplot to produce nice regression plot:
matplot(xtmp, p, type = "l", col = c(1, 2, 2), lty = c(1, 2, 2))

How to compute prediction intervals for a circle fit in R

I wish to compute the prediction interval of the radius from a circle fit with the formula > r² = (x-h)²+(y-k)². r- radius of the circle, x,y, are gaussian coordinates, h,k, mark the center of the fitted circle.
# data
x <- c(1,2.2,1,2.5,1.5,0.5,1.7)
y <- c(1,1,3,2.5,4,1.7,0.8)
# using nls.lm from minpack.lm (minimising the sum of squared residuals)
library(minpack.lm)
residFun <- function(par,x,y) {
res <- sqrt((x-par$h)^2+(y-par$k)^2)-par$r
return(res)
}
parStart <- list("h" = 1.5, "k" = 2.5, "r" = 1.7)
out <- nls.lm(par = parStart, x = x, y = y, lower =NULL, upper = NULL, residFun)
The problem is, predict() doesn't work with nls.lm, hence I am trying to compute the circle fit using nlsLM. (I could compute it by hand, but have troubles creating my Designmatrix).`
So this is what I tried next:
dat = list("x" = x,"y" = y)
out1 <- nlsLM(y ~ sqrt(-(x-h)^2+r^2)+k, start = parStart )
which results in:
Error in stats:::nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
Question 1a: How does nlsLM() work with circle fits? (advantage being that the generic predict() is available.
Question 1b: How do I get the prediction interval for my circle fit?
EXAMPLE from linear regression (this is what I want for the circle regression)
attach(faithful)
eruption.lm = lm(eruptions ~ waiting)
newdata = data.frame(waiting=seq(45,90, length = 272))
# confidence interval
conf <- predict(eruption.lm, newdata, interval="confidence")
# prediction interval
pred <- predict(eruption.lm, newdata, interval="predict")
# plot of the data [1], the regression line [1], confidence interval [2], and prediction interval [3]
plot(eruptions ~ waiting)
lines(conf[,1] ~ newdata$waiting, col = "black") # [1]
lines(conf[,2] ~ newdata$waiting, col = "red") # [2]
lines(conf[,3] ~ newdata$waiting, col = "red") # [2]
lines(pred[,2] ~ newdata$waiting, col = "blue") # [3]
lines(pred[,3] ~ newdata$waiting, col = "blue") # [3]
Kind regards
Summary of Edits:
Edit1: Rearranged formula in nlsLM, but parameter (h,k,r) results are now different in out and out1 ...
Edit2: Added 2 wikipedia links for clarification puprose on terminology used: (c.f. below)
confidence interval
prediction interval
Edit3: Some rephrasing of the question(s)
Edit4: Added a working example for linear regression
I am having a hard time figuring out what you want to do. Let me illustrate what the data looks like and something about the "prediction".
plot(x,y, xlim=range(x)*c(0, 1.5), ylim=range(y)*c(0, 1.5))
lines(out$par$h+c(-1,-1,1,1,-1)*out$par$r, # extremes of x-coord
out$par$k+c(-1,1,1,-1 ,-1)*out$par$r, # extremes of y-coord
col="red")
So what "prediction interval" are we speaking about? ( I do realize that you were thinking of a circle and if you just want to plot a circle on this background that's going to be pretty easy as well.)
lines(out$par$h+cos(seq(-pi,pi, by=0.1))*out$par$r, #center + r*cos(theta)
out$par$k+sin(seq(-pi,pi, by=0.1))*out$par$r, #center + r*sin(theta)
col="red")
I think that this question is not answerable in its current form. Any predict() function that is based on a linear model will require the predicted variable to be a linear function of the input design matrix. r^2 = (x-x0)^2 + (y-y0)^2 is not a linear function of the design matrix (which would be something like [x0 x y0 y], so I don't think you're going to be able to find a linear model fit that will give you confidence intervals. If someone more clever than I am has a way to do it, though, I'd be very interested in hearing about it.
The general way to approach these sorts of problems is to create a hierarchical nonlinear model, where your hyperparameters would be x0 and y0 (your h and k) with uniform distribution over your search space, and then the r^2 would be distributed ~N((x-x0)^2+(y-y0)^2, \sigma). You would then use MCMC sampling or similar to get your posterior confidence intervals.
Here's a solution to find h,k,r using base R's optim function. You essentially create a cost function that is a closure containing the data you wish to optimize over. I had to RSS value, else we would go to -Inf. There is a local optima problem, so you need to run this a few times...
# data
x <- c(1,2.2,1,2.5,1.5,0.5,1.7)
y <- c(1,1,3,2.5,4,1.7,0.8)
residFunArg <- function(xVector,yVector){
function(theta,xVec=xVector,yVec=yVector){
#print(xVec);print(h);print(r);print(k)
sum(sqrt((xVec-theta[1])^2+(yVec-theta[2])^2)-theta[3])^2
}
}
rFun = residFunArg(x,y);
o = optim(f=rFun,par=c(0,0,0))
h = o$par[1]
k = o$par[2]
r = o$par[3]
Run this command in the REPL to observe the local mins:
o=optim(f=tFun,par=runif(3),method="CG");o$par

In R draw two lines, with slopes double and half the value of the best fit line

I have data with a best fit line draw. I need to draw two other lines. One needs to have double the slope and the other need to have half the slope. Later I will use the region to differentially color points outside it as per:
Conditionally colour data points outside of confidence bands in R
Example dataset:
## Dataset from http://www.apsnet.org/education/advancedplantpath/topics/RModules/doc1/04_Linear_regression.html
## Disease severity as a function of temperature
# Response variable, disease severity
diseasesev<-c(1.9,3.1,3.3,4.8,5.3,6.1,6.4,7.6,9.8,12.4)
# Predictor variable, (Centigrade)
temperature<-c(2,1,5,5,20,20,23,10,30,25)
## For convenience, the data may be formatted into a dataframe
severity <- as.data.frame(cbind(diseasesev,temperature))
## Fit a linear model for the data and summarize the output from function lm()
severity.lm <- lm(diseasesev~temperature,data=severity)
# Take a look at the data
plot(
diseasesev~temperature,
data=severity,
xlab="Temperature",
ylab="% Disease Severity",
pch=16,
pty="s",
xlim=c(0,30),
ylim=c(0,30)
)
title(main="Graph of % Disease Severity vs Temperature")
par(new=TRUE) # don't start a new plot
abline(severity.lm, col="blue")
You can just use
# This gets the coefficients of the linear regression (intercept and slope)
c <- coef(severity.lm)
abline(c[1], c[2]*2, col="red")
abline(c[1], c[2]/2, col="red")
diseasesev<-c(1.9,3.1,3.3,4.8,5.3,6.1,6.4,7.6,9.8,12.4)
# Predictor variable, (Centigrade)
temperature<-c(2,1,5,5,20,20,23,10,30,25)
## For convenience, the data may be formatted into a dataframe
severity <- as.data.frame(cbind(diseasesev,temperature))
## Fit a linear model for the data and summarize the output from function lm()
severity.lm <- lm(diseasesev~temperature,data=severity)
line1 <- severity.lm$coefficients * c(1,2)
line2 <- severity.lm$coefficients * c(1,.5)
df <- as.data.frame(severity.lm[[12]])
df2 <- adply(df,1,function(x) cbind(line1[2]*x[[2]]+line1[1], line2[2]*x[[2]]+line2[1]))
plot(
df2[df2[,1] >= min(df2[,c(3,4)]) & df2[,1] <= max(df2[,c(3,4)]),c(2,1)],
xlab="Temperature",
ylab="% Disease Severity",
pch=16,
pty="s",
xlim=c(0,30),
ylim=c(0,30)
)
title(main="Graph of % Disease Severity vs Temperature")
par(new=TRUE) # don't start a new plot
abline(severity.lm, col="blue")
abline(line1, col="cyan")
abline(line2, col="cyan")
points(df2[df2[,1] < min(df2[,c(3,4)]) | df2[,1] > max(df2[,c(3,4)]),c(2,1)], pch = 16, col = 'red')

Resources