How to plot a polynomial over a scatterplot in R? - r

My intention is to plot a polynomial regression on a data
Lets say the x-axis is the third column of df that is df[,3] and the y-axis is fifth column df[,5]
I performed polynomial regression on this data and obtained a vector yreg which I want to plot over this scatter plot.
My question is how can we make it happen? All I have encountered so far are using built in regression models without explicitly defining the polynomial but in my case I have the regression polynomial in hand and I want to add it to the scatter plot of the x,y data
Here is how I plot the scatter plot using ggplot2
plt <- ggplot(g,aes(df[,3] , df[,5])) +
geom_point(color = "#69b3a2",size=0.5)
I tried the following:
plt <- ggplot(g,aes(df[,3] , df[,5])) +
geom_point(color = "#69b3a2",size=0.5) +
geom_smooth(formula = y~yreg,color="red")
But that did not work.

Related

How to use ggplot to make a qqplot to compare the distribution of two variables?

I'd like to know how is possible to make a qqplot with ggplot2 that compares two distributions and not a distribution to a theoretical distribution.
I want something like this:
qqplot(iris$Petal.Length, iris$Petal.Width)
that compares the quartiles of Petal.Length and Petal.Width in iris dataset, but using ggplot2.
One easy way to reproduce the plot is to convert the qqplot call to a dataframe and then plot it with ggplot2:
qq <- as.data.frame(qqplot(iris$Petal.Length, iris$Petal.Width, plot.it = FALSE))
ggplot(qq) +
geom_point(aes(x = x, y = y))

Put two linear regression lines into one plot

I initially have two data frame("icelandma,swissfe") with observations from different countries and totally same variables. To compare the linear regression of intp.trust and confidence for these two countries. I combine these two data frame into one with this command:
merge1 <- rbind(icelandma, swissfe)
And then, I draw the linear regression plot with this command:
ggplot(data = merge1,aes(x=intp.trust,y=confidence))+
geom_point(size=0.5)+
geom_smooth(method = "lm",formula = y~x)+
facet_grid(countryname~.)
The plot is like this
The regression line is still in two plots, I'm wondering if there is any way to post these two lines in the same plot. Thanks for your help in advance!
Try
ggplot(data = merge1,aes(x=intp.trust,y=confidence, group = countryname))+
geom_point(size=0.5)+
geom_smooth(method = "lm",formula = y~x)
facet_wrap puts your plots in different panels by countryname~.
If you want to differentiate by countryname add color to your aes: aes(...,color = countryname).

Visualising interaction between two categorical predictors and continuous outcome with ribbon confidence intervals

I am very new to R, so apologise in advance for a possible simple or obvious question.
I am trying to graph an interaction between two categorical variables (A and I), which both have 2 levels (0 and 1), against one continuous variable (V). I would like V on the Y axes, A on the X axes and I as different lines on the graph. However, I would like to include 95% confidence intervals. I would like to use a ribbon style CI on the graph (like geom_ribbon produces). However, I cannot do this after identifying A and I as binary categorical variables in R. The only way I can figure out how to do it is leaving A as a continuous variable (see picture). The syntax I am using is below:
data$I <- as.factor(data$I)
data$A <- as.factor(data$A)
gp <- ggplot(data=data, aes(x=A, y=V, colour=I))
gp + geom_point() + stat_smooth(method="lm")
Though I did not set A as a categorical variable when producing the attached image.

r: Blank graph when plotting multiple lines on scatterplot

My goal is to produce a graph showing the differences between regression lines using continuous vs categorical variables. I'm using is the "SleepStudy" dataset from Lock5Data, and I want to show the regression lines predicting GPA from ClassYear as either continuous or categorical. The code is below:
library(Lock5Data)
data("SleepStudy")
fit2 <- lm(GPA ~ factor(ClassYear), data = SleepStudy)
fit2_line <- aggregate(fit2$fitted.values ~ SleepStudy$ClassYear, FUN = mean)
colnames(fit2_line) <- c('ClassYear','GPA')
options(repr.plot.width=5, repr.plot.height=5)
library(ggplot2)
ggplot() +
geom_line(data=fit2_line, aes(x=ClassYear, y=GPA)) + # Fit line, ClassYear factor
geom_smooth(data=SleepStudy, method='lm', formula=GPA~ClassYear) + # Fit line, ClassYear continuous
geom_point(data=SleepStudy, aes(x=ClassYear, y=GPA)) # Data points as dots
What is producing the blank graph? What am I missing here?
You have to define the data you are using for the geom_smooth in the ggplot(). This code works:
ggplot(data=SleepStudy, aes(y = GPA,x = ClassYear)) +
geom_smooth(data=SleepStudy, method='lm', formula=y~x)+
geom_line(data=fit2_line, aes(x=ClassYear, y=GPA)) +
geom_point(data=SleepStudy, aes(x=ClassYear, y=GPA))

plotting multiple plots in ggplot2 on same graph that are unrelated

How would one use the smooth.spline() method in a ggplot2 scatterplot?
If my data is in the data frame called data, with two columns, x and y.
The smooth.spline would be sm <- smooth.spline(data$x, data$y). I believe I should use geom_line(), with sm$x and sm$y as the xy coordinates. However, how would one plot a scatterplot and a lineplot on the same graph that are completely unrelated? I suspect it has something to do with the aes() but I am getting a little confused.
You can use different data(frames) in different geoms and call the relevant variables using aes or you could combine the relevant variables from the output of smooth.spline
# example data
set.seed(1)
dat <- data.frame(x = rnorm(20, 10,2))
dat$y <- dat$x^2 - 20*dat$x + rnorm(20,10,2)
# spline
s <- smooth.spline(dat)
# plot - combine the original x & y and the fitted values returned by
# smooth.spline into a data.frame
library(ggplot2)
ggplot(data.frame(x=s$data$x, y=s$data$y, xfit=s$x, yfit=s$y)) +
geom_point(aes(x,y)) + geom_line(aes(xfit, yfit))
# or you could use geom_smooth
ggplot(dat, aes(x , y)) + geom_point() + geom_smooth()

Resources