How to make a loop to plot several graphs using ggplot - r

This is my dataframe:
x1 <- c(1,2,3,4)
x2 <- c(3,4,5,6)
x3 <- c(5,6,7,8)
x4 <- c(7,8,9,10)
x5 <- c(8,7,6,5)
df <- c(x1,x2,x3,x4,x5)
I choose 3 variables from my dataframe to plot 3 separate scatterplots each against x1 and store these in a character vector:
varlist <- c("x2","x4","x5")
So I want to create a function to make 3 independent scatterplots of x1 with x2, x1 with x4 and x1 with x5, using ggplot, where xx and yy will be the different pairs of variables to plot:
ggplot(data = df) +
geom_point(mapping = aes(x = xx, y = yy)) +
geom_smooth(mapping = aes(x = xx, y = yy))

You could do:
mapply(function(y) print(ggplot(data = df) +
geom_point(aes_string(x = "x1", y = y)) +
geom_smooth(aes_string(x = "x1", y = y))), y=c("x2","x4","x5"))
Note : I used df <- data.frame(x1,x2,x3,x4,x5) instead of df <- c(x1,x2,x3,x4,x5)
x is set to x1, mapply will loop over y which contains the different variables we want to have plotted against x1.

Related

How to write a function to loop through variables and plot using ggplot

I'm having problems figuring out how to loop through variables in a data frame and plot them using ggplot.
An example of my data is below:
head(myData,2)
x1 x2 yhat x11 x3 yhat1 x12
1 -0.8523122 -2.737223 -6.562228 -0.8523122 -1.450288 0.464739 -0.8523122
2 -0.5649950 -2.737223 -6.562228 -0.5649950 -1.450288 0.464739 -0.5649950
x4 yhat2 x21 x31 yhat3
1 -1.267759 -4.624147 -2.737223 -1.450288 -0.6858007
2 -1.267759 -4.624147 -2.267001 -1.450288 -0.6858007
What I'm trying to do is to use geom_raster to plot each pair of variables (i.e., [x1,x2],[x11,x3],etc) and use the corresponding yhat as the fill value.
For example, if I were to plot everything manually I'd do something like:
p<-ggplot(myData, aes(x = x1, y = x2)) + geom_raster(aes(fill = yhat))
pp<-ggplot(myData, aes(x = x11, y = x3)) + geom_raster(aes(fill = yhat1))
ppp<-ggplot(myData, aes(x = x12, y = x4)) + geom_raster(aes(fill = yhat2))
pppp<-ggplot(myData, aes(x = x21, y = x31)) + geom_raster(aes(fill = yhat3))
grid.arrange(p, pp, ppp, pppp, ncol = 2)
But I'm trying to write a function that will loop through the data frame and plot the graphs. I tried to adapt the code from a different question here but I can't make it work for me.
Any suggestions as to how I would achieve this for my data?
One way would be to split data in every 3 columns and apply the code to each list.
library(gridExtra)
library(tidyverse)
library(rlang)
temp <- split.default(df, gl(ncol(myData)/3, 3)) %>%
map(~{
x <- syms(names(.))
ggplot(., aes(x = !!x[[1]], y = !!x[[2]])) + geom_raster(aes(fill = !!x[[3]]))
})
grid.arrange(grobs = temp)
data
Applied this on limited data of 2 rows.
myData <- structure(list(x1 = c(-0.8523122, -0.564995), x2 = c(-2.737223,
-2.737223), yhat = c(-6.562228, -6.562228), x11 = c(-0.8523122,
-0.564995), x3 = c(-1.450288, -1.450288), yhat1 = c(0.464739,
0.464739), x12 = c(-0.8523122, -0.564995), x4 = c(-1.267759,
-1.267759), yhat2 = c(-4.624147, -4.624147), x21 = c(-2.737223,
-2.267001), x31 = c(-1.450288, -1.450288), yhat3 = c(-0.6858007,
-0.6858007)), class = "data.frame", row.names = c("1", "2"))

how to plot many x variable agaist one y variable using ggplot function in

I have an excel file with multiple columns with titles as x, x1, x2, x3, x4 etc. I am using ggplot function in R to plot x against x1. The code is
data %>%
ggplot(aes(x = x1, y = x)) +
geom_point(colour = "red") +
geom_smooth(method = "lm", fill = NA)
How to modify the present code so as to plot x against x1, x against x2, x against x3, x against x4 in the same ggplot function code
You should change the way your data.frame is formated to do this easily with ggplot2 syntax.
Instead of having 5 columns, with x, x1, x2, x3, x4, you may want to have a data.frame with 3 columns : x, y and type with type being a categorical variable indicating from which column your y is from (x1, x2, x3 or x4).
That would be something like this :
df <- data.frame(x = rep(data$x, 4),
y = c(data$x1, data$x2, data$x3, data$x4),
type = rep(c("x1", "x2", "x3", "x4"), each = nrow(data))
Then, with this data.frame, you can set the aes in order to plot x according to y for each category of your variable type thanks to the color argument.
ggplot(df, aes(x = x, y = y, color = type)) + geom_point() + geom_smooth(method = "lm, fill = "NA")
You should check http://www.sthda.com/english/wiki/ggplot2-scatter-plots-quick-start-guide-r-software-and-data-visualization for detailed explanations and customizations.

How do I create a regression line with various variables in R

I have already created the actual regression code but I am trying to get the regression line and a predicted line onto a plot but I can't seem to figure it out.
m1 <- lm(variable1 ~ 2 + 3 + 4 + 5 + 6 + 7 + 8, data = prog)
summary(m1)
and then I want to create the plot on the basis of hyp.data but I am still a bit lost.
Consider two (not 7!) predictor variables; one is numeric, the other categorical (i.e. a factor).
# Simulate data
set.seed(2017);
x1 <- 1:10;
x2 <- as.factor(sample(c("treated", "not_treated"), 10, replace = TRUE));
df <- cbind.data.frame(
y = 2 * x1 + as.numeric(x2) - 1 + rnorm(10),
x1 = x1,
x2 = x2);
In that case you can do the following:
# Fit the linear model
m1 <- lm(y ~ x1 + x2, data = df);
# Get predictions
df$pred <- predict(m1);
# Plot data
library(ggplot2);
ggplot(df, aes(x = x1, y = y)) +
geom_point() +
facet_wrap(~ x2, scales = "free") +
geom_line(aes(x = x1, y = pred), col = "red");

Plotting two longitudinal variables against time in r

Say I have a data that included two longitudinal variables (x1, x2), t is time (years), and type is class:
set.seed(20)
x1 = rnorm(20,5,1)
x2 = (x1 + rnorm(20))
t = rep(c(0,1,2,3), 5)
id = rep(1:5,each = 4)
type = as.factor(c(rep(0,8), rep(1,12)))
df = data.frame(id, t, x1, x2, type)
Is it possible to plot x1 and x2 agnist t in one plot? Actually, I am trying to see the relationship between x1 and x1 (but here use rnorm to make it easy) by modified the correlation matrix.
Not sure how you want to treat the ID variable, but maybe try this?
require(reshape)
df <- reshape::melt(df, id.vars = c('id', 't', 'type'))
ggplot(df, aes(x = t, y = value, color = variable)) +
geom_line() +
facet_wrap(~id)

Constraining slope in stat_smooth with ggplot (plotting ANCOVA)

Using ggplot(), I am trying to plot the results of an ANCOVA in which slopes of the two linear components are equal: i.e., lm(y ~ x + A). The default behavior for geom_smooth(method = "lm") is to plot separate slopes and intercepts for each level of each factor. For example, with two levels of A
library(ggplot2)
set.seed(1234)
n <- 20
x1 <- rnorm(n); x2 <- rnorm(n)
y1 <- 2 * x1 + rnorm(n)
y2 <- 3 * x2 + (2 + rnorm(n))
A <- as.factor(rep(c(1, 2), each = n))
df <- data.frame(x = c(x1, x2), y = c(y1, y2), A = A)
p <- ggplot(df, aes(x = x, y = y, color = A))
p + geom_point() + geom_smooth(method = "lm")
I can fit the ANCOVA separately with lm() and then use geom_abline() to manually add the lines. This approach has a couple of drawbacks like having the lines extend beyond the range of the data and manually specify the colors.
fm <- lm(y ~ x + A, data = df)
summary(fm)
a1 <- coef(fm)[1]
b <- coef(fm)[2]
a2 <- a1 + coef(fm)[3]
p + geom_point() +
geom_abline(intercept = a1, slope = b) +
geom_abline(intercept = a2, slope = b)
I know ancova() in the HH package automates the plotting, but I don't really care for lattice graphics. So I am looking for a ggplot()-centric solution.
library(HH)
ancova(y ~ x + A, data = df)
Is there a method to accomplish this using ggplot()? For this example, A has two levels, but I have situations with 3, 4, or more levels. The formula argument to geom_smooth() doesn't seem to have the answer (as far as I can tell).
For completeness, this works:
library(ggplot2)
set.seed(1234)
n <- 20
x1 <- rnorm(n); x2 <- rnorm(n)
y1 <- 2 * x1 + rnorm(n)
y2 <- 3 * x2 + (2 + rnorm(n))
A <- as.factor(rep(c(1, 2), each = n))
df <- data.frame(x = c(x1, x2), y = c(y1, y2), A = A)
fm <- lm(y ~ x + A, data = df)
p <- ggplot(data = cbind(df, pred = predict(fm)),
aes(x = x, y = y, color = A))
p + geom_point() + geom_line(aes(y = pred))

Resources