How to get the ggplot2 sat_smooth blue line as a function? - r

The follwing command:
ggplot(s, aes(x = I5, y = Success))+geom_point(size=3, alpha=0.4)+
stat_smooth(method="loess", colour="blue", size=1.5)+
xlab("I5")+
ylab("Probability of Success")+
theme_bw()
gives me the following plot:
I would like to get what corresponds to the blue line as a function so that I can apply it to any value.
Is there a way to do that?

If you need the actual loess fit, it's probably better to run it yourself. Let's create some sample data (it would have been nice if you had include some in your original question)
dd <- data.frame(
x=1:50,
y = cumsum(rnorm(50))
)
And now we can run the loess function ourself
sm <- loess(y~x, dd)
Now we can compare the line that ggplot draws to our loess curve
ggplot(dd, aes(x,y)) +
stat_smooth(method="loess") +
geom_point(data=data.frame(x=sm$x, y=predict(sm)), col="red")
We can see these line up perfectly. This we can just use the predict() function with our loess object to get a value for any point. For example
predict(sm, 5)
# [1] -2.922876

Related

How to add geom_point() to autolayer() line?

Trying to add geom_points to an autolayer() line ("fitted" in pic), which is a wrapper part of autoplot() for ggplot2 in Rob Hyndmans forecast package (there's a base autoplot/autolayer in ggplot2 too so same likely applies there).
Problem is (I'm no ggplot2 expert, and autoplot wrapper makes it trickier) the geom_point() applies fine to the main call, but how do I apply similar to the autolayer (fitted values)?
Tried type="b" like normal geom_line() but it's not an object param in autolayer().
require(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
forecast::autoplot(mdeaths) +
forecast::autolayer(model.ses.fc$fitted, series="Fitted") + # cannot set to show points, and type="b" not allowed
geom_point() # this works fine against the main autoplot call
This seems to work:
library(forecast)
library(fpp2)
model.ses <- ets(mdeaths, model="ANN", alpha=0.4)
model.ses.fc <- forecast(model.ses, h=5)
# Pre-compute the fitted layer so we can extract the data out of it with
# layer_data()
fitted_layer <- forecast::autolayer(model.ses.fc$fitted, series="Fitted")
fitted_values <- fitted_layer$layer_data()
plt <- forecast::autoplot(mdeaths) +
fitted_layer +
geom_point() +
geom_point(data = fitted_values, aes(x = timeVal, y = seriesVal))
There might be a way to make forecast::autolayer do what you want directly but this solution works. If you want the legend to look right, you'll want to merge the input data and fitted values into a single data.frame.

Plot with one line for each column and time-series on the x-axis R

You can find my dataset here.
From this data, I wish to plot (one line for each):
x$y[,1]
x$y[,5]
x$y[,1]+x$y[,5]
Therefore, more clearly, in the end, each of the following will be represented by one line:
y0,
z0,
y0+z0
My x-axis (time-series) will be from x$t.
I have tried the following, but the time-series variable is problematic and I cannot figure out how I can exactly plot it. My code is:
Time <- x$t
X0 <- x$y[,1]
Z0 <- x$y[,5]
X0.plus.Z0 <- X0 + Z0
xdf0 <- cbind(Time,X0,Z0,X0.plus.Z0)
xdf0.melt <- melt(xdf0, id.vars="Time")
ggplot(data = xdf0.melt, aes(x=Time, y=value)) + geom_line(aes(colour=Var2))
The error in your code comes from the use of melt applied to an object that is not a data.frame. You should modify like this:
xdf0 <- cbind.data.frame(Time,X0,Z0,X0.plus.Z0)
xdf0.melt <- reshape2::melt(xdf0, id.vars="Time")
ggplot(data = xdf0.melt, aes(x=Time, y=value)) + geom_line(aes(colour=variable))
You don't have to go through the melt process since you juste have 3 lines to plot, it's fine to plot them separately
ggplot(data=xdf0) + aes(x=Time) +
geom_line(aes(y=X0), col="red") +
geom_line(aes(y=Z0), col="blue") +
geom_line(aes(y=X0.plus.Z0))
However, you don't get the legend.
A remark about your example: you try to plot values of really different order of magnitude, so you can't really see anything.
How about
matplot(xdf0, type = 'l')
?

Change colors of select lines in ggplot2 coefficient plot in R

I would like to change the color of coefficient lines based on whether the point estimate is negative or positive in a ggplot2 coefficient plot in R. For example:
require(coefplot)
set.seed(123)
dat <- data.frame(x = rnorm(100), z = rnorm(100))
mod1 <- lm(y1 ~ x + z, data = dat)
coefplot.lm(mod1)
Which produces the following plot:
In this plot, I would like to change the "x" variable to red when plotted. Any ideas? Thanks.
I think, you cannot do this with a plot produced by coefplot.lm. The package coefplot uses ggplot2 as the plotting system, which is good itself, but does not allow to play with colors as easily as you would like. To achieve the desired colors, you need to have a variable in your dataset that would color-code the values; you need to specify color = color-code in aes() function within the layer that draws the dots with CE. Apparently, this is impossible to do with the output of coefplot.lm function. Maybe, you can change the colors using ggplot2 ggplot_build() function. I would say, it's easier to write your own function for this task.
I've done this once to plot odds. If you want, you may use my code. Feel free to change it. The idea is the same as in coefplot. First, we extract coefficients from a model object and prepare the data set for plotting; second, actually plot.
The code for extracting coefficients and data set preparation
df_plot_odds <- function(x){
tmp<-data.frame(cbind(exp(coef(x)), exp(confint.default(x))))
odds<-tmp[-1,]
names(odds)<-c('OR', 'lower', 'upper')
odds$vars<-row.names(odds)
odds$col<-odds$OR>1
odds$col[odds$col==TRUE] <-'blue'
odds$col[odds$col==FALSE] <-'red'
odds$pvalue <- summary(x)$coef[-1, "Pr(>|t|)"]
return(odds)
}
Plot the output of the extract function
plot_odds <- function(df_plot_odds, xlab="Odds Ratio", ylab="", asp=1){
require(ggplot2)
p <- ggplot(df_plot_odds, aes(x=vars, y=OR, ymin=lower, ymax=upper),asp=asp) +
geom_errorbar(aes(color=col),width=0.1) +
geom_point(aes(color=col),size=3)+
geom_hline(yintercept = 1, linetype=2) +
scale_color_manual('Effect', labels=c('Positive','Negative'),
values=c('blue','red'))+
coord_flip() +
theme_bw() +
theme(legend.position="none",aspect.ratio = asp)+
ylab(xlab) +
xlab(ylab) #switch because of the coord_flip() above
return(p)
}
Plotting your example
set.seed(123)
dat <- data.frame(x = rnorm(100),y = rnorm(100), z = rnorm(100))
mod1 <- lm(y ~ x + z, data = dat)
df <- df_plot_odds(mod1)
plot <- plot_odds(df)
plot
Which yields
Note that I chose theme_wb() as the default. Output is a ggplot2object. So, you may change it quite a lot.

Displaying smoothed (convolved) densities with ggplot2

I'm trying to display some frequencies convolved with a Gaussian kernel in ggplot2. I tried smoothing the lines with:
+ stat_smooth(se = F,method = "lm", formula = y ~ poly(x, 24))
Without success.
I read an article suggesting the frequencies should be convolved with a Gaussian kernel. Which ggplot2's stat_density function (http://docs.ggplot2.org/current/stat_density.html) seem to be able to produce.
However, I can't seem to be able to replace my geometry with stat_density. I there anything wrong with my code?
require(reshape2)
library(ggplot2)
library(RColorBrewer)
fileName = "/1.csv" # downloadable there: https://www.dropbox.com/s/l5j7ckmm5s9lo8j/1.csv?dl=0
mydata = read.csv(fileName,sep=",", header=TRUE)
dataM = melt(mydata,c("bins"))
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")))
ggplot(data=dataM,
aes(x=bins, y=value, colour=variable)) +
geom_line() + scale_x_continuous(limits = c(0, 2))
This code produces the following plot:
I'm looking at smoothing the lines a little bit, so they look more like this:
(from http://journal.frontiersin.org/Journal/10.3389/fncom.2013.00189/full)
Since my comments solved your problem, I'll convert them to an answer:
The density function takes individual measurements and calculates a kernel density distribution by convolution (gaussian is the default kernel). For example, plot(density(rnorm(1000))). You can control the smoothness with the bw (bandwidth) parameter. For example, plot(density(rnorm(1000), bw=0.01)).
But your data frame is already a density distribution (analogous to the output of the density function). To generate a smoother density estimate, you need to start with the underlying data and run density on it, adjusting bw to get the smoothness where you want it.
If you don't have access to the underlying data, you can smooth out your existing density distributions as follows:
ggplot(data=dataM, aes(x=bins, y=value, colour=variable)) +
geom_smooth(se=FALSE, span=0.3) +
scale_x_continuous(limits = c(0, 2)).
Play around with the span parameter to get the smoothness you want.

Fit a line with LOESS in R

I have a data set with some points in it and want to fit a line on it. I tried it with the loess function. Unfortunately I get very strange results. See the plot bellow. I expect a line that goes more through the points and over the whole plot. How can I achieve that?
How to reproduce it:
Download the dataset from https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1 (only two kb) and use this code:
load(url('https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1'))
lw1 = loess(y ~ x,data=data)
plot(y ~ x, data=data,pch=19,cex=0.1)
lines(data$y,lw1$fitted,col="blue",lwd=3)
Any help is greatly appreciated. Thanks!
You've plotted fitted values against y instead of against x. Also, you will need to order the x values before plotting a line. Try this:
lw1 <- loess(y ~ x,data=data)
plot(y ~ x, data=data,pch=19,cex=0.1)
j <- order(data$x)
lines(data$x[j],lw1$fitted[j],col="red",lwd=3)
Unfortunately the data are not available anymore, but an easier way how to fit a non-parametric line (Locally Weighted Scatterplot Smoothing or just a LOESS if you want) is to use following code:
scatter.smooth(y ~ x, span = 2/3, degree = 2)
Note that you can play with parameters span and degree to get arbitrary smoothness.
May be is to late, but you have options with ggplot (and dplyr). First if you want only plot a loess line over points, you can try:
library(ggplot2)
load(url("https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1"))
ggplot(data, aes(x, y)) +
geom_point() +
geom_smooth(method = "loess", se = FALSE)
Other way, is by predict() function using a loess fit. For instance I used dplyr functions to add predictions to new column called "loess":
library(dplyr)
data %>%
mutate(loess = predict(loess(y ~ x, data = data))) %>%
ggplot(aes(x, y)) +
geom_point(color = "grey50") +
geom_line(aes(y = loess))
Update: Added line of code to load the example data provided
Update2: Correction on geom_smoot() function name acoording #phi comment

Resources