I am trying to plot the density curve of a t-distribution with mean = 3 and df = 1.5 using ggplot2. However it is supposed to be symmetric around 3, so I can not use the noncentrality parameter.
ggplot(data.frame(x = c(-4, 10)), aes(x = x)) +
stat_function(fun = dt, args = list(df = 1.5))
Is there a way to simply shift the distribution along the x-axis?
you could also make a custom function for your shifted t-distribution:
custom <- function(x) {dt(x - 3, 1.5)}
ggplot(data.frame(x = c(-4, 10)), aes(x = x)) +
stat_function(fun = custom)
A simple solution is to just change the labels instead:
ggplot(data.frame(x = c(-4, 10)), aes(x = x)) +
stat_function(fun = dt, args = list(df = 1.5)) +
scale_x_continuous(breaks = c(0, 5, 10), labels = c(3, 8, 13))
There is also a function dt.scaled in the metRology package, which in addition to the df, lets you specify the mean and scale.
Relevant code:
dt.scaled <- function(x, df, mean = 0, sd = 1, ncp, log = FALSE) {
if (!log) stats::dt((x - mean)/sd, df, ncp = ncp, log = FALSE)/sd
else stats::dt((x - mean)/sd, df, ncp = ncp, log = TRUE) - log(sd)
}
Related
I want to use ggplot to plot three curves, each made with stat_function and with its own parameters.
This is done with the code below:
library(ggplot2)
ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(fun = function(x) plogis(x, location = 9, scale = 3), colour = "blue") +
stat_function(fun = function(x) plogis(x, location = 9, scale = 4), colour = "green")
which gives the figure below:
What I want to achieve is to shift the blue and green curves, exactly as they are, to the right along the horizontal axis (each by an arbitrary amount).
I don't know of an explicit way to do it in ggplot, so I tried to specify a different frame for the second and third geometric objects, as below:
ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(data = data.frame(x = c(3, 28)), fun = function(x) plogis(x, location = 9, scale = 3), colour = "blue") +
stat_function(data = data.frame(x = c(5, 30)), fun = function(x) plogis(x, location = 9, scale = 4), colour = "green")
But the resulting image is the same as the one above.
Your solution is almost correct, but you need to subtract the same constant within the function itself, so that the y-values still correspond.
c1 <- 4
c2 <- 4
p2 <- ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(data = data.frame(x = c(0+c1, 25+c1)),
fun = function(x) plogis(x - c1, location = 9, scale = 3), colour = "blue") +
stat_function(data = data.frame(x = c(0+c2, 25+c2)),
fun = function(x) plogis(x - c2, location = 9, scale = 4), colour = "green")
p2
PS: In the answer, I have added the constants also to the data.frame itself, so that the shift is shown (you can remove them from the df in case you want you want only the original x-range shown).
I have a large number of variables and would like to create scatterplots comparing all variables to a single variable. I have been able to do this in base R using lapply, but I cannot complete the same task in ggplot2 using lapply.
Below is an example dataset.
df <- data.frame("ID" = 1:16)
df$A <- c(1,2,3,4,5,6,7,8,9,10,11,12,12,14,15,16)
df$B <- c(5,6,7,8,9,10,13,15,14,15,16,17,18,18,19,20)
df$C <- c(11,12,14,16,10,12,14,16,10,12,14,16,10,12,14,16)
I define the variables I would like to generate scatterplots with, using the code below:
df_col_names <- df %>% select(A:C) %>% colnames(.)
Below is how I have been able to successfully complete the task of plotting all variables against variable A, using lapply in base R:
lapply(df_col_names, function(x) {
tiff(filename=sprintf("C:\\Documents\\%s.tiff", x),
width = 1000, height = 1000, res=200)
plot(df$A, df[[x]],
pch=19,
cex = 1.5,
ylab = x,
ylim = c(0, 20),
xlim = c(0, 20))
dev.off()
})
Below is my attempt at completing the task in ggplot2 without any success. It generates the tiff images, although they are empty.
lapply(df_col_names, function(x) {
tiff(filename=sprintf("C:\\Documents\\%s.tiff", x),
width = 1000, height = 1000, res=200)
ggplot(df) +
geom_point(data = df,
aes(x = A, y = df_col_names[[x]], size = 3)) +
geom_smooth(aes(x = A, y = df_col_names[[x]], size = 0), method = "lm", size=0.5) +
coord_fixed(ratio = 1, xlim = c(0, 20), ylim = c(0, 20)) +
guides(size = FALSE, color = FALSE) +
theme_bw(base_size = 14)
dev.off()
})
It works for me with ggsave. Also note that you are passing string column names to ggplot so use .data to refer to actual column values.
library(ggplot2)
lapply(df_col_names, function(x) {
ggplot(df) +
geom_point( aes(x = A, y = .data[[x]], size = 3)) +
geom_smooth(aes(x = A, y = .data[[x]], size = 0), method = "lm", size=0.5) +
coord_fixed(ratio = 1, xlim = c(0, 20), ylim = c(0, 20)) +
guides(size = FALSE, color = FALSE) +
theme_bw(base_size = 14) -> plt
ggsave(sprintf("%s.tiff", x), plt)
})
I am trying to use a log-modulus transformation in my plot. It was working fine...
library(tidyverse)
library(scales)
log_modulus_trans <- function()
trans_new(name = "log_modulus",
transform = function(x) sign(x) * log(abs(x) + 1),
inverse = function(x) sign(x) * ( exp(abs(x)) - 1 ))
# fake data
set.seed(1)
d <- data_frame(
tt = rep(1:10, 3),
cc = rep(LETTERS[1:3], each = 10),
xx = c(rnorm(10, mean = 100, sd = 10),
rnorm(10, mean = 0, sd = 10),
rnorm(10, mean = -100, sd = 10)))
ggplot(data = d,
mapping = aes(x = tt, y = xx, group = cc)) +
geom_line() +
coord_trans(y = "log_modulus")
When I tried to add a geom_vline() things got weird...
ggplot(data = d,
mapping = aes(x = tt, y = xx, group = cc)) +
geom_line() +
coord_trans(y = "log_modulus") +
geom_vline(xintercept = 5)
Any idea how to get geom_vline() to go from the top to the bottom of the plot window... or a work around hack?
Here is a solution using geom_segment
ggplot(data = d,
mapping = aes(x = tt, y = xx, group = cc)) +
geom_line() +
geom_segment(x = 5, xend = 5, y = -150, yend = 150) +
coord_trans(y = "log_modulus")
Just curious how can you generate the dcauchy distribution from Wikipedia:
Normally, you have
dcauchy(x, location = 0, scale = 1, log = FALSE)
for one line density p(x) v.s x
I assume in order to generate the diagram from wiki, a data.frame involves?
cauchy_dist <- data.frame(cauchy1 = rcauchy(10, location = 0, scale = 1, log = FALSE), cauchy2 = ....... , cauchy3 = ..... )
or you just need to
plot(x, P(x))
and then add lines to it?
You can use ggplot2's stat_function:
ggplot(data.frame(x = c(-5, 5)), aes(x)) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 0.5), aes(color = "a"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 1), aes(color = "b"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = 0, scale = 2), aes(color = "c"), size = 2) +
stat_function(fun = dcauchy, n = 1e3, args = list(location = -2, scale = 1), aes(color = "d"), size = 2) +
scale_x_continuous(expand = c(0, 0)) +
scale_color_discrete(name = "",
labels = c("a" = expression(x[0] == 0*","~ gamma == 0.5),
"b" = expression(x[0] == 0*","~ gamma == 1),
"c" = expression(x[0] == 0*","~ gamma == 2),
"d" = expression(x[0] == -2*","~ gamma == 1))) +
ylab("P(x)") +
theme_bw(base_size = 24) +
theme(legend.position = c(0.8, 0.8),
legend.text.align = 0)
You could create the data as follows:
location <- c(0, 0, 0, -2)
scale <- c(0.5, 1, 2, 1)
x <- seq(-5, 5, by = 0.1)
cauchy_data <- Map(function(l, s) dcauchy(x, l, s), location, scale)
names(cauchy_data) <- paste0("cauchy", seq_along(location))
cauchy_tab <- data.frame(x = x, cauchy_data)
head(cauchy_tab)
## x cauchy1 cauchy2 cauchy3 cauchy4
## 1 -5.0 0.006303166 0.01224269 0.02195241 0.03183099
## 2 -4.9 0.006560385 0.01272730 0.02272830 0.03382677
## 3 -4.8 0.006833617 0.01324084 0.02354363 0.03600791
## 4 -4.7 0.007124214 0.01378562 0.02440091 0.03839685
## 5 -4.6 0.007433673 0.01436416 0.02530285 0.04101932
## 6 -4.5 0.007763656 0.01497929 0.02625236 0.04390481
Map is used to apply a function of multiple variables to just as many vectors element by element. Thus, the first list element of cauchy_data will contain the following
dcauchy(x, location[1], scale[1])
and so on. I then put the Cauchy data in a data frame together with the vector of x coordinates, x. So you have the desired data table.
There are, of course, many ways to plot this. I prefer to use ggplot and show you how to plot as an example:
library(tidyr)
library(ggplot2)
curve_labs <- paste(paste("x0 = ", location), paste("gamma = ", scale), sep = ", ")
plot_data <- gather(cauchy_tab, key = curve, value = "P", -x )
ggplot(plot_data, aes(x = x, y = P, colour = curve)) + geom_line() +
scale_colour_discrete(labels = curve_labs)
You could tweak the plot in many ways to get something that more closely resembles the plot from Wikipedia.
This question already has an answer here:
Adding legend to ggplot when lines were added manually
(1 answer)
Closed 9 years ago.
I want to create a legend for my prior and posterior in ggplot 2. I'm using knitr so it needs to be able to transfer onto it but that shouldn't be a problem.
Below is the code I have:
<<echo=FALSE,message=FALSE,cache=FALSE,include=TRUE,fig.height=5,fig.pos="h!",
warning=FALSE>>=
require(ggplot2)
x <- seq(0, 1, len = 100)
y <- seq(0,6,len=100)
p <- qplot(x, geom = "blank")
Prior <- stat_function(aes(x = x, y = y), fun = dbeta, colour="red", n = 1000,
args = list(shape1 = 3, shape2 = 7))
Posterior <- stat_function(aes(x = x, y = ..y..), fun = dbeta, colour="blue",
n = 1000,args = list(shape1 = 7, shape2 = 23))
p + Prior + Posterior
#
I've tried a few things but I can't figure out the best way. Thanks!
If you put colour= inside the calls to aes(...), ggplot makes a color scale and creates a legend automatically.
p <- qplot(x, geom = "blank")
Prior <- stat_function(aes(x = x, y = y,color="#FF0000"), fun = dbeta, n = 1000,
args = list(shape1 = 3, shape2 = 7))
Posterior <- stat_function(aes(x = x, y = y,color="#0000FF"), fun = dbeta,
n = 1000,args = list(shape1 = 7, shape2 = 23))
p + Prior + Posterior +
scale_color_discrete("Distibution",labels=c("Prior","Posterior"))