geom_histogram() to hist() in R - r

I have the following code:
mapping <- aes(
x = values
, color = factor(par_a)
)
plot <- (ggplot(data=data, mapping=mapping)
+ geom_histogram(binwidth = 5, na.rm = TRUE)
+ facet_grid(par_b ~ par_c ~ par_d, scales = "free")
)
Since I am asked to use instead hist() because of the possibility to use plot=FALSE, now I want to adjust the code.
mapping <- aes(
x = values
, color = factor(par_a)
)
plot2 <- hist(values, breaks = seq(min(values), max(values)+5, by = 5))
+ facet_grid(par_b ~ par_c ~ par_d, scales = "free")
However, I have no idea how to implement the 'color = factor(par_a)' or the whole line 'facet_grid(par_b ~ par_c ~ par_d, scales = "free")'. I guess these functions are not explicitly supported for 'hist()', but I would really appreciate it if someone could tell me what the alternatives for them would be?

the base plotting can use the function par(mfrow=c(num_rows,num_cols)) in order to build subplots. The next plot, hist, etc. calls that you make will fill the desired subplots. and to color your bars within the plots you can make a variable as described here to pass to the color parameter of hist.

Related

Plotting two stat_function()'s in a grid using ggplot

I want to output two plots in a grid using the same function but with different input for x. I am using ggplot2 with stat_function as per this post and I have combined the two plots as per this post and this post.
f01 <- function(x) {1 - abs(x)}
ggplot() +
stat_function(data = data.frame(x=c(-1, 1)), aes(x = x, color = "red"), fun = f01) +
stat_function(data = data.frame(x=c(-2, 2)), aes(x = x, color = "black"), fun = f01)
With the following outputs:
Plot:
Message:
`mapping` is not used by stat_function()`data` is not used by stat_function()`mapping` is not used by stat_function()`data` is not used by stat_function()
I don't understand why stat_function() won't use neither of the arguments. I would expect to plot two graphs one with x between -1:1 and the second with x between -2:2. Furthermore it takes the colors as labels, which I also don't understand why. I must be missing something obvious.
The issue is that according to the docs the data argument is
Ignored by stat_function(), do not use.
Hence, at least in the second call to stat_function the data is ignored.
Second, the
The function is called with a grid of evenly spaced values along the x axis, and the results are drawn (by default) with a line.
Therefore both functions are plotted over the same range of x values.
If you simply want to draw functions this can be achievd without data and mappings like so:
library(ggplot2)
f01 <- function(x) {1 - abs(x)}
ggplot() +
stat_function(color = "black", fun = f01, xlim = c(-2, 2)) +
stat_function(color = "red", fun = f01, xlim = c(-1, 1))
To be honest, I'm not really sure what happens here with ggplot and its inner workings. It seems that the functions are always applied to the complete range, here -2 to 2. Also, there is an issue on github regarding a wrong error message for stat_function.
However, you can use the xlim argument for your stat_function to limit the range on which a function is drawn. Also, if you don't specify the colour argument by a variable, but by a manual label, you need to tell which colours should be used for which label with scale_colour_manual (easiest with a named vector). I also adjusted the line width to show the function better:
library(ggplot2)
f01 <- function(x) {1 - abs(x)}
cols <- c("red" = "red", "black" = "black")
ggplot() +
stat_function(data = data.frame(x=c(-1, 1)), aes(x = x, colour = "red"), fun = f01, size = 1.5, xlim = c(-1, 1)) +
stat_function(data = data.frame(x=c(-2, 2)), aes(x = x, colour = "black"), fun = f01) +
scale_colour_manual(values = cols)

Two plots in one plot with ggplot

I need to create "two plots" in "one plot" with ggplot. I managed to do it with base R as follows:
x=rnorm(10)
y=rnorm(10)*20+100
plot(1:10,rev(sort(x)),cex=2,col='red',ylim=c(0,2.2))
segments(x0=1:10, x1=1:10, y0=1.8,y1=1.8+y/max(y)*.2,lwd=3,col='dodgerblue')
However, I am struggling with ggplot, how can it be done?
Here's one possible translation of that code.
ggplot(data.frame(idx=seq_along(x), x,y)) +
geom_point(aes(idx, rev(sort(x))), col="red") +
geom_segment(aes(x=idx, xend=idx, y=1.8, yend=1.8+y/max(y)*.2), color="dodgerblue")
In general with ggplot2, you can add multiple views of data to a plot by adding additional layers (geoms)
My solution is similar to #MrFlick.
I would always recommend having a plot data frame and referring to the variables from there as you can more easily relate variables to plot aesthetics.
library(tidyverse)
plot_df <- data.frame(x, y) %>%
arrange(-x) %>%
mutate(id = 1:10)
ggplot(plot_df) +
geom_point(aes(id, x), color = "red", pch = 1, size = 5) +
geom_segment(aes(x = id, xend = id, y = 1.8, yend = 1.8+y/max(y)*.2),
lwd = 2, color = 'dodgerblue') +
scale_y_continuous(limits = c(0,2.2)) +
theme_light()
Ultimately, the goal of ggplot is to add aesthetics (in this case, the points and the segments) to form the final plot.
If you'd like to learn more, check out the ggplot cheat sheet and read more on the ideas behind ggplot: https://ggplot2.tidyverse.org/

R ggplot2 could not add legend to graph

I'm using visual studio with R version 3.5.1 where I tried to plot legend to the graph.
f1 = function(x) {
return(x+1)}
x1 = seq(0, 1, by = 0.01)
data1 = data.frame(x1 = x1, f1 = f1(x1), F1 = cumtrapz(x1, f1(x1)) )
However, when I tried to plot it, it never give me a legend!
For example, I used the same code in this (Missing legend with ggplot2 and geom_line )
ggplot(data = data1, aes(x1)) +
geom_line(aes(y = f1), color = "1") +
geom_line(aes(y = F1), color = "2") +
scale_color_manual(values = c("red", "blue"))
I also looked into (How to add legend to ggplot manually? - R
) and many other websites in stackoverflo, and I have tried every single function in https://www.rstudio.com/wp-content/uploads/2016/11/ggplot2-cheatsheet-2.1.pdf
i.e.
theme(legend.position = "bottom")
scale_fill_discrete(...)
group
guides()
show.legend=TRUE
I even tried to use the original plot() and legend() function. Neither worked.
I thought there might be something wrong with the dataframe, but I split them(x2,f1,F1) apart, it still didn't work.
I thought there might be something wrong with IDE, but the code given by kohske acturally plotted legend!
d<-data.frame(x=1:5, y1=1:5, y2=2:6)
ggplot(d, aes(x)) +
geom_line(aes(y=y1, colour="1")) +
geom_line(aes(y=y2, colour="2")) +
scale_colour_manual(values=c("red", "blue"))
What's wrong with the code?
As far as I know, you only have X and Y variables in your aesthetics. Therefore there is no need for a legend. You have xlab and ylab to describe your two lines. If you want to have legends, you should put the grouping in the aesthetics, which might require recoding your dataset
d<- data.frame(x=c(1:5, 1:5), y=c(1:5, 2:6), colorGroup = c(rep("redGroup", 5),
rep("blueGroup", 5)))
ggplot(d, aes(x, y, color = colorGroup )) + geom_line()
This should give you two lines and a legend

Adding a table to ggplot with gridExtra and annotation_custom() changes y-axis limits

I tried adding a little summary table to a plot which I created with ggplot2::ggplot(). The table is added via gridExtra::tableGrob() to the saved ggplot object.
My problem is that this seems to change the y-limits of my original plot.
Is there a way to avoid that without having to specify the limits again via ylim()?
Here is a minimal example for the problem using the ChickWeight dataset:
# load packages
require(ggplot2)
require(gridExtra)
# create plot
plot1 = ggplot(data = ChickWeight, aes(x = Time, y = weight, color = Diet)) +
stat_summary(fun.data = "mean_cl_boot", size = 1, alpha = .5)
plot1
# create table to add to the plot
sum_table = aggregate(ChickWeight$weight,
by=list(ChickWeight$Diet),
FUN = mean)
names(sum_table) = c('Diet', 'Mean')
sum_table = tableGrob(sum_table)
# insert table into plot
plot1 + annotation_custom(sum_table)
EDIT:
I just figured out that it seems to be an issue with stat_summary(). When I use another geom/layer, then the limits stay as they were in the original plot. Another example for that:
plot2 = ggplot(data = ChickWeight, aes(x = Time, y = weight, color = Diet)) +
geom_jitter()
plot2
plot2 + annotation_custom(sum_table)
The y-range for plot1 is different from plot2, the reason being that annotation_custom takes its aesthetics from the original aes statement, not the modified data frame used by stat_summary(). To get the y-ranges for the two plots to be the same (or roughly the same - see below), stop annotation_custom getting its aesthetics from the original data. That is, move aes() inside the stat_summary().
# load packages
require(ggplot2)
require(gridExtra)
# create plot
plot1 = ggplot(data = ChickWeight) +
stat_summary(aes(x = Time, y = weight, color = Diet), fun.data = "mean_cl_boot", size = 1, alpha = .5)
plot1
# create table to add to the plot
sum_table = aggregate(ChickWeight$weight,
by=list(ChickWeight$Diet),
FUN = mean)
names(sum_table) = c('Diet', 'Mean')
sum_table = tableGrob(sum_table)
# insert table into plot
plot2 = plot1 + annotation_custom(sum_table, xmin = 10, xmax = 10, ymin = 200, ymax = 200)
plot2
By the way, the reason the two plots will not give the exact same y-range is because of the bootstrap function in stat_summary(). Indeed, plot p1 repeatedly, and you might notice slight changes in the y-range. Or check the y-ranges in the build data.
Edit Updating to ggplot2 ver 3.0.0
ggplot_build(plot1)$layout$panel_params[[1]]$y.range
ggplot_build(plot2)$layout$panel_params[[1]]$y.range
Recall that ggplot does not evaluate functions until drawing time - each time p1 or p2 is drawn, a new bootstrap sample is selected.

How to add gaussian curve to histogram created with qplot?

I have question probably similar to Fitting a density curve to a histogram in R. Using qplot I have created 7 histograms with this command:
(qplot(V1, data=data, binwidth=10, facets=V2~.)
For each slice, I would like to add a fitting gaussian curve. When I try to use lines() method, I get error:
Error in plot.xy(xy.coords(x, y), type = type, ...) :
plot.new has not been called yet
What is the command to do it correctly?
Have you tried stat_function?
+ stat_function(fun = dnorm)
You'll probably want to plot the histograms using aes(y = ..density..) in order to plot the density values rather than the counts.
A lot of useful information can be found in this question, including some advice on plotting different normal curves on different facets.
Here are some examples:
dat <- data.frame(x = c(rnorm(100),rnorm(100,2,0.5)),
a = rep(letters[1:2],each = 100))
Overlay a single normal density on each facet:
ggplot(data = dat,aes(x = x)) +
facet_wrap(~a) +
geom_histogram(aes(y = ..density..)) +
stat_function(fun = dnorm, colour = "red")
From the question I linked to, create a separate data frame with the different normal curves:
grid <- with(dat, seq(min(x), max(x), length = 100))
normaldens <- ddply(dat, "a", function(df) {
data.frame(
predicted = grid,
density = dnorm(grid, mean(df$x), sd(df$x))
)
})
And plot them separately using geom_line:
ggplot(data = dat,aes(x = x)) +
facet_wrap(~a) +
geom_histogram(aes(y = ..density..)) +
geom_line(data = normaldens, aes(x = predicted, y = density), colour = "red")
ggplot2 uses a different graphics paradigm than base graphics. (Although you can use grid graphics with it, the best way is to add a new stat_function layer to the plot. The ggplot2 code is the following.
Note that I couldn't get this to work using qplot, but the transition to ggplot is reasonably straighforward, the most important difference is that your data must be in data.frame format.
Also note the explicit mapping of the y aesthetic aes=aes(y=..density..)) - this is slighly unusual but takes the stat_function results and maps it to the data:
library(ggplot2)
data <- data.frame(V1 <- rnorm(700), V2=sample(LETTERS[1:7], 700, replace=TRUE))
ggplot(data, aes(x=V1)) +
stat_bin(aes(y=..density..)) +
stat_function(fun=dnorm) +
facet_grid(V2~.)

Resources