Connecting the points of a dodged geom_point by group

Connecting the points of a dodged geom_point by group - r

Consider this nested data set, in which i aim at plotting the two nested factor variables on the x-axis:
df <- data.frame(X=c(rep("A",9), rep("B",9), rep("C",9)),
nested=c(rep(c(rep("X",3), rep("Y",3), rep("Z",3)),3)),
response=runif(27))
ggplot(df) +
geom_point(aes(x=X, y=response, col=nested, group=nested, shape=nested), position=position_dodge(width=1))
I want to connect the dots in each level of nested in each level of X to have vertical, parallel lines in the plot, spanning from the maximum to the minimum of response in each nested level. (much alike i would use fill=nested if would go for boxplots), but my first approach was not satisfactory:
ggplot(df) +
geom_point(aes(x=X, y=response, col=nested, group=nested, shape=nested), position=position_dodge(width=0.3))+
geom_line(aes(x=X, y=response, col=nested, group=nested))
I can imagine using geom_errorbar, but this means i would need to create a separate dataframe with min and max values, correct?

You can dodge the lines too. Just make sure that the group aesthetic is mapped to the interaction between nested and X:
ggplot(df) +
geom_point(aes(x = X, y = response, col = nested, shape = nested),
position=position_dodge(width = 0.3)) +
geom_line(aes(x = X, y = response, col = nested,
group = interaction(nested, X)),
position = position_dodge(width = 0.3))

Maybe try this approach with facet_wrap():
library(ggplot2)
#Code
ggplot(df) +
geom_point(aes(x=X, y=response,
col=nested, group=nested, shape=nested),
position=position_dodge(width=1))+
geom_line(aes(x=X, y=response,
col=nested,group=nested),position=position_dodge(width=1))+
facet_wrap(.~X,scales='free_x')+
theme(strip.background = element_blank(),
strip.text = element_blank())
Output:

Related

ggplot two histograms in one plot

I created the plot below using:
ggplot(data_all, aes(x = data_all$Speed, fill = data_all$Season)) +
theme_bw() +
geom_histogram(position = "identity", alpha = 0.2, binwidth=0.1)
As you can see, the difference in the amount of data available is very large. Is there a way to look only at the distribution and not at the total data amount?

You can reference some of the other calculated values from stat functions using a notation that you may have seen before: ..value... I'm not sure the proper name for these or where you can find a list documented, but sometimes these are called "special variables" or "calculated aesthetics".
In this case, the default calculated aesthetic on the y axis for geom_histogram() is ..count... When comparing distributions of different total N size, it's useful to use ..density... You can access ..density.. by passing it to the y aesthetic directly in the geom_histogram() function.
First, here's an example of two histograms with vastly different sizes (similar to OP's question):
library(ggplot2)
set.seed(8675309)
df <- data.frame(
x = c(rnorm(1000, -1, 0.5), rnorm(100000, 3, 1)),
group = c(rep("A", 1000), rep("B", 100000))
)
ggplot(df, aes(x, fill=group)) + theme_classic() +
geom_histogram(
alpha=0.2, color='gray80',
position="identity", bins=80)
And here's the same plot using ..density..:
ggplot(df, aes(x, fill=group)) + theme_classic() +
geom_histogram(
aes(y=..density..), alpha=0.2, color='gray80',
position="identity", bins=80)

Calculating means with stat_summary for two different groupings and plotting in one plot

I am having issues with plotting two calculated means using stat_summary in the same figure.
I am using ggplot and stat_summary to plot means of a dataset that I grouped based on variable A. Variable A can have value 1,2,3,4. The same data also have variable B that can have value 1,2.
So, I can make a plot with means of the data grouped after variable A, and I get 4 lines.
I can also make a plot with means of the data grouped after variable B, where I get 2 lines.
But how can I plot them in the same figure, so that I get 6 lines? I have made a somewhat similar example using the mtcars dataset:
library(ggplot2)
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$vs <- as.factor(mtcars$vs)
mtcars
plot1 <- ggplot(mtcars, aes(x=gear, y=hp, color=cyl, fill=cyl)) +
stat_summary(geom='ribbon', fun.data = mean_cl_normal, fun.args=list(conf.int=0.95), alpha=0.5) +
stat_summary(geom='line', fun.y = mean, size=1)
plot1
plot2 <- ggplot(mtcars, aes(x=gear, y=hp, color=vs, fill=vs)) +
stat_summary(geom='ribbon', fun.data = mean_cl_normal, fun.args=list(conf.int=0.95), alpha=0.5) +
stat_summary(geom='line', fun.y = mean, size=1)
plot2
So far I have the impression, that since I start with ggplot(xxx), where xxx defines the data and grouping, I can't combine it with another ggplot with another grouping. If I could initiate ggplot() without defining anything in the argument, but only defining data and grouping in the argument for stat_summary, I feel like that would be the solution. But I can't figure out how to use stat_summary like that, if even possible.

You can just add more layers, defining the aes for each seperately:
ggplot(mtcars) +
stat_summary(aes(x=gear, y=hp, color=paste('cyl:', cyl), fill = paste('cyl:', cyl)), geom='ribbon', fun.data = mean_cl_normal, fun.args=list(conf.int=0.95), alpha=0.5) +
stat_summary(aes(x=gear, y=hp, color=paste('cyl:', cyl)), geom='line', fun.y = mean, size=1) +
stat_summary(aes(x=gear, y=hp, color=paste('vs:', vs), fill=paste('vs:', vs)), geom='ribbon', fun.data = mean_cl_normal, fun.args=list(conf.int=0.95), alpha=0.5) +
stat_summary(aes(x=gear, y=hp, color=paste('vs:', vs)), geom='line', fun.y = mean, size=1)

Problem when trying to plot two histograms using fill aesthetic

I've been trying to plot two histograms by using the fill aesthetic and a specific column with two levels. However, instead of displaying both desired histograms, my code displays one histogram with the whole data and another only for the second classification. I don't know if there is a problem in my syntax neither if this is some kind of tricky issue.
library(tidyverse)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
P1 <- ggplot(db1, aes(x=val)) + geom_histogram()
P2 <- ggplot(db2, aes(x=val)) + geom_histogram()
PF <- ggplot(dbf, aes(x=val)) + geom_histogram()
I want to get this, P1 and P2
ggplot(db1, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I want
But the code I think should work, P1 and P2 with the fill aesthetic for column val
ggplot(dbf, aes(x=val)) + geom_histogram(aes(fill=type), alpha=0.5)
My code
Produces the combination of PF and P2
ggplot(dbf, aes(x=val)) + geom_histogram(fill="red", alpha=0.5) + geom_histogram(data=db2, aes(x=val),fill="green", alpha=0.5)
What I get
Any help or idea will be highly appreciated!

All you need is to pass position = "identity" to your geom_histogram function.
library(tidyverse)
library(ggplot2)
db1 <- data.frame(type=rep("A",100),val=rnorm(n=100,mean=50,sd=10))
db2 <- data.frame(type=rep("B",150),val=rnorm(n=150,mean=50,sd=10))
dbf <- bind_rows(db1,db2)
ggplot(dbf, aes(x=val, fill = type)) + geom_histogram(alpha=0.5, position = "identity")

Is your goal to show the overlap via the color combination? I'm not sure how to force geom_histogram to show the overlap, but geom_density does do what you want. You can play with the bandwidth (bw) to show more or less detail.
dbf %>% ggplot() +
aes(x = val, fill = type) +
geom_density(alpha = .5, bw = .5) +
scale_fill_manual(values = c("red","green"))

How to trim extra space from ggplot

I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks

What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg

How to plot two histograms on the same axis scale?

I have two dataframes: dataf1, dataf2. They have the same structure and columns.
3 columns names are A,B,C. And they both have 50 rows.
I would like to plot the histogram of column B on dataf1 and dataf2. I can plot two histograms separately but they are not of the same scale. I would like to know how to either put them on the same histogram using different colors or plot two histograms of the same scale?
ggplot() + aes(dataf1$B)+ geom_histogram(binwidth=1, colour="black",fill="white")
ggplot() + aes(dataf2$B)+ geom_histogram(binwidth=1, colour="black", fill="white")

Combine your data into a single data frame with a new column marking which data frame the data originally came from. Then use that new column for the fill aesthetic for your plot.
data1$source="Data 1"
data2$source="Data 2"
dat_combined = rbind(data1, data2)
You haven't provided sample data, so here are a few examples of possible plots, using the built-in iris data frame. In the plots below, dat is analogous to dat_combined, Petal.Width is analogous to B, and Species is analogous to source.
dat = subset(iris, Species != "setosa") # We want just two species
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="identity", colour="grey40", alpha=0.5, binwidth=0.1)
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="dodge", binwidth=0.1)
ggplot(dat, aes(Petal.Width, fill=Species)) +
geom_histogram(position="identity", colour="grey40", binwidth=0.1) +
facet_grid(Species ~ .)

As Zheyuan says, you just need to set the y limits for each plot to get them on the same scale. With ggplot2, one way to do this is with the lims command (though scale_y_continuous and coord_cartesian also work, albeit slightly differently). You also should never use data$column indside aes(). Instead, use the data argument for the data frame and unquoted column names inside aes(). Here's an example with some built-in data.
p1 = ggplot(mtcars, aes(x = mpg)) + geom_histogram() + lims(y = c(0, 13))
p2 = ggplot(iris, aes(x = Sepal.Length)) + geom_histogram() + lims(y = c(0, 13))
gridExtra::grid.arrange(p1, p2, nrow = 1)
Two get two histograms on the same plot, the best way is to combine your data frames. A guess, without seeing what your data looks like:
dataf = rbind(dataf1["B"], dataf2["B"])
dafaf$source = c(rep("f1", nrow(dataf1)), rep("f2", nrow(dataf2))
ggplot(dataf, aes(x = B, fill = source)) +
geom_histogram(position = "identity", alpha = 0.7)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Connecting the points of a dodged geom_point by group - r

Related

ggplot two histograms in one plot

Calculating means with stat_summary for two different groupings and plotting in one plot

Problem when trying to plot two histograms using fill aesthetic

How to trim extra space from ggplot

How to plot two histograms on the same axis scale?

Categories

Resources