I am very new to R and struggling with something that I know should be simple. I am able to make one scatterplot but would like to make multiple subplots using each feature in my dataset plotted against the variable: per_gop.
My code for one scatter plot is as follows:
data_US#data %>%
ggplot(aes(x=as.numeric(Hispanic_o), y=as.numeric(per_gop)))+
geom_point(aes(fill=as.numeric(gop_dem), size=as.numeric(total_vote)),colour="#525252",pch=21) +
stat_smooth(method=lm, se=FALSE, size=1, colour="#525252")+
scale_fill_distiller(palette="RdBu", type="div", direction=1, guide="colourbar", limits=c(-1,1))+
theme_bw()+
theme(legend.position="none")+
ggtitle(paste("correlation:",round(cor.test(as.numeric(data_US#data$per_gop),as.numeric(data_US#data$Hispanic_o))$estimate,2)))
I have tried using a gather function for this but I am not sure how to correctly pass it to the code for plotting:
My code so far for multiple subplots is as follows:
data_US#data %>%
gather(c(White_alon,Black_or_A, Asian_alon,Hispanic_o,Foreign_bo,
Veterans_2,Language_o,Homeowners,Median_val,Per_capita,Bachelors_,
Private_no), key = "expl_var", value="la_prop") %>%
ggplot(aes(x=??????, y=per_gop))+
geom_point(aes(fill=gop_dem, size=total_vote),colour="#525252",pch=21) +
stat_smooth(method=lm, se=FALSE, size=1, colour="#525252")+
scale_fill_distiller("BrBG", type="div", direction=1, guide="colourbar", limits=c(-1,1))+
facet_wrap(~expl_var, scales="free")+
theme_bw()+
theme(legend.position="none")+ggtitle(paste("correlation:",round(cor.test(data_US#data$per_gop,data_US#data$Persons_65)$estimate,2)))
This is the style of output I am trying to create, just without the repeating variable:
I would be very grateful if someone could point me in the right direction... Thank you!
Here's a reproducible example using the mtcars dataset. Should be possible to apply same form to your data, but hard to know for sure without seeing an example of that data.
library(tidyverse)
mtcars %>%
gather(expl_var, la_prop, -mpg) %>%
ggplot(aes(la_prop, mpg)) +
geom_point(colour="#525252",pch=21) +
stat_smooth(method=lm, se=FALSE, size=1, colour="#525252")+
facet_wrap(~expl_var, scales = "free") +
theme_bw()+
theme(legend.position="none")
Related
Here I am trying to able to plot the graph with the below two lines of code.
ggplot(Melvyl,aes(x=Type.of.Customer)) +
geom_histogram(stat="count")
But I want data labels or the count on each of the category, trying the below code but its not working. can you please help me out!
Thank you
ggplot(Melvyl,aes(x=Type.of.Customer)) +
geom_histogram(stat="count")+ stat_bin(binwidth=1, geom="text", aes(label=..count..), vjust=-1.5)
Instead of using geom_histogram you could go on with geom_bar. To add your labels use geom_text with stat="count".
Using mtcars as example data:
library(ggplot2)
ggplot(mtcars, aes(x=cyl)) +
geom_bar() +
geom_text(aes(label=..count..), stat = "count", vjust=-1.5)
I am trying to make a manual colour scale for my bar graph using plyr to summarize the data and ggplot2 to present the graph.
The data has two variables:
Region (displayed on the X-axis)
Genotype (displayed by the fill)
I have managed to do this already, however, I have not been able to find a way to personalize the colours - it simply gives me two randomly assigned colours.
Could someone please help me figure out what I am missing here?
I have included my code and an image of the graph below. The graph basically has the appearance I want it to, except that I can't personalize the colours.
ggplotdata <- summarySE(data, measurevar="Density", groupvars=c("Genotype", "Region"))
ggplotdata
#Plot the data
ggplotdata$Genotype <- factor(ggplotdata$Genotype, c("WT","KO"))
Mygraph <-ggplot(ggplotdata, aes(x=Region, y=Density, fill=Genotype)) +
geom_bar(position=position_dodge(), stat="identity",
colour="black",
size=.2) +
geom_errorbar(aes(ymin=Density-se, ymax=Density+se),
width=.2,
position=position_dodge(.9)) +
xlab(NULL) +
ylab("Density (cells/mm2)") +
scale_colour_manual(name=NULL,
breaks=c("KO", "WT"),
labels=c("KO", "WT"),
values=c("#FFFFFF", "#3366FF")) +
ggtitle("X") +
scale_y_continuous(breaks=0:17*500) +
theme_minimal()
Mygraph
The answer here was to use scale_fill_manual instead, thank you #dc37
I'm using fct_reorder() to order the levels of my factors in ggplot. That works fine for individual plots. But when I use plot_grid() from cowplot, there is some kind of problem. For contrast, to the left, I've used a plot that has fixed factor levels, not using fct_reorder.
Edited:
Here is the actual code I'm using:
#make the base
myplot <-filter(summary_by_intensity_reshaped, str_detect(feature, "binary"), Frequency == "2Hz") %>%
ggplot(., aes(fct_reorder(feature, mean),mean,fill=Intensity, ymax=mean+sem, ymin=mean-sem))
#add the layers
myplot + geom_bar(stat="identity", position=position_dodge()) +
geom_errorbar(aes(width=0.2),position=position_dodge(0.9)) +
labs(x="Behavior",y="Percent of Trials (%)") +
scale_x_discrete(breaks=c("binary_flutter", "binary_hold", "binary_lift", "binary_jump","binary_rear", "binary_lick", "binary_guard", "binary_vocalize"), labels=c("Flutter", "Holding", "Lifting", "Jumping", "Rearing", "Licking", "Guarding", "Vocalizing"))+
facet_grid(~Frequency)+
theme(axis.text.x=element_text(angle=-90))
And the output looks like this:
The problem arises when I try to use 'myplot' in plot_grid(). That's when it renders oddly as in the example below.
I suspect you're using fct_reorder() incorrectly. plot_grid() just takes whatever plot you make and puts it into a grid.
library(ggplot2)
library(cowplot)
library(forcats)
p1 <- ggplot(mpg, aes(class, displ, color = factor(cyl))) + geom_point()
p2 <- ggplot(mpg, aes(fct_reorder(class, displ, mean), displ, color = factor(cyl))) +
geom_point()
plot_grid(p1, p2)
From your x axis title in the plot on the right, it looks to me like you forgot to provide fct_reorder() with the vector to which it should apply the function.
If I do the following command
data(mtcars)
ggplot(data=mtcars, aes(cyl))+
geom_bar(aes(fill=as.factor(gear), y = (..count..)/sum(..count..)), position="dodge") +
scale_y_continuous(labels=percent)
I will get
However, what I really want to do is have each of the gear levels add up to 100%. So, gear is the subgroup I am looking at, and I want to know the distribution within each group.
I don't want to use facets and I don't want to melt the data either. Is there a way to do this?
I was searching for an answer to this exact question. This is what I came up with using the information I pooled together from Stack Overflow and getting familiar (i.e., trial-and-error) with ..x.., ..group.., and ..count.. from the Sebastian Sauer link provided in Simon's answer. It shouldn't require any other packages than ggplot.
library(ggplot2)
ggplot(mtcars, aes(x=as.factor(cyl), fill=as.factor(gear)))+
geom_bar(aes( y=..count../tapply(..count.., ..x.. ,sum)[..x..]), position="dodge" ) +
geom_text(aes( y=..count../tapply(..count.., ..x.. ,sum)[..x..], label=scales::percent(..count../tapply(..count.., ..x.. ,sum)[..x..]) ),
stat="count", position=position_dodge(0.9), vjust=-0.5)+
ylab('Percent of Cylinder Group, %') +
scale_y_continuous(labels = scales::percent)
Produces
First of all: Your code is not reproducible for me (not even after including library(ggplot2)). I am not sure if ..count.. is a fancy syntax I am not aware of, but in any case it would be nicer if I would have been able to reproduce right away :-).
Having said that, I think what you are looking for it described in http://docs.ggplot2.org/current/geom_bar.html and applied to your example the code
library(ggplot2)
data(mtcars)
mtcars$gear <- as.factor(mtcars$gear)
ggplot(data=mtcars, aes(cyl))+
geom_bar(aes(fill=as.factor(gear)), position="fill")
produces
Is this what you are looking for?
Afterthought: Learning melt() or its alternatives is a must. However, melt() from reshape2 is succeeded for most use-cases by gather() from tidyr package.
Here's a good resource on how to do this from Sebastian Sauer. The quickest way to solve your problem is Way 4 in which you substitude ..prop.. for (..count..)/sum(..count):
# Dropping scale_y_continuous, since you do not define percent
ggplot(data=mtcars, aes(cyl))+
geom_bar(aes(fill=as.factor(gear), y = (..count..)/sum(..count..)),
position="dodge")
Another approach, which I use and is similar to Way 1 in the linked page, is to use dplyr to calculate the percentages and stat = 'identity' to use the y aesthetic in a bar graph:
mtcars %>%
mutate(gear = factor(gear)) %>%
group_by(gear, cyl) %>%
count() %>%
group_by(gear) %>%
mutate(percentage = n/sum(n)) %>%
ggplot(aes(x = cyl, y = percentage, fill = gear)) +
geom_bar(position = 'dodge', stat = 'identity')
If I understand the question of wanting to make each gear sum to 100% (rather than cyl summing to 100%), I made a small tweak to Robin's resonse to make this work.
Basically in the aes() statements, change ..x.. to ..fill..
ggplot(mtcars, aes(x=as.factor(cyl), fill=as.factor(gear)))+
geom_bar(aes(y=..count../tapply(..count.., ..fill.. ,sum)[..fill..]), position="dodge") +
geom_text(aes(y=..count../tapply(..count.., ..fill.. ,sum)[..fill..],
label=scales::percent(..count../tapply(..count.., ..fill.. ,sum)[..fill..])),
stat="count", position=position_dodge(0.9), vjust=-0.5)+
ylab('Percent of Cylinder Group, %') +
scale_y_continuous(labels = scales::percent)
image of produced plot with percentages by fill variable rather than grouping variable
Hope this helps!
status = sample(c(0, 1), 500, replace = TRUE)
value = rnorm(500)
plot(value)
smoothScatter(value)
I'm trying to make a scatterplot of value, but if I were to just plot it, the data is all clumped together and it's not very presentable. I've tried smoothScatter(), which makes the plot look a bit nicer, but I am wondering if there's a way to color code the values based on the corresponding status?
I am trying to see if there's a relationship between status and value. What's another way to present the data nicely? I've tried boxplot, but I'm wondering how I can make the smoothScatter() plot better or if there are other ways to visualize it.
I'm assuming you meant to write plot(status, value) in your example? Regardless, there's not going to be much difference using this data, but you should get the idea of things to maybe look at with the following examples...
Have you looked into jitter?
Some basics:
plot(jitter(status), value)
or perhaps plot(jitter(status, 0.5), value)
Fancier with package ggplot2 you could do:
library(ggplot2)
df <- data.frame(value, status)
ggplot(data=df, aes(jitter(status, 0.10), value)) +
geom_point(alpha = 0.5)
or this...
ggplot(data=df, aes(factor(status), value)) +
geom_violin()
or...
ggplot(data=df, aes(x=status, y=value)) +
geom_density2d() +
scale_x_continuous(limits=c(-1,2))
or...
ggplot(data=df, aes(x=status, y=value)) +
geom_density2d() +
stat_density2d(geom="tile", aes(fill = ..density..), contour=FALSE) +
scale_x_continuous(limits=c(-1,2))
or even this..
ggplot(data=df, aes(fill=factor(status), value)) +
geom_density(alpha=0.2)