ggplot title and xlab() not working with variables in R - r

So here is my code:
plotbmi <- function(variable) {
plot <- ggplot(data = data, aes(variable, bmi))+
geom_boxplot()+
labs(title = paste("BMI boxplot split up by", variable))
ylab('BMI')+
theme_bw()+
theme(plot.title = element_text(size=14, hjust = 0.5, face = 'bold'))
return(plot)
}
plotbmi(data$region)
and my issue is that using this paste() in the title to concat the text with the variable is not working as I would like it to work. The plot title is not "BMI boxplot split up by regions" but it's "BMI boxplot split up by northeast" (northeast is a possible value in the regions variable.
How could I fix this?

In your implementation you're passing a character vector as the argument, which means it passes that vector itself to the xlab call. You can change this around to use tidyselect parts to both choose the correct column by column name and deparse it to paste in:
library(ggplot2)
plotbmi <- function(variable) {
plot <- ggplot(data = mtcars, aes({{variable}}, group = factor(gear)))+
geom_boxplot()+
labs(title = paste("cars boxplot split up by", rlang::get_expr(rlang::enquo(variable))))
ylab('Gears')+
theme_bw()+
theme(plot.title = element_text(size=14, hjust = 0.5, face = 'bold'))
return(plot)
}
plotbmi(cyl)

Related

How to display the variable actual value in ggplot legend

I was trying to do a ggplot scatter plot with var_x, var_y, and var_group as user-defined parameter. The code is as below
var_x='Expression'
var_y='Purity'
var_group = c('Type', 'Stage')
ggplot(data=d, mapping = aes(x=get(var_x), y=get(var_y), color=get(var_group[2]))) +
geom_point(size=2)+
xlab(var_x) +
ylim(0,1) + ylab(var_y)+
theme(legend.position = "top",
legend.text=element_text(size=40),
axis.text = element_text(size=30))+
ggtitle(paste0("Gene expression vs tumor purity in TCGA"))+
stat_poly_line() + stat_poly_eq()+
facet_wrap(.~get(var_group[1]), ncol=wNum, scale='fixed')
It's working but the legend is showing as "get(var_group[2]", while I'd like it to be displayed as the real value ("Type" in my test case). How can I do that?

How to create a legend across two different types of geoms ggplot2

I have two different datasets which I'd like to plot in the same ggplot2 plot, using different geoms for each. Ideally I would also like a legend which shows that the point geom corresponds to one type of data and the line geom corresponds to the other, but I cannot figure out how to do this. An example of what my data basically looks like is below, minus the legend.
require(ggplot2)
set.seed(1)
d1 = data.frame(y_values = rnorm(21), x_values = 1:21, factor_values = as.factor(sample(1:3, 21, replace=T)))
d2 = data.frame(y_values = seq(-1,1,by = .05), x_values = seq(1,21,by = .5))
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values), color="blue")
Maybe is this what you want? Two legends for each data. You can enable linetype in order to create a new legend so that points and lines can be in different places:
#Code
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values,linetype='myline'), color="blue")+
scale_linetype_manual('My line',values='solid')
Output:
Or you can also try this:
#Code 2
ggplot() +
geom_point(data=d1, aes(x=x_values, y=y_values, color=factor_values)) +
geom_line(data=d2, aes(x = x_values, y=y_values,linetype='myline'), color="blue")+
scale_linetype_manual('',values='solid')+
theme(
legend.spacing = unit(-17,'pt'),
legend.margin = margin(t=0,b=0,unit='pt'),
legend.background = element_blank()
)+guides(linetype=guide_legend(title="New Legend Title"),
color=guide_legend(title=""))
Output:

Tidying up the ggplot pie chart

After looking at various post and asking questions here i have been able to make a multi faceted pie chart. But i am facing a problem in tidying up the pie chart. Here are the things i am having troubles with:
How do i remove the facet labels from each row and only have one facet label on the top or bottom and left or right? How do i control how the facet label looks?
I have tried using facet_grid instead of facet_wrap and that removes the label from each row but still the labels are inside a box. I would like to remove the box which i donot seem to be able to do.
Centering the labels so that the values for each fraction of the pie is inside that pie-slice.
Some of my piechart have 8 to 10 values and they are not always inside there fraction. First i used geom_text_repel but that only helped me to repel the text. It didnt place the text inside each fraction. I also looked at this thread. I tried that by creating a new dataframe which has a position values and using that pos inside geom_text like so d<-c %>% group_by(Parameter)%>% mutate(pos= ave(Values, Zones, FUN = function(x) cumsum(x) - 0.5 * x)) and using the same code to make pie chart for d dataframe but it didnt quite work.
Grouping the values under certain level into one single "other" groups so the number of slices would be less
It would be ideal for me to be able to group the values with less than 1 % into one single group and call it "others" so that the number of slices are less. So far i have to completely ignore those values by c<-c[c$Values>1,] and using this newly created data frame.
Any suggestions/help regarding these issues would be helpful.
Following is the reproducible example of my current pie chart:
library(RColorBrewer)
library(ggrepel)
library(ggplot2)
library(tidyverse)
my_pal <- colorRampPalette(brewer.pal(9, "Set1"))
#### create new matrix ############
new_mat<-matrix(, nrow=40, ncol = 4)
colnames(new_mat)<-c("Zones", "ssoilcmb", "Erosion_t", "area..sq.m.")
for ( i in 1:nrow(new_mat)){
new_mat[i,4]<-as.numeric(sample(0:20, 1))
new_mat[i,3]<-as.numeric(sample(0:20, 1))
a<-sample(c("S2","S3","S4","S5","S1"),1)
b<-sample(c("Deep","Moderate","Shallow"),1)
new_mat[i,1]<-sample(c("High Precip","Moderate Precip","Low Precip"),1)
new_mat[i,2]<-paste0(a,"_",b)
}
m_dt<-as.data.frame(new_mat)
m_dt$Erosion_t<-as.numeric(m_dt$Erosion_t)
m_dt$area..sq.m.<-as.numeric(m_dt$area..sq.m.)
#### calculate parea
m_dt<- m_dt %>%
group_by(Zones)%>%
mutate(per_er=signif((`Erosion_t`/sum(`Erosion_t`))*100,3), per_area=signif((`area..sq.m.`/sum(`area..sq.m.`))*100,3))
## Rearranging data:
a<-data.frame(m_dt$Zones,m_dt$ssoilcmb, m_dt$per_er)
b<-data.frame(m_dt$Zones,m_dt$ssoilcmb, m_dt$per_area)
c<-data.frame(Zones=m_dt$Zones,ssoilcmb=m_dt$ssoilcmb,
Parameter=c(rep("Erosion",40),rep("Area",40)),
Values=c(m_dt$per_er,m_dt$per_area))
### New Plot ###
ggplot(c, aes(x="", y=Values, fill=ssoilcmb)) +
geom_bar(stat="identity", width=1, position = position_fill())+
coord_polar("y", start=0) +
facet_wrap(Zones~Parameter, nrow = 3) +
geom_text_repel(aes(label = paste0(Values, "%")), position = position_fill(vjust = 0.5))+
scale_fill_manual(values=my_pal(15)) +
labs(x = NULL, y = NULL, fill = NULL, title = "Erosions")+
theme_classic() + theme(axis.line = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank(),
plot.title = element_text(hjust = 0.5, color = "#666666"))
If you're open to alternatives, maybe a facet_wrapped barplot will suit your needs, e.g.
library(RColorBrewer)
library(ggrepel)
library(tidyverse)
my_pal <- colorRampPalette(brewer.pal(9, "Set1"))
#### create new matrix ############
new_mat<-matrix(nrow=40, ncol = 4)
colnames(new_mat)<-c("Zones", "ssoilcmb", "Erosion_t", "area..sq.m.")
for ( i in 1:nrow(new_mat)){
new_mat[i,4]<-as.numeric(sample(0:20, 1))
new_mat[i,3]<-as.numeric(sample(0:20, 1))
a<-sample(c("S2","S3","S4","S5","S1"),1)
b<-sample(c("Deep","Moderate","Shallow"),1)
new_mat[i,1]<-sample(c("High Precip","Moderate Precip","Low Precip"),1)
new_mat[i,2]<-paste0(a,"_",b)
}
m_dt<-as.data.frame(new_mat)
m_dt$Erosion_t<-as.numeric(m_dt$Erosion_t)
m_dt$area..sq.m.<-as.numeric(m_dt$area..sq.m.)
#### calculate parea
m_dt<- m_dt %>%
group_by(Zones)%>%
mutate(per_er=signif((`Erosion_t`/sum(`Erosion_t`))*100,3),
per_area=signif((`area..sq.m.`/sum(`area..sq.m.`))*100,3))
## Rearranging data:
a<-data.frame(m_dt$Zones,m_dt$ssoilcmb, m_dt$per_er)
b<-data.frame(m_dt$Zones,m_dt$ssoilcmb, m_dt$per_area)
c<-data.frame(Zones=m_dt$Zones,ssoilcmb=m_dt$ssoilcmb,
Parameter=c(rep("Erosion",40),rep("Area",40)),
Values=c(m_dt$per_er,m_dt$per_area))
### New Plot ###
c$Zones <- factor(c$Zones,levels(c$Zones)[c(2,3,1)])
ggplot(c, aes(x=ssoilcmb, y=Values, fill=ssoilcmb)) +
geom_col()+
facet_wrap(Zones~Parameter, nrow = 3) +
scale_fill_manual(values=my_pal(15)) +
labs(x = NULL, fill = NULL, title = "Erosions")+
theme_minimal() + theme(axis.line = element_blank(),
axis.ticks = element_blank(),
axis.text.x = element_text(angle = 90,
hjust = 1,
vjust = 0.5),
plot.title = element_text(hjust = 0.5,
color = "#666666"))

ggplot2 and R - Applying custom colors to a multi group histogram in long format

I've made a histogram graph that shows the distribution of lidar returns per elevation for three lidar scans I have done.
I've converted my data to long format, with:
one column called 'value', describing the z position of each point
one column called 'variable', containing the name of each
scan group
In the attached image you can see the histograms of my three scan groups. I am currently using viridis to color the histogram by scan group (ie. the name of the scan in the variable column). However, I want to match the colours in the graph with colours I already have.
How might I do this?
The hexcols I'd like to like color each of my three histograms with are:
lightgreen = "#62FE96"
lightred = "#FE206B"
darkpurple = "#62278E"
A link to my data - 'density2'
My current code:
library(tidyverse)
library(viridisLite)
library(viridis)
# histogram
p <- density2 %>%
ggplot( aes(x=value,color = variable, show.legend = FALSE)) +
geom_histogram(binwidth = 1, alpha = 0.5, position="identity") +
scale_color_viridis(discrete =TRUE) +
scale_fill_viridis(discrete=TRUE) +
theme_bw() +
labs(fill="") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
p + scale_y_sqrt() + theme(legend.position="none") + labs(y = "data pts", x = "elevation (m)")
Any help would be most appreciated!
Delete the scale_color_viridis and scale_fill_viridis lines - these are applying the Viridis color scale. Replace with scale_fill_manual(values = c(lightgreen, lightred, darkpurple)). And in your aesthetic mapping replace color = variable with fill = variable. For a histogram, color refers to the color of the lines outlining each bar, and fill refers to the color each bar is filled in.
This should leave you with:
p <- density2 %>%
ggplot(aes(x = value, fill = variable)) +
geom_histogram(binwidth = 1, alpha = 0.5, position = "identity") +
scale_fill_manual(values = c(lightgreen, lightred, darkpurple)) +
theme_bw() +
labs(fill = "") +
theme(panel.grid = element_blank())
p + scale_y_sqrt() +
theme(legend.position = "none") +
labs(y = "data pts", x = "elevation (m)")
I've also done some other clean-up. show.legend = FALSE does not belong inside aes() - and your theme(legend.position = "none") should take care of it.
I did not download your data, save it in my working directory, import it into R, and test this code on it. If you need more help, please post a small subset of your data in a copy/pasteable format (e.g., dput(density2[1:20, ]) for the first 20 rows---choose a suitable subset) and I'll be happy to test and adjust.

R ggplot2 legend with linetypes

I am relatively new to R and I have some difficulties with ggplot2. I have a data frame consisting of three variables (alpha, beta, gamma) and I want to plot them together. I get the plot but I have two problems:
legend is outside the plot and I want it to be inside
linetypes are changed to "solid", "dashed" and "dotted"!
Any ideas/suggestions would be more than welcome!
p <- ggplot() +
geom_line(data=my.data,aes(x = time, y = alpha,linetype = 'dashed')) +
geom_line(data=my.data,aes(x = time, y = beta, linetype = 'dotdash')) +
geom_line(data=my.data,aes(x = time, y = gamma,linetype = 'twodash')) +
scale_linetype_discrete(name = "", labels = c("alpha", "beta", "gamma"))+
theme_bw()+
xlab('time (years)')+
ylab('Mean optimal paths')
print(p)
What you are after is easier to achieve if you first rearrange your data to long format, with one observation per row.
You can do this with tidyr's gather function. Then you can simply map the linetype to the variable in your data.
In your original approach, you tried to assign a literal 'linetype' by using aes(), but ggplot interprets this as you saying, 'assign a line type here as if the variable that is mapped to linetype had the value dashed/dotdash/twodash'. When drawing the plot, it looks up the linetypes in the default scale_linetype_discrete, the first three values of which happen to be solid, dotted and dashed, which is why you're seeing the confusing replacement. You can specify linetypes by using scale_linetype_manual.
The position of the legend is adjustable in theme().
legend.position = c(0,1) defines the legend to be placed at the left, top corner.
legend.justification = c(0,1) sets the anchor to use in legend.position to the left, top corner of the legend box.
library(tidyr)
library(ggplot2)
# Create some example data
my.data <- data.frame(
time=1:100,
alpha = rnorm(100),
beta = rnorm(100),
gamma = rnorm(100)
)
my.data <- my.data %>%
gather(key="variable", value="value", alpha, beta, gamma)
p <- ggplot(data=my.data, aes(x=time, y=value, linetype=variable)) +
geom_line() +
scale_linetype_manual(
values=c("solid", "dotdash", "twodash"),
name = "",
labels = c("alpha", "beta", "gamma")) +
xlab('time (years)')+
ylab('Mean optimal paths') +
theme_bw() +
theme(legend.position=c(0.1, 0.9), legend.justification=c(0,1))
print(p)

Resources