in a project i present barplots and use the interaction command to order the groups, as one is a strict subgroup of the other. I would like to not print out the whole name of the first group as this takes up a lot of space. Is there a way to restrict the word to the first character or something like that?
mtcars$name <- rownames(mtcars)
ggplot(data = mtcars, aes(x=interaction(mtcars$cyl, mtcars$name)))+
geom_bar()+
theme(axis.text.x = element_text(angle = 90, hjust = 1,vjust = 0.5))
Here for example only the #cylinders are interesting to me, I just use the car name to order them. But they take up a lot of space. Just having the first letter of the car written would be ideal. so i would like to have 8.A for example. In my original data the first variable has different length (not just 1 character as #cylinder has here)
Thanks for any answer,
Regards
You can edit the labels using regular expressions in scale_x_discrete :
library(ggplot2)
ggplot(data = mtcars, aes(x=interaction(mtcars$cyl, mtcars$name)))+
geom_bar()+
xlab('Interaction cyl vs Name') +
theme(axis.text.x = element_text(angle = 90, hjust = 1,vjust = 0.5)) +
scale_x_discrete(labels = function(x) sub('(\\..).*', '\\1', x))
Everything inside () is referred to as a capture group where we specify which part of the text we want to keep. Here, we mention that we want to keep everything until a dot (i.e \\., . is a special character in regex which needs to be escaped with \\) followed by another character (.).
Related
I would like to exclude one column from my chart. In csv I use as my data there is a lot of empty cells and thus there is nameless column in my chart that is at the same highest of them all. In my opinion it looks a bit stupid so I would like to get rid of it.
Here is my chart code:
ggplot(df, aes(Coverage, fill=(Coverage)))+
geom_bar(color="black",fill="brown3")+
theme(text = element_text(size=15),axis.text.x = element_text(angle=90, hjust=1))+
labs(title = "Diagram przedstawiajacy w ktorym miesiacu w kolejnych latach najwieksza liczba dziennikarzy poniosla smierc", x="Panstwo", y="Rok")
And here is how the chart looks like. First column is the one counting amount of empty cells.
thank you very much for all the help!
Alternatively to #Roman Luštrik's answer, you can use dplyr to filter your dataset and do the plot in the same sequence:
library(dplyr)
library(ggplot2)
df %>% filter(Coverage != "") %>%
ggplot(df, aes(Coverage, fill=(Coverage)))+
geom_bar(color="black",fill="brown3")+
theme(text = element_text(size=15),axis.text.x = element_text(angle=90, hjust=0.5))+
labs(title = "Diagram przedstawiajacy w ktorym miesiacu w kolejnych latach najwieksza liczba dziennikarzy poniosla smierc", x="Panstwo", y="Rok")
If this is not working for you, please consider to provide a reproducible example of your dataset (see: How to make a great R reproducible example)
You will need to remove those entries from your df. You could so something along the lines of df[!(df$Coverage %in% c("levels", "to", "exclude", "here")), ]. If that doesn't work, you may, in addition, need to use droplevels(), too.
When you rotate the text, you will also need to offset it a bit, too. You can do it in theme() using hjust or vjust (I always forget which one). Something along the lines of element_text(angle = 90, hjust = 0.5).
I have a plot which is generated thus:
ggplot(dt.2, aes(x=AgeGroup, y=Prevalence)) +
geom_errorbar(aes(ymin=lower, ymax=upper), colour="black", width=.2) +
geom_point(size=2, colour="Red")
I control the x axis labels like this:
scale_x_discrete(labels=c("0-29","30-49","50-64","65-79",">80","All")) +
This works but I need to change the ">80" label to "≥80".
However "≥80" is displayed as "=80".
How can I display the greater than or equal sign ?
An alternative to using expressions is Unicode characters, in this case Unicode Character 'GREATER-THAN OR EQUAL TO' (U+2265). Copying #mnel's example
.d <- data.frame(a = letters[1:6], y = 1:6)
ggplot(.d, aes(x=a,y=y)) + geom_point() +
scale_x_discrete(labels = c(letters[1:5], "\u2265 80"))
Unicode is a good alternative if you have trouble remembering the complicated expression syntax or if you need linebreaks, which expressions don't allow. As a downside, whether specific Unicode characters work at all depends on your graphics device and font of choice.
You can pass an expression (including phantom(...) to fake a leading >= within
the label argument to scale_x_discrete(...)
for example
.d <- data.frame(a = letters[1:6], y = 1:6)
ggplot(.d, aes(x=a,y=y)) + geom_point() +
scale_x_discrete(labels = c(letters[1:5], expression(phantom(x) >=80))
See ?plotmath for more details on creating mathematical expressions and
this related SO question and answer
plot(5, ylab=expression("T ">="5"))
You can use
expression("">=80)
So your full axis label like would look like:
scale_x_discrete(labels=c("0-29","30-49","50-64","65-79",expression("">=80),"All")) +
I have had trouble exporting plots when using unicode, but the expression function is more consistent.
I have a plot which is generated thus:
ggplot(dt.2, aes(x=AgeGroup, y=Prevalence)) +
geom_errorbar(aes(ymin=lower, ymax=upper), colour="black", width=.2) +
geom_point(size=2, colour="Red")
I control the x axis labels like this:
scale_x_discrete(labels=c("0-29","30-49","50-64","65-79",">80","All")) +
This works but I need to change the ">80" label to "≥80".
However "≥80" is displayed as "=80".
How can I display the greater than or equal sign ?
An alternative to using expressions is Unicode characters, in this case Unicode Character 'GREATER-THAN OR EQUAL TO' (U+2265). Copying #mnel's example
.d <- data.frame(a = letters[1:6], y = 1:6)
ggplot(.d, aes(x=a,y=y)) + geom_point() +
scale_x_discrete(labels = c(letters[1:5], "\u2265 80"))
Unicode is a good alternative if you have trouble remembering the complicated expression syntax or if you need linebreaks, which expressions don't allow. As a downside, whether specific Unicode characters work at all depends on your graphics device and font of choice.
You can pass an expression (including phantom(...) to fake a leading >= within
the label argument to scale_x_discrete(...)
for example
.d <- data.frame(a = letters[1:6], y = 1:6)
ggplot(.d, aes(x=a,y=y)) + geom_point() +
scale_x_discrete(labels = c(letters[1:5], expression(phantom(x) >=80))
See ?plotmath for more details on creating mathematical expressions and
this related SO question and answer
plot(5, ylab=expression("T ">="5"))
You can use
expression("">=80)
So your full axis label like would look like:
scale_x_discrete(labels=c("0-29","30-49","50-64","65-79",expression("">=80),"All")) +
I have had trouble exporting plots when using unicode, but the expression function is more consistent.
Let's say that I have a long data set and I would like to colour a specific label on the x-axis. In the case of the example below I would like to colour the label for Valiant.
# Packs
require(ggplot2)
require(reshape2)
# Data and trans
data(mtcars)
mtcars$model <- rownames(mtcars)
mtcars <- melt(mtcars, id.vars = "model")
# Some chart
ggplot(data = subset(x = mtcars, subset = mtcars$variable == "cyl"),
aes(x = model, y = value)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90,
colour =
ifelse(mtcars$model == "Valiant",
"red","black")))
The code produces the chart below that is erroneous as the wrong label is coloured.
The reason is fairly simple as what is created by ifelse does not match the order on the axis. I can fix the code by forcing ggplot to colour a specific row. The code below colours the right label as in the particular data.frame used for the chart the row with the Valiant value is 31.
# Fixed chart
ggplot(data = subset(x = mtcars, subset = mtcars$variable == "cyl"),
aes(x = model, y = value)) +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90,
colour =
ifelse(as.numeric(rownames(mtcars)) == 31,
"red","black")))
Clearly this solutions is extremely impractical. On the actual data I've a vast number of observations with multiple columns (geo, gender, indicator, value, etc.). That data is subsequently filtered via subset and different options are passed to the aes settings. Trying to figure out the row that should be coloured is a nightmare. I'm looking for a solution that would enable me to:
Relatively effortless indicate specific observation to be coloured without trying to use row numbers
Ideally I would like to use the id with some string as a way of indicating the text I wan to highlight
I would like to encapsulate the solution in the ggplot2 code, I don't want to create separate data subsets only to derive colouring vector as I will be doing this a number of times. This would unnecessary multiply objects.
In practice, I want solution that would work like that: irrespectively of what is on the chart, when you find this string on x-axis make it red
The reason the first one mismatches is that mtcars$model is much longer than the subset you are plotting, so the colour vector ifelse(mtcars$model == "Valiant","red","black") is of length 352 but the subset you are plotting is only of length 32. The same problem exists with your second example, though in this case the extra elements of colour (which are all "black" anyway) are dropped so you don't notice.
Unfortunately it looks like theme(...) doesn't get evaluated with the data column-names available to it (i.e. can't just do colour=ifelse(model == "Valiant", "red", "black") directly in the theme(...) call)
One alternative is to make model a factor and filter on levels(..) == "Valiant". If you have a long dataframe your id variable is most likely a factor anyway (or it would make sense for it to be one).
mtcars$model = factor(mtcars$model)
ggplot(data=subset(mtcars, variable == 'cyl'), aes(x=model, y=value)) +
geom_bar(stat="identity") +
theme(axis.text.x=element_text(angle=90,
colour=ifelse(levels(mtcars$model) == 'Valiant', 'red', 'black')))
(your problem stems from feeding subset() into ggplot as your data, and then not being able to refer back to that particular subset in the theme call. I don't know if there is a tricksy way to do this).
I would like to put an annotation : E \perp c using ggplot2 annotate("text", label = ...).
I searched quite thorougly on the web but only managed to get a lone symbol using annotate("text", label = "symbol('\136')", parse = T).
Does anyone have a solution ?
Plotting code from help page:
p <- ggplot(df, aes(x = gp, y = y)) +
geom_point() +
geom_point(data = ds, aes(y = mean),
colour = 'red', size = 3)
p+geom_text( aes(x="b", y=-0.4, label = "E(y)*symbol('\\136')*b" ),
parse = TRUE)
After getting this to work I was also able to get annotate(text"...) working:
p+annotate("text", 1, -0.4, label="E(y)*symbol('\\136')*b", parse=TRUE)
The tricks: to mix your quoting characters which you did but to also use plotmath syntax which I'm guessing you might not have used.
Edit: * is not a quoting character. If anything, it should be called a linking character. In plotmath syntax every "atom" or function call needs to be separated from (or "linked-to" depending on how you view it) the adjoining atoms/functions. You can do this with * (the no-space separator/linker), ~ (the spacing separator/linker), or any of the dyadic operators in the plotmath vocabulary, examples including + , -, ==, !=.