How to display variables' special characters in ggplot? - r

How does one properly display special characters ("(", "ë", periods as commas, etc.) used in column names within a ggplot graphic?
My csv's column line looks like this:
r, á/b, ő/é, w/s (0.3), w/s (0.2), bins
And I'd like, for instance, the 4th variable to be displayed (in the ggplot legend), as "w/s (0.3)".
Here's my code:
require(reshape2)
library(ggplot2)
library(RColorBrewer)
fileName = paste("/2.csv", sep = "") # test file available here: https://www.dropbox.com/s/f2egxbuwwbba2q9/2.csv?dl=0
mydata = read.csv(fileName,sep=",", header=TRUE)
dataM = melt(mydata,c("bins"))
ggplot(data=dataM, aes(x= bins, y=value, colour=variable, size = variable)) +
geom_line(alpha = .9) +
scale_colour_manual(breaks=c("r","á/b","ő/é","w/s (0.3)","w/s (0.2)"), values=c("green","orange","blue","pink","yellow")) +
#scale_colour_brewer(type = "qual", palette = 7) +
scale_size_manual(breaks=c("r","á/b","ő/é","w/s (0.3)","w/s (0.2)"), values=c(1,0.5,0.5,0.5,0.5)) +
theme_bw() +
theme(plot.background = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank(),
legend.key = element_blank(), legend.title = element_blank()) +
scale_y_continuous("D", expand=c(0,0)) +
scale_x_continuous("E", expand=c(0,0)) +
theme(legend.position="bottom")
Which produces this:
We can see how the legend wrongly display special characters. Any quick way (or not-so-quick way) to fix this?
(I have other questions about this graphics, but I believe it is preferred to ask a new complete question, which I'll do right now)

I think all you need to do is include check.names=FALSE in your read.csv() call; the special characters in your header are getting converted when the data is read in (see ?make.names for more information).
I was initially a little confused by your question because I assumed the problem was with accented characters such as ë, whereas in fact letters are not getting messed up -- it's only non-alphanumeric characters that are replaced by dots (also, strings starting with a numeric value would have "X" prepended).

Related

How can I make the labels more readable in this lollipop plot?

I am trying to make a lollipop plot that includes a text 'condition' and a value associated. The issue I am having is that, because there is so much data, the labels overlap. Is there an easy fix for this?
This is my code (and my issue):
library(ggplot2)
df <- read.table(file = '24 hpi MP BP.tsv', sep = '\t', header = TRUE)
group <- df$Name
value <- df$Bgd.count
data <- data.frame(
x=group,
y=value
)
ggplot(data, aes(x=x, y=y)) +
geom_segment( aes(x=x, xend=x, y=0, yend=y), color="skyblue") +
geom_point( color="blue", size=4, alpha=0.6) +
theme_light() +
coord_flip() +
theme(
panel.grid.major.y = element_blank(),
panel.border = element_blank(),
axis.ticks.y = element_blank()
)
I am hoping to get a clear separation on the labels
Your question does not provide a reproducible example, so here a more general answer.
The problem is that you want to plot hundreds of discrete values. That is bound to yield a crowded graphic.
your options:
reduce the labels (don’t label all axis) and show only few labels .
focus only on few important data points - I think this would be my preferred approach, as you also give your “story” more justice.
Group your values and show “aggregate values” such as means/error bars
Make your graph appropriately large (change the height of the so called graphic device)
Use facets (but this will not really help with the crowding in all cases)
Shorten your labels
Make the font smaller
Last, but definitely not least, change your visualisation strategy.

Adjusting the format of numbers in the legend in ggplot2 or ggplotly

Please help,
I am trying to adjust the number which is shown when you hoover info on the geom_sf map that is converted to plotly object via ggplotly. For now, the number is shown without a comma or dot for thousand separators: e.g. the numbers are now shown like this: 15922784, and I would like it to be on the graph like this: 151,922,784 or 151.922.784. The picture of the current situation is here:
The number for variable "Ukupna_vrije..." is a number without dots or comma.
I tried to use forrmatable::scales with this code:
K1<-ggplot(data = spojeno) +
geom_sf(aes(fill=formattable::comma(Ukupna_vrijednost_projekata), label=Županija))+
theme(panel.background = element_rect(fill = "white"), axis.line=element_blank(),
axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks=element_blank(),
axis.title.x=element_blank(), axis.title.y=element_blank())+
scale_fill_viridis_c(option = "plasma")
ggplotly(K1)
then I do get the numbers with comma, but end up with "formatable::comma" literlly written on the map and legend, the picture and the code is below:
If I use scales::comma (instead of formmatable) then I get error message: Error: Discrete value supplied to continuous scale.
I suppose that I am putting this arguments on the wrong place or in worng format.
Also, I would like to add commas or dots to the numbers on the legend 3000000000, 2000000000 to have 3.000.000.000. Thanks.
As I already mentioned in my comment you could format the numbers in the legend via the labels argument of scale_fill_xxx. For the tooltip you could make use of the text aesthetic to style the tooltip. To display the content of text in the tooltip you have to call ggplotly with argument tooltip="text".
Making use of the default example from ggplot2::geom_sf:
library(plotly)
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
nc$AREA <- nc$AREA * 1e6
ggplot(nc) +
geom_sf(aes(fill = AREA, text = paste0("Name: ", NAME, "<br>", "Area: ", scales::comma(AREA)))) +
scale_fill_viridis_c(option = "plasma", labels = scales::label_comma())
ggplotly(tooltip = "text")

Concat math symbols and strings in ggplot2 labels - R, LaTex Solution?

I am trying to label the y-axis of my graph with the Theta greek symbol and P(z) with a comma separating them. Additionally, I am tyring to label my x-axis Q(z_i) where i is a subscript. I have tried to do this a few different ways..
string <- ", P(z)"
thet <- bquote(theta)
ylab.fig2 <- paste(thet, string, sep = "")
and have done something similar with expression(theta). I use ylab.fig2 as an input in my ggplot, ylab(fig.2).
new <- ggplot(data = data.frame(x=0), aes(x=x)) +
stat_function(fun=Pz.eq, aes(colour="P(z)")) +
stat_function(fun=bid1, aes(colour="Bid Curve: House 1")) +
stat_function(fun=bid2, aes(colour="Bid Curve: House 2")) +
stat_function(fun=bid3, aes(colour="Bid Curve: House 3")) +
xlim(0,20) + ylim(0,6) +
xlab("Q(z_i)") + ylab(ylab.fig2) +
ggtitle("Figure 2: Property Choice Per Household") +
theme(panel.grid = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks.x = element_blank(),
axis.ticks.y = element_blank(),
legend.title = element_blank(),
plot.title = element_text(hjust=0.5)) +
scale_colour_manual("Groups",
values = c("darkseagreen", "darkkhaki", "darkslategray3", "firebrick"))
The bquote() and expression() both work fine if they are sole inputs but when I use paste to return the rest of the axis label the greek symbol is not output. I believe this is due to the differing class() of each object. Alternatively, if there is a way to compile LaTex in the labels that would solve both my x and y-axis issues.
This is what my graph looks like thus far...
Overall, there are three things I'm trying to accomplish with x and y-axis labels:
1) Concat greek letters with text.
2) Put bold text inside of the label (only the z vector in P(z) will be bold).
3) Place 'i' subscripts on my text.
While the question regarding Greek letters has been posted before I am looking for a solution using LaTex where I can use more than just math symbols. Using LaTex code is will allow me to solve issues 2 and 3, not just 1.
The latex2exp package is probably the easiest:
library(latex2exp)
string <- ", P(z)"
thet <- "$\\theta$"
ylab.fig2 <- TeX(paste(thet, string, sep = ""))
And then use as ... + ylab(ylab.fig2) to build the plot.
Or using bquote and expression:
library(ggplot2)
i=2
f <- bquote(expression(theta * ", " * P(bold(z))))
g <- bquote(expression(Q(z[.(i)])))
ggplot(mtcars, aes(x=hp, y=wt)) + geom_point()+
ylab(eval(f))+
xlab(eval(g))

Multi-line legend text including exponent with ggplot

With ggplot, I want to add a left aligned legend title with multiple lines and exponents in the text for the units of the values in the legend. I'm plotting data of a form similar to:
leakage_rates_levels <- c(5.4, 0.25)
leakage_rates <- as.factor(rep(leakage_rates_levels, 3)) # L/s-m^2 at 75 Pa
data_groups_levels <- c('Set 1', 'Set 2', 'Set 3')
data_groups <- as.factor(rep(data_groups_levels, each=2))
moisture_level <- c(7, 3, 11, 10, 16, 6)
plotdt <- data.frame(data_groups, leakage_rates, moisture_level)
I use expression() to add exponents to the units in the legend. The following code generates the desired figure, but with the legend title text mis-formatted.
ggplot(plotdt, aes(data_groups)) +
geom_bar(aes(weight=moisture_level, fill=leakage_rates), position='dodge') +
labs(y='Moisture Level') +
labs(fill=expression(paste('Leakage Rate\nat 75 Pa\n(L/s-', m^2, ')', sep=''))) +
theme(panel.grid.major.x = element_blank(),
axis.title.x = element_blank())
The legend title appears left aligned except for the final line, which has a bunch of extraneous spaces in the middle of it.
Using legend_title_align=0 (suggested here) and/or legend_title=element_text(hjust=1) in theme() have no effect. Trying to add phantom() spacing also did not work (suggested here). The end of the top answer to this question notes the same problem I'm encountering but does not propose a solution.
Is there a way to get the meter squared term in the legend to be left-aligned like the rest of the text?
I am using ggplot 3.1.0 and R 3.5.1.
You can use the unicode representation of superscript two (U+00B2) and avoid the
problem-causing combination of expression() and a multi-line legend title:
ggplot(plotdt, aes(data_groups)) +
geom_bar(aes(weight=moisture_level, fill=leakage_rates), position='dodge') +
labs(y='Moisture Level') +
labs(fill=paste('Leakage Rate\nat 75 Pa\n(L/s-m\u00b2)', sep='')) +
theme(panel.grid.major.x = element_blank(),
axis.title.x = element_blank())
You can use atop to have lines "atop" each other.
Because you have 3 lines and atop only accepts 2 arguments however, you need to have 2 atop nested in one another. This makes the font on some of the lines smaller. The way to prevent this is to pass the expressions to either textstyle or displaystyle:
ggplot(plotdt, aes(data_groups)) +
geom_bar(aes(weight = moisture_level, fill = leakage_rates), position = "dodge") +
labs(y = "Moisture Level") +
labs(fill = expression(atop(atop(textstyle("Leakage Rate"),
textstyle("at 75 Pa")),
"(L/s-" ~m^2~ ")"))) +
theme(panel.grid.major.x = element_blank(), axis.title.x = element_blank())

Annotate exponential function ggplot2

I would just like to simply add an annotation to my ggplot with the exponential function on it like this graph:
excel graph
Here is the data:Data
Here is the code I used thus far:
dfplot<-ggplot(data, aes(dilution.factor,Concentation)) +
geom_point(size=3)+ geom_smooth(method="auto",se=FALSE, colour="black")+
scale_y_continuous(breaks=seq(0,14,by=2))
dfplot2<-dfplot+labs(x=('Dilution Factor'), y=expression('Concentration' ~
(ng/mu*L)))+
theme_bw() + theme(panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text = element_text(colour="black"),
axis.line = element_line(colour = "black"))
dfplot3<- dfplot2+annotate("text", x=3, y=10, label = "R^2 == 1",parse=TRUE)
dfplot3
dfplot4<-dfplot3+annotate("text", x=3, y=11, label =
as.character(expression("y=13.048e^-{0.697x}" ,parse=TRUE)))
dfplot4
I can get all the way up to putting the r^2 value (dfplot3)dfplot3
For some reason I cannot get it to add the exponential equation in. I keep getting this error:
Error: Aesthetics must be either length 1 or the same as the data (1): label
What am i doing wrong?
Not quite sure about the as.character(expression()) syntax you are using, but when you are parsing annotation text, ggplot2 doesn't understand the 'human' style notation shortcut of placing a number next to a letter 13.084e, you need to tell it explicitly this is multiplication. You also need == instead of =.
annotate("text", x=3, y=11, label = "y == 13.048*e^-{0.697*x}", parse =TRUE)
Edit: I see that you have included parse = TRUE inside the expression call, I think this is a mistake. To do it with expression you would write the following, but this is not in fact necessary:
annotate("text", x=3, y=11, label = as.character(expression("y == 13.048*e^-{0.697*x}")), parse = T)

Resources