two different legends from one dataset - r

I am trying to have two legends: one based on variable c and the other on variable d, defined by their own shape and size. I do know if this is possible in ggplot2? Maybe it is not fitting to the philosophy behind the use of ggplot2. If I transform the data to long format, I can deal with the different shapes, but the sizes are confounded. The same is happening if I use a facet_wrap option.
structure(list(a = c(5, 6, 7), b = c(5, 6, 7), c = c(0.1, 0.5,
1), d = c(10, 5, 1)), .Names = c("a", "b", "c", "d"), row.names = c(NA,
-3L), class = "data.frame")
library(ggplot2)
plot <- ggplot() + geom_point(data=e,aes(x=a,y=b,size=c), shape=1,
color="black")
plot <- plot + geom_point(data=e,aes(x=a,y=b,size=d), shape=3, color="red")
plot
Any advice is more than welcome.

you can write shape and size in aes() like geom_point(aes(x=a,y=b,shape=factor(c))) +geom_point(aes(x=a,y=b,size=d), shape=3). For example,
library(ggplot2)
ggplot(mpg) + geom_point(aes(x=hwy,y=cty,shape=class)) +
geom_point(aes(x=hwy,y=cty,size=cyl), shape=3)

Related

logarithmic y-axis issue in R/ ggplot2

I plotted a histogram from a frequency distribution table using ggplot2. Here is some sample data
dput(test_data)
structure(list(inst = c(5, 5, 5, 10, 10, 10, 15, 15, 15), equip = c("a",
"b", "c", "a", "b", "c", "a", "b", "c"), value = c(0.520670542493463,
0.7556017707102, 0.931902746669948, 0.206132101127878, 0.0114199279341847,
0.603053622646257, 0.315444506937638, 0.375196750741452, 0.983124621212482
)), class = "data.frame", row.names = c(NA, -9L))
When I use ggplot2 to plot the data, I get the following output:
test_hist1 <- ggplot(test_data,aes(x = inst, y =value, fill = equip)) + geom_bar(width=3,alpha=1,stat = "dodge", position ="stack")+theme_bw()+xlab(expression(Value))+ylab("value") + ggtitle(expression(test~data))+theme(plot.title = element_text(hjust = 0.5))+scale_fill_manual(values=c("#00FF00", "#FFD700","#DC143C"))
But when I transform the y_axis to be a log_axis, the plot direction changes and so does the intensity of the bars.
test_hist2 <- ggplot(test_data,aes(x = inst, y =value, fill = equip)) + geom_bar(width=3,alpha=1,stat = "dodge", position ="stack")+theme_bw()+xlab(expression(Value))+ylab("log_yaxis") + ggtitle(expression(test~data))+theme(plot.title = element_text(hjust = 0.5))+scale_fill_manual(values=c("#00FF00", "#FFD700","#DC143C"))+scale_y_log10()
My second plot is wrong, because the code for second plot is just converting my y-axis number to log10(y_axis_value) instead of a log_axis that is given in the following answer (the plot in the answer is the axis I am looking for). Can someone direct me in the right direction. Thanks for the help.
R: Difference between log axis scale vs. manual log transformation?

Reshape group labels in ggradar

When using ggradar long variable names don't fit the pane. Is there a way to reshape the variable names in ggradar?
Reproducible example:
library(ggradar)
suppressPackageStartupMessages(library(dplyr))
library(scales)
data <- data.frame(
group = c("A", "B", "C"),
variable_with_long_name_1 = c(0,1,0.5),
variable_with_long_name_2 = c(0,1,.5),
variable_with_long_name_3 = c(1,0,0.5)
)
ggradar(data)
This works and looks something like:
Any hints?
If I may, I suggest the use of ggRadar from ggiraphExtra:
library(ggiraphExtra)
g <- ggRadar(data, aes(color = group), scales = "free") +
theme_minimal() +
theme(text = element_text(size=7), # custom font size
axis.text.y = element_blank())
Plus you'll get to use the ggplot2 grammar.
Also I think it's better to use the simple ggsave, just play around with the dimension, you won't have to sacrifice text size:
g <- ggRadar(data, aes(color = group), scales = "free") +
theme_minimal() +
theme(axis.text.y = element_blank())
print(g)
ggsave("/plt.png", width = 16, height = 9, dpi = 120)
Data used:
data <- data.frame(
group = c("A", "B", "C"),
variable_with_long_name_1 = c(0,1,0.5),
variable_with_long_name_2 = c(0,1,.5),
variable_with_long_name_3 = c(1,0,0.5)
)
It's actually straight forward. ggradar allows to scale all labels:
Variable names are scaled by setting the axis.label.size option,
the scale labels by setting grid.label.size option and
the legend by setting the legend.label.size option.
So
library(ggradar)
suppressPackageStartupMessages(library(dplyr))
library(scales)
data <- data.frame(
group = c("A", "B", "C"),
variable_with_long_name_1 = c(0,1,0.5),
variable_with_long_name_2 = c(0,1,.5),
variable_with_long_name_3 = c(1,0,0.5)
)
ggradar(data, axis.label.size = 3, grid.label.size = 3, legend.text.size = 10)
plots to something like
library(ggradar)
suppressPackageStartupMessages(library(dplyr))
library(scales)
data <- data.frame(
group = c("A", "B", "C"),
"variable with long name"= c(0,1,0.5),
"variable with \n long name" = c(0,1,.5),
variable_with_long_name_3 = c(1,0,0.5)
)
ggradar(data)+
ggtitle("Title on \n two lines")
with other ggplot features I used "\n" inside long labels (like titles or names), but with ggradar it does not work. Maybe you can still use this as a hint to change something
Another method which might work, at least if you only have a few categories in your spider chart:
Add the text to the front of the graph, as indicated by Joachim Schork in the link below
https://statisticsglobe.com/add-bold-and-italic-text-to-ggplot2-plot-in-r
ggp + # Add bold text element to plot, could be anything else like italic
annotate("text", x = 4.5, y = 2.2, size = 5,
label = "My Bold Text",
fontface = "bold")
I just had the same problem, but for me rescaling the label was not an option (readability). As the goal was to not have the labels clipped though, here is the solution i found:
There is the option plot.extent.x.sf which can be increased to extend the size of the plot horizontally. I set it to 2, and then also my longest feature label was plotted correctly.

Values in gganimate col chart differs from original data values

I'm starting with animated charts and using gganimate package. I've found that when generating a col chart animation over time, values of variables change from original. Let me show you an example:
Data <- as.data.frame(cbind(c(1,1,1,2,2,2,3,3,3),
c("A","B","C","A","B","C","A","B","C"),
c(20,10,15,20,20,20,30,25,35)))
colnames(Data) <- c("Time","Object","Value")
Data$Time <- as.integer(Data$Time)
Data$Value <- as.numeric(Data$Value)
Data$Object <- as.character(Data$Object)
p <- ggplot(Data,aes(Object,Value)) +
stat_identity() +
geom_col() +
coord_cartesian(ylim = c(0,40)) +
transition_time(Time)
p
The chart obtained loks like this:
Values obtained in the Y-axis are between 1 and 6. It seems that the original value of 10 corresponds to a value of 1 in the Y-axis. 15 is 2, 20 is 3 and so on...
Is there a way for keeping the original values in the chart?
Thanks in advance
Your data changed when you coerced a factor variable into numeric. (see data section how to efficiently define a data.frame)
You were missing a position = "identity" for your bar charts to stay at the same place. I added a fill = Time for illustration.
Code
p <- ggplot(Data, aes(Object, Value, fill = Time)) +
geom_col(position = "identity") +
coord_cartesian(ylim = c(0, 40)) +
transition_time(Time)
p
Data
Data <- data.frame(Time = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
Object = c("A", "B", "C", "A", "B", "C", "A", "B", "C"),
Value = c(20, 10, 15, 20, 20, 20, 30, 25, 35))

How can I force ggplot to show more levels on the legend?

I'm trying to create a complex ggplot plot but some things don't work as expected.
I have extracted the problematic part, the creation of points and its associated legend.
library(data.table)
library(ggplot2)
lev <- c("A", "B", "C", "D") # define levels.
bb <- c(40, 30,20,10,5)/100 # define breaks.
ll <- c("40%","30%","20%","10%","5%") # labels.
# Create data
nodos <- data.table(event = c("A", "B", "D", "C", "D"), ord = c(1, 2, 3, 3, 4),
NP = c(0.375, 0.25, 0.125, 0.125, 0.125))
ggplot() + geom_point(data=nodos,aes(x=ord,
y=event, size=NP), color="black", shape=16) +
ylim(lev) + scale_size_continuous(name="Prop.",
breaks=bb, labels=ll, range=c(0,6))+
scale_x_continuous(limits=c(0.5, 4.5),
breaks=seq(1,4,1))
As you can see, no matter what breaks and labels I use I'm not able to force ggplot to paint a legend containing 0% or 10%.
scale_size_continuous keeps creating just two elements.
And the smaller points are very badly scaled.
I have also tried with scale_scale_area, but it doesn't work either.
I'm using R 3.4.2 and ggplot2 2.2.1 (also tried the latest github version).
How can I get it?
If you set the limits to encompass the breaks you'll be able to alter the legend. Current most of the breaks are outside the default limits of the scale.
ggplot() +
geom_point(data = nodos,
aes(x = ord, y = event, size = NP), color="black", shape = 16) +
scale_size_continuous(name = "Prop.",
breaks = bb,
limits = c(.05, .4),
labels = ll,
range = c(0, 6) )

Stacked bar plot in violin plot shape

Maybe this is a stupid idea, or maybe it's a brain wave. I have a dataset of lipid classes in 4 different species. The data is proportional, and the sums are 1000. I want to visualise the differences in proportions for each class in each species. Generally a stacked bar would be the way to go here, but there are several classes, and it becomes uninterpretable since only the bottom class shares a baseline (see below).
And this appears to be the best option of a bad bunch, with pie and donut charts being nothing short of sneered at.
I was then inspired by this creation Symmetrical, violin plot-like histogram?, which creates a sort of stacked distribution violin plot (see below).
I am wondering if this could somehow be converted into a stacked violin, such that each segment represents a whole variable. In the case of my data, species' A and D would be 'fat' around the TAG segment, and 'skinnier' at the STEROL segment. This way the proportions are depicted horizontally, and always have a common baseline. Thoughts?
Data:
structure(list(Sample = c("A", "A", "A", "B", "B", "B", "C",
"C", "C", "D", "D"), WAX = c(83.7179798600773, 317.364310355766,
20.0147496567679, 93.0194886619568, 78.7886829173726, 79.3445694220837,
91.0020522660375, 88.1542855137005, 78.3313314713951, 78.4449591023115,
236.150030864875), TAG = c(67.4640254081232, 313.243238213156,
451.287867136276, 76.308508343969, 40.127554151831, 91.1910102221636,
61.658394708941, 104.617259648364, 60.7502685224869, 80.8373642262043,
485.88633863193), FFA = c(41.0963382465756, 149.264019576272,
129.672579626868, 51.049208042632, 13.7282635713804, 30.0088572108344,
47.8878116348504, 47.9564218319094, 30.3836532949481, 34.8474205480686,
10.9218910757234), `DAG1,2` = c(140.35876401479, 42.4556176551009,
0, 0, 144.993393432366, 136.722412691012, 0, 140.027443968931,
137.579074961889, 129.935353616471, 46.6128854387559), STEROL = c(73.0144390122309,
24.1680929257195, 41.8258704279641, 78.906816661241, 67.5678558060943,
66.7150537517493, 82.4794113296791, 76.7443442992891, 68.9357008866253,
64.5444668132533, 29.8342694785768), AMPL = c(251.446564854412,
57.8713327050339, 306.155806819949, 238.853696442419, 201.783872969561,
175.935515655693, 234.169038776536, 211.986239116884, 196.931330316831,
222.658181144794, 73.8944654414811), PE = c(167.99718650752,
43.3839497916674, 22.1937177530762, 150.315149187176, 153.632530721031,
141.580725482114, 164.215442147509, 155.113323256627, 143.349000132624,
128.504657216928, 50.6281347160092), PC = c(174.904702096271,
52.2494387772846, 28.8494085790995, 191.038328534942, 190.183655117756,
175.33290326259, 199.2632149392, 175.400682364295, 176.64926273487,
163.075864395099, 66.071984352649), LPC = c(0, 0, 0, 120.508804125665,
109.194191312608, 103.16895230176, 119.324634197247, 0, 107.09037767833,
97.151732936871, 0)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -11L), .Names = c("Sample", "WAX", "TAG",
"FFA", "DAG1,2", "STEROL", "AMPL", "PE", "PC", "LPC"))
This is essentially a horizontal bar plot:
library(reshape2)
DFm <- melt(DF, id.vars = "Sample")
DFm1 <- DFm
DFm1$value <- -DFm1$value
DFm <- rbind(DFm, DFm1)
ggplot(DFm, aes(x = "A", y = value / 10, fill = variable, color = variable)) +
geom_bar(stat = "identity", position = "dodge") +
coord_flip() +
theme_minimal() +
facet_wrap(~ Sample, nrow = 1, switch = "x") +
theme(axis.text = element_blank(),
axis.title = element_blank(),
panel.grid = element_blank())

Resources