Make x-axis appear in a particular order in ggplot [duplicate] - r

This question already has answers here:
How do you specifically order ggplot2 x axis instead of alphabetical order? [duplicate]
(2 answers)
Order discrete x scale by frequency/value
(7 answers)
Closed 17 days ago.
I have a dataset:
data <- c('real','real','real','real','real','pred','pred','pred','pred','pred','real','real','real','real','pred','pred','pred','pred')
threshold <- c('>=1','>=2','>=3','>=4','>=101','>=1','>=2','>=3','>=4','>=101','>=1','>=2','>=3','>=4','>=1','>=2','>=3','>=4')
accuracy <- c(63.4,64.4,65.1,64.3,65.4,62.1,63.6,64.1,65.4,64.8,62.2,63.3,64.4,65.6,63.1,63.8,64.6,65.1)
types<-c('morning','morning','morning','morning','morning','morning','morning','morning','morning','morning','evening','evening','evening','evening','evening','evening','evening','evening')
df <- data.frame(data,threshold,accuracy,types)
I want to plot 'data' column as stacked barplot for morning and evening separately. So I use facet wrap. My code for plotting is:
ggplot(df, aes(x = threshold, y = accuracy)) + geom_bar(aes(fill = data), stat = "identity", color = "white",position = position_dodge(0.9))+
facet_wrap(~types) +
fill_palette("jco")
And the plot I get looks like:
However, as you can see the order of threshold got messed up. I want the order for morning to look like:
'>=1','>=2','>=3','>=4','>=101'
And the order for evening should be:
'>=1','>=2','>=3','>=4'
So I have three questions:
How can I enforce the order using my code?
2 Also for evening I shouldn't be getting '>=101' so how can I remove that from the plot.
Is there a way to make the background white but keep the grid.
And on a slightly unrelated note, can you point at a graph type that might be slightly better looking than this? I am new at visualisation so I am still learning.
Insights will be appreciated.

You may set order of threshold to reorder x axis.
Then add scales = 'free' in facet_wrap to remove >=101 in evening,
add theme_bw() to make background white.
df %>%
mutate(threshold = factor(threshold, levels = c('>=1','>=2','>=3','>=4','>=101'))) %>%
ggplot(aes(x = threshold, y = accuracy)) + geom_bar(aes(fill = data), stat = "identity", color = "white",position = position_dodge(0.9))+
facet_wrap(~types, scales = 'free') +
theme_bw() +
fill_palette("jco")

Related

R how to increase spacing between X and Y axis value labels [duplicate]

This question already has answers here:
Expand spacing between tick marks on x axis
(2 answers)
Closed 8 months ago.
I'm currently doing performing an applied research about voting methodologies in the US presidential elections. I'm using R to visualize the data I've received from the government.
I'm currently trying to display the amount of voting machines used per state. It works but the labels are barely readable. This is the R-code I've used to create my plot:
ggplot(data = e_2020_Voting_Machines_Per_State) +
geom_point(mapping = aes(x = State_Abbr, y = totalMachines)) + coord_flip()
With that code I get the following plot:
I'd like to increase the spacing between the State_Abbr names so that they're better readable.
I've searched for a solution for quite a bit now and I unfortunately haven't been able to find one yet.
Thank you very much for your help in advance!
I've got to agree with user2974951. If you use the below code, you can adjust the width and height values to until there's enough spacing between the tick labels.
my_plot <- ggplot(data = e_2020_Voting_Machines_Per_State) +
geom_point(mapping = aes(x = State_Abbr, y = totalMachines)) + coord_flip()
tiff("my_plot.tiff", width = 8, height =6, units = "cm", res = 300)
print(my_plot)
dev.off()
Or, you can include + element_text(size = 7) to your original ggplot() code to reduce the text size until they're all readable.

Custom spacing between ticks on discrete axis [duplicate]

This question already has answers here:
Spacing of discrete axis by a categorical variable
(1 answer)
Expand Categorical x-axis in ggplot
(2 answers)
Closed 3 years ago.
I am creating a plot where i have a discrete y-axis and a continuous x-axis. I want to create the impression of grouping by moving some y-axis ticks closer together and increase the space between the groups. I tried to demonstrate it by whipping something up in paint.
ggplot(data = mpg, aes(y = trans, x = displ, group = 1)) + geom_step()
So what I'm trying to do is move the manual(mx), the auto(sx), the auto(lx) closer together (blue arrows) and increase the space between these groups (red arrows)
My idea was to create empty ticks between the groups, but ggplot is ignoring those:
brks <- mpg$trans %>% unique() %>% sort()
brks <- append(brks, "test", 2)
brks <- append(brks, "", 5)
ggplot(data = mpg, aes(y = trans, x = displ, group = 1)) + geom_step() +
scale_y_discrete(breaks = brks)
Does anybody have an idea how to achieve this? Thanks!

Overlay percentage for barplot, while keeping count on the y axis [duplicate]

This question already has an answer here:
Adding Percentages to a Grouped Barchart Columns in GGplot2
(1 answer)
Closed 5 years ago.
I have a bar plot with my independent variable on the x axis (Education level), and the count of my dependent variable on the y axis (Default on credit card debt).
ggplot(cleancc, aes(x=factor(Education), fill = factor(DefaultOct05))) + geom_bar()
I'd like to keep everything as is but simply show the percentages for each break in the bar. For example, the blue part of the bar 2 is 23.7%.
As I don't have your dataset I cannot try it, but check out this option with stat_bin():
ggplot(cleancc, aes(x=factor(Education), fill = factor(DefaultOct05))) +
geom_bar() +
stat_bin(geom = "text",
aes(label = paste(round((..count..)/sum(..count..)*100), "%")),
vjust = 5)

Grouped bar plot in ggplot with y values based on combination of 2 categorical variables?

I am trying to create a grouped bar plot in ggplot, in which there should be 4 bars per each x value. Here is a subset of my data (actual data is about 4x longer):
Verb_Type,Frame,proportion_type,speaker
mental,V CP,0.209513024,Child
mental,V NP,0.138731597,Child
perception,V CP,0.017167382,Child
perception,V NP,0.387528402,Child
mental,V CP,0.437998087,Parent
mental,V NP,0.144086707,Parent
perception,V CP,0.042695836,Parent
perception,V NP,0.398376853,Parent
What I want is to plot Frame as the x values and proportion_type as the y values, but with the bars based on both Verb_Type and speaker. So for each x value (Frame), there would be 4 bars grouped together - a bar each for the proportion_type value corresponding to mental~child, mental~parent, perception~child, perception~parent. I need for the fill color to be based on Verb_Type, and the fill "texture" (saturation or something) based on speaker. I do not want stacked bars, as it would not accurately represent the data.
I don't want to use facet grids because I find it visually difficult to compare all 4 bars when they're separated into 2 groups. I want to group all the bars together so that the visualization is easier. But I can't figure out how to make the appropriate groupings. Is this something I can do in ggplot, or do I need to manipulate the data before plotting? I tried using melt to reshape the data, but either I was doing it wrong, or that's not what I actually should be doing.
I think you are looking for the interaction() (i.e. get all unique pairings) between df$Verb_Type and df$speaker to get the column groupings you are after. You can pass this directly to ggplot or make a new variable ahead of time:
ggplot(df, aes(x = Frame, y = proportion_type,
group = interaction(Verb_Type, speaker), fill = Verb_Type, alpha = speaker)) +
geom_bar(stat = "identity", position = "dodge") +
scale_alpha_manual(values = c(.5, 1))
Or:
df$grouper <- interaction(df$Verb_Type, df$speaker)
ggplot(df, aes(x = Frame, y = proportion_type,
group = grouper, fill = Verb_Type, alpha = speaker)) +
geom_bar(stat = "identity", position = "dodge") +
scale_alpha_manual(values = c(.5, 1))

A way to always dodge a histogram? [duplicate]

This question already has answers here:
Don't drop zero count: dodged barplot
(6 answers)
Closed 2 years ago.
Using ggplot2 I'm creating a histogram with a factor on the horizontal axis and another factor for the fill color, using a dodged position. My problem is that the fill factor sometimes takes only one value for a value of the horizontal factor, and with nothing to dodge the bar takes up the full width. Is there a way to make it dodge nothing so that all bar widths are the same? Or equivalently to plot the 0's?
For example
ggplot(data = mtcars, aes(x = factor(carb), fill = factor(gear))) +
geom_histogram(position = "dodge")
This answer has a couple ideas. It was also asked before the new version was released, so maybe something changed? Using facets (also shown here) I don't like for my situation, though I suppose editing the data and using geom_bar could work, but it feels inelegant. Moreover, when I tried facetting anyway
ggplot(mtcars, aes(x = factor(carb), fill = factor(gear))) +
geom_bar() + facet_grid(~factor(carb))
I get the error "Error in layout_base(data, cols, drop = drop):
At least one layer must contain all variables used for facetting"
I suppose I could generate a data frame of counts and then use geom_bar,
mtcounts <- ddply(subset(mtcars, select = c("carb", "gear")),
.fun = count, .variables = c("carb", "gear"))
filling out the levels that aren't present with 0's. Does anyone know if that would work or if there's a better way?
Updated geom_bar needs stat = "identity"
I'm not sure if this is too late for you, but see the answer to a recent post here
That is, I'd take Joran's advice to pre-calculate the counts outside the ggplot call and to use geom_bar. As with the answer to other post, the counts are obtained in two steps: first, a crosstabulation of counts is obtained using dcast; then second, melt the crosstabulation.
library(ggplot2)
library(reshape2)
dat = dcast(mtcars, factor(carb) ~ factor(gear), fun.aggregate = length)
dat.melt = melt(dat, id.vars = "factor(carb)", measure.vars = c("3", "4", "5"))
dat.melt
(p <- ggplot(dat.melt, aes(x = `factor(carb)`, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge"))
The chart:
As shown in this answer, in newer versions of ggplot2 (version >= 2.2.1.900) there is a simpler way: position_dodge gains a preserve argument that if set to "single" will always dodge.
ggplot(data = mtcars, aes(x = factor(carb), fill = factor(gear))) +
geom_bar(position = position_dodge(preserve = "single"))

Resources