Minimize the height of a geom_bar plot - r

I have a bar plot cutting off the data labels because of height being
bigger that other data points. I want to adjust the height so that the
labels are visible.
season batsman total_runs
<int> <chr> <int>
1 2016 V Kohli 973
2 2018 KS Williamson 747
3 2012 CH Gayle 733
4 2013 MEK Hussey 733
5 2019 DA Warner 727
6 2014 RV Uthappa 660
7 2017 DA Warner 641
8 2010 SR Tendulkar 618
9 2008 SE Marsh 616
10 2011 CH Gayle 608
11 2009 ML Hayden 572
12 2015 DA Warner 562
I have tried ylim but does not work in my case.
season_top_scorer <- match_full%>%
group_by(season,batsman)%>%
summarize(total_runs = sum(batsman_runs))%>%
arrange(season,desc(total_runs))%>%
filter(total_runs == max(total_runs))%>%
arrange(desc(total_runs))%>%
ggplot(aes(x = season,y = total_runs,fill = batsman))+
geom_bar(stat ="identity")+
ggtitle("Highest run scorer each season")+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
scale_x_discrete(name="Season",
limits = c(2008,2009,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019))+
geom_text(aes(label= total_runs,vjust= 0 ))+
scale_y_discrete(name = "Total Runs", limits = c(0,250,500,750,1000,1250))
The only problem is with season 2016. The height of bar is too big that
its cutting off the label.Any idea what might solve this problem in the
above code

You should use scale_y_continuous instead of scale_y_discrete.
scale_y_continuous(name = "Total Runs", breaks = c(0, 250, 500, 750, 1000, 1250))

Related

How to change behaviour of `check_overlap = TRUE`?

My data is structured as follows:
> Comparison
# A tibble: 12 x 3
round TotalShots Year
<int> <dbl> <dbl>
1 1 70 2021
2 2 68 2021
3 3 76 2021
4 4 73 2021
5 5 66 2021
6 6 70 2021
7 1 115 2020
8 2 106 2020
9 3 75 2020
10 4 73 2020
11 5 82 2020
12 6 84 2020
I can plot this in ggplot2 via:
ggplot(Comparison, aes(x = round, y = TotalShots,
colour = factor(Year), label = TotalShots)) +
geom_line() +
geom_point(size = 14) +
geom_text(colour = "black", size = 5, check_overlap = TRUE)
However, in the plot, I have the label, at Rd3 printing as 76 and not 75. I assume this is because of check_overlap = TRUE however the plot is wrong, as year = 2020 for round = 3 should have the label of 75 and not 76.
Is there any way to please fix this?
You can try using ggrepel library for clarity of labels and to avoid overlapping.
library(ggrepel)
library(ggplot2)
ggplot(Comparison, aes(x = round, y = TotalShots,
colour = factor(Year), label = TotalShots)) +
geom_line() +
geom_point(size = 14) +
geom_label_repel(colour = "black", size = 5, nudge_y = 0.8)

scaling x and y axis (geom_bar)

I am having a trouble plotting what is seemingly simple plot.
x <-
read_excel("Desktop/Book1.xlsx",
col_types = c("numeric", "numeric", "numeric"))
x1 <- gather(hospitals, key = "sector", value = "count", 2:3)
p <- ggplot(data = x1, aes( x = Years, y = count, fill = sector )) +
geom_col(position="stack", stat="identity", width = 5, colour="black") +
geom_text(aes(label=count), vjust=1, color="white", size=2) +
guides(fill=FALSE)+
scale_fill_grey() +
theme_bw(base_size = 12 )
p
data is
1 1946 Public hospitals 35
2 1984 Public hospitals 41
3 2000 Public hospitals 65
4 2001 Public hospitals 67
5 2002 Public hospitals 66
6 2003 Public hospitals 76
7 2004 Public hospitals 77
8 2005 Public hospitals 85
9 2006 Public hospitals 90
10 2007 Public hospitals 94
11 2008 Public hospitals 97
12 2009 Public hospitals 102
13 2010 Public hospitals 102
14 1946 Private hospitals NA
15 1984 Private hospitals 139
16 2000 Private hospitals 325
17 2001 Private hospitals 336
18 2002 Private hospitals 343
19 2003 Private hospitals 364
20 2004 Private hospitals 376
21 2005 Private hospitals 376
22 2006 Private hospitals 353
23 2007 Private hospitals 355
24 2008 Private hospitals 365
25 2009 Private hospitals 370
26 2010 Private hospitals 376
Showing 12 to 26 of 26 entries, 3 total columns
and i am ending with this result!
first, how can I modify x axis to show the bars separated and only for the years i have data for ? [ can the x axis around 1960 be omitted and bars squeezed to save space ?
second, how can the Y axis be fixed ? some bars are higher than their value is!
x1 %>%
ggplot(aes( x = as.character(Years), y = count, fill = sector )) +
geom_col(position="stack", colour="black") +
geom_text(aes(label=count), vjust=1, size=2,
color=ifelse(df$sector != "Public hospitals", "white", "black")) +
guides(fill=FALSE) +
scale_x_discrete(name = "Year") +
scale_fill_grey() +
theme_bw(base_size = 12)
Edit: Upon reconsideration I realized I did not properly stack the positioning of the text. It happens to look ok with this data, but that's just coincidence. To get the right positioning for the text, one approach is manual: we could sum up the cumulative height for each year:
x1 %>%
group_by(Years) %>%
mutate(cuml_count = cumsum(count)) %>%
ungroup() %>% ....
geom_text(aes(label = count, y = cuml_count), vjust = 1, size = 2,
color=ifelse(df$sector != "Public hospitals", "white", "black")) +

How to create cumulative precipitation vs. temperature graph in a single plot

I have historical data for precip vs. annual temperature. I want to plot them into cool & wet, warm and wet, cool and dry, warm and dry years. Can someone help me with this?
Year Precip annual temperature
1987 821 8.5
1988 441 8
1989 574 7.9
1990 721 12.4
1991 669 10.8
1992 830 10
1993 1105 7.8
1994 772 8
1995 678 6.7
1996 834 8
1997 700 11
1998 786 11.2
1999 612 12
2000 758 10.6
2001 833 11
2002 622 10.6
2003 656 10.7
2004 799 9.9
2005 647 10.8
2006 764 12
2007 952 12.5
2008 943 10.86
2009 610 12.8
2010 766 11
2011 717 11.3
2012 602 9.5
2013 834 10.6
2014 758 11
2015 841 11
2016 630 11.5
2017 737 11.2
Average 742.32 10.36
As Majid suggested, you need to give more detail so you can get better answers. At least, try to use dput() with your dataframe, so we can get a reproducible copy of it. Copying and pasting into Excel is not appropriate for these kind of questions.
In any case, that graph can be easily be done using the ggplot2 package. You graph each year based on its X and Y coordinates and then manually add the lines and the titles for each category. You do need to establish the boundaries between cool/warm and dry/wet, of course.
library(ggplot2)
rain <- read.csv('~/data/rain.csv')
limit_humid <- 800
limit_warm <- 9.5
ggplot(rain, aes(x = temp, y = precip)) +
geom_text(aes(label = year)) +
geom_vline(xintercept = limit_warm) +
geom_hline(yintercept = limit_humid) +
annotate('text', label = 'bold("Cool and wet")', size = 4, parse = T,
x = min(rain$temp), y = max(rain$precip)) +
annotate('text', label = 'bold("Warm and wet")', size = 4, parse = T,
x = max(rain$temp), y = max(rain$precip)) +
annotate('text', label = 'bold("Cool and dry")', size = 4, parse = T,
x = min(rain$temp), y = min(rain$precip)) +
annotate('text', label = 'bold("Warm and wet")', size = 4, parse = T,
x = max(rain$temp), y = min(rain$precip)) +
theme_classic() +
labs(x = 'Average Temperature (°C)',
y = 'Cumulative precipitation (mm)')

Annotate group on stacked bar graph

How could I add "Division" label on top of the bars themselves in this example of a stacked bar chart?
ggplot2 and a Stacked Bar Chart with Negative Values
I only want to show it for values with space (don't want to overcrowd the figure), so maybe this could be implemented by a minimum bar height. How could I do it for only bars with that minimum height?
Thanks!
You can use geom_text() which comes with a check_overlap parameter -- see ?geom_text():
dat <- read.table(text = " Division Year OperatingIncome
1 A 2012 11460
2 B 2012 7431
3 C 2012 -8121
4 D 2012 15719
5 E 2012 364
6 A 2011 12211
7 B 2011 6290
8 C 2011 -2657
9 D 2011 14657
10 E 2011 1257
11 A 2010 12895
12 B 2010 5381
13 C 2010 -2408
14 D 2010 11849
15 E 2010 517",header = TRUE,sep = "",row.names = 1)
ggplot(dat, aes(x = Year, y = OperatingIncome, fill = Division)) +
geom_col() +
geom_text(aes(label = Division),
position = position_stack(vjust = 0.5),
check_overlap = TRUE)
In the example, however, you will see that the labels do not overlap.

Plot point and line graph in primary and secondary y-axis using ggplot in R

I have the following table. I need to plot "Area" in primary y-axis as points with "Weeks" in x-axis. For the same x-axis I need to plot "SM9_5" in secondary y-axis. I have my code below but does not plot it correct.
Any idea is appreciated.
Thanks.
YEAR Week Area SM9_5 sum percent COUNTY
2002 9-2 250 212.2 250 10.2 125
2002 10-1 300 450.2 550 22.5 125
2002 10-2 100 150.2 650 100.0 125
2002 9-3 50 212.2 250 10.2 15
2002 10-1 30 450.2 550 22.5 15
2002 10-2 10 150.2 650 100.0 15
2003 9-2 12 112.2 12 20.2 150
2003 10-1 15 350.2 27 82.5 150
2003 10-2 16 650.2 43 100.0 150
gg <- gg + geom_point(aes(y = Area, colour = "Area"))
gg <- gg + geom_line(aes(y = SM9_1, colour = "Sep_SM_9-1"))
gg <- gg + scale_y_continuous(sec.axis = sec_axis(~., name = "Soil Moisture"))
gg <- gg + scale_colour_manual(values = c("blue","red"))
gg <- gg + facet_wrap(~COUNTY, 2, scales = "fixed")
gg <- gg + labs(y = "Area",
x = "Weeks",
colour = "Parameter")
plot(gg)
My plot is shown below.

Resources