Related
I've a dataset which has countries in a column, and I intend creating anoda column for the region.
I used the code below using mutate and case_when,
dput(head(internet))
structure(list(Entity = c("Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan"), Code = c("AFG",
"AFG", "AFG", "AFG", "AFG", "AFG"), Year = c(1990, 1991, 1992,
1993, 1994, 1995), Internet_Usage = c(0, 0, 0, 0, 0, 0)), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
dput(head(Middle_East_and_North_Africa))
structure(list(`c(...)` = c("Algeria", "Bahrain", "Egypt", "Iran",
"Iraq", "Israel")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
internet_n <- internet %>%
mutate(Region = case_when(Entity %in% toupper(Middle_East_and_North_Africa) ~ "Middle East and North Africa",
Entity %in% toupper(Latin_America_and_Caribbean) ~ "Latin America and Caribbean",
Entity %in% toupper(East_Asia_and_Pacific) ~ "East Asia and Pacific",
Entity %in% toupper(South_Asia) ~ "South Asia",
Entity %in% toupper(North_America) ~ "North America",
Entity %in% toupper(Europe_Central_Asia) ~ "Europe Central Asia"))
but the new column (Region) has NA all through as shown below
dput(head(internet_n))
structure(list(Entity = c("Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan"), Code = c("AFG",
"AFG", "AFG", "AFG", "AFG", "AFG"), Year = c(1990, 1991, 1992,
1993, 1994, 1995), Internet_Usage = c(0, 0, 0, 0, 0, 0), Region = c(NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_, NA_character_
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
I have the following dataframe:
structure(list(share.beer = c(0.277, 0.1376, 0.1194, 0.0769,
0.0539, 0.0361, 0.0361, 0.0351, 0.0313, 0.03, 0.0119, 0.0084,
0.007, 0.0069), country = c("Brazil", "China, mainland", "United States",
"Thailand", "Vietnam", "China, mainland", "China, mainland",
"China, mainland", "China, mainland", "Argentina", "Indonesia",
"China, mainland", "China, mainland", "India"), Beer = c("soyb",
"maiz", "soyb", "cass", "cass", "whea", "rape", "soyb", "rice",
"soyb", "cass", "cott", "swpo", "rape")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -14L))
I want to create a barplot so that the beer type appears in the legend, the countries as y values while the share.beer are my values to be filled.
I have tried in various ways, including the following code, but I can't get the result I would like to. Here, for instance, I kept the variable "Beer""
df %>%
pivot_longer(cols = -Country, values_to = "Count", names_to = "Type") %>%
ggplot() +
geom_col(aes(x = reorder(Country, -Count), y = Count, fill = Beer))
However, I get an error
Can't combine share beer and Beer .
Any help?
You actually don't need the pivot_longer to create a suitable dataframe. You can use the following code:
library(tidyverse)
df %>%
ggplot() +
geom_col(aes(x = reorder(country, -share.beer), y = share.beer, fill = Beer)) +
xlab("Country") +
ylab("Share beer") +
coord_flip()
Output:
I have a dataset that I am presenting facetted by region and then using sub region as a fill. I have defined the colours using a separate named variable relating to the names of the subregion. I am wondering if it is possible to make the legend itself grouped in a similar way to the facet to make it easier to interpret.
The named sub_region variable
sub_region_colours <- c("South America" = "#0570b0", "Western Africa" = "#8c96c6", "Central America" = "#74a9cf", "Eastern Africa" = "#8856a7", "Northern Africa" = "#edf8fb", "Middle Africa" = "#b3cde3", "Southern Africa" = "#810f7c", "Northern America" = "#f1eef6", "Caribbean" = "#bdc9e1", "Eastern Asia" = "#bd0026", "Southern Asia" = "#fd8d3c", "South-Eastern Asia" = "#f03b20", "Southern Europe" = "#238b45", "Australia and New Zealand" = "#ce1256", "Melanesia" = "#df65b0", "Micronesia" = "#d7b5d8", "Polynesia" = "#f1eef6", "Central Asia" = "#fecc5c", "Western Asia" = "#ffffb2", "Eastern Europe" = "#66c2a4", "Northern Europe" = "#edf8fb", "Western Europe" = "#b2e2e2", "Small Islands" = "#252525")
This is the head(exporting_countries) grouping by sender_iso3, year and sender_region removed.
structure(list(sender_iso3 = c("ABW", "ABW", "ABW", "ABW", "ABW",
"ABW"), year = c(2005, 2011, 2014, 2015, 2016, 2017), sender_region = c("Americas",
"Americas", "Americas", "Americas", "Americas", "Americas"),
sender_subregion = c("Caribbean", "Caribbean", "Caribbean",
"Caribbean", "Caribbean", "Caribbean"), export = c(1, 1,
4, 5, 2, 1)), class = "data.frame", row.names = c(NA, -6L
))
Finally this is the code for the current plot
geom_bar()+
labs(title = "Number of countries reporting export of chickens",
fill = "Subregion")+
facet_wrap(~ sender_region)+
theme_minimal()+
scale_x_continuous(name = "Year", limits = c(1986, 2017), breaks = c(1986, 1990, 2000, 2010, 2017), guide = guide_axis(angle = 90))+
scale_fill_manual(values = sub_region_colours)+
guides(fill = guide_legend(ncol = 2))
Which at the moment produces this:
Graph with less than ideal legend
It would be great if I can group the legend fill colours similarly to the facets which would make it easier to read off.
One approach to achieve this would be to make seperate plots for each region and make use of patchwork to glue the plots together. A second approach would be to make use of the ggnewscale package which allows to have multiple fill (or ...) scales and legends in one plot.
However, similiar to using patchwork the approach using ggnewscale package could become a bit tedious as it requires to split the data according to the number of facets and plot each dataset via seperate layers. Therefore my solution adds a helper function which 1) splits the data and sets up the layers for each region or facet and 2) can be used to loop over the regions via e.g. lapply.
BTW: As your sample data included only one region I added a second region.
library(dplyr)
library(ggplot2)
library(ggnewscale)
sub_region_colours <- c("South America" = "#0570b0", "Western Africa" = "#8c96c6", "Central America" = "#74a9cf", "Eastern Africa" = "#8856a7", "Northern Africa" = "#edf8fb", "Middle Africa" = "#b3cde3", "Southern Africa" = "#810f7c", "Northern America" = "#f1eef6", "Caribbean" = "#bdc9e1", "Eastern Asia" = "#bd0026", "Southern Asia" = "#fd8d3c", "South-Eastern Asia" = "#f03b20", "Southern Europe" = "#238b45", "Australia and New Zealand" = "#ce1256", "Melanesia" = "#df65b0", "Micronesia" = "#d7b5d8", "Polynesia" = "#f1eef6", "Central Asia" = "#fecc5c", "Western Asia" = "#ffffb2", "Eastern Europe" = "#66c2a4", "Northern Europe" = "#edf8fb", "Western Europe" = "#b2e2e2", "Small Islands" = "#252525")
d <- structure(list(sender_iso3 = c(
"ABW", "ABW", "ABW", "ABW", "ABW",
"ABW", "ABW", "ABW", "ABW", "ABW", "ABW", "ABW"
), year = c(
2005,
2011, 2014, 2015, 2016, 2017, 2005, 2011, 2014, 2015, 2016, 2017
), sender_region = c(
"Americas", "Americas", "Americas", "Americas",
"Americas", "Americas", "Africa", "Africa", "Africa", "Africa",
"Africa", "Africa"
), sender_subregion = c(
"Caribbean", "Caribbean",
"Caribbean", "Caribbean", "Caribbean", "Caribbean", "Southern Africa",
"Southern Africa", "Southern Africa", "Southern Africa", "Southern Africa",
"Southern Africa"
), export = c(
1, 1, 4, 5, 2, 1, 1, 1, 4, 5,
2, 1
)), class = "data.frame", row.names = c(NA, -12L))
regions <- unique(d$sender_region)
# Layers for each region
make_layers <- function(x) {
d <- filter(d, sender_region == regions[[x]])
list(
if (x != 1) new_scale_fill(),
geom_bar(data = d, aes(x = year, fill = sender_subregion)),
scale_fill_manual(
values = sub_region_colours,
guide = guide_legend(
order = x,
title = regions[x],
title.position = "top"
)
)
)
}
p <- ggplot() +
lapply(seq_along(regions), make_layers)
# Add theme and wrap
p +
theme_minimal() +
scale_x_continuous(
name = "Year", limits = c(1986, 2017),
breaks = c(1986, 1990, 2000, 2010, 2017),
guide = guide_axis(angle = 90)
) +
facet_wrap(~sender_region)
I'd like to have an animation where some lines collapse into points, which are the mean value, to demonstrate that the lines can be summarised by the mean value.
Something like this.
First, set up the data, and the line plot:
library(tidyverse)
# remotes::install_github("njtierney/brolgar)
# library(brolgar)
# h_cut <- sample_n_keys(heights, 5) %>%
# mutate(type = "raw")
#
# datapasta::dpasta(h_cut)
h_cut <- tibble::tribble(
~country, ~year, ~height_cm, ~continent, ~type,
"Bolivia", 1890, 163.594, "Americas", "raw",
"Bolivia", 1900, 162.45, "Americas", "raw",
"Bolivia", 1930, 162.5, "Americas", "raw",
"Bolivia", 1940, 163.4, "Americas", "raw",
"Bolivia", 1950, 162.482, "Americas", "raw",
"Bolivia", 1960, 163.182, "Americas", "raw",
"Bolivia", 1970, 163.886, "Americas", "raw",
"Bolivia", 1980, 164.191, "Americas", "raw",
"Bolivia", 1990, 168.1, "Americas", "raw",
"Bolivia", 2000, 168.7, "Americas", "raw",
"Ethiopia", 1860, 169.3, "Africa", "raw",
"Ethiopia", 1880, 167.461, "Africa", "raw",
"Ethiopia", 1910, 161.451, "Africa", "raw",
"Ethiopia", 1920, 166.636, "Africa", "raw",
"Ethiopia", 1930, 167.27, "Africa", "raw",
"Ethiopia", 1940, 168.5, "Africa", "raw",
"Ethiopia", 1950, 166.823, "Africa", "raw",
"Ethiopia", 1960, 167.512, "Africa", "raw",
"Ethiopia", 1970, 167.49, "Africa", "raw",
"Ethiopia", 1980, 167.253, "Africa", "raw",
"Georgia", 1840, 165.5, "Asia", "raw",
"Georgia", 1860, 163, "Asia", "raw",
"Georgia", 1890, 164.26, "Asia", "raw",
"Georgia", 2000, 173.2, "Asia", "raw",
"Paraguay", 1900, 165.615, "Americas", "raw",
"Paraguay", 1930, 165.363, "Americas", "raw",
"Paraguay", 1990, 172.6, "Americas", "raw",
"Spain", 1740, 163.3, "Europe", "raw",
"Spain", 1750, 163.6, "Europe", "raw",
"Spain", 1760, 163.2, "Europe", "raw",
"Spain", 1770, 164.3, "Europe", "raw",
"Spain", 1780, 163.3, "Europe", "raw",
"Spain", 1830, 161, "Europe", "raw",
"Spain", 1840, 163.7, "Europe", "raw",
"Spain", 1850, 162.5, "Europe", "raw",
"Spain", 1860, 162.7, "Europe", "raw",
"Spain", 1870, 162.6, "Europe", "raw",
"Spain", 1880, 163.9, "Europe", "raw",
"Spain", 1890, 164, "Europe", "raw",
"Spain", 1900, 164.6, "Europe", "raw",
"Spain", 1910, 165.1, "Europe", "raw",
"Spain", 1920, 165.6, "Europe", "raw",
"Spain", 1930, 165.2, "Europe", "raw",
"Spain", 1940, 166.3, "Europe", "raw",
"Spain", 1950, 170.8, "Europe", "raw",
"Spain", 1960, 174.2, "Europe", "raw",
"Spain", 1970, 175.2, "Europe", "raw",
"Spain", 1980, 175.6, "Europe", "raw"
)
ggplot(h_cut,
aes(x = year,
y = height_cm,
colour = country)) +
geom_line() +
theme(legend.position = "bottom")
Then, show the points
# demonstrate these lines collapsing down onto a point
h_sum <- h_cut %>%
group_by(country) %>%
summarise(height_cm = mean(height_cm)) %>%
mutate(year = max(h_cut$year),
type = "summary")
ggplot(h_sum,
aes(x = year,
y = height_cm)) +
geom_point()
These can be combined into one plot like so:
# combined:
p <- ggplot(h_cut,
aes(x = year,
y = height_cm,
colour = country)) +
geom_line() +
geom_point(data = h_sum,
aes(x = year,
y = height_cm,
colour = country))
p
Manually transition from line to points
library(gganimate)
anim <- p +
transition_layers(keep_layers = FALSE) +
enter_grow() +
exit_shrink() +
ease_aes(default = "cubic-in-out")
anim
But is there some way to make the lines shrink into the points?
h_full <- h_sum %>% full_join(h_cut)
#> Joining, by = c("country", "height_cm", "year", "type")
h_full
#> # A tibble: 53 x 5
#> country height_cm year type continent
#> <chr> <dbl> <dbl> <chr> <chr>
#> 1 Bolivia 164. 2000 summary <NA>
#> 2 Ethiopia 167. 2000 summary <NA>
#> 3 Georgia 166. 2000 summary <NA>
#> 4 Paraguay 168. 2000 summary <NA>
#> 5 Spain 166. 2000 summary <NA>
#> 6 Bolivia 164. 1890 raw Americas
#> 7 Bolivia 162. 1900 raw Americas
#> 8 Bolivia 162. 1930 raw Americas
#> 9 Bolivia 163. 1940 raw Americas
#> 10 Bolivia 162. 1950 raw Americas
#> # … with 43 more rows
p <- ggplot(h_full,
aes(x = year,
y = height_cm,
group = country,
colour = type)) +
geom_point() +
geom_line()
anim <- p + transition_states(type)
anim
#> Error in `$<-.data.frame`(`*tmp*`, ".id", value = c(1L, 1L, 1L, 1L, 1L, : replacement has 250 rows, data has 5
Created on 2019-07-25 by the reprex package (v0.3.0)
It seems as though you can stack multiple enter and exit animations together, such as exit_shrink and exit_fly.
Given the code you provided, I was able to have the lines shrink into the points by adding exit_fly(x_loc = 2000), which specifies that the lines fly to 2000 on the x axis.
Here is the edited code chunk which specifies the animation
anim <- p +
transition_layers(keep_layers = FALSE) +
enter_grow() +
exit_fly(x_loc = 2000) +
exit_shrink() +
ease_aes(default = "cubic-in-out")
anim
giving the following animation
For some reason the enter_grow() for the points isn't as smooth as your example which I could not figure out.
I am looking for some help with the given sample data of countries on one column and count on another column. I am trying a build a geo maps using ggplot showing the count and name of the country in the respective places of the map when I hover above the country. Below is the sample data given. I tried with the ggmap with the lat and long position to identify the country but not able to show the count and name of the country on hovering.
structure(list(Countries = c("USA", "India", "Europe", "LATAM",
"Singapore", "Phillipines", "Australia", "EMEA", "Malaysia",
"Hongkong", "Philippines", "Thailand", "New Zealand"
), count = c(143002, 80316, 33513, 3736, 2180, 1905, 1816, 921,
707, 631, 207, 72, 49)), .Names = c("Countries", "count"), row.names = c(NA,
13L), class = "data.frame")
I tried the below code.
countries = geocode(Countryprofile$Countries)
Countryprofile = cbind(Countryprofile,countries)
mapWorld <- borders("world", colour="grey", fill="lightblue")
q<-ggplot(data = Countryprofile) + mapWorld + geom_point(aes(x=lon, y=lat) ,color="red", size=3)+
geom_text(data = Countryprofile,aes(x=lon,y=lat,label=Countries))
ggplotly(q)
You can change any attribute in the result from ggplotly. In this case you can set the text attribute of the 2nd trace (where you markers are defined).
plotly_map <- ggplotly(q)
plotly_map$x$data[[2]]$text <- paste(Countryprofile$Countries,
Countryprofile$count,
sep='<br />')
plotly_map
library(plotly)
library(ggmap)
Countryprofile <- structure(list(Countries = c("USA", "India", "Europe", "LATAM",
"Singapore", "Phillipines", "Australia", "EMEA", "Malaysia",
"Hongkong", "Philippines", "Thailand", "New Zealand"
), count = c(143002, 80316, 33513, 3736, 2180, 1905, 1816, 921,
707, 631, 207, 72, 49)), .Names = c("Countries", "count"), row.names = c(NA,
13L), class = "data.frame")
countries = geocode(Countryprofile$Countries)
Countryprofile = cbind(Countryprofile,countries)
mapWorld <- borders("world", colour="grey", fill="lightblue")
q<-ggplot(data = Countryprofile) + mapWorld + geom_point(aes(x=lon, y=lat) ,color="red", size=3)+
geom_text(data = Countryprofile,aes(x=lon,y=lat,label=Countries))
plotly_map <- ggplotly(q)
plotly_map$x$data[[2]]$text <- paste(Countryprofile$Countries, Countryprofile$count, sep='<br />')
plotly_map