what I want appears to be simple but I can't figure it out: I want to take the NA values from my labs out. Problem is, it's my first time using the "na.value" argument, so I’m not quite sure how to proceed.
(btw, I can’t drop the NAs before plotting because the shapes that are not from the tourist regions will also disappear, and I need the full map.)
I have this code:
mun_tur_shape %>%
filter(abbrev_state == "BA") %>%
ggplot() +
geom_sf(aes(fill=TOURIST_REGION, colour=TOURIST_REGION)) +
scale_fill_manual(
na.value = "grey90"
values = c(viridis::inferno(13)),
aesthetics = c("fill", "colour")
) +
labs(fill = "Região Turística",
colour = "Região Turística"
) +
And this is how it looks:
Plot with NA value
Does anyone know what I can do to omit them?
# here's an sf for reproducible example:
#install.packages(geobr)
df <- geobr::read_state()
df %>%
# creating NA values like my real dataset has
mutate(name_region=case_when(name_region=="Nordeste"~NA_character_,
TRUE~name_region)) %>%
ggplot() +
geom_sf(aes(fill=name_region, colour=name_region)) +
scale_fill_manual(
na.value = "grey90",
values = (viridis::inferno(4)),
aesthetics = c("fill", "colour")
) +
labs(colour = "Regions",
fill = "Regions")
My solution for this (see also NA values in choropleth plot legend with ggplot2 in R) is to add first a base layer with all the polygons (no aes) using the colors that I would want to use for NA. After that, you can overlay the layer with aes(). In practice, just add one line on your code
library(dplyr)
library(ggplot2)
library(geobr)
df <- geobr::read_state()
df %>%
# creating NA values like my real dataset has
mutate(name_region = case_when(
name_region == "Nordeste" ~ NA_character_,
TRUE ~ name_region
)) %>%
# Create map
ggplot() +
# Add this line a base layer with no aes, but a NA fill color
geom_sf(fill = "grey50", color = "grey50") +
# end
geom_sf(aes(fill = name_region, colour = name_region)) +
scale_fill_manual(
na.value = "grey90",
values = (viridis::inferno(4)),
aesthetics = c("fill", "colour"),
na.translate = FALSE
) +
labs(
colour = "Regions",
fill = "Regions"
)
Related
I'm working on a plotting function that has the option to flip coordinates (using coord_flip). The thing is, this is a plot by group (using the fill argument), which means, for some reason, coord_flip also reverses colors, legend, my value column and my fill column. In practice, this means I have the following pice of code in my function:
if(flip_coord){
colors = c("#CC0000", "#002D73" ) %>% rev
rev_legend = T
table[[col_plot]] = fct_rev(table[[col_plot]]) # value column
table[['origin_table']] = fct_rev(table[['origin_table']]) # fill column
} else{
colors = c("#CC0000", "#002D73" )
rev_legend = F
}
There's also this line in my plot:
{if(flip_coord) coord_flip()} +
This brings back everything else that gets scrambled with coord_flip, but isn't too elegant. Is there a better way to only flip coordinates without reversing everything else?
PS: I know there's no reproducible example here, I'll try to add one, but if someone has already stumbled upon the answer to this problem that might be common, I'll post as is for the moment.
Edit: made some reprex. Let's say my data is this:
df = tibble(origin = c('2000s', '1990s') %>% rep(2),
region = c('South', 'North') %>% rep(2) %>% sort,
value = 1:4) %>%
mutate(origin = factor(origin, levels = c('1990s', '2000s')),
region = factor(region, levels = c('North', 'South')))
colors = c('red', 'blue')
# origin region value
# <fct> <fct> <int>
# 1 2000s North 1
# 2 1990s North 2
# 3 2000s South 3
# 4 1990s South 4
If I plot regularly, everything comes ordered (90s first, 00s second, North first, South second):
df %>%
ggplot(aes(x = region, fill = origin, y = value)) +
geom_bar(stat = "identity", position = 'dodge', color = "white", alpha= 0.8)+
scale_fill_manual(values=colors)
But, if I flip coordinates (just adding + coord_flip() to the code above) I get the following:
South above north, 00s above 90s and the legend isn't in the same order than the bars. This is exactly the same if I input x = value and y = origin. So, to fix this I have to do the following:
df2 = df
df2[['region']] = fct_rev(df2[['region']]) # Change 1
df2[['origin']] = fct_rev(df2[['origin']]) # Change 2
df2 %>%
ggplot(aes(x = value, fill = origin, y = region)) +
geom_bar(stat = "identity", position = 'dodge', color = "white", alpha= 0.8) +
guides(fill = guide_legend(reverse = T)) + # Change 3
scale_fill_manual(values=rev(colors)) # Change 4
Bringing the correct orders:
Is there any less cumbersome way to achieve this?
The issue is that coord_flip() changes the ordering of bars within groups in grouped bar plot:
According to here a hacky way to solve is to put width of position_dodge() to negative,
With scale_x_discrete(limits=rev)+ we get North in correct position:
library(tidyverse)
df %>%
ggplot(aes(x=region, y=value, fill=origin))+
geom_col(position = position_dodge(), width = -0.4)+
scale_fill_manual(values = c("red", "blue")) +
coord_flip()+
scale_x_discrete(limits=rev)+
theme_minimal(base_size=16)+
theme(axis.title.x=element_blank(),
axis.title.y=element_blank())
Coord flip does not flip everything around. Factors are plotted starting from the bottom. Thus, 1990 will be below 2000, and North will be below South.
The simplest way I can see is to simply reverse your factor levels. (when creating your factors).
library(tidyverse)
df <- tibble(
origin = c("2000s", "1990s") %>% rep(2),
region = c("South", "North") %>% rep(2) %>% sort(),
value = 1:4
) %>%
mutate(
## just reverse the factor levels
origin = factor(origin, levels = rev(c("1990s", "2000s"))),
region = factor(region, levels = rev(c("North", "South")))
)
colors <- c("red", "blue")
df %>%
# switched x and y
ggplot(aes(y = region, x = value, fill = origin)) +
geom_bar(stat = "identity", position = "dodge", color = "white", alpha = 0.8) +
## this is to set the correct legend order and mapping to your colors
scale_fill_manual(values = colors, breaks = rev(unique(df$origin)))
I have a shapefile with 7 regions.
I have an excel file with data about reptiles in these 7 regions.
I merged this shapefile with excel.
Using ggplot I tried to generate facet_wrap() from nome_popular, however the rest of the polygon parts were omitted in each facet created.
My tentative code
shapefile: https://drive.google.com/file/d/1I1m9lBX69zjsdGBg2zfpii5H4VFYE1_0/view?usp=sharing
excel:https://docs.google.com/spreadsheets/d/1eKQWWCAalehTTrUuqUlMPQnSTEZxF--g/edit?usp=sharing&ouid=118442515534677263769&rtpof=true&sd=true
# load data.frame
serpentes <- read_excel("E:/22-serpentes_cg/R/serpentes_cg_finall.xlsx")
# filer data.frame
total_especies <- serpentes %>%
rename(regiao_cg = REGIAO_CG) %>%
group_by(
especie, nome_popular,
regiao_cg
) %>%
summarise(Total_esp = sum(quant))
# load shapefile
regiao <- sf::st_read("E:/22-serpentes_cg/geo/regioes_urbanas.shp") %>%
rename(regiao_cg = REGIAO_CG)
# join shapefile and excel
total_especies_shp <- dplyr::left_join(regiao, total_especies, by = "regiao_cg")
# map facet_warp
p_total_especies_shp <- ggplot(
na.omit(total_especies_shp),
aes(fill = factor(Total_esp))
) +
geom_sf() +
scale_fill_brewer(
palette = "Spectral", na.value = "grey", direction = -1,
"Total de\nSepertens Regatadas"
) +
facet_wrap(~nome_popular)
p_total_especies_shp
output incomplete
OBS EDIT
I tried #stefan's answer which partly worked, but generated a facet called "NA" bad.
new code:
p_total_especies_shp <- ggplot(total_especies_shp)+
geom_sf(data=regiao)+
geom_sf(aes(fill=factor(Total_esp)))+
scale_fill_brewer(
palette = "Spectral", na.value = "grey", direction = -1,
"Total de\nSepertens Regatadas")+
facet_wrap(~nome_popular)
p_total_especies_shp
The issue is that with faceting the data is splitted in groups and only the polygons contained in the splitted data will show up.
If you want all regions to be shown in each facet then one option would be to add a base map via second geom_sf layer. In your case + geom_sf(regiao) + geom_sf() should do the job.
As an example I make use of the default example from ?geom_sf:
library(ggplot2)
set.seed(42)
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
base <- nc
nc$facet <- sample(c("a", "b", "c", "d"), size = nrow(nc), replace = TRUE)
ggplot(nc) +
geom_sf(data = base) +
geom_sf(aes(fill = AREA)) +
facet_wrap(~facet)
Working on #stefan answer, if you want to get rid of the NA panel, you need to provide the data in the second geom with the na.omit, leaving the ggplot call empty:
p_total_especies_shp <- ggplot() +
geom_sf(data = regiao) +
geom_sf(aes(fill = factor(Total_esp)), data = na.omit(total_especies_shp)) +
scale_fill_brewer(
palette = "Spectral", na.value = "grey", direction = -1,
"Total de\nSepertens Regatadas"
) +
facet_wrap(~nome_popular, drop = TRUE)
p_total_especies_shp
Which gives the result you want:
I'm struggling to modifing the colour/shape/... of the points based of if it's a missing value or not.
library(ggplot2)
library(naniar)
ggplot(data = airquality,
aes(x = Ozone,
y = Solar.R)) +
geom_miss_point()
What I have
airquality_no_na <-airquality[!(is.na(airquality$Ozone) | is.na(airquality$Solar.R)) ,]
airquality_na <-airquality[(is.na(airquality$Ozone) | is.na(airquality$Solar.R)),]
ggplot() +
geom_point(data = airquality_no_na,
aes(x = Ozone,
y = Solar.R, colour = "NoMissing")) +
geom_miss_point(data = airquality_na,
aes(x = Ozone,
y = Solar.R, colour = "Missing")) +
scale_colour_manual(name = 'Legende',
values =c('NoMissing'='green',
'Missing'='blue'))
What I would like to have
I don't know how to make the missing value in green and the non-missing value in blue without spliting in two dataframe.
EDIT :
My issue was a bit more complexe. I want to have the possibility to choose the color for the first data set (missing in blue, not missing in green) ans the second data set (missing in red, not missing in yellow)
#Create dataframes
df1=as.data.frame(matrix(data=runif(n=200, 0,1),ncol=2))
df2=as.data.frame(matrix(data=runif(n=100, 0,1),ncol=2))
#Add missing values
df1[rbinom(n=100,size=1,prob = 0.1) ==1,1] <- NA
df1[rbinom(n=100,size=1,prob = 0.1) ==1,2] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,1] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,2] <- NA
#This doesnt work. It only print in blue (missing) and green (not missing)
ggplot() +
geom_miss_point(data = df1,
aes(x = V1,
y = V2)) +
geom_miss_point(data = df2,
aes(x = V1,
y = V2)) +
scale_colour_manual(values = c("blue", "green", "yellow","red"))
I am not sure if this a good idea. But for the sake of "showing how to do this in theory". From what I understand from a quick look into the naniar package, is that the color aesthetic is mapped to ..missing.. by default. You would need to dig quite a lot into the actual geom to change that behaviour. But there is a simple workaround for it.
Create a second color scale with ggnewscale.
You will not get around subsetting your data first, but this is not a bad thing. Don't fear to subset your data, that's a very normal thing to do.
library(tidyverse)
library(naniar)
library(ggnewscale)
ggplot() +
geom_miss_point(data = df1, aes(V1, V2)) +
scale_colour_manual(name = "df1", values = c("blue", "green")) +
new_scale_color() +
geom_miss_point(data = df2, aes(V1, V2)) +
scale_colour_manual(name = "df2", values = c("yellow","red"))
With some trial and error I came up with a solution using the group aesthetic:
Row bind your datasets and add an identifier
Map the dataset identifier on group
Map the interaction of ..group.. and naniars ..missing.. on color. (I first tried by using dataset directly but that did not work. ): )
library(ggplot2)
library(naniar)
set.seed(42)
#Create dataframes
df1=as.data.frame(matrix(data=runif(n=200, 0,1),ncol=2))
df2=as.data.frame(matrix(data=runif(n=100, 0,1),ncol=2))
#Add missing values
df1[rbinom(n=100,size=1,prob = 0.1) ==1,1] <- NA
df1[rbinom(n=100,size=1,prob = 0.1) ==1,2] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,1] <- NA
df2[rbinom(n=50,size=1,prob = 0.1) ==1,2] <- NA
dplyr::bind_rows(df1, df2, .id = "dataset") %>%
ggplot() +
geom_miss_point(aes(x = V1,
y = V2,
group = dataset,
colour = interaction(..group.., ..missing..))) +
scale_colour_manual(values = c("blue", "red", "green", "yellow"))
I have a dataframe that looks as follows:
X = c(6,6.2,6.4,6.6,6.8,5.6,5.8,6,6.2,6.4,6.6,6.8,7,7.2,7.4,7.6,7.8,8,2.8,3,3.2,3.4,3.6,3.8,4,4.2,4.4,4.6,4.8,5)
Y = c(2.2,2.2,2.2,2.2,2.2,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.6,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8,2.8)
Value = c(0,0.00683254,0,0.007595654,0.015517884,0,0,0,0,0,0,0,0,0,0.005219395,0,0,0,0,0,0,0,0,0,0,0,0.002892342,0,0.002758141,0)
table = data.frame(X, Y, Value)
I have put together a heatmap in R, based on the following command:
ggplot(data = table, mapping = aes(x = X, y = Y)) +
geom_tile(aes(fill = Value), colour = 'black') +
theme_void() +
scale_fill_gradient2(low = "white", high = "black") + xlab(label = "X") + ylab(label = "Y")
Since there is not a value for every X and Y, it leads to plots that appear as follows.
I am attempting to smoothen the plot and have the following question:
As there are small white spaces between the plotted values, how could one color these white spaces to be the median intensity? Said differently, how would I first create an initial layer with non-zero median 'Value' before plotting the non-zero 'Value' on top (overlayed)?
A sample is shown below, which has been 'smoothed', which looks closer to the desired output.
I'm not sure if it will totally fit your need but from my understanding you have some missing values and combination of X and Y.
So, you can use complete function from tidyr to get all different combinations of X and Y (those without values will be filled with NA) and then by using na.value argument in scale_fill_gradient2 function, you can set the values of these NA values to the same color of the midpoint value:
library(tidyr)
library(dplyr)
library(ggplot2)
table %>% complete(X,Y) %>%
ggplot(aes(x = X, y = Y))+
geom_raster(aes(fill = Value), interpolate = TRUE)+
scale_fill_gradient2(low = "white", mid = "grey",high = "black",
na.value = "grey")
Does it answer your question ?
I'd like to be able to use a gradient to fill the colors on a map, but I need specific values (like zero) to be a specific color (say, red or grey).
Is there some way to first apply the gradient, and then set these specific color values? I'd like to be able to do it for multiple specific values if possible.
In the example below, how could we make the 0 values red?
suppressPackageStartupMessages(require(tidyverse))
suppressPackageStartupMessages(require(ggmap))
suppressPackageStartupMessages(require(viridis))
suppressPackageStartupMessages(require(albersusa)) #devtools::install_github("hrbrmstr/albersusa")
us <- usa_composite()
us_map <- fortify(us, region="name") %>%
rename(state = id)
dat <- tibble(state = state.name, value = sample(-2:5, 50, replace = T))
dat %>%
right_join(us_map) %>%
ggplot() +
geom_polygon(aes(x = long, y = lat, fill = value, group = group), color = "white", size = .2) +
coord_fixed(1.3) +
scale_fill_viridis()
#> Joining, by = "state"
Created on 2019-02-20 by the reprex package (v0.2.1)
You can change 0 to NA in your plot data object and within scale_fill_viridis use argument na.value:
# Create plot data object
pd <- right_join(dat, us_map)
# Replace wanted value with NA
pd$value[pd$value == 0] <- NA
ggplot(pd, aes(long, lat, fill = value, group = group)) +
geom_polygon(color = "white", size = 0.2) +
coord_fixed(1.3) +
scale_fill_viridis(na.value = "red")