Base R Choropleth: colors aren't being applied to the map according to the order of the interval/breaks which makes the map hard to read - r

I created a choropleth with base R but I'm struggling with the colors. First, the colors don't follow the same order as the intervals and second, two of the intervals are using the same color, all of which makes the graph hard to read. This happens regardless of how many colors I use. It also doesn't matter whether I'm using brewer.pal or base colors.Here is a map with its respective legend illustrating the issue.
Below are the statements that I use to create the graph once data has been downloaded:
#Relevant packages:
library(dplyr)
library(RColorBrewer)
library(rgdal)
#create colors vector
pop_colors <- brewer.pal(8,"Purples")
#create breaks/intervals
pop_breaks <- c(0,20000,40000,60000,80000,100000,120000)
#apply breaks to population
cuts <- cut(cal_pop$Pop2016, pop_breaks, dig.lab = 6)
#create a vector with colors by population according to the interval they belong to:
color_breaks <- pop_colors[findInterval(cal_pop$Pop2016,vec = pop_breaks)]
Create choropleth
plot(cal_pop,col = color_breaks, main = "Calgary Population (2016)")
#create legend
legend("topleft", fill = color_breaks, legend = levels(cuts), title = "Population")
I used readOGR() command to read the shape file, which I'm linking here in case anybody is interested in taking a look at the data.
I'd appreciate any advice you could give me.
Thanks!

Your error is in this line:
color_breaks <- pop_colors[findInterval(cal_pop$Pop2016,vec = pop_breaks)]
I can't read your data file, so I'll use a built-in one from the sf package.
library(sf)
nc <- readOGR(system.file("shapes/", package="maptools"), "sids")
str(nc#data)
colors <- brewer.pal(8,"Purples")
#create breaks/intervals
sid_breaks <- c(0,2,4,6,8,10,12,20,60)
#apply breaks to population
sid_cuts <- cut(nc$SID79, sid_breaks, dig.lab = 6, include=TRUE)
#create a vector with colors by population according to the interval they belong to:
sid_colors <- colors[sid_cuts]
#Create choropleth
par(mar=c(0,0,0,0))
plot(nc, col = sid_colors)
legend("bottomleft", fill = colors, legend = levels(sid_cuts), nc=2, title = "SID (1979)", bty="n")

Related

R: How can I assign points on a map a color based on a set of values?

I have run a factor analysis on a spatial dataset, and I would like to plot the results on a map so that the color of each individual point (location) is a combination in a RGB/HSV space of the scores at that location of the three factors extracted.
I am using base R to plot the locations, which are in a SpatialPointsDataFrame created with the spdep package:
Libraries
library(sp)
library(classInt)
Sample Dataset
fas <- structure(list(MR1 = c(-0.604222013102789, -0.589631093835467,
-0.612647301042234, 2.23360319770647, -0.866779007222414), MR2 = c(-0.492209397489792,
-0.216810726717787, -0.294487678489753, -0.60466348557844, 0.34752411748663
), MR3 = c(-0.510065798219453, -0.61303212834454, 0.194263734935779,
0.347461766159926, -0.756375966467285), x = c(1457543.717, 1491550.224,
1423185.998, 1508232.145, 1521316.942), y = c(4947666.766, 5001394.895,
4948766.5, 4950547.862, 5003955.997)), row.names = c("Acqui Terme",
"Alagna", "Alba", "Albera Ligure", "Albuzzano"), class = "data.frame")
Create spatial object
fas <- SpatialPointsDataFrame(fas[,4:5], fas,
proj4string = CRS("+init=EPSG:3003"))
Plotting function
map <- function(f) {
pal <- colorRampPalette(c("steelblue","white","tomato2"), bias = 1)
collist <- pal(10)
class <- classIntervals(f, 8, style = "jenks")
color <- findColours(class, collist)
plot(fas, pch=21,cex=.8, col="black",bg=color)
}
#example usage
#map(fas$MR1)
The above code works well for producing a separate plot for each factor. What I would like is a way to produce a composite map of the three factors together.
Many thanks in advance for any suggestion.
I found a solution through this post! With the data shown above, it goes like this:
#choose columns to map to color
colors <-fas#data[,c(1:3)]
#set range from 0 to 1
range_col <- function(x){(x-min(x))/(max(x)-min(x))}
colors_norm <- range_col(colors)
print(colors_norm)
#convert to RGB
colors_rgb <- rgb(colors_norm)
print(colors_rgb)
#plot
plot(fas, main="Color Scatterplot", bg=colors_hex,
col="black",pch=21)

Trying to plot in tmap shapefile with attribute

I am trying to work with municipality data in Norway, and I'm totally new to QGIS, shapefiles and plotting this in R. I download the municipalities from here:
Administrative enheter kommuner / Administrative units municipalities
Reproducible files are here:
Joanna's github
I have downloaded QGIS, so I can open the GEOJson file there and convert it to a shapefile. I am able to do this, and read the data into R:
library(sf)
test=st_read("C:/municipality_shape.shp")
head(test)
I have on my own given the different municipalities different values/ranks that I call faktor, and I have stored this classification in a dataframe that I call df_new. I wish to merge this "classification" on to my "test" object above, and wish to plot the map with the classification attribute onto the map:
test33=merge(test, df_new[,c("Kommunekode_str","faktor")],
by=c("Kommunekode_str"), all.x=TRUE)
This works, but when I am to plot this with tmap,
library(tmap)
tmap_mode("view")
tm_shape(test33) +
tm_fill(col="faktor", alpha=0.6, n=20, palette=c("wheat3","red3")) +
tm_borders(col="#000000", lwd=0.2)
it throws this error:
Error in object[-omit, , drop = FALSE] : incorrect number of
dimensions
If I just use base plot,
plot(test33)
I get the picture:
You see I get three plots. Does this has something to do with my error above?
I think the main issue here is that the shapes you are trying to plot are too complex so tmap is struggling to load all of this data. ggplot also fails to load the polygons.
You probably don't need so much accuracy in your polygons if you are making a choropleth map so I would suggest first simplifying your polygons. In my experience the best way to do this is using the package rmapshaper:
# keep = 0.02 will keep just 2% of the points in your polygons.
test_33_simple <- rmapshaper::ms_simplify(test33, keep = 0.02)
I can now use your code to produce the following:
tmap_mode("view")
tm_shape(test_33_simple) +
tm_fill(col="faktor", alpha=0.6, n=20, palette=c("wheat3","red3")) +
tm_borders(col="#000000", lwd=0.2)
This produces an interactive map and the colour scheme is not ideal to tell differences between municipalities.
static version
Since you say in the comments that you are not sure if you want an interactive map or a static one, I will give an example with a static map and some example colour schemes.
The below uses the classInt package to set up breaks for your map. A popular break scheme is 'fisher' which uses the fisher-jenks algorithm. Make sure you research the various different options to pick one that suits your scenario:
library(ggplot2)
library(dplyr)
library(sf)
library(classInt)
breaks <- classIntervals(test_33_simple$faktor, n = 6, style = 'fisher')
#label breaks
lab_vec <- vector(length = length(breaks$brks)-1)
rounded_breaks <- round(breaks$brks,2)
lab_vec[1] <- paste0('[', rounded_breaks[1],' - ', rounded_breaks[2],']')
for(i in 2:(length(breaks$brks) - 1)){
lab_vec[i] <- paste0('(',rounded_breaks[i], ' - ', rounded_breaks[i+1], ']')
}
test_33_simple <- test_33_simple %>%
mutate(faktor_class = factor(cut(faktor, breaks$brks, include.lowest = T), labels = lab_vec))
# map
ggplot(test_33_simple) +
geom_sf(aes(fill = faktor_class), size= 0.2) +
scale_fill_viridis_d() +
theme_minimal()

How to combine state distribution plot and separate legend in traminer?

Plotting several clusters using seqdplot in TraMineR can make the legend messy, especially in combination with numerous states. This calls for additional options for modifying the legend which is available with the function seqlegend. However, I have a hard time combining a state distribution plot (seqdplot) with a separate modified legend (seqlegend). Ideally one wants to plot the clusters (e.g. 9) without a legend and then add the separate legend in the available bottom right row, but instead the separate legend is generating a new plot window. Can anyone help?
Here's an example using the biofam data. With the data I use in my own research the legend becomes much more messy since I have 11 states.
#Data
library(TraMineR)
library(WeightedCluster)
data(biofam)
biofam.seq <- seqdef(biofam[501:600, 10:25])
#OM distances
biofam.om <- seqdist(biofam.seq, method = "OM", indel = 3, sm = "TRATE")
#9 clusters
wardCluster <- hclust(as.dist(biofam.om), method = "ward.D2")
cluster9 <- cutree(wardCluster, k = 9)
#State distribution plot
seqdplot(biofam.seq, group = cluster9, with.legend = F)
#Separate legend
seqlegend(biofam.seq, title = "States", ncol = 2)
#Combine state distribution plot and separate legend
#??
Thank you.
The seqplot function does not allow to control the number of columns of the legend, nor does it allow to add a legend title. So you have to compose the plot yourself by generating a separated plot for each group with the legend disabled and adding the legend afterwards. Here is how you can do that:
cluster9 <- factor(cluster9)
levc <- levels(cluster9)
lev <- length(levc)
par(mfrow=c(5,2))
for (i in 1:lev)
seqdplot(biofam.seq[cluster9 == levc[i],], border=NA, main=levc[i], with.legend=FALSE)
seqlegend(biofam.seq, ncol=4, cex = 1.2, title='States')
========================
Update, Oct 1, 2018 =================
Since TraMineR V 2.0-9, the seqplot family of functions now support (when applicable) the argument ncol to control the number of columns in the legend. To add a title to the legend, you still have to proceed as shown above.
AFAIK seqlegend() doesn't work when the other plots you are plotting utilizes the groups arguments. In your case the only thing seqlegend() is adding is a title "States". If you are looking to add a legend so you can customize what is in the legend and so forth, you can accomplish that by providing the corresponding alphabet and states that are used in your analysis.
The package's website has several walkthroughs and guides enumerating the various options and so forth: Link to their webiste
#Data
library(TraMineR)
library(WeightedCluster)
data(biofam)
## Generate alphabet and states
alphabet <- 0:7
states <- letters[seq_along(alphabet)]
biofam.seq <- seqdef(biofam[501:600, 10:25], states = states, alphabet = alphabet)
#OM distances
biofam.om <- seqdist(biofam.seq, method = "OM", indel = 3, sm = "TRATE")
#9 clusters
wardCluster <- hclust(as.dist(biofam.om), method = "ward.D2")
cluster9 <- cutree(wardCluster, k = 9)
#State distribution plot
seqdplot(biofam.seq, group = cluster9, with.legend = TRUE)

Plot a table with box size changing

Does anyone have an idea how is this kind of chart plotted? It seems like heat map. However, instead of using color, size of each cell is used to indicate the magnitude. I want to plot a figure like this but I don't know how to realize it. Can this be done in R or Matlab?
Try scatter:
scatter(x,y,sz,c,'s','filled');
where x and y are the positions of each square, sz is the size (must be a vector of the same length as x and y), and c is a 3xlength(x) matrix with the color value for each entry. The labels for the plot can be input with set(gcf,properties) or xticklabels:
X=30;
Y=10;
[x,y]=meshgrid(1:X,1:Y);
x=reshape(x,[size(x,1)*size(x,2) 1]);
y=reshape(y,[size(y,1)*size(y,2) 1]);
sz=50;
sz=sz*(1+rand(size(x)));
c=[1*ones(length(x),1) repmat(rand(size(x)),[1 2])];
scatter(x,y,sz,c,'s','filled');
xlab={'ACC';'BLCA';etc}
xticks(1:X)
xticklabels(xlab)
set(get(gca,'XLabel'),'Rotation',90);
ylab={'RAPGEB6';etc}
yticks(1:Y)
yticklabels(ylab)
EDIT: yticks & co are only available for >R2016b, if you don't have a newer version you should use set instead:
set(gca,'XTick',1:X,'XTickLabel',xlab,'XTickLabelRotation',90) %rotation only available for >R2014b
set(gca,'YTick',1:Y,'YTickLabel',ylab)
in R, you should use ggplot2 that allows you to map your values (gene expression in your case?) onto the size variable. Here, I did a simulation that resembles your data structure:
my_data <- matrix(rnorm(8*26,mean=0,sd=1), nrow=8, ncol=26,
dimnames = list(paste0("gene",1:8), LETTERS))
Then, you can process the data frame to be ready for ggplot2 data visualization:
library(reshape)
dat_m <- melt(my_data, varnames = c("gene", "cancer"))
Now, use ggplot2::geom_tile() to map the values onto the size variable. You may update additional features of the plot.
library(ggplot2)
ggplot(data=dat_m, aes(cancer, gene)) +
geom_tile(aes(size=value, fill="red"), color="white") +
scale_fill_discrete(guide=FALSE) + ##hide scale
scale_size_continuous(guide=FALSE) ##hide another scale
In R, corrplotpackage can be used. Specifically, you have to use method = 'square' when creating the plot.
Try this as an example:
library(corrplot)
corrplot(cor(mtcars), method = 'square', col = 'red')

Overlap image plot on a Google Map background in R

I'm trying to add this plot of a function defined on Veneto (italian region)
obtained by an image and contour:
image(X,Y,evalmati,col=heat.colors(100), xlab="", ylab="", asp=1,zlim=zlimits,main=title)
contour(X,Y,evalmati,add=T)
(here you can find objects: https://dl.dropboxusercontent.com/u/47720440/bounty.RData)
on a Google Map background.
I tried two ways:
PACKAGE RGoogleMaps
I downloaded the map mbackground
MapVeneto<-GetMap.bbox(lonR=c(10.53,13.18),latR=c(44.7,46.76),size = c(640,640),MINIMUMSIZE=TRUE)
PlotOnStaticMap(MapVeneto)
but i don't know the commands useful to add the plot defined by image and contour to the map
PACKAGE loa
I tried this way:
lat.loa<-NULL
lon.loa<-NULL
z.loa<-NULL
nx=dim(evalmati)[1]
ny=dim(evalmati)[2]
for (i in 1:nx)
{
for (j in 1:ny)
{
if(!is.na(evalmati[i,j]))
{
lon.loa<-c(lon.loa,X[i])
lat.loa<-c(lat.loa,Y[j])
z.loa<-c(z.loa,evalmati[i,j])
}
}
}
GoogleMap(z.loa ~ lat.loa*lon.loa,col.regions=c("red","yellow"),labels=TRUE,contour=TRUE,alpha.regions=list(alpha=.5, alpha=.5),panel=panel.contourplot)
but the plot wasn't like the first one:
in the legend of this plot I have 7 colors, and the plot use only these values. image plot is more accurate.
How can I add image plot to GoogleMaps background?
If the use of a GoogleMap map is not mandatory (e.g. if you only need to visualize the coastline + some depth/altitude information on the map), you could use the package marmap to do what you want. Please note that you will need to install the latest development version of marmap available on github to use readGEBCO.bathy() since the format of the files generated when downloading GEBCO files has been altered recently. The data from the NOAA servers is fine but not very accurate in your region of interest (only one minute resolution vs half a minute for GEBCO). Here is the data from GEBCO I used to produce the map : GEBCO file
library(marmap)
# Get hypsometric and bathymetric data from either NOAA or GEBCO servers
# bath <- getNOAA.bathy(lon1=10, lon2=14, lat1=44, lat2=47, res=1, keep=TRUE)
bath <- readGEBCO.bathy("GEBCO_2014_2D_10.0_44.0_14.0_47.0.nc")
# Create color palettes for sea and land
blues <- c("lightsteelblue4", "lightsteelblue3", "lightsteelblue2", "lightsteelblue1")
greys <- c(grey(0.6), grey(0.93), grey(0.99))
# Plot the hypsometric/bathymetric map
plot(bath, land=T, im=T, lwd=.03, bpal = list(c(0, max(bath), greys), c(min(bath), 0, blues)))
plot(bath, n=1, add=T, lwd=.5) # Add coastline
# Transform your data into a bathy object
rownames(evalmati) <- X
colnames(evalmati) <- Y
class(evalmati) <- "bathy"
# Overlay evalmati on the map
plot(evalmati, land=T, im=T, lwd=.1, bpal=col2alpha(heat.colors(100),.7), add=T, drawlabels=TRUE) # use deep= shallow= step= to adjust contour lines
plot(outline.buffer(evalmati),add=TRUE, n=1) # Outline of the data
# Add cities locations and names
library(maps)
map.cities(country="Italy", label=T, minpop=50000)
Since your evalmati data is now a bathy object, you can adjust its appearance on the map like you would for the map background (adjust the number and width of contour lines, adjust the color gradient, etc). plot.bath() uses both image() and contour() so you should be able to get the same results as when you plot with image(). Please take a look at the help for plot.bathy() and the package vignettes for more examples.
I am not realy inside the subject, but Lovelace, R. "Introduction to visualising spatial data in R" might help you
https://github.com/Robinlovelace/Creating-maps-in-R/raw/master/intro-spatial-rl.pdf From section "Adding base maps to ggplot2 with ggmap" with small changes and data from https://github.com/Robinlovelace/Creating-maps-in-R/archive/master.zip
library(dplyr)
library(ggmap)
library(rgdal)
lnd_sport_wgs84 <- readOGR(dsn = "./Creating-maps-in-R-master/data",
layer = "london_sport") %>%
spTransform(CRS("+init=epsg:4326"))
lnd_wgs84_f <- lnd_sport_wgs84 %>%
fortify(region = "ons_label") %>%
left_join(lnd_sport_wgs84#data,
by = c("id" = "ons_label"))
ggmap(get_map(location = bbox(lnd_sport_wgs84) )) +
geom_polygon(data = lnd_wgs84_f,
aes(x = long, y = lat, group = group, fill = Partic_Per),
alpha = 0.5)

Resources