Guidance on producing multivariate Radar plots in ggplot - r

I am currently attempting to make two different radar plots (see attached drawn representations).
Both plots use the same nested experimental design utilizing two independent variables: location and lithology, covering 36 experimental sites. Location has three levels (W,C,E) with each of the categories containing three lithologies (C,M,P).
Plot one is a representation of the distribution of altitude and orientation across sites with orientation being in degrees (0-360) and altitude being in meters.
Plot two will be used to represent the differing minerology of groups of site utilizing three minerals (A,B,C).
Drawn depiction of radar plot 1 with four cardinal directions, colours representing climate location (independent variable 1) and shapes representing lithology (independent variable 2). Circles represent altitude with larger distance from the center indicating higher altitude
The second radar plot uses the same independent variables: location (W,C,E) and lithology (C,M,P) with the red highlighted area representing the quantity of mineral A and yellow indicating mineral B .
If anyone has any pointers, packages or guides which could help I would greatly appreciate it.
Edit: The second plot doesn't seem too difficult to make but the first is still causing issues. I have partially solved using polar plots but am now having difficulty adjusting the aesthetics.
Data head:
head(radar_plot_data)
climate lithology plot pos_lit code altitude_m orientation_ao
1: Central Calcareous C.C.Pp.1 CC C.C.1 1150 150.8
2: Central Calcareous C.C.Pp.2 CC C.C.2 860 24.0
3: Central Calcareous C.C.Pp.3 CC C.C.3 1026 90.0
4: Central Calcareous C.C.Pp.4 CC C.C.4 1326 86.3
5: Central Calcareous C.C.Pp.5 CC C.C.5 966 87.5
6: Central Metapelite C.M.Pp.1 CM C.M.1 951 28.3
Current code:
`ggplot(radar_plot_data, aes(x = orientation_ao)) +`
geom_point(aes(x = orientation_ao, y = altitude_m,
shape = climate, color = lithology, stroke = 2.5)) +
coord_polar() +
scale_x_continuous(limits = c(0,360),
breaks = seq(0, 360, by = 45),
minor_breaks = seq(0, 360, by = 15))

Related

Geopandas: colorbar consists of bullets and is not a bar

I'm considering a geopandas DataFrame with population data mapped to zip codes.
Its head looks like
geometry plz NUTS3 einwohner
0 POLYGON ((9.36585 54.69994, 9.36683 54.70014, ... 24988 DEF0C 3350
1 POLYGON ((12.47666 49.13598, 12.47702 49.13637... 93185 DE235 1786
2 POLYGON ((12.54904 49.19318, 12.54953 49.19371... 93489 DE235 2622
3 POLYGON ((12.62945 49.28007, 12.62949 49.28013... 93494 DE235 2018
4 POLYGON ((12.76492 49.27279, 12.76496 49.27288... 93473 DE235 1931
Plotting einwohner (population)
fig, ax = plt.subplots()
plz_shape_gdf.plot(
ax=ax,
column='einwohner',
categorical=False,
legend=True,
cmap='hot_r',
alpha=0.8
)
yields a nice heatmap
Next I uniformly distribute the popultion of a NUTS3 region onto the zip code areas via
plz_shape_gdf['einwohner_uniform'] = ''
for nuts in plz_shape_gdf['NUTS3']:
maske = (plz_shape_gdf['NUTS3'] == nuts)
plz_shape_gdf.loc[maske, 'einwohner_uniform'] = plz_shape_gdf.loc[maske, 'einwohner'].mean()
Plotting over the column einwohner_uniform yields the correct heatmap I guess, but the colorbar is somewhat messed up (see picture below, sorry for its lenghtiness).
What's causing the issue?
Is it possible to fix this behavior?

Find centre location coordinates in R - geospatial analysis

I am trying to find the centre coordinates of high-density areas in R.
The dataset I have has about 1.5million rows and looks like this (dummy data)
LATITUDE LONGITUDE val
1 35.83111 -90.64639 359.1
2 42.40630 -90.31810 74.5
3 40.07806 -83.07806 115.4
4 40.53210 -90.14730 112.0
5 42.76310 -84.76220 118.4
6 39.29750 -87.97460 134.4 ...
...
After plotting it using ggmap and ggplot using the command
ggmap(UK_Map) +
geom_density2d(data=processedSubsetData,aes(x=processedSubsetData$Longitude,y=processedSubsetData$Latitude), bins=5) +
stat_density2d(data=processedSubsetData,aes(x=processedSubsetData$Longitude,y=processedSubsetData$Latitude,fill=..level.., alpha=..level..), geom='polygon')
I have the visualization which looks like below image.
As you can see from the image, there some high-density areas. I need to find the local centre coordinates of these high-density areas in the map.
I have tried calculating distance between the points and also rounding the coordinates to group them. But I am not able to make it work and is stuck.
Thanks

Identifying data points amongst background noise for binned data R

Not sure whether this should go on cross validated or not but we'll see. Basically I obtained data from an instrument just recently (masses of compounds from 0 to 630) which I binned into 0.025 bins before plotting a histogram as seen below:-
I want to identify the bins that are of high frequency and that stands out from against the background noise (the background noise increases as you move from right to left on the a-xis). Imagine drawing a curve line ontop of the points that have almost blurred together into a black lump and then selecting the bins that exists above that curve to further investigate, that's what I'm trying to do. I just plotted a kernel density plot to see if I could over lay that ontop of my histogram and use that to identify points that exist above the plot. However, the density plot in no way makes any headway with this as the densities are too low a value (see the second plot). Does anyone have any recommendations as to how I Can go about solving this problem? The blue line represents the density function plot overlayed and the red line represents the ideal solution (need a way of somehow automating this in R)
The data below is only part of my dataset so its not really a good representation of my plot (which contains just about 300,000 points) and as my bin sizes are quite small (0.025) there's just a huge spread of data (in total there's 25,000 or so bins).
df <- read.table(header = TRUE, text = "
values
1 323.881306
2 1.003373
3 14.982121
4 27.995091
5 28.998639
6 95.983138
7 2.0117459
8 1.9095478
9 1.0072853
10 0.9038475
11 0.0055748
12 7.0964916
13 8.0725191
14 9.0765316
15 14.0102531
16 15.0137390
17 19.7887675
18 25.1072689
19 25.8338140
20 30.0151683
21 34.0635308
22 42.0393751
23 42.0504938
")
bin <- seq(0, 324, by = 0.025)
hist(df$values, breaks = bin, prob=TRUE, col = "grey")
lines(density(df$values), col = "blue")
Assuming you're dealing with a vector bin.densities that has the densities for each bin, a simple way to find outliers would be:
look at a window around each bin, say +- 50 bins
current.bin <- 1
window.size <- 50
window <- bin.densities[current.bin-window.size : current.bin+window.size]
find the 95% upper and lower quantile value (or really any value you think works)
lower.quant <- quantile(window, 0.05)
upper.quant <- quantile(window, 0.95)
then say that the current bin is an outlier if it falls outside your quantile range.
this.is.too.high <- (bin.densities[current.bin] > upper.quant
this.is.too.low <- (bin.densities[current.bin] < lower.quant)
#final result
this.is.outlier <- this.is.too.high | this.is.too.low
I haven't actually tested this code, but this is the general approach I would take. You can play around with window size and the quantile percentages until the results look reasonable. Again, not exactly super complex math but hopefully it helps.

For a given location, identify minimum kernel density isopleth

I am undertaking research looking at the interactions of individual rats with a grid of traps distributed across the landscape (I have x, y coordinates for all trap locations). For each rat, I have generated a kernel utilisation density "home range" estimate using the R package adehabitatHR. What I'd like to do next is the following:
1- For each rat, calculate fine-scale home range contours from 1 - 99%
2- For each trap, calculate the minimum isopleth on which it is located: for example, trap 1 might "first" be on the 20% isopleth, trap 2 might "first" be on the 71% isopleth
My ultimate goal is to use the minimum isopleths calculated in a logistic regression to estimate the probability that a particular rat will "encounter" a particular trap within a specified time period.
Step 1 is easy enough but I'm having trouble imagining a way to accomplish step 2 short of plotting it all out manually (possible but I think there must be a better way). I suspect that part of my problem is that I'm new to both R and analysis of spatial data and I'm probably not searching with the right key words. Of what I've managed to find, the discussion that most closely resembles what I want to do is this.
How can I get the value of a kernel density estimate at specific points?
The above succeeds in calculating the probability value at specific points within a kernel utilisation distribution. However, what I'm trying to do is more to assign specific locations to a "category" - i.e. 5% category, 22% category etc.
Here is a small sample of my rat location data (coordinate system NZTM)
RatID Easting Northing
18 1732782.018 5926656.26
18 1732746.074 5926624.161
18 1732775.206 5926617.687
18 1732750.443 5926653.985
18 1732759.188 5926645.705
18 1732765.358 5926624.287
18 1732762.588 5926667.765
18 1732707.336 5926638.793
18 1732759.54 5926693.451
18 1732743.532 5926645.08
18 1732724.905 5926637.952
18 1732729.757 5926594.709
18 1732743.725 5926603.689
18 1732754.217 5926591.804
18 1732733.287 5926619.997
18 1732813.398 5926632.372
18 1732764.513 5926609.795
18 1732756.472 5926607.948
18 1732771.352 5926609.855
18 1732789.088 5926598.158
18 1732768.952 5926620.593
18 1732742.667 5926630.391
18 1732751.399 5926595.63
18 1732749.846 5926624.015
18 1732756.466 5926661.141
18 1732748.507 5926597.018
18 1732782.934 5926620.3
18 1732779.814 5926633.227
18 1732773.356 5926613.596
18 1732755.782 5926627.243
18 1732786.594 5926619.327
18 1732758.493 5926610.918
18 1732760.756 5926617.973
18 1732748.722 5926621.693
18 1732767.133 5926655.643
18 1732774.129 5926646.358
18 1732766.18 5926659.081
18 1732747.999 5926630.82
18 1732755.94 5926606.326
18 1732757.592 5926586.467
And here are the location data for my grid of traps:
TrapNum Easting Northing
HA1 1732789.055 5926589.589
HA2 1732814.738 5926605.615
HA3 1732826.837 5926614.635
HA4 1732853.275 5926621.766
HA5 1732877.903 5926638.804
HA6 1732893.335 5926649.771
HA7 1732917.186 5926651.287
HA8 1732944.25 5926669.952
HA9 1732963.233 5926679.758
HB1 1732778.721 5926613.718
HB2 1732798.169 5926624.735
HB3 1732818.44 5926631.303
HB4 1732844.132 5926647.878
HB5 1732862.387 5926662.465
HB6 1732884.118 5926671.112
HB7 1732903.641 5926681.234
HB8 1732931.883 5926695.332
HB9 1732947.286 5926698.757
HC1 1732766.385 5926629.555
HC2 1732785.31 5926647.128
HC3 1732801.985 5926657.742
HC4 1732835.289 5926664.553
HC5 1732843.434 5926694.72
HC6 1732862.648 5926702.187
HC7 1732878.385 5926709.82
HC8 1732916.886 5926712.215
HC9 1732935.947 5926715.582
HD1 1732755.253 5926654.033
HD2 1732774.911 5926672.812
HD3 1732794.617 5926671.724
HD4 1732820.064 5926689.754
HD5 1732816.794 5926714.769
HD6 1732841.166 5926732.481
HD7 1732865.646 5926734.21
HD8 1732906.592 5926738.893
HD9 1732930.1 5926752.73
Below is the code I used to calculate 1-99% home range contours using package adehabitatHR (Step 1). In addition, the code to plot selected home range isopleths over the grid of traps.
### First, load adehabitatHR and dependents
## specifying which variables are coordinates converts the dataframe into class SpatialPointsDataFrame
coordinates (RatLocs) = c("Easting", "Northing")
# create and store in object kudH KUDs using default bivariate normal kernel function and least-squares-cross-validation as smoothing bandwidth
kudH = kernelUD(RatLocs[,1], h = "LSCV")
kudH
## estimating home range from the KUD - mode VECTOR
homerange = getverticeshr(kudH)
## calculate home-range area for ALL probability levels (every 1%)
hr1to100 = kernel.area(kudH, percent = seq(1,100, by =1))
# generates error - for 100% kernel. rerun kernel UD with larger extent parameter.
## tried a range of values for other extents. Couldn't get one that worked for a 100% isopleth, 99% works
hr1to99 = kernel.area(kudH, percent = seq(1,99, by =1))
## An example of calculating and plotting selected home range isopleths over the grid of traps
## plot the trap grid
plot(Grid[,2], Grid[,3], xlab="Easting", ylab="Northing", pch=3, cex = 0.6, col="black", bty = "n", xlim=c(1742650,1743100), ylim=c(5912900,5913200), main = "KUD Home Range rat 33")
text(Grid[,2], Grid[,3], Grid[,1], cex=0.6, pos=2)
# Calculate and plot 95%, 75% and 50% contours for rat ID 33 (rat 2 in dataset)
HR95pc = getverticeshr(kudH)
plot(HR95pc[2,], col= rgb (1,0,0, alpha =0.1), border = "red1", add=TRUE)
HR75pc = getverticeshr(kudH, percent=75)
plot (HR75pc[2,], col = rgb(0,0,1, alpha =0.3), border = "purple", add=TRUE)
HR50pc = getverticeshr(kudH, percent=50)
plot(HR50pc[2,], col = rgb (0,1,1, alpha=0.3), border = "blue2", add=TRUE)
# Add individual location points for rat ID 33
rat33L = subset(RatLocs, RatID =="33")
plot(rat33L[,1], pch = 16, col = "blue", add=TRUE)
Can anyone help me get started on Step 2? I'd be grateful for any ideas.
Thanks.

Use a for loop to create several bubble plots with different legend scales in R

I have been trying to make several bubble plots showing the frequency of observations (as a percentage) of several individuals in different sites. Some individuals were found in the same site, but not all. Also the number of locations within each site may vary among individuals. My main problem is that I have more than 3 individuals and more than 3 sites, so I have been trying to come up with a good/fast way of creating this type of bubble plots/legends. I am also having problems with the legend as I need to have a function that will place the legend in the same location when creating a new plot. In the legend I want to show different bubble sizes for each frequency (if possible indicating the value next to the bubble).
Here is an example of my script. Any suggestions or ideas on how to do this will be extremely helpful.
# require libraries
library(maptools)
library(sp)
data<-read.table(text="ind lat long site freq perc
A -18.62303 147.29207 A 449 9.148329258
A -18.6195 147.29492 A 725 14.77180114
A -18.62512 147.3018 A 3589 73.12550937
A -18.62953 147.29422 A 145 2.954360228
B -18.75383 147.25405 B 2 0.364963504
B -18.73393 147.28162 B 1 0.182481752
B -18.62303 147.29207 A 3 0.547445255
B -18.6195 147.29492 A 78 14.23357664
B -18.62512 147.3018 A 451 82.29927007
B -18.62953 147.29422 A 13 2.372262774
C -18.51862 147.39717 C 179 0.863857922
C -18.53281 147.39052 C 20505 98.95757927
C -18.52847 147.40167 C 37 0.178562811",header=TRUE)
# Split data frame for each tag
ind<-data$ind
M<-split(data,ind)
l<-length(M)
### Detection Plots ###
pdf("Plots.pdf",width=11,height=8,paper="a4r")
par(mfrow=c(1,1))
for(j in 1:l){
# locations
new.data<-M[[j]]
site<-as.character(unique(new.data$site))
fname<-paste(new.data$ind[1],sep="")
loc<-new.data[,c("long","lat")]
names(loc)<-c("X", "Y")
coord<-SpatialPoints(loc)
coord1<-SpatialPointsDataFrame(coord,new.data)
# draw some circles with specify radius size
x<-new.data$long
y<-new.data$lat
freq<-new.data$perc
rad<-freq
rad1<-round(rad,1)
title<-paste("Ind","-",fname," / ","Site","-",new.data$site[1],sep="")
# create bubble plot
symbols(x,y,circles=rad1,inches=0.4,fg="black",bg="red",xlab="",ylab="")
points(x,y,pch=1,col="black",cex=0.4)
par(new=T)
# map scale
maps::map.scale(grconvertX(0.4,"npc"),grconvertY(0.1, "npc"),
ratio=FALSE,relwidth=0.2,cex=0.6)
# specifying coordinates for legend
legX<-grconvertX(0.8,"npc")
legY1<-grconvertY(0.9,"npc")
legY2<-legY1-0.001
legY3<-legY2-0.0006
legY4<-legY3-0.0003
# creating the legend
leg<-data.frame(X=c(legX,legX,legX,legX),Y=c(legY1,legY2,legY3,legY4),
rad=c(1000,500,100,25))
symbols(leg$X,leg$Y,circles=leg$rad,inches=0.3,add=TRUE,fg="black",bg="white")
mtext(title,3,line=1,cex=1.2)
mtext("Latitude",2,line=3,padj=1,cex=1)
mtext("Longitude",1,line=2.5,padj=0,cex=1)
box()
}
dev.off()
The first plot is actually Ok, and will only need to have the values of the frequency/perc next to the lengend bubble. However, it does not really work with the others...
You are hardcoding the legend position - make it relative...
legX<-grconvertX(0.8,"npc")
legY1<-grconvertY(0.9,"npc")
# Get the size of the plotting area (measured on the y axis)
ysize <- par()$usr[4]-par()$usr[3]
# Use that to calculate the new positions
legY2<-legY1 - (0.1* ysize)
legY3<-legY1 - (0.2* ysize)
legY4<-legY1 - (0.3* ysize)
This will put the bubbles on the same place on all the plots (in steps of 10% of the plotting area).

Resources