ggplot create map with arrows - r

I have a data frame like this
id lon lat
1 A -69.5 -58.5
2 A -69.5 -58.5
3 A -69.5 -57.5
4 A -68.5 -57.5
5 A -68.5 -57.5
6 A -68.5 -57.5
7 A -66.5 -57.5
8 A -68.5 -56.5
9 A -68.5 -56.5
10 A -67.5 -56.5
11 A -65.5 -56.5
12 A -65.5 -56.5
13 A -65.5 -55.5
14 A -62.5 -54.5
15 B -177 -52.5
16 B -178 -50.5
17 B -179 -48.5
18 B 179 -47.5
19 B 178 -46.5
20 B 177 -46.5
and I want to produce a map of the position of A and B, linked by oriented lines. However when ids cross the Pacific (lon=-180 -> lon=+180) I get an arrow crossing the whole figure, like shown below.
This is the code I am using
worldmap = map_data("world")
ggplot(test, aes(x = lon, y=lat, colour = factor(id))) +
geom_polygon(data=worldmap,center=180,aes(x=long, y=lat, group=group), fill="black",colour="black") +
xlab("") +ylab("")+theme(axis.text=element_blank(),axis.ticks=element_blank())+ theme(panel.background = element_rect(fill = 'white', colour = 'black') ,panel.grid.major = element_blank(),panel.grid.minor = element_blank())+
geom_path(size =2,arrow = arrow(angle=30,length = unit(0.6, "inches")))
How can I fix it?
Thanks

I guess that depends on what you think the "right" think to do is. I decided to break up the pathes that cross the glob into two segments by adding in points at the edge of the map, and then creating a "sequence" indicator so ggplot knows which lines to connect. Here's the transformation for your sample data
test2 <- do.call(rbind, lapply(split(test, test$id), function(x) {
cp <- cumsum(c(FALSE, diff(x$lon)>250))
xx<-split(x, cp)
xx<-Map(cbind, xx, seq=seq_along(xx))
Reduce(function(a,b) {
lasta<-a[nrow(a),]
firstb<-b[1,]
lasta$lon <- 180*sign(lasta$lon)
firstb$lon <- 180*sign(firstb$lon)
lasta$lat <- mean(lasta$lat, firstb$lat)
firstb$lat <- lasta$lat
rbind(a,lasta, firstb,b)
}, xx)
}))
tail(test2)
# id lon lat seq
# B.17 B -179 -48.5 1
# B.171 B -180 -48.5 1
# B.18 B 180 -48.5 2
# B.181 B 179 -47.5 2
# B.19 B 178 -46.5 2
# B.20 B 177 -46.5 2
here you can see that we've broken the B line up into two sequences. Then if we use a group aesthetic
geom_path(aes(group=interaction(id, seq)), ...)
then R will only connect those points that are in the same id/seq group. This will prevent the line from going across the ocean. However, because we are drawing two lines for that group rather than one, there's no way to turn of the arrow head for just one of the segments. you might want to find another way to indicate start/end.

Related

Points in a scatterplot with individual ellipses using ggplot2 in R

My dataset is formed by 4 columns, as shown below:
The two columns on the left represent the coordinates XY of a geographical structure, and the two on the left represent the size of "each" geographical unit (diameters North-South and East-West)
I would like to graphically represent a scatterplot where to plot all the coordinates and draw over each point an ellipse including the diameters of each geographical unit.
Manually, and using only two points, the image should be like this one:
How can I do it using ggplot2?
You can download the data here
Use geom_ellipse() from ggforce:
library(ggplot2)
library(ggforce)
d <- data.frame(
x = c(10, 20),
y = c(10, 20),
ns = c(5, 8),
ew = c(4, 4)
)
ggplot(d, aes(x0 = x, y0 = y, a = ew/2, b = ns/2, angle = 0)) +
geom_ellipse() +
coord_fixed()
Created on 2019-06-01 by the reprex package (v0.2.1)
I'm not adding any new code to what Claus Wilke already posted above. All credit should go to Claus. I'm simply testing it with the actual data, and showing OP how to post data,
Loading packages needed
# install.packages(c("tidyverse"), dependencies = TRUE)
library(tidyverse)
Reading data,
tbl <- read.table(
text = "
X Y Diameter_N_S Diameter_E_W
-4275 1145 77 96
-4855 1330 30 25
-4850 1612 45 90
-4990 1410 15 15
-5055 1230 60 50
-5065 1503 43 45
-5135 1305 40 50
-5505 1190 55 70
-5705 1430 90 40
-5645 1535 52 60
", header = TRUE, stringsAsFactors = FALSE) %>% as_tibble()
showing data,
tbl
#> # A tibble: 10 x 4
#> X Y Diameter_N_S Diameter_E_W
#> <int> <int> <int> <int>
#> 1 -4275 1145 77 96
#> 2 -4855 1330 30 25
#> 3 -4850 1612 45 90
#> 4 -4990 1410 15 15
#> 5 -5055 1230 60 50
#> 6 -5065 1503 43 45
#> 7 -5135 1305 40 50
#> 8 -5505 1190 55 70
#> 9 -5705 1430 90 40
#> 10 -5645 1535 52 60
loading more packages needed
library(ggforce) # devtools::install_github("thomasp85/ggforce")
executing
ggplot(tbl, aes(x0 = X, y0 = Y, a = Diameter_E_W, b = Diameter_N_S, angle = 0)) +
geom_ellipsis() + geom_point(aes(X, Y), size = .5) + coord_fixed() + theme_bw()

ggplot2 geom_area overlay area plots in front of each other

I am trying to make an area plot with the different areas are overlaid on one another rather than stacked.
I have a dataframe that looks like this:
r variable value
1 45.0 Cat 1 4.057250e+03
2 52.5 Cat 1 3.537323e+03
3 56.1 Cat 1 3.429861e+03
4 57.3 Cat 1 3.395330e+03
5 57.6 Cat 1 3.389983e+03
6 45.0 Cat 2 4.545455e-03
7 52.5 Cat 2 4.509400e+01
8 56.1 Cat 2 3.525753e+02
9 57.3 Cat 2 4.185094e+02
10 57.6 Cat 2 4.336622e+02
11 45.0 Cat 3 4.074432e+03
12 52.5 Cat 3 3.630504e+03
13 56.1 Cat 3 3.919076e+03
14 57.3 Cat 3 3.957039e+03
15 57.6 Cat 3 3.970083e+03
16 45.0 Cat 4 1.718182e+01
17 52.5 Cat 4 9.318133e+01
18 56.1 Cat 4 4.892154e+02
19 57.3 Cat 4 5.617087e+02
20 57.6 Cat 4 5.801001e+02
I am trying to get area plots for each category. My code for that is:
p <- ggplot(reshaped_data, aes(r, value))
p <- p + labs(x = "X Axis", y = "Y Axis") + ggtitle(title)
p <- p + geom_area(aes(colour = variable, fill= variable), position = 'stack')
p
And the result I am getting looks like this:
How can I make it so that the area graphs aren't stacked on each other, but the smallest are overlaid in front of the bigger ones?
Thanks
Using tidyverse:
library(forcats)
p + geom_area(aes(colour = variable,
fill= fct_reorder(variable, value, .desc = TRUE)), position = 'identity')
Remove .desc = TRUE if it does the opposite of what you want.
As Nathan wrote you have to use geom_area(position = "identity", ...)
But before this you should reorder the levels of variable:
df$variable <- factor(df$variable, unique(df[order(df$value, decreasing = T),"variable"]) )
or
df$variable <- reorder(df$variable, df$value, function(x) -max(x) )

ggplot doesn't show the second geom_line() in my plot

My df:
p1 p2 p3 x y
0 3000 14 0.0 0.026500
20 3000 14 11.0 0.054000
30 3000 14 17.9 0.057000
60 3000 14 49.3 0.064000
80 3000 14 77.4 0.063000
60 3500 14 45.3 0.061000
60 4000 14 41.4 0.058300
60 4400 14 43.7 0.073600
60 3500 9 41.7 0.060556
60 3500 18 46.7 0.060700
60 3500 21 49.2 0.059900
This is the result of a "one parameter at a time" experimental design, i.e., one where the parameters p1, p2 and p3 were changed one at a time (definitely not the best kind of DOE, but that's what I got). For each observation, two variables are measured, x and y. I would like to plot a line connecting all points of the p1 study (the first 5 rows), a line connecting all points of the p2 study (rows 4 and 6:8) and a third line connecting the points of the p3 study (rows 6 and 9:11). I tried with
ggplot(df, aes(x = x, y = y, color = p2)) +
geom_point( aes(shape = p3)) +
geom_line() +
geom_line(data = filter(df, p1 == "60" & p3 == "14"), aes(x = x, y = y))
The red and the green line correspond to the p1 and p3 study, but ggplot doesn't plot the line corresponding to the p2. How can I manage to plot it? In practice, I need either a geom_path or a geom_line connecting the triangle symbols in the center of the screen (x coordinate between 40 and 50).

Add points ggplot

Hi I have many data frame like this
id oldid yr mo dy lon lat
1 01206295 Aberfeldy 1885 3 22 -127.1 -31.78
2 05670001 05670005 1885 3 22 -4.38 49.15
3 06279 06279 1885 3 22 -123.5 37.5
4 106251 06323 1885 3 22 178.5 19.5
5 58FFF3618 58FFF3618 1885 3 22 -0.73 69.73
6 Achille.F Achille.F 1885 3 22 -35.62 -2.98
stored in different files myfiles and I am trying to plot the (lon,lat) points for each of them with the colour chosen according to the id value. So far I am doing like this
for (i in 1:length(myfiles)){
colnames(myfilesContent[[i]]) <-c("id","oldid","yr","mo","dy","lon","lat")
p <- ggplot() + geom_polygon(data=world_map,aes(x=long, y=lat,group=group))
myfilesContent[[i]]$lon <- as.numeric(myfilesContent[[i]]$lon)
myfilesContent[[i]]$lat <- as.numeric(myfilesContent[[i]]$lat)
p + geom_point(data=myfilesContent[[i]], aes(x=lon, y=lat, fill=as.factor(id)), size = 4, shape = 21, show_guide=FALSE)
print(p)
}
Anyway I am not sure that if an id is in different files it will be assigned with the same colour
Many thanks
You can make sure the levels for all your id columns are the same. First, get a master list of all the IDs from all the data.frames
allids <- unique(unlist(lapply(myfilesContent, function(x) levels(x[,1])))
Then make sure all the ID columns share these levels
lapply(seq_along(myfilesContent), function(i) {
myfilesContent[[i]][,1] < -factor(myfilesContent[[i]][,1], levels=allids)
})
If they have the same levels, they should get the same colors.

plot plate layout heatmap in r

I am trying to plot a plate layout heatmap in R. The plate layout is simply 8 (row) x 12 (column) circles (wells). Rows are labeled by alphabets and columns by numbers. Each well need to be filled with some color intensity depends upon a qualitative or quantitative variable. The plate layout look like this:
Here is small dataset:
set.seed (123)
platelay <- data.frame (rown = rep (letters[1:8], 12), coln = rep (1:12, each = 8),
colorvar = rnorm (96, 0.3, 0.2))
rown coln colorvar
1 a 1 0.187904871
2 b 1 0.253964502
3 c 1 0.611741663
4 d 1 0.314101678
5 e 1 0.325857547
6 f 1 0.643012997
7 g 1 0.392183241
8 h 1 0.046987753
9 a 2 0.162629430
10 b 2 0.210867606
11 c 2 0.544816359
12 d 2 0.371962765
13 e 2 0.380154290
14 f 2 0.322136543
15 g 2 0.188831773
16 h 2 0.657382627
17 a 3 0.399570096
18 b 3 -0.093323431
19 c 3 0.440271180
20 d 3 0.205441718
21 e 3 0.086435259
22 f 3 0.256405017
23 g 3 0.094799110
24 h 3 0.154221754
25 a 4 0.174992146
26 b 4 -0.037338662
27 c 4 0.467557409
28 d 4 0.330674624
29 e 4 0.072372613
30 f 4 0.550762984
31 g 4 0.385292844
32 h 4 0.240985703
33 a 5 0.479025132
34 b 5 0.475626698
35 c 5 0.464316216
36 d 5 0.437728051
37 e 5 0.410783531
38 f 5 0.287617658
39 g 5 0.238807467
40 h 5 0.223905800
41 a 6 0.161058604
42 b 6 0.258416544
43 c 6 0.046920730
44 d 6 0.733791193
45 e 6 0.541592400
46 f 6 0.075378283
47 g 6 0.219423033
48 h 6 0.206668929
49 a 7 0.455993024
50 b 7 0.283326187
51 c 7 0.350663703
52 d 7 0.294290649
53 e 7 0.291425909
54 f 7 0.573720457
55 g 7 0.254845803
56 h 7 0.603294121
57 a 8 -0.009750561
58 b 8 0.416922750
59 c 8 0.324770849
60 d 8 0.343188314
61 e 8 0.375927897
62 f 8 0.199535309
63 g 8 0.233358523
64 h 8 0.096284923
65 a 9 0.085641755
66 b 9 0.360705728
67 c 9 0.389641956
68 d 9 0.310600845
69 e 9 0.484453494
70 f 9 0.710016937
71 g 9 0.201793767
72 h 9 -0.161833775
73 a 10 0.501147705
74 b 10 0.158159847
75 c 10 0.162398277
76 d 10 0.505114274
77 e 10 0.243045399
78 f 10 0.055856458
79 g 10 0.336260696
80 h 10 0.272221728
81 a 11 0.301152837
82 b 11 0.377056080
83 c 11 0.225867994
84 d 11 0.428875310
85 e 11 0.255902688
86 f 11 0.366356393
87 g 11 0.519367803
88 h 11 0.387036298
89 a 12 0.234813683
90 b 12 0.529761524
91 c 12 0.498700771
92 d 12 0.409679392
93 e 12 0.347746347
94 f 12 0.174418785
95 g 12 0.572130490
96 h 12 0.179948083
Is there is package that can readily do it ? Is it possible write a function in base or ggplot2 or other package that can achieve this target.
Changing the colour of points of sufficient size, with ggplot2. Note I've implemeted #TylerRinkler's suggestion, but within the call to ggplot. I've also removed the axis labels
ggplot(platelay, aes(y = factor(rown, rev(levels(rown))),x = factor(coln))) +
geom_point(aes(colour = colorvar), size =18) +theme_bw() +
labs(x=NULL, y = NULL)
And a base graphics approach, which will let you have the x axis above the plot
# plot with grey colour dictated by rank, no axes or labels
with(platelay, plot( x=as.numeric(coln), y= rev(as.numeric(rown)), pch= 19, cex = 2,
col = grey(rank(platelay[['colorvar']] ) / nrow(platelay)), axes = F, xlab= '', ylab = ''))
# add circular outline
with(platelay, points( x=as.numeric(coln), y= rev(as.numeric(rown)), pch= 21, cex = 2))
# add the axes
axis(3, at =1:12, labels = 1:12)
axis(2, at = 1:8, labels = LETTERS[8:1])
# the background grid
grid()
# and a box around the outside
box()
And for giggles and Christmas cheer, here is a version using base R plotting functions.
Though there is very possibly a better solution.
dev.new(width=6,height=4)
rown <- unique(platelay$rown)
coln <- unique(platelay$coln)
plot(NA,ylim=c(0.5,length(rown)+0.5),xlim=c(0.5,length(coln)+0.5),ann=FALSE,axes=FALSE)
box()
axis(2,at=seq_along(rown),labels=rev(rown),las=2)
axis(3,at=seq_along(coln),labels=coln)
colgrp <- findInterval(platelay$colorvar,seq(min(platelay$colorvar),max(platelay$colorvar),length.out=10))
colfunc <- colorRampPalette(c("green", "blue"))
collist <- colfunc(length(unique(colgrp)))
symbols(platelay$coln,
factor(platelay$rown, rev(levels(platelay$rown))),
circles=rep(0.2,nrow(platelay)),
add=TRUE,
inches=FALSE,
bg=collist[colgrp])
And the resulting image:
here a solution using ggplot2 solution of #mnel and grid solution
here the code of given solution
d <- ggplot(platelay, aes(y=rown,x=factor(coln))) +
geom_point(aes(colour = colorvar), size =18) + theme_bw()
I use the data generated by ggplot
data <- ggplot_build(d)$data[[1]]
x <- data$x
y <- data$y
grid.newpage()
pushViewport(plotViewport(c(4, 4, 2, 2)),
dataViewport(x, y))
grid hase an ellipse geom
grid.ellipse(x, y,size=20, ar = 2,angle=0,gp =gpar(fill=data$colour))
grid.xaxis(at=c(labels=1:12,ticks=NA),gp=gpar(cex=2))
grid.yaxis(at = 1:8,label=rev(LETTERS[1:8]),gp=gpar(cex=2))
grid.roundrect(gp=gpar(fill=NA))
I add grid :
gpgrid <- gpar(col='grey',lty=2,col='white')
grid.segments(unit(1:12, "native") ,unit(0, "npc"), unit(1:12, "native"),unit(1, "npc"),gp=gpgrid)
grid.segments(unit(0, "npc"), unit(1:8, "native"), unit(1, "npc"),unit(1:8, "native"),gp=gpgrid)
upViewport()
This answer is an add on for #thelatemail answer which explains the platemap for (8,12) = 96 format.
To construct (32,48) = 1536 format, single digits of A-Z is insufficent. Hence one needs to expand letters such as AA, AB, AC, AD ... ZZ and it can be expanded to three or more digits by concatenating LETTERS to the levels variable as below.
levels = c(LETTERS, c(t(outer(LETTERS, LETTERS, paste, sep = "")))))
#thelatemail answer can be improved for letters in double digits for 1536 plate format as below
rown = rep (c(LETTERS, c(t(outer(LETTERS[1], LETTERS[1:6], paste, sep = "")))),
symbols(platelay$coln,
factor(platelay$rown,
levels = rev(c(LETTERS, c(t(outer(LETTERS[1], LETTERS[1:6], paste, sep = "")))))),
circles=rep(0.45,nrow(platelay)),
add=TRUE,
inches=FALSE,
bg=collist[colgrp])
The levels variable inside symbols function should have characters with alphabetically sorted single, then double, then triple ... and so on digits.
For example, if you have below incorrect order of levels inside the symbols function, then it will plot with incorrect color representation.
Incorrect order:
A, AA, AB, AC, AD, AE, AF, B, C,D, ...Z
Correct order:
A, B, C, D, E, .....Z, AA, AB, AC, AD, AE, AF

Resources