Multiple markers on same coordinate - r

When plotting out markers on a interactive worlmap from the r package leaflet data with exactly the same coordinates will overlap each other.
See the example below:
library(leaflet)
Data <- structure(list(Name = structure(1:3, .Label = c("M1", "M2", "M3"), class = "factor"), Latitude = c(52L, 52L, 51L), Longitude = c(50L, 50L, 50L), Altitude = c(97L, 97L, 108L)), .Names = c("Name", "Latitude", "Longitude", "Altitude"), class = "data.frame", row.names = c(NA, -3L))
leaflet(data = Data) %>%
addProviderTiles("Esri.WorldImagery", options = providerTileOptions(noWrap = TRUE)) %>%
addMarkers(~Longitude, ~Latitude, popup = ~as.character(paste(sep = "",
"<b>",Name,"</b>","<br/>", "Altitude: ",Altitude)))
There is a possibilty to show all coordinates with the cluster option, but this is far from my goal. I dont want clusters and only the overlapping Markers are shown when fully zoomed in. When fully zoomed in the background map turns into grey("Map data not yet available"). The spider view of the overlapping markers is what i want, but not when fully zoomed in.
See example below:
leaflet(data = Data) %>%
addProviderTiles("Esri.WorldImagery", options = providerTileOptions(noWrap = TRUE)) %>%
addMarkers(~Longitude, ~Latitude, popup = ~as.character(paste(sep = "",
"<b>",Name,"</b>","<br/>", "Altitude: ",Altitude)), clusterOptions = markerClusterOptions())
I found some literatur about the solution i want but i dont know how to implement it in the r leaflet code/package.
https://github.com/jawj/OverlappingMarkerSpiderfier-Leaflet
Also if there are other approaches to handle overlapping Markers, feel free to answer. (for example multiple Markers info in one popup)

You could jitter() your coordinates slightly:
library(mapview)
library(sp)
Data <- structure(list(Name = structure(1:3, .Label = c("M1", "M2", "M3"),
class = "factor"),
Latitude = c(52L, 52L, 51L),
Longitude = c(50L, 50L, 50L),
Altitude = c(97L, 97L, 108L)),
.Names = c("Name", "Latitude", "Longitude", "Altitude"),
class = "data.frame", row.names = c(NA, -3L))
Data$lat <- jitter(Data$Latitude, factor = 0.0001)
Data$lon <- jitter(Data$Longitude, factor = 0.0001)
coordinates(Data) <- ~ lon + lat
proj4string(Data) <- "+init=epsg:4326"
mapview(Data)
This way you still need to zoom in for the markers to separate, how far you need to zoom in depends on the factor attribute in jitter().
Note that I am using library(mapview) in the example for simplicity.

Following up on my comment, here's a somewhat more modern solution (circa 2020) that takes advantage of some newer packages designed to make our lives easier (tidyverse & sf). I use sf:st_jitter as well as mapview as #TimSalabim does. Finally, I chose a slightly larger jitter factor so you wouldn't have to zoom in quite so far to see the effect:
library(mapview)
library(sf)
Data <- tibble(Name = c("M1", "M2", "M3"),
Latitude = c(52L, 52L, 51L),
Longitude = c(50L, 50L, 50L),
Altitude = c(97L, 97L, 108L))
Data %>%
st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326) %>%
st_jitter(factor = 0.001) %>%
mapview

Related

Can I join two datasets by coordinates categories in different EPSGs?

Thanks for taking your time to read this
I am trying to join two different datasets where the only common possible columns are coordinates. However, one of those datasets uses a normal system of coordinates (e.g. lat=39.35678, long=-8.99740) while the other uses EPSG 25830 (e.g. x=236044.949, y=4141285.671). I am quite new to R and spacial data so I don't quite get the documentation on spTransform, but I really need to join those two datasets and the coordinates are the only common variable i can use. I've been trying to transform the columns on data_1 with EPSG:25830 to EPSG:4326 (the one data_2 uses).
Here's an example of what I mean (and how I've been trying to solve it)
Here's the first dataset, with the EPSG:25830
structure(list(TOTAL_PLAZAS = c(4, 8, 6, 4, 6, 6), X = c("234755.4852",
"235629.3447", "235170.6602", "235074.569", "235480.4626", "239104.22"
), Y = c("4143050.4408", "4142032.4727", "4142819.3263", "4142736.735",
"4142705.8228", "4140674.42"), SRID = c("25830", "25830", "25830",
"25830", "25830", "25830")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
And here's the other dataframe, with the usual coordinates
structure(list(accommodates = c(4L, 3L, 2L, 6L, 6L, 4L), longitude = c(-5.99975,
-5.99533, -5.98537, -5.99795, -5.99379, -5.99497), latitude = c(37.39358,
37.39898, 37.38816, 37.39794, 37.39941, 37.38551)), class = c("rowwise_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame")))
This is what I've been trying, but so far it only transforms the original dataframe into a spatial points data frame, which is no use for me since i just need the original dataset with converted coordinates in order to join it with the other dataset.
coordinates(data_1) <- c("X","Y")
proj4string(data_1) <- CRS("+init=epsg:25830")
CRS.new <- CRS("+init=epsg:4326")
dnew <- spTransform(data1, CRS.new)
Many thanks again!!
You can use the sf package to convert the data into spatial objects and do all the desired spatial operations. The only problem with your data is that the spatial point locations of both datasets do not intersect. Thus, if you do a spatial join with st_join, no join will actually occur. I think that you might be interested in using a buffer st_buffer to force an intersection between the two datasets.
# Create simple feature object from a simple table indicating X and Y coordinates, as well as its crs.
data1 <- data1 |>
st_as_sf(coords = c("X", "Y"),
crs = 25830) |>
# Then transform it to the desired crs.
st_transform(crs = 4326)
# Do the same for data2
data2 <- data2 |>
st_as_sf(coords = c("longitude", "latitude"),
crs = 4326)
# Check if data intersects
st_intersects(data1, data2)
# This returns a list with empty entries meaning that there is no intersection among any of the spatial objects contained in data1 and data2
# Do join by spatial coordinates
resul <- data1 |>
# Maybe you are interested in using a buffer to force an intersection between the two objects
st_buffer(dist = 200) |>
# Do spatial join
st_join(data2)
# Finally, you can plot the data to visualize the spatial intersections
library(tmap)
tm_shape(data1|>
st_buffer(dist = 200)) +
# plot data1 with buffer as green filled circles
tm_fill(col = "forestgreen") +
tm_shape(data2) +
# plot data2 as red dots
tm_dots(col = "firebrick2",
size = 0.3)

Grouped barplots in R using csv

I have a 3 column csv file like this
x,y1,y2
100,50,10
200,10,20
300,15,5
I want to have a barplot using R, with first column values on x axis and second and third columns values as grouped bars for the corresponding x. I hope I made it clear. Can someone please help me with this? My data is huge so I have to import the csv file and can't enter all the data.I found relevant posts but none was exactly addressing this.
Thank you
Use the following code
library(tidyverse)
df %>% pivot_longer(names_to = "y", values_to = "value", -x) %>%
ggplot(aes(x,value, fill=y))+geom_col(position = "dodge")
Data
df = structure(list(x = c(100L, 200L, 300L), y1 = c(50L, 10L, 15L),
y2 = c(10L, 20L, 5L)), class = "data.frame", row.names = c(NA,
-3L))

How can I plot individual data points in a map using R/ leaflet?

I am trying to show the individual points in a given place, like a map equivalent of dot plot. I tried with leaflet library in R, but I am only able to map the size of the marker to the continuous variable. Is it possible to map the individual data points as clusters instead of mapping the size of the marker to the continuous variable?
My data looks like this
Lat,Lon,Place,People
19.877263,75.3390241,Aurangabad,1
20.2602939,85.8394548,Bhubaneshwar,2
30.7194022,76.7646552,Chandigarh,23
13.0801721,80.2838331,Chennai,25
11.0018115,76.9628425,Coimbatore,2
27.4844597,94.9019447,Dibrugarh,1
16.2915189,80.4541588,Guntur,1
17.3887859,78.4610647,Hyderabad,4
22.5677459,88.3476023,Kolkata,7
15.8309251,78.0425373,Kurnool,1
9.9256493,78.1228866,Madurai,1
You can use the following code to have dot plot
leaflet(df) %>% addTiles() %>%
addCircleMarkers(lng = ~Lon, lat = ~Lat,
popup = ~Place)
Data
df = structure(list(Lat = c(19.877263, 20.2602939, 30.7194022, 13.0801721,
11.0018115, 27.4844597, 16.2915189, 17.3887859, 22.5677459, 15.8309251,
9.9256493), Lon = c(75.3390241, 85.8394548, 76.7646552, 80.2838331,
76.9628425, 94.9019447, 80.4541588, 78.4610647, 88.3476023, 78.0425373,
78.1228866), Place = structure(1:11, .Label = c("Aurangabad",
"Bhubaneshwar", "Chandigarh", "Chennai", "Coimbatore", "Dibrugarh",
"Guntur", "Hyderabad", "Kolkata", "Kurnool", "Madurai"), class = "factor"),
People = c(1L, 2L, 23L, 25L, 2L, 1L, 1L, 4L, 7L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-11L))
leaflet works great with sf package. Taking a sample of your data points
lat <- c(19.877263, 20.2602939)
lon <- c(75.3390241, 85.8394548)
place <- c("Aurangabad", "Bhubaneshwar")
You can convert them in spatial object using sf package. For leaflet to give you tiles, you need to have WSG84 coordinates. I assumed your data were in this coordinate system.
library(sf)
df <- data.frame(lon, lat, place, stringsAsFactors = FALSE)
points <- st_as_sf(df, coords = c("lon", "lat"), crs = 4326)
Then it's easy to plot with leaflet. Assuming you want markers that popup the name of the place when you click
library(leaflet)
leaflet(df) %>% addTiles() %>% addMarkers(popup = ~ place)

Plotting on a geographical map the provenience of our patients

I am trying to put on a Italian geographical map a dot reporting the provenience ('provincia') of our patients. Ideally, the dot size should be proportional to the number of patients coming from that 'provincia'. An example of the list I would like to plot is the following.
MI 8319
CO 537
MB 436
VA 338
BG 310
PV 254
CR 244
NO 210
RM 189
CS 179
In the first column there is the 'provincia' code: MI (Milano), CO (Como), MB (Monza-Brianza), etc. In the second column there is the number of patients from that 'provincia'. So the output should be an Italian political map where the biggest dot is around the city of Milano (MI), the second biggest dot is near the city of Como (CO), the third one is around the city of Monza-Brianza (MB),etc.
Is there any package that could do the plot I am looking for? I found a tool that could do the job here, but apparently they expect that I load the geographical coordinates in order to do the plot.
https://www.littlemissdata.com/blog/maps
Thanks in advance.
Here is one way to handle your task. You have the abbreviations for Italian province. You want to use them to merge your data with polygon data. If you download Italy's polygons from GADM, you can obtain data that contain the abbreviations. Specifically, the column, HASC_2 is the one. You need to merge your data with the polygon data. Then, you want to create another data set which contains centroid. You can draw a map with the two data sets.
library(tidyverse)
library(sf)
library(ggthemes)
# Get the sf file from https://gadm.org/download_country_v3.html
# and import it in R.
mysf <- readRDS("gadm36_ITA_2_sf.rds")
# This is your data, which is called mydata.
mydata <- structure(list(abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"), value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L)), class = "data.frame", row.names = c(NA,
-10L))
abbs value
1 MI 8319
2 CO 537
3 MB 436
4 VA 338
5 BG 310
6 PV 254
7 CR 244
8 NO 210
9 RM 189
10 CS 179
# Abbreviations are in HASC_2 in mysf. Manipulate strings so that
# I can join mydata with mysf with the abbreviations. I also get
# longitude and latitude with st_centroid(). This data set is for
# geom_point().
mysf2 <- mutate(mysf, HASC_2 = sub(x = HASC_2, pattern = "^IT.", replacement = "")) %>%
left_join(mydata, by = c("HASC_2" = "abbs")) %>%
mutate(lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat = map_dbl(geometry, ~st_centroid(.x)[[2]]))
# Draw a map
ggplot() +
geom_sf(data = mysf) +
geom_point(data = mysf2, aes(x = lon, y = lat, size = value)) +
theme_map()
UPDATE ON INSET MAP
This is an update following different suggestion on using inset maps, which I think it would be the best solution for yout question and comments:
library(sf)
library(cartography)
EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
IT = subset(EU, CNTR_CODE == "IT")
mydata <-
structure(list(
abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"),
value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L),
nuts = c("ITC4C","ITC42","ITC4D","ITC41",
"ITC46", "ITC48","ITC4A","ITC15",
"ITI43","ITF61")
),
class = "data.frame",
row.names = c(NA, -10L))
patients = merge(IT, mydata, by.x = "id", by.y = "nuts")
#Get breaks for map
br=getBreaks(patients$value)
#Delimit zone
#Based on NUTS1, Nortwest Italy
par(mar=c(0,0,0,0))
ghostLayer(IT[grep("ITC",IT$NUTS_ID),], bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
patients,
var = "value",
breaks = br,
col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
legend.pos = "topleft",
legend.title.txt = "Total patients",
add = TRUE,
legend.frame = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)
#Inset
par(
fig = c(0, 0.4, 0.01, 0.4),
new = TRUE
)
inset=patients[patients$abbs %in% c("RM","CS"),]
ghostLayer(inset, bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
patients,
var = "value",
breaks = br,
col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
legend.pos = "n",
add = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)
box(which = "figure", lwd = 1)
#RESTORE PLOT
par(fig=c(0,1,0,1))
OLD ANSWER
Following my comment on plotting labels, maybe with circles is not the best option for your map, given the concentration. I suggest you to use another kind of map for that, as chorolayer, I leveraged on https://stackoverflow.com/users/3304471/jazzurro for the dataframe.
library(sf)
library(cartography)
EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
IT = subset(EU, CNTR_CODE == "IT")
mydata <-
structure(list(
abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"),
value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L),
nuts = c("ITC4C","ITC42","ITC4D","ITC41",
"ITC46", "ITC48","ITC4A","ITC15",
"ITI43","ITF61")
),
class = "data.frame",
row.names = c(NA, -10L))
patients = merge(IT, mydata, by.x = "id", by.y = "nuts")
#Options1 - With circles
par(mar = c(0, 0, 0, 0))
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
propSymbolsLayer(
x = patients,
var = "value",
col = carto.pal(pal1 = "red.pal", n1 = 6),
legend.title.txt = "Total patients",
add = TRUE
)
#Option 2 - Chorolayer with labels
par(mar = c(0, 0, 0, 0))
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
choroLayer(
patients,
var = "value",
col = carto.pal(pal1 = "red.pal", n1 = 6),
legend.title.txt = "Total patients",
add = TRUE
)
#Create labels
patients$label = paste(patients$abbs, patients$value, sep = " - ")
labelLayer(
patients,
txt = "label",
overlap = FALSE,
halo = TRUE,
show.lines = TRUE,
)
Data from
https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/nuts-2016-files.html

Flow map(Travel Path) Using Lat and Long in R

I am trying to plot flow map (for singapore) . I have Entry(Lat,Long) and Exit (Lat,long). I am trying to map the flow from entry to exit in singapore map.
structure(list(token_id = c(1.12374e+19, 1.12374e+19, 1.81313e+19,
1.85075e+19, 1.30752e+19, 1.30752e+19, 1.32828e+19, 1.70088e+19,
1.70088e+19, 1.70088e+19, 1.05536e+19, 1.44818e+19, 1.44736e+19,
1.44736e+19, 1.44736e+19, 1.44736e+19, 1.89909e+19, 1.15795e+19,
1.15795e+19, 1.15795e+19, 1.70234e+19, 1.70234e+19, 1.44062e+19,
1.21512e+19, 1.21512e+19, 1.95909e+19, 1.95909e+19, 1.50179e+19,
1.50179e+19, 1.24174e+19, 1.36445e+19, 1.98549e+19, 1.92068e+19,
1.18468e+19, 1.18468e+19, 1.92409e+19, 1.92409e+19, 1.21387e+19,
1.9162e+19, 1.9162e+19, 1.40385e+19, 1.40385e+19, 1.32996e+19,
1.32996e+19, 1.69103e+19, 1.69103e+19, 1.57387e+19, 1.40552e+19,
1.40552e+19, 1.00302e+19), Entry_Station_Lat = c(1.31509, 1.33261,
1.28425, 1.31812, 1.33858, 1.29287, 1.39692, 1.37773, 1.33858,
1.33322, 1.28179, 1.30036, 1.43697, 1.39752, 1.27637, 1.39752,
1.41747, 1.35733, 1.28405, 1.37773, 1.35898, 1.42948, 1.32774,
1.42948, 1.349, 1.36017, 1.34971, 1.38451, 1.31509, 1.31509,
1.37002, 1.34971, 1.31231, 1.39169, 1.31812, 1.44909, 1.29341,
1.41747, 1.33759, 1.44062, 1.31509, 1.38451, 1.29461, 1.32388,
1.41747, 1.27614, 1.39752, 1.39449, 1.33261, 1.31231), Entry_Station_Long = c(103.76525,
103.84718, 103.84329, 103.89308, 103.70611, 103.8526, 103.90902,
103.76339, 103.70611, 103.74217, 103.859, 103.85563, 103.7865,
103.74745, 103.84596, 103.74745, 103.83298, 103.9884, 103.85152,
103.76339, 103.75191, 103.83505, 103.67828, 103.83505, 103.74956,
103.88504, 103.87326, 103.74437, 103.76525, 103.76525, 103.84955,
103.87326, 103.83793, 103.89548, 103.89308, 103.82004, 103.78479,
103.83298, 103.69742, 103.80098, 103.76525, 103.74437, 103.80605,
103.93002, 103.83298, 103.79156, 103.74745, 103.90051, 103.84718,
103.83793), Exit_Station_Lat = structure(c(48L, 34L, 118L, 60L,
14L, 54L, 10L, 49L, 49L, 74L, 71L, 65L, 102L, 5L, 102L, 119L,
116L, 10L, 13L, 88L, 117L, 66L, 40L, 62L, 117L, 37L, 67L, 34L,
85L, 44L, 102L, 44L, 115L, 29L, 92L, 17L, 121L, 70L, 120L, 52L,
85L, 34L, 42L, 11L, 4L, 115L, 62L, 48L, 92L, 14L), .Label = c("1.27082",
"1.27091", "1.27236", "1.27614", "1.27637", "1.27646", "1.27935",
"1.28221", "1.28247", "1.28405", "1.28621", "1.28819", "1.28932",
"1.29287", "1.29309", "1.29338", "1.29341", "1.29461", "1.29694",
"1.29959", "1.29974", "1.30034", "1.30252", "1.30287", "1.30392",
"1.30394", "1.30619", "1.30736", "1.30842", "1.31139", "1.3115",
"1.31167", "1.31188", "1.31509", "1.31654", "1.31756", "1.31913",
"1.31977", "1.32008", "1.3205", "1.32104", "1.32388", "1.32573",
"1.32725", "1.32774", "1.33119", "1.33155", "1.33261", "1.33322",
"1.33474", "1.33554", "1.33759", "1.33764", "1.33858", "1.33921",
"1.34037", "1.34225", "1.34293", "1.3432", "1.34426", "1.34857",
"1.349", "1.34905", "1.35158", "1.35733", "1.35898", "1.36017",
"1.3625", "1.36849", "1.37002", "1.37121", "1.37304", "1.37666",
"1.37775", "1.3786", "1.37862", "1.38001", "1.38029", "1.3803",
"1.38178", "1.38269", "1.38295", "1.38399", "1.38423", "1.38451",
"1.38671", "1.38672", "1.38777", "1.38814", "1.3894", "1.39147",
"1.39169", "1.39189", "1.39208", "1.39389", "1.39449", "1.39452",
"1.39628", "1.39692", "1.39717", "1.39732", "1.39752", "1.39821",
"1.39928", "1.39962", "1.4023", "1.40455", "1.40511", "1.40524",
"1.40843", "1.40961", "1.41184", "1.41588", "1.41685", "1.41747",
"1.42526", "1.42948", "1.43256", "1.43697", "1.44062", "1.44909"
), class = "factor"), Exit_Station_Long = structure(c(59L, 19L,
27L, 4L, 65L, 3L, 63L, 6L, 6L, 21L, 93L, 121L, 9L, 56L, 9L, 32L,
16L, 63L, 44L, 23L, 50L, 12L, 54L, 11L, 50L, 71L, 87L, 19L, 7L,
118L, 9L, 118L, 49L, 90L, 96L, 31L, 45L, 61L, 38L, 2L, 7L, 19L,
117L, 47L, 34L, 49L, 11L, 59L, 96L, 65L), .Label = c("103.67828",
"103.69742", "103.70611", "103.72092", "103.73274", "103.74217",
"103.74437", "103.74529", "103.74745", "103.74905", "103.74956",
"103.75191", "103.7537", "103.75803", "103.76011", "103.76215",
"103.76237", "103.76449", "103.76525", "103.76648", "103.76667",
"103.76893", "103.7696", "103.77082", "103.77145", "103.77266",
"103.774", "103.77866", "103.78185", "103.78425", "103.78479",
"103.7865", "103.78744", "103.79156", "103.79631", "103.79654",
"103.79836", "103.80098", "103.803", "103.80605", "103.80745",
"103.80781", "103.80978", "103.81703", "103.82004", "103.82592",
"103.82695", "103.83216", "103.83298", "103.83505", "103.83918",
"103.83953", "103.83974", "103.84387", "103.84496", "103.84596",
"103.84673", "103.84674", "103.84718", "103.84823", "103.84955",
"103.85092", "103.85152", "103.85226", "103.8526", "103.85267",
"103.85436", "103.85446", "103.85452", "103.86088", "103.86149",
"103.86275", "103.86291", "103.86395", "103.86405", "103.86896",
"103.87087", "103.87135", "103.87534", "103.87563", "103.8763",
"103.87971", "103.88003", "103.88126", "103.88243", "103.88296",
"103.88504", "103.8858", "103.88816", "103.8886", "103.88934",
"103.89054", "103.89237", "103.89313", "103.8938", "103.89548",
"103.89719", "103.89723", "103.89854", "103.9003", "103.90051",
"103.90208", "103.90214", "103.9031", "103.90484", "103.90537",
"103.90597", "103.90599", "103.90663", "103.9086", "103.90902",
"103.9126", "103.9127", "103.91296", "103.91616", "103.9165",
"103.93002", "103.94638", "103.94929", "103.95337", "103.9884"
), class = "factor")), .Names = c("token_id", "Entry_Station_Lat",
"Entry_Station_Long", "Exit_Station_Lat", "Exit_Station_Long"
), row.names = c(10807L, 10808L, 10810L, 10815L, 10817L, 10818L,
10819L, 10820L, 10823L, 10824L, 10826L, 10827L, 10829L, 10831L,
10832L, 10833L, 10834L, 10835L, 10836L, 10838L, 10840L, 10841L,
10843L, 10847L, 10850L, 10852L, 10854L, 10855L, 10859L, 10861L,
10869L, 10872L, 10883L, 10886L, 10891L, 10895L, 10896L, 10897L,
10900L, 10902L, 10903L, 10906L, 10910L, 10911L, 10912L, 10913L,
10915L, 10920L, 10921L, 10924L), class = "data.frame")
I am trying to get something this : Map Flow
Just realized that the original solution usin geom_path was more complicated than necessary. geom_segmentworks without changing the data:
require(ggplot2)
require(ggmap)
basemap <- get_map("Singapore",
source = "stamen",
maptype = "toner",
zoom = 11)
g = ggplot(a)
map = ggmap(basemap, base_layer = g)
map = map + coord_cartesian() +
geom_curve(size = 1.3,
aes(x=as.numeric(Entry_Station_Long),
y=as.numeric(Entry_Station_Lat),
xend=as.numeric(as.character(Exit_Station_Long)),
yend=as.numeric(as.character(Exit_Station_Lat)),
color=as.factor(token_id)))
map
This solution leverages Draw curved lines in ggmap, geom_curve not working to implement curved lines on a map.
ggmaps used for simplicity - for more ambitious projects I would recommend leaflet.
Below the solution using a long data format with some prior data wrangling. It also uses straight lines instead of the curves above.
a %>%
mutate(path = row_number()) -> a
origin = select(a,token_id,Entry_Station_Lat,Entry_Station_Long,path)
origin$type = "origin"
dest = select(a,token_id,Exit_Station_Lat,Exit_Station_Long,path)
dest$type = "dest"
colnames(origin) = c("id","lat","long","path","type")
colnames(dest) = c("id","lat","long","path","type")
complete = rbind(origin,dest)
complete %>% arrange(path,type) -> complete
require(ggmap)
basemap <- get_map("Singapore",
source = "stamen",
maptype = "toner",
zoom = 11)
g = ggplot(complete, aes(x=as.numeric(long),
y=as.numeric(lat)))
map = ggmap(basemap, base_layer = g)
map + geom_path(aes(color = as.factor(id)),
size = 1.1)
If you want to plot it on an actual Google Map, and recreate the style of your linked map, you can use my googleway package that uses Google's Maps API. You need an API key to use their maps
library(googleway)
df$Exit_Station_Lat <- as.numeric(as.character(df$Exit_Station_Lat))
df$Exit_Station_Long <- as.numeric(as.character(df$Exit_Station_Long))
df$polyline <- apply(df, 1, function(x) {
lat <- c(x['Entry_Station_Lat'], x['Exit_Station_Lat'])
lon <- c(x['Entry_Station_Long'], x['Exit_Station_Long'])
encode_pl(lat = lat, lon = lon)
})
mapKey <- 'your_api_key'
style <- '[ { "stylers": [{ "visibility": "simplified"}]},{"stylers": [{"color": "#131314"}]},{"featureType": "water","stylers": [{"color": "#131313"},{"lightness": 7}]},{"elementType": "labels.text.fill","stylers": [{"visibility": "on"},{"lightness": 25}]}]'
google_map(key = mapKey, style = style) %>%
add_polylines(data = df,
polyline = "polyline",
mouse_over_group = "Entry_Station_Lat",
stroke_weight = 0.7,
stroke_opacity = 0.5,
stroke_colour = "#ccffff")
Note, to recreate the map using flight data, see the example given in ?add_polylines
You can also show other types of routes, for example, driving between the locations by using Google's Directions API to encode the driving routes.
df$drivingRoute <- lst_directions <- apply(df, 1, function(x){
orig <- as.numeric(c(x['Entry_Station_Lat'], x['Entry_Station_Long']))
dest <- as.numeric(c(x['Exit_Station_Lat'], x['Exit_Station_Long']))
dir <- google_directions(origin = orig, destination = dest, key = apiKey)
dir$routes$overview_polyline$points
})
google_map(key = mapKey, style = style) %>%
add_polylines(data = df,
polyline = "drivingRoute",
mouse_over_group = "Entry_Station_Lat",
stroke_weight = 0.7,
stroke_opacity = 0.5,
stroke_colour = "#ccffff")
Alternative answer using leaflet and geosphere
#get Packages
require(leaflet)
require(geosphere)
#format data
a$Entry_Station_Long = as.numeric(as.character(a$Entry_Station_Long))
a$Entry_Station_Lat = as.numeric(as.character(a$Entry_Station_Lat))
a$Exit_Station_Long = as.numeric(as.character(a$Exit_Station_Long))
a$Exit_Station_Lat = as.numeric(as.character(a$Exit_Station_Lat))
a$id = as.factor(as.numeric(as.factor(a$token_id)))
#create some colors
factpal <- colorFactor(heat.colors(30), pathList$id)
#create a list of interpolated paths
pathList = NULL
for(i in 1:nrow(a))
{
tmp = gcIntermediate(c(a$Entry_Station_Long[i],
a$Entry_Station_Lat[i]),
c(a$Exit_Station_Long[i],
a$Exit_Station_Lat[i]),n = 25,
addStartEnd=TRUE)
tmp = data.frame(tmp)
tmp$id = a[i,]$id
tmp$color = factpal(a[i,]$id)
pathList = c(pathList,list(tmp))
}
#create empty base leaflet object
leaflet() %>% addTiles() -> lf
#add each entry of pathlist to the leaflet object
for (path in pathList)
{
lf %>% addPolylines(data = path,
lng = ~lon,
lat = ~lat,
color = ~color) -> lf
}
#show output
lf
Note that as I mentioned before there is no way of geosphering the paths in such a small locality - the great circles are effectively straight lines. If you want the rounded edges for sake of aesthetics you may have to use the geom_curve way described in my other answer.
I've also written the mapdeck library to make visualisations like this more appealing*
library(mapdeck)
set_token("MAPBOX_TOKEN") ## set your mapbox token here
df$Exit_Station_Lat <- as.numeric(as.character(df$Exit_Station_Lat))
df$Exit_Station_Long <- as.numeric(as.character(df$Exit_Station_Long))
mapdeck(
style = mapdeck_style('dark')
, location = c(104, 1)
, zoom = 8
, pitch = 45
) %>%
add_arc(
data = df
, origin = c("Entry_Station_Long", "Entry_Station_Lat")
, destination = c("Exit_Station_Long", "Exit_Station_Lat")
, layer_id = 'arcs'
, stroke_from_opacity = 100
, stroke_to_opacity = 100
, stroke_width = 3
, stroke_from = "#ccffff"
, stroke_to = "#ccffff"
)
*subjectively speaking
I would like to leave an alternative approach for you. What you can do is to restructure your data. Right now you have two columns for entry stations and the other two for exit stations. You can create one column for long, and another for lat by combing these columns. The trick is to use rbind() and c().
Let's have a look of this simple example.
x <- c(1, 3, 5)
y <- c(2, 4, 6)
c(rbind(x, y))
#[1] 1 2 3 4 5 6
Imagine x is long for entry stations and y for exit stations. 1 is longitude for a starting point. 2 is longitude where the first journey ended. As far as I can see from your sample data, it seems that 3 is identical 2. You could remove duplicated data points for each token_id. If you have a large set of data, perhaps this is something you want to consider. Back to the main point, you can create a column with longitude in the sequence you want with the combination of the two functions. Since you said you have date information, make sure you order the data by date. Then, the sequence of each journey appears in the right way in tmp. You want to do this with latitude as well.
Now we look into your sample data. It seems that Exit_Station_Lat and Exit_Station_Long are in factor. The first operation is to convert them to numeric. Then, you apply the method above and create a data frame. I called your data mydf.
library(dplyr)
library(ggplot2)
library(ggalt)
library(ggthemes)
library(raster)
mydf %>%
mutate_at(vars(Exit_Station_Lat: Exit_Station_Long),
funs(as.numeric(as.character(.)))) -> mydf
group_by(mydf, token_id) %>%
do(data.frame(long = c(rbind(.$Entry_Station_Long,.$Exit_Station_Long)),
lat = c(rbind(.$Entry_Station_Lat, .$Exit_Station_Lat))
)
) -> tmp
Now let's get a map data from GADM. You can download data using the raster package.
getData(name = "GADM", country = "singapore", level = 0) %>%
fortify -> singapore
Finally, you draw a map. The key thing is to use group in aes in geom_path(). I hope this will let you move forward.
ggplot() +
geom_cartogram(data = singapore,
aes(x = long, y = lat, map_id = id),
map = singapore) +
geom_path(data = tmp,
aes(x = long, y = lat, group = token_id,
color = as.character(token_id)),
show.legend = FALSE) +
theme_map()

Resources