Select raster in ggplot near coastline - r

So I have a map that is plotting air pressure in Catalonia:
Here's a close up:
I would now like to select all observations with air pressure above 97 kPa (dark blue) and create a new data frame with them for further analysis. Here's the tricky bit, I want to select observations that meet the altitude filter AAAAND are along the Mediterranean coast. Most observations above 97 kPa are along the Mediterranean coast but there are some outliers that are inland.
In the end, I guess some sort of diagonal latitude, longitude filtering must occur but I don't know how to specify it. Is there some sort of way to lace, or draw a polygon on data and then have it select everything inside?
This is my data:
structure(list(final.Latitude = c(42.161626, 41.516819, 41.786856,
41.748215, 42.393932, 41.767667, 41.938401, 41.494079, 41.752819,
41.422327, 41.253914, 42.104854, 40.544337, 41.647625, 41.26267,
40.745573, 40.881084, 41.201499, 41.494183, 40.873663, 41.211076,
41.814818, 41.737032, 42.434746, 41.796036, 41.197585, 42.119308,
41.665698, 41.141899, 40.883885, 40.814408, 40.710754, 41.65649,
41.541525, 41.581905, 41.61424, 42.215454, 41.137955, 41.553355,
42.17195, 40.909931, 42.757417, 41.89469, 41.49472, 41.447145,
41.791172, 41.862813, 41.677615, 41.398371, 41.094337, 42.83454,
41.331905, 41.954854, 41.560246, 41.758456, 41.516953, 41.625954,
40.974225, 42.098215, 42.238615, 41.841862, 41.280658, 41.491805,
41.675766, 42.416667, 41.220308, 42.043361, 41.009161, 41.136268,
41.747716, 41.816881, 41.851497, 42.306365, 41.525909, 41.732976,
41.187937, 41.788901, 41.862027, 41.874859, 41.074787, 41.724519,
42.313455, 42.838364, 41.590543, 42.422663, 41.530049, 42.374163,
41.639757, 41.914026, 41.531976, 42.282191, 41.474805, 41.760742,
41.498525, 41.277658, 41.648019, 41.175305, 41.672663, 42.269094,
41.561134, 41.467288, 42.403712, 41.100187, 41.82609, 41.516667,
41.636616, 41.417024, 41.688442, 41.165596, 41.208101, 41.850617,
41.197377, 42.367024, 42.361421, 41.594469, 42.391185, 42.578381,
41.868737, 41.145762, 42.496539, 41.161892, 41.231334, 41.510179,
41.634688, 41.763027, 41.929978, 40.671331, 41.713584, 41.462664,
41.692433, 41.053748, 41.136997, 41.207852, 41.683047, 42.344326,
41.4404, 42.411528, 41.041842, 41.984444, 42.034332, 40.655163,
41.355913, 41.727408, 41.607816, 41.193202, 41.731762, 42.028479,
41.102165, 41.955568, 41.360861, 41.743718, 41.580977, 41.852114,
42.073092, 41.601927, 41.315489, 41.548474, 42.810764, 41.709801,
42.371338, 41.47145, 42.621379, 41.826875, 41.374287, 41.616667,
42.296039, 42.074764, 41.699929, 41.646217, 41.885617, 42.002284,
41.726337, 41.284957, 41.47402, 40.734082, 40.716001, 41.480477,
41.18225, 41.503525, 41.126212, 42.233618, 41.598858, 41.477979,
41.605161, 41.039296, 42.360065, 41.759637, 41.846688, 42.109172,
41.265454, 41.468488, 42.017482, 41.625876, 41.759101, 41.191922,
41.699429, 41.751713, 42.22352, 41.248929, 42.271593, 41.354354,
41.543718, 42.068952, 41.87566, 42.010081, 41.316667, 41.985403,
42.181691, 42.065168, 41.501108, 42.218238, 41.873141, 41.946288,
41.666667, 41.917381, 41.175447, 41.971022, 41.437188, 42.316667,
42.042256, 41.191613, 42.057709, 41.522319, 41.42334, 41.678745,
41.554229, 41.186914, 42.249185, 41.234832, 42.315525, 42.408303,
41.477337, 41.413134, 41.915639, 41.188829, 41.740291, 41.309827,
41.327836, 41.051187, 41.607475, 41.794363, 42.380192, 41.278987,
41.717096, 41.940471, 41.422931, 41.728017, 41.002665, 41.224484,
41.155913, 42.433333, 42.20033, 41.496859, 42.188615, 41.168999,
41.894646, 41.138704, 42.227089, 41.466667, 41.224079, 41.54329,
42.228907, 41.826019, 41.076036, 41.973982, 41.762849, 42.233975,
41.744147, 42.212313, 41.383333, 41.26126, 41.45356, 41.533333,
41.694361, 41.253569, 42.145754, 42.13269, 42.323734, 41.838472,
41.808423, 41.237045, 41.257745, 42.035974, 41.456014, 41.99404,
42.112564, 42.413192, 41.53343, 41.384189, 41.556626, 41.977765,
41.750928, 41.737402, 42.184708, 41.874234, 41.116667, 41.995717,
41.561111, 41.482015, 41.967881, 40.90787, 41.849001, 41.699143,
41.394159, 41.701173, 42.422255, 41.125037, 41.146003, 41.673185,
41.952427, 41.533986, 40.812492, 42.166362, 41.526426, 40.597668,
41.322164, 42.036402, 41.880549, 42.20347, 41.379576, 41.752114,
41.648215, 41.386759, 41.243112, 41.286117, 41.220468, 42.061813,
41.929335, 41.247787, 42.283618, 41.957459, 42.137402, 41.316881,
42.138901, 41.847644, 42.246931, 42.475945, 42.329978, 42.196145,
42.217381, 42.375733, 42.254854, 41.613785, 41.348501, 41.947002,
41.228003, 41.504069, 41.225245, 42.105996, 41.383666, 41.428805,
41.184644, 42.393307, 42.105944, 42.369698, 42.077865, 42.330101,
42.592736, 42.495852, 42.593676, 42.324487, 42.350054, 42.016913,
42.173822, 42.373156, 42.198189, 42.254595, 42.372036, 42.286585,
42.246175, 42.737724, 42.785504, 42.742824, 42.747696, 42.104657,
42.212158, 42.21305, 42.324369, 42.36003), final.Longitude = c(1.092285,
1.901004, 1.096829, 1.629193, 2.846645, 2.250857, 2.76217, 0.458438,
0.659356, 0.937106, 1.093504, 3.085332, 0.48089, 0.693825, 1.170103,
0.618666, 0.498917, 1.046086, 2.294718, 0.400167, 0.975421, 0.636328,
0.584498, 0.627843, 0.568341, 1.11167, 2.101356, 0.555637, 1.375338,
0.801686, 0.714752, 0.578086, 1.082864, 0.924567, 2.549358, 2.539721,
2.641934, 0.908485, 2.401142, 3.074909, 0.260671, 0.71233, 1.045118,
0.672773, 2.244789, 0.810942, 1.874909, 1.016459, 2.1741, 0.311004,
0.716868, 1.922554, 3.207173, 1.016022, 0.904336, 1.433333, 1.011437,
0.517667, 1.844004, 2.708886, 1.358886, 0.722351, 1.348755, 2.789078,
1.883333, 1.439364, 2.910885, 0.383916, 0.989194, 2.558458, 2.305603,
1.215786, 2.967557, 2.368952, 1.513526, 1.569665, 1.986723, 3.074338,
0.878136, 1.052436, 2.464097, 2.36524, 0.738999, 2.581156, 2.926337,
1.686508, 2.920236, 2.357387, 1.680977, 1.620736, 2.016739, 1.979945,
2.161279, 0.765658, 1.968755, 0.970925, 1.324377, 1.271554, 2.848572,
1.13935, 0.689467, 3.150642, 0.886446, 2.175124, 1.3, 1.518095,
1.919698, 0.695244, 1.444914, 1.672893, 0.959004, 1.635581, 2.833618,
1.870308, 2.405817, 2.999929, 1.086656, 2.111956, 0.81979, 1.272164,
0.473006, 0.550333, 0.920149, 0.873428, 1.668666, 2.811885, 0.519665,
1.01746, 1.024914, 2.113455, 0.438496, 0.649917, 1.247073, 2.28626,
3.064954, 1.860957, 1.844289, 0.632897, 2.821111, 1.880192, 0.468583,
0.664668, 2.059243, 2.289721, 0.777024, 2.503212, 3.10289, 0.752582,
2.233086, 2.110014, 2.628836, 1.6172, 0.586083, 3.010814, 1.547502,
1.453319, 0.824509, 0.710496, 0.902696, 1.458522, 1.532405, 1.248929,
2.893647, 0.549688, 0.633333, 2.912027, 0.916209, 2.845646, 2.741505,
2.232976, 2.28476, 1.825837, 0.75331, 1.93062, 0.37308, 0.530085,
2.316382, 1.045336, 1.811358, 0.732084, 1.776623, 2.02648, 1.612224,
0.879873, 0.596649, 3.000214, 2.01358, 1.041892, 2.208315, 1.365168,
2.280459, 1.735019, 1.415025, 2.395075, 1.208506, 1.976123, 1.903569,
2.860457, 1.294718, 2.231477, 1.849072, 1.894075, 1.313526, 1.17631,
1.564897, 1.816667, 2.094575, 2.48476, 1.900428, 0.959546, 2.907316,
0.720168, 2.556673, 2.75, 3.163098, 1.269737, 3.148144, 2.009243,
3.116667, 2.10703, 1.255497, 1.267667, 1.750761, 1.203614, 0.855824,
1.677122, 1.209089, 0.967787, 0.845336, 2.921769, 0.741027, 1.387723,
1.514579, 1.185612, 0.85608, 0.638549, 0.988663, 2.094718, 0.871539,
1.038313, 1.050881, 1.73626, 1.297127, 1.420879, 2.840791, 1.396841,
1.70621, 0.599667, 1.310832, 1.107423, 1.133333, 2.190467, 2.157387,
2.407067, 0.976254, 2.803533, 1.051562, 3.042647, 1.283333, 1.21775,
2.109423, 1.73576, 1.895503, 1.139829, 2.791613, 1.323696, 2.286188,
1.805639, 2.511456, 2.083333, 1.772075, 1.703618, 1.383333, 2.444646,
1.550857, 2.569984, 2.978193, 3.186045, 2.285046, 2.745075, 1.807316,
0.932084, 2.165025, 1.17831, 1.517809, 2.159886, 1.130453, 0.486334,
1.796288, 0.567081, 2.516524, 2.292468, 1.977908, 0.900076, 2.287188,
1.25, 2.418595, 2.008056, 2.267024, 1.241291, 0.512357, 2.228122,
2.71888, 0.720332, 1.053843, 0.982562, 0.864294, 1.400214, 0.606714,
3.126837, 0.514198, 0.521602, 0.894424, 1.955372, 0.446585, 0.876504,
3.034546, 3.126409, 1.818155, 0.983416, 0.813849, 2.509957, 1.932048,
1.249001, 1.249929, 1.533654, 3.046003, 2.255497, 1.330192, 2.981263,
2.773767, 1.932191, 2.016775, 2.888187, 2.390149, 2.936294, 0.716667,
2.316524, 3.056888, 2.970593, 3.068024, 2.889007, 0.964319, 0.956673,
2.384404, 1.033249, 2.392219, 0.780085, 2.993361, 0.947085, 0.974277,
0.589454, 0.722588, 1.469659, 1.190359, 1.646143, 1.039464, 1.261727,
1.21003, 1.202521, 1.540233, 1.582939, 1.295875, 1.316197, 1.522829,
1.395169, 1.608097, 1.605217, 1.424143, 1.501501, 0.720617, 0.692133,
0.943847, 0.726361, 0.857982, 1.00119, 0.931612, 0.936593, 0.880921
), final.airpressure = c(90.3429670210118, 100.056314503294,
97.3091644688341, 95.6535847066497, 99.3265763810809, 96.5299354845676,
99.2908252491736, 99.9962987211271, 98.4839088116185, 95.1270315428243,
91.7964083849734, 101.021480506879, 100.453323066872, 98.7679512938402,
98.4130259181779, 101.215628071472, 101.154916909799, 98.1888998199279,
100.236578057928, 97.3442020744334, 96.8780695175868, 97.2157924966616,
97.6952727499611, 89.3510660330727, 97.4026261211394, 97.8595387529882,
91.4445861078662, 98.1653373115852, 100.694700579387, 101.094242163897,
101.227774675623, 101.227774675623, 97.3325214710347, 97.3675674866686,
101.203482924827, 99.8643906852719, 99.1241567035337, 97.2274590917429,
100.260637723706, 101.239922737453, 95.3556106036821, 87.4837596830127,
97.5312824821367, 98.2596212844703, 101.252072257138, 98.5311924350098,
97.4260055568285, 97.941775300055, 101.215628071472, 96.8548215707678,
90.6144028722749, 96.5878708247223, 98.9221495412202, 96.8896955834789,
98.1300041505036, 93.6882866645953, 97.648390271736, 101.094242163897,
93.1166652957402, 97.2974880696958, 95.940976332665, 96.9013230445826,
100.344892041253, 101.167056228173, 88.3170492768192, 99.1955517955091,
100.815606749275, 97.9065226068125, 98.9696435706551, 99.2908252491736,
91.5763611671335, 96.333215139154, 101.009358656544, 99.5532993411879,
93.3852278362198, 100.513613145927, 94.8306977255515, 101.057854786839,
97.4961775397119, 101.033603811924, 97.4961775397119, 90.4405901323857,
90.8866542537754, 101.142779048056, 98.9221495412202, 97.5429869382878,
100.032303869205, 99.0052790563431, 95.3441686169426, 97.1458223183509,
85.7068860068727, 99.7326566531564, 92.3488424802678, 97.1225101185704,
101.288529565072, 98.1064557754255, 100.610152546045, 94.8762273867043,
99.7565953632834, 95.206971819536, 97.988798636897, 101.203482924827,
96.2176846131662, 90.9412026091979, 93.1054919663179, 96.2061391837523,
97.2507964822998, 98.7916584468953, 100.74304563755, 101.179197003352,
95.4357429674702, 101.203482924827, 99.0052790563431, 87.5362656890491,
99.5532993411879, 99.8284459750613, 86.5024876526411, 91.2801339042574,
96.9943929811272, 86.751973904839, 95.5732694326337, 100.755135528406,
97.554692799058, 98.4130259181779, 95.1384474715514, 100.092341260942,
99.8044900228565, 98.0358445503587, 94.5012604507927, 95.4013922834961,
97.0060330066725, 100.441269391339, 99.7326566531564, 98.3067972240066,
100.646378721282, 98.9696435706551, 88.4230933496698, 101.009358656544,
100.477434757695, 95.7799308085443, 99.302740863126, 95.1042037946933,
92.2491595688505, 99.5771950003942, 97.4961775397119, 99.1955517955091,
101.142779048056, 98.6139934081953, 95.2412524995432, 101.227774675623,
99.0528129975234, 97.5898188107684, 97.5781087351297, 100.791413906912,
96.8083424123042, 94.5693258589238, 98.1653373115852, 93.9021396579619,
98.3539958134293, 93.2620406569011, 94.1164807924228, 89.3081878129729,
99.3981172677039, 96.6110546956734, 99.4577740333045, 99.613049243971,
92.1606428704777, 101.264223234853, 101.276375670773, 94.5126012824831,
95.8719236916393, 98.4720914516169, 96.8199601104416, 100.646378721282,
97.1808011101903, 100.670536751023, 100.997238260744, 98.7560998507858,
98.2478308373571, 99.0171604026964, 99.3504176202466, 96.3100979416952,
96.2292314281149, 97.8947745289984, 100.803509602308, 100.610152546045,
96.0331238936967, 92.848875102663, 94.5466319440964, 98.5784987599884,
101.082111582687, 92.5374264091341, 93.0608120542108, 95.1042037946933,
100.068321981469, 95.9524899406276, 98.1064557754255, 99.5771950003942,
98.5548427591196, 90.3971890662113, 98.15355817787, 99.8284459750613,
95.7799308085443, 95.9755213018661, 92.693019947478, 98.7916584468953,
94.6033769450608, 96.0792308579518, 94.8193187245782, 96.7502748288739,
100.140397117231, 95.8489171908438, 97.2741394745025, 101.13064264277,
100.549804560724, 99.8763751312065, 100.658457011412, 99.6967593586619,
100.924546421398, 92.5596380567634, 99.8164672802803, 94.6715159035427,
97.4610852327918, 96.7502748288739, 98.7324012307697, 98.15355817787,
100.140397117231, 95.1498647702722, 97.2391270868996, 100.803509602308,
91.631323470876, 94.842078092087, 94.4672461199774, 97.0060330066725,
97.554692799058, 98.2242541872674, 90.408037379785, 101.227774675623,
98.389409626026, 97.4143151376009, 97.5664000646158, 88.837883277904,
98.3067972240066, 92.3820940484839, 100.224550390235, 94.6828771671133,
97.0293172485913, 99.2312686227987, 99.2193615850005, 99.9123370990196,
92.8823067150292, 93.2620406569011, 100.368977705507, 94.6033769450608,
97.7539075022777, 100.140397117231, 99.8164672802803, 101.239922737453,
94.6715159035427, 99.9483120154717, 99.0409273731154, 87.5782931823713,
98.000557998281, 101.300684917927, 100.320812156865, 94.785189913392,
92.3488424802678, 98.0123187708731, 97.2507964822998, 99.8524076773845,
100.791413906912, 98.3776036052463, 93.4524894115233, 98.8272298464039,
100.104353062587, 95.4701060199168, 100.284703164497, 100.74304563755,
93.5759281491901, 100.405117042172, 101.203482924827, 100.924546421398,
91.1597236157778, 95.5503346002599, 93.5646997115319, 92.9826737947488,
93.2508498834821, 99.9003483379078, 98.4130259181779, 99.4935852775937,
91.8735497628596, 97.1108561166103, 94.8306977255515, 94.6033769450608,
94.0261723258617, 100.501552236019, 91.2910881775692, 98.0123187708731,
99.6847964653268, 96.7270575491, 101.167056228173, 94.3313111822093,
100.912436202455, 95.6995094483651, 97.8712826022554, 89.0620370589062,
96.936213800838, 101.142779048056, 98.7560998507858, 100.791413906912,
99.8883610153609, 101.179197003352, 95.7914250898844, 97.2507964822998,
99.7206894524044, 93.7220205195241, 100.960885798805, 100.730957197395,
88.5505143251773, 94.0036087522406, 98.5075477862968, 98.6613394871811,
99.1955517955091, 99.3861902092696, 98.7442498298192, 100.730957197395,
101.045728571852, 95.6076820035206, 98.2832064236821, 101.094242163897,
100.128380990558, 92.5263225842078, 101.106374200865, 99.8763751312065,
91.8184421669338, 100.670536751023, 90.0723442570617, 89.1475776674053,
101.264223234853, 100.779319662913, 99.2908252491736, 100.140397117231,
97.8595387529882, 91.0285480828219, 94.7624441974078, 96.9711171200268,
101.203482924827, 97.4260055568285, 100.332851376663, 93.5534726212054,
95.6650638255479, 100.912436202455, 91.631323470876, 87.211235711269,
85.7377460403079, 91.6753170637597, 94.3879269519511, 87.6098270436714,
91.9286904332642, 90.9739473358343, 92.4042684118727, 90.408037379785,
96.3100979416952, 86.3780137131668, 88.9659019811689, 94.2634170829669,
87.6729288330611, 91.3787697026177, 93.1390159773857, 89.9859162986326,
91.4775120851854, 90.690550948402, 86.9395606892669, 87.1589246647255,
95.0585647311234, 95.3441686169426, 94.5920252209444, 92.8377339061359,
89.8456477071546)), .Names = c("final.Latitude", "final.Longitude",
"final.airpressure"), row.names = c(NA, -379L), class = "data.frame")
This is the code to plot based on this post: Plotting contours on an irregular grid
library(akima)
library(ggplot2)
library(reshape2)
x <- newfinal$final.Longitude
y <- newfinal$final.Latitude
z <- newfinal$final.airpressure
require(akima)
fld <- interp(x,y,z)
df <- melt(fld$z, na.rm = TRUE)
names(df) <- c("x", "y", "Rain")
df$Lon <- fld$x[df$x]
df$Lat <- fld$y[df$y]
ggplot(data = df, aes(x = Lon, y = Lat, z = Rain)) +
geom_tile(aes(fill = Rain)) +
stat_contour() +
ggtitle("Air pressure Catalonia") +
xlab("Longitude") +
ylab("Latitude") +
scale_fill_continuous(name = "kPa",
low = "white", high = "blue") +
theme(plot.title = element_text(size = 20,),
legend.title = element_text(size = 10),
axis.text = element_text(size = 10),
axis.title.x = element_text(size = 10, vjust = -0.5),
axis.title.y = element_text(size = 10, vjust = 0.2),
legend.text = element_text(size = 10))

Overview
Using the shape file for the western part of the Mediterranean Sea, I calculated the coordinates for the boundaries of the Balearic (Iberian Sea) and Western Basin portions. Then, I calculated the distance - in kilometers - of each 379 coordinate pairs in df to each coordinate pair for the boundaries of Balearic and Western Basin portions.
After filtering distance to only include the minimum distance values, I identified which points in df contained a distance that was less than or equal to 20 kilometers (based on the distance from Traiguera to VinarĂ²s).
Finally, df2 was created by returning the rows where final.airpressure was greater than 97 and whose coordinate pair was less than or equal to 20 kilometers from the Mediterranean Sea.
Reproducible Example
Note: df is not written down below to reduce the lines of code. To create df, simply copy and paste the dput() and store it.
Additionally, the shape file for the western part of the Mediterranean Sea needed to be downloaded in in my browser and unzipped inside my working directory prior to importing into r.
After reading Plotting contours on an irregular grid, I use the data from df2 to reproduce that plot using base R. The GeoJSON file of the administrative boundaries of the counties of Catalonia was taken from their open data portal.
# load necessary packages
library( akima )
library( sf )
library( leaflet )
library( geosphere )
# create sf data frame
# of Catalonia
catalonia.polygon <-
read_sf( dsn = "https://analisi.transparenciacatalunya.cat/api/geospatial/txvb-mhz6?method=export&format=GeoJSON"
, layer = "OGRGeoJSON"
, stringsAsFactors = FALSE )
# dissolve into one polygon
catalonia.polygon <-
st_union( x = catalonia.polygon )
# create sf data frame
# of the western basin
western.basin <-
read_sf( dsn = getwd()
, layer = "iho"
, stringsAsFactors = FALSE )
# view first version of the map
my.map <-
leaflet() %>%
setView( lng = 1.514619
, lat = 41.875227
, zoom = 8 ) %>%
addTiles() %>%
addPolygons( data = catalonia.polygon
, fill = "#D24618"
, color = "#D24618" ) %>%
addCircleMarkers( data = df
, lng = ~final.Longitude
, lat = ~final.Latitude
, fillColor = "#10539A"
, fillOpacity = 0.75
, stroke = FALSE
, radius = 10
, group = "No Filter" ) %>%
addPolygons( data = western.basin
, label = ~name )
# display map
my.map
# get the boundaries of each
# polygon within the western basin
list.of.polygon.boundaries <-
sapply(
X = methods::as( object = western.basin, Class = "Spatial" )#polygons
, FUN = function( i )
sp::coordinates( obj = i#Polygons[[1]] )
)
# label each set of boundaries
names( list.of.polygon.boundaries ) <- western.basin$name
# from Google maps
# it looks like we only care about the
# the Balearic and Western Basin parts of the sea
# since they are nearest Catalonia
list.of.polygon.boundaries <-
list.of.polygon.boundaries[ c("Balearic (Iberian Sea)", "Mediterranean Sea - Western Basin" ) ]
# calculate each points distance (in kilometers)
# from each boundary point within
# each polygon in list.of.polygon.boundaries
# ~1 minute to complete
distance <-
apply(
X = df[ c("final.Longitude", "final.Latitude") ]
, MARGIN = 1
, FUN = function( i )
lapply(
X = list.of.polygon.boundaries
, FUN = function( j )
distGeo(
p1 = i
, p2 = j
) / 1000 # to transform results into kilometers
)
)
# find the minimum distance value
# for each list in distance
distance.min <-
lapply(
X = distance
, FUN = function( i )
lapply(
X = i
, FUN = function( j )
min( j )
)
)
# set the maximum distance
# allowed between a point in df
# and the sea to 20 kilometers
max.km <- 20
# identify which points in df
# are less than or equal to max.km
less.than.or.equal.to.max.km <-
sapply(
X = distance.min
, FUN = function( i )
sapply(
X = i
, FUN = function( j )
j <= max.km
)
)
# convert matrix results into
# vector of TRUE/FALSE indices
less.than.or.equal.to.max.km <-
apply(
X = less.than.or.equal.to.max.km
, MARGIN = 2
, FUN = any
)
# create subset from df
# where the air pressure is greater than 97 kpa
# AND the coordinate pairs are located
# along the Mediterranean coast
df2 <-
df[ which( df$final.airpressure > 97 &
less.than.or.equal.to.max.km ), ]
# View our results
my.map %>%
hideGroup( group = "No Filter" ) %>%
addCircleMarkers( data = df2
, lng = ~final.Longitude
, lat = ~final.Latitude
, fillColor = "#10539A"
, fillOpacity = 0.75
, stroke = FALSE
, radius = 10
, group = "Filter" )
Plot Contours on Irregular Grid
# plot intended results
bivariate.interpolation <-
akima::interp(
x = df2$final.Longitude
, y = df2$final.Latitude
, z = df2$final.airpressure
)
# using base R
filled.contour(
x = bivariate.interpolation$x
, y = bivariate.interpolation$y
, z = bivariate.interpolation$z
, color.palette = colorRampPalette( c("white", "blue" ) )
, xlab = "Longitude"
, ylab = "Latitude"
, main = "Catalonia Air Pressure"
, key.title = title(
main = "Air Pressure (kPa)"
, cex.main = 0.5 )
)
# end of script #
Final Data from df2
structure(list(final.Latitude = c(42.104854, 40.544337, 41.26267,
40.745573, 40.881084, 41.201499, 41.494183, 40.873663, 41.197585,
41.141899, 40.883885, 40.814408, 40.710754, 41.581905, 41.61424,
41.137955, 41.553355, 42.17195, 41.447145, 41.398371, 41.954854,
40.974225, 41.675766, 41.220308, 41.009161, 41.136268, 41.747716,
42.306365, 41.525909, 41.187937, 41.862027, 41.074787, 41.724519,
41.590543, 42.422663, 42.374163, 41.639757, 41.277658, 41.175305,
42.403712, 41.417024, 41.165596, 41.208101, 41.197377, 41.594469,
42.391185, 40.671331, 41.053748, 41.207852, 42.344326, 41.041842,
40.655163, 41.607816, 41.731762, 42.028479, 41.102165, 41.360861,
41.743718, 42.073092, 41.826875, 42.296039, 41.699929, 41.646217,
40.734082, 40.716001, 41.480477, 41.18225, 41.039296, 42.360065,
41.265454, 41.468488, 41.191922, 41.248929, 41.354354, 41.316667,
42.218238, 41.666667, 41.917381, 41.175447, 41.971022, 41.437188,
42.316667, 41.191613, 41.186914, 42.315525, 41.188829, 41.327836,
41.051187, 41.278987, 41.002665, 41.224484, 41.155913, 41.496859,
41.168999, 41.138704, 42.227089, 41.224079, 41.54329, 41.076036,
41.383333, 41.26126, 41.694361, 41.253569, 42.13269, 42.323734,
41.808423, 41.237045, 41.384189, 41.116667, 41.482015, 40.90787,
41.699143, 41.146003, 41.952427, 40.812492, 40.597668, 42.036402,
41.880549, 41.648215, 41.386759, 41.243112, 41.286117, 41.220468,
42.061813, 41.247787, 42.283618, 41.316881, 42.138901, 42.246931,
42.196145, 42.217381, 42.375733, 42.254854, 41.504069, 42.105996
), final.Longitude = c(3.085332, 0.48089, 1.170103, 0.618666,
0.498917, 1.046086, 2.294718, 0.400167, 1.11167, 1.375338, 0.801686,
0.714752, 0.578086, 2.549358, 2.539721, 0.908485, 2.401142, 3.074909,
2.244789, 2.1741, 3.207173, 0.517667, 2.789078, 1.439364, 0.383916,
0.989194, 2.558458, 2.967557, 2.368952, 1.569665, 3.074338, 1.052436,
2.464097, 2.581156, 2.926337, 2.920236, 2.357387, 1.968755, 1.324377,
3.150642, 1.919698, 1.444914, 1.672893, 1.635581, 2.405817, 2.999929,
0.519665, 0.438496, 1.247073, 3.064954, 0.632897, 0.468583, 2.289721,
2.503212, 3.10289, 0.752582, 2.110014, 2.628836, 3.010814, 2.893647,
2.912027, 2.845646, 2.741505, 0.37308, 0.530085, 2.316382, 1.045336,
0.596649, 3.000214, 1.365168, 2.280459, 1.208506, 1.294718, 1.849072,
1.816667, 2.907316, 2.75, 3.163098, 1.269737, 3.148144, 2.009243,
3.116667, 1.255497, 1.209089, 2.921769, 0.85608, 2.094718, 0.871539,
1.297127, 0.599667, 1.310832, 1.107423, 2.157387, 0.976254, 1.051562,
3.042647, 1.21775, 2.109423, 1.139829, 2.083333, 1.772075, 2.444646,
1.550857, 2.978193, 3.186045, 2.745075, 1.807316, 1.796288, 1.25,
2.267024, 0.512357, 2.71888, 1.400214, 3.126837, 0.521602, 0.446585,
3.034546, 3.126409, 2.509957, 1.932048, 1.249001, 1.249929, 1.533654,
3.046003, 1.330192, 2.981263, 2.016775, 2.888187, 2.936294, 3.056888,
2.970593, 3.068024, 2.889007, 2.392219, 2.993361), final.airpressure = c(101.021480506879,
100.453323066872, 98.4130259181779, 101.215628071472, 101.154916909799,
98.1888998199279, 100.236578057928, 97.3442020744334, 97.8595387529882,
100.694700579387, 101.094242163897, 101.227774675623, 101.227774675623,
101.203482924827, 99.8643906852719, 97.2274590917429, 100.260637723706,
101.239922737453, 101.252072257138, 101.215628071472, 98.9221495412202,
101.094242163897, 101.167056228173, 99.1955517955091, 97.9065226068125,
98.9696435706551, 99.2908252491736, 101.009358656544, 99.5532993411879,
100.513613145927, 101.057854786839, 101.033603811924, 97.4961775397119,
101.142779048056, 98.9221495412202, 100.032303869205, 99.0052790563431,
101.288529565072, 100.610152546045, 101.203482924827, 97.2507964822998,
100.74304563755, 101.179197003352, 101.203482924827, 99.5532993411879,
99.8284459750613, 99.8044900228565, 97.0060330066725, 99.7326566531564,
100.646378721282, 101.009358656544, 99.302740863126, 99.5771950003942,
99.1955517955091, 101.142779048056, 98.6139934081953, 101.227774675623,
99.0528129975234, 100.791413906912, 99.3981172677039, 99.613049243971,
101.264223234853, 101.276375670773, 97.1808011101903, 100.670536751023,
100.997238260744, 98.7560998507858, 100.803509602308, 100.610152546045,
98.5784987599884, 101.082111582687, 100.068321981469, 98.5548427591196,
98.15355817787, 98.7916584468953, 100.140397117231, 101.13064264277,
100.549804560724, 99.8763751312065, 100.658457011412, 99.6967593586619,
100.924546421398, 99.8164672802803, 100.140397117231, 100.803509602308,
97.554692799058, 101.227774675623, 98.389409626026, 98.3067972240066,
99.2312686227987, 99.2193615850005, 99.9123370990196, 100.368977705507,
97.7539075022777, 99.8164672802803, 101.239922737453, 99.9483120154717,
99.0409273731154, 101.300684917927, 99.8524076773845, 100.791413906912,
98.8272298464039, 100.104353062587, 100.284703164497, 100.74304563755,
100.405117042172, 101.203482924827, 98.4130259181779, 100.501552236019,
99.6847964653268, 101.167056228173, 100.912436202455, 101.142779048056,
100.791413906912, 101.179197003352, 99.7206894524044, 100.960885798805,
100.730957197395, 98.6613394871811, 99.1955517955091, 99.3861902092696,
98.7442498298192, 100.730957197395, 101.045728571852, 98.2832064236821,
101.094242163897, 101.106374200865, 99.8763751312065, 100.670536751023,
101.264223234853, 100.779319662913, 99.2908252491736, 100.140397117231,
101.203482924827, 100.332851376663)), .Names = c("final.Latitude",
"final.Longitude", "final.airpressure"), row.names = c(12L, 13L,
15L, 16L, 17L, 18L, 19L, 20L, 26L, 29L, 30L, 31L, 32L, 35L, 36L,
38L, 39L, 40L, 45L, 49L, 53L, 58L, 64L, 66L, 68L, 69L, 70L, 73L,
74L, 76L, 78L, 80L, 81L, 84L, 85L, 87L, 88L, 95L, 97L, 102L,
107L, 109L, 110L, 112L, 115L, 116L, 127L, 131L, 133L, 135L, 138L,
141L, 144L, 146L, 147L, 148L, 150L, 151L, 154L, 163L, 166L, 168L,
169L, 175L, 176L, 177L, 178L, 185L, 186L, 190L, 191L, 195L, 199L,
201L, 206L, 211L, 214L, 215L, 216L, 217L, 218L, 219L, 221L, 227L,
230L, 235L, 238L, 239L, 243L, 248L, 249L, 250L, 253L, 255L, 257L,
258L, 260L, 261L, 264L, 270L, 271L, 274L, 275L, 277L, 278L, 280L,
281L, 289L, 296L, 299L, 301L, 303L, 308L, 310L, 312L, 315L, 317L,
318L, 322L, 323L, 324L, 325L, 326L, 327L, 329L, 330L, 333L, 334L,
336L, 339L, 340L, 341L, 342L, 347L, 349L), class = "data.frame")
Session Info
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] geosphere_1.5-7 leaflet_1.1.0.9000
[3] sf_0.6-0 akima_0.6-2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.15 magrittr_1.5 units_0.5-1
[4] xtable_1.8-2 lattice_0.20-35 R6_2.2.2
[7] rlang_0.2.0 udunits2_0.13 tools_3.4.3
[10] grid_3.4.3 e1071_1.6-8 DBI_0.8
[13] htmltools_0.3.6 crosstalk_1.0.0 class_7.3-14
[16] yaml_2.1.17 digest_0.6.15 shiny_1.0.5
[19] htmlwidgets_1.0 mime_0.5 sp_1.2-7
[22] compiler_3.4.3 pillar_1.2.1 classInt_0.1-24
[25] httpuv_1.3.6.2

Related

How to compute distance.matrix for the spatialRF::rf_spatial function

I am using the package spatialRF in R to perform a regression task. From the example provided by the package, the have precomputed the distance.matrix and they use it in the function spatialRF::rf. Here is an example:
library(spatialRF)
#loading training data
data(block.data)
#names of the response variable and the predictors
dependent.variable.name <- "ntl"
predictor.variable.names <- colnames(block.data)[2:4]
#coordinates of the cases
xy <- block.data[, c("x", "y")]
#distance matrix
distance.matrix <- dist(subset(block.data, select = -c(x, y)))
#random seed for reproducibility
random.seed <- 1
model.non.spatial <- spatialRF::rf(
data = block.data,
dependent.variable.name = dependent.variable.name,
predictor.variable.names = predictor.variable.names,
distance.matrix = distance.matrix,
distance.thresholds = 0,
xy = xy,
seed = random.seed,
verbose = FALSE)
When running the spatialRF::rf function I am getting this error: Error in diag<-(tmp, value = NA): only matrix diagonals can be replaced
My dataset:
block.data = structure(list(ntl = c(11.4058170318604, 13.7000455856323, 16.0420398712158,
17.4475727081299, 26.263370513916, 30.658130645752, 19.8927211761475,
20.917688369751, 23.7149887084961, 25.2641334533691), pop = c(111.031448364258,
145.096557617188, 166.351989746094, 193.804962158203, 331.787200927734,
382.979248046875, 237.971466064453, 276.575958251953, 334.015289306641,
345.376617431641), tirs = c(35.392936706543, 34.4172630310059,
33.7765464782715, 35.3224639892578, 40.4262886047363, 39.6619148254395,
38.6306076049805, 36.752326965332, 37.2010040283203, 36.1100578308105
), agbh = c(1.15364360809326, 0.177780777215958, 0.580717206001282,
0.647109687328339, 3.84336423873901, 5.6310133934021, 2.10894227027893,
3.9533429145813, 2.7016019821167, 4.36041164398193), lc = c(40L,
40L, 40L, 126L, 50L, 50L, 50L, 50L, 40L, 50L)), class = "data.frame", row.names = c(NA,
-10L))
For reference, in the example in the link I provided, the distance matrix and the dataset the authors are using it's the same.

Changing the order of lines plotted so that one is on top for time series in ggplot

so I am plotting some time series data on ggplot for a project I'm working on. Here is what the data looks like:
Sample data:
structure(list(Date = c("2015-01-01", "2015-02-01", "2015-03-01",
"2015-04-01"), Actual = c(500L, 600L, 700L, 750L), Fcst1 = c(600L,
610L, 634L, 650L), Fcst2 = c(500L, 630L, 875L, 900L), Fcst3 = c(500L,
600L, 754L, 800L), Fcst4 = c(500L, 600L, 700L, 760L)), class = "data.frame", row.names = c(NA,
-4L))
Date Actual Fcst1 Fcst2 Fcst3 Fcst4
2015-01-01 500 600 500 500 500
2015-02-01 600 610 630 600 600
2015-03-01 700 634 875 754 700
2015-04-01 750 650 900 800 760
.......... ... ... ... ... ...
The data itself goes on for another 60 months and there are 40 forecasts total, adjusted monthly. When I try to plot it, I want the Actual line to be on top, but it ends up being plotted first. Here's the code I am using:
df <- df %>%
mutate(Date = ymd(Date))
colnames(df)[3:length(df)] <-
paste("df", colnames(df[, c(3:length(df))]), sep = "")
ggplot(
tidyr::pivot_longer(df, c(Fcst1:Fcst6, Actual), names_to = "Forecast", names_prefix = "df"),
aes(Date, value, color = Forecast)
) +
geom_line(size = 1) +
scale_color_manual(values = c(
"Fcst1" = "red", "Fcst2" = "blue",
"Fcst3" = "green", "Fcst4" = "yellow", "Fcst5" = "purple",
"Fcst6" = "orange", "Actual" = "black"
)) +
ggtitle(label = "Actuals vs Forecasts", subtitle = "Dataset") +
ylab("Rate") +
scale_y_continuous(labels = scales::comma)
I want to preserve the ordering in the legend though, so I want Actuals at the top followed by the order of forecasts listed in the scale_color_manual. Right now, Actuals is being plotted first (which means it's under all the other forecasts), and I want it to be on top (and preferably with a thicker line, maybe size=1.2. Thanks!
Here a possible solution using the data you included. You have to format the levels of Forecast and modify scale_color_manual() for the legend. I have added a trick for that:
library(tidyverse)
#Data
df <- structure(list(Date = c("2015-01-01", "2015-02-01", "2015-03-01",
"2015-04-01"), Actual = c(500L, 600L, 700L, 750L), Fcst1 = c(600L,
610L, 634L, 650L), Fcst2 = c(500L, 630L, 875L, 900L), Fcst3 = c(500L,
600L, 754L, 800L), Fcst4 = c(500L, 600L, 700L, 760L)), class = "data.frame", row.names = c(NA,
-4L))
#Format date
df <- df %>% mutate(Date = ymd(Date))
#Create data for plot
df2 <- tidyr::pivot_longer(df, c(Fcst1:Fcst4, Actual), names_to = "Forecast", names_prefix = "df")
#Format levels
labs <- unique(df2$Forecast)
i1 <- labs[which(labs=='Actual')]
i2 <- rev(labs[which(labs!='Actual')])
i3 <- c(i2,i1)
df2$Forecast <- factor(df2$Forecast,levels=i3,ordered = T)
#Plot
ggplot(df2,aes(Date, value, color = Forecast)) +
geom_line(size = 1) +
scale_color_manual(values = c(
"Fcst1" = "red", "Fcst2" = "blue",
"Fcst3" = "green", "Fcst4" = "yellow", "Fcst5" = "purple",
"Fcst6" = "orange", "Actual" = "black"
),guide = guide_legend(reverse=TRUE)) +
ggtitle(label = "Actuals vs Forecasts", subtitle = "Dataset") +
ylab("Rate") +
scale_y_continuous(labels = scales::comma)
Output:

Plotting on a geographical map the provenience of our patients

I am trying to put on a Italian geographical map a dot reporting the provenience ('provincia') of our patients. Ideally, the dot size should be proportional to the number of patients coming from that 'provincia'. An example of the list I would like to plot is the following.
MI 8319
CO 537
MB 436
VA 338
BG 310
PV 254
CR 244
NO 210
RM 189
CS 179
In the first column there is the 'provincia' code: MI (Milano), CO (Como), MB (Monza-Brianza), etc. In the second column there is the number of patients from that 'provincia'. So the output should be an Italian political map where the biggest dot is around the city of Milano (MI), the second biggest dot is near the city of Como (CO), the third one is around the city of Monza-Brianza (MB),etc.
Is there any package that could do the plot I am looking for? I found a tool that could do the job here, but apparently they expect that I load the geographical coordinates in order to do the plot.
https://www.littlemissdata.com/blog/maps
Thanks in advance.
Here is one way to handle your task. You have the abbreviations for Italian province. You want to use them to merge your data with polygon data. If you download Italy's polygons from GADM, you can obtain data that contain the abbreviations. Specifically, the column, HASC_2 is the one. You need to merge your data with the polygon data. Then, you want to create another data set which contains centroid. You can draw a map with the two data sets.
library(tidyverse)
library(sf)
library(ggthemes)
# Get the sf file from https://gadm.org/download_country_v3.html
# and import it in R.
mysf <- readRDS("gadm36_ITA_2_sf.rds")
# This is your data, which is called mydata.
mydata <- structure(list(abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"), value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L)), class = "data.frame", row.names = c(NA,
-10L))
abbs value
1 MI 8319
2 CO 537
3 MB 436
4 VA 338
5 BG 310
6 PV 254
7 CR 244
8 NO 210
9 RM 189
10 CS 179
# Abbreviations are in HASC_2 in mysf. Manipulate strings so that
# I can join mydata with mysf with the abbreviations. I also get
# longitude and latitude with st_centroid(). This data set is for
# geom_point().
mysf2 <- mutate(mysf, HASC_2 = sub(x = HASC_2, pattern = "^IT.", replacement = "")) %>%
left_join(mydata, by = c("HASC_2" = "abbs")) %>%
mutate(lon = map_dbl(geometry, ~st_centroid(.x)[[1]]),
lat = map_dbl(geometry, ~st_centroid(.x)[[2]]))
# Draw a map
ggplot() +
geom_sf(data = mysf) +
geom_point(data = mysf2, aes(x = lon, y = lat, size = value)) +
theme_map()
UPDATE ON INSET MAP
This is an update following different suggestion on using inset maps, which I think it would be the best solution for yout question and comments:
library(sf)
library(cartography)
EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
IT = subset(EU, CNTR_CODE == "IT")
mydata <-
structure(list(
abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"),
value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L),
nuts = c("ITC4C","ITC42","ITC4D","ITC41",
"ITC46", "ITC48","ITC4A","ITC15",
"ITI43","ITF61")
),
class = "data.frame",
row.names = c(NA, -10L))
patients = merge(IT, mydata, by.x = "id", by.y = "nuts")
#Get breaks for map
br=getBreaks(patients$value)
#Delimit zone
#Based on NUTS1, Nortwest Italy
par(mar=c(0,0,0,0))
ghostLayer(IT[grep("ITC",IT$NUTS_ID),], bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
patients,
var = "value",
breaks = br,
col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
legend.pos = "topleft",
legend.title.txt = "Total patients",
add = TRUE,
legend.frame = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)
#Inset
par(
fig = c(0, 0.4, 0.01, 0.4),
new = TRUE
)
inset=patients[patients$abbs %in% c("RM","CS"),]
ghostLayer(inset, bg="lightblue")
plot(st_geometry(EU), col="grey90", add=TRUE)
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464", add=TRUE)
choroLayer(
patients,
var = "value",
breaks = br,
col = carto.pal(pal1 = "red.pal", n1 = length(br)-1),
legend.pos = "n",
add = TRUE
)
labelLayer(patients,txt="abbs", halo=TRUE, overlap = FALSE)
box(which = "figure", lwd = 1)
#RESTORE PLOT
par(fig=c(0,1,0,1))
OLD ANSWER
Following my comment on plotting labels, maybe with circles is not the best option for your map, given the concentration. I suggest you to use another kind of map for that, as chorolayer, I leveraged on https://stackoverflow.com/users/3304471/jazzurro for the dataframe.
library(sf)
library(cartography)
EU = st_read("~/R/mapslib/EUROSTAT/NUTS_RG_03M_2016_3035_LEVL_3.geojson")
IT = subset(EU, CNTR_CODE == "IT")
mydata <-
structure(list(
abbs = c("MI", "CO", "MB", "VA", "BG", "PV", "CR",
"NO", "RM", "CS"),
value = c(8319L, 537L, 436L, 338L, 310L, 254L,
244L, 210L, 189L, 179L),
nuts = c("ITC4C","ITC42","ITC4D","ITC41",
"ITC46", "ITC48","ITC4A","ITC15",
"ITI43","ITF61")
),
class = "data.frame",
row.names = c(NA, -10L))
patients = merge(IT, mydata, by.x = "id", by.y = "nuts")
#Options1 - With circles
par(mar = c(0, 0, 0, 0))
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
propSymbolsLayer(
x = patients,
var = "value",
col = carto.pal(pal1 = "red.pal", n1 = 6),
legend.title.txt = "Total patients",
add = TRUE
)
#Option 2 - Chorolayer with labels
par(mar = c(0, 0, 0, 0))
plot(st_geometry(IT), col = "#FEFEE9", border = "#646464")
choroLayer(
patients,
var = "value",
col = carto.pal(pal1 = "red.pal", n1 = 6),
legend.title.txt = "Total patients",
add = TRUE
)
#Create labels
patients$label = paste(patients$abbs, patients$value, sep = " - ")
labelLayer(
patients,
txt = "label",
overlap = FALSE,
halo = TRUE,
show.lines = TRUE,
)
Data from
https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/nuts-2016-files.html

Flow map(Travel Path) Using Lat and Long in R

I am trying to plot flow map (for singapore) . I have Entry(Lat,Long) and Exit (Lat,long). I am trying to map the flow from entry to exit in singapore map.
structure(list(token_id = c(1.12374e+19, 1.12374e+19, 1.81313e+19,
1.85075e+19, 1.30752e+19, 1.30752e+19, 1.32828e+19, 1.70088e+19,
1.70088e+19, 1.70088e+19, 1.05536e+19, 1.44818e+19, 1.44736e+19,
1.44736e+19, 1.44736e+19, 1.44736e+19, 1.89909e+19, 1.15795e+19,
1.15795e+19, 1.15795e+19, 1.70234e+19, 1.70234e+19, 1.44062e+19,
1.21512e+19, 1.21512e+19, 1.95909e+19, 1.95909e+19, 1.50179e+19,
1.50179e+19, 1.24174e+19, 1.36445e+19, 1.98549e+19, 1.92068e+19,
1.18468e+19, 1.18468e+19, 1.92409e+19, 1.92409e+19, 1.21387e+19,
1.9162e+19, 1.9162e+19, 1.40385e+19, 1.40385e+19, 1.32996e+19,
1.32996e+19, 1.69103e+19, 1.69103e+19, 1.57387e+19, 1.40552e+19,
1.40552e+19, 1.00302e+19), Entry_Station_Lat = c(1.31509, 1.33261,
1.28425, 1.31812, 1.33858, 1.29287, 1.39692, 1.37773, 1.33858,
1.33322, 1.28179, 1.30036, 1.43697, 1.39752, 1.27637, 1.39752,
1.41747, 1.35733, 1.28405, 1.37773, 1.35898, 1.42948, 1.32774,
1.42948, 1.349, 1.36017, 1.34971, 1.38451, 1.31509, 1.31509,
1.37002, 1.34971, 1.31231, 1.39169, 1.31812, 1.44909, 1.29341,
1.41747, 1.33759, 1.44062, 1.31509, 1.38451, 1.29461, 1.32388,
1.41747, 1.27614, 1.39752, 1.39449, 1.33261, 1.31231), Entry_Station_Long = c(103.76525,
103.84718, 103.84329, 103.89308, 103.70611, 103.8526, 103.90902,
103.76339, 103.70611, 103.74217, 103.859, 103.85563, 103.7865,
103.74745, 103.84596, 103.74745, 103.83298, 103.9884, 103.85152,
103.76339, 103.75191, 103.83505, 103.67828, 103.83505, 103.74956,
103.88504, 103.87326, 103.74437, 103.76525, 103.76525, 103.84955,
103.87326, 103.83793, 103.89548, 103.89308, 103.82004, 103.78479,
103.83298, 103.69742, 103.80098, 103.76525, 103.74437, 103.80605,
103.93002, 103.83298, 103.79156, 103.74745, 103.90051, 103.84718,
103.83793), Exit_Station_Lat = structure(c(48L, 34L, 118L, 60L,
14L, 54L, 10L, 49L, 49L, 74L, 71L, 65L, 102L, 5L, 102L, 119L,
116L, 10L, 13L, 88L, 117L, 66L, 40L, 62L, 117L, 37L, 67L, 34L,
85L, 44L, 102L, 44L, 115L, 29L, 92L, 17L, 121L, 70L, 120L, 52L,
85L, 34L, 42L, 11L, 4L, 115L, 62L, 48L, 92L, 14L), .Label = c("1.27082",
"1.27091", "1.27236", "1.27614", "1.27637", "1.27646", "1.27935",
"1.28221", "1.28247", "1.28405", "1.28621", "1.28819", "1.28932",
"1.29287", "1.29309", "1.29338", "1.29341", "1.29461", "1.29694",
"1.29959", "1.29974", "1.30034", "1.30252", "1.30287", "1.30392",
"1.30394", "1.30619", "1.30736", "1.30842", "1.31139", "1.3115",
"1.31167", "1.31188", "1.31509", "1.31654", "1.31756", "1.31913",
"1.31977", "1.32008", "1.3205", "1.32104", "1.32388", "1.32573",
"1.32725", "1.32774", "1.33119", "1.33155", "1.33261", "1.33322",
"1.33474", "1.33554", "1.33759", "1.33764", "1.33858", "1.33921",
"1.34037", "1.34225", "1.34293", "1.3432", "1.34426", "1.34857",
"1.349", "1.34905", "1.35158", "1.35733", "1.35898", "1.36017",
"1.3625", "1.36849", "1.37002", "1.37121", "1.37304", "1.37666",
"1.37775", "1.3786", "1.37862", "1.38001", "1.38029", "1.3803",
"1.38178", "1.38269", "1.38295", "1.38399", "1.38423", "1.38451",
"1.38671", "1.38672", "1.38777", "1.38814", "1.3894", "1.39147",
"1.39169", "1.39189", "1.39208", "1.39389", "1.39449", "1.39452",
"1.39628", "1.39692", "1.39717", "1.39732", "1.39752", "1.39821",
"1.39928", "1.39962", "1.4023", "1.40455", "1.40511", "1.40524",
"1.40843", "1.40961", "1.41184", "1.41588", "1.41685", "1.41747",
"1.42526", "1.42948", "1.43256", "1.43697", "1.44062", "1.44909"
), class = "factor"), Exit_Station_Long = structure(c(59L, 19L,
27L, 4L, 65L, 3L, 63L, 6L, 6L, 21L, 93L, 121L, 9L, 56L, 9L, 32L,
16L, 63L, 44L, 23L, 50L, 12L, 54L, 11L, 50L, 71L, 87L, 19L, 7L,
118L, 9L, 118L, 49L, 90L, 96L, 31L, 45L, 61L, 38L, 2L, 7L, 19L,
117L, 47L, 34L, 49L, 11L, 59L, 96L, 65L), .Label = c("103.67828",
"103.69742", "103.70611", "103.72092", "103.73274", "103.74217",
"103.74437", "103.74529", "103.74745", "103.74905", "103.74956",
"103.75191", "103.7537", "103.75803", "103.76011", "103.76215",
"103.76237", "103.76449", "103.76525", "103.76648", "103.76667",
"103.76893", "103.7696", "103.77082", "103.77145", "103.77266",
"103.774", "103.77866", "103.78185", "103.78425", "103.78479",
"103.7865", "103.78744", "103.79156", "103.79631", "103.79654",
"103.79836", "103.80098", "103.803", "103.80605", "103.80745",
"103.80781", "103.80978", "103.81703", "103.82004", "103.82592",
"103.82695", "103.83216", "103.83298", "103.83505", "103.83918",
"103.83953", "103.83974", "103.84387", "103.84496", "103.84596",
"103.84673", "103.84674", "103.84718", "103.84823", "103.84955",
"103.85092", "103.85152", "103.85226", "103.8526", "103.85267",
"103.85436", "103.85446", "103.85452", "103.86088", "103.86149",
"103.86275", "103.86291", "103.86395", "103.86405", "103.86896",
"103.87087", "103.87135", "103.87534", "103.87563", "103.8763",
"103.87971", "103.88003", "103.88126", "103.88243", "103.88296",
"103.88504", "103.8858", "103.88816", "103.8886", "103.88934",
"103.89054", "103.89237", "103.89313", "103.8938", "103.89548",
"103.89719", "103.89723", "103.89854", "103.9003", "103.90051",
"103.90208", "103.90214", "103.9031", "103.90484", "103.90537",
"103.90597", "103.90599", "103.90663", "103.9086", "103.90902",
"103.9126", "103.9127", "103.91296", "103.91616", "103.9165",
"103.93002", "103.94638", "103.94929", "103.95337", "103.9884"
), class = "factor")), .Names = c("token_id", "Entry_Station_Lat",
"Entry_Station_Long", "Exit_Station_Lat", "Exit_Station_Long"
), row.names = c(10807L, 10808L, 10810L, 10815L, 10817L, 10818L,
10819L, 10820L, 10823L, 10824L, 10826L, 10827L, 10829L, 10831L,
10832L, 10833L, 10834L, 10835L, 10836L, 10838L, 10840L, 10841L,
10843L, 10847L, 10850L, 10852L, 10854L, 10855L, 10859L, 10861L,
10869L, 10872L, 10883L, 10886L, 10891L, 10895L, 10896L, 10897L,
10900L, 10902L, 10903L, 10906L, 10910L, 10911L, 10912L, 10913L,
10915L, 10920L, 10921L, 10924L), class = "data.frame")
I am trying to get something this : Map Flow
Just realized that the original solution usin geom_path was more complicated than necessary. geom_segmentworks without changing the data:
require(ggplot2)
require(ggmap)
basemap <- get_map("Singapore",
source = "stamen",
maptype = "toner",
zoom = 11)
g = ggplot(a)
map = ggmap(basemap, base_layer = g)
map = map + coord_cartesian() +
geom_curve(size = 1.3,
aes(x=as.numeric(Entry_Station_Long),
y=as.numeric(Entry_Station_Lat),
xend=as.numeric(as.character(Exit_Station_Long)),
yend=as.numeric(as.character(Exit_Station_Lat)),
color=as.factor(token_id)))
map
This solution leverages Draw curved lines in ggmap, geom_curve not working to implement curved lines on a map.
ggmaps used for simplicity - for more ambitious projects I would recommend leaflet.
Below the solution using a long data format with some prior data wrangling. It also uses straight lines instead of the curves above.
a %>%
mutate(path = row_number()) -> a
origin = select(a,token_id,Entry_Station_Lat,Entry_Station_Long,path)
origin$type = "origin"
dest = select(a,token_id,Exit_Station_Lat,Exit_Station_Long,path)
dest$type = "dest"
colnames(origin) = c("id","lat","long","path","type")
colnames(dest) = c("id","lat","long","path","type")
complete = rbind(origin,dest)
complete %>% arrange(path,type) -> complete
require(ggmap)
basemap <- get_map("Singapore",
source = "stamen",
maptype = "toner",
zoom = 11)
g = ggplot(complete, aes(x=as.numeric(long),
y=as.numeric(lat)))
map = ggmap(basemap, base_layer = g)
map + geom_path(aes(color = as.factor(id)),
size = 1.1)
If you want to plot it on an actual Google Map, and recreate the style of your linked map, you can use my googleway package that uses Google's Maps API. You need an API key to use their maps
library(googleway)
df$Exit_Station_Lat <- as.numeric(as.character(df$Exit_Station_Lat))
df$Exit_Station_Long <- as.numeric(as.character(df$Exit_Station_Long))
df$polyline <- apply(df, 1, function(x) {
lat <- c(x['Entry_Station_Lat'], x['Exit_Station_Lat'])
lon <- c(x['Entry_Station_Long'], x['Exit_Station_Long'])
encode_pl(lat = lat, lon = lon)
})
mapKey <- 'your_api_key'
style <- '[ { "stylers": [{ "visibility": "simplified"}]},{"stylers": [{"color": "#131314"}]},{"featureType": "water","stylers": [{"color": "#131313"},{"lightness": 7}]},{"elementType": "labels.text.fill","stylers": [{"visibility": "on"},{"lightness": 25}]}]'
google_map(key = mapKey, style = style) %>%
add_polylines(data = df,
polyline = "polyline",
mouse_over_group = "Entry_Station_Lat",
stroke_weight = 0.7,
stroke_opacity = 0.5,
stroke_colour = "#ccffff")
Note, to recreate the map using flight data, see the example given in ?add_polylines
You can also show other types of routes, for example, driving between the locations by using Google's Directions API to encode the driving routes.
df$drivingRoute <- lst_directions <- apply(df, 1, function(x){
orig <- as.numeric(c(x['Entry_Station_Lat'], x['Entry_Station_Long']))
dest <- as.numeric(c(x['Exit_Station_Lat'], x['Exit_Station_Long']))
dir <- google_directions(origin = orig, destination = dest, key = apiKey)
dir$routes$overview_polyline$points
})
google_map(key = mapKey, style = style) %>%
add_polylines(data = df,
polyline = "drivingRoute",
mouse_over_group = "Entry_Station_Lat",
stroke_weight = 0.7,
stroke_opacity = 0.5,
stroke_colour = "#ccffff")
Alternative answer using leaflet and geosphere
#get Packages
require(leaflet)
require(geosphere)
#format data
a$Entry_Station_Long = as.numeric(as.character(a$Entry_Station_Long))
a$Entry_Station_Lat = as.numeric(as.character(a$Entry_Station_Lat))
a$Exit_Station_Long = as.numeric(as.character(a$Exit_Station_Long))
a$Exit_Station_Lat = as.numeric(as.character(a$Exit_Station_Lat))
a$id = as.factor(as.numeric(as.factor(a$token_id)))
#create some colors
factpal <- colorFactor(heat.colors(30), pathList$id)
#create a list of interpolated paths
pathList = NULL
for(i in 1:nrow(a))
{
tmp = gcIntermediate(c(a$Entry_Station_Long[i],
a$Entry_Station_Lat[i]),
c(a$Exit_Station_Long[i],
a$Exit_Station_Lat[i]),n = 25,
addStartEnd=TRUE)
tmp = data.frame(tmp)
tmp$id = a[i,]$id
tmp$color = factpal(a[i,]$id)
pathList = c(pathList,list(tmp))
}
#create empty base leaflet object
leaflet() %>% addTiles() -> lf
#add each entry of pathlist to the leaflet object
for (path in pathList)
{
lf %>% addPolylines(data = path,
lng = ~lon,
lat = ~lat,
color = ~color) -> lf
}
#show output
lf
Note that as I mentioned before there is no way of geosphering the paths in such a small locality - the great circles are effectively straight lines. If you want the rounded edges for sake of aesthetics you may have to use the geom_curve way described in my other answer.
I've also written the mapdeck library to make visualisations like this more appealing*
library(mapdeck)
set_token("MAPBOX_TOKEN") ## set your mapbox token here
df$Exit_Station_Lat <- as.numeric(as.character(df$Exit_Station_Lat))
df$Exit_Station_Long <- as.numeric(as.character(df$Exit_Station_Long))
mapdeck(
style = mapdeck_style('dark')
, location = c(104, 1)
, zoom = 8
, pitch = 45
) %>%
add_arc(
data = df
, origin = c("Entry_Station_Long", "Entry_Station_Lat")
, destination = c("Exit_Station_Long", "Exit_Station_Lat")
, layer_id = 'arcs'
, stroke_from_opacity = 100
, stroke_to_opacity = 100
, stroke_width = 3
, stroke_from = "#ccffff"
, stroke_to = "#ccffff"
)
*subjectively speaking
I would like to leave an alternative approach for you. What you can do is to restructure your data. Right now you have two columns for entry stations and the other two for exit stations. You can create one column for long, and another for lat by combing these columns. The trick is to use rbind() and c().
Let's have a look of this simple example.
x <- c(1, 3, 5)
y <- c(2, 4, 6)
c(rbind(x, y))
#[1] 1 2 3 4 5 6
Imagine x is long for entry stations and y for exit stations. 1 is longitude for a starting point. 2 is longitude where the first journey ended. As far as I can see from your sample data, it seems that 3 is identical 2. You could remove duplicated data points for each token_id. If you have a large set of data, perhaps this is something you want to consider. Back to the main point, you can create a column with longitude in the sequence you want with the combination of the two functions. Since you said you have date information, make sure you order the data by date. Then, the sequence of each journey appears in the right way in tmp. You want to do this with latitude as well.
Now we look into your sample data. It seems that Exit_Station_Lat and Exit_Station_Long are in factor. The first operation is to convert them to numeric. Then, you apply the method above and create a data frame. I called your data mydf.
library(dplyr)
library(ggplot2)
library(ggalt)
library(ggthemes)
library(raster)
mydf %>%
mutate_at(vars(Exit_Station_Lat: Exit_Station_Long),
funs(as.numeric(as.character(.)))) -> mydf
group_by(mydf, token_id) %>%
do(data.frame(long = c(rbind(.$Entry_Station_Long,.$Exit_Station_Long)),
lat = c(rbind(.$Entry_Station_Lat, .$Exit_Station_Lat))
)
) -> tmp
Now let's get a map data from GADM. You can download data using the raster package.
getData(name = "GADM", country = "singapore", level = 0) %>%
fortify -> singapore
Finally, you draw a map. The key thing is to use group in aes in geom_path(). I hope this will let you move forward.
ggplot() +
geom_cartogram(data = singapore,
aes(x = long, y = lat, map_id = id),
map = singapore) +
geom_path(data = tmp,
aes(x = long, y = lat, group = token_id,
color = as.character(token_id)),
show.legend = FALSE) +
theme_map()

Need to Draw a Bar Graph ( in percentile manner ) in ggplot2

hi i have a data set like this
ALL Critical Error Warning Review
2016 1412 475 4 125
154 45 49 2 58
116 86 12 1 17
I want to plot a stacked bar graph using ggplot2 where a single bar would show 100% of "ALL" and rest "Critical","Error","Warning","Review" should be on top of another according to their contribution in "ALL".
I am try it with no luck!!! Need a hand..Thanks
I'm not quite sure if your description of the desired plot is non-ambiguous.
My interpretation would be the following:
## Copied from user1317221_G - Thanks for that.
babydf <- structure(list(ALL = c(2016L, 154L, 116L), Critical = c(1412L,
45L, 86L), Error = c(475L, 49L, 12L), Warning = c(4L, 2L, 1L),
Review = c(125L, 58L, 17L)), .Names = c("ALL", "Critical",
"Error", "Warning", "Review"), class = "data.frame", row.names = c(NA,
-3L))
# Add IDs
babydf <- cbind(id=1:nrow(babydf), babydf))
library(reshape2)
library(ggplot2)
# reshape the dataframe:
df.reshaped <- melt(babydf, id.vars='id')
ggplot(subset(df.reshaped, variable != 'ALL'), aes(x=id, y=value, fill=variable)) + geom_bar(stat='identity')
If you want to have all bars of equal height, just do
babydf[, 3:6] <- babydf[, 3:6] / babydf$ALL * 100
before melt. The result:

Resources