mapping chemical concentrations with ggplot - r

I am trying to make map to show the concentrations of Chromium recorded in topsoil in Scotland (n = 1000). The following is a sub-set of the data:
Easting Northing Concentration
1 -4.327230 55.94000 1.913814
2 -4.336588 55.77886 1.408240
3 -4.334057 55.93637 1.798651
4 -4.340633 55.94451 1.629410
5 -4.341627 55.77247 1.382017
6 -4.354362 55.78004 1.432969
7 -4.327912 55.94871 1.488551
8 -4.301948 55.77286 1.278754
9 -4.317669 55.77715 1.465383
10 -4.266635 55.77981 1.793092
11 -4.349507 55.77358 1.336460
12 -4.331458 55.92509 1.546543
13 -4.360420 55.77211 1.720986
14 -4.316048 55.93779 1.876795
15 -4.348813 55.92620 1.637490
16 -4.358550 55.92574 1.460898
17 -4.271819 55.88522 2.011570
18 -4.350699 55.93884 1.385606
19 -4.323065 55.78208 1.620136
20 -4.305748 55.94769 1.463893
21 -4.324094 55.76453 1.416641
22 -4.311998 55.77294 1.390935
23 -4.295788 55.77657 1.378398
24 -4.351286 55.94323 1.485721
25 -4.344118 55.78473 1.623249
26 -4.358147 55.93492 1.454845
27 -4.310889 55.78653 1.372912
28 -4.270665 55.77506 1.706718
29 -4.341747 55.78244 1.561101
30 -4.312615 55.93929 1.521138
31 -4.330014 55.78626 1.564666
32 -4.328320 55.95283 2.313656
33 -4.334340 55.93043 2.007748
34 -4.317788 55.76303 1.309630
35 -4.342244 55.93936 1.680336
36 -4.351105 55.94818 1.673942
37 -4.351354 55.93379 1.396199
38 -4.318706 55.93135 1.854913
39 -4.315999 55.93428 1.361728
40 -4.326163 55.78588 1.646404
41 -4.302010 55.78203 2.023664
42 -4.318585 55.78720 1.305351
43 -4.304388 55.94097 1.465383
44 -4.309106 55.93414 1.539076
45 -4.297275 55.77474 1.503791
46 -4.298785 55.93290 1.447158
47 -4.326837 55.77311 1.555094
48 -4.342423 55.92641 1.338456
49 -4.332528 55.77228 1.491362
50 -4.347461 55.78197 1.426511
str(dat.tmp)
'data.frame': 50 obs. of 3 variables:
$ Easting : num -4.33 -4.34 -4.33 -4.34 -4.34 ...
$ Northing : num 55.9 55.8 55.9 55.9 55.8 ...
$ Concentration: num 1.91 1.41 1.8 1.63 1.38 ...
This is the code I am currently using to produce the concentrations on a map of Glasgow:
qmap(location="glasgow", maptype = "terrain",zoom=10,color="bw"
,extent="panel",
maprange=FALSE) +
stat_contour(data = dat.tmp, geom="polygon",
aes(x =Easting, y = Northing, z = Concentration
, fill = ..level.. ) ) +
scale_fill_continuous(name = "Cu (mg/kg)", low = "yellow", high = "red" )
When executing, this is returning:
Error in unit(tic_pos.c, "mm") : 'x' and 'units' must have length > 0
In addition: Warning message:
Not possible to generate contour data
This is a similar issue to a previous post - the map / plot I am trying create is very similar too.
Filled contour plot with R/ggplot/ggmap
Any help would be greatly appreciated - thank you.

I've reproduced the error, and also spotted a warning:
Warning message: Not possible to generate contour data
A quick google points at a similar question, which is resolved by using stat_density2d instead of stat_contour:
qmap(location="glasgow", maptype = "terrain", zoom=10, color="bw",
extent="panel", maprange=FALSE) +
stat_density2d(data=dat,
aes(x=Easting, y=Northing, z=Concentration, fill=..level.. ))

Related

how to add regression lines for each factor on a plot

I've created a model and I'm trying to add curves that fit the two parts of the data, insulation and no insulation. I was thinking about using the insulation coefficient as a true/false term, but I'm not sure how to translate that into code. Entries 1:56 are "w/o" and 57:101 are "w/". I'm not sure how to include the data I'm using but here's the head and tail:
month year kwh days est cost avgT dT.yr kWhd.1 id insulation
1 8 2003 476 21 a 33.32 69 -8 22.66667 1 w/o
2 9 2003 1052 30 e 112.33 73 -1 35.05172 2 w/o
3 10 2003 981 28 a 24.98 60 -6 35.05172 3 w/o
4 11 2003 1094 32 a 73.51 53 2 34.18750 4 w/o
5 12 2003 1409 32 a 93.23 44 6 44.03125 5 w/o
6 1 2004 1083 32 a 72.84 34 3 33.84375 6 w/o
month year kwh days est cost avgT dT.yr kWhd.1 id insulation
96 7 2011 551 29 e 55.56 72 0 19.00000 96 w/
97 8 2011 552 27 a 61.17 78 1 20.44444 97 w/
98 9 2011 666 34 e 73.87 71 -2 19.58824 98 w/
99 10 2011 416 27 a 48.03 64 0 15.40741 99 w/
100 11 2011 653 31 e 72.80 53 1 21.06452 100 w/
101 12 2011 751 33 a 83.94 45 2 22.75758 101 w/
bill$id <- seq(1:101)
bill$insulation <- as.factor(ifelse(bill$id > 56, c("w/"), c("w/o")))
m1 <- lm(kWhd.1 ~ avgT + insulation + I(avgT^2), data=bill)
with(bill, plot(kWhd.1 ~ avgT, xlab="Average Temperature (F)",
ylab="Daily Energy Use (kWh/d)", col=insulation))
no_ins <- data.frame(bill$avgT[1:56], bill$insulation[1:56])
curve(predict(m1, no_ins=x), add=TRUE, col="red")
ins <- data.frame(bill$avgT[57:101], bill$insulation[57:101])
curve(predict(m1, ins=x), add=TRUE, lty=2)
legend("topright", inset=0.01, pch=21, col=c("red", "black"),
legend=c("No Insulation", "Insulation"))
ggplot2 makes this a lot easier than base plotting. Something like this should work:
ggplot(bill, aes(x = avgT, y = kWhd.1, color = insulation)) +
geom_smooth(method = "lm", formula = y ~ x + I(x^2), se = FALSE) +
geom_point()
In base, I'd create a data frame with point you want to predict on, something like
pred_data = expand.grid(
kWhd.1 = seq(min(bill$kWhd.1), max(bill$kWhd.1), length.out = 100),
insulation = c("w/", "w/o")
)
pred_data$prediction = predict(m1, newdata = pred_data)
And then use lines to add the predictions to your plot. My base graphics is pretty rusty, so I'll leave that to you (or another answerer) if you want it.
In base R it's important to order the x-values. Since this is to be done on multiple factors, we can do this with by, resulting in a list L.
Since your example data is not complete, here's an example with iris where we consider Species as the "factor".
L <- by(iris, iris$Species, function(x) x[order(x$Petal.Length), ])
Now we can do the plot and add loess predictions as lines with a sapply.
with(iris, plot(Sepal.Width ~ Petal.Length, col=Species))
sapply(seq(L), function(x)
lines(L[[x]]$Petal.Length,
predict(loess(Sepal.Width ~ Petal.Length, L[[x]], span=1.1)), # span=1.1 for smoothing
col=x))
Yields

Aesthetics must be either length 1 or the same as the data: ymin, ymax, x, y, colour, when using a second geom_errorbar function

I'm trying to add error bars to a second curve (using dataset "pmfprofbs01"), but I'm having problems and I couldn't fix this.
There are a few threads on this error, but unfortunately it looks like every other answer is case specific, and I'm not able to overcome this error in my code. I am able to plot a first smoothed curve (stat_smooth) and overlapping errorbars (using geom_errobar). The problem rises when I try to add a second curve to the same graph, for comparison purposes.
With following code, I get the following error: "Error: Aesthetics must be either length 1 or the same as the data (35): ymin, ymax, x, y, colour"
I am looking to add additional errorbars to the second smoothed curve (corresponding to datasets pmfprof01 and pmfprofbs01).
Could someone explain why I keep getting this error? The code works until using the second call of geom_errorbar().
These are my 4 datasets (all used as data frames):
- pmfprof1 and pmfprof01 are the two datasets used for applying the smoothing method.
- pmfprofbs1 and pmfprofbs01 contain additional information based on an error analysis for plotting error bars.
> pmfprof1
Z correctedpmfprof1
1 -1.1023900 -8.025386e-22
2 -1.0570000 6.257110e-02
3 -1.0116000 1.251420e-01
4 -0.9662020 2.143170e-01
5 -0.9208040 3.300960e-01
6 -0.8754060 4.658550e-01
7 -0.8300090 6.113410e-01
8 -0.7846110 4.902430e-01
9 -0.7392140 3.344200e-01
10 -0.6938160 4.002040e-01
11 -0.6484190 1.215460e-01
12 -0.6030210 -1.724360e-01
13 -0.5576240 -6.077170e-01
14 -0.5122260 -1.513420e+00
15 -0.4668290 -2.075330e+00
16 -0.4214310 -2.617160e+00
17 -0.3760340 -3.350500e+00
18 -0.3306360 -4.076220e+00
19 -0.2852380 -4.926540e+00
20 -0.2398410 -5.826390e+00
21 -0.1944430 -6.761300e+00
22 -0.1490460 -7.301530e+00
23 -0.1036480 -7.303880e+00
24 -0.0582507 -7.026800e+00
25 -0.0128532 -6.627960e+00
26 0.0325444 -6.651490e+00
27 0.0779419 -6.919830e+00
28 0.1233390 -6.686490e+00
29 0.1687370 -6.129060e+00
30 0.2141350 -6.120890e+00
31 0.2595320 -6.455160e+00
32 0.3049300 -6.554560e+00
33 0.3503270 -6.983390e+00
34 0.3957250 -7.413500e+00
35 0.4411220 -6.697370e+00
36 0.4865200 -5.477230e+00
37 0.5319170 -4.552890e+00
38 0.5773150 -3.393060e+00
39 0.6227120 -2.449930e+00
40 0.6681100 -2.183190e+00
41 0.7135080 -1.673980e+00
42 0.7589050 -8.003740e-01
43 0.8043030 -2.918780e-01
44 0.8497000 -1.159710e-01
45 0.8950980 9.123767e-22
> pmfprof01
Z correctedpmfprof01
1 -1.25634000 -1.878749e-21
2 -1.20387000 -1.750190e-01
3 -1.15141000 -3.500380e-01
4 -1.09894000 -6.005650e-01
5 -1.04647000 -7.935110e-01
6 -0.99400600 -8.626150e-01
7 -0.94153900 -1.313880e+00
8 -0.88907200 -2.067770e+00
9 -0.83660500 -2.662440e+00
10 -0.78413800 -4.514190e+00
11 -0.73167100 -7.989510e+00
12 -0.67920400 -1.186870e+01
13 -0.62673800 -1.535970e+01
14 -0.57427100 -1.829150e+01
15 -0.52180400 -2.067170e+01
16 -0.46933700 -2.167890e+01
17 -0.41687000 -2.069820e+01
18 -0.36440300 -1.662640e+01
19 -0.31193600 -1.265950e+01
20 -0.25946900 -1.182580e+01
21 -0.20700200 -1.213370e+01
22 -0.15453500 -1.233680e+01
23 -0.10206800 -1.235160e+01
24 -0.04960160 -1.123630e+01
25 0.00286531 -9.086940e+00
26 0.05533220 -6.562710e+00
27 0.10779900 -4.185860e+00
28 0.16026600 -3.087430e+00
29 0.21273300 -2.005150e+00
30 0.26520000 -9.295540e-02
31 0.31766700 1.450360e+00
32 0.37013400 1.123910e+00
33 0.42260100 2.426750e-01
34 0.47506700 1.213370e-01
35 0.52753400 5.265226e-21
> pmfprofbs1
Z correctedpmfprof01 bsmean bssd bsse bsci
1 -1.1023900 -8.025386e-22 0.00000000 0.0000000 0.00000000 0.0000000
2 -1.0570000 6.257110e-02 1.46519200 0.6691245 0.09974719 0.2010273
3 -1.0116000 1.251420e-01 1.62453300 0.6368053 0.09492933 0.1913175
4 -0.9662020 2.143170e-01 1.62111600 0.7200497 0.10733867 0.2163269
5 -0.9208040 3.300960e-01 1.44754700 0.7236743 0.10787900 0.2174158
6 -0.8754060 4.658550e-01 1.67509800 0.7148755 0.10656735 0.2147724
7 -0.8300090 6.113410e-01 1.78144200 0.7374481 0.10993227 0.2215539
8 -0.7846110 4.902430e-01 1.73058700 0.7701354 0.11480501 0.2313743
9 -0.7392140 3.344200e-01 0.97430090 0.7809477 0.11641681 0.2346227
10 -0.6938160 4.002040e-01 1.26812000 0.8033838 0.11976139 0.2413632
11 -0.6484190 1.215460e-01 0.93601510 0.7927926 0.11818254 0.2381813
12 -0.6030210 -1.724360e-01 0.63201080 0.8210839 0.12239996 0.2466809
13 -0.5576240 -6.077170e-01 0.05952252 0.8653050 0.12899205 0.2599664
14 -0.5122260 -1.513420e+00 0.57893690 0.8858471 0.13205429 0.2661379
15 -0.4668290 -2.075330e+00 -0.08164613 0.8921298 0.13299086 0.2680255
16 -0.4214310 -2.617160e+00 -1.08074600 0.8906925 0.13277660 0.2675937
17 -0.3760340 -3.350500e+00 -1.67279700 0.9081813 0.13538367 0.2728479
18 -0.3306360 -4.076220e+00 -2.50074900 1.0641550 0.15863486 0.3197076
19 -0.2852380 -4.926540e+00 -3.12062200 1.0639080 0.15859804 0.3196333
20 -0.2398410 -5.826390e+00 -4.47060100 1.1320770 0.16876008 0.3401136
21 -0.1944430 -6.761300e+00 -5.40812700 1.1471780 0.17101120 0.3446504
22 -0.1490460 -7.301530e+00 -6.42419100 1.1685490 0.17419700 0.3510710
23 -0.1036480 -7.303880e+00 -5.79613500 1.1935850 0.17792915 0.3585926
24 -0.0582507 -7.026800e+00 -5.85496900 1.2117630 0.18063896 0.3640539
25 -0.0128532 -6.627960e+00 -6.70480400 1.1961400 0.17831002 0.3593602
26 0.0325444 -6.651490e+00 -8.27106200 1.3376870 0.19941060 0.4018857
27 0.0779419 -6.919830e+00 -8.79402900 1.3582760 0.20247983 0.4080713
28 0.1233390 -6.686490e+00 -8.35947700 1.3673080 0.20382624 0.4107848
29 0.1687370 -6.129060e+00 -8.04437600 1.3921620 0.20753126 0.4182518
30 0.2141350 -6.120890e+00 -8.18588300 1.5220550 0.22689456 0.4572759
31 0.2595320 -6.455160e+00 -8.37217600 1.5436800 0.23011823 0.4637728
32 0.3049300 -6.554560e+00 -8.59346400 1.6276880 0.24264140 0.4890116
33 0.3503270 -6.983390e+00 -8.88378700 1.6557140 0.24681927 0.4974316
34 0.3957250 -7.413500e+00 -9.72709800 1.6569390 0.24700188 0.4977996
35 0.4411220 -6.697370e+00 -9.46033400 1.6378470 0.24415582 0.4920637
36 0.4865200 -5.477230e+00 -8.37590600 1.6262700 0.24243002 0.4885856
37 0.5319170 -4.552890e+00 -7.52867000 1.6617010 0.24771176 0.4992302
38 0.5773150 -3.393060e+00 -6.89192300 1.6667330 0.24846189 0.5007420
39 0.6227120 -2.449930e+00 -6.25115300 1.6670390 0.24850750 0.5008340
40 0.6681100 -2.183190e+00 -6.05373800 1.6720180 0.24924973 0.5023298
41 0.7135080 -1.673980e+00 -5.10526700 1.6668400 0.24847784 0.5007742
42 0.7589050 -8.003740e-01 -4.42001600 1.6561830 0.24688918 0.4975725
43 0.8043030 -2.918780e-01 -4.26640200 1.6588970 0.24729376 0.4983878
44 0.8497000 -1.159710e-01 -4.46318500 1.6533830 0.24647179 0.4967312
45 0.8950980 9.123767e-22 -5.17173200 1.6557990 0.24683194 0.4974571
> pmfprofbs01
Z correctedpmfprof01 bsmean bssd bsse bsci
1 -1.25634000 -1.878749e-21 0.000000 0.0000000 0.00000000 0.0000000
2 -1.20387000 -1.750190e-01 2.316589 0.4646486 0.07853995 0.1596124
3 -1.15141000 -3.500380e-01 2.320647 0.4619668 0.07808664 0.1586911
4 -1.09894000 -6.005650e-01 2.635883 0.6519826 0.11020517 0.2239639
5 -1.04647000 -7.935110e-01 2.814679 0.6789875 0.11476983 0.2332404
6 -0.99400600 -8.626150e-01 2.588038 0.7324196 0.12380151 0.2515949
7 -0.94153900 -1.313880e+00 2.033736 0.7635401 0.12906183 0.2622852
8 -0.88907200 -2.067770e+00 2.394285 0.8120181 0.13725611 0.2789380
9 -0.83660500 -2.662440e+00 2.465425 0.9485307 0.16033095 0.3258317
10 -0.78413800 -4.514190e+00 0.998115 1.0177400 0.17202946 0.3496059
11 -0.73167100 -7.989510e+00 -1.585430 1.0502190 0.17751941 0.3607628
12 -0.67920400 -1.186870e+01 -5.740894 1.2281430 0.20759406 0.4218819
13 -0.62673800 -1.535970e+01 -9.325951 1.3289330 0.22463068 0.4565045
14 -0.57427100 -1.829150e+01 -12.010540 1.3279860 0.22447060 0.4561792
15 -0.52180400 -2.067170e+01 -14.672770 1.3296720 0.22475559 0.4567583
16 -0.46933700 -2.167890e+01 -14.912250 1.3192610 0.22299581 0.4531820
17 -0.41687000 -2.069820e+01 -12.850570 1.3288470 0.22461614 0.4564749
18 -0.36440300 -1.662640e+01 -6.093746 1.3497100 0.22814263 0.4636416
19 -0.31193600 -1.265950e+01 -5.210692 1.3602240 0.22991982 0.4672533
20 -0.25946900 -1.182580e+01 -6.041660 1.3818700 0.23357866 0.4746890
21 -0.20700200 -1.213370e+01 -5.765808 1.3854680 0.23418683 0.4759249
22 -0.15453500 -1.233680e+01 -6.985883 1.4025360 0.23707185 0.4817880
23 -0.10206800 -1.235160e+01 -7.152865 1.4224030 0.24042999 0.4886125
24 -0.04960160 -1.123630e+01 -3.600538 1.4122650 0.23871635 0.4851300
25 0.00286531 -9.086940e+00 -0.751673 1.5764920 0.26647578 0.5415439
26 0.05533220 -6.562710e+00 2.852910 1.5535620 0.26259991 0.5336672
27 0.10779900 -4.185860e+00 5.398850 1.5915640 0.26902342 0.5467214
28 0.16026600 -3.087430e+00 6.262459 1.6137360 0.27277117 0.5543377
29 0.21273300 -2.005150e+00 8.047920 1.6283340 0.27523868 0.5593523
30 0.26520000 -9.295540e-02 11.168640 1.6267620 0.27497297 0.5588123
31 0.31766700 1.450360e+00 12.345900 1.6363310 0.27659042 0.5620994
32 0.37013400 1.123910e+00 12.124650 1.6289230 0.27533824 0.5595546
33 0.42260100 2.426750e-01 11.279890 1.6137100 0.27276677 0.5543288
34 0.47506700 1.213370e-01 11.531670 1.6311490 0.27571450 0.5603193
35 0.52753400 5.265226e-21 11.284980 1.6662890 0.28165425 0.5723903
The code for plotting both curves is:
deltamean01<-pmfprofbs01[,"bsmean"]-
pmfprofbs01[,"correctedpmfprof01"]
correctmean01<-pmfprofbs01[,"bsmean"]-deltamean01
deltamean1<-pmfprofbs1[,"bsmean"]-
pmfprofbs1[,"correctedpmfprof1"]
correctmean1<-pmfprofbs1[,"bsmean"]-deltamean1
pl<- ggplot(pmfprof1, aes(x=pmfprof1[,1], y=pmfprof1[,2],
colour="red")) +
list(
stat_smooth(method = "gam", formula = y ~ s(x), size = 1,
colour="chartreuse3",fill="chartreuse3", alpha = 0.3),
geom_line(data=pmfprof1,linetype=4, size=0.5,colour="chartreuse3"),
geom_errorbar(aes(ymin=correctmean1-pmfprofbs1[,"bsci"],
ymax=correctmean1+pmfprofbs1[,"bsci"]),
data=pmfprofbs1,colour="chartreuse3",
width=0.02,size=0.9),
geom_point(data=pmfprof1,size=1,colour="chartreuse3"),
xlab(expression(xi*(nm))),
ylab("PMF (KJ/mol)"),
## GCD
geom_errorbar(aes(ymin=correctmean01-pmfprofbs01[,"bsci"],
ymax=correctmean01+pmfprofbs01[,"bsci"]),
data=pmfprofbs01,
width=0.02,size=0.9),
geom_line(data=pmfprof01,aes(x=pmfprof01[,1],y=pmfprof01[,2]),
linetype=4, size=0.5,colour="darkgreen"),
stat_smooth(data=pmfprof01,method = "gam",aes(x=pmfprof01[,1],pmfprof01[,2]),
formula = y ~ s(x), size = 1,
colour="darkgreen",fill="darkgreen", alpha = 0.3),
theme(text = element_text(size=20),
axis.text.x = element_text(size=20,colour="black"),
axis.text.y = element_text(size=20,colour="black")),
scale_x_continuous(breaks=number_ticks(8)),
scale_y_continuous(breaks=number_ticks(8)),
theme(panel.background = element_rect(fill ='white',
colour='gray')),
theme(plot.background = element_rect(fill='white',
colour='white')),
theme(legend.position="none"),
theme(legend.key = element_blank()),
theme(legend.title = element_text(colour='gray', size=20)),
NULL
)
pl
This is the result of using pl,
[enter image description here][1]
[1]: https://i.stack.imgur.com/x8FjY.png
Thanks in advance for any suggestion,

Measuring distance between centroids R

I want to create a matrix of the distance (in metres) between the centroids of every country in the world. Country names or country IDs should be included in the matrix.
The matrix is based on a shapefile of the world downloaded here: http://gadm.org/version2
Here is some rough info on the shapefile I'm using (I'm using shapefile#data$UN as my ID):
> str(shapefile#data)
'data.frame': 174 obs. of 11 variables:
$ FIPS : Factor w/ 243 levels "AA","AC","AE",..: 5 6 7 8 10 12 13
$ ISO2 : Factor w/ 246 levels "AD","AE","AF",..: 61 17 6 7 9 11 14
$ ISO3 : Factor w/ 246 levels "ABW","AFG","AGO",..: 64 18 6 11 3 10
$ UN : int 12 31 8 51 24 32 36 48 50 84 ...
$ NAME : Factor w/ 246 levels "Afghanistan",..: 3 15 2 11 6 10 13
$ AREA : int 238174 8260 2740 2820 124670 273669 768230 71 13017
$ POP2005 : int 32854159 8352021 3153731 3017661 16095214 38747148
$ REGION : int 2 142 150 142 2 19 9 142 142 19 ...
$ SUBREGION: int 15 145 39 145 17 5 53 145 34 13 ...
$ LON : num 2.63 47.4 20.07 44.56 17.54 ...
$ LAT : num 28.2 40.4 41.1 40.5 -12.3 ...
I tried this:
library(rgeos)
shapefile <- readOGR("./Map/Shapefiles/World/World Map", layer = "TM_WORLD_BORDERS-0.3") # Read in world shapefile
row.names(shapefile) <- as.character(shapefile#data$UN)
centroids <- gCentroid(shapefile, byid = TRUE, id = as.character(shapefile#data$UN)) # create centroids
dist_matrix <- as.data.frame(geosphere::distm(centroids))
The result looks something like this:
V1 V2 V3 V4
1 0.0 4296620.6 2145659.7 4077948.2
2 4296620.6 0.0 2309537.4 219442.4
3 2145659.7 2309537.4 0.0 2094277.3
4 4077948.2 219442.4 2094277.3 0.0
1) Instead of the first column (1,2,3,4) and row (V1, V2, V3, V4) I would like to have country IDs (shapefile#data$UN) or names (shapefile#data#NAME). How does that work?
2) I'm not sure of the value that is returned. Is it metres, kilometres, etc?
3) Is geosphere::distm preferable to geosphere:distGeo in this instance?
1.
This should work to add the column and row names to your matrix. Just as you had done when adding the row names to shapefile
crnames<-as.character(shapefile#data$UN)
colnames(dist_matrix)<- crnames
rownames(dist_matrix)<- crnames
2.
The default distance function in distm is distHaversine, which takes a radius( of the earth) variable in m. So I assume the output is in m.
3.
Look at the documentation for distGeo and distHaversine and decide the level of accuracy you want in your results. To look at the docs in R itself just enter ?distGeo.
edit: answer to q1 may be wrong since the matrix data may be aggregated, looking at alternatives

get_map does not download full extent of map wanted

I have a data.frame b that consists of 51 geographical points.
Using the ggmap package I tried to show the points on a google map
first I downloaded the map from google using a bounding box:
p<-ggmap(get_map(location = c(left = min(b$coords.x1), bottom = min(b$coords.x2), right =max(b$coords.x1) , top = max(b$coords.x2)))) but when I ploted the map :
p+geom_point(aes(x=coords.x1, y=coords.x2),data=b, alpa=0.5, size=3)
I got the next warning:
Removed 21 rows containing missing values (geom_point)
here is my data frame:
>b
coords.x1 coords.x2
1 12.51787 41.87951
2 12.47803 41.89199
3 12.48278 41.90599
4 12.47687 41.89861
5 12.49223 41.89021
6 12.47090 41.90332
7 12.46656 41.89767
8 12.48494 41.90068
9 12.45351 41.90665
10 12.47221 41.89556
11 12.48449 41.89064
12 12.50552 41.89576
13 12.47714 41.85862
14 12.49313 41.87940
15 12.45394 41.90217
16 12.45446 41.90305
17 12.45446 41.90305
18 12.49214 41.91421
19 12.48331 41.90093
20 12.49060 41.88977
21 12.48533 41.89242
22 12.48111 41.89503
23 12.48313 41.89385
24 12.47454 41.89958
25 12.47631 41.91145
26 12.48694 41.88833
27 12.47477 41.88356
28 12.50742 41.88401
29 12.46809 41.88218
30 12.52118 41.91346
31 12.49211 41.88647
32 12.51339 41.88052
33 12.51339 41.88052
34 12.49666 41.88929
35 12.47537 41.89831
36 12.47733 41.89943
37 12.48074 41.90128
38 12.47543 41.90618
39 12.47597 41.90692
40 12.47724 41.90568
41 12.46651 41.90308
42 12.46693 41.92834
43 12.47402 41.93220
44 12.53848 41.83618
45 12.47951 41.89143
46 12.47307 41.89916
47 12.47705 41.90555
48 12.47938 41.90352
49 12.47922 41.90470
50 12.47639 41.91081
51 12.49242 41.89481
I usually follow the strategy of give my data a 'window' with f from make_bbox:
bbox <- ggmap::make_bbox(Long, Lat, data, f = 0.5)
map_loc <- get_map(location = bbox, source = 'google', maptype = 'terrain')
map <- ggmap(map_loc, extent = 'device', maprange=FALSE, darken = c(0.5, "white"))
Another option is to set a different zoom factor:
map_loc <- get_map(location = c(lon = mean(data$lon), mean(data$lat)),
source = 'google', zoom = 14)
map <- ggmap(map_loc, extent = 'device')
play with zoom level to get the desired window area

Given data points and y value, give x value

Given a set of (x,y) coordinates, how can I solve for x, from y. If you were to plot the coordinates, they would be non-linear, but pretty close to exponential. I tried approx(), but it is way off. Here is example data. In this scenario, how could I solve for y == 50?
V1 V3
1 5.35 11.7906
2 10.70 15.0451
3 16.05 19.4243
4 21.40 20.7885
5 26.75 22.0584
6 32.10 25.4367
7 37.45 28.6701
8 42.80 30.7500
9 48.15 34.5084
10 53.50 37.0096
11 58.85 39.3423
12 64.20 41.5023
13 69.55 43.4599
14 74.90 44.7299
15 80.25 46.5738
16 85.60 47.7548
17 90.95 49.9749
18 96.30 51.0331
19 101.65 52.0207
20 107.00 52.9781
21 112.35 53.8730
22 117.70 54.2907
23 123.05 56.3025
24 128.40 56.6949
25 133.75 57.0830
26 139.10 58.5051
27 144.45 59.1440
28 149.80 60.0687
29 155.15 60.6627
30 160.50 61.2313
31 165.85 61.7748
32 171.20 62.5587
33 176.55 63.2684
34 181.90 63.7085
35 187.25 64.0788
36 192.60 64.5807
37 197.95 65.2233
38 203.30 65.5331
39 208.65 66.1200
40 214.00 66.6208
41 219.35 67.1952
42 224.70 67.5270
43 230.05 68.0175
44 235.40 68.3869
45 240.75 68.7485
46 246.10 69.1878
47 251.45 69.3980
48 256.80 69.5899
49 262.15 69.7382
50 267.50 69.7693
51 272.85 69.7693
52 278.20 69.7693
53 283.55 69.7693
54 288.90 69.7693
I suppose the problem you have is that approx solves for y given x, while you are talking about solving for x given y. So you need to switch your variables x and y when using approx:
df <- read.table(textConnection("
V1 V3
85.60 47.7548
90.95 49.9749
96.30 51.0331
101.65 52.0207
"), header = TRUE)
approx(x = df$V3, y = df$V1, xout = 50)
# $x
# [1] 50
#
# $y
# [1] 91.0769
Also, if y is exponential with respect to x, then you have a linear relationship between x and log(y), so it makes more sense to use a linear interpolator between x and log(y), then take the exponential to get back to y:
exp(approx(x = df$V3, y = log(df$V1), xout = 50)$y)
# [1] 91.07339

Resources