The following code is an example for the transition function from the pdf manual for gdistance:
library(raster)
library(gdistance)
r <- raster(nrows=6, ncols=7, xmn=0, xmx=7, ymn=0, ymx=6, crs="+proj=utm +units=m")
r[] <- c(2, 2, 1, 1, 5, 5, 5,
2, 2, 8, 8, 5, 2, 1,
7, 1, 1, 8, 2, 2, 2,
8, 7, 8, 8, 8, 8, 5,
8, 8, 1, 1, 5, 3, 9,
8, 1, 1, 2, 5, 3, 9)
T <- transition(r, function(x) 1/mean(x), 8)
# 1/mean: reciprocal to get permeability
T <- geoCorrection(T)
c1 <- c(5.5,1.5)
c2 <- c(1.5,5.5)
#make a SpatialLines object for visualization
sPath1 <- shortestPath(T, c1, c2, output="SpatialLines")
plot(r)
lines(sPath1)
#make a TransitionLayer for further calculations
sPath2 <- shortestPath(T, c1, c2)
plot(raster(sPath2))
My specific interest is in this line:
T <- transition(r, function(x) 1/mean(x), 8)
Because I've come across numerous examples of people doing the following:
T <- transition(1/r, mean, 8)
As far as I can tell, this is the difference between 1/mean(x) and mean(1/x), which are not equivalent.
To verify this, I ran both versions of the transition function using the above code from the gdistance manual, and got these two very different plots:
And using costDistance(T, c1, c2) I got a distance of 21.1 for the first, and 13.6 for the second.
Clearly, these are very different results. So, my question is, what is the correct method for creating a TransitionLayer object from a cost matrix/layer/raster?
This is indeed an important difference. Take a look at the Wikipedia article on the harmonic mean for more info.
In the example, the values in the input raster are costs. So the correct way is to take the arithmetic mean of the cost first and then take the reciprocal of that to get the conductance. The traveller experiences half of the cost of the origin cell and half of the cost of the destination cell, (cost1 + cost2)/2.
So 1/mean(x) is correct for this case.
If the input raster has conductance values, the other function is correct: mean(1/x).
Related
I've just started learning R, and I'm attempting to do some calculations involving a joint PMF in R.
The following matrix holds the joint PMF $p_{NG}(n,g)$:
(pNG <- matrix(c(16, 0, 0, 0, 0, 8, 8, 0, 0, 0, 4, 8, 4,
0, 0, 2, 6, 6, 2, 0, 1, 4, 6, 4, 1)/80,
ncol = 5, nrow = 5, byrow = TRUE))
colnames(pNG) <- rownames(pNG) <- 0:4
The marginal PMFs of $N$ and $G$ are found as follows:
(pN <- rowSums(pNG))
(pG <- colSums(pNG))
The expected value and variance of $N$ are found as follows:
(EN <- sum(0:4 * pN))
(VarN <- sum((0:4 - EN)^2 * pN))
The conditional PMF of $N$ at $G = 0, 1, 2, 3, 4$ are found as follows:
(pNgG <- sweep(pNG, 2, pG, "/"))
The expected value of $N$ given $G$ are found as follows:
(ENgG <- colSums(0:4 * pNgG))
The variance of $N$ given $G$ is found as follows:
(VarNgG <- colSums(outer(0:4, ENgG, "-")^2 * pNgG))
With all this said and done, I want to find $P(N > G)$. However, I'm unsure of how to do this. I was thinking that there is a pattern here that has to do with the diagonals (upper or lower) of the matrix, since this is where $i > j$ or $j > i$; on the diagonals, we have $i = j$..
So you need to add up all the cells of the matrix where the row number is greater than the column number. This is the "lower triangular" sub-matrix, which you can access using R's lower.tri() function:
sum(pXY[lower.tri(pXY)])
You can use upper.tri() for the opposite. (And diag() if you need the diagonal, where the row number equals the column number.)
edit: added current solution
I am dabbling with the Travelling Salesman Problem and am using a solver to calculate the most optimal tour. The output of my linear solver gives me a table with arches in a route, however to plot the tour I require vector with all the locations chained in the right order. Is there an elegant way to chain these arches into a single tour?
One solution would be a series of (nested) joins/matches, however that is not an elegant solution in my opinion.
# output of solver (where i = 'from' and j = 'to')
solution = data.frame(i = c(6, 4, 10, 7, 1, 9, 3, 2, 8, 5),
j = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
# transformation
??
# required output
tour = c(6, 1, 5, 10, 3, 7, 4, 2, 8, 9)
So the output I am looking for is a single chain of connected arches (from i to j) in the tour.
My current solution uses for loops and match and looks as follows:
# number of cities to visit
nCities = length(solution)
# empty matrix
tour = matrix(0, nCities, 2)
#first location to visit picked manually
tour[1, ] = solution[1, ]
# for loop to find index of next arch in tour
for(k in 2:nCities){
ind = match(tour[k - 1, 2], solution[, 1])
tour[k, ] = solution[ind, ]
}
# output 'tour' is the solution but then sorted.
# I then take only the first column which is the tour
tour = tour[1, ]
However, it looks clunky and as I try to avoid for loops as much as possible I am not to happy with it. Also, my suspicion is that there are more elegant solutions out there, preferably using base R functions.
What I want to achieve is exactly the same that was already asked here (and specifically using R's base graphics, not packages like ggplot or lattice): Ordering bars in barplot()
However, the solutions proposed there do not seem to work for me. What I need to is the following. Suppose I have this:
num <- c(1, 8, 4, 3, 6, 7, 5, 2, 11, 3)
cat <- c(letters[1:length(num)])
data <- data.frame(num, cat)
If I generate a barplot using barplot(data$num), here is what I get:
Now, I want to reorder the bars according to data$cat. Following the link I mentioned above, I tried the accepted answer but got an error:
num2 <- factor(num, labels = as.character(cat))
Error in factor(num, labels = as.character(cat)) : invalid 'labels'; length 10 should be 1 or 9
Then I also tried the other answer there:
num <- as.factor(num)
barplot(table(num))
But here is what I got:
So, in this particular case of mine, which is slightly different from that question, how should I order the barplot so the bars are defined by data$num but ordered according to data$cat?
you can use ggplot to do this
library("ggplot2")
num <- c(1, 8, 4, 3, 6, 7, 5, 2, 11, 3)
cat <- c(letters[1:10])
data <- data.frame(num, cat)
ggplot(data,aes(x= reorder(cat,-num),num))+geom_bar(stat ="identity")
The result is as shown below
Using base functions
df <- data[order(data$num,decreasing = TRUE),]
barplot(df$num,names.arg = df$cat)
I get the following,
num <- c(1, 8, 4, 3, 6, 7, 5, 2, 11, 3)
cat <- c(letters[1:10])
data <- data.frame(num, cat)
barplot(data[order(data[,1],decreasing=TRUE),][,1],names.arg=data[order(data[,1],decreasing=TRUE),][,2])
The above code uses the order() function twice (see comments, below). To avoid doing this the results of the ordered data.frame can be stored in a new data.frame and this can be used to generate the barplot.
num <- c(1, 8, 4, 3, 6, 7, 5, 2, 11, 3)
cat <- c(letters[1:10])
data <- data.frame(num, cat)
data2 <- data[order(data[,1],decreasing=TRUE),]
barplot(data2[,1],names.arg=data2[,2])
Alternatively, you can also use the following if you don't want to put your data in a new dataframe. Just a little simpler.
barplot(sort(data$num, decreasing = TRUE))
In R I have a SpatialPointsDataFrame whit duplicated point (coordinates and attributes), I would like to remove all point with same data ...
I have find in the sp package the remove.duplicates() function but it seems to remove only on location ... Is there another way?
thank you
E.
Would something like this work?
library(sp)
pts <- SpatialPoints(cbind(c(1, 1, 1, 2, 3, 4), c(1, 1, 1, 4, 2, 4)))
pts <- SpatialPointsDataFrame(pts, data=data.frame(id = c(1, 2, 2, 3, 4, 5)))
## All points
pts
## No spatial duplicates
remove.duplicates(pts)
## No duplicates in attributes
pts[which(!duplicated(pts$id)), ]
## Combination
pts[which(!duplicated(as.data.frame(pts))), ]
I am writing a function where I want to supply a variable which contains a condition to be evaluated inside the function. For example, I have a hourval variable containing values like 0, 3, 6, 9, 18, 3, 6, 9, 18 0, 3, 18 ... I want to select the indices where hourval variable matches to 0, 6. This 0, 6 could change depending upon some other parameters. Basically they are not fixed always. So I pass a variable g1 = call("which", (hourval==0 | hourval == 6)). I want this statement to be evaluated in the program. Hence I use the statement x1 = eval(g1). Obviously, when I pass the variable g1, that time hourval variable is not generated, but it is generated just before the eval(g1) statement. I get error, object hourval not found. Is there any other way to solve this problem.
Thanks in advance, any help is appreciated.
Narayani Barve
Is this what you want?
> hourval <- c(0, 3, 6, 9, 18, 3, 6, 9, 18, 0, 3, 18)
> test <- c(0,6)
> which(hourval %in% test)
[1] 1 3 7 10
It took me a while to find it with this search strategy
library(fortunes)
fortune("parse")
but eventually got the one I remembered:
> fortune("parse")
If the answer is parse() you should usually rethink the question.
-- Thomas Lumley
R-help (February 2005)
Part of my difficulty was in the fact that I remembered the quote as having "eval(parse(".
This is what you seem to describe
f1 <- function(y) {
hourval <- c(0, 3, 6, 9, 18, 3, 6, 9, 18, 0, 3, 18)
eval(substitute(y))
}
f1( which(hourval %in% c(0,6)) )
But this is what I'd do instead.
f2 <- function(y) {
hourval <- c(0, 3, 6, 9, 18, 3, 6, 9, 18, 0, 3, 18)
which(hourval %in% y)
}
f2( c(0,6) )
But again, there's not enough information yet to know if either of these answer the question.