Getting pairs of coordinates in the same column? (R) - r

I'm playing around with the concaveman package.
I'm using this sample code to create a polygon of a concave hull around some test points:
library(concaveman)
data(points)
polygons <- concaveman(points)
plot(points)
plot(polygons, add = TRUE)
However, the polygon df has all the coordinates crammed into one row like so:
polygons
1
list(c(-122.0809, -122.0813, -122.0812, -122.082, -122.0819, -1...
I tried using unlist, but this just separates the x/y coordinate pairs to opposite ends of the df from each other:
fixpolygon <- data.frame(unlist(polygons))
outputs:
polygons1 -122.0809
polygons2 -122.0813
polygons3 -122.0812
...
polygons210 37.3736
polygons211 37.3764
polygons22 37.3767
How can I make it so that the output is like so:
c(-122.0809, 37.3736)
c(-122.0813, 37.3764)
...
etc. etc. ?

By inspecting
str(polygons)
we can see that what you want is already prepared in
polygons$polygons[[1]][[1]]
# V1 V2
# [1,] -122.0809 37.3736
# [2,] -122.0813 37.3764
# [3,] -122.0812 37.3767
# [4,] -122.0820 37.3772
# [5,] -122.0819 37.3792
# [6,] -122.0822 37.3792
# ...

Try using the sf package:
library(sf)
st_coordinates(st_as_sf(polygons))
X Y L1 L2
[1,] -122.0809 37.3736 1 1
[2,] -122.0813 37.3764 1 1
[3,] -122.0812 37.3767 1 1
[4,] -122.0820 37.3772 1 1
[5,] -122.0819 37.3792 1 1
[6,] -122.0822 37.3792 1 1

Related

How does the equation for the SpatRaster roughness index (terrain, v = "roughness") work?

The terra package offers and describes the following terrain indices:
x <- terrain(x, v="roughness")
x <- terrain(x, v="TPI")
x <- terrain(x, v="TRI")
I am confused on how this is calculated based on the package description of roughness as "the difference between the maximum and the minimum value of a cell and its 8 surrounding cells" (Hijmans et al. 2023). How does this work for edge and corner cells? I am assuming that the calculation reduces to a cell and its 5 or 3 surrounding cells in these cases?
The ruggedness (TRI) index is described as "the mean of the absolute differences between the value of a cell and the value of its 8 surrounding cells". The following is a graphic illustration of how I envision the calculation of these indices from the description provided.
Does this provide a correct interpretation of these indices?
If this interpretation is incorrect, then I am hoping someone point me in the correct direction (a reference) or explain here. I am interested in coding to calculate a slope of 16° from a DSM and an elevational difference of 1.3 m, but think that a terrain index would give a better indicator of the 1.3 m criterion for this habitat model.
## > 16° slope
habitat_slope_mat <- matrix(nrow = 2, ncol = 3)
habitat_slope_mat[1, ] <- c(0,16,0) # from,to = 0 absent
habitat_slope_mat[2, ] <- c(16,minmax(x)[2],1) # from,to = 1 present
habitat_slope <- classify(x, habitat_slope_mat, include.lowest=TRUE)
I looked at the cited references and was expecting to find the formula for this to help me think of the best way to treat the 1.3 m criterion. I have been unable to locate a written / published description that further explains the method. This paper is listed in the citations for the terrain function description:
Jones, K.H., 1998. A comparison of algorithms used to compute hill (sic) *terrain *as a property of the DEM. Computers & Geosciences 24: 315-323
The correct title for the article (DOI: 10.1016/S0098-3004(98)00032-6) is: "A comparison of algorithms used to compute hill *slope *as a property of the DEM". I cannot locate the formula for roughness in that paper and was interested in reading more on this topic.
I am not sure if this question is appropriate here, as you do not seem to be asking a coding question.
The manual points to Wilson et al (2007) for terrain indices. It also shows how you can use focal instead of terrain to compute them.
You can see for yourself what happens with small examples like this:
library(terra)
x <- rast(nrow=3, ncol=3, vals=c(1,2,3,1,2,1,1,2,8), ext=ext(0,1,0,1), crs="local")
as.matrix(x, wide=T)
# [,1] [,2] [,3]
#[1,] 1 2 3
#[2,] 1 2 1
#[3,] 1 2 8
terrain(x, "roughness") |> as.matrix(wide=TRUE)
# [,1] [,2] [,3]
#[1,] NaN NaN NaN
#[2,] NaN 7 NaN
#[3,] NaN NaN NaN
focal(x, w=3, fun=\(x) {max(x) - min(x)}) |> as.matrix(wide=T)
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] NA 7 NA
#[3,] NA NA NA
terrain(x, "TRI") |> as.matrix(wide=TRUE)
# [,1] [,2] [,3]
#[1,] NaN NaN NaN
#[2,] NaN 1.375 NaN
#[3,] NaN NaN NaN
focal(x, w=3, fun=\(x) sum(abs(x[-5]-x[5]))/8) |> as.matrix(wide=T)
# [,1] [,2] [,3]
#[1,] NA NA NA
#[2,] NA 1.375 NA
#[3,] NA NA NA
So the edges become missing (you could do other things via focal)
Or look at the source code.

rbind is changing a constant number of one of my columns

I am trying to make a matrix form a dataframe, everything is working perfect but when I use rbind a constant number that I have in one of my columns change.
My dataframe looks like this:
CHR POS Fd FdDenom
1 10 3809 0.0000 0.0000
2 10 5673 -0.2500 0.0000
3 10 5847 0.0000 0.5000
...
And is named FS_10
On than I am running the next for loop
table10 <-c()
a <- 0
for(i in 1:round((nrow(Fs_10)/50))) {
window <- Fs_4[a:c(a+50),]
a<- a+50
a1 <- sum(window$FdNum)
a2 <- sum(window$FdDenom)
Result <- a1/a2
start <- window[1,]
end <-window[50,]
middle <- (start[,2]+end[,2])/2
table10 <- rbind(table10,c(window[1,1], start[,2], end[,2], end[,2]-start[,2], middle, Result))
}
My output look like this:
V1 V2 V3 V4 V5 V6
1 2 3869 624096 620287 313952.5 0.029411765
50 2 624096 624694 598 624395.0 0.500000000
100 2 624714 625470 756 625092.0 0.205128205
I expect in column V1 the number 10, but I am having 2, I have change several things and the 2 is still there instead the 10. Do you know what is happening?
A simplified version of the porblem is:
rbind(c(start[,1], start[,2], end[,2]), c(start[,1], start[,2], end[,2]))
Where start is:
CHR POS FdNum FdDenom
240938 10 148990666 0.25 0.25
And end is:
CHR POS FdNum FdDenom
240987 10 149534407 -0.5 0
I have this:
[,1] [,2] [,3]
[1,] 2 148990666 149534407
[2,] 2 148990666 149534407
Again 2 instead 10
Unsing this:
rbind(list(inicio[,1], inicio[,2], fin[,2]), list(inicio[,1], inicio[,2], fin[,2]))
I have this:
[,1] [,2] [,3]
[1,] factor,1 148990666 149534407
[2,] factor,1 148990666 149534407
Do you know which is the problem?
Thanks
I am newbie programming in R, I thought that I was creating a matrix but akrun is right, I was creating a list. My solution was to create instead a dataframe substituting the las line of my for loop with this:
tabla10 <- rbind(tabla10,data.frame(window[1,c(1,2)], start[,2], end[,2], end[,2]-start[,2], meddle, Result))
However I still do not understand why the 10 was turning in to 2, I'll study more about this.
Thanks

filling columns of a matrix by the values of outputs of different functions

I want to fill each column of an empty matrix by values resulted from different functions. I want to use many functions and so the speed is important. I have prepared a small example of what I want to do but I can't.
I have an empty matrix which I want to fill each column by values of functions' outputs. This matrix has an exact number of columns and each column has specific names:
mat<-matrix(ncol = 4)
colnames(mat)<-c("binomial","normal","gamma","exponential")
Then, considering a vector which includes some colnames of this matrix:
remove<-c("gamma","exponential")
I want to fill columns of this matrix by random values resulted from each distribution but under this circumstance that if remove object contains the name of columns of this matrix, they must be removed and not be computed.
I wrote this:
mat<-mat[,-which(colnames(mat) %in% remove) ]
mat[,1]<-rnbinom(10, mu = 4, size = 1)
mat[,2]<-rnorm(10)
mat[,3]<-rgamma(10, 0.001)
mat[,4]<-rexp(10)
The final matrix I am looking for that is something like this:
binomial normal
1 -0.54948696
6 -0.53396115
1 0.69918478
13 0.92824442
0 0.03331125
I would be very grateful for your kind help.
Here is a method that constructs a function. The random generators are stored in a list and then the subset of them (those not in remove) are fed to sapply.
randMatGet <- function(sampleSize=10, remove=NULL) {
randFuncs <- list("binomial"=function(x) rnbinom(x, mu=4, size=1),
"normal"=function(x)rnorm(x),
"gamma"=function(x) rgamma(x, 0.001),
"exponential"=function(x) rexp(x))
sapply(randFuncs[setdiff(names(randFuncs), remove)], function(f) f(sampleSize))
}
Now, call the function
set.seed(1234)
randMatGet()
binomial normal gamma exponential
[1,] 0 0.375635612 0.000000e+00 1.45891992
[2,] 1 0.310262167 0.000000e+00 1.43920743
[3,] 1 0.005006950 3.099691e-294 2.76404158
[4,] 5 -0.037630263 7.540715e-249 0.02316716
[5,] 0 0.723976061 0.000000e+00 0.89394340
[6,] 0 -0.496738863 0.000000e+00 3.68036715
[7,] 0 0.011395161 0.000000e+00 2.90720399
[8,] 4 0.009859946 9.088837e-34 0.13015222
[9,] 10 0.678271423 0.000000e+00 0.81417829
[10,] 0 1.029563029 0.000000e+00 2.01986489
and then with remove
# reset seed for comparison
set.seed(1234)
randMatGet(remove=remove)
binomial normal
[1,] 0 0.375635612
[2,] 1 0.310262167
[3,] 1 0.005006950
[4,] 5 -0.037630263
[5,] 0 0.723976061
[6,] 0 -0.496738863
[7,] 0 0.011395161
[8,] 4 0.009859946
[9,] 10 0.678271423
[10,] 0 1.029563029
To allow for adjustments of different parameters, change the function as follows. This is an example for the mu argument to rbinom.
randMatGet <- function(sampleSize=10, remove=NULL, mu=4) {
randFuncs <- list("binomial"=function(x) rnbinom(x, mu=mu, size=1),
"normal"=function(x)rnorm(x),
"gamma"=function(x) rgamma(x, 0.001),
"exponential"=function(x) rexp(x))
sapply(randFuncs[setdiff(names(randFuncs), remove)], function(f) f(sampleSize))
}
Now, you can do randMatGet(mu=1).

Visualization of multi-dimensional data clusters in R

For a set of documents, I have a feature matrix of size 30 X 32 where rows represent documents and columns = features. So basically 30 documents and 32 features for each of them. After running a PSO Algorithm, I have been able to find some cluster centroids (that I am not at the moment sure if they are optimum) each of which is a row vector of length 32. And I have a column vector of size 30X1 which shows the centroid each document has been assigned to. So index one of this vector would contain the index of the centroid to which document 1 has been assigned and so on. This is obtained after computing euclidean distances of each of the documents from the centroids. I wanted to get some hints regarding whether there is a way in R to plot this multidimensional data in the form of clusters. Is there a way, for example, by which I could either collapse these dimensions to 1-D, or somehow show them in a graph that might be a bit pretty to look at. I have been reading on Multidimensional Scaling. So far what I understand about it is that it is a way to reduce a multi-dimensional data to lower dimensions, which does seem what I want. So, I tried it on with this code (the centroids[[3]] basically consists of 4 X 32 matrix and represents the 4 centroids):
points <- features.dataf[2:ncol(features.dataf)]
row.names(points) <- features.dataf[,1]
fit <- cmdscale(points, eig = TRUE, k = 2)
x <- fit$points[, 1]
y <- fit$points[, 2]
plot(x, y, pch = 19, xlab="Coordinate 1", ylab="Coordinate 2", main="Clustering Text Based on PSO", type="n")
text(x, y, labels = row.names(points), cex=.7)
It gives me this error:
Error in cmdscale(pointsPlusCentroids, eig = TRUE, k = 2) :
distances must be result of 'dist' or a square matrix
However, it does seem to give a plot alright. But the pch = 19 point symbols do not appear, just the text names. Like this:
In addition to above, I want to color these such that the documents that lie in cluster 1 get colored to one color and those in 2 to a different color and so on. Is there any way to do this if I have a column vector with centroids present in this way:
[,1]
[1,] 1
[2,] 3
[3,] 1
[4,] 4
[5,] 1
[6,] 4
[7,] 3
[8,] 4
[9,] 4
[10,] 4
[11,] 2
[12,] 2
[13,] 2
[14,] 2
[15,] 1
[16,] 2
[17,] 1
[18,] 4
[19,] 2
[20,] 4
[21,] 1
[22,] 1
[23,] 1
[24,] 1
[25,] 1
[26,] 3
[27,] 4
[28,] 1
[29,] 4
[30,] 1
Could anyone please help me with this? Or if there is any other way to plot multi-dimensional clusters like these. Thank you!
As cmdscale needs distances, try cmdscale(dist(points), eig = TRUE, k = 2). Symbols do not appear because of type = "n". For coloring text, use: text(x, y, rownames(points), cex = 0.6, col = centroids)

Using rollapply() to find modal value

I've got panel data and have been playing around with k-means clustering. So now I've got a panel of factor values that are mostly stable but I'd like to smooth that out a bit more so that (for example) the data says "Wyoming was in group 1 in earlier years, moved into group 2, then moved into group 5" rather than "Wyoming was in group 1,1,1,2,3,2,2,5,5,5".
So the approach I'm taking is to use rollapply() to calculate the modal value. Below is code that works to calculate the mode ("Mode()"), and a wrapper for that ("ModeR()") that (perhaps clumsily) resolves the problem of multi-modal windows by randomly picking a mode. All that is fine, but when I put it into rollapply() I'm getting problems.
Mode <- function(vect){ # take a vector as input
temp <- as.data.frame(table(vect))
temp <- arrange(temp,desc(Freq)) # from dplyr
max.f <- temp[1,2]
temp <- filter(temp,Freq==max.f) # cut out anything that isn't modal
return(temp[,1])
}
ModeR <- function(vect){
out <- Mode(vect)
return(out[round(runif(1,min=0.5000001,max=length(out)+0.499999999))])
}
temp <- round(runif(20,min=1,max=10)) # A vector to test this out on.
cbind(temp,rollapply(data=temp,width=5,FUN=ModeR,fill=NA,align="right"))
which returned:
temp
[1,] 5 NA
[2,] 6 NA
[3,] 5 NA
[4,] 5 NA
[5,] 7 1
[6,] 6 1
[7,] 5 1
[8,] 5 1
[9,] 3 2
[10,] 1 3
[11,] 5 3
[12,] 7 3
[13,] 5 3
[14,] 4 3
[15,] 3 3
[16,] 4 2
[17,] 8 2
[18,] 5 2
[19,] 6 3
[20,] 6 3
Compare that with:
> ModeR(temp[1:5])
[1] 5
Levels: 5 6 7
> ModeR(temp[2:6])
[1] 6
Levels: 5 6 7
So it seems like the problem is in how ModeR is being applied in rollapply(). Any ideas?
Thanks!
Rick
Thanks to /u/murgs! His comment pointed me in the right direction (in addition to helping me streamline ModeR() using sample()).
ModeR() as written above returns a factor (as does Mode()). I need it to be a number. I can fix this by updating my code as follows:
Mode <- function(vect){ # take a vector as input
temp <- as.data.frame(table(vect))
temp <- arrange(temp,desc(Freq))
max.f <- temp[1,2]
temp <- filter(temp,Freq==max.f) # cut out anything that isn't modal
return(as.numeric(as.character(temp[,1]))) #HERE'S THE BIG CHANGE
}
ModeR <- function(vect){
out <- Mode(vect)
return(out[sample(1:length(out),1)]) #HERE'S SOME IMPROVED CODE!
}
Now rollapply() does what I expected it to do! There's still that weird as.character() bit (otherwise it rounds down the number). I'm not sure what's going on there, but the code works so I won't worry about it...

Resources