How to change values in raster inside loop - r

I have aerial images of forests where I need to compute fragmentation index. I know how to do it for any individual image, but I want to use loop because the is a bunch of them.
# required libraries
library(raster)
library(SDMTools)
The desired index value is element number 11. But before I extract this value I need to replace all values to "1" (from original range 1-100)
# Individual raster can be done like this:
x <- raster(forest_cov[1])
x[x > 0] = 1
PatchStat(x)[11]
# I have tried this loop but it is not working
rast<-numeric(41)
for (i in 1:41) {
rast[i] <- PatchStat(raster(forest_cov[i][forest_cov[i] > 0 == 1]))[11]
}
The problem is that I do not know how to replace all values in raster to 1 (inside code). What am I doing wrong?

To work out why your code isn't returning the expected result, you should probably run chunks of your code from the inside out. For example, does forest_cov[i] > 0 == 1 return what you expect it to return for raster 1? (I suspect not, since according to your comments, forest_cov is a character vector and so the logical comparison of element i to 0 is not sensible.) But, if so, does forest_cov[i][forest_cov[i] > 0 == 1] return what you expect, and so on.
Here's how I would approach the problem.
Prepare some fake data:
# Write out three fake rasters to temp files
writeRaster(stack(replicate(3, raster(matrix(runif(100), nc=10)))),
{f <- tempfile()}, bylayer=TRUE, format='ascii')
# Filenames of these fake rasters
rasters <- paste0(f, '_', 1:3, '.asc')
Calculate the frac.dim.index (i.e. 11th element of PatchStat result) of each raster:
sapply(rasters, function(x) {
require(SDMTools)
PatchStat(raster(x) >= 0.1)[11]
})
Alternatively, if all the rasters referred to in your character vector have consistent extent and dimensions, then you can perform the operation on a stack as follows:
s <- stack(rasters) >= 0.1
sapply(seq_len(nlayers(s)), function(i) PatchStat(s[[i]])[11])

Related

How do you create a function that row reduces a matrix in R?

So far I've tried the following code but it didn't work in R-studio; it just hangs there.
Am I doing something wrong? This is my first real R code project so I'd love suggestions!
new.rref <- function(M,fractions=FALSE)
{
#M is a matrix.
#Require numeric matricies.
if ((!is.matrix(M)) || (!is.numeric(M)))
stop("Sorry pal! Data not a numeric matrix.")
#Specify and differentiate between rows and columns.
r=nrow(M)
c=ncol(M)
#Now establish a continuous loop (*needed help on this one)
#According to the help documents I've read, this has to do with a
#computerized version of the Gaussian Reducing Algorithm
#While 1<r and 1<c, must set first column entries in which
#1:r < 1 equal to zero. This while loop is used to loop the
#algorithm until a specific condition is met -- in this case,
#until elements in the first column to which 1:r < 1
#are set to zero.
while((1<=r) & (1<=c))
new <- M[,1]
new[1:r < y.position] <- 0
# Now here's the fun part :)
#We need to find the maximum leading coefficient that lies
#at or below the current row.
new1 <- which.max(abs(new))
#We will assign these values to the vector "LC"
LC <- col[which]
#Now we need to allow for row exchange!
#Basically tells R that M[c(A,B),] = M[c(B,A),].
if (which > 1) { M[c(1,which),]<-A[c(which,1),] }
#Now we have to allow for the pivot, "sweep", and restoration
#of current row. I totally didn't know how to do this so I
#used and changed some code from different documentations.
#PIVOT (friends reference)
M[1,]<-M[1,]/LC
new2 <-M[1,]
#CLEAN
M <- M - outer(M[,x.position],new2)
#RESTORE
A[1,]<-new2
#Last, but certantly not least, we're going to round the matrix
#off to a certain value. I might have did this wrong.
round(M)
return(M)
print(M)
}
Edit: I added the first line, for some reason it got deleted.
Edit 2: Say you have a matrix M=matrix(c(2,3,4,7), nrow=2, ncol=2, byrow=TRUE); new.rref(M) needs to produce the reduced row echelon form of matrix M. I already did the math; new.rref(M) should be equal to matrix(c(1,0,0,1), nrow=2, ncol=2, byrow=T

R studio doesn't find objects in my function

I’m new to programming and I’m currently writing a function to go through hundreds of csv files in the working directory.
The files have tons of NA values in it.
The function (which I call it corr) has two parameters, the directory, and a threshold value (numeric vector of length 1 indicating the number of complete cases).
The purpose of the function is to take the complete cases for two columns that are sulfate and nitrate(second and third column in the spreadsheet) and calculate the correlation between them if the number of complete cases is greater than the threshold parameter.
The function should return a vector with the correlation if it met the threshold requirement (the default threshold value is 0).
When I run the code I get back two of the following:
A + sign in the console
OR
2.The objects I created in the function can't be found.
Any help would be much appreciated. Thank you in advance!
corr <- function(directory, threshold=0){
filelist2<- data.frame(list.files(path=directory,
pattern=".csv", full.names=TRUE))
corvector <- numeric()
for(i in 1:length(filelist2)){
data <-data.frame(read.csv(filelist2[i]))
removedNA<-complete.cases(data)
newdata<-data[removedNA,2:3]
if(nrow(removedNA) > threshold){
corvector<-c(corvector, cor(data$sulfate, data$nitrate ))
}
}
corvector
}
I don't think your nrow(removedNA) does what you think it does. To replicate the example I use the mtcars dataset.
data <- mtcars # create dataset
data[2:4, 2] <- NA # create some missings in column 2
data[15:17, 3] <- NA # create some missing in column 3
removedNA <- complete.cases(data)
table(removedNA) # 6 missings indeed
nrow(removedNA) # NULL removedNA is no data.frame, so nrow() doesn't work
newdata <- data[removedNA, 2:3] # this works though
nrow(newdata) # and this shows the rows in 'newdata'
#---- therefore instead of nrow(removedNA) try
if(nrow(data)-nrow(newdata) < threshold) {
...
}
NB: I changed the > in < in the line with threshold. I guess it depends on whether you want to set an absolute minimum number of lines (in which cases you could simply use nrow(newdata) > threshold) as threshold, or whether you want the threshold to reflect the different number of lines in the original data and 'new' data.

Sampling with list of rasters and locations using function in R

I am sweating over this piece of code. I have received previously help to build it here. In short, what I am doing here I have list of three rasters that I am randomly sampling numberv times. Therefore, the output is a list of four lists, each list has three rasters. After I obtain the random points locations, I then take the raster value in this location.
Problem I want to solve is that I would like to take the second sample locations, ie sample.set[[1]][2] and obtain raster value from rasters[1]. Then I would like to take sample.set[[1]][3] and obtain raster value from rasters[2]. Then sample.set[[2]][2] and obtain raster value from rasters[1] and sample.set[[2]][3] and obtain raster value from rasters[2] etc. The result would be a list of 4 lists, each list with 2 elements with sample xy values (locations) and previous raster value.
Help will be much appreciated.
y <- matrix(1:150,50,3)
mv <- c(1,2,3)
rep = 20
valuematrix <- vector("list",ncol(y))
for (i in 1:ncol(y)) {
newmatrix <- replicate(rep,y[,i])
valuematrix[[i]] <- newmatrix
}
library(sp)
library(raster)
rasters <- setNames(lapply(valuematrix, function(x) raster(x)),
paste0('raster',1:length(mv)))
# Create a loop that will sample the rasters
library(dismo)
numberv = c(10,12,14,16) # sample number vector
# Function to sample using a given number (returns list of three)
sample.number <- function(x) {
rps <- lapply(rasters, function(y) randomPoints(raster(y),n=x))
setNames(rps,paste0('sample',1:length(mv)))
}
# Apply sample.number() to your numberv list
sample.set <- lapply(numberv,sample.number)
# Function to extract values from a given sample
sample.extract <- function(x) {
lapply(1:length(x),function(y) data.frame(x[[y]],
extract(rasters[[y]],x[[y]])))
}
# Apply sample.extract() to the set of samples (returns list of four lists)
sample.values <- lapply(sample.set,sample.extract)
Now I would like to use the sample values from the second element of the list sample.set to sample 1st raster in list rasters I try this but no success:
sample.extract.prev <- function(x) {
lapply(1:length(x),function(y) data.frame(x[[y]],
extract(rasters[[y]],x+1[[y]])))
}
sample.values.prev <- lapply(sample.set,sample.extract.prev)
Managed to solve this (big high five to myself ;)
Unfortunately I managed to do it with a loop, would be great to see an example of a function.
samplevaluesnext <- vector("list",length(sample.set))
## Look up values
for (j in 1:length(sample.set)) {
for (i in 1:(length(rasters)-1)) {
samplevaluesnext[[j]][[i]] <- data.frame(sample.set[[j]][[i+1]],
extract(rasters[[i]],
as.data.frame(sample.set[[j]][i+1])))
}
}

Why can't I assign a value to a vector as the result of a conditional statement in R

I have 24 .csv files, each containing hundreds of thousands of data points.
My intention is for this code to:
1. loop though each of the files in the directory
2. take a sample of 1000 random points from a single column
3. check to see if each sample data point is below a particular level, here's where I'm stuck, if TRUE change the result[i] to 1, if FALSE then 0. The result vector doesn't change at all though. Any thoughts?
rm(list=ls())
years<-c(1990:2013)
#####################################
S=1000
level<-.075
result<-(1:S)
inBounds<-function(data){
for(i in 1:S){
result[i]<-(data[i] < level)
}
return(mean(result))
}
#####################################
#Get sample arithmetic mean readings from 1990-2013
n=1000
temp<-data.frame()
arithMean<-data.frame()
Samp<-data.frame()
CI<-data.frame()
#Get data file names
files <- list.files(path="~/Proj",pattern="*.csv", full.names=T, recursive=FALSE)
for(i in 1:23){
temp<-read.csv(files[i],header=TRUE,sep=",")
arithMean<-temp$Arithmetic.Mean
Samp<-sample(arithMean,n,replace=TRUE,prob=NULL)
CI[1,i]<-inBounds(Samp)
}
The problem is with scope. The result vector declared after level is in a higher scope than the result vector in the function. They are not equal.
If you want the result vector from the function, return it. If you want both the vector and the average both, return a list:
return(list(result = result, mean = mean(result)))
The result of your entire operation is a single vector of length 23, so you can do this with sapply:
CI <- sapply(1:23, function(i) {
temp <- read.csv(files[i], header=T, sep=",")
return(mean(sample(temp$Arithmetic.Mean, n, replace=T, prob=NULL) < level))
})
The reason result was not changing in your function is that it was declared outside the function but you were editing it inside the function. You could move the result<-(1:S) inside the function to get the expected behavior.

Avoiding nested loops in R

I have this set of sequences with 2 variables for a 3rd variable(device). Now i want to break the sequence for each device into sets of 300. dsl is a data frame that contains d being the device id and s being the number of sequences of length 300.
First, I am labelling (column Sid) all the sequences rep(1,300) followed by rep(2,300) and so on till rep(s,300). Whatever remains unlabelled i.e. with initialized labels(=0) needs to be ignored. The actual labelling happens with seqid vector though.
I had to do this as I want to stack the sets of 300 data points and then transpose it. This would form one row of my predata data.frame. For each predata data frame i am doing a k-means to generate 5 clusters that I am storing in final data.
Essentially for every device I will have 5 clusters that I can then pull by referencing the row number in final data (mapped to device id).
#subset processed data by device
for (ds in 1:387){
d <- dsl[ds,1]
s <- dsl[ds,3]
temp.data <- subset(data,data$Device==d)
temp.data$Sid <- 0
temp.data[1:(s*300),4] <- rep(1:300,s)
temp.data <- subset(temp.data,temp.data$Sid!="0")
seqid <- NA
for (j in 1:s){ seqid[(300*(j-1)+1):(300*j)] <- j }
temp.data$Sid <- seqid
predata <- as.data.frame(matrix(numeric(0),s,600))
for(k in 1:s){
temp.data2 <- subset(temp.data[,c(1,2)], temp.data$Sid==k)
predata[k,] <- t(stack(temp.data2)[,1])
}
ob <- kmeans(predata,5,iter.max=10,algorithm="Hartigan-Wong")
finaldata <- rbind(finaldata,(unique(fitted(ob,method="centers"))))
}
Being a noob to R, I ended up with 3 nested loops (the function did work for the outermost loop being one value). This has taken 5h and running. Need a faster way to go about this.
Any help will be appreciated.
Thanks
Ok, I am going to suggest a radical simplification of your code within the loop. However, it is hard to verify that I in fact did assume the right thing without having sample data. So please ensure that my predata in fact equals yours.
First the code:
for (ds in 1:387){
d <- dsl[ds,1]
s <- dsl[ds,3]
temp.data <- subset(data,data$Device==d)
temp.data <- temp.data[1:(s*300),]
predata <- cbind(matrix(temp.data[,1], byrow=T, ncol=300), matrix(temp.data[,2], byrow=T, ncol=300))
ob <- kmeans(predata,5,iter.max=10,algorithm="Hartigan-Wong")
finaldata <- rbind(finaldata,(unique(fitted(ob,method="centers"))))
}
What I understand you are doing: Take the first 300*s elements from your subset(data, data$Devide == d. This might easily be done using the command
temp.data <- temp.data[1:(s*300),]
Afterwards, you collect a matrix that has the first row c(temp.data[1:300, 1], temp.data[1:300, 2]), and so on for all further rows. I do this using the matrix command as above.
I assume that your outer loop could be transformed in a call to tapply or something similar, but therefore, we would need more context.

Resources