How to extract the values from a raster in R - r

I want to use R to extract values from a raster. Basically, my raster has values from 0-6 and I want to extract for every single pixel the corresponding value. So that I have at the end a data table containing those two variables.
Thank you for your help, I hope my explanations are precisely enough.

Example data
library(raster)
r <- raster(ncol=5, nrow=5, vals=1:25)
To get all values, you can do
values(r)
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#as.matrix(r)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 2 3 4 5
#[2,] 6 7 8 9 10
#[3,] 11 12 13 14 15
#[4,] 16 17 18 19 20
#[5,] 21 22 23 24 25
Also see ?getValues
You can also use indexing
r[2,2]
#7
r[7:8]
#[1] 7 8
For more complex extractions using points, lines or polygons, see ?extract

x is the raster object you are trying to extract values from; y is may be a SpatialPoints, SpatialPolygons,SpatialLines, Extent or a vector representing cell numbers (take a look at ?extract). Your code values_raster <- extract(x = values, df=TRUE) will not work because you're feeding the function with any y object/vector.
You could try to build a vector with all cell numbers of your raster. Imagine your raster have 200 cells. If your do values_raster <- extract(x = values,y=seq(1,200,1), df=TRUE) you'll get a dataframe with values for each cell.

How about simply doing
as.data.frame(s, xy=TRUE) # s is your raster file

Related

Extract longitudinal pixel values from raster.list, save to data frame

I have a list of rasters of the same location for multiple years. The change in pixel value over time represents the time series of pixel.
To further analyses, I need to extract the values over time per every pixel, and store it in data frame, where row = #pixel, column = year
Dummy data:
library(raster)
# create raster data from scratch
# create empty raster
y1<-raster(ncol = 3, nrow = 3)
values(y1)<-1:9
projection(y1)<-CRS("+init=epsg:4326")
# create and diversify the rasters
y2<-y1+10
y3<-y1+20
y4<-y1+30
# make list of rasters
y.list<-list(y1, y2,y3,y4)
# plot all rasters at once
par(mfrow = c(2,2))
for(i in 1:length(y.list)) {
plot(y.list[[i]])
}
How the dataframe should look like:
y1 y2 y3 y4
pixel1 1 10 20 30
pixel2
...
pixel9 9 19 29 39
How to extract unique pixel values over time, and convert individual pixel data to data frame??
I found a great answer here !
How to extract values from rasterstack with xy coordinates?
No need to put rasters into the list of rasters - just create raster stack !!
Than, just simply use raster::extract to create a time series of each pixel value over time !
the whole script:
library(raster)
# create raster data from scratch
# create empty raster
y1<-raster(ncol = 3, nrow = 3)
values(y1)<-1:9
projection(y1)<-CRS("+init=epsg:4326")
# recreate and diversify the rasters
y2<-y1+10
y3<-y1+20
y4<-y1+30
# create raster stack
# create raster stack
s<-stack(y1, y2, y3, y4)
# plot rasters
plot(s)
# extract raster values - return a matrix of values in each pixel
# row = pixel, column = layer (year)
mat <- raster::extract( s , 1:ncell(s) )
tadaaaa !!!!
> mat
layer.1 layer.2 layer.3 layer.4
[1,] 1 11 21 31
[2,] 2 12 22 32
[3,] 3 13 23 33
[4,] 4 14 24 34
[5,] 5 15 25 35
[6,] 6 16 26 36
[7,] 7 17 27 37
[8,] 8 18 28 38
[9,] 9 19 29 39

Splitting a variable into equally sized groups

I have a continuous variable called Longitude (it corresponds to geographical longitude) that has 12465 unique values. I need to create a new variable called Longitude1024 that consists of the variable Longitude split into 1024 equally sized groups. I did that using the following function:
data$Longitude1024 <- as.factor( as.numeric( cut(data$Longitude,1024)))
However, the problem is that, when I use this function to create the new variable Longitude1024, this new variable consists of only 651 unique elements rather than 1024. Does anyone know what the problem here is and how could I actually get the new variable with 1024 unique values?
Thanks a lot
Use rank, then scale it down. Here's an example with 10 groups:
x <- rnorm(124655)
g <- floor(rank(x) * 10 / (length(x) + 1))
table(g)
# g
# 0 1 2 3 4 5 6 7 8 9
# 12465 12466 12465 12466 12465 12466 12466 12465 12466 12465
Short answer: try cut2 from the Hmisc package
Long answer
Example: split dat, which is 1000 unique values, into 100 equal groups of 10.
Doesn't work:
# dummy data
set.seed(321)
dat <- rexp(1000)
# all unique values
length(unique(dat))
[1] 1000
cut generates 100 levels
init_res <- cut(dat, 100)
length(unique(levels(init_res)))
[1] 100
But does not split the data into equally sized groups
init_grps <- split(dat, cut(dat, 100))
table(unlist(lapply(init_grps, length)))
0 1 2 3 4 5 6 7 9 10 11 13 15 17 18 19 22 23 24 25 27 37 38 44 47 50 63 71 72 77
42 9 8 4 1 3 1 3 2 1 2 1 1 1 2 1 1 1 2 2 2 1 1 1 1 1 1 2 1 1
Works with Hmisc::cut2
cut2 divides the vector into groups of equal length, as desired
require(Hmisc)
final_grps <- split(dat, cut2(dat, g=100))
table(unlist(lapply(final_grps, length)))
10
100
If you want, you can store the results in a data frame, for example
foobar <- do.call(rbind, final_grps)
head(foobar)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[0.000611,0.00514) 0.004345915 0.002192086 0.004849693 0.002911516 0.003421753 0.003159641 0.004855366 0.0006111574
[0.005137,0.01392) 0.009178133 0.005137309 0.008347482 0.007072484 0.008732725 0.009379002 0.008818794 0.0110489833
[0.013924,0.02004) 0.014283326 0.014356782 0.013923721 0.014290554 0.014895342 0.017992638 0.015608931 0.0173707930
[0.020041,0.03945) 0.023047527 0.020437743 0.026353839 0.036159321 0.024371834 0.026629812 0.020793695 0.0214221779
[0.039450,0.05912) 0.043379064 0.039450453 0.050806316 0.054778805 0.040093806 0.047228050 0.055058519 0.0446634954
[0.059124,0.07362) 0.069671018 0.059124220 0.063242564 0.064505875 0.072344089 0.067196661 0.065575249 0.0634142853
[,9] [,10]
[0.000611,0.00514) 0.002524557 0.003155055
[0.005137,0.01392) 0.008287758 0.011683228
[0.013924,0.02004) 0.018537469 0.014847937
[0.020041,0.03945) 0.026233400 0.020040981
[0.039450,0.05912) 0.041310471 0.058449603
[0.059124,0.07362) 0.063608022 0.066316782
Hope this helps

Extract minima returns

I am trying to apply the block maxima (in my case minima) approach of Extreme Value Theory to financial returns. I have daily returns for 30 financial indices stored in a csv file called 'Returns'. I start by loading the data
Returns<-read.csv("Returns.csv", header=TRUE)
I then extract the minimum returns over consecutive non-overlapping blocks of equal length (i.e., 5 days) for each index I have in my 'Returns.csv' file. For that, I do the following
for (xx in Returns) #Obtain the minima.
{
rows<-length(xx) #This is the number of returns
m<-5 #When m<-5 we obtain weekly minima. Change accordingly (e.g., 20)
k<-rows/m #This is the number of blocks (i.e., number of returns/size of block),
bm<-rep(0,k) #which is also the number of extremes
for(i in 1:k){bm[i]<-min(xx[((i-1)*m+1):(i*m)])}
#Store the minima in a file 'minima.csv'
write.table(bm,file="minima.csv", append=TRUE, row.names=FALSE, col.names=FALSE)
The code extracts the minima returns for all indices correctly but when the minima are stored in the file 'minima.csv' they all appear in the same column (appended).
What I want the code to do is to read the financial returns contained in the first column of the file 'Returns.csv', extract the minima returns over consecutive non-overlapping blocks of equal length (i.e., 5 days) and store them in the first column of the file 'minima.csv'. Then do exactly the same for the financial returns contained in the second column of the file 'Returns.csv' and store the minima returns in the second column of the file 'minima.csv', and so on, until I reach column 30.
I think your data looks similar to this:
> m <- matrix(1:40, ncol=4)
> m
[,1] [,2] [,3] [,4]
[1,] 1 11 21 31
[2,] 2 12 22 32
[3,] 3 13 23 33
[4,] 4 14 24 34
[5,] 5 15 25 35
[6,] 6 16 26 36
[7,] 7 17 27 37
[8,] 8 18 28 38
[9,] 9 19 29 39
[10,] 10 20 30 40
Obviously you have more rows and columns and your data is not just the sequence of 1 to 40. To chunk each column with a size of 5 and find the minimum for each column run:
> apply(m, 2, function(x) sapply(split(x, ceiling(seq_along(x)/5)), min))
[,1] [,2] [,3] [,4]
1 1 11 21 31
2 6 16 26 36
Basically the apply is splitting m by the columns and applying the function to each column. The inner function takes each column, chunks the columns and then returns the minimum of each chunk. Your data is in a dataframe not a matrix so you need to do this before you run the command above.
m <- as.matrix(Returns)
To write this to a csv
> mins <- apply(m, 2, function(x) sapply(split(x, ceiling(seq_along(x)/5)), min))
> write.table(mins, file="test.min.csv", sep=',', row.names=F, col.names=F, quote=F)

Issue with order function [duplicate]

This question already has answers here:
Understanding the order() function
(7 answers)
Closed 9 years ago.
I have this function and it takes a few parameters.
I have this part of the function here:
sort.order <- order(inputs[,input.of.interest])
Iif I read inputs I get something like:
Status Quo Vaccination
[1,] 10.409146 16.252537
[2,] 5.834875 9.373437
[3,] 5.784903 15.935623
[4,] 12.208484 18.654250
[5,] 9.786787 16.467321
[6,] 6.560276 9.689887
But what is input.of.interest supposed to be?
What does it mean, how is this function used?
Should it be a number, i.e if it's 2, what would it do?
It chooses the column to sort by. If it's 1 it sorts by Status Quo and if it's 2 it sorts by Vaccination.
x <- seq(20, 11, -1)
x
# [1] 20 19 18 17 16 15 14 13 12 11
order(x)
# [1] 10 9 8 7 6 5 4 3 2 1
x[order(x)]
# [1] 11 12 13 14 15 16 17 18 19 20
Hope you see better how it works.

Finding the index of the minimum value which is larger than a threshold in R

This is probably very simple, but I'm missing the correct syntax in order to simplify it.
Given a matrix, find the entry in one column which is the lowest value, greater than some input parameter. Then, return an entry in a different column on that corresponding row. Not very complicated... and I've found something that works but, a more efficient solution would be greatly appreciated.
I found this link:Better way to find a minimum value that fits a condition?
which is great.. but that method of finding the least entry loses the index information required to find a corresponding value in a corresponding row.
Let's say column 2 is the condition column, and column 1 is the one I want to return.... currently I've made this: (note that this only works because row two is full of numbers which are less than 1).
matrix[which.max((matrix[,2]>threshhold)/matrix[,2]),1]
Any thoughts? I'm expecting that there is probably some quick and easy function which has this effect... it's just never been introduced to me haha.
rmk's answer shows the basic way to get a lot of info out of your matrix. But if you know which column you're testing for the minimum value (above your threshold), and then want to return a different value in that row, maybe something like
incol<- df[,4] # select the column to search
outcol <- 2 # select the element of the found row you want to get
threshold <- 5
df[ rev(order(incol>threshold))[1] ,outcol]
You could try the following. Say,
df <- matrix(sample(1:35,35),7,5)
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 18 16 27 19 31
[2,] 24 1 7 12 5
[3,] 28 35 23 4 6
[4,] 33 3 25 26 15
[5,] 14 10 11 21 20
[6,] 9 2 32 17 13
[7,] 30 8 29 22 34
Say your threshold is 5:
apply(df,2,function(x){ x[x<5] <- max(x);which.min(x)})
[1] 6 7 2 2 2
Corresponding to the values:
[1] 9 8 7 12 5
This should give you the index of the smallest entry in each column greater than threshold according to the original column indexing.

Resources