merge data vector to shapefile data slot

merge data vector to shapefile data slot - r

i try do add economic data to a shapefile using merge and the 2 digit ISO code as ID. The code looks somewhat like this:
library(maptools)
library(foreign)
library(sp)
library(lattice)
library(shapefiles)
world.shp<-readShapePoly("world_shapefile.shp")
world.shp#data<-merge(world.shp#data, data.frame(country=iso.code.vector, net=country.data.vector), by.x="ISO2", by.y="country", all.x=TRUE, sort=FALSE)
Unfortunately this ruins the order of the .shp file even though i put the sort argument. A plot afterwards shows me that the data does not match the polygons like it should. What am i doing wrong?
i got the world map data from thematicmapping.org
Thanks for your help

Merge will always break the sp object. Here are two ways to merge a dataframe to the sp #data datframe.
shape#data = data.frame(shape#data, OtherData[match(sdata#data$IDS, OtherData$IDS),])
Where; shape is your shape file, IDS is the identifier you want to merge on and OtherData is the dataframe that you want to combine with shape. Note that IDS can be different names in the two datasets but need to actually be the same values (not fuzzy).
Alternatively you can use this function.
join.sp.df <- function(x, y, xcol, ycol) {
x$sort_id <- 1:nrow(as(x, "data.frame"))
x.dat <- as(x, "data.frame")
x.dat2 <- merge(x.dat, y, by.x = xcol, by.y = ycol)
x.dat2.ord <- x.dat2[order(x.dat2$sort_id), ]
x2 <- x[x$sort_id %in% x.dat2$sort_id, ]
x2.dat <- as(x2, "data.frame")
row.names(x.dat2.ord) <- row.names(x2.dat)
x2#data <- x.dat2.ord
return(x2)
}
Where; x=sp SpatialDataFrame object, y=dataframe object to merge with x, xcol=Merge column name in sp object (need to quote), ycol=Merge column name in dataframe object (need to quote)

I found the same problem when using R versions 2.12.x and 2.13.x, but the problem appears to have been resolved in version 2.15.1.

I found a workaround. Not very elegant actually and it takes some time to execute but it works:
world.shp<-readShapePoly("world_shapefile.shp")
net<-rep(NA,length(world.shp#data$NAME))
for(i in 1:length(net))
{
for(j in 1:length(iso.code.vector))
{
if(!is.na(world.shp#data$ISO2[i])){if(world.shp#data$ISO2[i]==iso.code.vector[j]){net[i]=country.data.vector[j]}}
}
}
world.shp#data<-data.frame(world.shp#data, net)

Related

How to create a subset of a shp file, with all its properties

I am new to programming in R and with .shp files.
I am trying to take a subsample / subset of a .shp file that is so big, you can download this file from here: https://www.ine.es/ss/Satellite?L=es_ES&c=Page&cid=1259952026632&p=1259952026632&pagename=ProductosYServicios%2FPYSLayout (select the year 2021 and then go ahead).
I have tried several things but none of them work, neither is it worth passing it to sf because it would simply add one more column called geometry with the coordinates listed and that is not enough for me to put it later in the leaflet package.
I have tried this here but it doesn't work for me:
myspdf = readOGR(getwd(), layer = "SECC_CE_20210101") #It works
PV2 = myspdf[myspdf#data$NCA == 'País Vasco', ] #Dont work
PV2 = myspdf[,myspdf#data$NCA == 'País Vasco'] #Dont work
What I intend is to create a sample of myspdf (with data, polygons, plotorder, bbox and proj4string) but I don't want it from all the NCA values (myspdf#data$NCA), I only want those in which data$NCA are 'País Vasco'
In short, I would like to have a sample for each value of the different NCA column.
Is that possible? someone can help me on this? thank you very much.
I have tried this too but the same thing as before appears to me, all 18 variables appear and all are empty:
Pais_V = subset(myspdf, NCA == 'País Vasco')
dim(Pais_V)

Here's one approach:
library(rgdal)
dlshape=function(shploc, shpfile) {
temp=tempfile()
download.file(shploc, temp)
unzip(temp)
shp.data <- sapply(".", function(f) {
fp <- file.path(temp, f)
return(readOGR(dsn=".",shpfile))
})
}
setwd("C:/temp")
x = dlshape(shploc="https://www2.census.gov/geo/tiger/GENZ2020/shp/cb_2020_us_aitsn_500k.zip", "cb_2020_us_aitsn_500k")
x<-x$. # extract the shapefile
mycats<-c("00","T2","T3","28")
x2<-subset(x, x$LSAD %in% mycats) # subset using the list `mycats`
mypal=colorFactor("Dark2",domain=x2$LSAD)
library(leaflet)
leaflet(x2) %>% addPolygons(weight=.2, color=mypal(x2$LSAD))
dlshape function courtesy of #yokota

Here's another option. This uses the package sf.
myspdf <- st_read("./_data/España_Seccionado2021/SECC_CE_20210101.shp",
as_tibble = T)
Now you can filter this data any way that you filter a data frame. It will still work as spatial data, as well.
Using tidyverse (well, technically dplyr):
myspdf %>% filter(NCA == "País Vasco")
This takes it from 36,334 observations to 1714 observations.
The base R method you tried to use with readOGR will work, as well.
myspdf[myspdf$NCA == "País Vasco",]

for loop with mutate only outputs last result

I'm sorry if it's a duplicate, and for the lack of reproducibility, I'd have to link you the files.
What I'm trying to do is this:
I have a data frame with coordinates and names, let's say
df <- tribble(
~Species, ~lat, ~lon,
"a",42.92991, 11.875801,
"b",42.92991, 11.875801,
"c",43.91278, 3.513611,
"d",43.60851, 3.871755,
"e",39.24373, 9.120478
)
I also have a folder with tifrasters, such as
files <- list.files(path="~/world/", pattern="*.tif$", full.name=TRUE, all.files=TRUE)
Now for each iteration I'd like to:
create a new column on the data frame with the file name
insert in that column the extracted value for the corresponding lat and lon
I've tried using this for loop, and while on paper looks just fine, I don't understand why it outputs to funvar the last result only. I't like it overwrites the result instead of appending it.
If I use a similar loop with mutate and simpler objects, it appends them, so I'm not sure what the problem could be
for(i in files){
fraster<- raster(i)
fname<-gsub(".*//|[.].*", "", i)
funvar<-dplyr::mutate(fundata, !!fname:= raster::extract(fraster, coordinates(data.frame(lat,lon))))
}
Thanks!

The way I solved it is a bit of an hack, but works. I explicitly assign the new column to a data frame, like this.
I'm still notsure why mutate doesn't do that by itself
for(i in files){
fraster<- raster(i)
fname<-gsub(".*//|[.].*", "", i)
funvar<-dplyr::mutate(fundata, !!fname:= raster::extract(fraster, coordinates(data.frame(lat,lon))))
fundata[fname] <- funvar[[fname]]
}

From the info you provide I cannot tell if this will work, but normally you would make a RasterStack and avoid the loop.
library(raster)
# NOTE the order of lon, lat`
xy <- cbind(lon, lat)
s <- stack(files)
e <- raster::extract(s, xy)
If that is not possible, you can do something like this
fundata <- data.frame(xy)
for (f in files){
fraster<- raster(f)
fname <- gsub(".tif$", "", basename(f))
fundata[[fname]] <- raster::extract(fraster, xy)
}

How to Loop over Rasters to convert them to data.frames

I have some rasters that I would like to transform to data frames. I can do it manually one by one but it is ineffcient. When I try to make a loop (using a list or vector with names) the code doesn't work and R error says " Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘structure("RasterLayer", package = "raster")’ to a data.frame"
I have tried to make it using the function assign() but it doesn't work either. When using a vector of names I can only get R to make a dataframe of one single observation containing the name of the vector
When I do it one by one, R actually makes what I want. My code for one raster is just
#"a" is the name of the raster
r_1 <- as.data.frame(a, xy=TRUE, na.rm=TRUE, centroids=TRUE)
I have tried several things to male a loop but all have failed. First, I tried by creating a vector and looping with the function assign()
# "a" and "b" are the names of my rasters
o2 <- c("a","b")
for(i in 1:length(o2)){
nam <- substr(o2[i],1,nchar(o2))
assign(nam,as.data.frame(o2[i], xy=TRUE, na.rm=TRUE, centroids=TRUE))
}
But this only creates a dataframe named a1 with one observation "a1" and one variable.
I have tried to make a list too
o4 <- list(a,b)
for(i in 1:length(o4)){
nam <- substr(o4[i],1,nchar(ola4))
r_i <- as.data.frame(o4[i], xy=TRUE, na.rm=TRUE, centroids=TRUE)
}
The error this time says: " Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘structure("RasterLayer", package = "raster")’ to a data.frame"
I expect to have a data frame with three columns and as much rows as cells in my raster. The columns should be the latitude and longitude of the centroid of each cell and a column with the information each cell. I don't see any mistake in my code, maybe someone can help me.
I created the rasters myself using different shapefiles. I have more than 40 rasters with the following characteristics: witdth 8806, height: 10389, origin: -77.6699, 4.94778, pixel size: 0,001041666, SRC: EPSG:4326 - WGS 84 - Geographic. As I said, I created the rasters myself and all of them have those same characteristics.

When asking a question like this, always include some example data (normally not your data). Here are use three (identical) raster files
f <- system.file("external/test.grd", package="raster")
ff <- c(f,f,f)
Now use lists to accomplish what you want.
r <- lapply(ff, raster)
x <- lapply(r, function(i) as.data.frame(i, xy=TRUE, na.rm=TRUE))
Never use assign

Instead of a loop you can use apply :
s=c(raster1,raster2,raster3)
lapply(s, as.data.frame)

R - SpatialPointsDataFrame from a list of SpatialPoints

How to create a SpatialPointsDataFrame from a list of SpatialPoints?
In the following there´s a code of a list containing SpatialPoints:.
SP1 <- SpatialPoints(cbind(1,5))
SP2 <- SpatialPoints(cbind(2,4))
SP3 <- SpatialPoints(cbind(3,3))
SP.l<-list(SP1,SP2, SP3)
What I´m looking for is a way to extract the SpatialPoints from the list and create a SpatialPointsDataFrame out of it.
With the following code I can get single SpatialPoints out of the list:
coords_3 = SP.l[[3]]#coords
data_3 = as.data.frame(SP.l[[3]])
SPDF_3 <- SpatialPointsDataFrame(coords=coords_3, data=as.data.frame(data_3))
However I´d like receive all at once.
Maybe something like:
SP <- SpatialPoints(lapply(1:length(lidR.clip.SP.l), function(i) {
...
EDIT:
what was missing was:
SP.l <- do.call("rbind", SP.l)
That´s what I was actually looking for.
thx!

Since there is no minimal working example as hrbrmstr. You need to provide one. For now, I use a sample data from the GISTools package and demonstrate one way. There is a data set call newhaven in the package. breach is the data. I made a copy of it and created foo, which class is SpatialPoints. I created two list elements using foo.
Using your code, I looped through each list element and converted SpatialPoints to SpatialPointsDataFrame. I hope you can figure out how to apply the following code to your case.
library(GISTools)
data(newhaven)
foo <- breach
mylist <- list(foo1 = breach[1:10, ],
foo2 = breach[11:20, ])
lapply(1:length(mylist), function(x){
SpatialPointsDataFrame(coords = mylist[[x]]#coords,
data = as.data.frame(mylist[[x]]))
})
If you want to bind all SPDFs, then you can try the following.
do.call(rbind, lapply(1:length(mylist), function(x){
SpatialPointsDataFrame(coords = mylist[[x]]#coords,
data = as.data.frame(mylist[[x]]))
})
)

what was missing was:
SP.l <- do.call("rbind", SP.l)
That´s what I was actually looking for.
thx!

Create several data.frames via a for loop and name them accordingly

I want to apply a for-loop to every element of a list (station code of air quality stations) and create a single data.frame for each station with specific data.
My current code looks like this:
for (i in Stations))
{i_PM <- data.frame(PM2.5$DateTime,PM2.5$i)
colnames(i_PM)[1] <- "DateTime"
i_AOT <- subset(MOD2011, MOD2011$Station_ID==i)
i <- merge(i_PM, i_AOT, by="DateTime")}
Stations consists of 28 elements. The result should be a data.frame for every station with the colums DateTime, PM2.5 and several elements from MOD2011.
I just dont get it running as its supposed to be. Im sure its my fault, I couldnt find the specific answer via the internet.
Can you show me my mistake?

Try assign:
for (i in Stations)) {
dat <- data.frame(PM2.5$DateTime,PM2.5$i)
dat2 <- subset(MOD2011, MOD2011$Station_ID==i)
colnames(i_PM)[1] <- "DateTime"
assign(paste(i, "_PM", sep=""), dat)
assign(paste(i, "_AOT", sep=""), dat2)
assign(i, merge(dat, dat2, by="DateTime"))
}
Note, however, that this is bad coding practice. You should reconsider your algorithm. For instance, use a list instead.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

merge data vector to shapefile data slot - r

I found the same problem when using R versions 2.12.x and 2.13.x, but the problem appears to have been resolved in version 2.15.1.

Related

How to create a subset of a shp file, with all its properties

for loop with mutate only outputs last result

How to Loop over Rasters to convert them to data.frames

R - SpatialPointsDataFrame from a list of SpatialPoints

Create several data.frames via a for loop and name them accordingly

Categories

Resources