I have a folder with about 100 point shapefiles that are locations obtained while scat sampling of an ungulate species. I would like to merge all these point shapefiles into one shapefile in R. All the point data were in .gpx format initially which I then changed to shapefiles.
I am fairly new to R,so I am very confused as on how to do it and could not find codes that merged or combined more than a few shapefiles. Any suggestions would be much appreciated. Thanks!
Building on #M_Merciless ..
for long lists you can use
all_schools <- do.call(rbind, shapefile_list)
Or, alternatively, the very fast:
all_schools <- sf::st_as_sf(data.table::rbindlist(x))
library(sf)
list all the shapefiles within the folder location
file_list <- list.files("shapefile/folder/location", pattern = "*shp", full.names = TRUE)
read all shapefiles, store as a list
shapefile_list <- lapply(file_list, read_sf)
append the separate shapefiles, in my example there are 4 shapefiles, this can probably be improved by using a for loop or apply function for a longer list.
all_schools <- rbind(shapefile_list[[1]], shapefile_list[[2]], shapefile_list[[3]], shapefile_list[[4]])
Adding a solution that I think is "tidier"
library(fs)
library(tidyverse)
# Getting all file paths
shapefiles <- 'my/data/folder' |>
dir_ls(recurse = TRUE) |>
str_subset('.shp$')
# Loading all files
sfdf <- shapefiles |>
map(st_read) |>
bind_rows()
Admittedly, more lines of code but personally I think the code is much easier to read and comprehend this way.
Related
I am new to programming in R and with .shp files.
I am trying to take a subsample / subset of a .shp file that is so big, you can download this file from here: https://www.ine.es/ss/Satellite?L=es_ES&c=Page&cid=1259952026632&p=1259952026632&pagename=ProductosYServicios%2FPYSLayout (select the year 2021 and then go ahead).
I have tried several things but none of them work, neither is it worth passing it to sf because it would simply add one more column called geometry with the coordinates listed and that is not enough for me to put it later in the leaflet package.
I have tried this here but it doesn't work for me:
myspdf = readOGR(getwd(), layer = "SECC_CE_20210101") #It works
PV2 = myspdf[myspdf#data$NCA == 'País Vasco', ] #Dont work
PV2 = myspdf[,myspdf#data$NCA == 'País Vasco'] #Dont work
What I intend is to create a sample of myspdf (with data, polygons, plotorder, bbox and proj4string) but I don't want it from all the NCA values (myspdf#data$NCA), I only want those in which data$NCA are 'País Vasco'
In short, I would like to have a sample for each value of the different NCA column.
Is that possible? someone can help me on this? thank you very much.
I have tried this too but the same thing as before appears to me, all 18 variables appear and all are empty:
Pais_V = subset(myspdf, NCA == 'País Vasco')
dim(Pais_V)
Here's one approach:
library(rgdal)
dlshape=function(shploc, shpfile) {
temp=tempfile()
download.file(shploc, temp)
unzip(temp)
shp.data <- sapply(".", function(f) {
fp <- file.path(temp, f)
return(readOGR(dsn=".",shpfile))
})
}
setwd("C:/temp")
x = dlshape(shploc="https://www2.census.gov/geo/tiger/GENZ2020/shp/cb_2020_us_aitsn_500k.zip", "cb_2020_us_aitsn_500k")
x<-x$. # extract the shapefile
mycats<-c("00","T2","T3","28")
x2<-subset(x, x$LSAD %in% mycats) # subset using the list `mycats`
mypal=colorFactor("Dark2",domain=x2$LSAD)
library(leaflet)
leaflet(x2) %>% addPolygons(weight=.2, color=mypal(x2$LSAD))
dlshape function courtesy of #yokota
Here's another option. This uses the package sf.
myspdf <- st_read("./_data/España_Seccionado2021/SECC_CE_20210101.shp",
as_tibble = T)
Now you can filter this data any way that you filter a data frame. It will still work as spatial data, as well.
Using tidyverse (well, technically dplyr):
myspdf %>% filter(NCA == "País Vasco")
This takes it from 36,334 observations to 1714 observations.
The base R method you tried to use with readOGR will work, as well.
myspdf[myspdf$NCA == "País Vasco",]
I have a large polyline shapefile that needs to be dissolved. However, the examples online only relate to polygons not polylines for example gUnaryUnion. I am reading in my shapefile using st_read from the sf package. Any help would be appreciated.
If I understand your question, one option is to use dplyr.
library(sf)
library(dplyr)
# get polyline road data
temp_shapefile <- tempfile()
download.file("https://www2.census.gov/geo/tiger/TIGER2017//ROADS/tl_2017_06075_roads.zip", temp_shapefile)
temp_dir <- tempdir()
unzip(temp_shapefile, exdir = temp_dir)
sf_roads <- read_sf(file.path(temp_dir,'tl_2017_06075_roads.shp'))
Use the RTTYP field to reduce the polyline from ~4000 unique segments to 6 segments.
sf_roads_summarized <- sf_roads %>%
group_by(RTTYP) %>%
summarize()
I manged to achieve this by using st_combine.
I am new to R and need to create one dataframe from 80 .xlsx files that mostly share the same columns and are all in the same folder. I want to bind all these files efficiently in a manner that would work if I added or removed files from the folder later. I want to do this without converting the files to .csv, unless someone can show me how to that efficiently for large numbers of files within R itself.
I've previously been reading files individually using the read_excel function from the readxl package. After, I would use rbind to bind them. This was fine for 10 files, but not 80! I've experimented with many solutions offered online however none of these seem to work, largely because they are using functions other than read_excel or formats other than .xlsx. I haven't kept track of many of my failed attempts, so cannot offer code other than one alternate method I tried to adapt to read_excel from the read_csv function.
#Method 1
library(readxl)
library(purr)
library(dplyr)
library(tidyverse)
file.list <- list.files(pattern='*.xlsx')
alldata <- file.list %>%
map(read_excel) %>%
reduce(rbind)
#Output
New names:
* `` -> ...2
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
Any code on how to do this would be greatly appreciated. Sorry if anything is wrong about this post, it is my first one.
UPDATE:
Using the changes suggested by the answers, I'm now using the code:
file.list <- list.files(pattern='*.xlsx')
alldata <- file.list %>%
map_dfr(read_excel) %>%
reduce(bind_rows)
This output now is as follows:
New names:
* `` -> ...2
Error: Column `10.Alert.alone` can't be converted from numeric to character
This happens regardless of which type of bind() function I use in the reduce() slot. If anyone can help with this, please let me know!
You're on the right track here. But you need to use map_dfr instead of plain-vanilla map. map_dfr outputs a data frame (or actually tibble) for each iteration, and combines them via bind_rows.
This should work:
library(readxl)
library(tidyverse)
file.list <- list.files(pattern='*.xlsx')
alldata <- file.list %>%
map_dfr(~read_excel(.x))
Note that this assumes your files all have consistent column names and data types. If they don't, you may have to do some cleaning. (One trick I've used in complex cases is to add a %>% mutate_all(as.character) to the read_excel command inside the map function. That will turn everything into characters, and then you can convert the data types from there.)
this should get you there/close...
library(data.table)
library(readxl)
#create files list
file.list <- list.files( pattern = ".*\\.xlsx$", full.names = TRUE )
#read files to list of data.frames
l <- lapply( l, readxl::read_excel )
#bind l together to one larger data.table, by columnname, fill missing with NA
dt <- data.table::rbindlist( l, use.names = TRUE, fill = TRUE )
Try using map_dfr.
alldata <- file.list %>%
map_dfr(read_excel)
I have more than 1000 shape files in a directory, and I want to select only 10 of them whose names are already known to me as follows:
15TVN44102267_Polygons.shp, 15TVN44102275_Polygons.shp
15TVN44102282_Polygons.shp, 15TVN44102290_Polygons.shp
15TVN44102297_Polygons.shp, 15TVN44102305_Polygons.shp
15TVN44102312_Polygons.shp, 15TVN44102320_Polygons.shp
15TVN44102327_Polygons.shp, 15TVN44102335_Polygons.shp
First I want to read only these shape files using the list.files command, and then merge them into one big file. I tried the following command, but it failed. I will appreciate any assistance from the community.
setwd('D/LiDAR/CHM_tree_objects')
files <- list.files(pattern="15TVN44102267_Polygons|
15TVN44102275_Polygons| 15TVN44102282_Polygons|
15TVN44102290_Polygons| 15TVN44102297_Polygons|
15TVN44102305_Polygons| 15TVN44102312_Polygons|
15TVN44102320_Polygons| 15TVN44102327_Polygons|
15TVN44102335_Polygons| 15TVN44102342_Polygons|
15TVN44102350_Polygons| 15TVN44102357_Polygons",
recursive = TRUE, full.names = TRUE)
Here's a slightly different approach. If you already know the location of the files and their file names, you don't need to use list.files:
library(sf)
baseDir <- '/temp/r/'
filenames <- c('Denisonia-maculata.shp', 'Denisonia-devisi.shp')
filepaths <- paste(baseDir, filenames, sep='')
# Read each shapefile and return a list of sf objects
listOfShp <- lapply(filepaths, st_read)
# Look to make sure they're all in the same CRS
unique(sapply(listOfShp, st_crs))
# Combine the list of sf objects into a single object
combinedShp <- do.call(what = sf:::rbind.sf, args=listOfShp)
combinedShp will then be an sf object that has all the features in your individual shapefiles. You can then write that out to a single file in your chosen format with st_write.
I want to to convert two .shp files into one database that would allow me to draw the maps together.
Also, is there a way to convert .shp files into .csv files? I want to be able to personalize and add some data which is easier for me under a .csv format. What I have in mind if to add overlay yield data and precipitation data on the maps.
Here are the shapefiles for Morocco, and Western Sahara.
Code to plot the two files:
# This is code for mapping of CGE_Morocco results
# Loading administrative coordinates for Morocco maps
library(sp)
library(maptools)
library(mapdata)
# Loading shape files
Mor <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
Sah <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
# Ploting the maps (raw)
png("Morocco.png")
Morocco <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
plot(Morocco)
dev.off()
png("WesternSahara.png")
WesternSahara <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
plot(WesternSahara)
dev.off()
After looking into suggestions from #AriBFriedman and #PaulHiemstra and subsequently figuring out how to merge .shp files, I have managed to produce the following map using the following code and data (For .shp data, cf. links above)
code:
# Merging Mor and Sah .shp files into one .shp file
MoroccoData <- rbind(Mor#data,Sah#data) # First, 'stack' the attribute list rows using rbind()
MoroccoPolys <- c(Mor#polygons,Sah#polygons) # Next, combine the two polygon lists into a single list using c()
summary(MoroccoData)
summary(MoroccoPolys)
offset <- length(MoroccoPolys) # Next, generate a new polygon ID for the new SpatialPolygonDataFrame object
browser()
for (i in 1: offset)
{
sNew = as.character(i)
MoroccoPolys[[i]]#ID = sNew
}
ID <- c(as.character(1:length(MoroccoPolys))) # Create an identical ID field and append it to the merged Data component
MoroccoDataWithID <- cbind(ID,MoroccoData)
MoroccoPolysSP <- SpatialPolygons(MoroccoPolys,proj4string=CRS(proj4string(Sah))) # Promote the merged list to a SpatialPolygons data object
Morocco <- SpatialPolygonsDataFrame(MoroccoPolysSP,data = MoroccoDataWithID,match.ID = FALSE) # Combine the merged Data and Polygon components into a new SpatialPolygonsDataFrame.
Morocco#data$id <- rownames(Morocco#data)
Morocco.fort <- fortify(Morocco, region='id')
Morocco.fort <- Morocco.fort[order(Morocco.fort$order), ]
MoroccoMap <- ggplot(data=Morocco.fort, aes(long, lat, group=group)) +
geom_polygon(colour='black',fill='white') +
theme_bw()
Results:
New Question:
1- How to eliminate the boundaries data that cuts though the map in half?
2- How to combine different regions within a .shp file?
Thanks you all.
P.S: the community in stackoverflow.com is wonderful and very helpful, and especially toward beginners like :) Just thought of emphasizing it.
Once you have loaded your shapefiles into Spatial{Lines/Polygons}DataFrames (classes from the sp-package), you can use the fortify generic function to transform them to flat data.frame format. The specific functions for the fortify generic are included in the ggplot2 package, so you'll need to load that first. A code example:
library(ggplot2)
polygon_dataframe = fortify(polygon_spdf)
where polygon_spdf is a SpatialPolygonsDataFrame. A similar approach works for SpatialLinesDataFrame's.
The difference between my solution and that of #AriBFriedman is that mine includes the x and y coordinates of the polygons/lines, in addition to the data associated to those polgons/lines. I really like visualising my spatial data with the ggplot2 package.
Once you have your data in a normal data.frame you can simply use write.csv to generate a csv file on disk.
I think you mean you want the associated data.frame from each?
If so, it can be accessed with the # slot access function. The slot is called data:
write.csv( WesternSahara#data, file="/home/wherever/myWesternSahara.csv")
Then when you read it back in with read.csv, you can try assigning:
myEdits <- read.csv("/home/wherever/myWesternSahara_modified.csv")
WesternSahara#data <- myEdits
You may need to do some massaging of row names and so forth to get it to accept the new data.frame as valid. I'd probably try to merge the existing data.frame with a csv you read in in R, rather than making edits destructively....