I have a number shapefiles. Some of these contain separated collections of counties, as the one below. For the data visualized, my goal is to split the shapefile into three separate files, where one contains the two counties on the left, one contains the single unit above, and the last on contains the remaining units.
Does anyone know a command which would help to do this?
Thank you in advance!
Here an example how you can go about it if you know which counties go into which group (i.e. not based on polygon characteristics).
library("raster")
library("rgdal")
swe <- getData('GADM', country='SWE', level=1)
group1 <- c("Gotland", "Halland","Kalmar")
swe1 <- swe[swe$NAME_1 %in% group1,]
plot(swe)
plot(swe1, add=TRUE, col="red")
writeOGR(swe1, dsn = "swe1", layer="swe1", "swe1",
driver="ESRI Shapefile", overwrite_layer=T)
And so forth for the other groups.
For the large group it would of course be tedious to type or even copy-paste all the names. Instead you can do this:
group2 <- c("Orebro", "Blekinge","Dalarna")
swe2 <- swe[swe$NAME_1 %in% group2,]
'%ni%' <- Negate('%in%')
swe3 <- swe[swe$NAME_1 %ni% c(group1,group2),]
I would try dbscan package, it will cluster counties the way you want. Use low minPts, 1 or 2, and experiment with a few eps values. Then collect each cluster and dump to a separate shape file.
Related
I'm still new to R and don't know how to create a loop for my workprocess to make it more efficient.
I have a Digital Elevation Model (raster Barrow_5m.tif), a shapefile for lakes and buffer with 10 iDs in a row of the table each.
In the script below I created a new raster file for all values of the lake and the buffer shape file with the data from the DEM raster. This works fine.
setwd("...")
Barrow_5m <- raster("Barrow_5m.tif")
Barrow_DTLB <- st_read("Barrow_DTLB.shp")
Barrow_DTLB_Buffer <- st_read("Barrow_DTLB_BufferOUT.shp")
Barrow_lake <- crop(Barrow_5m, extent(Barrow_DTLB))
raster_lake <- rasterize(Barrow_DTLB, Barrow_lake, mask = TRUE)
Barrow_buffer <- crop(Barrow_2m, extent(Barrow_DTLB_Buffer))
raster_buffer <- rasterize(Barrow_DTLB_Buffer, Barrow_buffer, mask = TRUE)
writeRaster(raster_lake, "raster_lake.tif")
writeRaster(raster_buffer, "raster_buffer.tif")
But now I want to have a raster file for every id of the lake and the buffer shapefile seperately, so 2x10 files.
I thought it's best to write a loop for this, but my skills are not enough so far to do this.
Also other questions didn't bring the solution so far. I tried to help me with this.
Alternatively I could use my end product tif from the script above and undo this in files for every ID.
I want to write the loop and not do it by hand for all the IDs of the shapefiles, because afterwards I am going to do the same with an even bigger shapefile of more values.
I found a solution now, by extracting data by the ID.
It creates a largelist with 11 elements and all values of each id, which is sufficient for my further work. You can also directly creat the mean, max, min, etc values of each element (so each ID).
k <- Barrow_DTLB$ID #k= number of rows
LakesA <- extract(raster_lakeA, Barrow_DTLB[k, ])
LakesA_mean <- extract(raster_lakeA, Barrow_DTLB[k, ], fun=mean)
Maybe this solution is also helpful for a few, who already viewed the question.
I think this should work:
for (i in unique(raster_lake)){
r <- raster_lake
r[!(values(r) == i)] <- NA
r <- trim(r)
writeRaster(r, paste0("raster_lake_", i, ".tif"))
}
I'm programming a script for the calculation of cover around points in R.
I have two inputs: an IMG raster file, and a .csv with all the points.
I've used this script:
library(raster)
library(rgdal)
#load in raster and locality data
map <- raster('map.IMG')
sites <- read.csv('points.csv', header=TRUE)
#convert lat/lon to appropirate projection
coordinates(sites) <- c("X", "Y")
proj4string(sites) <- CRS("+init=epsg:27700")
#extract values to points
Landcover<-extract (map, sites, buffer=2000)
extraction <- lapply(Landcover, function(serial) prop.table(table(serial)))
# Write .csv file
lapply(extraction, function(x) write.table( data.frame(x), 'test2.csv' , append= T, sep=',' ))
I get a .csv file in my map, but the data isn't organised in the way I would like it to be.
There a three columns in the csv file. One with 'x', one with 'Freq' (Which I think is the code for every class in my image) and one with the cover part, somewhere between 0-1. See the image included.Image
I want to have on the rows the serial and classes, and under that the correct serial with it's coverage.
Also every point isn't named, so I can't see which is which. In the points.csv I have for example a 'serial' code for every point, which i would like to use for that.
Can somebody steer me in the right direction?
I hope I have been clear with my questions, thank in advance!
I am attempting to count the number of points within each LSOA area within London. I have attempted to use the over function although the output does not produce a count of the number of listings per LSOA
The code I have conducted so far is as follows
ldnLSOA <- readOGR(".", "LSOA_2011_London_gen_MHW")
LondonListings <- read.csv('Londonlistings.csv')
proj4string(LdnLSOA) <- proj4string(LondonListings)
plot(ldnLSOA)
plot(LondonListings, add =T)
LSOAcounts <- over(LondonListings, ldnLSOA)
This produces a table with no additional data than the original ldnLSOA shapefile.
I was wondering if someone knew how I would be able to get a table in the format:
LSOAname | LSOAcode | Count
or that sort of framework.
Example data:
LondonListings:
longitude | latituide
-0.204406 51.52060
-0.034617 51.45037
-0.221920 51.46449
-0.126562 51.47158
-0.188879 51.57068
-0.096917 51.49281
Shapefile:
https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london
I deleted my inespecific answer and wrote another one with your data (except for the points... but it is not hard to replace this data, right?)
Let me know if it worked!
#I'm not sure which of this libs are used, since I always have all of them loaded here
library(rgeos)
library(rgdal)
library(sp)
#Load the shapefile
ldnLSOA <- readOGR(".", "LSOA_2011_London_gen_MHW")
plot(ldnLSOA)
#It's always good to take a look in the data associated to your map
ldn_data<-as.data.frame(ldnLSOA#data)
#Create some random point in this shapefile
ldn_points<-spsample(ldnLSOA,n=1000, type="random")
plot(ldnLSOA)
plot(ldn_points, pch=21, cex=0.5, col="red", add=TRUE)
#create an empty df with as many rows as polygons in the shapefile
df<-as.data.frame(matrix(ncol=3, nrow=length(ldnLSOA#data$LSOA11NM)))
colnames(df)<- c("LSOA_name","LSOA_code", "pt_Count")
df$LSOAname<-ldn_data$LSOA11NM
df$LSOAcode<-ldn_data$LSOA11CD
# Over = at the spatial locations of object x,
# retrieves the indexes or attributes from spatial object y
pt.poly <- over(ldn_points,ldnLSOA)
# Now let's count
pt.count<-as.data.frame(table(pt.poly$LSOA11CD))
#As it came in alphabetical order, let's put in the same order of data in data frame
pt.count_ord<-as.data.frame(pt.count[match(df$LSOA_name,pt.count$Var1),])
#Fill 3rd col with counts
df[,3]<-pt.count_ord$Freq
I have read so many threads and articles and I keep getting errors. I am trying to make a choropleth? map of the world using data I have from the global terrorism database. I want to color countries on a factor of nkills or just the number of attacks in that country.. I don't care at this point. Because there are so many countries with data, it is unreasonable to make any plots to show this data.
Help is strongly appreciated and if I did not ask this correctly I sincerely apologize, I am learning the rules of this website as I go.
my code (so far..)
library(maps)
library(ggplot2)
map("world")
world<- map_data("world")
gtd<- data.frame(gtd)
names(gtd)<- tolower(names(gtd))
gtd$country_txt<- tolower(rownames(gtd))
demo<- merge(world, gts, sort=FALSE, by="country_txt")
In the gtd data frame, the name for the countries column is "country_txt" so I thought I would use that but I get error in fix.by(by.x, x) : 'by' must specify a uniquely valid column
If that were to work, I would plot as I have seen on a few websites..
I have honestly been working on this for so long and I have read so many codes/other similar questions/websites/r handbooks etc.. I will accept that I am incompetent when it comes to R gladly for some help.
Something like this? This is a solution using rgdal and ggplot. I long ago gave up on using base R for this type of thing.
library(rgdal) # for readOGR(...)
library(RColorBrewer) # for brewer.pal(...)
library(ggplot2)
setwd(" < directory with all files >")
gtd <- read.csv("globalterrorismdb_1213dist.csv")
gtd.recent <- gtd[gtd$iyear>2009,]
gtd.recent <- aggregate(nkill~country_txt,gtd.recent,sum)
world <- readOGR(dsn=".",
layer="world_country_admin_boundary_shapefile_with_fips_codes")
countries <- world#data
countries <- cbind(id=rownames(countries),countries)
countries <- merge(countries,gtd.recent,
by.x="CNTRY_NAME", by.y="country_txt", all.x=T)
map.df <- fortify(world)
map.df <- merge(map.df,countries, by="id")
ggplot(map.df, aes(x=long,y=lat,group=group)) +
geom_polygon(aes(fill=nkill))+
geom_path(colour="grey50")+
scale_fill_gradientn(name="Deaths",
colours=rev(brewer.pal(9,"Spectral")),
na.value="white")+
coord_fixed()+labs(x="",y="")
There are several versions of the Global Terrorism Database. I used the full dataset available here, and then subsetted for year > 2009. So this map shows total deaths due to terrorism, by country, from 2010-01-01 to 2013-01-01 (the last data available from this source). The files are available as MS Excel download, which I converted to csv for import into R.
The world map is available as a shapefile from the GeoCommons website.
The tricky part of making choropleth maps is associating your data with the correct polygons (countries). This is generally a four step process:
Find a field in the shapefile attributes table that maps (no pun intended) to a corresponding field in your data. In this case, it appears that the field "CNTRY_NAME" in the shapefile maps to the field "country_txt" in gtd database.
Create an association between ploygon IDs (stored in the row names of the attribute table), and the CNTRY_NAME field.
Merge the result with your data using CNTRY_NAME and country_txt.
Merge the result of that with the data frame created using the fortify(map) - this associates ploygons with deaths (nkill).
Building on the nice work by #jlhoward. You could instead use rworldmap that already has a world map in R and has functions to aid joining data to the map. The default map is deliberately low resolution to create a 'cleaner' look. The map can be customised (see rworldmap documentation) but here is a start :
library(rworldmap)
#3 lines from #jlhoward
gtd <- read.csv("globalterrorismdb_1213dist.csv")
gtd.recent <- gtd[gtd$iyear>2009,]
gtd.recent <- aggregate(nkill~country_txt,gtd.recent,sum)
#join data to a map
gtdMap <- joinCountryData2Map( gtd.recent,
nameJoinColumn="country_txt",
joinCode="NAME" )
mapDevice('x11') #create a world shaped window
#plot the map
mapCountryData( gtdMap,
nameColumnToPlot='nkill',
catMethod='fixedWidth',
numCats=100 )
Following a comment from #hk47, you can also add the points to the map sized by the number of casualties.
deaths <- subset(x=gtd, nkill >0)
mapBubbles(deaths,
nameX='longitude',
nameY='latitude',
nameZSize='nkill',
nameZColour='black',
fill=FALSE,
addLegend=FALSE,
add=TRUE)
I want to to convert two .shp files into one database that would allow me to draw the maps together.
Also, is there a way to convert .shp files into .csv files? I want to be able to personalize and add some data which is easier for me under a .csv format. What I have in mind if to add overlay yield data and precipitation data on the maps.
Here are the shapefiles for Morocco, and Western Sahara.
Code to plot the two files:
# This is code for mapping of CGE_Morocco results
# Loading administrative coordinates for Morocco maps
library(sp)
library(maptools)
library(mapdata)
# Loading shape files
Mor <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
Sah <- readShapeSpatial("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
# Ploting the maps (raw)
png("Morocco.png")
Morocco <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/MAR_adm1.shp")
plot(Morocco)
dev.off()
png("WesternSahara.png")
WesternSahara <- readShapePoly("F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/Country-CGE/ESH_adm1.shp")
plot(WesternSahara)
dev.off()
After looking into suggestions from #AriBFriedman and #PaulHiemstra and subsequently figuring out how to merge .shp files, I have managed to produce the following map using the following code and data (For .shp data, cf. links above)
code:
# Merging Mor and Sah .shp files into one .shp file
MoroccoData <- rbind(Mor#data,Sah#data) # First, 'stack' the attribute list rows using rbind()
MoroccoPolys <- c(Mor#polygons,Sah#polygons) # Next, combine the two polygon lists into a single list using c()
summary(MoroccoData)
summary(MoroccoPolys)
offset <- length(MoroccoPolys) # Next, generate a new polygon ID for the new SpatialPolygonDataFrame object
browser()
for (i in 1: offset)
{
sNew = as.character(i)
MoroccoPolys[[i]]#ID = sNew
}
ID <- c(as.character(1:length(MoroccoPolys))) # Create an identical ID field and append it to the merged Data component
MoroccoDataWithID <- cbind(ID,MoroccoData)
MoroccoPolysSP <- SpatialPolygons(MoroccoPolys,proj4string=CRS(proj4string(Sah))) # Promote the merged list to a SpatialPolygons data object
Morocco <- SpatialPolygonsDataFrame(MoroccoPolysSP,data = MoroccoDataWithID,match.ID = FALSE) # Combine the merged Data and Polygon components into a new SpatialPolygonsDataFrame.
Morocco#data$id <- rownames(Morocco#data)
Morocco.fort <- fortify(Morocco, region='id')
Morocco.fort <- Morocco.fort[order(Morocco.fort$order), ]
MoroccoMap <- ggplot(data=Morocco.fort, aes(long, lat, group=group)) +
geom_polygon(colour='black',fill='white') +
theme_bw()
Results:
New Question:
1- How to eliminate the boundaries data that cuts though the map in half?
2- How to combine different regions within a .shp file?
Thanks you all.
P.S: the community in stackoverflow.com is wonderful and very helpful, and especially toward beginners like :) Just thought of emphasizing it.
Once you have loaded your shapefiles into Spatial{Lines/Polygons}DataFrames (classes from the sp-package), you can use the fortify generic function to transform them to flat data.frame format. The specific functions for the fortify generic are included in the ggplot2 package, so you'll need to load that first. A code example:
library(ggplot2)
polygon_dataframe = fortify(polygon_spdf)
where polygon_spdf is a SpatialPolygonsDataFrame. A similar approach works for SpatialLinesDataFrame's.
The difference between my solution and that of #AriBFriedman is that mine includes the x and y coordinates of the polygons/lines, in addition to the data associated to those polgons/lines. I really like visualising my spatial data with the ggplot2 package.
Once you have your data in a normal data.frame you can simply use write.csv to generate a csv file on disk.
I think you mean you want the associated data.frame from each?
If so, it can be accessed with the # slot access function. The slot is called data:
write.csv( WesternSahara#data, file="/home/wherever/myWesternSahara.csv")
Then when you read it back in with read.csv, you can try assigning:
myEdits <- read.csv("/home/wherever/myWesternSahara_modified.csv")
WesternSahara#data <- myEdits
You may need to do some massaging of row names and so forth to get it to accept the new data.frame as valid. I'd probably try to merge the existing data.frame with a csv you read in in R, rather than making edits destructively....