Binding multiple shapefiles results in rownames error - r

I've got a list of around 20 shapefiles that I want to bind into one. These shapefiles have different number of fields - some have 1 and some have 2. Examples are shown below:
# 1 field
> dput(head(shp[[1]]))
structure(list(area = c(1.60254096388, 1.40740270051, 0.093933438653,
0.609245720277, 22.892748868, 0.0468096597394)), row.names = 0:5, class = "data.frame")
# 2 fields
> dput(head(shp[[3]]))
structure(list(per = c(61, 70, 79, 90, 57, 66), area = c(2218.8,
876.414, 2046.94, 1180.21, 1779.12, 122.668)), row.names = c(0:5), class = "data.frame")
I used the following code to bind them and it worked just as I wanted:
merged<- raster::bind(shp, keepnames= FALSE, variables = area)
writeOGR(merged, './shp', layer= 'area', driver="ESRI Shapefile")
However, I now need to subset one of the shapefiles in the list. I do it in this way:
shp[[3]]#data <- shp[[3]]#data %>% subset(Area >= 50)
names(shp[[3]]#data)[names(shp[[3]]#data) == "Area"] <- "area"
When I run the bind command, however, this now gives me an error:
merged<- raster::bind(shp, keepnames= FALSE, variables = area)
Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length
Calls: <Anonymous> ... row.names<- -> row.names<-.data.frame -> .rowNamesDF<-
Execution halted
I'm not sure why that is. The shapefile hasn't changed, they are just subsetted. I tried deleting the rownames in the way shown below and it still throws the same error.
rownames(shp[[3]]#data) <- NULL
What could it be?

I think the problem is that that you subset #data (the attributes) but you should subset the entire object. Something like this
x <- shp[[3]] # for simplicity
x <- x[x$Area >= 50, ]
names(x)[names(x) == "Area"] <- "area"
shp[[3]] <- x

Related

Concatenate layers in R Keras

I have this BERT classifier, where I want to concatenate the BERT output with additional features (hot-coded, 13 categories).
I get this error message which I do not understand - the arguments specified are all named.
input_word_ids <- layer_input(shape = c(set.max_length), dtype = 'int32', name = "input_word_ids")
input_mask <- layer_input(shape = c(set.max_length), dtype = 'int32', name = "input_attention_mask")
input_topic <- layer_input(shape = c(13), dtype = 'int32', name = "input_topic")
last_hidden_state <- model_tf(input_word_ids, attention_mask = input_mask)[[1]] # shape=(None, 512, 768)
cls_token <- last_hidden_state[, 1,] # shape=(None, 768)
output <- cls_token %>%
layer_concatenate(inputs = list(cls_token, input_topic), axis = -1)
Error in assert_all_dots_named(envir, cl) :
All arguments provided to `...` must be named.
Call with unnamed arguments in dots:
layer_concatenate(inputs = list(cls_token, input_topic), axis = -1, .)
If I run layer_concatenate(inputs = list(cls_token, input_topic)) [without the axis argument],
I get
Error in modifiers[[nm]](args[[nm]]) :
cannot coerce type 'environment' to vector of type 'integer'
The first error message stems from the Keras package (assert_all_dots_named(), line 435, https://github.com/rstudio/keras/blob/main/R/utils.R) if I am not mistaken
I read the Keras vignette, I don't see what I am doing wrong...
Any help is highly appreciated, many thanks in advance!
I was able to solve it on my own - cls_token %>% was the problem. A conflict of Keras functional api and maggritr-piping I suppose. cls_token %>% was used by layer_concatenate() as another "unnamed" input, therefore the error message.
Solution:
output <- layer_concatenate(inputs = list(cls_token, input_topic), axis = 1) %>%
layer_dropout(rate = set.dropout)

Error in if (ncol(spc1$amp) > ncol(spc2$amp)) { : argument is of length zero

I am using WarbleR in R to do some acoustic analyses. As freq_range couldn't detect all the bottom frequencies very well, I have created a data frame manually with all the right bottom frequencies, loaded this into R and turned it into a selection table. Traq_freq_contour and compare.methods and freq_DTW all work fine (although freq_DTW does give a warning message:
Warning message: In (0:(n - 1)) * f : NAs produced by integer overflow
However. If I try to do the function cross_correlation, I get the following error:
Error in if (ncol(spc1$amp) > ncol(spc2$amp)) { :
argument is of length zero
I do not get this error with a selection table with the bottom and top frequency added with the freq_range function in R instead of manually. What could be the issue here? The selection tables both look similar:
This is the selection table partly made by R through freq_range:
And this is the one with the bottom frequencies added manually (which has more sound files than the one before):
This is part of the code I use:
#Comparing methods for quantitative analysis of signal structure
compare.methods(X = stnew, flim = c(0.6,2.5), bp = c(0.6,2.5), methods = c("XCORR", "dfDTW"))
#Measure acoustic parameters with spectro_analysis
paramsnew <- spectro_analysis(stnew, bp = c(0.6,2), threshold = 20)
write.csv(paramsnew, "new_acoustic_parameters.csv", row.names = FALSE)
#Remove parameters derived from fundamental frequency
paramsnew <- paramsnew[, grep("fun|peakf", colnames(paramsnew), invert = TRUE)]
#Dynamic time warping
dm <- freq_DTW(stnew, length.out = 30, flim = c(0.6,2), bp = c(0.6,2), wl = 300, img = TRUE)
str(dm)
#Spectrographic cross-correlation
xcnew <- cross_correlation(stnew, wl = 300, na.rm = FALSE)
str(xc)
Any idea what I'm doing wrong?

Problem with for loop when downloading species occurrence data

I want to download the occurrence data from gbif website and I use the following R script. When I run the script, I got an error with the following message "Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0)". It would be highly appreciated if anyone could help me with this.
My data: data
My R script:
flist<-read_excel("Mekong fish.xlsx",sheet="Sheet1")
##Loop
fname<-list()
Occ<-list()
datfish<-list()
name_list<-unique(flist$Updated_name)
# create for loop to produce ggplot2 graphs
for (i in seq_along(name_list)) {
# create plot for each Occurrence in df
Occ[[i]] <-occ_search(scientificName = name_list[i], limit=2)
fname[[i]]<-occ_search(scientificName = name_list[i],
fields = c("species", "country","decimalLatitude", "decimalLongitude"),
hasCoordinate=T, limit= Occ[[i]]$meta[4],return ="data")
datfish[[i]]<-as.data.frame(fname[[i]]$data)
}
I got a different error:
Expecting logical in D1424 / R1424C4: got 'in Lao'Expecting logical in D1426 / R1426C4: got 'in China'Expecting logical in D1467 / R1467C4: got 'only Cambodia'Expecting logical in D1469 / R1469C4: got 'only in VN'Expecting logical in D1473 / R1473C4: got 'only in China'Expecting logical in D1486 / R1486C4: got 'only in Malaysia'Expecting logical in D1488 / R1488C4: got 'only 1 point in VN'
I think the problem is caused in some fields in the 4th column. I don't have the right packages installed to run your code. But I got a different error (package missing) once i dropped the fourth column.
flist<-read_excel("~/Downloads/Mekong fish.xlsx",sheet="Sheet1")
flist <=subset(flist, select = -4)
...
EDIT:
This worked for me. read_excel assigned column 4 the type boolean. When I explicitly set it to text it worked.
library(readxl)
library(rgbif)
library(raster)
flist<-read_excel("~/Downloads/Mekong fish.xlsx",
sheet="Sheet1",
col_types = c("numeric", "text", "numeric", "text"))
flist
##Loop
fname<-list()
Occ<-list()
datfish<-list()
name_list<-unique(flist$Updated_name)
# create for loop to produce ggplot2 graphs
for (i in seq_along(name_list[1:2])) {
message(i)
# # create plot for each Occurrence in df
Occ[[i]] <-occ_search(scientificName = name_list[i], limit=2)
message(Occ[[i]])
fname[[i]]<-occ_search(scientificName = name_list[i],
fields = c("species", "country","decimalLatitude", "decimalLongitude"),
hasCoordinate=T, limit= Occ[[i]]$meta[4],return ="data")
message(fname[[i]])
datfish[[i]]<-as.data.frame(fname[[i]]$data)
message(datfish[[i]])
}
> 1
> list(offset = 0, limit = 2, endOfRecords = FALSE, count = >15)list(list(name = c("Animalia", "Chordata", "Actinopterygii",
> "Cypriniformes", "Cyprinidae", "Aaptosyax", "Aaptosyax grypus"), key = > > c("1", "44", "204", "1153", "7336", "2363805", "2363806"),
> etc...

How to fix RichnessGrid Error in split.default

I tried to use RichnessGrid to count species occurrence on the map. But I am constantly getting the error message
"Error in split.default(x = seq_len(nrow(x)), f = f, drop = drop, ...) :
group length is 0 but data length > 0".
By checking other posts, it seems that this is the error message for typos, which is not my case. Does anyone know how to trouble shot this problem?
My data look like this
I tried a few things: 1. change the resolution option or the type definition; 2. change header of my data; 3. look at the summary of my data and sample data. But nothing worked, and I still could not figure out where went wrong.
dput(head(clean))
#subset my df (clean) for RichnessGrid
dat<-clean %>% select(the.plant.list,longitude,latitude)
# tried to change header but still failed
dat <- dat %>% rename(species = the.plant.list)
head(dat)
RichnessGrid(dat, reso=60, type = "spnum")
#try sample data and code
data(lemurs)
e <- c(-125, -105, 30, 50)
RichnessGrid(lemurs, e, reso = 60, type = "spnum")
#compare sample data and my own
data(lemurs)
data(dat)
summary(lemurs)
summary(dat)

Error in as(x, class(k)) : no method or default for coercing “NULL” to “data.frame”

I am currently facing an error mentioned below which is related to NULL values being coerced to a data frame. The data set does contain nulls, however I have tried both is.na() and is.null() functions to replace the null values with something else. The data is stored on hdfs and is stored in a pig.hive format. I have also attached the code below. The code works fine if I remove v[,25] from the key.
Code:
AM = c("AN");
UK = c("PP");
sample.map <- function(k,v){
key <- data.frame(acc = v[!which(is.na(v[,1],1],
year = substr(v[!which(is.na(v[,1]),2],1,4),
month = substr(v[!which(is.na(v[,1]),2],5,6))
value <- data.frame(v[,3],count=1)
keyval(key,value)
}
sample.reduce <- function(key,v){
AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2])
UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2])
Total <- AT + UnknownT
d <- data.frame(AT,UnknownT,Total)
keyval(key,d)
}
out <- mapreduce(input ="/user/hduser/input",
output = "/user/hduser/output",
input.format = make.input.format("pig.hive", sep = "\u0001")
output.format = make.output.format("csv", sep = ","),
map= sample.map)
reduce = sample.reduce)
Error:
Warning in asMethod(object) : NAs introduced by coercion
Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable
Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) :
data length is not a multiple of split variable
Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length
Error in as(x, class(k)) :
no method or default for coercing “NULL” to “data.frame”
Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available
UPDATE
I have added the sample data and edited the code above. Hope this helps!
Sample Data:
NULL,"2014-03-14","PP"
345689202,"2014-03-14","AN"
234539390,"2014-03-14","PP"
123125444,"2014-03-14","AN"
NULL,"2014-03-14","AN"
901828393,"2014-03-14","AN"
There are some issues with as which have been identified recently. I don't see why as can't handle this by default, but you can modify coerce which handles the conversion with an S4 method to call as.data.frame.
setMethod("coerce",c("NULL","data.frame"), function(from, to, strict=TRUE) as.data.frame(from))
[1] "coerce"
as(NULL,"data.frame")
data frame with 0 columns and 0 rows

Resources