Attempting to save intermediate states when running rmh yields error - r

I am trying to simulate a multitype point process, saving the intermediate states every 1000 steps in rmhcontrol. However, I can't simulate whenever I specify nsave. As an example, whenever I run the code block below, I get the error:
Error in factor(Cmprop, levels = Ctypes) : object 'Cmprop' not found
The code is:
library(spatstat)
library(optimbase)
num_marks <- length(unique(marks(amacrine)))
iradii <- .1*ones(nrow=num_marks,ncol=num_marks)
MSH1 <- MultiStraussHard(iradii=iradii)
x <- ppm(amacrine, trend =~polynom(x,y,3), interaction=MSH1)
control <- rmhcontrol(nsave=1e3)
rmh(x,control=control)
Thanks for the help!

This is a bug in spatstat versions 1.62-1 and 1.62-2.
It has already been fixed in the current development version 1.62-2.006 which you can download from the GitHub repository for spatstat. The next public release on CRAN will be at the end of January 2020.
Please note: the code in the original question generates an error because ones has formal arguments nx, ny rather than nrow, ncol. The following code tests the bug:
library(spatstat)
nm <- length(levels(marks(amacrine)))
ir <- matrix(0.1, nm, nm)
MSH1 <- MultiStraussHard(iradii=ir)
fit <- ppm(amacrine ~ polynom(x,y,3), MSH1)
rmh(fit, nsave=1e3, verbose=FALSE)

Related

Using R Package NNMAPSlite to get City Environmental vs Mortality Dataset

I have several question for those who have worked with R studio. Currently I need to work with NMMAPSlite package. However, I found that there is an issue from the package itself when I wanted to initialise the database connection to remote DB that store the NMMAPS City dataset.
In short, I need help to either
resolve the problem with NMMAPSlite old R package or
where to find the NMMAPS dataset in csv format
BACKGROUND
As a background, I'm using NMMAPSLite packages with intend to reproduce paper of Antonio Gasparrini. Attached at the bottom is the code base I would like to run. It requires:
require(dlnm);
require(NMMAPSlite)
Now the package NMMAPSlite has been deprecated it seems, so I managed to install the dependencies and the package from archive. I will elaborate below on the links required to get the dependencies for NMMAPS and DLNM as well.
PROBLEM
The problems occur when calling initDB() where it says it failed to create remoteDB instance due to invalid object creation. But I suspect, rather, the error comes from the fact the url is not supported. Here is the NMMAPS docs that describes the initDB() function. The db initialisation is necessary to read the city dataset.
The following is the error from R Console when running initDB()
creating directory 'NMMAPS' for local storage
Error in validObject(.Object) :
invalid class “remoteDB” object: object needs a 'url' of type 'http://'
In addition: Warning message:
In grep("^http://", URL, fixed = TRUE, perl = TRUE) :
argument 'perl = TRUE' will be ignored
QUESTIONS
I know this packages NMMAPS are deprecated and too old perhaps, but I really want to reproduce/replicate Antonio Gasparrini's paper: Distributed lag non-linear models for the purpose of my undergraduate thesis project.
Hence,
I wonder if there is anyway to get NMMAPS Dataset for cities environment data vs mortality rate. I visited the official NMMAPS Database but the link for downloading the data is either broken or the server is already down
Or you can also help me to find out if there is equivalent to NMMAPSlite package in R. I just need to download the cities dataset that contains humidity trend, temperatures trend, dewpoint, CO2 trends, Ozone O3 trend, and deaths/mortality rate with respect to time at any particular city for over 2 years. The most important that I need is the mortality rate and Ozone O3 trend.
Or last effort, perhaps do you mind suggesting me similar dataset that is used by his paper? Something where I can derive/analyze time relationship to estimate mortality rate given environmental and air polution information?
APPENDIX
Definition of initDB
baseurl = "http://www.ihapss.jhsph.edu/NMMAPS/v0.1"
function (basedir = "NMMAPS")
{
if (!file.exists(basedir))
message(gettextf("creating directory '%s' for local storage",
basedir))
outcome <- new("remoteDB", url = paste(baseurl, "outcome",
sep = "/"), dir = file.path(basedir, "outcome"), name = "outcome")
exposure <- new("remoteDB", url = paste(baseurl, "exposure",
sep = "/"), dir = file.path(basedir, "exposure"), name = "exposure")
Meta <- new("remoteDB", url = paste(baseurl, "Meta", sep = "/"),
dir = file.path(basedir, "Meta"), name = "Meta")
assign("exposure", exposure, .dbEnv)
assign("outcome", outcome, .dbEnv)
assign("Meta", Meta, .dbEnv)
}
Code to run:
The error comes from line 3
require(dlnm);require(NMMAPSlite)
##############################
# LOAD AND PREPARE THE DATASET
##############################
initDB()
data <- readCity("ny", collapseAge = TRUE)
data <- data[,c("city", "date", "dow", "death", "tmpd", "dptp", "rhum", "o3tmean", "o3mtrend", "cotmean", "comtrend")]
# TEMPERATURE: CONVERSION TO CELSIUS
data$temp <- (data$tmpd-32)*5/9
# POLLUTION: O3 AND CO AT LAG-01
data$o3 <- data$o3tmean + data$o3mtrend
data$co <- data$cotmean + data$comtrend
data$o301 <- filter(data$o3,c(1,1)/2,side=1)
data$co01 <- filter(data$co,c(1,1)/2, side=1)
# DEW POINT TEMPERATURE AT LAG 0-1
data$dp01 <- filter(data$dptp,c(1,1)/2,side=1)
##############################
# CROSSBASIS SPECIFICATION
##############################
# FIXING THE KNOTS AT EQUALLY SPACED VALUES
range <- range(data$temp,na.rm=T)
ktemp <- range [1] + (range [2]-range [1])/5*1:4
# CROSSBASIS MATRIX
ns.basis <- crossbasis(data$temp,varknots=ktemp,cenvalue=21, lagdf=5,maxlag=30)
##############################
# MODEL FIT AND PREDICTION
##############################
ns <- glm(death ~ ns.basis + ns (dp01, df=3 ) + dow + o301 + co01 +
ns(date,df=14*7),family=quasipoisson(), data)
ns.pred <- crosspred(ns.basis,ns,at=-16:33)
##############################
# RESULTS AND PLOTS
##############################
# 3-D PLOT (FIGURE 1)
crossplot(ns.pred,label="Temperature")
# SLICES (FIGURE 2, TOP)
percentiles <- round(quantile(data$temp,c(0.001,0.05,0.95,0.999)), 1)
ns.pred <- crosspred(ns.basis,ns,at=c(percentiles,-16:33))
crossplot(ns.pred,"slices",var=percentiles,lag=c(0,5,15,28), label="Temperature")
# OVERALL EFFECT (FIGURE 2, BELOW)
crossplot(ns.pred,"overall",label="Temperature", title="Overall effect of temperature on mortality
New York 1987–2000" )
# RR AT CHOSEN PERCENTILES VERSUS 21C (AND 95%CI)
ns.pred$allRRfit[as.character(percentiles)]
cbind(ns.pred$allRRlow,ns.pred$allRRhigh)[as.character(percentiles),]
##############################
# THE MOVING AVERAGE MODELS UP TO LAG x (DESCRIBED IN SECTION 5.2)
# CAN BE CREATED BY THE CROSSBASIS FUNCTION INCLUDING THE
# ARGUMENTS lagtype="strata", lagdf=1, maxlag=x
Resources for your context
Distributed lag non-linear models link
Rstudio's NMMAPSlite Package docs pdf download
Rstudio's DNLM Package docs pdf
Duplicate questions from another forum: forum
How to install package from tar/archive: link
Meanwhile, I will contact the author of this package and see if I can get the dataset. Preferable in csv format.
It seems that your code is based on R ver. < 3.0.0. You might find it difficult to reproduce the paper as the current R is typical > 4.0.0. You could try to install the windows version of NMMAPS database from the link given by 'Lil'. But, you will need to install an older version of R (2.9.2).
Or, you could hang on with the latest version of R and make a simple search on GitHub. In case you haven't found the NMMAPS database, you will find how to deal with the database here.
you could try this link http://www.biostat.jhsph.edu/IHAPSS/data/NMMAPS/R/ to download the package. There you have the city-data compressed where you can choose New York manually if initDB does not work.

Error related to randomisation test within lapply() function in R

I have 30 datasets that are conbined in a data list. I wanted to analyze spatial point pattern by L function along with randomisation test. Codes are following.
The first code works well for a single dataset (data1) but once it is applied to a list of dataset with lapply() function as shown in 2nd code, it gives me a very long error like so,
"Error in Kcross(X, i, j, ...) : No points have mark i = Acoraceae
Error in envelopeEngine(X = X, fun = fun, simul = simrecipe, nsim =
nsim, : Exceeded maximum number of errors"
Can anybody tell me what is wrong with 2nd code?
grp <- factor(data1$species)
window <- ripras(data1$utmX, data1$utmY)
pp.grp <- ppp(data1$utmX, data1$utmY, window=window, marks=grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, nsim = 100, envelope = TRUE)
plot(L.grp)
plot(LE.grp)
L.LE.sp <- lapply(data.list, function(x) {
grp <- factor(x$species)
window <- ripras(x$utmX, x$utmY)
pp.grp <- ppp(x$utmX, x$utmY, window = window, marks = grp)
L.grp <- alltypes(pp.grp, Lest, correlation = "Ripley")
LE.grp <- alltypes(pp.grp, Lcross, envelope = TRUE)
result <- list(L.grp=L.grp, LE.grp=LE.grp)
return(result)
})
plot(L.LE.sp$LE.grp[1])
This question is about the R package spatstat.
It would help if you could add a minimal working example including data which demonstrate this problem.
If that is not available, please generate the error on your computer, then type traceback() and capture the output and post it here. This will trace the location of the error.
Without this information, my best guess is the following:
The error message says No points have mark i=Acoraceae. That means that the code is expecting a point pattern to include points of type Acoraceae but found that there were none. This can happen because in alltypes(... envelope=TRUE) the code generates random point patterns according to complete spatial randomness. In the simulated patterns, the number of points of type Acoraceae (say) will be random according to a Poisson distribution with a mean equal to the number of points of type Acoraceae in the observed data. If the number of Acoraceae in the actual data is small then there is a reasonable chance that the simulated pattern will contain no Acoraceae at all. This is probably what is causing the error message No points have mark i=Acoraceae.
If this interpretation is correct then you should be able to suppress the error by including the argument fix.marks=TRUE, that is,
alltypes(pp.grp, Lcross, envelope=TRUE, fix.marks=TRUE, nsim=99)
I'm not suggesting this is necessarily appropriate for your application, but this should remove the error message if my guess is correct.
In the latest development version of spatstat, available on github, the code for envelope has been tweaked to detect this error.

getting error in mclust-package while working on univariate fit

While working on a univariate fit using Mclust I am getting following error:
Error in mstepE(data = as.matrix(data)[initialization$subset, ], z = z, :
row dimension of z should equal data length
I am using the code mentioned in:
https://cran.r-project.org/web/packages/mclust/vignettes/mclust.html#initialisation
This is the code section where I am getting error:
df1 <- dataSample
BIC <- NULL
for(j in 1:20){
rBIC <- mclustBIC(df1, verbose = T,
initialization = list(hcPairs = randomPairs(df1)))
BIC <- mclustBICupdate(BIC, rBIC)
}
summary(BIC)
Following link contains data to be passed to variable 'df1' (file name:dataSample.csv)
https://drive.google.com/open?id=0Bzau9RsRnQreYk9XOWVBSm91b2o4NTQ4RlA2UFdWbDBVOVpR
This is the solution I get from one of the Authors (Prof. Luca Scrucca) for 'mclust' library:
"there was a bug due to the use of automatic subset that clash when hcPairs are provided. I have fixed it in the current dev version of mclust.
Since submission to CRAN won't happen shortly, you may use the following code to avoid the error with the current release of mclust:
rBIC <- mclustBIC(df1, verbose = T,
initialization = list(hcPairs = randomPairs(df1),
subset = 1:NROW(df1)))
When the bug fix will be released, the subset argument could be omitted as it is redundant."
Now, the code is working fine.

R: Autokrige.cv function in automap package generates NaNs

I’m fairly new to R and I am trying to make interpolations of temperature measurements that where gathered from different station across the Netherlands. I have data for about 35 stations that make measurements every 10 minutes covering a timespan of about two weeks. Accordingly, I figured it would be best to make a loop that takes care of this. To see how well the interpolation technique works I want to do a cross validation for every timestamp.
In order to do this I used the Autokrige function from the automap package, and next I used the compare.cv function from the automap package in order to get an overview of the most important statistics for all time stamps. Besides that, I made sure the cross validation is only done if at least 25 stations registred meassurements.
The problem however is, that my code as described below works most of the time but gives the following warnings in 4 cases:
1. In sqrt(ret[[var.name]]) : NaNs produced
2. In sqrt(ret[[var.name]]) : NaNs produced
3. In sqrt(ret[[var.name]]) : NaNs produced
4. In sqrt(ret[[var.name]]) : NaNs produced
When I try to use the compare.cv command for the total list including all the cross validations it gives me the following error:
"Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm, :
missing values and NaN's not allowed if 'na.rm' is FALSE"
Im wondering what causes the Autokrige function to generate NaNs in the cross validation, and more importantly how I can remove them from the results.cv so that I can use the compare.cv function?
rm(list=ls())
# load packages
require(sp)
require(gstat)
require(ggmap)
require(automap)
require(ggplot2)
#load data (download link provided below)
load("download path") https://www.dropbox.com/s/qmi3loub29e55io/meassurements_aug.RDS?dl=0
# make data spatial and assign spatial coordinate system
coordinates(meassurements) = ~x+y
proj4string(meassurements) <- CRS("+init=epsg:4326")
meassurements_df <- as.data.frame(meassurements)
# loop for cross validation
timestamp <- meassurements$import_log_id
results.cv=list()
for (i in unique(timestamp)) {
x = meassurements_df[which(meassurements$import_log_id == i), ]
if(sum(!is.na(x$temperature)) > 25){
results.cv[[paste0(i)]] = autoKrige.cv (temperature ~ 1, meassurements[which(meassurements$import_log_id == i & !is.na(meassurements$temperature)), ])
}
}
# calculate key statistics (RMSE MAE etc)
compare.cv(results.cv)
Thanks!
I came across the same problem and solved it with the help of remove.duplicates() of package sp on the SpatialPointDataFrame used for kriging. Prior to that I calculated the mean of the relevant variables in the DataFrame.
SPDF#data <- SPDF#data %>%
group_by(varx,vary,varz) %>%
mutate_at(vars(one_of(relevant_var)),mean,na.rm=TRUE) %>%
ungroup()
SPDF <- SPDF %>% remove.duplicates()
At the time I was encountering the same problem the Dropbox link above was not working anymore, so I could not check this specific example.

Can't find the function after require(thePackage)

I am trying to get the plot.cuminc() function in the cmprsk package.
I installed the package and used require(cmprsk) as well as library(cmprsk) but R still can't find the function.
p.s. I used ?plot.cuminc() and find the help document
Dose anyone know what's wrong with the R , or my code?
If you run plot on a cuminc object, it will automatically run plot.cuminc. Try running this example code:
set.seed(2)
ss <- rexp(100)
gg <- factor(sample(1:3,100,replace=TRUE),1:3,c('a','b','c'))
cc <- sample(0:2,100,replace=TRUE)
strt <- sample(1:2,100,replace=TRUE)
print(xx <- cuminc(ss,cc,gg,strt))
plot(xx,lty=1,color=1:6)
If you really want to look at the function, or run it directly for some reason, you can use cmprsk:::plot.cuminc.

Resources