FW function in R is not working for my dataset - r

Hi I am trying to use the Finlay Wilkinson regression function FW in R for yield stability analysis. I tried running this, but am getting the below error
Error in FW(y = yield, VAR = genotype, ENV = location, method = "OLS")
: object 'genotype' not found
My data file has the headers 'yield', 'genotype' and 'location' on it. Not sure what I am doing wrong here, since I just followed the syntax from the original publication. Please help!
My code below:
crop <- read.csv("barleygt6.csv", header = TRUE)
summary(crop)
install.packages("devtools")
library(devtools)
install.packages("remotes")
remotes::install_github("lian0090/FW")
library(FW)
lm1=FW(y= yield, VAR= genotype, ENV= location, method="OLS")

Related

Example code for tbl_svysummary function add_ci() produce error message

I'm using tbl_svysummary to report weighted means. Additionally I would like to add another column with confidence intervals using add_ci().
If I try, I get the following error:
Error in UseMethod("add_ci") :
no applicable method for 'add_ci' applied to an object of class "c('tbl_svysummary', 'gtsummary')
The same happens if I run the example-code provided by the reference page in the web, that you can see below (https://www.danieldsjoberg.com/gtsummary/reference/add_ci.html).
Package versions:
gtsummary 1.5.0,
survey 4.1-1
Code:
# Example 3 ----------------------------------
data(api, package = "survey")
add_ci_ex3 <-
survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) %>%
tbl_svysummary(
include = c(api00, hsg, stype),
statistic = hsg ~ "{mean} ({sd})"
) %>%
add_ci(
method = api00 ~ "svymedian"
)
Basically I just tried around wit the example code an my own data, but I only get the same error message again and again. Searching google and stack overflow doesn't give me any further information on this error. Any ideas what could be the reason?

Error: No tidy method for objects of class dgCMatrix

I'm trying out a package regarding double machine learning (https://rdrr.io/github/yixinsun1216/crossfit/) and in trying to run the main function dml(), I get the following "Error: No tidy method for objects of class dgCMatrix" using example dataframe data. When looking through the documentation (https://rdrr.io/github/yixinsun1216/crossfit/src/R/dml.R), I can't find anything wrong with how tidy() is used. Does anyone have any idea what could be going wrong here?
R version 4.2.1
I have already tried installing broom.mixed, although broomextra doesn't seem to be available for my R version, and the same problem occurs. Code used below;
install.packages("remotes")
remotes::install_github("yixinsun1216/crossfit", force = TRUE)
library("remotes")
library("crossfit")
library("broom.mixed")
library("broom")
# Effect of temperature and precipitation on corn yield in the presence of
# time and locational effects
data(corn_yield)
library(magrittr)
dml_yield <- "logcornyield ~ lower + higher + prec_lo + prec_hi | year + fips" %>%
as.formula() %>%
dml(corn_yield, "linear", n = 5, ml = "lasso", poly_degree = 3, score = "finite")

Error in eval(parse()) - r unable to find argument input

I am very new to R, and this is my first time of encountering the eval() function. So I am trying to use the med and boot.med function from the following package: mma. I am using it to conduct mediation analysis. med and boot.med take in models such as linear models, and dataframes that specify mediators and predictors and then estimate the mediation effect of each mediator.
The author of the package gives the flexible option of specifying one's own custom.function. From the source code of med, it can be seen that the custom.function is passed to the eval(). So I tried insert the gbmt function as the custom function. However, R kept giving me error message: Error during wrapup: Number of trees to be used in prediction must be provided. I have been searching online for days and tried many ways of specifying the number of trees parameter n.trees, but nothing works (I believe others have raised similar issues: post 1, post 2).
The following codes are part of the source code of the med function:
cf1 = gsub("responseY", "y[,j]", custom.function[j])
cf1 = gsub("dataset123", "x2", cf1)
cf1 = gsub("weights123", "w", cf1)
full.model[[j]] <- eval(parse(text = cf1))
One custom function example the author gives in the package documentation is as follows:
temp1<-med(data=data.bin,n=2,custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
Here the glm is the custom function. This example code works and you can replicate it easily (if you have mma installed and loaded). However when I am trying to use the gbmt function on a survival object, I got errors and here is what my code looks like:
temp1 <- med(data = data.surv,n=2,type = "link",
custom.function = 'gbmt(responseY ~.,
data = dataset123,
distribution = dist,
train_params = start_stop,
cv_folds=10,
keep_gbm_data = TRUE,
)')
Anyone has any idea how the argument about number of trees n.trees can be added somewhere in the above code?
Many thanks in advance!
Update: in order to replicate the example code, please install mma and try the following:
library("mma")
data("weight_behavior") ##binary x #binary y
x=weight_behavior[,c(2,4:14)]
pred=weight_behavior[,3]
y=weight_behavior[,15]
data.bin<-data.org(x,y,pred=pred,contmed=c(7:9,11:12),binmed=c(6,10), binref=c(1,1),catmed=5,catref=1,predref="M",alpha=0.4,alpha2=0.4)
temp1<-med(data=data.bin,n=2) #or use self-defined final function
temp1<-med(data=data.bin,n=2, custom.function = 'glm(responseY~.,data=dataset123,family="quasibinomial",
weights=weights123)')
I changed the custom.function to gbmt and used a survival object as responseY and the error occurs. When I use the gbmt function on my data outside the med function, there is no error.

Using R Package NNMAPSlite to get City Environmental vs Mortality Dataset

I have several question for those who have worked with R studio. Currently I need to work with NMMAPSlite package. However, I found that there is an issue from the package itself when I wanted to initialise the database connection to remote DB that store the NMMAPS City dataset.
In short, I need help to either
resolve the problem with NMMAPSlite old R package or
where to find the NMMAPS dataset in csv format
BACKGROUND
As a background, I'm using NMMAPSLite packages with intend to reproduce paper of Antonio Gasparrini. Attached at the bottom is the code base I would like to run. It requires:
require(dlnm);
require(NMMAPSlite)
Now the package NMMAPSlite has been deprecated it seems, so I managed to install the dependencies and the package from archive. I will elaborate below on the links required to get the dependencies for NMMAPS and DLNM as well.
PROBLEM
The problems occur when calling initDB() where it says it failed to create remoteDB instance due to invalid object creation. But I suspect, rather, the error comes from the fact the url is not supported. Here is the NMMAPS docs that describes the initDB() function. The db initialisation is necessary to read the city dataset.
The following is the error from R Console when running initDB()
creating directory 'NMMAPS' for local storage
Error in validObject(.Object) :
invalid class “remoteDB” object: object needs a 'url' of type 'http://'
In addition: Warning message:
In grep("^http://", URL, fixed = TRUE, perl = TRUE) :
argument 'perl = TRUE' will be ignored
QUESTIONS
I know this packages NMMAPS are deprecated and too old perhaps, but I really want to reproduce/replicate Antonio Gasparrini's paper: Distributed lag non-linear models for the purpose of my undergraduate thesis project.
Hence,
I wonder if there is anyway to get NMMAPS Dataset for cities environment data vs mortality rate. I visited the official NMMAPS Database but the link for downloading the data is either broken or the server is already down
Or you can also help me to find out if there is equivalent to NMMAPSlite package in R. I just need to download the cities dataset that contains humidity trend, temperatures trend, dewpoint, CO2 trends, Ozone O3 trend, and deaths/mortality rate with respect to time at any particular city for over 2 years. The most important that I need is the mortality rate and Ozone O3 trend.
Or last effort, perhaps do you mind suggesting me similar dataset that is used by his paper? Something where I can derive/analyze time relationship to estimate mortality rate given environmental and air polution information?
APPENDIX
Definition of initDB
baseurl = "http://www.ihapss.jhsph.edu/NMMAPS/v0.1"
function (basedir = "NMMAPS")
{
if (!file.exists(basedir))
message(gettextf("creating directory '%s' for local storage",
basedir))
outcome <- new("remoteDB", url = paste(baseurl, "outcome",
sep = "/"), dir = file.path(basedir, "outcome"), name = "outcome")
exposure <- new("remoteDB", url = paste(baseurl, "exposure",
sep = "/"), dir = file.path(basedir, "exposure"), name = "exposure")
Meta <- new("remoteDB", url = paste(baseurl, "Meta", sep = "/"),
dir = file.path(basedir, "Meta"), name = "Meta")
assign("exposure", exposure, .dbEnv)
assign("outcome", outcome, .dbEnv)
assign("Meta", Meta, .dbEnv)
}
Code to run:
The error comes from line 3
require(dlnm);require(NMMAPSlite)
##############################
# LOAD AND PREPARE THE DATASET
##############################
initDB()
data <- readCity("ny", collapseAge = TRUE)
data <- data[,c("city", "date", "dow", "death", "tmpd", "dptp", "rhum", "o3tmean", "o3mtrend", "cotmean", "comtrend")]
# TEMPERATURE: CONVERSION TO CELSIUS
data$temp <- (data$tmpd-32)*5/9
# POLLUTION: O3 AND CO AT LAG-01
data$o3 <- data$o3tmean + data$o3mtrend
data$co <- data$cotmean + data$comtrend
data$o301 <- filter(data$o3,c(1,1)/2,side=1)
data$co01 <- filter(data$co,c(1,1)/2, side=1)
# DEW POINT TEMPERATURE AT LAG 0-1
data$dp01 <- filter(data$dptp,c(1,1)/2,side=1)
##############################
# CROSSBASIS SPECIFICATION
##############################
# FIXING THE KNOTS AT EQUALLY SPACED VALUES
range <- range(data$temp,na.rm=T)
ktemp <- range [1] + (range [2]-range [1])/5*1:4
# CROSSBASIS MATRIX
ns.basis <- crossbasis(data$temp,varknots=ktemp,cenvalue=21, lagdf=5,maxlag=30)
##############################
# MODEL FIT AND PREDICTION
##############################
ns <- glm(death ~ ns.basis + ns (dp01, df=3 ) + dow + o301 + co01 +
ns(date,df=14*7),family=quasipoisson(), data)
ns.pred <- crosspred(ns.basis,ns,at=-16:33)
##############################
# RESULTS AND PLOTS
##############################
# 3-D PLOT (FIGURE 1)
crossplot(ns.pred,label="Temperature")
# SLICES (FIGURE 2, TOP)
percentiles <- round(quantile(data$temp,c(0.001,0.05,0.95,0.999)), 1)
ns.pred <- crosspred(ns.basis,ns,at=c(percentiles,-16:33))
crossplot(ns.pred,"slices",var=percentiles,lag=c(0,5,15,28), label="Temperature")
# OVERALL EFFECT (FIGURE 2, BELOW)
crossplot(ns.pred,"overall",label="Temperature", title="Overall effect of temperature on mortality
New York 1987–2000" )
# RR AT CHOSEN PERCENTILES VERSUS 21C (AND 95%CI)
ns.pred$allRRfit[as.character(percentiles)]
cbind(ns.pred$allRRlow,ns.pred$allRRhigh)[as.character(percentiles),]
##############################
# THE MOVING AVERAGE MODELS UP TO LAG x (DESCRIBED IN SECTION 5.2)
# CAN BE CREATED BY THE CROSSBASIS FUNCTION INCLUDING THE
# ARGUMENTS lagtype="strata", lagdf=1, maxlag=x
Resources for your context
Distributed lag non-linear models link
Rstudio's NMMAPSlite Package docs pdf download
Rstudio's DNLM Package docs pdf
Duplicate questions from another forum: forum
How to install package from tar/archive: link
Meanwhile, I will contact the author of this package and see if I can get the dataset. Preferable in csv format.
It seems that your code is based on R ver. < 3.0.0. You might find it difficult to reproduce the paper as the current R is typical > 4.0.0. You could try to install the windows version of NMMAPS database from the link given by 'Lil'. But, you will need to install an older version of R (2.9.2).
Or, you could hang on with the latest version of R and make a simple search on GitHub. In case you haven't found the NMMAPS database, you will find how to deal with the database here.
you could try this link http://www.biostat.jhsph.edu/IHAPSS/data/NMMAPS/R/ to download the package. There you have the city-data compressed where you can choose New York manually if initDB does not work.

Create a path diagram of a SEM model with a categorical response variable using semPaths()

I would like to create a path diagram of a SEM model with a categorical response variable using semPaths(). However I am running into an error:
library(lavaan)
library(semPlot)
table.7.5 <-read.table("http://www.da.ugent.be/datasets/Agresti2002.Table.7.5.dat",header=TRUE)
table.7.5$mental <- ordered(table.7.5$mental,levels = c("well","mild","moderate","impaired"))
model <- "mental ~ ses + life"
fit <- sem(model, data=table.7.5)
semPaths(fit,"std",edge.label.cex = 0.5, curvePivot=TRUE,layout = "tree")
Error is:
Error in colnames<-(*tmp*, value = "mental") :
attempt to set 'colnames' on an object with less than two dimensions
Thanks
The above solution solves the problem for the current CRAN version of semPlots(). In the meantime this issue has been solved in the development version. To install it run:
library(devtools)
install_github("SachaEpskamp/semPlot")
For more details read this:
https://github.com/SachaEpskamp/semPlot/issues/9
Someone helped me to solve the problem by replacing the content of cov by res.cov. I don't know why but when it's ordered data, lavaan puts the implied covariance matrix in res.cov instead of cov.
All you have to do is this:
fit#SampleStats#cov<-fit#SampleStats#res.cov
Before calling:
semPaths(fit,"std",edge.label.cex = 0.5, curvePivot=TRUE,layout = "tree")

Resources