Combining ROCR performance objects - r

I have multiple performance objects created using ROCR. Each of these contain auc or fpr/tpr values for a class. In turn they have results for multiple test runs. So,
length(first.perf.obj#y.values)
gives something > 1.
I can plot average for a single class using
plot(first.perf.obj, avg="vertical")
as described in the ROCR manual. I want to combine these objects to calculate and plot their global average. Something like
global.perf.obj <- combine.perf.objects(first.perf.obj, second.perf.obj, third.perf.obj)
Is there an easy way to do this, or should I decompose each object and calculate values by hand?

I went back recreating prediction objects for the global case.
I'm calling the prediction function like
global.prediction <- prediction(c(cls1.likelihood,
cls2.likelihood,
cls3.likelihood,
cls4.likelihood,
cls5.likelihood),
c(duplicate.cols(cls1.labels, ncol(cls1.likelihood)),
duplicate.cols(cls2.labels, ncol(cls2.likelihood)),
duplicate.cols(cls3.labels, ncol(cls3.likelihood)),
duplicate.cols(cls4.labels, ncol(cls4.likelihood)),
duplicate.cols(cls5.labels, ncol(cls5.likelihood))),
label.ordering=c(FALSE, TRUE))
for duplicate.cols simply builds a data.frame of repeating labels.
Then I'm able to get any statistic for the global case by e.g. performance(global.prediction, "auc")
It's a bit slow, but I think it's simpler than trying to combine values from multiple performance objects.

Related

package "fdapace" (R) - How to access the principal components of the functional principal component analysis

After applying the FPCA() function of the "fdapace" package on a dataset, the function returns a FPCA object with various values and fields. Unfortunately I don't know which of those fields contain the Principal components and how to access them or plot them. I know that there is a documentation for the package but as a beginner it doesn't really help me(no criticism intended). You can find the documentation here: fdapace.pdf
The estimate of the functional principal components (FPCs) are saved in xiEst in the result list, a matrix each row of which is the FPCs for a subject in the data. You can make whatever plots you want with this information. See the following for an example.
res = FPCA(Ly, Lt)
res$xiEst # This is the matrix containing the FPC estimates.
Plotting the first eigenfunction:
workGrid = FPCAsparse$workGrid
phi1=FPCAsparse$phi[,1]
plot(workGrid,phi1)
Plotting the mean function:
mu=FPCAsparse$mu
workGrid = FPCAsparse$workGrid
plot(workGrid,mu)

iterating a coxph() model using various sets of covariates

I'm still a little new to R, so this may be a basic question.
I am looking for risk estimates for a joint-cox model using coxph(). I have to iterate the model for about 60 times using various combinations of variables. Since each iteration of the model will have different covariates (and main exposures), I want to write one function to do it. In the age-adjusted model I just had the main exposure, everything runs fine. I can add the covariates, it runs... I just need a way to write a single function where the "covars" can be whatever I put into the function call.
Note: this is a simplified version, it runs just fine, I just want to make it work without writing out 60 unique iterations of it.
subtype <- function(expo, covars){
temp <- coxph(Surv(FAIL, OUTCOME) ~ joint[[expo]]*strata(EVENT2)+
covars+
cluster(ID)+strata(AGE_INT),
na.action=na.exclude,
data=joint)
return(summary(temp))
}
results <- subtype("RACE", covars=...)
results2 <- subtype("GENDER", covers=...
When I did this macro programing in SAS, it was easy.
Thank you for your help.

Taylor diagram from existing Correlation and Standard Dev values

Is it possible to create a Taylor diagram from already calculated correlation and standard deviation values?
I am doing model evaluation, and I have already the correlation and standard deviations values.I understand that there is already a package plotrix where by giving the observation and the modeled values, the diagram is created. However for the type of work that I am doing, it is easier to start by giving already the correlation and standard deviation values.
Is there any way I can do this in R?
There's no reason it shouldn't be possible, but the authors didn't seem to allow for that when they wrote the function. The function is a bit long and complex, but the part that does the calculation is at the top. It is possible to swap out that code and replace it to allow for the passing of summary statistics. Now, keep in mind what i'm about to do is a hack and i've only tested it with versions 3.5-5 of plotrix. Other version may not work.
Here will will create a new function taylor.diagram2 that takes all the code from taylor.diagram but adds in an extra if statement to check for a list of summarized data as the first argument
taylor.diagram2<-taylor.diagram
bl<-as.list(body(taylor.diagram))
cond<-list(
as.name("if"),
quote(is.list(ref) & missing(model)), #condition
quote({R<-ref$R; sd.r<-ref$sd.r; sd.f<-ref$sd.f}), #if true
as.call(c(as.symbol("{"), bl[3:8]))) #else
bl<-c(bl[1:2], as.call(cond), bl[9:length(bl)]) #splice in new code
body(taylor.diagram2)<-as.call(bl) #update function
Now we can test the function. First, we'll do things the standard way
#test data
aref<-rnorm(30,sd=2)
amodel1<-aref+rnorm(30)/2
#standard behavior function
taylor.diagram2(aref,amodel1, main="Standard Behavior"))
#summarized data
xx<-list(
R=cor(aref, amodel1, use = "pairwise"),
sd.r=sd(aref),
sd.f=sd(amodel1)
)
#modified behavior
taylor.diagram2(xx, main="Modified Behavior")
So the new taylor.diagram2 function can do both. If you pass it two vectors, it will do the standard behavior. If you pass it a list with the names R, sd.r, and sd.f, then it will do the same plot but with the values you passed in. Also, the model parameter must be empty for the modified version to work. That means if you want to set any additional parameter, you must use named parameters rather than positional arguments.

Accessing class values in R's poLCA

I am trying my hand at learning Latent Component Analysis, while also learning R. I'm using the poLCA package, and am having a bit of trouble accessing the attributes. I can run the sample code just fine:
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
ds = within(ds, (cesdcut = ifelse(cesd>20, 1, 0)))
library(poLCA)
res2 = poLCA(cbind(homeless=homeless+1,
cesdcut=cesdcut+1, satreat=satreat+1,
linkstatus=linkstatus+1) ~ 1,
maxiter=50000, nclass=3,
nrep=10, data=ds)
but in order to make this more useful, I'd like to access the attributes within the objects created by the poLCA class as such:
attr(res2, 'Nobs')
attr(res2, 'maxiter')
but they both come up as 'Null'. I expect Nobs to be 453 (determined by the function) and maxiter to be 50000 (dictated by my input value).
I'm sure I'm just being naive, but I could use any help available. Thanks a lot!
Welcome to R. You've got the model-fitting syntax right, in that you can get a model out (don't know how latent component analysis works, so can't speak to the statistical validity of your result). However, you've mixed up the different ways in which R can store information pertaining to a model.
poLCA returns an object of class poLCA, which is
a list containing the following elements:
(. . .)
Nobs number of fully observed cases (less than or equal to N).
maxiter maximum number of iterations through which the estimation algorithm was set
to run.
Since it's a list, you can extract individual elements from your model object using the $ operator:
res2$Nobs # number of observations
res2$maxiter # maximum iterations
In some cases, there might be extractor functions to get this information without having to do low-level indexing. For example, many model-fitting functions will have a fitted method, which pulls out the vector of fitted values on the training data; and similarly residuals pulls out the vector of residuals. You should check whether there are such extractor functions provided by the poLCA package and use them if possible; that way, you're not making assumptions about the structure of the model object that might be broken in the future.
This is distinct to getting the attributes of an object, which is what you use attr for. Attributes in R are what you might call metadata: they contain R-specific information about an object itself, rather than information about whatever it is the object relates to. Examples of common attributes include class (the class of an object), dim (the dimensions of an array or matrix), names (names of individual elements of a vector/list/array) and so on.

Creating a correlation matrix using novel distance function

I'm trying to write a function that will create a correlation matrix using a fancy distance estimate (dcorr, Brownian distance). More generally, I want to write code for a generic "correlation" matrix in which you can plug in any distance estimator.
My data is formatted such that columns are variables and rows are observations.
I'm having problems with my basic code. My algorithm is as follows:
Use apply to take a variable
Pass to function that will again take apply on the entire matrix
At this point you should have two pairs of variables
Use na.omit to remove missing observations (necessary for dcorr)
Calculate dcorr
I was hoping this would result in the correlation matrix but I'm having a lot of problems with basic variable managment. I'm having difficulty passing variables to the apply function. In particular, I want to pass a the column that was pulled in the first apply and pass it to the second apply (that is applied on the entire original matrix)
My code:
dcormatrix <- function(Matrix){
dcorhelper <- function (Col1){
as.matrix(apply(Matrix,2,function(Col2){
B <- na.omit(cbind(Col1,Col2))
dcor(B[,1],B[,2],index=1)
},Col1=Col1))
}
apply(Matrix,2,dcorhelper(),Matrix=Matrix)
}
Any ideas? I'm sure there's gotta be an easy way to do this.
You may want to check out designdist from the vegan package. It allows one to define alternate distance / dissimilarity matrices. See here.

Resources