Taylor diagram from existing Correlation and Standard Dev values - r

Is it possible to create a Taylor diagram from already calculated correlation and standard deviation values?
I am doing model evaluation, and I have already the correlation and standard deviations values.I understand that there is already a package plotrix where by giving the observation and the modeled values, the diagram is created. However for the type of work that I am doing, it is easier to start by giving already the correlation and standard deviation values.
Is there any way I can do this in R?

There's no reason it shouldn't be possible, but the authors didn't seem to allow for that when they wrote the function. The function is a bit long and complex, but the part that does the calculation is at the top. It is possible to swap out that code and replace it to allow for the passing of summary statistics. Now, keep in mind what i'm about to do is a hack and i've only tested it with versions 3.5-5 of plotrix. Other version may not work.
Here will will create a new function taylor.diagram2 that takes all the code from taylor.diagram but adds in an extra if statement to check for a list of summarized data as the first argument
taylor.diagram2<-taylor.diagram
bl<-as.list(body(taylor.diagram))
cond<-list(
as.name("if"),
quote(is.list(ref) & missing(model)), #condition
quote({R<-ref$R; sd.r<-ref$sd.r; sd.f<-ref$sd.f}), #if true
as.call(c(as.symbol("{"), bl[3:8]))) #else
bl<-c(bl[1:2], as.call(cond), bl[9:length(bl)]) #splice in new code
body(taylor.diagram2)<-as.call(bl) #update function
Now we can test the function. First, we'll do things the standard way
#test data
aref<-rnorm(30,sd=2)
amodel1<-aref+rnorm(30)/2
#standard behavior function
taylor.diagram2(aref,amodel1, main="Standard Behavior"))
#summarized data
xx<-list(
R=cor(aref, amodel1, use = "pairwise"),
sd.r=sd(aref),
sd.f=sd(amodel1)
)
#modified behavior
taylor.diagram2(xx, main="Modified Behavior")
So the new taylor.diagram2 function can do both. If you pass it two vectors, it will do the standard behavior. If you pass it a list with the names R, sd.r, and sd.f, then it will do the same plot but with the values you passed in. Also, the model parameter must be empty for the modified version to work. That means if you want to set any additional parameter, you must use named parameters rather than positional arguments.

Related

Changing the output of a function in R

I have created a function to generate random poisson distributions but want the function to calculate the mean of lambda also.
xpoisson<-function(x,mu){rpois(x, lambda=mu)}
This is what I have written so far and Im not sure where/how to add the mean(mu) to the function. Any help apppreciated.
Functions will only output one variable. In order to output two variables you need, for instance, to output them as a list. Try the following:
xpoisson<-function(x,mu){
list(rpois(x, lambda=mu),mean(mu))
}
I believe what you want is not mean(mu) but mean(rpois(x,lamda=mu)) ?
If so, just add the new line inside the function, assign the distribution and mean to variables and return both values by adding them to a list and returning said list.

R: parameter in update function

Here is a snippet of R script doing beta regression on data "GasolineYield":
library("betareg")
data("GasolineYield", package = "betareg")
gy_logit <- betareg(yield ~ batch + temp, data = GasolineYield)
gy_logit4 <- update(gy_logit, subset = -4)
The 4th line magically deletes the 4th observation and update the fit automatically, but I don't quite understand the why this parameter works in the update function here, because I tried to look up the documentation by ?update, but couldn't find there's such parameter.
I'm curious about how to find right documentation in this case, because maybe I want to add some new observation instead of removing it. Any help?
subset in betareg works the same as subset in lm, therefore you can read lm documentation.
From the help file you can find:
subset an optional vector specifying a subset of observations to be used in the fitting process.
Hence by setting select=-4 you are lefting out the fourth row in the estimation.
update() contains the ... parameter, which means any parameters that are not matched in your call to update() are passed on to the function that does the estimation. In this case, that is betareg(), which does have the subset argument.
This type of thing is very common in R. Many higher-level function that call other user-visible functions will have the three dot parameter and pass any unmatched parameters on, so you have to search all the user-visible functions that get called in order to know all possible options.
You can check out the help file for the top level function (update() in this case) to get an idea of which functions get the leftover parameters.

Combining ROCR performance objects

I have multiple performance objects created using ROCR. Each of these contain auc or fpr/tpr values for a class. In turn they have results for multiple test runs. So,
length(first.perf.obj#y.values)
gives something > 1.
I can plot average for a single class using
plot(first.perf.obj, avg="vertical")
as described in the ROCR manual. I want to combine these objects to calculate and plot their global average. Something like
global.perf.obj <- combine.perf.objects(first.perf.obj, second.perf.obj, third.perf.obj)
Is there an easy way to do this, or should I decompose each object and calculate values by hand?
I went back recreating prediction objects for the global case.
I'm calling the prediction function like
global.prediction <- prediction(c(cls1.likelihood,
cls2.likelihood,
cls3.likelihood,
cls4.likelihood,
cls5.likelihood),
c(duplicate.cols(cls1.labels, ncol(cls1.likelihood)),
duplicate.cols(cls2.labels, ncol(cls2.likelihood)),
duplicate.cols(cls3.labels, ncol(cls3.likelihood)),
duplicate.cols(cls4.labels, ncol(cls4.likelihood)),
duplicate.cols(cls5.labels, ncol(cls5.likelihood))),
label.ordering=c(FALSE, TRUE))
for duplicate.cols simply builds a data.frame of repeating labels.
Then I'm able to get any statistic for the global case by e.g. performance(global.prediction, "auc")
It's a bit slow, but I think it's simpler than trying to combine values from multiple performance objects.

Returning Multiple Output Parameters from Optim

Im running an optimisation routine using optim in R and im telling the programme what i want returned. for example, if i put return(op1$par), it will return all 4 of my variable values. Thats fine, and if i run return(op1), I obviously get all the information from the optimisation routine (par, value, convergence etc). However, in this format, the par values arent accessible in the output, it simply details that there are 4 values.
Now what i need is to the get the parameter values and the convergence information at the same time. R wont let me call this return(op1$par, op1$convergence) so im looking for the best way to get these two entities in one run?
I should specify that im writing this to a file for 1000s of iterations and not just looking to call it up once on screen.
Cheers
Try something like this:
return(c(Parameters=op1$par, Convergence=op1$convergence))
The names Parameters and Convergence are only for identifying what are the parameters and what is the convergence, since this result will be a vector.
By design, a function can return only one object (or else assignments like a <- fn(b) would get confusing; which thing do you assign?). But that object can be a vector, or a list (which is what optim does). So wrap your arguments in something like
return(c(par=op1$par, convergence=op1$convergence))
or more generally (for objects of different types),
return(list(par=op1$par, convergence=op1$convergence))

R: partimat function doesn't recognize my classes

I am a relatively novice r user and am attempting to use the partimat() function within the klaR package to plot decision boundaries for a linear discriminant analysis but I keep encountering the same error. I have tried inputing the arguments multiple different ways according to the manual, but keep getting the following error:
Error in partimat.default(x, grouping, ...) :
at least two classes required
Here is an example of the input I've given:
partimat(sources1[,c(3:19)],grouping=sources1[,2],method="lda",prec=100)
where my data table is loaded in under the name "sources1" with columns 3 through 19 containing the explanatory variables and column 2 containing the classes. I have also tried doing it by entering the formula like so:
partimat(sources1$group~sources1$tio2+sources1$v+sources1$cr+sources1$co+sources1$ni+sources1$rb+sources1$sr+sources1$y+sources1$zr+sources1$nb+sources1$la+sources1$gd+sources1$yb+sources1$hf+sources1$ta+sources1$th+sources1$u,data=sources1)
with these being the column heading.
I have successfully run an LDA on this same data set without issue so I'm not quite sure what is wrong.
From the source code of the partimat.default function getAnywhere(partimat.default) it states
if (nlevels(grouping) < 2)
stop("at least two classes required")
Therefore maybe you haven't defined your grouping column as a factor variable. If you try summary(sources1[,2]) what do you get? If it's not a factor, try
sources1[,2] <- as.factor(sources1[,2])
Or in method 2 try removing the "sources1$"on each of your variable names in the formula as you specify the data frame in which to look for these variable names in the data argument. I think you are effectively specifying the dataframe twice and it might be looking, for instance, for
"sources1$sources1$groups"
Rather than
"sources1$groups"
Without further error messages or a reproducible example (i.e. include some data in your post) it's hard to say really.
HTH

Resources