R 'fdrtool' package: how to use t statistic - r

Can one use t statistics from a Student's t-test directly in the fdrtool() function of fdrtool package (ver. 1.2.12)? The paper (Strimmer-K BMC Bioinfo. 2008, 9:303) mentions this but as far as I can see the parameters only recognize "normal", "correlation" and "pvalue". Is there a workaround for a non-statistician ?

I think it's a typo.
I took a look at the source for the fdrtool function and found that the statistic argument first gets passed through match.arg and then to fdrtool:::get.nullmodel.
Then, lo and behold:
args(fdrtool:::get.nullmodel)
# function (statistic = c("normal", "correlation", "pvalue", "studentt"))
# NULL
and indeed there is a fully-implemented case in that function for the student t:
if (statistic == "studentt") {
f0 = function(x, param, log = FALSE) {
return(dt(x, df = param, log = log))
}
F0 = function(x, param) {
return(pt(x, df = param))
}
iqr = function(param) {
return(qt(0.75, df = param) - qt(0.25, df = param))
}
get.support = function() return(c(1, 1000))
}
Now, before I tell you how to access this option, I want to warn you that it's very possible it was disabled on purpose. I can't imagine why, because at first glance it seems like it should work fine. But if you're planning to use this in a research result you ought to document the fact that this was essentially a "hidden" option and that you had to do some hacking to access it. Moreover, I haven't actually tested this on my computer, so beware of typos.
Now, as for that hacking, the easiest way to get this to work would be to first simply type fdrtool into the R console. Then, copy and paste the output to a new R script (or use sink if you're fancy like that). The first few lines should look like:
function (x, statistic = c("normal", "correlation", "pvalue"),
plot = TRUE, color.figure = TRUE, verbose = TRUE, cutoff.method = c("fndr",
"pct0", "locfdr"), pct0 = 0.75)
{
statistic = match.arg(statistic)
...
Then all you have to do is change c("normal", "correlation", "pvalue") to c("normal", "correlation", "pvalue", "studentt"). That is, the first few lines should now look like
function (x, statistic = c("normal", "correlation", "pvalue", "studentt"),
plot = TRUE, color.figure = TRUE, verbose = TRUE, cutoff.method = c("fndr",
"pct0", "locfdr"), pct0 = 0.75)
{
statistic = match.arg(statistic)
...
Finally, reassign this function to fdrtool (don't worry, this won't break the underlying package, it will just act like a "mask" until you remove it with rm):
fdrtool <- function (x, statistic = c("normal", "correlation", "pvalue", "studentt"),
plot = TRUE, color.figure = TRUE, verbose = TRUE, cutoff.method = c("fndr",
"pct0", "locfdr"), pct0 = 0.75)
{
statistic = match.arg(statistic)
...
And run the whole thing or source the script. Then you should be good to go.

Turns out that the maintainer of the package, Korbinian Strimmer, disabled the t-score based function on purpose. The reason for that is that it has been used incorrectly too often.
Prof. Strimmer is a nice guy and responded to my help request quickly and very comprehensively. This is what he suggests: T-scores in practice often do not follow a t-distribution but show rather an over- or underdispersion, which is why you should better use the normal option.
Before that, however, you will have to center your data
z.centered = z-median(z)
fdrtool(z.centered, statistic="normal")

Related

How to plot splom function?

i'm trying to recreate this graph
require(car)
scatterplotMatrix(~Week+Cases+Egg.Pr+Beef.Pr+Pork.Pr+Chicken.Pr+Cereal.Pr, reg.line = lm,
smooth = TRUE, spread = FALSE, span = 0.5, id.n = 0, diagonal = 'boxplot', data = Eggs)
using the lattice package,
i've been trying to make it work with this structure, but it just comes out wrong:
splom(Eggs, panel = function(Week, Cases, Egg.Pr, Beef.Pr, Pork.Pr, Chicken.Pr, Cereal.Pr) {
panel.xyplot(Week, Cases, Egg.Pr, Beef.Pr, Pork.Pr, Chicken.Pr, Cereal.Pr)
panel.lmline(Week, Cases, Egg.Pr, Beef.Pr, Pork.Pr, Chicken.Pr, Cereal.Pr)
panel.smooth(Week, Cases, Egg.Pr, Beef.Pr, Pork.Pr, Chicken.Pr, Cereal.Pr)
}, spread = FALSE, span = 0.5, id.n = 0, diagonal = 'boxplot',
data = Eggs)
please help me out, and overexplain as much as possible
With some guessing from context, and since you asked for overexplanation:
First, you need to refine your understanding of panel functions. They are supposed to be general purpose functions describing the plotting procedure, not specific to the data at hand. With that in mind, we can rewrite your first attempt as:
myPanel <- function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.lmline(x, y, ...)
panel.loess(x, y, ...)
}
Note that panel.smooth will not work, for reasons that are too complicated to get into (but basically, it's not grid-based, as lattice requires).
With this, you should be able to do
Eggs <- Duncan[-1] # placeholder to make example reproducible
splom(Eggs, panel = myPanel)
This leaves the diagonal, for which you need another panel function. Here's one suggestion which can be fine-tuned as per your requirements:
myDiagonal <- function(x, ...) {
diag.panel.splom(x, ...) # retain default and build on it
ycenter <- quantile(x, 0.25, na.rm = TRUE)
panel.bwplot(x = x, y = rep(ycenter, length(x)),
box.width = 0.1 * diff(range(x, finite = TRUE))),
...)
}
Combining, you can now do
splom(Eggs, panel = myPanel, diag.panel = myDiagonal)
The sprinkling of ...-s are important, but I'm not going to go into why. For the rest, documentation of the corresponding functions should help you figure out how to fine tune.

Error calling rCBA::fpgrowth: method fpgrowth with signature (DDI)[[Ljava/lang/String; not found

I wrote the R code below to mine with the FP-Growth algorithm:
fpgabdata <- read.csv('../Agen Biasa.csv', header = FALSE)
train <- sapply(fpgabdata, as.factor)
train <- data.frame(train, check.names = TRUE)
txns <- as(train,"transactions")
abrulesfpg = rCBA::fpgrowth(txns, support = 0.25, confidence = 0.5, maxLength = 10, consequent = NULL, verbose = TRUE, parallel = TRUE)
But I get the following error:
Error in .jcall(jPruning, "[[Ljava/lang/String;", "fpgrowth", support, :
method fpgrowth with signature (DDI)[[Ljava/lang/String; not found
These are my data:
The reason you are seeing this error is that the current implementation of the FP-growth algorithm in rCBA requires that you specify a value for the consequent (right hand side).
For example, the following should work, assuming you have sensible thresholds for support and confidence:
abrulesfpg = rCBA::fpgrowth(
txns,
support = 0.25,
confidence = 0.5,
maxLength = 10,
consequent = "SPIRULINA",
verbose = TRUE,
parallel = TRUE
)
I know the OP is likely to have discovered this by now, but I've answered this just in case anyone else encounters the same error.

Log chart in blotter function chart.Posn() possible?

Is it possible to draw a log price chart in the chart.Posn() or chart.Reconcile() functions of blotter? I tried adding log.scale = TRUE to the function call without success. Is the underlying chart_Series function still too "experimental" to support this functionality or is the function call not correct?
chart.Posn(Portfolio = portfolio.st, Symbol = "GSPC", log.scale = TRUE)
Update: I have been trying to use the chart_Series() function directly, setting the ylog graphical parameter:
par(ylog=TRUE)
chart_Series(Cl(GSPC))
But I receive an error "log scale needs positive bounds" despite the data being all positive.
Btw, GSPC is an OHLCV time-series xts of the S&P 500 that plots in chartSeries() and chart_Series(), but just not with log-scale for either charting functions.
I found this old post not as a solution but as an alternative:
Does chart_Series() work with logarithmic axis?
I don't think there is any parameter like log.scale that chart_Series recognises. You could simply do chart_Series(log(Cl(GSPC)). You could also do some basic modifications to chart.Posn to put things on the log scale. Use as a starting point the source code for chart.Posn.
Here is an example of a modified function you could make. You can obviously modify it further in any way you please.
# We need an example. So,
# Source this code from the directory containing quantstrat, or at least source the macd.R demo in quantstrat.
source("demo/macd.R")
log.chart.Posn <- function(Portfolio, Symbol, Dates = NULL, env = .GlobalEnv) {
pname<-Portfolio
Portfolio<-getPortfolio(pname)
x <- get(Symbol, env)
Prices <- log(x)
chart_Series(Prices)
#browser()
if(is.null(Dates)) Dates<-paste(first(index(Prices)),last(index(Prices)),sep='::')
#scope the data by Dates
Portfolio$symbols[[Symbol]]$txn<-Portfolio$symbols[[Symbol]]$txn[Dates]
Portfolio$symbols[[Symbol]]$posPL<-Portfolio$symbols[[Symbol]]$posPL[Dates]
Trades = Portfolio$symbols[[Symbol]]$txn$Txn.Qty
Buys = log(Portfolio$symbols[[Symbol]]$txn$Txn.Price[which(Trades>0)])
Sells = log(Portfolio$symbols[[Symbol]]$txn$Txn.Price[which(Trades<0)])
Position = Portfolio$symbols[[Symbol]]$txn$Pos.Qty
if(nrow(Position)<1) stop ('no transactions/positions to chart')
if(as.POSIXct(first(index(Prices)))<as.POSIXct(first(index(Position)))) Position<-rbind(xts(0,order.by=first(index(Prices)-1)),Position)
Positionfill = na.locf(merge(Position,index(Prices)))
CumPL = cumsum(Portfolio$symbols[[Symbol]]$posPL$Net.Trading.PL)
if(length(CumPL)>1)
CumPL = na.omit(na.locf(merge(CumPL,index(Prices))))
else
CumPL = NULL
if(!is.null(CumPL)) {
CumMax <- cummax(CumPL)
Drawdown <- -(CumMax - CumPL)
Drawdown<-rbind(xts(-max(CumPL),order.by=first(index(Drawdown)-1)),Drawdown)
} else {
Drawdown <- NULL
}
if(!is.null(nrow(Buys)) && nrow(Buys) >=1 ) (add_TA(Buys,pch=2,type='p',col='green', on=1));
if(!is.null(nrow(Sells)) && nrow(Sells) >= 1) (add_TA(Sells,pch=6,type='p',col='red', on=1));
if(nrow(Position)>=1) {
(add_TA(Positionfill,type='h',col='blue', lwd=2))
(add_TA(Position,type='p',col='orange', lwd=2, on=2))
}
if(!is.null(CumPL)) (add_TA(CumPL, col='darkgreen', lwd=2))
if(!is.null(Drawdown)) (add_TA(Drawdown, col='darkred', lwd=2, yaxis=c(0,-max(CumMax))))
plot(current.chob())
}
log.chart.Posn(Portfolio = portfolio.st, Sym = "AAPL", Dates = NULL, env = .GlobalEnv)
add_MACD() # Simply added to make the plot almost identical to what is in demo/macd.R
This is what the original chart looks like:
New plot, with log scales:

theta.sparse error with lorDIF

I was wondering whether anyone can help me out.
I am trying to run a dif analysis on my data but keep getting a theta.sparse error, which I am unsure of how to fix. I would really appreciate any that you can give me.
library(lordif)
dat<- read.csv2("OPSO.csv",header=TRUE)
datgender <- read.csv2("DATA.csv",header=TRUE)
group<-datgender$Gender
sink("outputDIFopso.txt")
gender.difopso <- lordif(dat, group, selection = NULL,
criterion = c("Chisqr", "R2", "Beta"),
pseudo.R2 = c("McFadden", "Nagelkerke", "CoxSnell"), alpha = 0.01,
beta.change = 0.1, R2.change = 0.02, maxIter = 10, minCell = 5,
minTheta = -4, maxTheta = 4, inc = 0.1, control = list(), model = "GRM",
anchor = NULL, MonteCarlo = FALSE, nr = 100)
print(gender.difopso)
summary(gender.difopso)
sink()
pdf("graphtestop.pdf")
plot(gender.difopso)
dev.off()
dev.off()
Error in lordif(dat, group, selection = NULL, criterion = c("Chisqr", :
object 'theta.sparse' not found
Thank you :)
You should check the error line before then. The output will probably say you have no items flagged for DIF. When that's the case you should just run the mirt function and extract theta and ipar objects as necessary.
The author could add some case handling for when compare(flags, flags.matrix) is true. At the very least, it seems a warning is omitted when there are no items with DIF the same way it says
if (ndif == ni) {
warning("all items got flagged for DIF - stopping\n")
}
and there is no case handling when (ndif == 0) although compare(flags, flag.matrix) evaluates to TRUE.
The implications when all or none of the items have DIF is that you would get the same results (generating the same ICC plots, same inference etc) by fitting mirt in the combined sample (no DIF) or two or more mirt models for each group (all DIF). So it's a correct time saving procedure to just bypass when all that breaks down.

Hiding output from console

My code contains a command which is essential for it to run, however it ends up showing the result of this command in the console, i have tried to use suppressWarnings(), suppressMessages(), invisible() and sink() but all of these still show the result.
Here is an example data set and where the problem originates from:
M<-c(1111,1222,1333,1444,1555,1666,1777,2223,6654,9867,1123,1456,2436,6875)
fstAdi <- ets(ts(rep(M,length = length(M)), deltat= 1/4, start = c(8,1)), model = "AAA", damped = FALSE, opt.crit = "mae", ic="aic", lower = c(0, 0, 0, 0), upper = c(0.999, 0.999, 0.999, 0.999), bounds = "admissible", restrict = FALSE)
mae11Ad<-summary(fstAdi)[,"MAE"]
The last line of the code above always shows the summary in the console, which when automating this for a report causing problems. Does anyone know of a command which can stop this happening?
Thankyou
I have found a way to hide it while automating the report, but if anyone knows how to hide it while just running the code to make the process quicker then that would be really helpful anyway :)
The summary method for ets objects is a bit verbose:
> forecast:::summary.ets
function (object, ...)
{
print(object)
cat("\nTraining set error measures:\n")
print(accuracy(object))
}
<bytecode: 0x161d31c8>
<environment: namespace:forecast>
This is pretty bad style, summary methods should return an object with a class and the print method for that class should produce the output.
So you could just call the accuracy method on your object:
> accuracy(fstAdi)[,"MAE"]
[1] 1971.468
which has the advantage of not needing any diversion of output and is more readable.
sink does work, how exactly are you using it? Try for example
M<-c(1111,1222,1333,1444,1555,1666,1777,2223,6654,9867,1123,1456,2436,6875)
fstAdi <- ets(ts(rep(M,length = length(M)), deltat= 1/4, start = c(8,1)), model = "AAA", damped = FALSE, opt.crit = "mae", ic="aic", lower = c(0, 0, 0, 0), upper = c(0.999, 0.999, 0.999, 0.999), bounds = "admissible", restrict = FALSE)
sink(tempfile())
mae11Ad<-summary(fstAdi)[,"MAE"]
sink()
I know that probably looks like a bad-style approach, but this seems to work
sink( tempfile() )
mae11Ad<-summary(fstAdi)[,"MAE"]

Resources