Plotting coda mcmc objects giving error in plot.new - r

I have an R package I am working on that returns output from a Metropolis-Hastings sampler. The output consists of, among other things, matrices where the columns are the variables and the rows are the samples from the posterior. I convert these into coda mcmc objects with this code:
colnames(results$beta) = x$data$Pops
results$beta = mcmc(results$beta, thin = thin)
where thin is 183 and beta is a 21 x 15 matrix (this is a toy example). The mcmc.summary method works fine, but the plot.mcmc gives me:
Error in plot.new() : figure margins too large
I have done a bit of debugging. All the values are finite, there are no NA's, the limits of the axes seem to be being set okay, and there are enough panels (2 plots each with 4 rows and 2 columns) I think. Is there something I am missing in the coercion into the mcmc object?
Package source and all associated files can be found on http://github.com/jmcurran/rbayesfst. A script which will produce the error quickly is in the unexported function mytest, so you'll need
rbayesfst:::mytest()
to get it to run.
There has been suggestion that this has been answered already in this question, but I would like to point out that it is not me setting any of the par values, but plot.mcmc so my question is not about par or plot but what (if anything) I am doing wrong in making a matrix into an mcmc object that cannot be plotted by plot.mcmc It can't be the size of the matrix, because I have had examples with many more dimensions directly from rjags that worked fine.

Related

How to calculate the topographic error for self-organising maps using the kohonen package in R?

I'm working with the kohonen package (version 3.0.11) in R for applying the self-organising maps algorithm to a large data set.
In order to determine the optimal grid size, I tried to calculate both the quantisation error and the topographic error at various grid sizes, to see at which size their normalised sum is minimal.
Unfortunately, whenever I run the topo.error() function, I get an error and I'm wondering if the function is still usable after version 2.0.19 of the package (that's the latest version for which I found documentation about the topo.error function).
I know other packages such as aweSOM have similar functions, but the kohonen::topo.error() function only uses the data set and grid parameters as arguments, and not the trained SOM model, saving a substantial amount of computation time.
Here is a minimal reproducible example with the output error:
Code
library('kohonen')
data(yeast)
set.seed(261122)
## take only complete cases
X <- yeast[[3]][apply(yeast[[3]], 1, function(x) sum(is.na(x))) == 0,]
yeast.som <- som(X, somgrid(5, 8, "hexagonal"))
## quantization error
mean(yeast.som$distances)
## topographical error
topo.error(yeast.som, "bmu")
Output
Error in topo.error(yeast.som, "bmu") :
could not find function "topo.error"

Extrapolate capped values in heat matrix

I have a number of heatmaps (example below), from each of which I extract a value matrix. My problem is that, in the images, values above a certain threshold (in this case 200) are capped at that threshold and shown as a fuschia color. I'm trying to extrapolate these values. I tried replacing 200 with NA and using na.approx and na.spline from the zoo package, approxExtrap from the Hmisc package, as well as using columnwise loess regression. Loess was the only technique that yielded values above 200 at all, but still nowhere near the actual values (I have those for a few images). Any ideas?
Okay, I was able to do this with moderate success using the interp() function from the akima package, using the flags linear = FALSE, extrap = TRUE. It took a full 30 seconds to run per image, performing perfectly on some images, but tending to overestimate when the fuschia region was too large.

R clValid function Error for huge dataset

I'm trying to evaluate my clustering results using this package
I run the following but it is giving me error;
intern <- clValid(test_clvalid, 3:25, maxitems = 260000, clMethods="kmeans", validation="internal")
Error in hclust(Dist, method) : size cannot be NA nor exceed 65536
test_clvalid is my data set, it has 256342 observations with 5 numeric variables.
When I ran the same with less data observations, it seems to run fine. Not sure why hclust() is called/giving error when I specify to use k-means evaluation.
Unfortunately that package is using hclust to initialize the input to kmeans,
as you can see here.
That also means that,
before that,
the cross-distance matrix was calculated,
which has 256,342 x 256,342 dimensions for your whole dataset.
The hclust function is hard-coded to deal with matrices that are 65536 x 65536 at the most,
so you won't be able to use that package to evaluate k-means on your data.

Predicting in a Bayesian Network in R -- message: consider using the 'smooth' argument

I've gotten code figured out to predict probabilities in each category of the target node. However, when I try to adapt it to a different dataset (just more nodes of similar type data), when I try run the predict function, it gives this message (in black text, not red like an error or warning):
The evidence for row 25 has probability smaller than 0.00000 in then
model. Consider using the 'smooth' argument when building the network.
Exiting...
Here is the code:
library("bnlearn")
pigment.test <- read.table("pigment_test.csv", sep=",", header=T)
bn.mle <- bn.fit(dg_pigment, data=pigment.test[,2:19], method="mle", smooth=0.01)
bn.grain <- as.grain(bn.mle)
predict.mle <- predict(bn.grain, "pigment", newdata=pigment_val[,2:18],
type="distribution")
predict.mle
I tried putting smooth in the bn.fit or the as.grain part of the code, but it says its an unused argument. This happens for other rows (not just 25 when I remove it). Does anyone know where I could include a smooth argument here? Or is there a different function? I was trying to have the program automatically calculate the conditional probabilities rather than me manually creating the tables.

plotting loess with standard errors in R causes integer overflow

I am attempting to use predict with a loess object in R. There are 112406 observations. There is one particular line inside stats:::predLoess which attempts to multiply N*M1 where N=M1=112406. This causes an integer overlow and the function bombs out. The line of code that does this is the following (copied from predLoess source):
L <- .C(R_loess_ise, as.double(y), as.double(x), as.double(x.evaluate[inside,
]), as.double(weights), as.double(span), as.integer(degree),
as.integer(nonparametric), as.integer(order.drop.sqr), as.integer(sum.drop.sqr),
as.double(span * cell), as.integer(D), as.integer(N), as.integer(M1),
double(M1), L = double(N * M1))$L
Has anyone solved this or found a solution to this problem? I am using R 2.13. The name of this forum is fitting for this problem.
It sounds like you're trying to get predictions for all N=112406 observations. First, do you really need to do this? For example, if you want graphical output, it's faster just to get predictions on a small grid over the range of your data.
If you do need 112406 predictions, you can split your data into subsets (say of size 1000 each) and get predictions on each subset independently. This avoids forming a single gigantic matrix inside predLoess.

Resources