When I use the list function:
el_nino_1974_2000_all <- list()
for (k in seq_along(el_nino_start_month)){
el_nino_1974_2000_all[[k]] = window(Nino3.4_Flow_1974_2000_zoo,
start = (as.Date(el_nino_1974_2000[k,]$el_nino_start_mont)),
end = (as.Date(el_nino_1974_2000[k,]$el_nino_finish_month)))
}
A gives a series of separate data subsets staring from i = 1. However, I want to merge all subsets into one frame of data either in zoo format or data frame format.
This is the structure of el_nino_1974_2000_all.
> str(el_nino_1974_2000_all)
List of 7
$ :‘zoo’ series from 1976-08-15 to 1977-01-15
Data: num [1:6, 1:2] 0.519 0.874 0.886 0.823 0.734 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:6], format: "1976-08-15" "1976-09-15" ...
$ :‘zoo’ series from 1982-05-15 to 1983-06-15
Data: num [1:14, 1:2] 0.961 1.388 0.959 1.171 1.564 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:14], format: "1982-05-15" "1982-06-15" ...
$ :‘zoo’ series from 1986-09-15 to 1988-01-15
Data: num [1:17, 1:2] 0.974 1.089 1.322 1.273 1.313 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:17], format: "1986-09-15" "1986-10-15" ...
$ :‘zoo’ series from 1991-05-15 to 1992-07-15
Data: num [1:15, 1:2] 0.68 1 0.923 0.773 0.68 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:15], format: "1991-05-15" "1991-06-15" ...
$ :‘zoo’ series from 1993-02-15 to 1993-07-15
Data: num [1:6, 1:2] 0.54 0.641 1.01 1.144 0.917 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:6], format: "1993-02-15" "1993-03-15" ...
$ :‘zoo’ series from 1994-08-15 to 1995-02-15
Data: num [1:7, 1:2] 0.662 0.746 1.039 1.329 1.301 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:7], format: "1994-08-15" "1994-09-15" ...
$ :‘zoo’ series from 1997-04-15 to 1998-05-15
Data: num [1:14, 1:2] 0.601 1.136 1.461 1.668 2.079 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:2] "Nino3.4_degree_1974_2000" "Houlgrave_flow_1974_2000"
Index: Date[1:14], format: "1997-04-15" "1997-05-15" ...
>
Sorry, I don't know how to do the formatting.
If the dates don't overlap, you can stick these together using rbind (since the number of columns is the same for each component). Try:
el_nino_1974_2000_all <- c()
for (k in seq_along(el_nino_start_month)){
el_nino_1974_2000_all <- rbind(el_nino_1974_2000_all,window(...))
}
Instead of the list construction you originally had.
This will return a zoo object.
If you want to return a data.frame, try using rbind, but with data.frame to convert your objects (this will work even if date indices overlap between each of your datasets):
el_nino_1974_2000_all <- data.frame()
for (k in seq_along(el_nino_start_month)){
el_nino_1974_2000_all <- rbind(el_nino_1974_2000_all,data.frame(window(...)))
}
Have you tried this function :
http://rss.acs.unt.edu/Rdoc/library/gtools/html/smartbind.html
out <- smartbind(list_of_dataframes)
Note : list_of_dataframes should contain data.frames but you can just transform you're data to dframes on the fly and then use this function.
Related
I'm trying to ggplot using Hellinger Transformation on my dataset. It works fine for a regular prcomp function but not Hellingers. How can I plot the data from Hellinger transformed data using ggplot?
library(ggfortify)
library(vegan)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
autoplot(pca_res, data = iris, colour =
'Species',
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size = 3)
##Hellinger Transformation
df.hell <- decostand(df, method = "hellinger")
df.hell <- rda(df.hell)
ggplot2::autoplot(df.hell)
autoplot(df.hell, data = iris, colour =
'Species',
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size = 3)
Error: Objects of type rda/cca not supported by autoplot.
Error: Objects of type rda/cca not supported by autoplot.
Edit 1: Even if the first plot can be manually computed in ggplot2, what about the rest of the plots like loading, or ellipses etc? base plot allows for overlay when using Hellingers but doesn't seem like ggplot2 would directly allow for it.
prcomp returns an object of class prcomp, which can be plotted with autoplot. As the error message says, rda function returns an object of class "rda" "cca", which cannot be plotted using autoplot. Therefore, you must extract the bits you need manually:
data.frame(PC = df.hell$CA$u, species = iris$Species) %>%
ggplot(aes(x=PC.PC1, y=PC.PC2)) +
geom_point(aes(colour=species))
You can find the relevant parts of the object by doing str(df.hell):
List of 10
$ colsum : Named num [1:4] 0.037 0.0746 0.086 0.0854
..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
$ tot.chi : num 0.0216
$ Ybar : num [1:150, 1:4] 0.0042 0.00511 0.0042 0.00359 0.00363 ...
..- attr(*, "scaled:center")= Named num [1:4] 0.656 0.479 0.498 0.267
.. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
..- attr(*, "METHOD")= chr "PCA"
$ method : chr "rda"
$ call : language rda(X = df.hell)
$ pCCA : NULL
$ CCA : NULL
$ CA :List of 7
..$ eig : Named num [1:4] 0.0208691 0.0005348 0.0001951 0.0000205
.. ..- attr(*, "names")= chr [1:4] "PC1" "PC2" "PC3" "PC4"
..$ poseig : NULL
..$ u : num [1:150, 1:4] -0.122 -0.11 -0.119 -0.106 -0.123 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:4] "PC1" "PC2" "PC3" "PC4"
..$ v : num [1:4, 1:4] -0.241 -0.508 0.589 0.58 0.375 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
.. .. ..$ : chr [1:4] "PC1" "PC2" "PC3" "PC4"
..$ rank : int 4
..$ tot.chi: num 0.0216
..$ Xbar : num [1:150, 1:4] 0.0042 0.00511 0.0042 0.00359 0.00363 ...
.. ..- attr(*, "scaled:center")= Named num [1:4] 0.656 0.479 0.498 0.267
.. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
.. ..- attr(*, "METHOD")= chr "PCA"
$ inertia : chr "variance"
$ regularization: chr "this is a vegan::rda result object"
- attr(*, "class")= chr [1:2] "rda" "cca"
I have tried to figure out actual memory requirements for storing particular object. I tried two methods:
object.size(obj)
save(obj, file = "obj.Rdata") and checking the file size.
The .Rdata file is compressed so it was always smaller than what object.size() has returned, until I saw this object:
> object.size(out)
144792 bytes
> save(out, file = "out.Rdata")
# the file has 211 759 bytes
When I open the file in new R and run object.size(out), it reports 144792 bytes again.
Any idea how this can happen?
I don't want to post the complete object here since it contains closed data, but I can post the str output at least (it is the output of the R2jags::jags call - object of class rjags):
> str(out)
List of 6
$ model :List of 8
..$ ptr :function ()
..$ data :function ()
..$ model :function ()
..$ state :function (internal = FALSE)
..$ nchain :function ()
..$ iter :function ()
..$ sync :function ()
..$ recompile:function ()
..- attr(*, "class")= chr "jags"
$ BUGSoutput :List of 24
..$ n.chains : int 2
..$ n.iter : num 1000
..$ n.burnin : num 500
..$ n.thin : num 1
..$ n.keep : int 500
..$ n.sims : int 1000
..$ sims.array : num [1:500, 1:2, 1:5] -5.86e-06 -3.78e-02 6.92e-02 4.33e-02 4.34e-02 ...
.. ..- attr(*, "dimnames")=List of 3
.. .. ..$ : NULL
.. .. ..$ : NULL
.. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
..$ sims.list :List of 5
.. ..$ alpha : num [1:1000, 1] 0.04702 -0.00818 0.03757 0.00799 0.00369 ...
.. ..$ beta : num [1:1000, 1] -0.135 -0.2082 -0.0112 -0.129 -0.1613 ...
.. ..$ deviance : num [1:1000, 1] 16028 22052 16127 16057 16141 ...
.. ..$ overdisp_sigma: num [1:1000, 1] 0.26506 0.00821 0.24998 0.25793 0.26013 ...
.. ..$ yr_reff_sigma : num [1:1000, 1] 0.1581 0.176 0.0695 0.1052 0.1043 ...
..$ sims.matrix : num [1:1000, 1:5] 0.04702 -0.00818 0.03757 0.00799 0.00369 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
..$ summary : num [1:5, 1:9] 3.16e-03 -1.20e-01 1.68e+04 2.29e-01 1.19e-01 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
.. .. ..$ : chr [1:9] "mean" "sd" "2.5%" "25%" ...
..$ mean :List of 5
.. ..$ alpha : num [1(1d)] 0.00316
.. ..$ beta : num [1(1d)] -0.12
.. ..$ deviance : num [1(1d)] 16835
.. ..$ overdisp_sigma: num [1(1d)] 0.229
.. ..$ yr_reff_sigma : num [1(1d)] 0.119
..$ sd :List of 5
.. ..$ alpha : num [1(1d)] 0.0403
.. ..$ beta : num [1(1d)] 0.0799
.. ..$ deviance : num [1(1d)] 2378
.. ..$ overdisp_sigma: num [1(1d)] 0.0702
.. ..$ yr_reff_sigma : num [1(1d)] 0.036
..$ median :List of 5
.. ..$ alpha : num [1(1d)] 0.00399
.. ..$ beta : num [1(1d)] -0.123
.. ..$ deviance : num [1(1d)] 16209
.. ..$ overdisp_sigma: num [1(1d)] 0.252
.. ..$ yr_reff_sigma : num [1(1d)] 0.111
..$ root.short : chr [1:5] "alpha" "beta" "deviance" "overdisp_sigma" ...
..$ long.short :List of 5
.. ..$ : int 1
.. ..$ : int 2
.. ..$ : int 3
.. ..$ : int 4
.. ..$ : int 5
..$ dimension.short: num [1:5] 0 0 0 0 0
..$ indexes.short :List of 5
.. ..$ : NULL
.. ..$ : NULL
.. ..$ : NULL
.. ..$ : NULL
.. ..$ : NULL
..$ last.values :List of 2
.. ..$ :List of 4
.. .. ..$ alpha : num [1(1d)] 0.0296
.. .. ..$ beta : num [1(1d)] -0.0964
.. .. ..$ deviance : num [1(1d)] 16113
.. .. ..$ overdisp_sigma: num [1(1d)] 0.265
.. ..$ :List of 4
.. .. ..$ alpha : num [1(1d)] 0.0334
.. .. ..$ beta : num [1(1d)] -0.228
.. .. ..$ deviance : num [1(1d)] 16139
.. .. ..$ overdisp_sigma: num [1(1d)] 0.257
..$ program : chr "jags"
..$ model.file : chr "model.txt"
..$ isDIC : logi TRUE
..$ DICbyR : logi TRUE
..$ pD : num 2830902
..$ DIC : num 2847738
..- attr(*, "class")= chr "bugs"
$ parameters.to.save: chr [1:5] "alpha" "beta" "overdisp_sigma" "yr_reff_sigma" ...
$ model.file : chr "model.txt"
$ n.iter : num 1000
$ DIC : logi TRUE
- attr(*, "class")= chr "rjags"
One way this can happen is if the object has an associated environment that needs saving with it if it is to make sense. This comes up most commonly in the context of "closures" (see here for one explanation).
Without a reproducible example (and without having used R2jags myself) I can't tell you whether that's what is going on in your case, but it at least seems plausible, given that: (a) closures seem to be the most common cause of this situation; (b) based on the output of str(out), your object seems to include a bunch of functions; and (c) it seems like this might be a useful way to organize a computation-heavy and possibly parallelizable procedure like MCMC.
## Define a function "f" that returns a closure, here assigned to the object "y"
f <- function() {
x <- 1:1e6
function() 2*x
}
y <- f()
environment(y)
# <environment: 0x0000000008409ab8>
object.size(y)
# 1216 bytes
save(y, file="out.Rdata")
file.info("out.Rdata")$size
# [1] 2128554
I have been following an online example for R Kohonen self-organising maps (SOM) which suggested that the data should be centred and scaled before computing the SOM.
However, I've noticed the object created seems to have attributes for centre and scale, in which case am I really applying a redundant step by centring and scaling first? Example script below
# Load package
require(kohonen)
# Set data
data(iris)
# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)
# Prepare SOM
set.seed(590507)
som1 <- som(dt,
somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
str(som1)
The output from the last line of the script is:
List of 13
$ data :List of 1
..$ : num [1:150, 1:4] -0.898 -1.139 -1.381 -1.501 -1.018 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length"
"Petal.Width"
.. ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
.. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width"
"Petal.Length" "Petal.Width"
.. ..- attr(*, "scaled:scale")= Named num [1:4] 0.828 0.436 1.765 0.762
.. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width"
"Petal.Length" "Petal.Width"
$ unit.classif : num [1:150] 3 5 5 5 4 2 4 4 6 5 ...
$ distances : num [1:150] 0.0426 0.0663 0.0768 0.0744 0.1346 ...
$ grid :List of 6
..$ pts : num [1:36, 1:2] 1.5 2.5 3.5 4.5 5.5 6.5 1 2 3 4 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "x" "y"
..$ xdim : num 6
..$ ydim : num 6
..$ topo : chr "hexagonal"
..$ neighbourhood.fct: Factor w/ 2 levels "bubble","gaussian": 1
..$ toroidal : logi FALSE
..- attr(*, "class")= chr "somgrid"
$ codes :List of 1
..$ : num [1:36, 1:4] -0.376 -0.683 -0.734 -1.158 -1.231 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:36] "V1" "V2" "V3" "V4" ...
.. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length"
"Petal.Width"
$ changes : num [1:500, 1] 0.0445 0.0413 0.0347 0.0373 0.0337 ...
$ alpha : num [1:2] 0.05 0.01
$ radius : Named num [1:2] 3.61 0
..- attr(*, "names")= chr [1:2] "66.66667%" ""
$ user.weights : num 1
$ distance.weights: num 1
$ whatmap : int 1
$ maxNA.fraction : int 0
$ dist.fcts : chr "sumofsquares"
- attr(*, "class")= chr "kohonen"
Note notice that in lines 7 and 10 of the output there are references to centre and scale. I would appreciate an explanation as to the process here.
Your step with scaling is not redundant because in source code there are no scaling, and attributes, that you see in 7 and 10 are attributes from train dataset.
To check this, just run and compare results of this chunk of code:
# Load package
require(kohonen)
# Set data
data(iris)
# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)
#compare train datasets
str(dt)
str(as.matrix(iris[, 1:4]))
# Prepare SOM
set.seed(590507)
som1 <- kohonen::som(dt,
kohonen::somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
#without scaling
som2 <- kohonen::som(as.matrix(iris[, 1:4]),
kohonen::somgrid(6,6, "hexagonal"),
rlen=500,
keep.data=TRUE)
#compare results of som function
str(som1)
str(som2)
I'm working from caracal's great example conducting a factor analysis on dichotomous data and I'm now struggling to extract the factors from the object produced by the psych package's fa.poly function.
Can anyone help me extract the factors from the fa.poly object (and look at the correlation)?
Please see caracal's example for the working example.
In this example you create an object with:
faPCdirect <- fa.poly(XdiNum, nfactors=2, rotate="varimax") # polychoric FA
so somewhere in faPCdirect there is what you want. I recommend using str() to inspect the structure of faPCdirect
> str(faPCdirect)
List of 5
$ fa :List of 34
..$ residual : num [1:6, 1:6] 4.79e-01 7.78e-02 -2.97e-0...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:6] "X1" "X2" "X3" "X4" ...
.. .. ..$ : chr [1:6] "X1" "X2" "X3" "X4" ...
..$ dof : num 4
..$ fit
...skip stuff....
..$ BIC : num 4.11
..$ r.scores : num [1:2, 1:2] 1 0.0508 0.0508 1
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:2] "MR2" "MR1"
.. .. ..$ : chr [1:2] "MR2" "MR1"
..$ R2 : Named num [1:2] 0.709 0.989
.. ..- attr(*, "names")= chr [1:2] "MR2" "MR1"
..$ valid : num [1:2] 0.819 0.987
..$ score.cor : num [1:2, 1:2] 1 0.212 0.212 1
So this says that this object is a list of five, with the first element called fa and that contains an element called score.cor that is a 2x2 matrix. I think what you want is the off diagonal.
> faPCdirect$fa$score.cor
[,1] [,2]
[1,] 1.0000000 0.2117457
[2,] 0.2117457 1.0000000
I have the following object M, from which I need to extract the fstatistic. It is a model generated by the function summaryC of a model generated by aovp, both functions from package lmPerm. I have tried hints for extracting values from normal linear models and from the functions in attr, extract and getElement, but without success.
Anybody could give me a hint?
> str(M)
List of 2
$ Error: vegetation: NULL
$ Error: Within :List of 11
..$ NA : NULL
..$ terms :Classes 'terms', 'formula' length 3 Temp ~ depth
.. .. ..- attr(*, "variables")= language list(Temp, depth)
.. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
.. .. .. ..- attr(*, "dimnames")=List of 2
.. .. .. .. ..$ : chr [1:2] "Temp" "depth"
.. .. .. .. ..$ : chr "depth"
.. .. ..- attr(*, "term.labels")= chr "depth"
.. .. ..- attr(*, "order")= int 1
.. .. ..- attr(*, "intercept")= int 1
.. .. ..- attr(*, "response")= int 1
.. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
..$ residuals : Named num [1:498] -46.9 -43.9 -46.9 -38.9 -41.9 ...
.. ..- attr(*, "names")= chr [1:498] "3" "4" "5" "6" ...
..$ coefficients : num [1:4, 1:4] -2.00 -1.00 -1.35e-14 1.00 2.59 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
.. .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
..$ aliased : Named logi [1:4] FALSE FALSE FALSE FALSE
.. ..- attr(*, "names")= chr [1:4] "depth1" "depth2" "depth3" "depth4"
..$ sigma : num 29
..$ df : int [1:3] 4 494 4
..$ r.squared : num 0.00239
..$ adj.r.squared: num -0.00367
..$ **fstatistic** : Named num [1:3] 0.395 3 494
.. ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
..$ cov.unscaled : num [1:4, 1:4] 0.008 -0.002 -0.002 -0.002 -0.002 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
.. .. ..$ : chr [1:4] "depth1" "depth2" "depth3" "depth4"
..- attr(*, "class")= chr "summary.lmp"
- attr(*, "class")= chr "listof"
there it goes a reproducible example to play with:
Temp=1:100
depth<- rep( c("1","2","3","4","5"), 100)
vegetation=rep( c("1","2"), 50)
df=data.frame(Temp,depth,vegetation)
M=summaryC(aovp(Temp~depth+Error(vegetation),df, perm=""))
as the str output from your example shows, M is a list of two lists, the second one contains what you want. Hence list extraction via [[ does the trick:
> M[[2]][["fstatistic"]]
value numdf dendf
0.3946 3.0000 494.0000
If this is not what you want, please comment.