How to extract components from an object of class "spec"? - r

I am trying to construct a table of power spectra and run into this problem:
Define the table:
V <- tibble(month=double(),day=double(),hour=double(),minutes=double(),
frequency=double(),power=double(),period=double())
compute the spectrum:
S <- spec.pgram(Spec2d$Inst,spans=windowSize,log="yes")
which creates an object of class "spec"
I need to extract the data from S and put it into V. When I try:
V$frequency <- S$freq
I get this error message:
Error: Assigned data `S$freq` must be compatible with existing data.
x Existing data has 0 rows.
x Assigned data has 48 rows.
ℹ Only vectors of size 1 are recycled.
which doesn't make sense to me. I have tried to coerce S$freq into different different types of objects but nothing works.
S$freq is a vector of length 48 as in the error message
What is going on? Is there a workaround?

Don't initialise the dataframe/tibble first. Try :
S <- spec.pgram(Spec2d$Inst,spans=windowSize,log="yes")
V <- data.frame(frequency = S$freq)

Related

Performing HCPC on the columns (i.e. variables) instead of the rows (i.e. individuals) after (M)CA

I would like to perform a HCPC on the columns of my dataset, after performing a CA. For some reason I also have to specify at the start, that all of my columns are of type 'factor', just to loop over them afterwards again and convert them to numeric. I don't know why exactly, because if I check the type of each column (without specifying them as factor) they appear to be numeric... When I don't load and convert the data like this, however, I get an error like the following:
Error in eigen(crossprod(t(X), t(X)), symmetric = TRUE) : infinite or
missing values in 'x'
Could this be due to the fact that there are columns in my dataset that only contain 0's? If so, how come that it works perfectly fine by reading everything in first as factor and then converting it to numeric before applying the CA, instead of just performing the CA directly?
The original issue with the HCPC, then, is the following:
# read in data; 40 x 267 data frame
data_for_ca <- read.csv("./data/data_clean_CA_complete.csv",row.names=1,colClasses = c(rep('factor',267)))
# loop over first 267 columns, converting them to numeric
for(i in 1:267)
data_for_ca[[i]] <- as.numeric(data_for_ca[[i]])
# perform CA
data.ca <- CA(data_for_ca,graph = F)
# perform HCPC for rows (i.e. individuals); up until here everything works just fine
data.hcpc <- HCPC(data.ca,graph = T)
# now I start having trouble
# perform HCPC for columns (i.e. variables); use their coordinates that are stocked in the CA-object that was created earlier
data.cols.hcpc <- HCPC(data.ca$col$coord,graph = T)
The code above shows me a dendrogram in the last case and even lets me cut it into clusters, but then I get the following error:
Error in catdes(data.clust, ncol(data.clust), proba = proba, row.w =
res.sauv$call$row.w.init) : object 'data.clust' not found
It's worth noting that when I perform MCA on my data and try to perform HCPC on my columns in that case, I get the exact same error. Would anyone have any clue as how to fix this or what I am doing wrong exactly? For completeness I insert a screenshot of the upper-left corner of my dataset to show what it looks like:
Thanks in advance for any possible help!
I know this is old, but because I've been troubleshooting this problem for a while today:
HCPC says that it accepts a data frame, but any time I try to simply pass it $col$coord or $colcoord from a standard ca object, it returns this error. My best guess is that there's some metadata it actually needs/is looking for that isn't in a data frame of coordinates, but I can't figure out what that is or how to pass it in.
The current version of FactoMineR will actually just allow you to give HCPC the whole CA object and tell it whether to cluster the rows or columns. So your last line of code should be:
data.cols.hcpc <- HCPC(data.ca, cluster.CA = "columns", graph = T)

Can't get 'plotweb' in the Biparite package to work (R)

I am trying to visualise a biparite network using the biparite package in R. My data consists of 4 columns in a spreadsheet. The columns contain 1) plant species names2) bee species names 3) site 4) interaction frequency. I first read the data into R from a CSV file, then convert it to a web using the helper function frame2webs. When I then try to visualise the network with plotweb() I get the error message:
Error in web[rind, cind, drop = FALSE] : incorrect number of dimensions
My code looks like this:
library(bipartite)
bee <- read.csv('TestFile.csv')
bees <- as.data.frame(bee)
BeeWeb <- frame2webs(bees, type.out = "array")
plotweb(BeeWeb)
I've also tried:
BeeWeb <- frame2webs(bees,
varnames = c("higher","lower","webID","freq"),
type.out = "array")
Please help! I am new to R and am struggling to make this work. Cheers!
Not sure what your data look like, but this happens to me when I have a single factor level in either the "higher" or "lower" column, type.out is "list", and emptylist is TRUE.
This is due to a problem in empty, a function that frame2webs only calls when type.out is "list" and emptylist is TRUE. empty finds the dimensions of your data using NROW and NCOL, which interpret a single row of input as a vertical vector. When there's only one factor level in "lower" or "higher", the input to empty is a one-row array. empty interprets this row as a column, hence the 'incorrect number of dimensions' error.
Two simple workarounds:
Set type.out to "array"
Set emptylist to FALSE

xgb.DMatrix Error: The length of labels must equal to the number of rows in the input data

I am using xgboost in R.
I created the xgb matrix fine using a matrix as input, but when I reduce the number in columns in the matrix data, I receive an error.
This works:
> dim(ctt1)
[1] 6401 5901
> xgbmat1 <- xgb.DMatrix(
Matrix(data.matrix(ctt1)),
label = as.matrix(as.numeric(data$V2)) - 1
)
This does not:
> dim(ctt1[,nr])
[1] 6401 1048
xgbmat1 <- xgb.DMatrix(
Matrix(data.matrix(ctt1[,nr])),
label = as.matrix(as.numeric(data$V2)) - 1)
Error in xgb.setinfo(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
In my case I fixed this error by changing assign operation:
labels <- df_train$target_feature
It turns out that by removing some columns, there are some rows with all 0s, and could not contribute to model.
For sparse matrices, xgboost R interface uses the CSC format creation method. The problem currently is that this method automatically determines the number of rows from the existing non-sparse values, and any completely sparse rows at the end are not counted in. A similar loss of completely sparse columns at the end can happen with the CSR sparse format. For more details see xgboost issue #1223 and also wikipedia on the sparse matrix formats.
The proper way for creating the DBMatrix Like
xgtrain <- xgb.DMatrix(data = as.matrix(X_train[,-5]), label = `X_train$item_cnt_month)`
drop the label column in data parameter and use same data set for create label column in index five i have item_cnt_month i drop it at run time and use same data set for referring label column
Before splitting your data, you need to turn it into a data frame.
For Exemplo:
data <- read.csv(...)
data = as.data.frame(data)
Now you can set your train data and test data to use in your "sparse.model.matrix" and "xgb.DMatrix".

having troubles with handling large data in R

Im currently making recommender system with 8k users and 200k items using recommenderlab package.
Before using the functions of recommenderlab, I'm having troubles with converting my data frame to real rating matrix.
item_idx mem_idx rating
1 00600015987465341234f7dae4 534122168382b 4
2 0060001660924533ad0cd443e1 53d79f413e3aa 5
3 006000195520453d7ac28e4b4b 53d79f413e3aa 5
4 0060001986642536d6fc77d269 535146eb5af95 4
5 00708969975005409278f828f3 540927366f478 5
This is the part of my data frame, all the (item_idx, mem_idx) pairs are distinct.
mat <- tapply(df$rating, list(df$mem_idx, df$ID), FUN=function(x) x)
I tried to convert data frame to matrix using this code, some times success but usually there occur error like this.
Error: cannot allocate vector of size 1.1 Gb
In the succeeded case,
r <- as(mat, "realRatingMatrix")
I applied this code to make it realRatingMatrix
But I always failed with this error
Error in which(x == 0, arr.ind = TRUE) :
error in evaluating the argument 'x' in selecting a method for function 'which': Error: (list) object cannot be coerced to type 'double'
Anyone who knows how to escape one of these errors, please help me.
Convert the dataframe to a sparse matrix and then to realRatingMatrix class
itm <- factor(data[,1])
mem <- factor(data[,2])
# sparsematrix
s <- sparseMatrix(
as.numeric(itm),
as.numeric(mem),
dimnames = list(
as.character(levels(itm)),
as.character(levels(mem))),
x = data[,3])
#convert to realRatingMatrix class
rm <- new("realRatingMatrix",data=s)

Unable to Convert Chi-Squared Values into a Numeric Column in R

I've been working on a project for a little bit for a homework assignment and I've been stuck on a logistical problem for a while now.
What I have at the moment is a list that returns 10000 values in the format:
[[10000]]
X-squared
0.1867083
(This is the 10000th value of the list)
What I really would like is to just have the chi-squared value alone so I can do things like create a histogram of the values.
Is there any way I can do this? I'm fine with repeating the test from the start if necessary.
My current code is:
nsims = 10000
for (i in 1:nsims) {cancer.cells <- c(rep("M",24),rep("B",13))
malig[i] <- sum(sample(cancer.cells,21)=="M")}
benign = 21 - malig
rbenign = 13 - benign
rmalig = 24 - malig
for (i in 1:nsims) {test = cbind(c(rbenign[i],benign[i]),c(rmalig[i],malig[i]))
cancerchi[i] = chisq.test(test,correct=FALSE) }
It gives me all I need, I just cannot perform follow-up analysis on it such as creating a histogram.
Thanks for taking the time to read this!
I'll provide an answer at the suggestion of #Dr. Mike.
hist requires a vector as input. The reason that hist(cancerchi) will not work is because cancerchi is a list, not a vector.
There a several ways to convert cancerchi, from a list into a format that hist can work with. Here are 3 ways:
hist(as.data.frame(unlist(cancerchi)))
Note that if you do not reassign cancerchi it will still be a list and cannot be passed directly to hist.
# i.e
class(cancerchi)
hist(cancerchi) # will still give you an error
If you reassign, it can be another type of object:
(class(cancerchi2 <- unlist(cancerchi)))
(class(cancerchi3 <- as.data.frame(unlist(cancerchi))))
# using the ldply function in the plyr package
library(plyr)
(class(cancerchi4 <- ldply(cancerchi)))
these new objects can be passed to hist directly
hist(cancerchi2)
hist(cancerchi3[,1]) # specify column because cancerchi3 is a data frame, not a vector
hist(cancerchi4[,1]) # specify column because cancerchi4 is a data frame, not a vector
A little extra information: other useful commands for looking at your objects include str and attributes.

Resources