how could I write a list of matrices into a csv file in R?
I tried
fff = tr[[3]]
MATweight = ldply(fff, function(t) t$toDataFrame())
but had this error
Error in t$toDataFrame : $ operator is invalid for atomic vectors
I am not sure if it was the right thing to do, any idea please?
I have this list of matrices
> str(fff)
List of 10
$ : num [1:1000, 1:50] 1 1 1 1 2 2 1 2 2 1 ...
....
$ : num [1:1000, 1:50] 1 1 1 1 2 2 1 2 2 1 ..
.
When I tried the suggested answer by G. I got:
> write.csv(map_dfr(fff, as.data.frame, .id = "matrix"),"testt.csv", row.names = FALSE)
> tgtg = read.csv("/Users/amani/testt.csv")
> str(tgtg)
'data.frame': 10000 obs. of 51 variables:
$ matrix: int 1 1 1 1 1 1 1 1 1 1 ...
$ V1 : int 1 1 1 1 2 2 1 2 2 1 ...
$ V2 : int 2 2 1 2 1 1 2 1 1 2 ...
$ V3 : int 1 1 1 1 1 2 1 1 2 1 ...
....
$ V50 : int 2 2 1 1 1 1 1 1 2 2 ...
Using the L test data shown, try map_dfr from purrr:
library(purrr)
L <- list(as.matrix(BOD), as.matrix(10*BOD)) # test data
write.csv(map_dfr(L, as.data.frame, .id = "matrix"), stdout(), row.names = FALSE)
giving:
"matrix","Time","demand"
"1",1,8.3
"1",2,10.3
"1",3,19
"1",4,16
"1",5,15.6
"1",7,19.8
"2",10,83
"2",20,103
"2",30,190
"2",40,160
"2",50,156
"2",70,198
Why not do it like this:
# sample data like your `fff` list of matrices
fff <- lapply(1:10, function(i)
matrix(data = sample(1:2, 510000, replace = TRUE), nrow = 10000, ncol = 51))
# we give each matrix in the list a unique name
mat_names <- as.character(1:length(fff))
# converting to dataframe with the list index preserved
fffdf <- lapply(1:length(fff), function(i)
cbind(mat_names = mat_names[i], as.data.frame(fff[[i]])))
fffdf <- do.call(rbind, fffdf)
# writing it to a file
t <- tempfile(fileext = ".csv")
write.csv(fffdf, t, row.names = FALSE)
# testing the reassembly into the same list of matrices that you started
tcsv <- read.csv(t)
tgtg <- lapply(mat_names, function(i) {
df <- subset(tcsv, mat_names == i, select = -mat_names)
mat <- as.matrix(df)
dimnames(mat) <- NULL
return(mat)
})
Testing the result: is tgtg the same as the fff you began with?
> identical(fff, tgtg)
[1] TRUE
> str(tgtg)
List of 10
$ : int [1:10000, 1:51] 1 2 1 2 1 1 2 2 1 1 ...
$ : int [1:10000, 1:51] 2 2 2 2 2 2 1 1 1 1 ...
$ : int [1:10000, 1:51] 1 2 1 1 2 2 1 1 2 1 ...
$ : int [1:10000, 1:51] 1 1 2 2 1 1 1 2 1 1 ...
$ : int [1:10000, 1:51] 1 1 2 1 2 2 1 1 1 2 ...
$ : int [1:10000, 1:51] 2 1 2 2 1 1 1 2 1 2 ...
$ : int [1:10000, 1:51] 2 1 2 1 2 1 1 1 2 1 ...
$ : int [1:10000, 1:51] 2 2 1 1 2 2 1 2 1 1 ...
$ : int [1:10000, 1:51] 1 2 1 1 1 1 1 2 2 1 ...
$ : int [1:10000, 1:51] 1 1 1 2 2 2 2 2 2 2 ...
Related
I am trying to recode variables, which lie in columns of multiple dataframes. These dataframes also lie within lists. I have an attempt using a loop which works. I wonder if there is an even easier possibility using lapply.
My lapply attempt did not work, because the recode command tried accessing the dataframe, but it needs the columns. Is there an easy lapply way to do it?
This is an example of how my data (dataframes within list) looks like using the structure command str():
> str(Q1_dfcl)
List of 19
$ Q1_Ques_01Dep:'data.frame': 384 obs. of 20 variables:
..$ Q001_01: num [1:384] 3 2 2 1 4 2 1 1 3 1 ...
..$ Q001_02: num [1:384] 2 1 3 1 5 1 1 1 1 1 ...
..$ Q001_05: num [1:384] 3 1 3 1 2 2 2 1 1 1 ...
..$ Q001_06: num [1:384] 2 2 5 2 5 1 2 1 1 1 ...
..$ Q001_08: num [1:384] 3 1 5 1 4 1 1 1 1 1 ...
..$ Q001_09: num [1:384] 3 1 2 2 3 3 1 2 3 1 ...
..$ Q001_11: num [1:384] 4 2 4 1 1 3 1 1 2 2 ...
..$ Q001_13: num [1:384] 1 1 3 1 2 1 1 1 1 1 ...
..$ Q001_21: num [1:384] 3 1 5 3 5 1 2 1 2 1 ...
..$ Q001_26: num [1:384] 3 2 5 1 4 2 1 1 1 1 ...
..$ Q001_27: num [1:384] 4 1 4 1 5 2 2 2 1 2 ...
..$ Q001_30: num [1:384] 2 3 5 2 4 1 1 2 1 1 ...
..$ Q001_31: num [1:384] 3 1 5 2 5 1 1 1 1 2 ...
..$ Q001_40: num [1:384] 2 1 5 2 5 1 2 1 1 1 ...
..$ Q001_48: num [1:384] 4 1 5 1 4 2 2 1 2 1 ...
..$ Q001_51: num [1:384] 3 1 5 3 4 1 1 1 2 1 ...
..$ Q001_52: num [1:384] 1 1 2 1 2 1 1 1 1 1 ...
..$ Q001_57: num [1:384] 1 1 1 1 1 1 1 1 1 1 ...
..$ Q001_61: num [1:384] 2 2 2 1 2 2 1 1 1 1 ...
..$ Q001_64: num [1:384] 4 4 5 3 5 3 2 2 5 3 ...
$ Q1_Ques_02Dys:'data.frame': 384 obs. of 10 variables:
..$ Q001_02: num [1:384] 2 1 3 1 5 1 1 1 1 1 ...
..$ Q001_05: num [1:384] 3 1 3 1 2 2 2 1 1 1 ...
..$ Q001_08: num [1:384] 3 1 5 1 4 1 1 1 1 1 ...
..$ Q001_09: num [1:384] 3 1 2 2 3 3 1 2 3 1 ...
..$ Q001_21: num [1:384] 3 1 5 3 5 1 2 1 2 1 ...
..$ Q001_31: num [1:384] 3 1 5 2 5 1 1 1 1 2 ...
..$ Q001_40: num [1:384] 2 1 5 2 5 1 2 1 1 1 ...
..$ Q001_48: num [1:384] 4 1 5 1 4 2 2 1 2 1 ...
..$ Q001_57: num [1:384] 1 1 1 1 1 1 1 1 1 1 ...
..$ Q001_61: num [1:384] 2 2 2 1 2 2 1 1 1 1 ...
$ Q1_Ques_03Las:'data.frame': 384 obs. of 6 variables:
..$ Q001_06: num [1:384] 2 2 5 2 5 1 2 1 1 1 ...
..$ Q001_29: num [1:384] 3 1 2 2 5 1 1 1 1 1 ...
..$ Q001_30: num [1:384] 2 3 5 2 4 1 1 2 1 1 ...
..$ Q001_43: num [1:384] 3 2 2 2 1 2 1 1 1 1 ...
..$ Q001_54: num [1:384] 1 2 3 1 4 1 1 1 1 2 ...
..$ Q001_55: num [1:384] 2 2 5 1 5 1 1 1 3 1 ...
This is what my code looks like:
### Preparing the values for for recoding
level_key_difficulty <- c("1" = "0", "2" = "1", "3" = "2", "4" = "3", "5" = "4")
### This works.
for (i in 1:length(Q1_list)){
for (j in 1:length(Q1_list[[i]])){
Q1_dfcl[[i]][j] <- lapply(Q1_dfcl[[i]][j], function(x) recode(x, !!!level_key_difficulty))[[1]]
names(Q1_dfcl[[i]][j]) <- names(Q1_list[[i]][j])
}
}
### This does not work, because it only tries to do it on the WHOLE dataframe within
# Q1_dfcl[[1]]
lapply(Q1_dfcl, \(x) recode(x, !!!level_key_difficulty))
Actually you could just subtract 1.
dlst ## before
# [[1]]
# V1 V2 V3 V4
# 1 1 1 2 4
# 2 5 2 2 1
# 3 1 4 1 5
#
# [[2]]
# V1 V2 V3 V4
# 1 4 3 3 5
# 2 2 1 4 5
# 3 2 1 5 4
dlst_new <- lapply(dlst, `-`, 1)
dlst_new ## after
# [[1]]
# V1 V2 V3 V4
# 1 0 0 1 3
# 2 4 1 1 0
# 3 0 3 0 4
#
# [[2]]
# V1 V2 V3 V4
# 1 3 2 2 4
# 2 1 0 3 4
# 3 1 0 4 3
To change just the values of specific columns, we can do
sset <- c('V1', 'V2') ## define subset
dlst_new1 <- lapply(dlst, \(x) {x[sset] <- x[sset] - 1; x})
dlst_new1
# [[1]]
# V1 V2 V3 V4
# 1 0 0 2 4
# 2 4 1 2 1
# 3 0 3 1 5
#
# [[2]]
# V1 V2 V3 V4
# 1 3 2 3 5
# 2 1 0 4 5
# 3 1 0 5 4
Data:
dlst <- list(structure(list(V1 = c(1L, 5L, 1L), V2 = c(1L, 2L, 4L), V3 = c(2L,
2L, 1L), V4 = c(4L, 1L, 5L)), class = "data.frame", row.names = c(NA,
-3L)), structure(list(V1 = c(4L, 2L, 2L), V2 = c(3L, 1L, 1L),
V3 = 3:5, V4 = c(5L, 5L, 4L)), class = "data.frame", row.names = c(NA,
-3L)))
It can be done when you nest your lapply such as:
lapply(list(mtcars), \(a) lapply(a, \(b) recode(b, !!!level_key_difficulty, .default = "def")))
Purrr has a convenient function called map_depth that allows you to map over a list of many depths, in this case you want to go to depth two since the first depth is just a data.frame. The second depth is the columns of interest
purrr::map_depth(list(mtcars), 2, ~ recode(.x, !!!level_key_difficulty, .default = "def"))
There is also rapply a function I rarely use but is similar to map_depth
rapply(list(mtcars), \(a) recode(a, !!!level_key_difficulty, .default = "def"), how = "list")
So I'm working with this dataset
tibble [1,000 x 17] (S3: tbl_df/tbl/data.frame)
$ Solde : num [1:1000] 1 2 4 1 1 4 4 2 4 2 ...
$ Duree : num [1:1000] 6 48 12 42 24 36 24 36 12 30 ...
$ Historique : num [1:1000] 4 2 4 2 3 2 2 2 2 4 ...
$ Motif : num [1:1000] 3 3 6 2 0 6 2 1 3 0 ...
$ Montant : num [1:1000] 1169 5951 2096 7882 4870 ...
$ Epargne : num [1:1000] 5 1 1 1 1 5 3 1 4 1 ...
$ Employe_depuis: num [1:1000] 5 3 4 4 3 3 5 3 4 1 ...
$ Statut_sexe : num [1:1000] 3 2 3 3 3 3 3 3 1 4 ...
$ Debit : num [1:1000] 1 1 1 3 1 1 1 1 1 1 ...
$ Residence : num [1:1000] 4 2 3 4 4 4 4 2 4 2 ...
$ Age : num [1:1000] 67 22 49 45 53 35 53 35 61 28 ...
$ Logement : num [1:1000] 2 2 2 3 3 3 2 1 2 2 ...
$ n_credit : num [1:1000] 2 1 1 1 2 1 1 1 1 2 ...
$ Emploi : num [1:1000] 3 3 2 3 3 2 3 4 2 4 ...
$ n_pers : num [1:1000] 1 1 2 2 2 2 1 1 1 1 ...
$ Statut : num [1:1000] 1 2 1 1 2 1 1 1 1 2 ...
$ y : num [1:1000] 1 0 1 1 0 1 1 1 1 0 ...
I'm trying to build a model that predicts the variable Y.
To do that I made this function
modele_nnet=function(base,p){
library(nnet)
#Echantillonage
train_id=createDataPartition(base$y,p)
data_train=base[train_id$Resample1,]#Base Train
data_test=base[-train_id$Resample1,]
attach(base)
nnet <-nnet(y~Solde+Duree+Historique+Motif+Epargne+Employe_depuis+Statut_sexe+Debit+Residence+Age+Montant+Logement+n_credit+n_pers+Emploi,data=data_train,family=binomial,size=2)
return(nnet)
}
Then when trying to use a new observation to predict Y I coded this :
newdata=c(Solde=3,Duree=60,Historique=4,Motif=1,Montant=2040,Epargne=5,Employe_depuis=5,Statut_sexe=3,Debit=1,Residence=4,Age=22,Logement=2,n_credit=2,Emploi=3,n_pers=1)
nnet_mod=modele_nnet(base,0.8)
predict(nnet_mod,newdata)
but I get this error
predict(nnet_mod,newdata)
Error in z[keep, ] <- matrix(.C(VR_nntest, as.integer(ntr), as.double(x), :
NAs are not allowed in subscripted assignments
In addition: Warning message:
'newdata' had 15 rows but variables found have 1000 rows
I don't understand how to fix it, is the problem with the model itself? or the prediction function?
The problem appears to be how you are specifying your new data. The partitioning and model look good. Like #RuiBarradas suggests, specify your new data as a dataframe, like I do below.
Data:
base <- data.frame(
Solde = sample(1:5, size = 100, replace = TRUE),
Duree = sample(1:100, size = 100, replace = TRUE),
Historique = sample(1:5, size = 100, replace = TRUE),
y = sample(0:1, size = 100, replace = TRUE))
newdat <- data.frame(Solde = 3, Duree = 60, Historique = 4)
Fit model:
library(nnet)
library(caret)
modele_nnet <- function(base, p){
train_id=caret::createDataPartition(base$y, p)
data_train=base[train_id$Resample1,]
data_test=base[-train_id$Resample1,]
nnet_mod <-nnet(y~Solde+Duree+Historique,
data=data_train, family=binomial, size=2)
return(nnet_mod)
}
my_model <- modele_nnet(base, 0.8)
predict(my_model, newdata = newdat)
I face problem in r, while doing glm. The problem is, Variable length differ found for "var1". but when I delete this var1 from the data. Then next same type error appear for next variables present in the data. I checked all the data, but there are no length differs in actual. How I resolve this problem? Anyone can please help me. Thanks in advance.
The data is look like; d_status is my response variable and is factor. here doesn't appear because of there are more variables.
data.frame': 300 obs. of 20 variables:
$ age : num 28 43 32 64 37 42 36 48 55 31 ...
$ gender : num 1 2 2 2 1 2 2 1 2 2 ...
$ u_clarity: num 1 2 1 2 1 1 1 2 1 1 ...
$ ph : num 5 5.5 5 5 5 5 5 5.2 5 5 ...
$ sp_g : num 1.01 1.02 1.01 1.01 1.01 ...
$ albumin : num 1 1 2 1 2 2 2 1 1 2 ...
$ glucose : num 2 2 2 2 2 2 2 2 2 2 ...
$ sugar : num 2 1 2 2 2 2 1 1 1 2 ...
$ kb : num 2 2 2 2 2 2 2 2 2 2 ...
$ bpigment : num 2 2 2 2 2 2 2 1 2 2 ...
$ ur_bi : num 2 2 2 2 2 2 2 2 2 2 ...
$ blood : num 2 2 2 2 2 2 2 2 2 2 ...
$ pus_cells: num 1 2 1 2 1 1 1 1 1 1 ...
$ red_cells: num 1 2 1 2 1 1 1 2 1 1 ...
$ epi_cells: num 1 2 1 2 1 2 1 1 2 2 ...
$ mt : num 1 2 1 2 1 1 2 2 2 1 ...
$ co : num 2 1 1 2 1 1 1 2 2 1 ...
$ gc : num 2 1 1 1 1 1 1 2 2 1 ...
$ bacteria : num 1 2 1 2 1 1 1 2 2 1 ...
$ cc : num 1 1 1 2 1 1 1 2 1 1 ...
f1=glm(y~.,family=quasibinomial(link='logit'),data=dataset1[training,])
Error in model.frame.default(formula = y ~ ., data = dataset1[training, :
variable lengths differ (found for 'age')
I'm trying to convert a triple nested list into a dataframe. This question has helped, but I can't get the dataframe I'd like.
The list is an options chain obtained from IBrokers, a summary is shown below. I've uploaded the actual chain here which is more detailed.
Chain <-
list(
list(
list(
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="25")),
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180621",strike="26"))
),
list(
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="25")),
list(version="8",contract=list(symbol="BHP",right="C",expiry="20180730",strike="26"))
)
),
list(
list(
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="65")),
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180621",strike="64"))
),
list(
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="65")),
list(version="8",contract=list(symbol="CBA",right="C",expiry="20180730",strike="64"))
)
)
)
I'd like to convert the list into a dataframe like this:
Contracts <- data.frame(symbol=c("BHP","BHP","BHP","BHP","CBA","CBA","CBA","CBA"),
right=c("C","C","C","C","C","C","C","C"),
expiry=c("20180621","20180621","20180730","20180730","20180621","20180621","20180730","20180730"),
strike=c("25","26","25","26","65","64","65","64"))
I tried this code, but it didn't give me the dataframe I wanted.
X <- lapply(Chain,function(x) as.data.frame.list(lapply(x,as.data.frame.list)))
dfx <- do.call(rbind,X)
Any suggestions please?
How about the following?
df <- as.data.frame(matrix(unlist(Chain, recursive = T), ncol = 5, byrow = T)[, -1]);
colnames(df) <- c("symbol", "right", "expiry", "strike");
# symbol right expiry strike
#1 BHP C 20180621 25
#2 BHP C 20180621 26
#3 BHP C 20180730 25
#4 BHP C 20180730 26
#5 CBA C 20180621 65
#6 CBA C 20180621 64
#7 CBA C 20180730 65
#8 CBA C 20180730 64
Explanation: Recursively unlist the nested Chain, then recast as matrix, remove column version and convert to data.frame. The only minor down-side is that we have to manually add column names.
Update
Since your actual data is quite different, here is a possibility.
Note: I assume the structure from the Gist is stored in tbl.
tbl;
#Source: local data frame [2 x 6]
#Groups: <by row>
#
## A tibble: 2 x 6
# symbol sectype exch currency multiplier Chain
# <fct> <fct> <fct> <fct> <fct> <list>
#1 BHP OPT ASX AUD 100 <list [1,241]>
#2 CBA OPT ASX AUD 100 <list [1,204]>
The following list contains two data.frames, one for each row from tbl.
lst <- lapply(tbl$Chain, function(x)
do.call(rbind.data.frame, lapply(x, function(y) as.data.frame(unclass(y$contract)))))
#List of 2
# $ :'data.frame': 1241 obs. of 16 variables:
# ..$ conId : Factor w/ 1241 levels "198440202","198440207",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ symbol : Factor w/ 1 level "BHP": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ sectype : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ exch : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ primary : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ expiry : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
# ..$ strike : Factor w/ 118 levels "25","26","27",..: 1 1 2 2 3 3 4 4 5 5 ...
# ..$ currency : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ right : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
# ..$ local : Factor w/ 1241 levels "BHPV78","BHPV88",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ multiplier : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ comboleg : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secIdType : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secId : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# $ :'data.frame': 1204 obs. of 16 variables:
# ..$ conId : Factor w/ 1204 levels "198447027","198447030",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ symbol : Factor w/ 1 level "CBA": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ sectype : Factor w/ 1 level "OPT": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ exch : Factor w/ 1 level "ASX": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ primary : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ expiry : Factor w/ 18 levels "20180628","20181220",..: 1 1 1 1 1 1 1 1 1 1 ...
# ..$ strike : Factor w/ 179 levels "79.68","81.68",..: 1 1 2 2 3 3 4 4 5 5 ...
# ..$ currency : Factor w/ 1 level "AUD": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ right : Factor w/ 2 levels "C","P": 1 2 1 2 1 2 1 2 1 2 ...
# ..$ local : Factor w/ 1204 levels "CBAKT9","CBAKU9",..: 1 2 3 4 5 6 7 8 9 10 ...
# ..$ multiplier : Factor w/ 1 level "100": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ combo_legs_desc: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ comboleg : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ include_expired: Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secIdType : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
# ..$ secId : Factor w/ 1 level "": 1 1 1 1 1 1 1 1 1 1 ...
You can use unstack
unstack(data.frame(d<-unlist(Chain),names(d)))
contract.expiry contract.right contract.strike contract.symbol version
1 20180621 C 25 BHP 8
2 20180621 C 26 BHP 8
3 20180730 C 25 BHP 8
4 20180730 C 26 BHP 8
5 20180621 C 65 CBA 8
6 20180621 C 64 CBA 8
7 20180730 C 65 CBA 8
8 20180730 C 64 CBA 8
If you want you can delete the word contract.
unstack(data.frame(d<-unlist(Chain),sub(".*[.]","",names(d))))
expiry right strike symbol version
1 20180621 C 25 BHP 8
2 20180621 C 26 BHP 8
3 20180730 C 25 BHP 8
4 20180730 C 26 BHP 8
5 20180621 C 65 CBA 8
6 20180621 C 64 CBA 8
7 20180730 C 65 CBA 8
8 20180730 C 64 CBA 8
This can also be written as unstack(data.frame(d<-unlist(Chain),sub("contract[.]","",names(d)))) Although I would prefer to maintain the name contract in order to know which columns indeed form the contract dataframe needed
Or even you can change the names After unstacking.
With the new data:
a=readLines("https://raw.githubusercontent.com/hughandersen/OptionsTrading/master/Stocks_option_chain")
b=eval(parse(text=paste(a,collapse="")))
s=unstack(data.frame(d<-unlist(b[6]),names(d)))
I have some data that looks like this:
head(data)
net1re net2re net3re net4re net5re net6re
24 3 2 1 2 3 3
33 1 1 1 1 1 2
30 3 1 1 1 1 3
22 2 1 1 1 1 1
31 3 2 1 1 1 2
1 2 1 1 1 1 2
I'm running principal component analysis as follows:
library(psych)
fit <- principal(data[,1:6], rotate="varimax")
data$friendship=fit$scores
This creates the variable "friendship" which I can call on the console:
> colnames(data)
[1] "net1re" "net2re" "net3re" "net4re" "net5re"
[6] "net6re" "friendship"
But when I want to view my data, instead of the variable name I get "PC1":
> head(data)
net1re net2re net3re net4re net5re net6re PC1
24 3 2 1 2 3 3 1.29231531
33 1 1 1 1 1 2 -0.68448111
30 3 1 1 1 1 3 0.02783916
22 2 1 1 1 1 1 -0.67371031
31 3 2 1 1 1 2 0.10251282
1 2 1 1 1 1 2 -0.44345075
This becomes a major trouble because I need to repeat that with diffrent variables and all the results get "PC1".
Why is this happening and how can I assign the variable name instead of "PC1".
Thanks
This unusual effect appears becausefit$scores is a matrix:
str(data)
#'data.frame': 6 obs. of 7 variables:
# $ net1re : int 3 1 3 2 3 2
# $ net2re : int 2 1 1 1 2 1
# $ net3re : int 1 1 1 1 1 1
# $ net4re : int 2 1 1 1 1 1
# $ net5re : int 3 1 1 1 1 1
# $ net6re : int 3 2 3 1 2 2
# $ friendship: num [1:6, 1] 1.1664 -1.261 0.0946 -0.5832 1.1664 ...
# ..- attr(*, "dimnames")=List of 2
# .. ..$ : chr "24" "33" "30" "22" ...
# .. ..$ : chr "PC1"
To get the desired result, you can use
data$friendship=as.vector(fit$scores)
or
data$friendship=fit$scores[,1]
In either case, the output will be:
data
# net1re net2re net3re net4re net5re net6re friendship
#24 3 2 1 2 3 3 1.16635312
#33 1 1 1 1 1 2 -1.26098965
#30 3 1 1 1 1 3 0.09463653
str(data)
#'data.frame': 6 obs. of 7 variables:
# $ net1re : int 3 1 3 2 3 2
# $ net2re : int 2 1 1 1 2 1
# $ net3re : int 1 1 1 1 1 1
# $ net4re : int 2 1 1 1 1 1
# $ net5re : int 3 1 1 1 1 1
# $ net6re : int 3 2 3 1 2 2
# $ friendship: num 1.1664 -1.261 0.0946 -0.5832 1.1664 ...