How to sum two tables of different dimensions in R? - r

I want to sum two tables in R, but they have different valid categories, which produces two different dimensions. How can I add them up?
Example:
table(VA)
1 2 3 4 6 7 8 9 10
652 1 300 777 9 615 167 26 67
table(VB)
1 2 3 4 5 6 7 8 9 10
285 5 282 367 1 12 289 129 33 1118
table(V2A)+table(V2B)
Error in table(cx$V2A) + table(cx$V2B) : non-conformable arrays
What can I do to solve this?

I guess VA and VB are vectors. To effectively sum the tables, all you need to do is this:
table(c(VA,VB))
> VA <- sample(1:10,20,replace=TRUE)
> VB <- sample(1:10,20,replace=TRUE)
> table(VA)
VA
1 2 3 4 5 6 7 9 10
1 3 3 2 3 2 2 2 2
> table(VB)
VB
1 2 4 5 6 7 8 9 10
1 2 2 2 4 3 1 2 3
> table(c(VA,VB))
1 2 3 4 5 6 7 8 9 10
2 5 3 4 5 6 5 1 4 5

Related

How to order numeric values in a designed order in R?

My question is: Given the target table(on the right), how can I order rows of the original table(on the left) to get exactly the target table with R? Thank you in advance.
Original table:
A B
1 1
1 2
5 12
2 6
5 14
3 6
3 7
5 13
6 2
3 10
5 11
2 5
6 14
2 7
5 15
6 1
3 8
6 3
2 4
1 3
2 10
4 11
2 8
1 4
1 5
2 9
4 12
4 13
3 9
6 15
Target table:
A B
1 1
1 2
1 3
1 4
1 5
3 6
3 7
3 8
3 9
3 10
5 11
5 12
5 13
5 14
5 15
6 1
6 2
6 3
2 4
2 5
2 6
2 7
2 8
2 9
2 10
4 11
4 12
4 13
6 14
6 15
This can be accomplished by ordering by an odd/even flag, and dat$B:
dat[order(-(dat$A %% 2), dat$B),]
## A B
##1 1 1
##2 1 2
##20 1 3
##24 1 4
##25 1 5
##6 3 6
##7 3 7
##17 3 8
##29 3 9
##10 3 10
##11 5 11
##3 5 12
##8 5 13
##5 5 14
##15 5 15
##16 6 1
##9 6 2
##18 6 3
##19 2 4
##12 2 5
##4 2 6
##14 2 7
##23 2 8
##26 2 9
##21 2 10
##22 4 11
##27 4 12
##28 4 13
##13 6 14
##30 6 15
If it's not an odd/even split then you can manually set the 1/3/5, and 2/4/6 groups:
dat[order(`levels<-`(factor(dat$A), list('1'=c(1,3,5), '2'=c(6,2,4))), dat$B),]
This collapsed version of the code with levels<- called directly as a function is a bit hard to read, but it is equivalent to:
grpord <- factor(dat$A)
levels(grpord) <- list('1'=c(1,3,5), '2'=c(6,2,4))
dat[order(grpord, dat$B),]
...where "1" is assigned to the groups 1, 3 and 5, and "2" to the groups 6, 2 and 4.

Imputation with categorical variables with mix package in R

I'm trying to impute missing variables in a data set that contains categorical variables (7-point Likert scales) using the mix package in R. Here is what I'm doing:
1. Loading the data:
data <- read.csv("test.csv", header=TRUE, row.names="ID")
2. Here's what the data looks like:
The first column is my ID column, the next three columns are categorical variables (7-point Likert scales - these are the ones where I am interested in imputing the missing values). Then I have three auxiliary variables: aux_cat is another categorical variable (unordered ranging from 1 to 9, no missing data), aux_one is an integer (no missing data), aux_two is numerical (contains missing data).
var_one var_two var_three aux_cat aux_one aux_two
1 2 1 2 6 26 0.0
2 3 2 3 7 45 32906.5
3 6 2 3 3 31 1237.5
4 7 NA NA 8 11 277.0
5 4 3 1 5 145 78201.0
6 NA NA NA 6 30 48550.0
7 7 6 3 3 48 11568.0
8 6 6 4 2 15 4482.0
9 7 6 5 5 61 NA
10 5 6 7 3 2 NA
11 5 6 5 3 11 78663.0
12 6 2 2 3 16 1235.0
13 7 2 5 3 13 5781.0
14 6 5 4 6 16 5062.0
15 5 5 3 3 43 400.0
16 7 7 5 2 114 7968.0
17 6 5 4 3 99 247.5
18 7 7 7 6 114 1877.0
19 5 5 4 5 3 5881.5
20 4 4 2 3 65 1786.0
21 4 3 6 5 9 14117.5
22 3 3 2 3 35 2093.0
23 3 4 4 5 62 23071.5
24 5 3 5 3 22 2707.5
25 3 1 2 6 128 942.0
26 5 3 6 4 57 101379.0
27 5 5 4 6 76 1398.0
28 1 3 4 3 17 1024.5
29 4 3 2 1 143 10657.0
30 7 1 4 8 14 167.5
31 7 3 7 3 22 4344.0
32 3 3 3 6 27 1582.0
33 7 1 3 2 29 66.5
34 5 5 4 2 108 513.5
35 7 6 6 7 24 936.5
36 4 5 4 7 40 5950.5
37 NA NA NA 8 15 99.5
38 2 2 2 6 21 123.5
39 6 4 5 2 61 477.5
40 6 5 5 2 16 28921.0
41 6 2 2 2 11 1063.5
42 6 2 5 3 116 97798.5
43 4 4 2 8 11 9159.5
44 6 6 6 6 4 1098.5
45 6 4 5 7 21 236.5
46 4 6 4 5 43 219.5
47 3 2 3 3 28 85.5
48 5 5 5 2 71 13483.5
49 5 5 6 8 98 18400.0
50 5 6 6 3 27 357.0
51 5 7 6 7 14 145.5
52 4 5 5 3 93 427.5
53 3 4 5 2 40 412.0
54 6 6 3 2 8 2418.0
55 5 6 5 5 8 4923.5
56 4 5 2 7 32 4135.0
57 7 7 2 6 83 1408.5
58 7 2 3 2 12 5595.0
59 7 2 1 2 32 2280.5
60 7 4 5 3 11 638.5
61 7 5 3 3 24 225.5
62 4 3 3 9 44 570.0
3. Performing preliminary manipulations
I try to run prelim.mix(x, p) where x is the data matrix containing missing values and p is the number of categorical variables in x. The categorical variables must be in the first p columns of x, and they must be coded with consecutive positive integers starting with 1. For example, a binary variable must be coded as 1,2 rather than 0,1.
In my case p should be 4 since I have three Likert-scale variables where I want imputed values and one other categorical variable among my auxiliary variables.
s <- prelim.mix(data,4)
This step seems to work fine.
4. Finding the maximum likelihood (ML) estimate:
thetahat <- em.mix(s)
This is where I encounter the following error:
Steps of EM:
1...2...3...Error in em.mix(s) : NA/NaN/Inf in foreign function call (arg 6)
I think this must have something to do with my auxiliary variables, but I'm not sure. Any help would be much appreciated.

How can I recode every i to i-1

I have data like
table(data$num)
# 1 2 3 4 5 6 7 .... 100
# 10 2 13 2 7 8 19 2
I want to recode every i to i-1, like
table(data$num)
# 0 1 2 3 4 5 6 .... 99
# 10 2 13 2 7 8 19 2
How can I do this?
Use :
table(data$num - 1)
Taking mtcars as example :
table(mtcars$cyl)
# 4 6 8
#11 7 14
table(mtcars$cyl - 1)
# 3 5 7
#11 7 14

Clogit function in CEDesign not converge

I designed a CE Experiment using the package support.CEs. I generated a CE Design with 3 attributes an 4 levels per attribute. The questionnaire had 4 alternatives and 4 blocks
des1 <- rotation.design(attribute.names = list(
Qualitat = c("Aigua potable", "Cosetes.blanques.flotant", "Aigua.pou", "Aigua.marro"),
Disponibilitat.acces = c("Aixeta.24h", "Aixeta.10h", "Diposit.comunitari", "Pou.a.20"),
Preu = c("No.problemes.€", "Esforç.economic", "No.pagues.acces", "No.pagues.no.acces")),
nalternatives = 4, nblocks = 4, row.renames = FALSE,
randomize = TRUE, seed = 987)
The questionnaire was replied by 15 persons (ID 1-15), so 60 outputs (15 persons responding per 4 blocks:
ID BLOCK q1 q2 q3 q4
1 1 1 1 2 3 3
2 1 2 1 3 3 4
3 1 3 5 1 3 5
4 1 4 5 2 2 5
5 2 1 1 2 4 3
6 2 2 1 4 3 4
7 2 3 3 1 3 2
8 2 4 1 2 2 2
9 3 1 1 2 2 2
10 3 2 1 4 3 4
11 3 3 3 1 3 4
12 3 4 3 2 1 4
13 4 1 1 5 4 3
14 4 2 1 4 5 4
15 4 3 5 5 3 2
16 4 4 5 2 5 5
17 5 1 1 2 4 2
18 5 2 3 2 3 2
19 5 3 3 1 3 4
20 5 4 3 2 1 4
21 6 1 1 5 5 5
22 6 2 1 3 3 4
23 6 3 3 1 3 4
24 6 4 1 2 2 2
25 7 1 1 2 4 3
26 7 2 4 2 3 4
27 7 3 3 1 3 3
28 7 4 3 4 5 5
29 8 1 1 3 2 3
30 8 2 1 4 3 4
31 8 3 3 1 3 4
32 8 4 1 2 2 1
33 9 1 1 2 3 3
34 9 2 1 3 3 4
35 9 3 5 1 3 5
36 9 4 5 2 2 5
37 15 1 1 5 5 5
38 15 2 4 4 5 4
39 15 3 5 5 3 5
40 15 4 4 3 5 5
41 11 1 1 5 5 5
42 11 2 4 4 5 4
43 11 3 5 5 3 5
44 11 4 5 3 5 5
45 12 1 1 2 4 3
46 12 2 4 2 3 4
47 12 3 3 1 3 3
48 12 4 3 4 5 5
49 13 1 1 2 2 2
50 13 2 1 4 3 4
51 13 3 3 1 3 2
52 13 4 1 2 2 2
53 14 1 1 1 3 3
54 14 2 1 4 1 4
55 14 3 4 1 3 2
56 14 4 3 2 1 2
57 15 1 1 1 3 2
58 15 2 5 2 1 4
59 15 3 4 4 3 1
60 15 4 3 4 1 4
The probles is that, when i merge the questions and answers matrix with the formula
dataset1 <- make.dataset(respondent.dataset = res1,
choice.indicators = c("q1","q2","q3","q4"),
design.matrix = desmat1)
R shows a warning message: In fitter(X, Y, strats, offset, init, control, weights = weights, :
Ran out of iterations and did not converge
I should expect that the matrix desmat1 generated had 4800 observations (80 possible combinations and 60 outputs). Instead of that i have only 1200 obseravations. The matrix dataset1 only shows the combination of 1 set of alternatives instead of the 4.
For example, for ID 1, Block 1, Question 1 only appears alternative 1. It match with the answer selected by the person, but in other cases it does not match, and that information is lost in R, so the results when clogit is applied are wrong.
I do hope thay the problems is understood.
Regards,
Edition:
I found my problem. When i make the dataset from the respondent.dataset that i generated in .csv format, r detects only the q1 response instead of q1-q4. dataset1
dataset1 <- make.dataset(respondent.dataset = res1,
choice.indicators = c("q1","q2","q3","q4"),
design.matrix = desmat1)
detects q1-q4 as new columns. But the key is that q1-q4 has to fill the columns QES in dataset1. I did another CE before with 1 block and the dataset was correctly done one reading the respondant.dataset. So the key point is that now i'm using 4 blocks but i do not know how to make R to interprete that q1-q4 are the columns QUES for each block.
res1 matrix (repondant.dataset) (Complete matriz has 60 rows = 15 respondants (ID 1-15) * 4 Questions (QES column in make.dataset)
Kind reagards,

Multiply the rows of a matrix to get a vector: J, j701

I am programming with J.
I have this vector:
F =: 5>\i.10
F
0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
How can I have this vector as result:
(*/ 0 1 2 3 4), (*/ 1 2 3 4 5), (*/ 2 3 4 5 6), (*/ 3 4 5 6 7), (*/ 4 5 6 7 8), (*/ 5 6 7 8 9)
0 120 720 2520 6720 15120
NB. I want to multiply all the rows
I tried:
*/ F
0 720 5040 20160 60480
but, how you can see it multiply the columns, and I want the rows.
How can I use the */ to multiply the rows? Thank you all!
In short, what you want is 5 */\ i.10
5 */\ i.10
0 120 720 2520 6720 15120
However, if you ever run across this issue in another context, and you really want to address the rows, you could say:
]M=:5>\i. 10
0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
*/ rows M
0 120 720 2520 6720 15120
Rows is defined by the standard library as "1. That is, it applies the verb at "rank 1". Rank is a fundamental concept in J, and you'll need to understand it to progress with the language.

Resources