users,
I have received a data from a conjoint survey experiment. What I want to do is to reshape from wide to long format. However, this seems to be slightly complicated. I am pretty sure it is possible to do with cj_tidy (package cregg) but can't solve it myself.
In the survey, the respondents were asked to compare two organizations that vary across 7 profiles (Efficiency Opennes Inclusion Leader Gain & System). In total, respondents were presented with four comparisons. So 2 organizations and 4 comparisons (4x2). They had to choose one of the presented organization and rate them separately after choosing one.
At the moment, the profile variables are structured in this way: org1_Efficiency_conj_1, org1_Opennes_conj1 ..etc. The first part "org" indicates whether it is the first or second organization. The last part "conj", indicated the order of the conjoint/comparison, where the "conj4" is the last comparison. The CHOICE variables also follow the order of conjoint – for example,"CHOICE_conj1", "CHOICE_conj2", where =1 means the respondent chose "org1". If =2, then org2 was chosen. The RATING> variable indicates a value from 0 to 10 for each organization: RATING_conj1_org1; RATING_conj1_org2 etc..
The current wide format of the data is not suitable for conjoint analysis - what I need is to create 8 observations for each respondent (4x2=8) where the variable CHOICE would indicate which of the organizations were chosen (where =1 if yes; and =0 if no). In a similar way, the variable RATING should indicate the rating given by respondents for both of the organizations (0 to 10).
This is how I would like the data to look like:
Note please that there are also covariates such as Q1 and Q2 in the picture, they are not a part of the experiment and should remain constant for each individual observation.
Below I share 50 observations from my real data.
> dput(cjdata_wide) structure(list(ID = 1:50, org1_Effeciency_conj_1 =
> c(3L, 2L, 1L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 3L,
> 2L, 3L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 3L, 3L, 1L, 1L, 3L, 1L, 1L, 3L,
> 3L, 2L, 3L, 2L, 3L, 2L, 1L, 1L, 3L, 2L, 1L, 1L, 1L, 2L, 2L, 1L ),
> org1_Oppenes_conj_1 = c(3L, 3L, 1L, 3L, 1L, 3L, 2L, 3L, 2L, 3L, 1L,
> 1L, 1L, 2L, 3L, 2L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 1L, 3L,
> 1L, 2L, 2L, 3L, 2L, 2L, 1L, 3L, 1L, 3L, 2L, 2L, 1L, 2L, 3L, 3L, 3L,
> 3L, 3L, 2L, 3L, 1L), org1_Inclusion_conj_1 = c(2L, 1L, 1L, 2L, 2L,
> 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L), org1_Leader_conj_1 =
> c(5L, 6L, 3L, 6L, 1L, 4L, 2L, 6L, 1L, 6L, 1L, 2L, 2L, 6L, 3L, 2L, 6L,
> 3L, 5L, 6L, 3L, 1L, 4L, 3L, 5L, 5L, 2L, 1L, 4L, 1L, 3L, 4L, 2L, 3L,
> 5L, 2L, 1L, 3L, 3L, 2L, 1L, 4L, 1L, 5L, 2L, 6L, 1L, 4L, 2L, 3L),
> org1_Gain_conj_1 = c(4L, 4L, 1L, 3L, 3L, 8L, 3L, 2L, 6L, 5L, 1L, 6L,
> 3L, 8L, 1L, 3L, 6L, 2L, 2L, 5L, 5L, 3L, 4L, 8L, 6L, 4L, 5L, 6L, 6L,
> 8L, 4L, 4L, 5L, 7L, 6L, 7L, 3L, 7L, 8L, 2L, 6L, 4L, 6L, 4L, 8L, 4L,
> 6L, 4L, 3L, 6L), org1_System_conj_1 = c(5L, 4L, 5L, 1L, 4L, 4L, 5L,
> 1L, 2L, 2L, 4L, 3L, 1L, 4L, 4L, 2L, 3L, 3L, 2L, 4L, 3L, 1L, 4L, 3L,
> 1L, 1L, 5L, 3L, 1L, 3L, 5L, 4L, 5L, 3L, 2L, 4L, 1L, 2L, 3L, 4L, 1L,
> 1L, 3L, 5L, 5L, 5L, 1L, 1L, 5L, 3L), org2_Effeciency_conj_1 = c(2L,
> 1L, 3L, 2L, 1L, 3L, 1L, 2L, 2L, 2L, 3L, 2L, 3L, 3L, 3L, 2L, 2L, 1L,
> 2L, 2L, 2L, 3L, 1L, 3L, 1L, 3L, 2L, 1L, 2L, 2L, 1L, 2L, 3L, 1L, 2L,
> 1L, 1L, 3L, 2L, 1L, 3L, 3L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 3L),
> org2_Oppenes_conj_1 = c(1L, 1L, 3L, 1L, 3L, 1L, 1L, 2L, 3L, 2L, 3L,
> 3L, 2L, 1L, 1L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 1L,
> 2L, 3L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 2L, 3L, 1L, 1L, 1L,
> 2L, 1L, 1L, 1L, 3L), org2_Inclusion_conj_1 = c(1L, 2L, 2L, 1L, 1L,
> 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L), org2_Leader_conj_1 =
> c(4L, 5L, 6L, 3L, 2L, 5L, 1L, 3L, 6L, 2L, 4L, 6L, 6L, 5L, 6L, 4L, 1L,
> 2L, 4L, 2L, 4L, 6L, 5L, 6L, 4L, 1L, 3L, 5L, 3L, 5L, 6L, 1L, 6L, 4L,
> 1L, 3L, 4L, 2L, 1L, 3L, 4L, 3L, 5L, 2L, 4L, 4L, 3L, 3L, 4L, 2L),
> org2_Gain_conj_1 = c(5L, 1L, 6L, 5L, 8L, 6L, 4L, 3L, 8L, 8L, 7L, 7L,
> 7L, 5L, 7L, 7L, 2L, 6L, 7L, 7L, 6L, 8L, 3L, 1L, 8L, 2L, 6L, 2L, 5L,
> 6L, 7L, 1L, 7L, 2L, 2L, 5L, 8L, 6L, 2L, 7L, 8L, 7L, 1L, 8L, 4L, 3L,
> 4L, 7L, 7L, 7L), org2_System_conj_1 = c(3L, 3L, 3L, 4L, 3L, 3L, 3L,
> 5L, 4L, 4L, 1L, 4L, 3L, 1L, 5L, 5L, 5L, 4L, 3L, 3L, 4L, 4L, 1L, 5L,
> 5L, 3L, 4L, 2L, 5L, 2L, 2L, 5L, 3L, 4L, 3L, 5L, 5L, 5L, 5L, 2L, 3L,
> 4L, 2L, 1L, 3L, 3L, 2L, 4L, 4L, 2L), org1_Effeciency_conj_2 = c(2L,
> 1L, 2L, 3L, 3L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 2L, 3L,
> 3L, 1L, 2L, 1L, 2L, 3L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 1L, 2L,
> 1L, 1L, 3L, 1L, 3L, 1L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 3L, 3L),
> org1_Oppenes_conj_2 = c(1L, 3L, 2L, 1L, 2L, 3L, 3L, 2L, 1L, 3L, 3L,
> 2L, 1L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 3L, 2L, 1L, 3L, 2L,
> 3L, 3L, 3L, 3L, 2L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 1L, 1L, 1L, 1L, 1L,
> 1L, 3L, 3L, 2L, 3L), org1_Inclusion_conj_2 = c(2L, 1L, 1L, 2L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L,
> 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
> 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L), org1_Leader_conj_2 =
> c(3L, 3L, 2L, 2L, 5L, 5L, 6L, 2L, 2L, 1L, 6L, 5L, 2L, 1L, 2L, 4L, 5L,
> 4L, 3L, 6L, 4L, 1L, 5L, 3L, 1L, 5L, 5L, 4L, 6L, 6L, 5L, 6L, 5L, 4L,
> 4L, 6L, 3L, 4L, 6L, 2L, 4L, 4L, 1L, 4L, 4L, 3L, 3L, 1L, 4L, 4L),
> org1_Gain_conj_2 = c(3L, 1L, 7L, 7L, 2L, 1L, 8L, 1L, 2L, 7L, 5L, 4L,
> 4L, 3L, 6L, 3L, 1L, 1L, 8L, 3L, 4L, 3L, 3L, 5L, 4L, 3L, 4L, 8L, 6L,
> 8L, 3L, 1L, 8L, 5L, 6L, 3L, 3L, 6L, 7L, 1L, 3L, 6L, 5L, 7L, 6L, 6L,
> 3L, 4L, 2L, 6L), org1_System_conj_2 = c(5L, 1L, 5L, 1L, 4L, 3L, 3L,
> 4L, 2L, 1L, 5L, 3L, 5L, 3L, 4L, 2L, 2L, 3L, 4L, 1L, 1L, 4L, 3L, 4L,
> 3L, 2L, 1L, 1L, 4L, 5L, 2L, 3L, 5L, 3L, 5L, 2L, 4L, 2L, 1L, 5L, 5L,
> 1L, 2L, 2L, 5L, 2L, 4L, 3L, 2L, 3L), org2_Effeciency_conj_2 = c(3L,
> 3L, 1L, 2L, 2L, 1L, 3L, 1L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 1L,
> 2L, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 2L, 3L, 3L, 3L,
> 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L),
> org2_Oppenes_conj_2 = c(2L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 2L, 2L,
> 3L, 3L, 1L, 2L, 1L, 2L, 3L, 3L, 3L, 3L, 2L, 3L, 1L, 1L, 3L, 1L, 3L,
> 2L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 3L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 1L, 1L, 2L), org2_Inclusion_conj_2 = c(1L, 2L, 2L, 1L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L,
> 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
> 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L), org2_Leader_conj_2 =
> c(6L, 6L, 1L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 3L, 5L, 2L, 1L, 5L, 4L,
> 6L, 4L, 2L, 3L, 3L, 1L, 4L, 2L, 2L, 6L, 6L, 1L, 5L, 4L, 4L, 1L, 3L,
> 3L, 4L, 5L, 5L, 3L, 3L, 6L, 3L, 2L, 5L, 2L, 6L, 4L, 2L, 5L, 1L),
> org2_Gain_conj_2 = c(8L, 5L, 3L, 6L, 8L, 2L, 2L, 2L, 7L, 6L, 4L, 1L,
> 6L, 7L, 2L, 1L, 2L, 2L, 3L, 2L, 5L, 5L, 4L, 2L, 7L, 2L, 7L, 4L, 7L,
> 1L, 2L, 5L, 1L, 2L, 7L, 1L, 6L, 2L, 8L, 7L, 7L, 1L, 6L, 3L, 3L, 2L,
> 5L, 3L, 4L, 2L), org2_System_conj_2 = c(1L, 5L, 3L, 4L, 5L, 1L, 4L,
> 3L, 4L, 4L, 4L, 5L, 2L, 2L, 1L, 3L, 4L, 4L, 5L, 2L, 5L, 1L, 2L, 1L,
> 2L, 3L, 3L, 4L, 1L, 3L, 3L, 5L, 4L, 5L, 1L, 5L, 5L, 5L, 4L, 3L, 2L,
> 4L, 4L, 3L, 3L, 4L, 3L, 1L, 1L, 2L), org1_Effeciency_conj_3 = c(1L,
> 3L, 3L, 1L, 2L, 3L, 3L, 1L, 2L, 3L, 1L, 3L, 3L, 3L, 2L, 3L, 2L, 1L,
> 1L, 2L, 2L, 3L, 2L, 1L, 3L, 3L, 2L, 3L, 2L, 1L, 2L, 3L, 3L, 1L, 3L,
> 3L, 2L, 1L, 1L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 2L, 3L, 3L, 1L),
> org1_Oppenes_conj_3 = c(2L, 3L, 3L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L,
> 2L, 3L, 3L, 1L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 3L, 3L, 1L,
> 3L, 1L, 2L, 3L, 2L, 1L, 3L, 1L, 3L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 3L,
> 3L, 2L, 3L, 3L, 3L), org1_Inclusion_conj_3 = c(1L, 1L, 1L, 2L, 1L,
> 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
> 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L,
> 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L), org1_Leader_conj_3 =
> c(3L, 1L, 5L, 6L, 3L, 2L, 2L, 6L, 4L, 3L, 3L, 2L, 2L, 1L, 2L, 3L, 5L,
> 6L, 4L, 1L, 2L, 4L, 5L, 1L, 2L, 2L, 2L, 6L, 4L, 6L, 4L, 6L, 1L, 1L,
> 3L, 5L, 4L, 1L, 3L, 6L, 2L, 6L, 6L, 1L, 2L, 2L, 6L, 2L, 6L, 5L),
> org1_Gain_conj_3 = c(2L, 7L, 2L, 4L, 6L, 7L, 2L, 4L, 1L, 5L, 5L, 7L,
> 5L, 7L, 7L, 3L, 2L, 6L, 2L, 5L, 6L, 6L, 7L, 3L, 5L, 6L, 3L, 8L, 1L,
> 2L, 8L, 5L, 2L, 8L, 5L, 6L, 5L, 2L, 5L, 3L, 3L, 2L, 4L, 2L, 4L, 5L,
> 7L, 6L, 2L, 7L), org1_System_conj_3 = c(5L, 5L, 1L, 1L, 4L, 3L, 1L,
> 1L, 2L, 5L, 1L, 5L, 2L, 1L, 5L, 4L, 1L, 1L, 3L, 4L, 5L, 1L, 5L, 3L,
> 3L, 5L, 1L, 3L, 2L, 5L, 2L, 1L, 5L, 1L, 3L, 2L, 5L, 5L, 2L, 1L, 3L,
> 2L, 2L, 4L, 4L, 4L, 2L, 3L, 5L, 4L), org2_Effeciency_conj_3 = c(2L,
> 1L, 2L, 2L, 1L, 2L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 3L,
> 3L, 1L, 3L, 1L, 1L, 2L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 1L, 1L, 3L, 1L,
> 2L, 1L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L),
> org2_Oppenes_conj_3 = c(1L, 1L, 1L, 2L, 3L, 3L, 2L, 1L, 3L, 3L, 1L,
> 3L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 2L, 3L, 1L, 2L, 3L,
> 1L, 2L, 1L, 1L, 3L, 3L, 1L, 3L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 2L,
> 2L, 1L, 1L, 2L, 1L), org2_Inclusion_conj_3 = c(2L, 2L, 2L, 1L, 2L,
> 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L,
> 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L,
> 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L), org2_Leader_conj_3 =
> c(1L, 5L, 2L, 1L, 2L, 4L, 4L, 1L, 2L, 4L, 5L, 5L, 5L, 4L, 3L, 4L, 6L,
> 3L, 2L, 2L, 5L, 2L, 2L, 5L, 5L, 3L, 5L, 3L, 3L, 1L, 5L, 5L, 2L, 2L,
> 2L, 2L, 1L, 6L, 1L, 5L, 1L, 5L, 1L, 2L, 6L, 6L, 4L, 3L, 2L, 6L),
> org2_Gain_conj_3 = c(1L, 8L, 3L, 5L, 2L, 6L, 3L, 2L, 7L, 1L, 2L, 2L,
> 8L, 1L, 2L, 6L, 1L, 8L, 6L, 3L, 7L, 4L, 5L, 2L, 6L, 8L, 2L, 7L, 6L,
> 8L, 5L, 7L, 3L, 6L, 1L, 8L, 4L, 3L, 7L, 5L, 8L, 8L, 3L, 6L, 3L, 4L,
> 5L, 4L, 4L, 5L), org2_System_conj_3 = c(4L, 1L, 4L, 3L, 3L, 5L, 3L,
> 3L, 4L, 2L, 3L, 1L, 1L, 5L, 2L, 3L, 3L, 2L, 5L, 3L, 1L, 2L, 3L, 5L,
> 1L, 4L, 5L, 2L, 3L, 2L, 3L, 2L, 4L, 3L, 5L, 3L, 1L, 1L, 3L, 2L, 4L,
> 5L, 5L, 3L, 1L, 1L, 4L, 1L, 4L, 5L), org1_Effeciency_conj_4 = c(1L,
> 1L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 2L, 3L,
> 3L, 1L, 1L, 3L, 1L, 3L, 2L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 1L, 3L, 2L,
> 3L, 3L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 1L, 1L, 3L, 3L, 1L, 3L),
> org1_Oppenes_conj_4 = c(2L, 1L, 2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 1L,
> 1L, 3L, 1L, 3L, 2L, 2L, 3L, 2L, 3L, 1L, 3L, 3L, 1L, 1L, 1L, 3L, 1L,
> 1L, 1L, 2L, 3L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 3L, 3L, 1L, 3L,
> 3L, 3L, 2L, 3L, 2L), org1_Inclusion_conj_4 = c(2L, 2L, 1L, 2L, 2L,
> 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
> 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
> 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L), org1_Leader_conj_4 =
> c(4L, 6L, 5L, 1L, 2L, 1L, 1L, 3L, 3L, 6L, 2L, 5L, 6L, 6L, 6L, 2L, 3L,
> 3L, 4L, 4L, 4L, 1L, 5L, 5L, 2L, 6L, 2L, 5L, 4L, 4L, 2L, 5L, 6L, 5L,
> 1L, 4L, 4L, 3L, 4L, 2L, 3L, 2L, 5L, 1L, 3L, 6L, 2L, 6L, 4L, 1L),
> org1_Gain_conj_4 = c(3L, 1L, 2L, 3L, 4L, 7L, 2L, 7L, 4L, 1L, 6L, 3L,
> 5L, 8L, 3L, 7L, 8L, 1L, 3L, 6L, 7L, 1L, 1L, 1L, 1L, 3L, 4L, 3L, 1L,
> 8L, 3L, 2L, 1L, 7L, 2L, 4L, 4L, 1L, 6L, 8L, 6L, 3L, 7L, 3L, 8L, 7L,
> 3L, 1L, 3L, 3L), org1_System_conj_4 = c(5L, 1L, 2L, 3L, 2L, 5L, 5L,
> 2L, 3L, 5L, 3L, 4L, 5L, 2L, 4L, 2L, 3L, 2L, 4L, 4L, 1L, 1L, 4L, 3L,
> 2L, 4L, 3L, 1L, 5L, 5L, 2L, 4L, 5L, 4L, 3L, 3L, 1L, 5L, 4L, 1L, 2L,
> 3L, 5L, 5L, 3L, 2L, 5L, 2L, 3L, 3L), org2_Effeciency_conj_4 = c(3L,
> 3L, 3L, 1L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 3L, 2L, 1L, 1L,
> 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 2L, 2L, 1L, 3L, 1L, 3L,
> 2L, 2L, 3L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 3L, 2L, 2L, 3L, 1L),
> org2_Oppenes_conj_4 = c(1L, 3L, 1L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 2L,
> 3L, 2L, 2L, 2L, 1L, 3L, 1L, 3L, 2L, 2L, 1L, 1L, 3L, 3L, 2L, 1L, 3L,
> 3L, 2L, 3L, 1L, 3L, 3L, 2L, 1L, 3L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 1L,
> 1L, 2L, 3L, 1L, 1L), org2_Inclusion_conj_4 = c(1L, 1L, 2L, 1L, 1L,
> 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
> 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L,
> 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L), org2_Leader_conj_4 =
> c(1L, 5L, 2L, 6L, 6L, 6L, 2L, 1L, 2L, 4L, 5L, 3L, 4L, 4L, 2L, 1L, 6L,
> 1L, 1L, 2L, 6L, 3L, 1L, 4L, 4L, 3L, 3L, 4L, 6L, 5L, 3L, 2L, 3L, 6L,
> 6L, 5L, 2L, 6L, 3L, 5L, 5L, 1L, 6L, 5L, 4L, 5L, 1L, 2L, 2L, 6L),
> org2_Gain_conj_4 = c(5L, 8L, 1L, 2L, 7L, 2L, 7L, 8L, 2L, 6L, 7L, 7L,
> 7L, 5L, 8L, 4L, 6L, 6L, 6L, 4L, 6L, 6L, 7L, 2L, 5L, 6L, 6L, 1L, 8L,
> 5L, 2L, 5L, 6L, 3L, 3L, 7L, 7L, 8L, 4L, 7L, 5L, 2L, 2L, 7L, 6L, 4L,
> 7L, 4L, 4L, 1L), org2_System_conj_4 = c(2L, 3L, 3L, 2L, 4L, 4L, 4L,
> 4L, 1L, 4L, 1L, 2L, 4L, 5L, 2L, 3L, 5L, 1L, 1L, 1L, 5L, 4L, 2L, 2L,
> 3L, 2L, 1L, 4L, 3L, 4L, 5L, 3L, 1L, 3L, 2L, 4L, 4L, 1L, 3L, 3L, 4L,
> 5L, 4L, 4L, 1L, 1L, 3L, 5L, 5L, 1L), CHOICE_conj1 = c(2L, 2L, 1L, 2L,
> 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L,
> 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
> 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L ), RATING_conj1_org1 =
> c(1L, 3L, 6L, 5L, 3L, 1L, 5L, 2L, 0L, 7L, 6L, 8L, 5L, 10L, 8L, 10L,
> 1L, 6L, 5L, 8L, 2L, 7L, 0L, 6L, 8L, 0L, 4L, 2L, 8L, 6L, 7L, 7L, 7L,
> 2L, 3L, 8L, 6L, 7L, 2L, 7L, 3L, 8L, 5L, 7L, 8L, 6L, 6L, 10L, 3L, 9L),
> RATING_conj1_org2 = c(7L, 6L, 4L, 7L, 7L, 1L, 6L, 6L, 0L, 3L, 2L, 0L,
> 0L, 9L, 5L, 3L, 1L, 6L, 8L, 5L, 2L, 2L, 0L, 4L, 5L, 0L, 6L, 8L, 3L,
> 5L, 6L, 6L, 5L, 8L, 3L, 8L, 3L, 1L, 5L, 9L, 7L, 3L, 7L, 6L, 6L, 4L,
> 4L, 0L, 6L, 7L), CHOICE_conj2 = c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L,
> 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L,
> 2L, 1L, 1L, 2L, 1L, 1L, 2L), RATING_conj2_org1 = c(5L, 4L, 4L, 4L,
> 5L, 1L, 5L, 7L, 0L, 3L, 5L, 6L, 5L, 9L, 5L, 3L, 1L, 4L, 4L, 8L, 3L,
> 7L, 0L, 9L, 9L, 1L, 3L, 2L, 3L, 5L, 6L, 4L, 5L, 8L, 3L, 7L, 6L, 1L,
> 7L, 0L, 7L, 6L, 6L, 8L, 9L, 7L, 5L, 10L, 7L, 7L), RATING_conj2_org2 =
> c(0L, 2L, 7L, 4L, 8L, 1L, 7L, 8L, 0L, 3L, 6L, 0L, 0L, 7L, 8L, 10L,
> 0L, 3L, 6L, 8L, 2L, 5L, 0L, 4L, 5L, 2L, 5L, 5L, 7L, 5L, 5L, 7L, 1L,
> 2L, 3L, 8L, 3L, 7L, 3L, 6L, 2L, 8L, 8L, 8L, 7L, 6L, 6L, 5L, 5L, 9L),
> CHOICE_conj3 = c(2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L,
> 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L,
> 2L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L,
> 2L, 1L, 1L), RATING_conj3_org1 = c(4L, 6L, 4L, 6L, 7L, 1L, 6L, 3L,
> 0L, 6L, 2L, 7L, 0L, 9L, 5L, 3L, 1L, 3L, 4L, 7L, 1L, 8L, 0L, 5L, 5L,
> 1L, 5L, 2L, 8L, 5L, 5L, 5L, 3L, 8L, 2L, 4L, 5L, 7L, 8L, 6L, 7L, 6L,
> 4L, 9L, 7L, 5L, 4L, 2L, 8L, 9L), RATING_conj3_org2 = c(7L, 4L, 6L,
> 5L, 6L, 1L, 3L, 7L, 0L, 3L, 2L, 3L, 3L, 6L, 5L, 10L, 0L, 3L, 4L, 10L,
> 0L, 4L, 0L, 7L, 5L, 2L, 3L, 2L, 3L, 5L, 8L, 2L, 7L, 2L, 7L, 5L, 3L,
> 3L, 0L, 0L, 2L, 6L, 7L, 8L, 5L, 2L, 8L, 10L, 6L, 8L), CHOICE_conj4 =
> c(2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L,
> 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
> 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L),
> RATING_conj4_org1 = c(4L, 5L, 8L, 6L, 4L, 1L, 8L, 3L, 0L, 7L, 5L, 5L,
> 2L, 8L, 7L, 10L, 1L, 5L, 5L, 10L, 1L, 3L, 0L, 6L, 7L, 1L, 2L, 5L, 7L,
> 8L, 7L, 3L, 6L, 2L, 2L, 8L, 5L, 5L, 4L, 5L, 3L, 7L, 3L, 8L, 8L, 6L,
> 2L, 10L, 7L, 7L), RATING_conj4_org2 = c(6L, 4L, 4L, 4L, 5L, 1L, 6L,
> 7L, 0L, 3L, 6L, 2L, 0L, 5L, 5L, 3L, 0L, 3L, 4L, 9L, 4L, 8L, 0L, 5L,
> 6L, 2L, 8L, 3L, 2L, 5L, 5L, 7L, 2L, 6L, 7L, 8L, 3L, 3L, 1L, 5L, 7L,
> 10L, 7L, 10L, 5L, 5L, 7L, 5L, 5L, 8L), Q7 = c(0L, 0L, 8L, 9L, 6L,
> 10L, 2L, 2L, 6L, 8L, 0L, 0L, 5L, 2L, 7L, 7L, 3L, 0L, 0L, 5L, 6L, 4L,
> 7L, 2L, 977L, 0L, 6L, 3L, 2L, 4L, 7L, 8L, 2L, 1L, 9L, 8L, 10L, 6L,
> 0L, 9L, 5L, 0L, 3L, 0L, 0L, 0L, 2L, 5L, 977L, 2L), Q8 = c(1L, 1L, 2L,
> 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 977L, 1L, 2L, 2L, 1L, 3L, 1L, 1L,
> 3L, 1L, 3L, 1L, 2L, 1L, 977L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 2L,
> 3L, 3L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 977L, 1L), Q9 = c(4L, 8L,
> 1L, 0L, 4L, 0L, 8L, 7L, 0L, 0L, 10L, 10L, 0L, 4L, 0L, 10L, 4L, 5L,
> 10L, 8L, 2L, 9L, 0L, 5L, 2L, 0L, 5L, 4L, 4L, 8L, 0L, 0L, 5L, 6L, 2L,
> 0L, 0L, 0L, 7L, 4L, 5L, 5L, 6L, 10L, 7L, 4L, 6L, 0L, 977L, 7L), Q10 =
> c(8L, 10L, 7L, 5L, 7L, 2L, 7L, 8L, 0L, 2L, 10L, 10L, 0L, 10L, 2L,
> 10L, 8L, 8L, 10L, 8L, 7L, 10L, 5L, 7L, 4L, 0L, 7L, 7L, 10L, 10L, 4L,
> 2L, 5L, 9L, 5L, 6L, 2L, 4L, 10L, 3L, 5L, 7L, 9L, 10L, 10L, 10L, 8L,
> 977L, 977L, 10L), Q11 = c(10L, 9L, 1L, 4L, 5L, 0L, 5L, 6L, 1L, 3L,
> 9L, 10L, 0L, 10L, 7L, 7L, 5L, 7L, 10L, 10L, 9L, 7L, 0L, 8L, 7L, 0L,
> 7L, 7L, 8L, 10L, 5L, 2L, 2L, 10L, 5L, 1L, 2L, 4L, 6L, 4L, 7L, 10L,
> 6L, 8L, 8L, 6L, 8L, 6L, 977L, 10L), Q12 = c(0L, 0L, 0L, 5L, 1L, 10L,
> 2L, 0L, 0L, 2L, 0L, 0L, 5L, 0L, 6L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
> 0L, 0L, 10L, 3L, 0L, 0L, 977L, 10L, 7L, 0L, 0L, 5L, 8L, 2L, 0L, 966L,
> 7L, 977L, 0L, 0L, 0L, 0L, 0L, 0L, 977L, 977L, 0L), Q13 = c(2L, 2L,
> 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 977L, 2L), Q14 =
> c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Q1 =
> c(2L, 2L, 8L, 6L, 5L, 1L, 7L, 3L, 7L, 4L, 1L, 6L, 4L, 1L, 5L, 10L,
> 5L, 4L, 3L, 7L, 2L, 5L, 3L, 5L, 977L, 0L, 5L, 4L, 4L, 7L, 5L, 3L, 8L,
> 3L, 3L, 0L, 5L, 6L, 3L, 4L, 0L, 3L, 3L, 2L, 7L, 4L, 2L, 7L, 4L, 7L),
> Q2 = c(1L, 1L, 1L, 977L, 1L, 3L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 1L, 3L,
> 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 977L, 2L, 3L, 3L, 1L, 1L, 3L, 2L,
> 1L, 1L, 3L, 3L, 2L, 3L, 3L, 2L, 1L, 3L, 3L, 3L, 977L, 1L, 3L, 977L,
> 977L, 1L), gender = c(1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
> 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L,
> 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,
> 1L, 2L, 2L, 1L), profile_age = c(5L, 2L, 5L, 5L, 3L, 5L, 2L, 5L, 3L,
> 5L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 2L, 5L, 5L, 5L, 5L, 2L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 5L, 5L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L ), educ = c(6L, 5L, 2L, 5L, 6L, 6L, 4L, 6L,
> 3L, 5L, 4L, 5L, 6L, 4L, 4L, 6L, 6L, 6L, 3L, 6L, 5L, 6L, 5L, 5L, 3L,
> 4L, 6L, 6L, 5L, 3L, 3L, 4L, 3L, 6L, 3L, 5L, 5L, 6L, 3L, 5L, 3L, 3L,
> 3L, 3L, 4L, 5L, 5L, 4L, 2L, 3L)), class = "data.frame", row.names =
> c(NA,
> -50L))
What I have done so far is this:
library(cregg)
str(long <- cj_tidy(cjdata_wide,
profile_variables = c("All the profile variables"),
task_variables = c("CHOICE AND RATING VARIABLES HERE"),
id = ~ id))
stopifnot(nrow(long) == nrow(data)*4*2
But I'm keep getting errors. I have tried to follow the example given by the cregg package - but with no success. Any help is much appreciated! I am open to all possible ways, be it so through cregg package or tidyr for instance.
Your data not being in a standard form mades this a difficult problem. Here is a solution using the tidyr package.
The solutions involves 3 parts, dealing with the profiles, the rating and finally the rating choice.
The key to the profiles part was to pivot long and breaking up the profile names into component parts and then pivot wider for the column headings.
The rating and binary choice involved pivoting longer and then aligning the rows.
library(tidyr)
library(dplyr)
#Get the categories part correct
answer <-cjdata_wide %>% pivot_longer(cols=starts_with("org"), names_to=c("org", "Cat", "conj", "order"), values_to= "values", names_sep="_") %>% select(-c("conj"))
answer <-answer %>% select(!starts_with("RATING") & !starts_with("CHOICE"))
answer <-pivot_wider(answer, names_from = "Cat", values_from = "values")
#get the ratings column corretn
rating <-cjdata_wide %>% select(starts_with("RATING") )
rating <- rating%>% pivot_longer(cols=everything(), names_to=c("Rating", "conj", "order"), values_to= "Choice_Rating", names_sep="_") %>% select(-c("conj"))
answer$Choice_Rating <- rating$Choice_Rating
#Get the choice correct
choice <-cjdata_wide %>% select(starts_with("CHOICE") )
choiceRate <- choice%>% pivot_longer(cols=everything(), names_to=c("Choice", "conj"), values_to= "Choice_Rating", names_sep="_") %>% select(-c("conj"))
answer$Choice_binary <-ifelse(substr(answer$org, 4,4) == rep(choiceRate$Choice_Rating,each=2), 1, 0)
answer
It may be possible to simplify the above. Good luck.
Update per Comment
The final data frame has pairs of rows which corresponds to org 1 or 2. I duplicated the choice so that Choice_Rating column is the same length as the Organization ("org" column). I then compared Choice_Rating & Organization and setting the final value to either 0 or 1 depending on the match.
For question in the comment, A simple way is to convert the factor column to integers with as.integer() function, then the first factor becomes 1 and the second becomes 2 etc. (may need to relevel in order to get the proper order).
Another option is to create a new "org" column with your factor names properly listed.
Hopefully this provides enough guidance.
I have to draw a bar chart in R ggplot2 with multiple variables (i.e each bar for BMI, weight, cholesterol, Blood pressure etc) in each group ( i.e. different populations ex: Indian, Korean, Philipinos etc.) But the bars are overflowing to the next group in the axis. for example: the bars of the Indian group is overflowing to Korean group. The axis marks are not adjusted accordingly. I have attached the figure .. can someone please help. Following is my code. dput(data) is also given.
p = ggplot(data = t,
aes(x = factor(Population, levels = names(sort(table(Population), increasing = TRUE))),
y = Snp_Count,
group = factor(Trait, levels = c("BMI", "DBP", "HDL", "Height", "LDL", "TC", "TG", "WC", "Weight"),
ordered = TRUE)))
p = p + geom_bar(aes(fill = Trait),
position = position_dodge(preserve = "single"),
stat = "identity") +
scale_fill_manual(values = c("#28559A", "#3EB650", "#E56B1F", "#A51890", "#FCC133", "#663300", "#6666ff", "#ff3300", "#ff66ff")) +
coord_flip()
structure(list(Trait = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("BMI",
"DBP", "HDL", "HT", "LDL", "TC", "TG", "WC", "Weight"), class = "factor"),
Association = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "Direct", class = "factor"), TraitClass = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Anthropometric",
"BP", "Lipid"), class = "factor"), Population = structure(c(2L,
3L, 4L, 5L, 7L, 8L, 10L, 11L, 12L, 13L, 22L, 24L, 3L, 5L,
11L, 22L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
14L, 15L, 18L, 20L, 28L, 5L, 7L, 13L, 14L, 1L, 3L, 5L, 7L,
9L, 11L, 12L, 16L, 18L, 20L, 22L, 5L, 6L, 7L, 10L, 12L, 18L,
20L, 3L, 5L, 6L, 7L, 8L, 11L, 12L, 13L, 14L, 15L, 18L, 19L,
20L, 21L, 22L, 23L, 26L, 28L, 3L, 4L, 5L, 8L, 12L, 22L, 24L,
3L, 5L, 7L, 8L, 17L, 25L, 27L), .Label = c("ACB", "AFR",
"ASW", "ASW/ACB", "CEU", "CHB", "EAS", "Filipino", "FIN",
"GBR", "Hispanic", "Hispanic/Latinos", "JPT", "Korean", "Kuwaiti",
"Micronesian", "Moroccan", "MXL", "Mylopotamos", "Orcadian",
"Pomak", "SAS", "Saudi_Arabian", "Seychellois", "Surinamese",
"Taiwanese", "Turkish", "YRI"), class = "factor"), Snp_Count = c(3L,
12L, 6L, 17L, 2L, 10L, 1L, 6L, 3L, 3L, 10L, 6L, 1L, 1L, 1L,
1L, 2L, 1L, 10L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L,
2L, 1L, 2L, 20L, 5L, 4L, 1L, 1L, 2L, 7L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 8L, 2L, 4L, 3L, 1L, 2L, 1L, 4L, 20L, 5L,
11L, 2L, 4L, 3L, 4L, 2L, 3L, 4L, 1L, 1L, 1L, 2L, 2L, 1L,
2L, 3L, 2L, 4L, 4L, 1L, 4L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L
), Gene_Count = c(3L, 9L, 7L, 9L, 2L, 8L, 1L, 7L, 3L, 2L,
8L, 7L, 1L, 1L, 1L, 1L, 2L, 1L, 4L, 1L, 1L, 1L, 1L, 2L, 2L,
1L, 2L, 1L, 1L, 1L, 1L, 1L, 9L, 6L, 5L, 1L, 1L, 2L, 5L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 6L, 2L, 3L, 3L, 1L, 2L, 1L, 3L,
10L, 4L, 7L, 1L, 3L, 3L, 4L, 1L, 3L, 5L, 1L, 1L, 1L, 3L,
3L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 3L, 2L, 3L, 3L, 2L, 3L, 2L,
2L, 2L)), class = "data.frame", row.names = c(NA, -86L))
The total width of each group in your barchart is 0.9 by default, which means that 90% of the area is covered. When you increase the width of the individual bars to 3 they will overlap with other groups, the maximum value for with should thus be 1 and then it will touch the other groups.
I'd suggest in your situation to use facet_wrap instead of a dodged barchart.
Note: geom_col is the same as geom_bar(stat = "identity).
my.df$Trait <- factor(my.df$Trait, levels = c("BMI", "DBP", "HDL", "HT", "LDL", "TC", "TG", "WC", "Weight"))
my.df$Population <- factor(my.df$Population, levels = names(sort(table(my.df$Population), increasing = TRUE)))
ggplot(my.df, aes(x = Trait, y = Snp_Count, fill = Trait)) +
geom_col(width = 1) +
scale_fill_manual(values = c("#28559A", "#3EB650", "#E56B1F", "#A51890", "#FCC133", "#663300", "#6666ff", "#ff3300", "#ff66ff")) +
# Split the data by Population, allow flexible scales and spacing for y axis (Trait)
facet_grid(Population ~ ., scales = "free_y", space = "free_y", switch = "y") +
coord_flip() +
theme(axis.text.y = element_blank(), # Remove Trait labels (indicated by color)
axis.ticks.y = element_blank(), # Remove tick marks
strip.background = element_blank(),
strip.text.y = element_text(angle = 180, hjust = 1), # Rotate Population labels
panel.spacing.y = unit(3, "pt")) # Spacing between groups
Data
my.df <-
structure(list(Trait = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L),
.Label = c("BMI", "DBP", "HDL", "HT", "LDL", "TC", "TG", "WC", "Weight"), class = "factor"),
Population = structure(c(2L, 3L, 4L, 5L, 7L, 8L, 10L, 11L,
12L, 13L, 22L, 24L, 3L, 5L, 11L, 22L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 18L, 20L, 28L, 5L,
7L, 13L, 14L, 1L, 3L, 5L, 7L, 9L, 11L, 12L, 16L, 18L, 20L,
22L, 5L, 6L, 7L, 10L, 12L, 18L, 20L, 3L, 5L, 6L, 7L, 8L,
11L, 12L, 13L, 14L, 15L, 18L, 19L, 20L, 21L, 22L, 23L, 26L,
28L, 3L, 4L, 5L, 8L, 12L, 22L, 24L, 3L, 5L, 7L, 8L, 17L,
25L, 27L),
.Label = c("ACB", "AFR", "ASW", "ASW/ACB", "CEU",
"CHB", "EAS", "Filipino", "FIN", "GBR", "Hispanic", "Hispanic/Latinos",
"JPT", "Korean", "Kuwaiti", "Micronesian", "Moroccan", "MXL",
"Mylopotamos", "Orcadian", "Pomak", "SAS", "Saudi_Arabian",
"Seychellois", "Surinamese", "Taiwanese", "Turkish", "YRI"), class = "factor"),
Snp_Count = c(3L, 12L, 6L, 17L, 2L,
10L, 1L, 6L, 3L, 3L, 10L, 6L, 1L, 1L, 1L, 1L, 2L, 1L, 10L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 20L,
5L, 4L, 1L, 1L, 2L, 7L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 8L,
2L, 4L, 3L, 1L, 2L, 1L, 4L, 20L, 5L, 11L, 2L, 4L, 3L, 4L,
2L, 3L, 4L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 3L, 2L, 4L, 4L, 1L,
4L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L)),
class = "data.frame", row.names = c(NA, -86L))
I am running a negative binomial regression.
I would like to know why I have the following errors:
In sqrt(1/i) : NaNs produced
It appears that there are some negative values in "i", but how do I avoid that?
Another one is:
In loglik(n, th, mu, Y, w) : value out of range in 'lgamma'
It is probably a consequences of the first error, so if I fix the first one, the second might be gone. Or maybe not.
In some other cases I am able to calculate the regression but the following output seems strange for me:
(Dispersion parameter for Negative Binomial(10684331573) family taken
to be 1)
Null deviance: 8779.49 on 359 degrees of freedom
Residual deviance: 270.32 on 200 degrees of freedom
AIC: 2074.7
Number of Fisher Scoring iterations: 1
Theta: 10684331573
Std. Err.: 615849693813
2 x log-likelihood: -1752.749
Do these numbers seem okay? I mean the dispersion parameter, theta and standard error. They look enormously big to me and therefore I am not sure if the results are okay.
I never had any problems like that using poisson regression, but then I realized that I have an overdispersed data and that is why I am using negative binomial. However, I am having a lot of troubles with this one.
Here is the code:
negbin <- glm.nb(Freq ~ cluster*gender*agecombined*educ, maxit=100)
mod.good <- step(negbin, direction='both', maxit=100)
And here is the dput of the whole dataset:
structure(list(gender = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("1",
"2"), class = "factor"), agecombined = structure(c(1L, 1L, 2L, 2L,
3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L,
5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L,
2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L,
1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L,
6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L,
2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L,
5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L,
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L,
4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L,
6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L,
3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L,
5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L,
2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L,
4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L,
1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L,
6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L,
2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L,
5L, 5L, 6L, 6L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L),
.Label = c("18-24", "25-34", "35-44", "45-54", "55-64", "65 and
older"), class = "factor"), educ = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), .Label =
c("2-year college", "BA", "Illiterate", "MA or higher", "Primary",
"Secondary"), class = "factor"),
cluster = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L), .Label = c("E", "A", "B", "C", "D"
), class = "factor"), Freq = c(27L, 18L, 48L, 29L, 18L, 19L,
14L, 10L, 2L, 1L, 2L, 0L, 48L, 36L, 69L, 54L, 33L, 15L, 12L,
4L, 5L, 1L, 0L, 0L, 2L, 4L, 12L, 14L, 17L, 17L, 23L, 32L,
16L, 17L, 18L, 6L, 4L, 2L, 17L, 7L, 8L, 4L, 5L, 0L, 1L, 0L,
0L, 0L, 53L, 42L, 82L, 58L, 81L, 60L, 42L, 35L, 16L, 14L,
22L, 6L, 83L, 40L, 62L, 54L, 43L, 46L, 26L, 12L, 15L, 3L,
3L, 3L, 11L, 13L, 11L, 23L, 16L, 18L, 11L, 5L, 1L, 3L, 1L,
1L, 26L, 44L, 34L, 54L, 25L, 41L, 19L, 17L, 10L, 3L, 3L,
0L, 4L, 4L, 7L, 14L, 22L, 31L, 14L, 34L, 14L, 33L, 14L, 20L,
7L, 11L, 22L, 11L, 14L, 8L, 8L, 1L, 2L, 0L, 1L, 2L, 29L,
65L, 34L, 84L, 36L, 65L, 28L, 39L, 16L, 15L, 16L, 9L, 25L,
51L, 12L, 38L, 23L, 29L, 22L, 19L, 7L, 5L, 5L, 1L, 7L, 16L,
14L, 35L, 6L, 27L, 8L, 5L, 1L, 1L, 1L, 0L, 24L, 57L, 29L,
53L, 24L, 28L, 11L, 9L, 7L, 2L, 0L, 0L, 3L, 7L, 1L, 8L, 2L,
18L, 5L, 13L, 10L, 11L, 5L, 10L, 3L, 1L, 5L, 13L, 4L, 2L,
2L, 1L, 1L, 0L, 0L, 0L, 14L, 51L, 21L, 77L, 23L, 50L, 25L,
31L, 17L, 16L, 13L, 13L, 19L, 52L, 24L, 59L, 18L, 44L, 9L,
20L, 6L, 3L, 7L, 2L, 14L, 28L, 34L, 47L, 29L, 47L, 15L, 13L,
9L, 3L, 2L, 0L, 46L, 75L, 124L, 81L, 67L, 45L, 33L, 15L,
9L, 4L, 5L, 3L, 0L, 10L, 6L, 19L, 12L, 28L, 22L, 37L, 31L,
41L, 26L, 31L, 7L, 6L, 21L, 13L, 6L, 7L, 8L, 2L, 2L, 1L,
0L, 0L, 67L, 89L, 116L, 159L, 99L, 102L, 64L, 80L, 42L, 25L,
25L, 8L, 108L, 123L, 60L, 97L, 68L, 66L, 44L, 35L, 12L, 5L,
9L, 2L, 7L, 3L, 53L, 15L, 33L, 3L, 8L, 3L, 4L, 0L, 0L, 0L,
48L, 19L, 76L, 40L, 55L, 11L, 16L, 1L, 4L, 1L, 2L, 0L, 6L,
7L, 21L, 22L, 18L, 23L, 32L, 37L, 40L, 13L, 23L, 10L, 4L,
2L, 19L, 2L, 8L, 3L, 6L, 0L, 1L, 0L, 1L, 0L, 68L, 37L, 90L,
42L, 76L, 38L, 47L, 16L, 29L, 5L, 18L, 2L, 82L, 32L, 62L,
27L, 44L, 22L, 20L, 8L, 8L, 2L, 1L, 0L)), .Names = c("gender", "agecombined", "educ", "cluster", "Freq"), row.names = c(NA,
-360L), class = "data.frame")