Create rainbow histogram with bin labels ggplot - r

I am trying to create a histogram with a rainbow color scale but I also want to have the bin labels. I have been able to create a histogram with labeled bins and I have read a couple of posts talking about how to make a rainbow histogram which I have been able to recreate (here and here). However, I have not been able to create a rainbow histogram with the correct bin labels. I will attach an example data set and some sample code that I have tried. Ideally, I would also like to remove any bin labels that have zero as a value but I don't want to be too greedy here.
ggplot(final_df,aes(x=V1, fill = cut(V1, 25)))+ geom_histogram(show.legend = FALSE) +
stat_bin(aes(y=..count.., label=..count..), geom="text", vjust=-.5)
As you can see, it creates the rainbow histogram but the bin labels are all messed up.
structure(list(V1 = c(18, 0, 20, 21, 0, 2, 0, 1, 0, 0, 4, 16,
0, 0, 20, 20, 2, 0, 19, 22, 0, 0, 19, 0, 22, 22, 19, 2, 0, 0,
1, 18, 23, 1, 3, 1, 1, 1, 0, 21, 21, 0, 0, 15, 24, 0, 20, 19,
0, 1, 20, 21, 0, 0, 20, 22, 20, 0, 21, 0, 0, 22, 0, 0, 0, 23,
2, 1, 1, 21, 0, 2, 3, 23, 23, 1, 22, 0, 19, 23, 1, 2, 23, 1,
0, 0, 20, 1, 0, 0, 1, 18, 0, 0, 0, 0, 0, 2, 0, 7, 22, 0, 0, 23,
1, 0, 23, 0, 0, 1, 2, 0, 0, 18, 16, 0, 0, 1, 0, 0, 0, 2, 22,
0, 2, 0, 0, 0, 24, 0, 0, 0, 1, 1, 20, 0, 0, 1, 18, 0, 1, 1, 0,
0, 3, 0, 20, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 20, 2,
0, 1, 22, 0, 1, 23, 2, 0, 1, 5, 0, 10, 1, 17, 0, 0, 1, 1, 2,
1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 0, 23, 2, 19, 2, 1, 21, 3,
0, 0, 20, 0, 1, 0, 1, 0, 0, 24, 2, 1, 1, 23, 1, 1, 0, 1, 0, 0,
22, 23, 0, 23, 0, 22, 2, 19, 0, 20, 22, 0, 23, 0, 21, 0, 0, 23,
0, 0, 0, 0, 3, 22, 1, 0, 1, 22, 22, 20, 0, 1, 2, 22, 2, 23, 0,
18, 1, 23, 0, 2, 0, 1, 22, 0, 21, 0, 2, 20, 0, 0, 23, 0, 1, 18,
0, 18, 20, 1, 0, 20, 0, 1, 0, 0, 17, 20, 0, 0, 1, 22, 20, 22,
2, 1, 1, 0, 1, 0, 0, 0, 18, 0, 0, 21, 0, 0, 2, 22, 20, 1, 0,
0, 0, 0, 1, 0, 0, 1, 0, 4, 1, 0, 21, 21, 0, 0, 1, 0, 1, 3, 0,
1, 1, 0, 24, 0, 0, 22, 17, 0, 1, 20, 1, 1, 21, 1, 21, 21, 0,
21, 0, 1, 23, 0, 0, 23, 21, 0, 0, 24, 0, 6, 17, 0, 21, 0, 23,
0, 0, 22, 1, 1, 22, 0, 2, 0, 0, 1, 19, 0, 21, 21, 2, 1, 18, 1,
21, 0, 1, 1, 0, 0, 1, 23, 0, 0, 1, 0, 0, 0, 1, 2, 1, 0, 0, 0,
25, 0, 0, 1, 0, 0, 0, 23, 23, 0, 0, 0, 21, 19, 2, 0, 0, 0, 0,
0, 1, 0, 22, 22, 0, 19, 0, 3, 0, 21, 0, 1, 20, 1, 1, 1, 22, 1,
22, 1, 22, 1, 0, 2, 0, 25, 23, 0, 20, 0, 2, 22, 0, 0, 1, 0, 1,
23, 22, 0, 1, 19, 23, 1, 0, 2, 0, 18, 0, 0, 2, 0, 0, 23, 0, 0,
0, 0, 0, 1, 2, 1, 0, 21, 0, 21, 20, 0, 1, 19, 23, 0, 1, 23, 0,
1, 22, 21, 3, 0, 22, 2, 0, 1, 23, 2, 0, 24, 23, 21, 23, 20, 0,
0, 0, 20, 22, 0, 2, 0, 17, 0, 0, 1, 22, 1, 1, 1, 0, 0, 3, 3,
5, 21, 21, 1, 19, 18, 0, 24, 1, 2, 0, 0, 1, 1, 0, 0, 0, 0, 0,
0, 23, 1, 20, 0, 0, 1, 19, 22, 21, 24, 3, 1, 2, 24, 0, 0, 23,
17, 22, 0, 24, 23, 16, 1, 0, 2, 20, 0, 19, 0, 2, 1, 22, 20, 0,
20, 0, 1, 22, 0, 1, 0, 2, 0, 1, 0, 0, 2, 25, 24, 2, 20, 3, 0,
0, 23, 0, 4, 0, 19, 1, 0, 1, 0, 3, 19, 22, 0, 0, 0, 1, 0, 1,
23, 20, 20, 23, 0, 0, 0, 24, 0, 21, 20, 23, 0, 1, 1, 0, 19, 0,
0, 0, 1, 22, 0, 22, 0, 1, 18, 0, 20, 1, 0, 0, 1, 20, 0, 0, 0,
0, 0, 0, 0, 0, 19, 0, 0, 1, 0, 2, 23, 19, 21, 4, 1, 0, 0, 1,
23, 21, 21, 4, 20, 24, 0, 3, 0, 20, 23, 1, 23, 21, 20, 18, 0,
21, 2, 1, 21, 0)), class = "data.frame", row.names = c(NA, -713L
))

The issue is that you manually bin your V1 variable using cut(V1, 25. Thereby you get 25 groups which (while most of the time having a zero count) get stacked on top of each other. Hence, you end up with 25 stacked (and overlapping) labels per bin. Instead make use of the bins computed by stat_bin by mapping factor(..x..) on fill:
library(ggplot2)
p <- ggplot(final_df, aes(x = V1, fill = factor(..x..))) +
geom_histogram(show.legend = FALSE)
p +
stat_bin(aes(y = ..count.., label = ..count..), geom = "text", vjust = -.5)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
To get rid of the zero entries you could make use of an ifelse:
p +
stat_bin(aes(y = ..count.., label = ifelse(..count.. > 0, ..count.., "")), geom = "text", vjust = -.5)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Related

Missing value where TRUE/FALSE needed error in smcure model

I'm creating a cure model in R to predict Loan Default. I'm seeking someone to help me debug this error. I think it may have to do with my columns.
library(smcure)
smcure(Surv(DURATION, DEFAULT) ~ CHK_ACCT+HISTORY+NEW_CAR+USED_CAR+FURNITURE+`RADIO/TV`+EDUCATION+
RETRAINING+AMOUNT+SAV_ACCT+EMPLOYMENT+INSTALL_RATE+MALE_DIV+MALE_SINGLE+MALE_MAR_or_WID+
`CO-APPLICANT`+GUARANTOR+PRESENT_RESIDENT+REAL_ESTATE+PROP_UNKN_NONE+AGE+OTHER_INSTALL+RENT+
OWN_RES+NUM_CREDITS+JOB+NUM_DEPENDENTS+TELEPHONE+FOREIGN,
cureform=~CHK_ACCT+HISTORY+NEW_CAR+USED_CAR+FURNITURE+`RADIO/TV`+EDUCATION+RETRAINING+AMOUNT+SAV_ACCT+
EMPLOYMENT+INSTALL_RATE+MALE_DIV+MALE_SINGLE+MALE_MAR_or_WID+`CO-APPLICANT`+GUARANTOR+PRESENT_RESIDENT+
REAL_ESTATE+PROP_UNKN_NONE+AGE+OTHER_INSTALL+RENT+OWN_RES+NUM_CREDITS+JOB+NUM_DEPENDENTS+
TELEPHONE+FOREIGN,
model="ph", data = CD)
Error in while (convergence > eps & i < emmax) { :
missing value where TRUE/FALSE needed
Does anyone know what this error may mean?
Attached I have a subset of the data I used.
Data
structure(list(CHK_ACCT = c(0, 1, 3, 0, 0, 3, 3, 1, 3, 1, 1,
0, 1, 0, 0, 0, 3, 0, 1, 3, 3, 0, 0, 1, 3, 0, 3, 2, 1, 0, 1, 0,
1, 3, 2, 1, 3, 2, 2, 1, 3, 1, 1, 0, 0, 3, 3, 0, 3, 3, 1, 1, 3,
3, 1, 3, 1, 3, 2, 0, 1, 1, 1, 1, 3, 3, 3, 1, 3, 3, 3, 3, 0, 1,
0, 0, 0, 1, 3, 1, 3, 3, 3, 0, 0, 3, 1, 1, 0, 0, 3, 0, 3, 2, 1,
1, 3, 1, 1, 1, 3, 1, 3, 1, 3, 1, 3, 1, 0, 1, 1, 2, 1, 3, 0, 3,
0, 0, 0, 1, 0, 3, 3, 2, 1, 0, 0, 1, 1, 0, 1, 0, 3, 3, 3, 3, 3,
1, 1, 2, 2, 1, 0, 0, 3, 1, 0, 3, 0, 3, 3, 3, 2, 1, 1, 0, 0, 0,
1, 3, 3, 3, 3, 1, 3, 3, 0, 1, 3, 1, 0, 3, 1, 1, 0, 3, 0, 0, 3,
0, 3, 1, 0, 3, 1, 3, 1, 1, 0, 1, 3, 1, 1, 3, 1, 1, 3, 1, 1, 1
), DURATION = c(6, 48, 12, 42, 24, 36, 24, 36, 12, 30, 12, 48,
12, 24, 15, 24, 24, 30, 24, 24, 9, 6, 10, 12, 10, 6, 6, 12, 7,
60, 18, 24, 18, 12, 12, 45, 48, 18, 10, 9, 30, 12, 18, 30, 48,
11, 36, 6, 11, 12, 24, 27, 12, 18, 36, 6, 12, 36, 18, 36, 9,
15, 36, 48, 24, 27, 12, 12, 36, 36, 36, 7, 8, 42, 36, 12, 42,
11, 54, 30, 24, 15, 18, 24, 10, 12, 18, 36, 18, 12, 12, 12, 12,
24, 12, 54, 12, 18, 36, 20, 24, 36, 6, 9, 12, 24, 18, 12, 24,
14, 6, 15, 18, 36, 12, 48, 42, 10, 33, 12, 21, 24, 12, 10, 18,
12, 12, 12, 12, 12, 48, 36, 15, 18, 60, 12, 27, 12, 15, 12, 6,
36, 27, 18, 21, 48, 6, 12, 36, 18, 6, 10, 36, 24, 24, 12, 9,
12, 24, 6, 24, 18, 15, 10, 36, 6, 18, 11, 24, 24, 15, 12, 24,
8, 21, 30, 12, 6, 12, 21, 36, 36, 21, 24, 18, 15, 9, 16, 12,
18, 24, 48, 27, 6, 45, 9, 6, 12, 24, 18), HISTORY = c(4, 2, 4,
2, 3, 2, 2, 2, 2, 4, 2, 2, 2, 4, 2, 2, 4, 0, 2, 2, 4, 2, 4, 4,
4, 2, 0, 1, 2, 3, 2, 2, 2, 4, 2, 4, 4, 2, 2, 2, 2, 2, 3, 4, 4,
4, 2, 2, 4, 2, 3, 3, 2, 2, 3, 1, 2, 4, 2, 4, 2, 4, 0, 0, 2, 2,
2, 2, 2, 2, 2, 4, 4, 4, 2, 4, 2, 3, 0, 2, 2, 2, 2, 2, 2, 4, 4,
2, 2, 0, 4, 4, 4, 4, 2, 0, 4, 2, 4, 3, 2, 2, 3, 4, 2, 4, 1, 2,
2, 2, 3, 2, 2, 4, 2, 4, 2, 4, 4, 4, 2, 4, 2, 4, 2, 4, 2, 2, 4,
4, 2, 3, 2, 2, 2, 4, 3, 2, 4, 2, 2, 2, 2, 2, 4, 1, 4, 4, 4, 4,
2, 2, 2, 4, 3, 2, 4, 1, 2, 4, 4, 4, 2, 2, 2, 2, 2, 2, 2, 4, 0,
2, 3, 2, 3, 1, 2, 4, 2, 4, 3, 3, 1, 4, 4, 4, 1, 4, 2, 0, 2, 0,
2, 2, 2, 4, 4, 2, 2, 3), NEW_CAR = c(0, 0, 0, 0, 1, 0, 0, 0,
0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0), USED_CAR = c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0), FURNITURE = c(0,
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 1), `RADIO/TV` = c(1, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1,
1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1,
1, 0, 1, 0, 0, 0), EDUCATION = c(0, 0, 1, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0), RETRAINING = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0), AMOUNT = c(1169,
5951, 2096, 7882, 4870, 9055, 2835, 6948, 3059, 5234, 1295, 4308,
1567, 1199, 1403, 1282, 2424, 8072, 12579, 3430, 2134, 2647,
2241, 1804, 2069, 1374, 426, 409, 2415, 6836, 1913, 4020, 5866,
1264, 1474, 4746, 6110, 2100, 1225, 458, 2333, 1158, 6204, 6187,
6143, 1393, 2299, 1352, 7228, 2073, 2333, 5965, 1262, 3378, 2225,
783, 6468, 9566, 1961, 6229, 1391, 1537, 1953, 14421, 3181, 5190,
2171, 1007, 1819, 2394, 8133, 730, 1164, 5954, 1977, 1526, 3965,
4771, 9436, 3832, 5943, 1213, 1568, 1755, 2315, 1412, 1295, 12612,
2249, 1108, 618, 1409, 797, 3617, 1318, 15945, 2012, 2622, 2337,
7057, 1469, 2323, 932, 1919, 2445, 11938, 6458, 6078, 7721, 1410,
1449, 392, 6260, 7855, 1680, 3578, 7174, 2132, 4281, 2366, 1835,
3868, 1768, 781, 1924, 2121, 701, 639, 1860, 3499, 8487, 6887,
2708, 1984, 10144, 1240, 8613, 766, 2728, 1881, 709, 4795, 3416,
2462, 2288, 3566, 860, 682, 5371, 1582, 1346, 1924, 5848, 7758,
6967, 1282, 1288, 339, 3512, 1898, 2872, 1055, 1262, 7308, 909,
2978, 1131, 1577, 3972, 1935, 950, 763, 2064, 1414, 3414, 7485,
2577, 338, 1963, 571, 9572, 4455, 1647, 3777, 884, 1360, 5129,
1175, 674, 3244, 4591, 3844, 3915, 2108, 3031, 1501, 1382, 951,
2760, 4297), SAV_ACCT = c(4, 0, 0, 0, 0, 4, 2, 0, 3, 0, 0, 0,
0, 0, 0, 1, 4, 4, 0, 2, 0, 2, 0, 1, 4, 0, 0, 3, 0, 0, 3, 0, 1,
4, 0, 0, 0, 0, 0, 0, 2, 2, 0, 1, 0, 0, 2, 2, 0, 1, 4, 0, 0, 4,
0, 4, 4, 0, 0, 0, 0, 4, 0, 0, 0, 4, 0, 3, 0, 4, 0, 4, 0, 0, 4,
0, 0, 0, 4, 0, 4, 2, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 4, 4, 3, 0,
4, 1, 0, 4, 1, 0, 0, 0, 4, 0, 0, 0, 4, 2, 1, 0, 0, 0, 2, 4, 4,
4, 2, 2, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 4, 0, 0, 0, 1, 4, 3, 2,
4, 0, 3, 0, 0, 0, 0, 1, 0, 1, 0, 3, 1, 0, 0, 3, 1, 0, 1, 0, 1,
4, 1, 0, 2, 0, 2, 2, 0, 3, 0, 0, 0, 0, 0, 0, 0, 4, 0, 2, 0, 0,
0, 0, 4, 3, 0, 0, 0, 0, 1, 0, 3, 1, 0, 0, 1, 0, 0, 1, 4, 0),
EMPLOYMENT = c(4, 2, 3, 3, 2, 2, 4, 2, 3, 0, 1, 1, 2, 4,
2, 2, 4, 1, 4, 4, 2, 2, 1, 1, 2, 2, 4, 2, 2, 4, 1, 2, 2,
4, 1, 1, 2, 2, 2, 2, 4, 2, 2, 3, 4, 1, 4, 0, 2, 2, 1, 4,
2, 2, 4, 2, 0, 2, 4, 1, 2, 4, 4, 2, 1, 4, 1, 2, 2, 2, 2,
4, 4, 3, 4, 4, 1, 3, 2, 1, 1, 4, 2, 4, 4, 2, 1, 2, 3, 3,
4, 4, 4, 4, 4, 1, 3, 2, 4, 3, 4, 3, 2, 3, 1, 2, 4, 3, 1,
4, 4, 1, 3, 2, 4, 4, 3, 1, 2, 3, 2, 4, 2, 4, 1, 2, 2, 2,
0, 2, 3, 2, 1, 2, 3, 4, 2, 2, 3, 2, 1, 1, 2, 2, 1, 3, 4,
3, 2, 4, 4, 2, 2, 4, 3, 2, 4, 4, 3, 2, 4, 1, 3, 0, 4, 2,
0, 1, 3, 4, 4, 2, 0, 2, 1, 0, 2, 4, 3, 4, 1, 2, 2, 2, 4,
2, 4, 0, 3, 2, 2, 3, 2, 3, 2, 4, 2, 1, 4, 4), INSTALL_RATE = c(4,
2, 2, 2, 3, 2, 3, 2, 2, 4, 3, 3, 1, 4, 2, 4, 4, 2, 4, 3,
4, 2, 1, 3, 2, 1, 4, 3, 3, 3, 3, 2, 2, 4, 4, 4, 1, 4, 2,
4, 4, 3, 2, 1, 4, 4, 4, 1, 1, 4, 4, 1, 3, 2, 4, 1, 2, 2,
3, 4, 2, 4, 4, 2, 4, 4, 2, 4, 4, 4, 1, 4, 3, 2, 4, 4, 4,
2, 2, 2, 1, 4, 3, 4, 3, 4, 4, 1, 4, 4, 4, 4, 4, 4, 4, 3,
4, 4, 4, 3, 4, 4, 3, 4, 2, 2, 2, 2, 1, 1, 1, 4, 3, 4, 3,
4, 4, 2, 1, 3, 3, 4, 3, 4, 4, 4, 4, 4, 4, 3, 1, 4, 2, 4,
2, 4, 2, 4, 4, 2, 2, 4, 3, 2, 4, 4, 1, 4, 3, 4, 2, 1, 4,
2, 4, 2, 3, 4, 2, 1, 3, 4, 4, 2, 4, 1, 4, 4, 2, 4, 4, 4,
3, 4, 2, 4, 2, 4, 4, 4, 1, 2, 4, 4, 4, 4, 2, 2, 4, 1, 2,
4, 4, 2, 4, 2, 1, 4, 4, 4), MALE_DIV = c(0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1), MALE_SINGLE = c(1, 0, 1, 1, 1, 1, 1, 1, 0, 0,
0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1,
1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0,
1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0,
0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0,
0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0,
1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1,
0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1,
1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0,
1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0,
0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0),
MALE_MAR_or_WID = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0,
0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), `CO-APPLICANT` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0), GUARANTOR = c(0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0), PRESENT_RESIDENT = c(4, 2, 3, 4, 4, 4, 4,
2, 4, 2, 1, 4, 1, 4, 4, 2, 4, 3, 2, 2, 4, 3, 3, 4, 1, 2,
4, 3, 2, 4, 3, 2, 2, 4, 1, 2, 3, 2, 2, 3, 2, 1, 4, 4, 4,
4, 4, 2, 4, 2, 2, 2, 2, 1, 4, 2, 1, 2, 2, 4, 1, 4, 4, 2,
4, 4, 2, 1, 4, 4, 2, 2, 4, 1, 4, 4, 3, 4, 2, 1, 1, 3, 4,
4, 4, 2, 1, 4, 3, 3, 4, 3, 3, 4, 4, 4, 2, 4, 4, 4, 4, 4,
2, 3, 4, 3, 4, 2, 2, 2, 2, 4, 3, 2, 1, 1, 3, 3, 4, 3, 2,
2, 2, 4, 3, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 2, 2, 3, 2, 2,
2, 1, 2, 2, 4, 2, 4, 3, 2, 4, 4, 4, 1, 4, 4, 4, 4, 1, 3,
2, 4, 1, 3, 4, 4, 2, 2, 1, 4, 4, 3, 1, 2, 2, 1, 1, 1, 4,
2, 4, 1, 2, 2, 4, 4, 2, 4, 3, 1, 4, 3, 4, 2, 2, 4, 3, 1,
4, 4, 3), REAL_ESTATE = c(1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0,
0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1,
0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0,
0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), PROP_UNKN_NONE = c(0,
0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
1, 0, 0, 0, 0, 0, 0, 1, 1), AGE = c(67, 22, 49, 45, 53, 35,
53, 35, 61, 28, 25, 24, 22, 60, 28, 32, 53, 25, 44, 31, 48,
44, 48, 44, 26, 36, 39, 42, 34, 63, 36, 27, 30, 57, 33, 25,
31, 37, 37, 24, 30, 26, 44, 24, 58, 35, 39, 23, 39, 28, 29,
30, 25, 31, 57, 26, 52, 31, 23, 23, 27, 50, 61, 25, 26, 48,
29, 22, 37, 25, 30, 46, 51, 41, 40, 66, 34, 51, 39, 22, 44,
47, 24, 58, 52, 29, 27, 47, 30, 28, 56, 54, 33, 20, 54, 58,
61, 34, 36, 36, 41, 24, 24, 35, 26, 39, 39, 32, 30, 35, 31,
23, 28, 25, 35, 47, 30, 27, 23, 36, 25, 41, 24, 63, 27, 30,
40, 30, 34, 29, 24, 29, 27, 47, 21, 38, 27, 66, 35, 44, 27,
30, 27, 22, 23, 30, 39, 51, 28, 46, 42, 38, 24, 29, 36, 20,
48, 45, 38, 34, 36, 30, 36, 70, 36, 32, 33, 20, 25, 31, 33,
26, 34, 33, 26, 53, 42, 52, 31, 65, 28, 30, 40, 50, 36, 31,
74, 68, 20, 33, 54, 34, 36, 29, 21, 34, 28, 27, 36, 40),
OTHER_INSTALL = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1,
0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,
1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0,
1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0), RENT = c(0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 1, 0, 0, 1, 0, 0), OWN_RES = c(1, 1, 1, 0, 0, 0,
1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1,
1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0,
0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0,
1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0,
0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1,
1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1,
1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1,
1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1,
1, 0, 0, 1), NUM_CREDITS = c(2, 1, 1, 1, 2, 1, 1, 1, 1, 2,
1, 1, 1, 2, 1, 1, 2, 3, 1, 1, 3, 1, 2, 1, 2, 1, 1, 2, 1,
2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1,
2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 4, 1,
1, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2,
2, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
2, 2, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 2, 2, 1, 2, 2, 1, 2,
1, 2, 1, 1, 2, 2, 1, 1, 2, 2, 1, 2, 2, 1, 3, 1, 1, 1, 1,
1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 2, 1, 2,
2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 2, 2,
2, 2, 1, 1, 2, 1, 3, 1, 2, 3, 1, 1, 1, 1, 2, 2, 4, 1, 1),
JOB = c(2, 2, 1, 2, 2, 1, 2, 3, 1, 3, 2, 2, 2, 1, 2, 1, 2,
2, 3, 2, 2, 2, 1, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 3, 1,
2, 2, 2, 2, 3, 2, 1, 2, 1, 3, 2, 0, 1, 2, 1, 3, 2, 2, 2,
1, 3, 2, 3, 1, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 3, 1,
3, 3, 2, 2, 1, 2, 2, 2, 1, 1, 1, 3, 2, 2, 3, 2, 2, 2, 1,
2, 2, 2, 2, 2, 2, 3, 1, 2, 2, 2, 2, 3, 3, 2, 2, 2, 2, 2,
1, 2, 2, 2, 3, 2, 2, 3, 2, 3, 1, 2, 2, 2, 1, 2, 3, 2, 2,
2, 1, 2, 2, 2, 2, 1, 2, 1, 0, 3, 3, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 3, 2, 2, 1, 2, 1, 2, 2, 2, 3, 2, 2, 2, 2, 2,
2, 2, 2, 3, 2, 2, 3, 2, 2, 3, 2, 2, 3, 1, 2, 2, 2, 3, 0,
2, 2, 3, 1, 2, 2, 2, 3, 2, 2, 2, 3), NUM_DEPENDENTS = c(1,
1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1,
1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1,
1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1,
1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1,
1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1,
1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1,
2, 2, 1, 1, 1, 1, 1, 1, 1), TELEPHONE = c(1, 0, 0, 0, 0,
1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0,
0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1,
0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1,
1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1,
1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1,
0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0,
1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0,
1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,
0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0,
1, 1, 0, 1, 1), FOREIGN = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
DEFAULT = c(0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0,
1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0,
1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 0,
1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1)), row.names = c(NA,
-200L), class = c("tbl_df", "tbl", "data.frame"))

Generating multiple csv from a list

Using the markovchain library, I managed to generate 144 transitions probability matrixes (eg. matrixList), but how can I save all (143) as a .csv file?
library(tidyverse)
library(markovchain)
mcListFist<-markovchainListFit(data=df[, 1:144], name="df")
matrixList<-list()
for (i in 1:dim(mcListFist$estimate)) {
myMatr<- mcListFist$estimate[[i]]#transitionMatrix
matrixList[[i]]<-myMatr
}
matrixList
Output of transition probability matrix 84-85
............
[[84]]
0
0 1
[[85]]
0
0 1
Sample data:
df<-structure(list(`04:00` = c(11, 11, 11, 11, 11, 11, 11, 11, 11,
11), `04:10` = c(11, 11, 11, 11, 11, 11, 11, 11, 11, 11), `04:20` = c(11,
11, 11, 11, 11, 11, 11, 11, 11, 11), `04:30` = c(11, 11, 11,
11, 11, 11, 11, 11, 11, 11), `04:40` = c(11, 11, 11, 11, 11,
11, 11, 11, 11, 11), `04:50` = c(11, 11, 11, 11, 11, 11, 11,
11, 11, 11), `05:00` = c(11, 11, 11, 11, 11, 11, 11, 11, 11,
11), `05:10` = c(11, 11, 11, 11, 11, 11, 11, 11, 11, 11), `05:20` = c(11,
11, 11, 11, 11, 11, 11, 11, 11, 11), `05:30` = c(11, 11, 11,
11, 11, 11, 11, 11, 11, 11), `05:40` = c(11, 11, 11, 11, 11,
11, 11, 11, 11, 11), `05:50` = c(11, 11, 11, 11, 11, 11, 11,
11, 11, 11), `06:00` = c(11, 0, 11, 11, 11, 11, 11, 0, 0, 11),
`06:10` = c(11, 0, 11, 11, 11, 11, 11, 0, 0, 11), `06:20` = c(11,
0, 11, 11, 11, 11, 11, 0, 0, 11), `06:30` = c(11, 0, 11,
11, 11, 11, 11, 0, 0, 0), `06:40` = c(11, 0, 11, 11, 11,
11, 11, 0, 0, 0), `06:50` = c(11, 0, 11, 11, 11, 11, 11,
0, 0, 0), `07:00` = c(11, 0, 11, 0, 11, 11, 11, 0, 0, 0),
`07:10` = c(11, 0, 11, 0, 11, 11, 11, 0, 0, 0), `07:20` = c(11,
0, 11, 0, 11, 11, 11, 0, 0, 0), `07:30` = c(11, 0, 11, 0,
11, 11, 11, 0, 0, 0), `07:40` = c(0, 0, 11, 0, 11, 11, 11,
0, 0, 0), `07:50` = c(0, 0, 11, 0, 11, 11, 11, 0, 0, 0),
`08:00` = c(0, 0, 0, 0, 0, 11, 11, 0, 0, 0), `08:10` = c(0,
0, 0, 0, 0, 11, 11, 0, 0, 0), `08:20` = c(0, 0, 0, 0, 0,
11, 11, 0, 0, 0), `08:30` = c(0, 0, 0, 0, 0, 11, 11, 0, 0,
0), `08:40` = c(0, 0, 0, 0, 0, 11, 11, 0, 0, 0), `08:50` = c(0,
0, 0, 0, 0, 11, 11, 0, 0, 0), `09:00` = c(0, 0, 0, 0, 0,
0, 0, 0, 0, 0), `09:10` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
`09:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `09:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `09:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `09:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `10:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `10:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `10:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `10:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `10:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `11:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `11:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `11:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `11:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `11:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `11:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `12:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `12:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `12:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `12:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `12:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `12:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `13:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `13:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `13:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `13:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `13:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `13:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `14:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `14:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `14:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `14:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `14:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `14:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `15:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `15:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `15:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `15:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `15:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `15:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `16:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `16:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `16:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `16:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `16:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `16:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `17:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `17:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `17:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `17:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `17:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `17:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `18:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `18:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `18:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `18:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `18:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `18:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `19:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `19:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `19:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `19:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `19:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `19:50` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `20:00` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `20:10` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `20:20` = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), `20:30` = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0), `20:40` = c(0, 0, 0, 0, 0, 0,
0, 0, 0, 0), `20:50` = c(0, 11, 11, 0, 0, 0, 0, 0, 0, 0),
`21:00` = c(0, 11, 11, 0, 0, 0, 0, 0, 0, 11), `21:10` = c(0,
11, 11, 0, 0, 0, 0, 0, 0, 11), `21:20` = c(0, 11, 11, 0,
0, 0, 0, 0, 0, 11), `21:30` = c(0, 11, 11, 0, 0, 0, 0, 0,
0, 11), `21:40` = c(0, 11, 11, 0, 0, 0, 0, 0, 0, 11), `21:50` = c(11,
11, 11, 0, 0, 0, 0, 0, 0, 11), `22:00` = c(11, 11, 11, 0,
0, 0, 0, 0, 0, 11), `22:10` = c(11, 11, 11, 0, 0, 0, 11,
0, 0, 11), `22:20` = c(11, 11, 11, 0, 0, 0, 11, 0, 0, 11),
`22:30` = c(11, 11, 11, 0, 0, 11, 11, 11, 11, 11), `22:40` = c(11,
11, 11, 0, 0, 11, 11, 11, 11, 11), `22:50` = c(11, 11, 11,
0, 0, 11, 11, 11, 11, 11), `23:00` = c(11, 11, 11, 0, 0,
11, 11, 11, 11, 11), `23:10` = c(11, 11, 11, 0, 0, 11, 11,
11, 11, 11), `23:20` = c(11, 11, 11, 0, 0, 11, 11, 11, 11,
11), `23:30` = c(11, 11, 11, 0, 0, 11, 11, 11, 11, 11), `23:40` = c(11,
11, 11, 0, 0, 11, 11, 11, 11, 11), `23:50` = c(11, 11, 11,
0, 11, 11, 11, 11, 11, 11), `00:00` = c(11, 11, 11, 0, 11,
11, 11, 11, 11, 11), `00:10` = c(11, 11, 11, 0, 11, 11, 11,
11, 11, 11), `00:20` = c(11, 11, 11, 0, 11, 11, 11, 11, 11,
11), `00:30` = c(11, 11, 11, 11, 11, 11, 11, 11, 11, 11),
`00:40` = c(11, 11, 11, 11, 11, 11, 11, 11, 11, 11), `00:50` = c(11,
11, 11, 11, 11, 11, 11, 11, 11, 11), `01:00` = c(11, 11,
11, 11, 11, 11, 11, 11, 11, 11), `01:10` = c(11, 11, 11,
11, 11, 11, 11, 11, 11, 11), `01:20` = c(11, 11, 11, 11,
11, 11, 11, 11, 11, 11), `01:30` = c(11, 11, 11, 11, 11,
11, 11, 11, 11, 11), `01:40` = c(11, 11, 11, 11, 11, 11,
11, 11, 11, 11), `01:50` = c(11, 11, 11, 11, 11, 11, 11,
11, 11, 11), `02:00` = c(11, 11, 11, 11, 11, 11, 11, 11,
11, 11), `02:10` = c(11, 11, 11, 11, 11, 11, 11, 11, 11,
11), `02:20` = c(11, 11, 11, 11, 11, 11, 0, 11, 11, 11),
`02:30` = c(11, 11, 11, 11, 11, 11, 11, 11, 11, 11), `02:40` = c(11,
11, 11, 11, 11, 11, 11, 11, 11, 11), `02:50` = c(11, 11,
11, 11, 11, 11, 11, 11, 11, 11), `03:00` = c(11, 11, 11,
11, 11, 11, 11, 11, 11, 11), `03:10` = c(11, 11, 11, 11,
11, 11, 11, 11, 11, 11), `03:20` = c(11, 11, 11, 11, 11,
11, 11, 11, 11, 11), `03:30` = c(11, 11, 11, 11, 11, 11,
11, 11, 11, 11), `03:40` = c(11, 11, 11, 11, 11, 11, 11,
11, 11, 11), `03:50` = c(11, 11, 11, 11, 11, 11, 11, 11,
11, 11)), row.names = c(NA, -10L), class = c("tbl_df", "tbl",
"data.frame"))
Apply the writing function to every element of the list:
mapply(function(data, name) {
data <- as.data.frame(data)
write.csv(data, paste0(name, ".csv"))
}, matrixList, 1:length(matrixList))

OpenCL bincount

I am trying to implement a bincount operation in OpenCL which allocates an output buffer and uses indices from x to accumulate some weights at the same index (assume that num_bins == max(x)). This is equivalent to the following python code:
out = np.zeros_like(num_bins)
for i in range(len(x)):
out[x[i]] += weight[i]
return out
What I have is the following:
import pyopencl as cl
import numpy as np
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
prg = cl.Program(ctx, """
__kernel void bincount(__global int *res_g, __global const int* x_g, __global const int* weight_g)
{
int gid = get_global_id(0);
res_g[x_g[gid]] += weight_g[gid];
}
""").build()
# test
x = np.arange(5, dtype=np.int32).repeat(2) # [0, 0, 1, 1, 2, 2, 3, 3, 4, 4]
x_g = cl.Buffer(ctx, cl.mem_flags.READ_WRITE | cl.mem_flags.COPY_HOST_PTR, hostbuf=x)
weight = np.arange(10, dtype=np.int32) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
weight_g = cl.Buffer(ctx, cl.mem_flags.READ_WRITE | cl.mem_flags.COPY_HOST_PTR, hostbuf=weight)
res_g = cl.Buffer(ctx, cl.mem_flags.READ_WRITE, 4 * 5)
prg.bincount(queue, [10], None, res_g, x_g, weight_g)
# transfer back to cpu
res_np = np.empty(5).astype(np.int32)
cl.enqueue_copy(queue, res_np, res_g)
Output in res_np:
array([1, 3, 5, 7, 9], dtype=int32)
Expected output:
array([1, 5, 9, 13, 17], dtype=int32)
How do I accumulate the elements that are indexed more than once?
EDIT
The above is a contrived example, in my real-world application x will be indices from a sliding window algorithm:
x = np.array([ 0, 1, 2, 4, 5, 6, 8, 9, 10, 1, 2, 3, 5, 6, 7, 9, 10,
11, 4, 5, 6, 8, 9, 10, 12, 13, 14, 5, 6, 7, 9, 10, 11, 13,
14, 15, 8, 9, 10, 12, 13, 14, 16, 17, 18, 9, 10, 11, 13, 14, 15,
17, 18, 19, 20, 21, 22, 24, 25, 26, 28, 29, 30, 21, 22, 23, 25, 26,
27, 29, 30, 31, 24, 25, 26, 28, 29, 30, 32, 33, 34, 25, 26, 27, 29,
30, 31, 33, 34, 35, 28, 29, 30, 32, 33, 34, 36, 37, 38, 29, 30, 31,
33, 34, 35, 37, 38, 39], dtype=np.int32)
weight = np.array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,
0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0,
1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0], dtype=np.int32)
There is a pattern which becomes more apparent when reshaping x to (2,3,2,3,3). But I am having a hard time figuring out how the approach given by #doqtor can be used here and especially if it is easy enough to generalize.
The expected output is:
array([1, 1, 0, 0, 2, 2, 0, 0, 3, 3, 0, 0, 2, 2, 0, 0, 1, 1, 0, 0, 1, 1,
0, 0, 2, 2, 0, 0, 3, 3, 0, 0, 2, 2, 0, 0, 1, 1, 0, 0], dtype=int32)
The problem is that OpenCL buffer to which weights are accumulated is not initialized (zeroed). Fixing that:
res_np = np.zeros(5).astype(np.int32)
res_g = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY | cl.mem_flags.COPY_HOST_PTR, hostbuf=res_np)
prg.bincount(queue, [10], None, res_g, x_g, weight_g)
# transfer back to cpu
cl.enqueue_copy(queue, res_np, res_g)
Returns correct results: [ 1 5 9 13 17]
====== Update ==========
As #Kevin noticed there is race condition here too. If there is any pattern it could be addressed this way without using synchronization, for example processing every 2 elements by 1 work item:
__kernel void bincount(__global int *res_g, __global const int* x_g, __global const int* weight_g)
{
int gid = get_global_id(0);
for(int x = gid*2; x < gid*2+2; ++x)
res_g[x_g[x]] += weight_g[x];
}
Then schedule 5 work items:
prg.bincount(queue, [5], None, res_g, x_g, weight_g)

How to Obtain Constant Term in Linear Discriminant Analysis

Consider dput:
structure(list(REAÇÃO = structure(c(0, 1, 0, 0, 1, 0, 1, 1,
0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1,
1, 0, 1, 1, 0, 1, 1), format.spss = "F11.0"), IDADE = structure(c(22,
38, 36, 58, 37, 31, 32, 54, 60, 34, 45, 27, 30, 20, 30, 30, 22,
26, 19, 18, 22, 23, 24, 50, 20, 47, 34, 31, 43, 35, 23, 34, 51,
63, 22, 29), format.spss = "F11.0"), ESCOLARIDADE = structure(c(6,
12, 12, 8, 12, 12, 10, 12, 8, 12, 12, 12, 8, 4, 8, 8, 12, 8,
9, 4, 12, 6, 12, 12, 12, 12, 12, 12, 12, 8, 8, 12, 16, 12, 12,
12), format.spss = "F11.0"), SEXO = structure(c(1, 1, 0, 0, 1,
0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0,
0, 1, 0, 1, 0, 0, 0, 1, 1, 1), format.spss = "F11.0")), .Names = c("REAÇÃO",
"IDADE", "ESCOLARIDADE", "SEXO"), row.names = c(NA, -36L), class = "data.frame")
where: REAÇÃO is a dependent variable in the model.
Constant: -4.438.
How can I obtain this value using a simple function in R?
For obtain constant term in Discriminant Analysis on R (with library MASS):
groupmean<-(model$prior%*%model$means)
constant<-(groupmean%*%model$scaling)
constant
where model is the lda discriminant expression:
model<-lda(y~x1+x2+xn,data=mydata)
model

Meet-in-the-Middle Atack on an NTRU Private key

I was wondering if anyone could tell me how to represent the enumeration of vectors of privite key f in a Meet-In-the-Middle Attack on an NTRU Private key. I can not understand the example, given here http://securityinnovation.com/cryptolab/pdf/NTRUTech004v2.pdf
I'll be very thankful if anyone could show an example in detail.
(Full disclosure: I work for Security Innovation and worked for NTRU until SI acquired us)
Warning: Long answer!
Let's look at a toy example: N = 11, q = 29. Let's take df = 3, so f consists of 3 coefficients equal to 1 and 8 coefficients equal to 0. Take dg = 5. And assume that h = g*f^{-1} mod p, rather than using the optimizations that have f = 1+pF. Then we might have
f = [1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]
finv = [16, 12, 4, 18, 17, 14, 9, 28, 8, 26, 3]
g = [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0]
h = [15, 20, 1, 21, 4, 26, 14, 17, 25, 11, 12]
You can check that f*h = g here.
The attacker wants to find f, so they can do the brute force search for df = 3. They can speed this up by taking advantage of the fact that there will be some rotation of f that has a 1 in the first position, so they only need to search the (10 pick 2) possible locations for the other two nonzero coefficients of f. The full search they perform is this:
f*h (=g) f
[9, 18, 7, 13, 26, 22, 15, 28, 27, 24, 19]; [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[23, 17, 4, 8, 16, 2, 3, 6, 10, 21, 11]; [1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[15, 2, 3, 5, 11, 21, 12, 23, 17, 4, 8]; [1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[12, 23, 17, 4, 8, 16, 2, 3, 5, 11, 20]; [1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[24, 20, 9, 18, 7, 13, 26, 22, 14, 28, 27]; [1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[2, 3, 6, 10, 21, 12, 23, 17, 4, 8, 15]; [1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[19, 10, 18, 7, 13, 26, 22, 14, 28, 27, 24]; [1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[28, 27, 25, 19, 10, 18, 7, 13, 25, 22, 14]; [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0]
[18, 7, 13, 26, 22, 15, 28, 27, 24, 19, 9]; [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[22, 14, 28, 27, 25, 19, 10, 18, 7, 13, 25]; [1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0]
[14, 28, 27, 24, 20, 9, 19, 6, 14, 25, 22]; [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0]
[11, 20, 12, 23, 17, 4, 9, 15, 2, 3, 5]; [1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0]
[23, 17, 4, 8, 16, 1, 4, 5, 11, 20, 12]; [1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0]
[1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0]; [1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]
[18, 7, 13, 26, 22, 14, 0, 26, 25, 19, 9]; [1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0]
[27, 24, 20, 9, 19, 6, 14, 25, 22, 14, 28]; [1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0]
[17, 4, 8, 16, 2, 3, 6, 10, 21, 11, 23]; [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
[28, 27, 24, 19, 10, 18, 7, 13, 26, 22, 14]; [1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0]
[25, 19, 9, 18, 7, 13, 26, 22, 14, 0, 26]; [1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0]
[8, 16, 1, 3, 6, 10, 21, 12, 23, 17, 4]; [1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0]
[15, 28, 27, 24, 20, 9, 18, 7, 13, 26, 21]; [1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0]
[3, 6, 10, 21, 12, 23, 17, 4, 8, 16, 1]; [1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0]
[12, 23, 17, 4, 9, 15, 2, 3, 5, 11, 20]; [1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0]
[2, 3, 5, 11, 21, 12, 23, 17, 4, 8, 15]; [1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1]
[17, 4, 8, 15, 2, 3, 6, 10, 21, 12, 23]; [1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0]
[0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1]; [1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
[7, 13, 26, 21, 15, 28, 27, 24, 20, 9, 18]; [1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0]
[24, 20, 9, 18, 7, 13, 26, 21, 15, 28, 27]; [1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0]
[4, 8, 16, 1, 4, 5, 11, 20, 12, 23, 17]; [1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0]
[23, 17, 4, 8, 16, 2, 3, 5, 11, 20, 12]; [1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1]
[26, 22, 14, 28, 27, 24, 20, 9, 18, 7, 13]; [1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0]
[4, 5, 11, 20, 12, 23, 17, 4, 8, 16, 1]; [1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0]
[21, 12, 23, 17, 4, 8, 16, 1, 3, 6, 10]; [1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0]
[1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0]; [1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0]
[20, 9, 18, 7, 13, 26, 22, 14, 28, 27, 24]; [1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]
[16, 2, 3, 5, 11, 20, 12, 23, 17, 4, 8]; [1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0]
[4, 9, 15, 2, 3, 5, 11, 20, 12, 23, 17]; [1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0]
[13, 26, 22, 14, 0, 26, 25, 19, 9, 18, 7]; [1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0]
[3, 6, 10, 21, 12, 23, 17, 4, 8, 15, 2]; [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1]
[11, 21, 12, 23, 17, 4, 8, 15, 2, 3, 5]; [1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0]
[20, 9, 19, 6, 14, 25, 22, 14, 28, 27, 24]; [1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0]
[10, 18, 7, 13, 26, 22, 14, 28, 27, 24, 19]; [1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1]
[8, 16, 2, 3, 6, 10, 21, 11, 23, 17, 4]; [1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0]
[27, 25, 19, 10, 18, 7, 13, 25, 22, 14, 28]; [1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1]
[7, 13, 26, 22, 15, 28, 27, 24, 19, 9, 18]; [1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1]
Scan down there, and you can see that g appears in row 14, 26 and 34 of the 45 rows. (g appears three times because there are three 1's in f, so there are three rotations of f that have a 1 in the leading position).
Now let's look at the meet-in-the-middle attack. The attacker uses the formula
(f1+f2) * h = g
so
f1*h = g - f2*h
Using e[i] to mean the i'th coefficient of e, this means that the attacker knows that
(f1*h)[i] = - (f2*h)[i] + 0 or 1
So the attacker calculates all possible values of f1*h. Call the resulting list {g1}. They then calculate -f2*h and for each result g2, they see if g2 is the same as an existing g1 or if g2 differs from any g1 by no more than 1 in each coefficient. In other words,
[3, 10, 12, 7]
would match
[4, 10, 12, 8]
Doing it this way, the attacker needs only work through the following:
All 10 f1s with a 1 in the leading position and a 1 somewhere else
All 10 f2s with a single 1 in any position other than the leading one
This gives the following. I've sorted the lists to make the matches easier to spot.
f1*h = g1 f1
[00, 08, 26, 03, 16, 12, 05, 18, 17, 15, 09] [1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[03, 16, 12, 04, 19, 17, 15, 09, 00, 08, 26] [1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[06, 21, 22, 25, 01, 11, 02, 13, 07, 23, 27] [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[07, 24, 27, 06, 21, 22, 25, 00, 11, 02, 13] [1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[11, 02, 13, 07, 24, 27, 06, 21, 22, 25, 00] [1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[12, 05, 18, 17, 15, 09, 00, 08, 26, 03, 16] [1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[16, 12, 05, 18, 18, 14, 10, 28, 08, 26, 03] [1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
[19, 17, 15, 09, 00, 08, 26, 03, 16, 12, 04] [1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[26, 03, 16, 12, 05, 18, 18, 14, 10, 28, 08] [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[27, 06, 21, 22, 25, 01, 11, 02, 13, 07, 23] [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
-f2*h = g2 f2
[03, 15, 12, 04, 18, 17, 14, 09, 28, 08, 25] [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
[04, 18, 17, 14, 09, 28, 08, 25, 03, 15, 12] [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
[08, 25, 03, 15, 12, 04, 18, 17, 14, 09, 28] [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
[09, 28, 08, 25, 03, 15, 12, 04, 18, 17, 14] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[12, 04, 18, 17, 14, 09, 28, 08, 25, 03, 15] [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
[15, 12, 04, 18, 17, 14, 09, 28, 08, 25, 03] [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[17, 14, 09, 28, 08, 25, 03, 15, 12, 04, 18] [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[18, 17, 14, 09, 28, 08, 25, 03, 15, 12, 04] [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
[25, 03, 15, 12, 04, 18, 17, 14, 09, 28, 08] [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
[28, 08, 25, 03, 15, 12, 04, 18, 17, 14, 09] [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
You can see that:
line 1 of g1 matches with line 10 of g2, giving [1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0]
line 2 of g1 matches with line 1 of g2, giving [1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
line 6 of g1 matches with line 5 of g2, giving [1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0]
line 7 of g1 matches with line 6 of g2, giving [1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0]
line 8 of g1 matches with line 8 of g2, giving [1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]
line 9 of g1 matches with line 9 of g2, giving [1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0]
There are 6 collisions here because there are 3 rotations with a 1 in the leading position and for each rotation there are two ways to pick the other two coefficients.
So an attacker would have to do about 45/3 = 15 work to find the key with a brute force search and about 10 work to find the key with a meet-in-the-middle attack (slightly less than 10 due to the rotations, but I don't have a clean formula to hand).
There are various optimizations, but this should be enough to give you the idea.
One thing I haven't dealt with so far is how to keep the search time down. A straightforward way to do it is simply to sort the results as you're going along. The time to insert or look for a collision with an entry is about log_2(size of the search space). Alternatively, at the cost of using more memory, it's possible to bring this search time down to a constant by reserving a block for each possible value of the first few coefficients of g1.
Hope this helps. Let me know if you have any more questions.

Resources