Binning continuous data to stack histogram in R - r

I have a dataset that looks like this:
USER.ID avgfrequency orders group
1 3 3.7821782 101 3
2 7 14.7500000 8 3
3 9 13.4761905 21 3
4 13 5.1967213 61 3
5 16 6.7812500 64 3
6 26 41.7500000 4 2
7 49 13.6666667 3 2
8 50 7.0000000 1 1
9 51 1.0000000 1 1
10 52 17.7500000 4 2
11 69 4.5000000 2 1
12 75 9.9500000 20 3
13 91 84.2000000 5 2
14 98 8.0185185 54 3
15 138 14.2000000 5 2
16 139 34.7500000 4 2
17 149 7.6666667 21 3
18 155 35.3333333 9 3
19 167 24.0000000 1 1
20 170 7.3529412 34 3
21 171 4.4210526 76 3
22 174 4.5000000 2 1
23 175 6.5781250 64 3
24 176 19.2857143 21 3
25 177 10.4864865 37 3
26 178 28.0000000 15 3
27 180 4.8461538 39 3
28 183 25.5000000 2 1
29 184 13.0000000 1 1
30 210 32.0000000 1 1
31 215 13.4615385 13 3
32 220 11.3611111 36 3
33 223 26.2500000 8 3
34 224 40.5000000 8 3
35 230 15.4000000 10 3
36 232 14.6666667 3 2
37 234 34.5833333 12 3
38 238 138.5000000 2 1
39 240 7.0000000 3 2
40 243 35.0000000 3 2
41 246 6.7500000 4 2
42 247 8.5000000 50 3
43 258 17.6666667 3 2
44 283 23.5000000 2 1
45 295 19.5625000 16 3
46 300 81.6666667 3 2
47 311 34.4166667 12 3
48 338 64.0000000 1 1
49 342 113.3333333 3 2
50 343 197.0000000 1 1
51 347 3.6923077 13 3
52 350 4.6666667 3 2
53 360 177.5000000 2 1
54 361 39.0000000 10 3
55 362 1.4000000 5 2
56 365 15.0000000 24 3
57 366 59.2000000 5 2
58 367 5.0000000 4 2
59 369 27.9285714 14 3
60 372 63.6666667 3 2
61 375 9.3750000 8 3
62 377 13.3225806 31 3
63 380 169.5000000 2 1
64 383 23.2352941 17 3
65 391 0.0000000 1 1
I want to split avgfrequency into different bins of width 10 and plot it as x-axis and on y-axis I want to show the count of USER.ID as histograms and in each bar I want to show count of USER.ID of different group with different color. So, each histogram would have three different colors for each bin.
Is it possible to do it in R ?

It is possible. See below:
library(ggplot2) #load the ggplot2 graph package
data = data.frame(data) #make the dataset a R dataframe object
head(data,2) #just showing part of the data here.
USER.ID avgfrequency orders group
3 3.782178 101 3
7 14.750000 8 3
#build graph
ggplot(data, aes(x=avgfrequency,fill=factor(group))) +
geom_histogram(breaks=seq(0,200,by=10),colour='black') +
xlab("Average Frequency") + ylab("Count of USER.ID") +
scale_fill_manual("Group", breaks = c("1","2","3"), values = c("grey30","grey50", "grey70")) +
theme_bw()

Related

How to test for p-value with groups/filters in dplyr

My data looks like the example below. (sorry if it's too long, not sure what's acceptable/needed).
I have used the following code to calculate the median and IQR of each time difference (tdif) between tests (testno):
data %>% group_by(testno) %>% filter(type ==1) %>%
summarise(Median = median(tdif), IQR= IQR(tdif), n= n(), .groups = 'keep') -> result
I have done this for each category of 'type' (coded as 1 - 10), which brought me to the added table (bottom).
My question is, if it is possible to:
Do this an easier way (without the filters? So I can do this all in 1 run), and
Is it possible run a test for p-value with all the groups/filters?
data <- read.table(header=T, text= '
PID time tdif testno type
3 205 0 1 1
4 77 0 1 1
4 85 8 2 1
4 126 41 3 1
4 165 39 4 1
4 202 37 5 1
4 238 36 6 1
4 272 34 7 1
4 277 5 8 1
4 370 93 9 1
4 397 27 10 1
4 452 55 11 1
4 522 70 12 1
4 529 7 13 1
4 608 79 14 1
4 651 43 15 1
4 655 4 16 1
4 713 58 17 1
4 804 91 18 1
4 900 96 19 1
4 944 44 20 1
4 979 35 21 1
4 1015 36 22 1
4 1051 36 23 1
4 1077 26 24 1
4 1124 47 25 1
4 1162 38 26 1
4 1222 60 27 1
4 1334 112 28 1
4 1383 49 29 1
4 1457 74 30 1
4 1506 49 31 1
4 1590 84 32 1
4 1768 178 33 1
4 1838 70 34 1
4 1880 42 35 1
4 1915 35 36 1
4 1973 58 37 1
4 2017 44 38 1
4 2090 73 39 1
4 2314 224 40 1
4 2381 67 41 1
4 2433 52 42 1
4 2484 51 43 1
4 2694 210 44 1
4 2731 37 45 1
4 2792 61 46 1
4 2958 166 47 1
5 48 0 1 3
5 111 63 2 3
5 699 588 3 3
5 1077 378 4 3
6 -43 0 1 3
8 67 0 1 1
8 168 101 2 1
8 314 146 3 1
8 368 54 4 1
8 586 218 5 1
10 639 0 1 6
13 -454 0 1 3
13 -384 70 2 3
13 -185 199 3 3
13 193 378 4 3
13 375 182 5 3
13 564 189 6 3
13 652 88 7 3
13 669 17 8 3
13 718 49 9 3
14 704 0 1 8
15 -165 0 1 3
15 -138 27 2 3
15 1335 1473 3 3
16 168 0 1 6
18 -1329 0 1 3
18 -1177 152 2 3
18 -1071 106 3 3
18 -945 126 4 3
18 -834 111 5 3
18 -719 115 6 3
18 -631 88 7 3
18 -497 134 8 3
18 -376 121 9 3
18 -193 183 10 3
18 -78 115 11 3
18 -13 65 12 3
18 100 113 13 3
18 196 96 14 3
18 552 356 15 3
18 650 98 16 3
18 737 87 17 3
18 804 67 18 3
18 902 98 19 3
18 983 81 20 3
18 1119 136 21 3
19 802 0 1 1
19 1593 791 2 1
26 314 0 1 8
26 389 75 2 8
26 597 208 3 8
33 639 0 1 6
Added table (values differ from example data, because this isn't the complete set).

Error in x[j] : invalid subscript type 'list' while using subset in R

I have a problem while I'm trying to subset my dataframe. Here is the code that I'm using to import data file and sub-setting;
fiber_val<-read.csv(file.choose(), header=TRUE, dec=",", check.names=FALSE,stringsAsFactors=FALSE)
y<-14
z<-16
fiber_val[, y:z] <- sapply(fiber_val[, y:z], as.numeric)
fiber_val$sg<-(fiber_val$airdryweight/1.077)/fiber_val$waterweight
fiber_val<-subset(fiber_val, select = c(id,sample,standtreedisk,density,sg))
after running the last line, it yells at me
Error in x[j] : invalid subscript type 'list'
and here's part of data set that I'm using;
id stand tree disk species region standtreedisk nirblock sample barktopith pithtobark length sections ringssection airdryweight waterweight density
1 160 7 10 131 6 160x7x10 749 16907 4 2 52 5 2 0.6489 1.3245 0.48992
2 160 7 10 131 6 160x7x10 749 16905 2 4 52 5 3 0.6062 1.2206 0.49664
3 160 7 12 131 6 160x7x12 750 16915 2 3 43 4 2 0.6438 1.3279 0.48483
4 160 7 13 131 6 160x7x13 750 16919 2 2 30 3 3 0.5816 1.4101 0.41245
5 161 17 12 131 6 161x17x12 760 17166 4 2 50 5 1 0.5702 1.3952 0.40869
6 161 17 12 131 6 161x17x12 760 17167 5 1 50 5 1 0.5454 1.3307 0.40986
7 161 17 12 131 6 161x17x12 760 17163 1 5 50 5 1 0.6947 1.5702 0.44243
8 161 17 13 131 6 161x17x13 760 17170 3 1 32 3 2 0.4357 1.2244 0.35585
9 26 9 7 131 4 26x9x7 140 3883 8 1 82 8 2 0.4595 1.3503 0.34029
10 161 17 13 131 6 161x17x13 760 17169 2 2 32 3 1 0.484 1.2843 0.37686
11 136 50 1 131 6 136x50x1 579 12482 9 1 96 9 2 0.5392
12 137 54 5 131 4 137x54x5 586 12636 4 4 73 7 1 0.4692
13 137 54 5 131 4 137x54x5 586 12638 6 2 73 7 2 0.4555
14 137 54 6 131 4 137x54x6 586 12640 1 6 65 6 4 0.6449
15 137 54 1 131 4 137x54x1 585 12606 5 5 90 9 1 0.7035
16 137 54 1 131 4 137x54x1 585 12610 9 1 90 9 2 0.4963
17 137 54 1 131 4 137x54x1 585 12609 8 2 90 9 2 0.5193
18 137 54 1 131 4 137x54x1 585 12603 2 8 90 9 3 0.6427
19 137 54 6 131 4 137x54x6 586 12644 5 2 65 6 1 0.4654
20 137 54 4 131 4 137x54x4 585 12632 7 1 76 7 2 0.4974
21 137 54 5 131 4 137x54x5 586 12639 7 1 73 7 2
22 137 5 3 131 4 137x5x3 582 12557 2 7 82 8 3
23 137 74 3 131 4 137x74x3 588 12679 3 5 71 7 2
24 137 74 3 131 4 137x74x3 588 12683 7 1 71 7 1
25 137 5 3 131 4 137x5x3 582 12562 7 2 82 8 1
26 137 74 5 131 4 137x74x5 588 12695 6 1 61 6 2
27 138 108 1 131 4 138x108x1 594 12830 6 5 104 10 1
28 138 108 1 131 4 138x108x1 594 12831 7 4 104 10 2
29 138 108 1 131 4 138x108x1 594 12832 8 3 104 10 2
30 138 66 1 131 4 138x66x1 592 12781 5 4 87 8 2
any help would be appreciated :)
you appear to have a problem with the type of object you have. you can try converting it to a data frame using unlist, as.data.frame(), etc.

Convert data frame to time series for prediction in R

I retrieve data from MySQL in the following format:
date newCustomers
2016-07-27 31
2016-07-26 3
The data starts from date 2015-02-25 and there is an entry for each day.
I want to convert this data frame to time series for forecasting purposes.
I tried the following:
dataTimeSeries <- ts(data, start=c(2015,2,25), frequency=365.25) and it gave me a warning In data.matrix(data) : NAs introduced by coercion. On checking what's in dataTimeSeries, this is what I found
date day
2016.000 NA 31
2016.003 NA 3
2016.005 NA 2
2016.008 NA 0
What am I doing wrong, please point me in the right direction?
UPDATE: As suggested, I tried dataTimeSeries <- ts(data$newCustomers, start=c(2015,2,25), frequency=365.25) and it gave me the following result
Time Series:
Start = 2015.00273785079
End = 2015.9993155373
Frequency = 365.25
[1] 31 3 2 0 101 69 8 4 15 3 1 22 47 85 359 6 7 2 134 44 20 61 2 0 4 2373 4243 7 31 11 2 0 25 1689 24 74
[37] 22 0 1 336 373 14 11 145 7 0 1 19 49 522 19 1 39 1611 9 675 21 1 45 4 156 180 747 265 169 0 0 4 7 3 4 10
[73] 64 1 3 5 2 13 15 0 6 0 13 2 13 10 5 14 16 28 134 8 2 0 0 9 29 7 79 17 1 4 167 6 64 334 14 0
[109] 0 13 17 57 66 3 0 0 25 2 4 22 16 2 0 23 23 169 9912 24 8 3 154 3 2 29 29 243 0 6 2 72 66 7 1 0
[145] 24 208 13 6 7 10 4 54 79 72 9 29 31 208 224 18 50 65 152 50 10 55 107 249 178 3 0 0 627 19 220 20 285 0 1 11
[181] 26 25 88 9 2 7 64 54 212 295 37 49 19 144 30 78 29 97 210 143 4 294 2 34 642 24 0 0 1 4 0 0 0 0 0 0
[217] 2 3 9 0 0 62 6 16 0 12 0 21 3 6 5 8 1 1 0 3 40 16 1 0 0 66 0 0 1 8 6 1 14 26 4 4
[253] 285 4 0 0 0 3 1 0 28 0 0 24 360 0 0 2 3 0 11 294 578 1 4 0 0 19 2 7 10 0 0 1 20 1 59 19
[289] 2 0 0 9 19 12 4 10 5 4 5 5 7 38 10 5 6 9 18 22 30 28 13 14 22 22 35 12 6 3 3 15 3 3 28 1
[325] 0 0 7 45 21 14 21 0 0 22 14 17 799 7 0 3 8 20 21 107 75 3 3 39 36 137 42 39 6 16 113 11 6 10 8 6
[361] 6 8 21 12 81
which is not correct.
This should work, since you only need to feed the data (and not the times) to ts():
dataTimeSeries <- ts(data$newCustomers, ...)
It's also possible that your data doesn't have regularly spaced intervals between observations? Time series are best used for data sets with equally-spaced intervals between your observation dates. You can see Analyzing Daily/Weekly data using ts in R for other methods of analyzing data that doesn't necessarily have equally-spaced time.

how to fix “No appropriate likelihood could be inferred” for network meta-analysis in R?

I am currently learning Network meta-analysis in R with "gemtc",and "netmeta".
As I try to fit the GLM model for analysis, I encountered this error message " No appropriate likelihood could be inferred" .
My code are:
gemtc_network_numbers <-mtc.network(data.ab=diabetes_data,treatments=treatments)
mtcmodel<-mtc.model(network=gemtc_network_numbers,type="consistency",factor=2.5, n.chain=4, linearModel="random")
mtcresults <- mtc.run(mtcmodel, n.adapt = 20000, n.iter=100000, thin=10, sampler="rjags")
# View results summary
print(summary(mtcresults))
My data are:
> diabetes_data
study treatment responder samplesize
1 1 1 45 410
2 1 3 70 405
3 1 4 32 202
4 2 1 119 4096
5 2 4 154 3954
6 2 5 302 6766
7 3 2 1 196
8 3 5 8 196
9 4 1 138 2800
10 4 5 200 2826
11 5 3 799 7040
12 5 4 567 7072
13 6 1 337 5183
14 6 3 380 5230
15 7 2 163 2715
16 7 6 202 2721
17 8 1 449 2623
18 8 6 489 2646
19 9 5 29 416
20 9 6 20 424
21 10 4 177 4841
22 10 6 154 4870
23 11 3 86 3297
24 11 5 75 3272
25 12 1 102 2837
26 12 6 155 2883
27 13 4 136 2508
28 13 5 176 2511
29 14 3 665 8078
30 14 4 569 8098
31 15 2 242 4020
32 15 3 320 3979
33 16 3 37 1102
34 16 5 43 1081
35 16 6 34 2213
36 17 3 251 5059
37 17 4 216 5095
38 18 1 335 3432
39 18 6 399 3472
40 19 2 93 2167
41 19 6 115 2175
42 20 5 140 1631
43 20 6 118 1578
44 21 1 93 1970
45 21 3 97 1960
46 21 4 95 1965
47 22 2 690 5087
48 22 4 845 5074
Thanks for your help.
Angel
You have to solution :
1- Replace your responder variable by "responders" and your samplesize variable by "sampleSize".
or
2- Use for example : mtc.model(...,likelihood="poisson",link="log")).

How to annotate boxplots using svyboxplot library in R

I am trying to figure out how to label the boxplots that appear after I use the svyboxplot library for R.
I have tried the following:
svyboxplot(~ALCANYNO~factor(REGION), design=ihisDesign3, xlab='Region', ylab='Frequency', ylim=c(0,10), colnames=c("Northeast", "Midwest", "South", "West"));
SOLUTION: Add the following to factor:
labels = c('Northeast', 'Midwest', 'South', 'West')
This changes the example above to the following:
svyboxplot(~ALCANYNO~factor(REGION,
labels=c('Northeast', 'Midwest', 'South', 'West')),
design=ihisDesign3, xlab='Region', ylab='Frequency',
ylim =c (0, 10))
I am Creating a dataset to explain:
options(width = 120)
library (survey)
library (KernSmooth)
xd1<-
"xsmoke age_p psu stratum wt8
13601 3 22 2 20 356.5600
32966 3 38 2 45 434.3562
63493 1 32 1 87 699.9987
238175 3 46 1 338 982.8075
174162 3 40 1 240 273.6313
220206 3 33 2 308 1477.1688
118133 3 68 1 159 716.3012
142859 2 23 1 194 1100.9475
115253 2 35 2 155 444.3750
61675 3 31 1 85 769.5963
189813 3 37 1 263 328.5600
226274 1 47 2 318 605.8700
41969 3 71 2 58 597.0150
167667 3 40 2 230 1030.4637
225103 3 37 2 316 349.6825
49894 3 70 2 68 517.7862
98075 3 46 2 130 1428.7225
180771 3 50 1 250 652.4188
137057 3 42 1 186 590.2100
77705 2 23 1 105 1687.2450
89106 3 48 1 118 407.6513
208178 3 50 1 290 556.5000
100403 3 52 2 133 1481.8200
221571 1 27 2 310 833.5338
10823 2 72 1 16 1807.6425
108431 3 71 2 145 945.6263
68708 1 46 1 94 1989.3775
23874 3 23 2 33 1707.8775
150634 3 19 2 206 761.1500
231232 3 42 2 326 1487.4113
184654 2 42 2 255 1715.2375
215312 3 57 1 300 483.5663
40713 2 57 2 56 2042.2762
130309 3 23 1 177 948.5625
25515 2 55 1 35 2719.7525
235612 2 83 2 333 603.3537
13755 2 36 2 20 265.1938
2441 3 33 1 4 1062.1200
157327 3 77 1 215 2010.6600
66502 3 20 2 91 1122.9725
230778 1 55 2 325 1207.3025
74805 3 54 1 101 1028.5150
166556 1 50 1 229 1546.9450
91914 1 68 1 121 428.5350
89651 3 59 2 118 143.5437
149329 3 44 2 204 1064.7725
212700 2 59 2 295 1050.1163
454 1 79 1 1 275.5700
125639 1 27 1 170 785.1037
55442 3 47 1 76 950.3312
145132 3 77 1 197 1269.2287
123069 3 24 1 167 216.1937
188301 1 55 2 260 426.6313
852 2 66 2 1 1443.4887
3582 3 81 1 6 790.8412
235423 1 44 2 333 659.4238
42175 2 40 1 59 1089.6762
57033 3 43 1 78 226.8750
177273 2 85 1 244 392.7200
218558 3 40 2 305 1680.2700
27784 2 45 1 39 280.0550
81823 3 43 1 110 965.0438
76344 3 26 1 103 1095.6012
114916 3 56 2 154 436.8838
35563 3 78 1 49 333.2875
192279 3 30 2 267 722.0312
61315 1 48 2 84 1426.5725
219903 3 43 1 308 791.5738
42612 3 25 1 60 658.1387
178488 3 33 2 246 675.1912
9031 1 27 2 14 989.4863
145092 2 64 1 197 960.1912
71885 3 53 2 97 595.4050
38137 2 75 1 53 1004.0912
140149 1 21 1 190 1870.9350
162052 3 25 1 223 892.7775
89527 2 39 2 118 518.1050
59650 3 26 2 82 432.7837
24709 2 84 1 34 453.9013
18933 3 85 1 27 582.3288
24904 3 35 2 34 1027.5287
213668 3 39 1 298 3174.1925
110509 3 30 1 149 469.8188
72462 3 63 1 98 386.2163
152596 3 19 1 209 1328.2188
17014 4 62 1 24 294.9250
33467 2 50 1 46 1601.4575
5241 3 33 1 9 1651.0988
215094 3 23 1 300 427.6313
88885 1 21 1 118 1092.2613
204868 2 60 2 285 781.2325
157415 2 31 2 215 1323.5750
71081 2 44 2 96 1059.2088
25420 3 38 1 35 530.7413
144226 1 27 1 196 1126.3112
47888 3 46 2 66 965.4050
216179 3 29 2 301 1237.6463
29172 3 68 1 41 1025.9738
168786 1 47 1 232 680.6213
94035 2 23 2 124 330.4563
170542 1 25 2 234 757.2287
160331 2 33 2 220 636.3900
124163 3 80 2 167 287.6988
71442 2 37 1 97 442.2300
80191 2 74 2 107 871.0338
199309 3 29 2 277 485.2337
91293 3 35 2 120 138.3187
219524 2 68 1 307 609.5862
119336 3 85 2 160 149.7612
31814 3 68 1 44 396.6913
54920 1 28 2 75 532.7175
161034 3 29 2 221 791.0100
177037 1 50 1 244 626.2400
119963 1 54 1 162 374.1062
107972 2 58 1 145 944.8863
22932 3 60 1 32 310.6413
54197 3 23 2 74 931.2737
209598 3 23 1 292 1078.2950
213604 1 74 2 297 588.5000
146480 3 27 1 200 212.0588
162463 3 55 2 223 1202.0925
215534 3 33 2 300 430.3938
100703 1 53 1 134 463.6200
162588 3 27 1 224 612.0250
222676 1 35 1 312 292.7000
220052 3 84 1 308 1301.4738
131382 3 36 1 178 825.9512
102117 3 28 1 137 451.4075
70362 3 52 2 95 185.2562
188757 3 22 2 261 704.3913
215878 2 37 1 301 789.9837
45820 3 18 2 64 2019.4137
84860 3 47 1 113 149.0200
110581 3 37 1 149 526.0775
207650 3 51 2 289 688.0538
40723 3 59 2 56 497.6050
169663 3 19 2 233 845.0362
191955 1 36 1 267 735.7350
213816 3 18 2 298 2275.3513
120967 3 48 2 163 1055.3238
209430 2 42 2 291 1771.0225
21235 3 21 1 30 1204.5663
131326 3 29 1 178 331.9588
19667 1 57 1 28 638.9138
74743 2 48 1 101 1208.8763
178672 3 66 2 246 338.2013
100174 3 24 2 133 1733.6275
69046 3 24 2 94 542.4863
79960 1 41 2 107 567.6363
108591 2 42 1 146 978.3775
235635 3 24 1 334 1382.9437
187426 2 54 2 259 478.2362
28728 3 39 2 40 1165.6175
205348 3 32 2 286 1082.9913
218812 3 30 1 306 308.1037
168389 3 48 2 231 593.2475
145479 1 21 1 198 864.2663
105170 2 40 1 141 1016.7862
155753 2 78 2 212 1109.0025
169399 3 28 1 233 1467.1363
55664 1 63 1 76 904.3763
74024 2 51 1 100 547.5538
85558 1 25 1 114 893.8825
142684 3 54 2 193 1203.3212
198792 1 22 1 277 1800.3325
82603 3 70 2 110 827.3763
171036 2 50 2 235 2003.9725
1616 1 42 2 2 590.5662
57042 3 45 1 78 1021.7287
45100 2 38 2 63 1807.9288
134828 2 28 1 183 715.1187
91167 3 26 2 120 480.1950
170605 3 40 2 234 507.2763
175869 3 77 1 242 386.2987
81594 2 82 2 109 580.0838
37426 1 20 2 52 1159.1613
113799 3 85 1 153 459.5450
24721 3 18 2 34 2912.7575
26297 3 45 2 36 1304.4925
57074 1 51 1 78 602.2112
185000 3 34 1 256 583.5738
94196 3 44 2 124 2344.1087
80656 3 45 2 108 1340.9713
14849 1 46 1 22 967.2525
145730 2 73 1 198 418.8037
56633 3 34 2 77 1011.5488
273 2 54 1 1 786.2138
60567 1 40 2 83 315.2925
47788 1 38 2 66 1105.9188
76943 2 53 2 103 537.7062
165014 3 34 1 227 824.3125
188444 3 22 1 261 623.2225
29043 1 35 1 41 724.9025
165578 3 25 1 228 596.0275
50702 3 43 2 69 985.9662
197621 3 39 2 275 1310.1163
26267 3 41 2 36 1030.3900
29565 1 60 2 41 920.8550
20060 3 36 2 28 157.2188
119780 2 20 1 162 863.8100"
tor <- read.table(textConnection(xd1), header=TRUE, as.is=TRUE)
# Grouping variable "xsmoke" must be a factor
tor$xsmoke <- factor(tor$xsmoke,levels=c (1,2,3),
labels=c('Current SMK','Former SMK', 'Never Smk'), ordered=TRUE)
is.factor(tor$xsmoke)
# object with survey design variables and data
nhis <- svydesign (id=~psu,strat=~stratum, weights=~wt8, data=tor, nest=TRUE)
MyBreaks <- c(18, 25, 35, 45, 55, 65, 75, 85)
svyboxplot (age_p~xsmoke,
subset (nhis, age_p>=0),
col=c("red", "yellow", "green"), medcol="blue",
varwidth=TRUE, all.outliers=TRUE,
ylab="Age at Interview",
xlab=" "
)
The Factor variable xsmoke is coded as tor$xsmoke <- factor(tor$xsmoke,levels=c (1,2,3),
labels=c('Current SMK','Former SMK', 'Never Smk'), ordered=TRUE) which should be useful
__________________________________________enter code here

Resources