Barplot using three columns - r

The data in the table is given below:
Year NSW Vic. Qld SA WA Tas. NT ACT Aust.
1 1917 1904 1409 683 440 306 193 5 3 4941
2 1927 2402 1727 873 565 392 211 4 8 6182
3 1937 2693 1853 993 589 457 233 6 11 6836
4 1947 2985 2055 1106 646 502 257 11 17 7579
5 1957 3625 2656 1413 873 688 326 21 38 9640
6 1967 4295 3274 1700 1110 879 375 62 103 11799
7 1977 5002 3837 2130 1286 1204 415 104 214 14192
8 1987 5617 4210 2675 1393 1496 449 158 265 16264
9 1997 6274 4605 3401 1480 1798 474 187 310 18532
I want to plot a graph with (Year) on my x-axis and (total value) on my Y-axis. The barplot should depicting the ACT and NT value for the respective (Years).
I tried the following command:
barplot(as.matrix(r_data$ACT, r_data$NT), main="r_data", ylab="Total", beside=TRUE)
The above command showed the barplot of ACT column per year but didn't show the Bar plot of NT column.

You have to create the matrix in a different way:
barplot(as.matrix(r_data[c("ACT", "NT")]),
main="r_data", ylab="Total", beside=TRUE)
You can also use cbind instead of as.matrix and keep the rest of your original approach:
barplot(cbind(r_data$ACT, r_data$NT),
main="r_data", ylab="Total", beside=TRUE)

Related

Converting SAS Proc Shewhart into R programming

I have a data, where SAS Proc shewhart is implemented, I want to implement the same in R program, below is the data and sas code
> valueid date dis_id sales_amount yymm (year month)
> 868 5-Mar-18 2 956 1803
868 6-Apr-17 2 473 1704
868 22-Dec-16 2 524 1612
914 17-Dec-15 2 1768 1512
914 18-Aug-16 2 477 1608
914 12-Jan-17 2 804 1701
870 1-May-17 2 1373 1705
870 8-Sep-17 2 323 1709
870 29-Feb-16 2 1718 1602
870 26-Jan-16 2 1242 1601
870 1-Apr-16 2 995 1604
800 22-Apr-16 2 356 1604
925 10-May-16 2 1487 1605
928 30-May-16 2 1210 1605
928 29-Jun-16 2 1935 1606
928 28-Nov-16 2 1149 1611
928 13-Dec-16 2 835 1612
987 10-Jul-17 2 1023 1707
987 27-Jul-17 2 389 1707
987 22-Sep-17 2 1191 1709
Below is the Program use to implement XSCHART
proc shewhart data=sales_revenue;
by valueid;
xschart sales_amount*yymm/ nochart outtable= newoutput;
id dis_id;
run;
I need to convert this shewhart with xschart into R. kindly help me.

Divide paired matching columns

I have a data.frame df with matching columns that are also paired. The matching columns are defined in the factor patient. I would like to devide the matching columns by each other. Any suggestions how to do this?
I tried this, but this does not take the pairing from patient into account.
m1 <- m1[sort(colnames(df)]
m1_g <- m1[,grep("^n",colnames(df))]
m1_r <- m1[,grep("^t",colnames(df))]
m1_new <- m1_g/m1_r
m1_new
head(df)
na-008 ta-008 nc012 tb012 na020 na-018 ta-018 na020 tc020 tc093 nc093
hsa-let-7b-5p_TGAGGTAGTAGGTTGTGT 56 311 137 242 23 96 113 106 41 114
hsa-let-7b-5p_TGAGGTAGTAGGTTGTGTGG 208 656 350 713 49 476 183 246 157 306
hsa-let-7b-5p_TGAGGTAGTAGGTTGTGTGGT 631 1978 1531 2470 216 1906 732 850 665 909
hsa-let-7b-5p_TGAGGTAGTAGGTTGTGTGGTT 2760 8159 6067 9367 622 4228 2931 3031 2895 2974
hsa-let-7b-5p_TGAGGTAGTAGGTTGTGTGGTTT 1698 4105 3737 3729 219 1510 1697 1643 1527 1536
> head(patient)
$`008`
[1] "na-008" "ta-008"
$`012`
[1] "nc012" "tb012"
$`018`
[1] "na-018" "ta-018"
$`020`
[1] "na020" "tc020"
$`045`
[1] "nb045" "tc045"
$`080`
[1] "nb-080" "ta-080"

interpreting dates from Auto arima model

The following is my code,
auto<-auto.arima(x)
auto_for<-forecast(auto,h=30)
> auto_for$x
Time Series:
Start = 1
End = 74
Frequency = 1
[1] 151 151 151 151 151 219 465 465 465 465 465 743 743 743 743 743 743 743 743 743 743 743
[23] 743 743 743 743 743 743 743 829 829 829 829 829 829 1004 1004 1004 1424 1424 1424 1822 1941 1941
[45] 1941 1941 1941 1941 1941 2076 2076 2252 2252 2252 2252 2252 2252 2252 2252 2252 2252 2252 2252 2940 2940 2940
[67] 2940 2940 3134 3134 3134 3207 3207 3465
> auto_for
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
75 3510.397 3359.577 3661.217 3279.738 3741.056
76 3555.795 3342.503 3769.086 3229.594 3881.996
77 3601.192 3339.964 3862.419 3201.679 4000.705
78 3646.589 3344.949 3948.229 3185.271 4107.907
79 3691.986 3354.743 4029.230 3176.217 4207.755
80 3737.384 3367.952 4106.815 3172.387 4302.380
81 3782.781 3383.749 4181.812 3172.515 4393.047
82 3828.178 3401.595 4254.761 3175.776 4480.580
83 3873.575 3421.116 4326.035 3181.599 4565.552
84 3918.973 3442.039 4395.907 3189.565 4648.380
85 3964.370 3464.157 4464.582 3199.361 4729.379
86 4009.767 3487.312 4532.222 3210.741 4808.793
87 4055.164 3511.376 4598.953 3223.512 4886.817
88 4100.562 3536.246 4664.878 3237.515 4963.608
89 4145.959 3561.836 4730.081 3252.621 5039.297
90 4191.356 3588.077 4794.635 3268.720 5113.992
91 4236.753 3614.908 4858.599 3285.722 5187.785
I have the forecasted value, but I am not able to get the dates from the model. The dates are not present in the graph either and it has changed from 0 to 91, instead of my actual dates. I have used xts variable at the starting.
Update:
> a<-ts(ana)
> a
Time Series:
Start = 1
End = 68
Frequency = 1
final.day final.cumsum135
1 16535 318
2 16536 318
3 16537 318
4 16538 318
5 16539 318
6 16540 318
7 16541 318
8 16542 318
9 16543 318
10 16544 318
11 16545 318
12 16546 318
13 16547 318
14 16548 318
15 16549 318
16 16550 318
17 16551 318
18 16552 318
19 16553 318
20 16554 318
21 16555 318
22 16556 318
23 16557 318
24 16558 318
25 16559 318
26 16560 369
27 16561 369
28 16562 369
29 16563 369
30 16564 369
31 16565 369
32 16566 369
33 16567 369
34 16568 369
35 16569 369
> auto<-arima(a)
Error in arima(a) : only implemented for univariate time series
Is there any way I can get back the dates here?
Whit daily series, sometimes fitted and forecast "lost" dates. You could get dates by hand, using index:
y=x # x is your xts series
n=length(y)
model_a1 <- auto.arima(y)
# the plot
plot(x=1:n,y,xaxt="n",xlab="")
axis(1,at=seq(1,n,length.out=20),labels=index(y)[seq(1,n,length.out=20)],
las=2,cex.axis=.5)
lines(fitted(model_a1), col = 2)
#the forecast
auto_for<-forecast(model_a1,h=30)
fcs=xts(auto_for$mean,seq.Date(as.Date(index(y)[n]),by=1,length.out=30))
fcs

geom_bar : There are extra x-axis appear in my bar plot

My data is follow the sequence:
deptime .count
1 4.5 6285
2 14.5 5901
3 24.5 6002
4 34.5 5401
5 44.5 5080
6 54.5 4567
7 104.5 3162
8 114.5 2784
9 124.5 1950
10 134.5 1800
11 144.5 1630
12 154.5 1076
13 204.5 738
14 214.5 556
15 224.5 544
16 234.5 650
17 244.5 392
18 254.5 309
19 304.5 356
20 314.5 364
My ggplot code:
ggplot(pplot, aes(x=deptime, y=.count)) + geom_bar(stat="identity",fill='#FF9966',width = 5) + labs(x="time", y="count")
output figure
There are a gap between each 100. Does anyone know how to fix it?
Thank You

How to call a variable in loops of R? (create arrays as dictionary)

I'd like to define a series of variables in a for loop. (create a array as dictionary. Convert tops to d1 as shown below)
Firstly, I assign values to them (d1~d11);
then I try to define the names of these variables.
How should I call specific variables in the names() function to make it work like "names(d1)<-..."
for (i = 1:11)
{
assign(paste("d",i,sep=""),tops[,2*i])
names(eval(parse(text=paste("d",i,sep=""))))<-tops[,2*i-1]
}
> tops[,c(1,2)]
V1 V2 V3 V4 V5 V6
1 shift 2136 shift 2211 shift 2324
2 bed 1463 k 1551 plant 1664
3 run 1338 bed 1527 run 1466
4 plant 1309 run 1504 k 1456
5 k 1294 hr 1484 bed 1390
6 hr 1285 clean 1464 hr 1366
7 check 1255 plant 1386 clean 1359
8 clean 1203 check 1261 s 1254
9 s 1052 s 1205 check 1048
10 unload 1024 start 1115 end 1028
11 chang 1023 fine 1113 fine 1020
12 fine 960 chang 1104 start 1006
13 end 924 end 1050 chang 977
14 start 905 stop 974 stop 950
15 pellet 878 pellet 915 pellet 897
16 work 866 work 907 remov 874
17 due 856 screen 900 sinter 862
18 stop 853 bwr 888 side 841
19 complet 772 side 888 due 809
20 remov 750 due 861 conveyor 792
21 requir 726 complet 841 work 777
22 sinter 711 sinter 834 north 771
23 south 710 conveyor 775 south 760
24 side 688 north 768 west 738
25 issu 682 remov 764 belt 737
26 t 675 ok 759 carri 735
27 belt 672 t 753 screen 727
28 carri 668 requir 750 stock 725
29 strand 649 unload 749 unload 719
30 conveyor 646 chute 747 chute 688
> d1
shift bed run plant k hr check clean s
2136 1463 1338 1309 1294 1285 1255 1203 1052
unload chang fine end start pellet work due stop
1024 1023 960 924 905 878 866 856 853
complet remov requir sinter south side issu t belt
772 750 726 711 710 688 682 675 672
carri strand conveyor
668 649 646
> length(d1)
[1] 30
I hope I make it clear. if not, please free to ask me
As David mentioned, don't assign 11 different variables; create a list with 11 elements. This will simplify your code considerably.
d <- lapply(1:11, function(i) tops[, 2 * i = 1])

Resources