I have following dataframe in r
name year month volume
SSI 2016 01 123
SSI 2016 02 23
SSI 2016 03 1234
SSI 2016 04 1253
SSI 2016 04 144
SSI 2016 05 167
SSII 2016 01 1112
SSII 2016 02 234
SSII 2016 03 154
SSII 2016 04 143
SSII 2016 04 144
SSII 2016 05 167
How I want to plot is on x axis I want all the name and group by year and on y axis volume.
How can I do it ion plotly?
I have changed your data to include 2017 year for few records
df <- read.table(text = " name year month volume
SSI 2016 01 123
SSI 2016 02 23
SSI 2016 03 1234
SSI 2017 04 1253
SSI 2017 04 144
SSI 2017 05 167
SSII 2016 01 1112
SSII 2016 02 234
SSII 2016 03 154
SSII 2017 04 143
SSII 2017 04 144
SSII 2017 05 167", header = T)
g <- ggplot(data = df, aes(x = factor(year), y = volume, group = name, fill = name)) + geom_bar(stat = "identity", position = "dodge")
ggplotly(g)
Related
i want to create a two columns about unique values in a rows. And another when get to 25 distinct values.
Let take a example:
raffle Bola1 Ball2 Ball3 Ball4 Ball5 Ball6 Ball7 Ball8 Ball9 Ball10 Ball11 Ball12 Ball13 Ball14 Ball15
2 23 15 05 04 12 16 20 06 11 19 24 01 09 13 07
3 20 23 12 08 06 01 07 11 14 04 16 10 09 17 24
4 16 05 25 24 23 08 12 02 17 18 01 10 04 19 13
5 15 13 20 02 11 24 09 16 04 23 25 12 08 19 01
6 23 19 01 05 07 21 16 10 15 25 06 02 12 04 17
7 22 04 15 08 16 14 21 23 12 01 25 19 07 10 18
8 19 16 18 09 13 08 05 25 17 10 06 15 01 22 20
9 21 04 17 05 03 13 16 09 20 24 25 19 11 15 10
10 24 19 08 23 06 02 20 11 09 03 04 10 05 12 14
11 24 09 08 19 20 22 06 10 11 16 07 25 23 02 12
12 11 05 25 01 09 08 16 04 07 24 17 02 12 14 10
13 13 06 10 05 08 14 03 11 16 15 09 17 19 07 23
14 14 21 13 19 20 06 09 05 07 23 18 01 15 02 25
15 23 06 21 04 10 24 16 01 15 02 08 19 12 18 25
16 24 17 05 08 07 12 13 02 15 10 19 25 23 21 06
17 13 20 17 01 06 07 02 14 05 09 16 19 03 21 18
18 02 23 10 07 11 14 17 22 15 06 24 08 19 20 18
19 15 17 10 23 11 24 13 14 06 02 08 05 20 16 07
20 04 09 08 24 16 20 03 17 18 19 07 06 23 14 10
21 05 02 01 22 19 08 24 04 25 23 18 20 14 11 16
22 13 15 05 09 07 10 01 03 22 02 25 14 06 04 12
23 10 11 05 19 18 14 06 04 20 01 08 03 12 16 17
24 01 19 21 14 02 23 25 05 20 11 07 10 24 17 03
25 04 23 20 02 05 13 07 09 24 03 01 06 14 22 16
26 19 11 07 16 08 21 05 10 20 13 23 09 17 14 22
27 25 06 22 21 11 24 03 14 12 13 20 08 10 15 18
28 18 21 11 07 09 03 20 16 14 12 13 17 01 19 10
29 13 14 06 01 24 04 08 05 17 22 21 19 20 09 16
30 22 02 01 17 08 04 19 20 11 14 06 21 07 23 03
I have 15 distinct values, in first rows,
I have plus 6 distinct values, in second rows,
I have plus 3 distinct values, in a third rows,
On the seven row, i complete all numbers, 25 distinct values,
I need to memory this information, like this
raffle Ball1 Ball15 unique_balls group
1 16 02 15 1
2 22 19 21 1
...
7 24 10 25 1
8 8 1 15 2
When i get to 25 distinct values, i indicate another group!
I have more than 1 hundread raffle, help me!
If you want to calculate unique values in each row and also carry it forward till the threshold is reached, we can use a for loop
num <- numeric(length = 0L) #Vector to store unique values
threshold <- 25 #Threshold value to reset
df$group <- 1 #Initialise all group values to 1
count <- 1 #Variable to keep the count of unique groups
#For every row in the dataframe
for (i in seq_len(nrow(df))) {
#Get all the unique values from previous rows before threshold was reached
#and append new unique values for this row
num <- unique(c(num, as.integer(df[i, ])))
#If the length of unique values reaches the threshold
if (length(num) >= threshold) {
df$group[i] <- count
#Empty the unique values vector
num <- numeric(length = 0L)
#Increment the group count by 1
count = count + 1
}
else {
#If the threshold is not reached, continue the previous count
df$group[i] <- count
}
}
df$group
# [1] 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 7
Column data is distributed by YEAR, MONTH and DAY, each row is associated to a fourth column named X.
How to obtain the summatory of X at YEAR, MONTH and DAY matched row values and sort the results, for example:
A:
year month day x
2000 01 01 50
2000 01 02 30
2002 02 03 50
1994 01 01 3
2000 01 01 50
1996 01 02 5
2000 01 01 10
And obtain
A:
year month day x
1994 01 01 3
1996 01 02 5
2000 01 01 110
2000 01 02 30
2002 02 03 50
dplyr is a good option for this:
library(dplyr)
A %>% group_by(year, month, day) %>% summarise('x' = sum(x))
which gives the desired:
year month day x
1994 01 01 3
1996 01 02 5
2000 01 01 110
2000 01 02 30
2002 02 03 50
I want to plot a time series together with its moving average like the example in a Forecasting: Principles and Practices I use my own time series called salests:
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2015 110 115 92 120 125 103 132 136 114 139 143 119
2016 150 156 130 169 166 142 170 173 151 180 184 163
I then use similar code as in the book:
autoplot(salests, series="Sales") +
forecast::autolayer(ma(salests, 5), series="5 Moving Average")
But I receive the error:
Error: Invalid input: date_trans works with objects of class Date only
What am I doing wrong? It seems that I just am following the book.
Thanks in advance
Here are some ideas that could help you.
# I start reading your dataset
df1 <- read.table(text='
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2015 110 115 92 120 125 103 132 136 114 139 143 119
2016 150 156 130 169 166 142 170 173 151 180 184 163
', header=T)
# Set locale to 'English' if you have a different setting
Sys.setlocale( locale='English' )
# I reshape your dataset in long format
library(reshape)
df2 <- melt(df1)
df2$time <- paste0("01-",df2$variable,'-',rep(rownames(df1), ncol(df1)))
df2$time <- as.Date(df2$time, "%d-%b-%Y")
( df2 <- df2[order(df2$time),] )
# variable value time
# 1 Jan 110 2015-01-01
# 3 Feb 115 2015-02-01
# 5 Mar 92 2015-03-01
# 7 Apr 120 2015-04-01
# 9 May 125 2015-05-01
# 11 Jun 103 2015-06-01
# 13 Jul 132 2015-07-01
# 15 Aug 136 2015-08-01
# 17 Sep 114 2015-09-01
# 19 Oct 139 2015-10-01
# 21 Nov 143 2015-11-01
# 23 Dec 119 2015-12-01
# 2 Jan 150 2016-01-01
# 4 Feb 156 2016-02-01
# 6 Mar 130 2016-03-01
# 8 Apr 169 2016-04-01
# 10 May 166 2016-05-01
# 12 Jun 142 2016-06-01
# 14 Jul 170 2016-07-01
# 16 Aug 173 2016-08-01
# 18 Sep 151 2016-09-01
# 20 Oct 180 2016-10-01
# 22 Nov 184 2016-11-01
# 24 Dec 163 2016-12-01
Now create a time-series ts object
( salests <- ts(df2$value, frequency=12, start = c(2015,1)) )
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
# 1 110 115 92 120 125 103 132 136 114 139 143 119
# 2 150 156 130 169 166 142 170 173 151 180 184 163
and plot it:
library(ggfortify)
library(forecast)
autoplot(salests) +
forecast::autolayer(ma(salests, 5), series="5 Moving Average")
I have a dataframe, with the following data:
data1$YEAR data1$WEEK data1$TOTAL.PATIENTS
1 2009 1 579428
9 2009 2 565631
17 2009 3 582932
25 2009 4 611176
33 2009 5 638613
41 2009 6 648304
49 2009 7 624583
57 2009 8 659573
65 2009 9 623389
73 2009 10 637672
81 2009 11 605503
89 2009 12 608342
97 2009 13 586651
105 2009 14 564460
113 2009 15 558837
121 2009 16 577836
129 2009 17 624734
137 2009 18 598189
145 2009 19 550300
153 2009 20 544432
161 2009 21 531526
169 2009 22 538177
177 2009 23 493761
185 2009 24 521701
193 2009 25 512268
201 2009 26 475877
209 2009 27 480680
217 2009 28 502466
225 2009 29 503971
233 2009 30 485804
241 2009 31 496666
249 2009 32 506019
257 2009 33 544827
265 2009 34 588916
273 2009 35 573972
281 2009 36 571201
289 2009 37 638302
296 2009 38 608464
303 2009 39 606458
311 2009 40 855346
319 2009 41 853912
327 2009 42 906536
335 2009 43 898860
343 2009 44 899425
351 2009 45 864348
359 2009 46 853552
367 2009 47 654101
375 2009 48 814550
383 2009 49 781811
391 2009 50 728401
399 2009 51 536961
407 2009 52 583299
2 2010 1 721138
...
second column is the year from 2009 to 2015
third column is the week of the year
I would like to plot this data frame. On the x-axis of this plot I would like to see the weeks of each year separately.
something like this. How can I do that?
Doe this work or you need to re-label X-axis to Year only (in the following plot the x-axis is in Year-Weeks)?
head(df)
Year Week TOTAL.PATIENTS
1 2009 11 605503
2 2009 12 608342
3 2009 13 586651
4 2009 14 564460
5 2009 15 558837
6 2009 16 577836
df$Year_Week <- paste(df$Year, sprintf('%02d', df$Week), sep='-')
df$Year <- as.factor(df$Year)
library(scales)
ggplot(df, aes(Year_Week,TOTAL.PATIENTS,col=Year, group=Year)) +
geom_line(lwd=2) + scale_y_continuous(labels = comma) +
xlab('Year-Week') +
theme(axis.text.x = element_text(angle=90, vjust = 0.5))
I have a user table like this
ID Date Value
---------------------------
1001 31 01 14 2035.1
1002 31 01 14 1384.65
1003 31 01 14 1011.1
1004 31 01 14 1187.04
1001 28 02 14 2035.1
1002 28 02 14 1384.65
1003 28 02 14 1011.1
1004 28 02 14 1188.86
1001 31 03 14 2035.1
1002 31 03 14 1384.65
1003 31 03 14 1011.1
1004 31 03 14 1188.86
1001 30 04 14 2066.41
1002 30 04 14 1405.95
1003 30 04 14 1026.66
1004 30 04 14 1207.15
And I want to make a sum from this table like this
ID Date Value Total
---------------------------------------
1001 31 01 14 2035.1 2035.1
1002 31 01 14 1384.65 1384.65
1003 31 01 14 1011.1 1011.1
1004 31 01 14 1187.04 1187.04
1001 28 02 14 2035.1 4070.2
1002 28 02 14 1384.65 2769.3
1003 28 02 14 1011.1 2022.2
1004 28 02 14 1188.86 2375.9
1001 31 03 14 2035.1 6105.3
1002 31 03 14 1384.65 4153.95
1003 31 03 14 1011.1 3033.3
1004 31 03 14 1188.86 3564.76
1001 30 04 14 2066.41 8171.71
1002 30 04 14 1405.95 5180.61
1003 30 04 14 1026.66 4059.96
1004 30 04 14 1207.15 4771.91
I have id, for each id for the first month it should write it is value for total and for second month of that id, it should add the value of first month + second month and it should go on like this. How can I do this summation in X++?
Can anyone help me?
It can be done as a display method on the table:
display Amount total()
{
return (select sum(Value) of Table
where Table.Id == this.Id &&
Table.Date <= this.Date).Value;
}
Change the table and field names to your fit.
This may not be the fastest way to do it though. In say a report context, it might be better to keep a running total for each id (in a map).
Also it can be done in a select like this:
Table table1, table2
while select table1
group Date, Id, Value
inner join sum(Value) of table2
where table2.Id == table1.Id &&
table2.Date <= table1.Date
{
...
}
You need to group on the wanted fields, because it is an aggregate select.