There are a few similar queries to mine but I can't quite figure it out. In Access 2010 I have one table with three columns, day, week and number.
Day Week Number
Monday 1 12
Monday 2 24
Tuesday 2 10
Thursday 1 12
Monday 1 10
Tuesday 2 10
I want to be able to count (sum) the total "number" for Monday in Week 1, Monday in Week 2 etc.
Day Week Total
Monday 1 22
Monday 2 24
Tuesday 2 20
Thursday 1 12
You need a TOTALS Query.
SELECT
yourTable.dayFieldColumn,
yourTable.weekFieldColumn,
Sum(yourTable.numberColumnName) As TotalSum
FROM
yourTable
GROUP BY
yourTable.dayFieldColumn,
yourTable.weekFieldColumn;
Related
I have a very big dataframe with repeated measures but no column is available to group. The key to select needed rows is to geet the max(id) taking into account the repeated sequence is from 0 to 7 this way:
temperature weekday id
32 monday 0
34 thursday 0
34 saturday 1
55 wednesday 2
43 friday 0
45 sunday 1
42 friday 0
desired output (max id from sequence):
temperature weekday id
32 monday 0
55 wednesday 2
45 sunday 1
42 friday 0
Sounds like you want to select every row where the next id isn't higher than the current id. With dplyr:
your_data %>% filter(lead(id, default = 0) <= id)
(The default makes sure the last row of the data is included.)
Hello I am trying to find the week number for a series of date over three years. However R is not giving the correct week number. I am generating a seq of dates from 2016-04-01 to 2019-03-30 and then I am trying to calculate week over three years such that I get the week number 54, 55 , 56 and so on.
However when I check the week 2016-04-03 R shows the week number as 14 where as when cross checked with excel it is the week number 15 and also it simply calculates 7 days and does not reference the actual calendar days. Also the week number starts from 1 for every start of year
The code looks like this
days <- seq(as.Date("2016-04-03"),as.Date("2019-03-30"),'days')
weekdays <- data.frame('days'=days, Month = month(days), week = week(days),nweek = rep(1,length(days)))
This is how the results looks like
days week
2016-04-01 14
2016-04-02 14
2016-04-03 14
2016-04-04 14
2016-04-05 14
2016-04-06 14
2016-04-07 14
2016-04-08 15
2016-04-09 15
2016-04-10 15
2016-04-11 15
2016-04-12 15
However when checked from excel this is what I get
days week
2016-04-01 14
2016-04-02 14
2016-04-03 15
2016-04-04 15
2016-04-05 15
2016-04-06 15
2016-04-07 15
2016-04-08 15
2016-04-09 15
2016-04-10 16
2016-04-11 16
2016-04-12 16
Can someone please help me identify wherever I am going wrong.
Thanks a lot in advance!!
Not anything that you're doing wrong per se, there is just a difference in how R (I presume you're using the lubridate package) and Excel calculate week numbers.
R will calculate week numbers based on the seven day block from 1 January that year; but
Excel calculates week numbers based on a week starting from Sunday.
Taking the first few days of January 2016 for an example. On, Friday, 1 January 2016, both R and Excel will say this is week 1.
On Sunday, 3 January 2016:
this is within the first seven days of the start of the year so R will return week number 1; but
it is a Sunday, so Excel ticks over to week number 2.
Try this:
ifelse(test = weekdays.Date(days[1]) == "Sunday", yes = epiweek(days[1]), no = epiweek(days[1]) + 1) + cumsum(weekdays.Date(days) == "Sunday")
This tests whether the first day is a Sunday or not and returns an appropriate week number starting point, then adds on one more week number each Sunday. Gives the same week number if there's overlap between years.
I have a log of times for 2 periods (1 & 2) in a data frame. I need to account for the time accumulated for each person based on a third column 'in' vs 'out'. I then need to create an additional column to track the sum of accumulated time for both periods.
Period Time Subs
1 10:00 'Peter in'
1 .
1 .
1 8:00 'Peter out' #In this period he has accumulated 2 minutes
2 10:00 'Peter in'
2 .
2 2:00 'Peter out' #In this period he has accumulated 8 minutes
I know I need to use an if and ifelse statement but I'm not sure how to start. I started and stopped learning R and now I'm trying to pick back up where I left off.
It depends a lot on how your data is formatted, of course. if you have something like
df <- data.frame(Period=c(1,1,1,1,2,2,2), Time=c("10:00",NA,NA,"8:00","10:00",NA,"2:00"))
> df
Period Time
1 1 10:00
2 1 <NA>
3 1 <NA>
4 1 8:00
5 2 10:00
6 2 <NA>
7 2 2:00
If the Time variable is formatted as character, you can strip out the minutes column like so:
df$Min <- as.numeric(sapply(strsplit(as.character(df$Time), ":"), "[[", 1))
> df
Period Time Min
1 1 10:00 10
2 1 <NA> NA
3 1 <NA> NA
4 1 8:00 8
5 2 10:00 10
6 2 <NA> NA
7 2 2:00 2
This is much easier if you can have the Min column already as numeric!
Then, an easy way to return the total time accumulated for each period is the diff of the range for each period, within a tapply() call.
tapply(df$Min, df$Period, function(x) diff(range(x, na.rm=T)))
1 2
2 8
Edited for clarity:
I'm using R and I have a set of data that consists of order days:
> orders <- data.frame(order.num=1:4,
+ day = c("Mon", "Mon", "Mon", "Tue"))
> orders
order.num day
1 1 Mon
2 2 Mon
3 3 Mon
4 4 Tue
...
Orders typically come in on a consistent day (Monday in example above), but sometimes they come in on an alternate day (Tuesday in example above).
Here is actual data, spread into columns using dplyr::spread function
Outlet.number Sun Mon Tue Wed Thu Fri Sat
1 1 0 530 162 0 629 49 0
2 2 0 784 123 0 854 65 0
3 3 24 15 483 0 365 0 0
For Outlet 1, the "typical" order days are Monday and Thursday
For Outlet 2, the "typical" order days are Monday and Thursday
For Outlet 3, the "typical" order days are Tuesday and Thursday
I want to be able to predict if an order on an atypical day (e.g. Tue for Outlet 1) is more likely to be associated with the first typical day (Monday) or the second typical day (Thursday)
Neither of these examples have any orders on Wednesdays so I was able to hard code this small set, but for future outlets, Wednesday may either be a typical or atypical day.
Is there a way to ingest the data as shown above and then classify them?
Any idea why the day is coming out wrong when the date is accurate?
I'm debugging and I can see the date variables which are correct but the day is wrong.
date Date (#9f14161)
date 26 [0x1a]
dateUTC 26 [0x1a]
day 5
dayUTC 5
fullYear 2010 [0x7da]
fullYearUTC 2010 [0x7da]
hours 17 [0x11]
hoursUTC 17 [0x11]
milliseconds 0
millisecondsUTC 0
minutes 0
minutesUTC 0
month 10 [0xa]
monthUTC 10 [0xa]
seconds 0
secondsUTC 0
time 1290790800000 [0x12c89208a80]
timezoneOffset 0
That is my date variables, as you can see, The date is 26 (today), month is 10 (this month) and the year is 2010 (this year) yet the day is coming out at 5 which is a friday.
The month begins with 0, so a month with the value 10 is not october but november.
So friday (day = 5) is correct in your example.