I'm trying to figure out how to use cal command on Linux/Debian to display Monday as the first day of the week instead of Sunday.
From what I see accordingly to cal's man page
-M Weeks start on Monday.
But it doesn't seem to work on my machine:
cal -M
Usage: cal [general options] [-jy] [[month] year]
cal [general options] [-j] [-m month] [year]
ncal -C [general options] [-jy] [[month] year]
ncal -C [general options] [-j] [-m month] [year]
ncal [general options] [-bhJjpwySM] [-H yyyy-mm-dd] [-s country_code] [-W number of days] [[month] year]
ncal [general options] [-Jeo] [year]
General options: [-31] [-A months] [-B months] [-d yyyy-mm]
cal doesn't support -M option in all UNIX versions.
Alternatively, you can use ncal -M -b to get the desired output.
May 2022
Mo Tu We Th Fr Sa Su
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31
Credits:
How to display calendar in terminal with Monday as the start of the week
Unix - Monday as first day
Related
I have a dataset where the data is reported by week and year like: YYWW. I have split it into to columns: Year and Week.
I need to get a date from the week: Week_start_date. My weeks start at mondays, so I would like to get the monday and sunday date from each week.
ID
YYWW
year
week
Week_start_date
Week_end_date
1
1504
2015
04
?
?
2
1651
2016
51
?
?
3
1251
2012
51
?
?
4
1447
2014
47
?
?
How do I extract the week start date from just a week number and year?
I've looked at several threads at SO, but haven't found a solution yet.
I have tried looking at different threads, but encounters problems using their solutions. Most seaches for "convert week number and year to date" on google and SO returns the opposite: Getting a weeknumber from a date. This guy answered by Vince, have maybe some similar issues, but I can't get the code to do the job: https://communities.sas.com/t5/SAS-Programming/Converting-week-number-to-start-date/td-p/106456
Use INTNX() with the WEEK interval and increment from the first of the year.
Use +1 to get Monday/Sunday dates.
You may need to tweak to match the dates you need.
data have;
infile cards dlm='09'x;
input ID $ YYWW year week ;
format year 8. week z2.;
cards;
1 1504 2015 04
2 1651 2016 51
3 1251 2012 51
4 1447 2014 47
;;;;
data want;
set have;
week_start = intnx('week', mdy(1, 1, year), week, 'b')+1;
week_end = intnx('week', mdy(1, 1, year), week, 'e')+1;
format week_: date9.;
run;
Use one of the WEEK... informats. But you will need to insert the letter W between the YEAR and WEEK number.
data have;
input ID $ YYWW year week ;
cards;
1 1504 2015 04
2 1651 2016 51
3 1251 2012 51
4 1447 2014 47
;;;;
data want;
set have;
week_start=input(cats(year,'W',put(week,Z2.)),weekv.);
week_end=week_start+6;
format week_: yymmdd10.;
run;
Results
Obs ID YYWW year week week_start week_end
1 1 1504 2015 4 2015-01-19 2015-01-25
2 2 1651 2016 51 2016-12-19 2016-12-25
3 3 1251 2012 51 2012-12-17 2012-12-23
4 4 1447 2014 47 2014-11-17 2014-11-23
Given 1525505457, return a date object(Sat May 05 2018 15:30:57 GMT+0800 (CST)) representing it.
http://ergoemacs.org/emacs/elisp_parse_time.html this article only tell how to format a date to unix time, but not otherwise.
Use the combination of decode-time & seconds-to-time, like this:
(decode-time (seconds-to-time 1525505457))
=> (57 30 9 5 5 2018 6 t 7200)
The same module, time-date.el has other functions for conversion into time object, like string, etc.:
(decode-time (date-to-time "2018-05-05T12:33:05Z"))
=> (5 33 14 5 5 2018 6 t 7200)
I have a script which runs everyday at 1.00 AM regularly for every day.
But On every alternate Wednesday I need to change the timings to 6.00 AM and which currently I am doing separately on every Tuesday Manually.
e.g
Wednesday Nov 09 2016 6.00 AM.
Wednesday Nov 23 2016 6.00 AM.
Wednesday Dec 07 2016 6.00 AM.
The main thing is for every Wednesday in between the job should be as per regular timings.
Using this bash trick it could be done with 3 cron entries (possibly 2):
#Every day except Wednesdays at 1am
0 1 * * 0,1,2,4,5,6 yourCommand
#Every Wednesdays at 1am, proceeds only on even weeks
0 1 * * 3 test $((10#$(date +\%W)\%2)) -eq 0 && yourCommand
#Every Wednesdays at 6am, proceeds only on odd weeks
0 6 * * 3 test $((10#$(date +\%W)\%2)) -eq 1 && yourCommand
Change the -eq's to 1 or 0 depending if you want to start with odd or even week. It should work according to your example, because Wednesday Nov 09 2016 6.00 AM is even.
I have a dataset ("bids") that is composed of series of observations (~5 million) each of which represents a bid to purchase a product (in the simplified example below, either a book or a game). For each observation, I have (see example data below):
Date that the bid was submitted
Time that the bid was submitted
Name of the product on which the bidder is bidding
Name of the bidder
observation date time product bidder
1 1/1/2016 9:00:00 AM book AB
2 1/1/2016 9:01:00 AM book CD
3 1/1/2016 9:02:00 AM book EF
4 1/1/2016 9:03:00 AM book CD
5 1/1/2016 9:00:00 AM game AB
6 1/1/2016 9:01:00 AM game CD
7 1/1/2016 9:02:00 AM game CD
8 1/1/2016 9:07:00 AM game CD
9 1/2/2016 9:00:00 AM book AB
10 1/2/2016 9:06:00 AM book CD
11 1/2/2016 9:02:00 AM book EF
12 1/2/2016 9:03:00 AM book EF
13 1/2/2016 9:00:00 AM game EF
14 1/2/2016 8:59:00 AM game CD
15 1/2/2016 9:00:00 AM game GH
16 1/2/2016 9:01:00 AM game AB
17 1/2/2016 10:00:00 AM game AB
18 1/2/2016 10:06:00 AM game CD
19 1/2/2016 10:06:00 AM game EF
20 1/2/2016 10:06:00 AM game GH
21 1/2/2016 3:00:00 PM game AB
In some cases, there is a single bid made by one bidder for a particular product and no other bids for that product that occur close in time (e.g., observation #21). In most cases, however, there are several bids from several bidders for the same product that are close together in time (e.g., observations 1-4 make up a group; observations 14-16 make up a group). To study these groups, I need to be able to group them and identify each group with a unique identifier. Ultimately, I also need to be able to count both the total number of bids in each group and the number of unique/distinct bidders in each group. That's something I can probably solve on my own if I figure out how to create the groups, but I mention it in case there is a simpler/more integrated approach to that.
What I'm struggling with is the "close in time" parameter. If, "close in time" meant the same day, it's clear to me that I could use id in plyr and create a new column ("bidgrp") (or any of several other approaches). Something like:
bids$bidgrp <- id(bids[c("date", "product")], drop = TRUE)
But, "close in time" actually means within 5 minutes. In other words, for example, observations 9, 11, and 12 are part of a group, but 10 - since it is more than 5 minutes after the earliest member of the group (9), is not part of the group. Part of the challenge is figuring how to establish what the first (earliest) member of the group is (I don't have a reliable indicator, so [this] (Grouping observations based on first row value) solution won't work, but that can probably be done by sorting the data before attempting any grouping (although here again, if there are smarter, more efficient ways to do this, I'd welcome them)
From looking at other conditional grouping questions on SO and elsewhere, my instinct is to tackle this with a series of ifelse loop-like steps as follows:
Sort the dataset first by date, then by product, then by time
Assign a group id number of i to the first observation
Look at the product being bid on in next observation; if it is a bid for a different product than the product in the prior observation, assign it a group id of i+1; if it is a bid for the same product, look at the date.
If the date is different than the prior observation, assign it a group id of i+1; if it is the same date, look at the time
If the time is more than 5 minutes after the prior observation, assign it a group id of i+1; if it is within 5 minutes of the earliest observation in the group (this is what makes the problem particularly tricky - not just a matter of looking at the last observation, but knowing which observation to key off when determining distance in time), assign it a group number of i and look at the next observation
The result (for the sample data above), would identify 9 groups, as follows:
observation date time product bidder grpid
1 1/1/2016 9:00:00 AM book AB 1
2 1/1/2016 9:01:00 AM book CD 1
3 1/1/2016 9:02:00 AM book EF 1
4 1/1/2016 9:03:00 AM book CD 1
5 1/1/2016 9:00:00 AM game AB 2
6 1/1/2016 9:01:00 AM game CD 2
7 1/1/2016 9:02:00 AM game CD 2
8 1/1/2016 9:07:00 AM game CD 3
9 1/2/2016 9:00:00 AM book AB 4
11 1/2/2016 9:02:00 AM book EF 4
12 1/2/2016 9:03:00 AM book EF 4
10 1/2/2016 9:06:00 AM book CD 5
14 1/2/2016 8:59:00 AM game CD 6
13 1/2/2016 9:00:00 AM game EF 6
15 1/2/2016 9:00:00 AM game GH 6
16 1/2/2016 9:01:00 AM game AB 6
17 1/2/2016 10:00:00 AM game AB 7
18 1/2/2016 10:06:00 AM game CD 8
19 1/2/2016 10:06:00 AM game EF 8
20 1/2/2016 10:06:00 AM game GH 8
21 1/2/2016 3:00:00 PM game AB 9
And, I'll ultimately need to get to something like this:
grpid bids uniquebidders
1 4 3
2 3 2
3 1 1
4 3 2
5 1 1
6 4 4
7 1 1
8 3 3
9 1 1
Apologies for the long question. I know several of the sub-issues here (working with time; loop-like operations) have been covered on SO (I've reviewed many of them), but it's the combination of these issues that makes this particularly challenging for me (and hopefully useful for others).
Thanks in advance for any help you can offer.
You could make a specific function for date-time comparaison. I do know if it is performant for 5 millions rows but it work on the example dataset. I followed your steps. It use rleid that create a run-length type id column. Using it twice, you get the group id you want. It deliberatly goes step by step but it could be written more concisely.
library(data.table)
setDT(DT)
# this function compare each datetime of the vector with the first one
# If > 5 mins then a new time reference is set for next group and group
# is incremented
func_perso <- function(vec){
time1 <- vec[1]
grp <- 1
res <- vector("integer", length(vec))
for(i in 1:length(vec)){
time <- vec[i]
if(difftime(time, time1, units = "secs") > 5*60){
grp <- grp + 1
time1 <- time
}
res[i] <- grp
}
res
}
# Create a datetime object (POSIXct) for easier comparaison
DT[, dtime := as.POSIXct(strptime(paste(date, time), "%d/%m/%Y %I:%M:%S %p", tz = "UTC"))]
# order data as you mentionned
setorder(DT, date, product, dtime)
# Apply func on column dtime by data and product
DT[, grp1 := .SD[, func_perso(dtime)], by = .(date, product)]
# use rleid to count identify the group
DT[, grp2 := paste0(product, rleid(grp1)), by = .(date, product)]
# count the group
DT[, grpid := rleid(grp2)]
# delete non necessary column
DT[, `:=`(
dtime = NULL,
grp1 = NULL,
grp2 = NULL
)]
# the result
DT
#> Observation date time product bidder grpid
#> 1: 1 1/1/2016 9:00:00 AM book AB 1
#> 2: 2 1/1/2016 9:01:00 AM book CD 1
#> 3: 3 1/1/2016 9:02:00 AM book EF 1
#> 4: 4 1/1/2016 9:03:00 AM book CD 1
#> 5: 5 1/1/2016 9:00:00 AM game AB 2
#> 6: 6 1/1/2016 9:01:00 AM game CD 2
#> 7: 7 1/1/2016 9:02:00 AM game CD 2
#> 8: 8 1/1/2016 9:07:00 AM game CD 3
#> 9: 9 1/2/2016 9:00:00 AM book AB 4
#> 10: 11 1/2/2016 9:02:00 AM book EF 4
#> 11: 12 1/2/2016 9:03:00 AM book EF 4
#> 12: 10 1/2/2016 9:06:00 AM book CD 5
#> 13: 14 1/2/2016 8:59:00 AM game CD 6
#> 14: 13 1/2/2016 9:00:00 AM game EF 6
#> 15: 15 1/2/2016 9:00:00 AM game GH 6
#> 16: 16 1/2/2016 9:01:00 AM game AB 6
#> 17: 17 1/2/2016 10:00:00 AM game AB 7
#> 18: 18 1/2/2016 10:06:00 AM game CD 8
#> 19: 19 1/2/2016 10:06:00 AM game EF 8
#> 20: 20 1/2/2016 10:06:00 AM game GH 8
#> 21: 21 1/2/2016 3:00:00 PM game AB 9
#> Observation date time product bidder grpid
I have log files with time stamps. I want to search for text between two time stamps using sed even if the first time stamp or the last time stamp are not present.
For example, if I search between 9:30 and 9:40 then it should return text even if neither 9:30 nor 9:40 is there but the time stamp is between 9:30 and 9:40.
I am using a sed one liner:
sed -n '/7:30:/,/7:35:/p' xyz.log
But it only returns data if both the time stamps are present; it will print everything if one of the time stamp are missing. And if the time is in 12 hr format it will pull data for both AM and PM.
Additionally, I have different time stamp formats for different log files so I need a generic command.
Here are some time format examples:
<Jan 27, 2013 12:57:16 AM MST>
Jan 29, 2013 8:58:12 AM
2013-01-31 06:44:04,883
Some of them contain AM/PM i.e. 12 hr format and others contain 24 hr format so I have to account for that as well.
I have tried this as well but it doesn't work:
sed -n -e '/^2012-07-19 18:22:48/,/2012-07-23 22:39:52/p' history.log
With the serious medley of time formats you have to parse, sed is not the correct tool to use. I'd automatically reach for Perl, but Python would do too, and you probably could do it in awk if you put your mind to it. You need to normalize the time formats (you don't say anything about date, so I assume you're working only with the time portion).
#!/usr/bin/env perl
use strict;
use warnings;
use constant debug => 0;
my $lo = "09:30";
my $hi = "09:40";
my $lo_tm = to_minutes($lo);
my $hi_tm = to_minutes($hi);
while (<>)
{
print "Read: $_" if debug;
if (m/\D\d\d?:\d\d:\d\d/)
{
my $tm = normalize_hhmm($_);
print "Normalized: $tm\n" if debug;
print $_ if ($tm >= $lo_tm && $tm<= $hi_tm);
}
}
sub to_minutes
{
my($val) = #_;
my($hh, $mm) = split /:/, $val;
if ($hh < 0 || $hh > 24 || $mm < 0 || $mm >= 60 || ($hh == 24 && $mm != 0))
{
print STDERR "to_minutes(): garbage = $val\n";
return undef;
}
return $hh * 60 + $mm;
}
sub normalize_hhmm
{
my($line) = #_;
my($hhmm, $ampm) = $line =~ m/\D(\d\d?:\d\d):\d\d\s*(AM|PM|am|pm)?/;
my $tm = to_minutes($hhmm);
if (defined $ampm)
{
if ($ampm =~ /(am|AM)/)
{
$tm -= 12 * 60 if ($tm >= 12 * 60);
}
else
{
$tm += 12 * 60 if ($tm < 12 * 60);
}
}
return $tm;
}
I used the sample data:
<Jan 27, 2013 12:57:16 AM MST>
Jan 29, 2013 8:58:12 AM
2013-01-31 06:44:04,883
Feb 2 00:00:00 AM
Feb 2 00:59:00 AM
Feb 2 01:00:00 AM
Feb 2 01:00:00 PM
Feb 2 11:00:00 AM
Feb 2 11:00:00 PM
Feb 2 11:59:00 AM
Feb 2 11:59:00 PM
Feb 2 12:00:00 AM
Feb 2 12:00:00 PM
Feb 2 12:59:00 AM
Feb 2 12:59:00 PM
Feb 2 00:00:00
Feb 2 00:59:00
Feb 2 01:00:00
Feb 2 11:59:59
Feb 2 12:00:00
Feb 2 12:59:59
Feb 2 13:00:00
Feb 2 09:31:00
Feb 2 09:35:23
Feb 2 09:36:23
Feb 2 09:37:23
Feb 2 09:35:00
Feb 2 09:40:00
Feb 2 09:40:59
Feb 2 09:41:00
Feb 2 23:00:00
Feb 2 23:59:00
Feb 2 24:00:00
Feb 3 09:30:00
Feb 3 09:40:00
and it produced what I consider the correct output:
Feb 2 09:31:00
Feb 2 09:35:23
Feb 2 09:36:23
Feb 2 09:37:23
Feb 2 09:35:00
Feb 2 09:40:00
Feb 2 09:40:59
Feb 3 09:30:00
Feb 3 09:40:00
I'm sure this isn't the only way to do the processing; it seems to work, though.
If you need to do date analysis, then you need to use one of the date or time manipulation packages from CPAN to deal with the problems. The code above also hard codes the times in the script. You'd probably want to handle them as command line arguments, which is perfectly doable, but isn't scripted above.