Append list of logged in users to a log file using crontab? - unix

I need to create a basic log file through the use of a crontab job that appends a timestamp, followed by a list of logged in users. It must be at 23:59 each night.
(I have used 18 18 * * * as an example to make sure the job works for now)
So far, I have;
!#/bin/bash
59 23 * * * (date ; who) >> /root/userlogfile.txt
for my crontab script, the output;
Fri Dec 9 18:18:01 UTC 2022
root console 00:00 Dec 9 18:15:15
My required output is something similar to;
Fri 09 Dec 23:59:00 GMT 2022
user1 tty2 2017-11-30 22:00 (:0)
user5 pts/1 2017-11-30 20:35 (192.168.1.1)
How would I go about this?

Related

Re-run airflow historical runs

I have a dag with the following parameters:
start_date=datetime(2020, 7, 6)
schedule_interval="0 12 * * *",
concurrency=2,
max_active_runs=6,
catchup=True
I had to re-process a year's historical data, so I did a reset of dag run status for past one year. In the mid of re-processing, I realised I need to re-process a few latest days first due to some business priority change, but airflow seems be a bit random in picking which days to run, though often favours old days more, so my tree view of the dag run is a bit messed up, and it's going to take quite a while to catch up all runs.
I had two choices:
Set all old dag run days to failure
Delete old dag run days
To avoid generating excessive number of failure notification, I chose the 2nd option.
Here is a quick illustration. Before I delete, in my tree view, I have:
1 Sep 2021 - 1 Jan 2022: dag run successful
2 Jan 2022 - 3 Jan 2022: dag running
4 Jan 2022 - 1 Aug 2022: dag scheduled
2 Aug 2022 - 6 Aug 2022: dag run successful
7 Aug 2022 - 1 Sep 2022: dag scheduled
To speed up the process of August data, I deleted dag runs scheduled between 4 Jan and 1 Aug, now the tree view now becomes
1 Sep 2021 - 1 Jan 2022: dag run successful
2 Jan 2022 - 3 Jan 2022: dag running
2 Aug 2022 - 6 Aug 2022: dag run successful
7 Aug 2022 - 1 Sep 2022: dag scheduled
Note that dag runs between 4 Jan 2022 and 1 Aug 2022 are now completely gone from the tree view.
Unfortunately, because the latest dag run is 6 Aug 2022 and airflow thinks there is only runs starting from 7 Aug to catchup and all deleted runs between 4 Jan and 1 Aug are hence ignored.
So my question now is, if I don't want to re-process those days that have already been re-processed, is there a way for me to tell airflow I need to re-run those days I have deleted?

Convert date with Time Zone formats in R

I have my dates in the following format :- Wed Apr 25 2018 00:00:00 GMT-0700 (Pacific Standard Time) or 43167 or Fri May 18 2018 00:00:00 GMT-0700 (PDT) all mixed in 1 column. What would be the easiest way to convert all of these in a simple YYYY-mm-dd (2018-04-13) format? Here is the column:
dates <- c('Fri May 18 2018 00:00:00 GMT-0700 (PDT)',
'43203',
'Wed Apr 25 2018 00:00:00 GMT-0700 (Pacific Standard Time)',
'43167','43201',
'Fri May 18 2018 00:00:00 GMT-0700 (PDT)',
'Tue May 29 2018 00:00:00 GMT-0700 (Pacific Standard Time)',
'Tue May 01 2018 00:00:00 GMT-0700 (PDT)',
'Fri May 25 2018 00:00:00 GMT-0700 (Pacific Standard Time)',
'Fri Apr 06 2018 00:00:00 GMT-0700 (PDT)','43173')
Expected format:2018-05-18, 2018-04-13, 2018-04-25, ...
I believe similar questions have been asked several times before. However, there
is a crucial point which needs special attention:
What is the origin for the dates given as integer (or as character string which can be converted to integer to be exact)?
If the data is imported from the Windows version of Excel, origin = "1899-12-30" has to be used. For details, see the Example section in help(as.Date) and the Other Applications section of the R Help Desk article by Gabor Grothendieck and Thomas Petzoldt.
For conversion of the date time strings, the mdy_hms() function from the lubridate package is used. In addition, I am using data.table syntax for its conciseness:
library(data.table)
data.table(dates)[!dates %like% "^\\d+$", new_date := as.Date(lubridate::mdy_hms(dates))][
is.na(new_date), new_date := as.Date(as.integer(dates), origin = "1899-12-30")][]
dates new_date
1: Fri May 18 2018 00:00:00 GMT-0700 (PDT) 2018-05-18
2: 43203 2018-04-13
3: Wed Apr 25 2018 00:00:00 GMT-0700 (Pacific Standard Time) 2018-04-25
4: 43167 2018-03-08
5: 43201 2018-04-11
6: Fri May 18 2018 00:00:00 GMT-0700 (PDT) 2018-05-18
7: Tue May 29 2018 00:00:00 GMT-0700 (Pacific Standard Time) 2018-05-29
8: Tue May 01 2018 00:00:00 GMT-0700 (PDT) 2018-05-01
9: Fri May 25 2018 00:00:00 GMT-0700 (Pacific Standard Time) 2018-05-25
10: Fri Apr 06 2018 00:00:00 GMT-0700 (PDT) 2018-04-06
11: 43173 2018-03-14
Apparently, the assumption to choose the origin which belongs to the Windows version of Excel seems to hold.
If only a vector of Date values is required:
data.table(dates)[!dates %like% "^\\d+$", new_date := as.Date(lubridate::mdy_hms(dates))][
is.na(new_date), new_date := as.Date(as.integer(dates), origin = "1899-12-30")][, new_date]
[1] "2018-05-18" "2018-04-13" "2018-04-25" "2018-03-08" "2018-04-11" "2018-05-18"
[7] "2018-05-29" "2018-05-01" "2018-05-25" "2018-04-06" "2018-03-14"

How to run cronjob on alternate weekday?

I have a script which runs everyday at 1.00 AM regularly for every day.
But On every alternate Wednesday I need to change the timings to 6.00 AM and which currently I am doing separately on every Tuesday Manually.
e.g
Wednesday Nov 09 2016 6.00 AM.
Wednesday Nov 23 2016 6.00 AM.
Wednesday Dec 07 2016 6.00 AM.
The main thing is for every Wednesday in between the job should be as per regular timings.
Using this bash trick it could be done with 3 cron entries (possibly 2):
#Every day except Wednesdays at 1am
0 1 * * 0,1,2,4,5,6 yourCommand
#Every Wednesdays at 1am, proceeds only on even weeks
0 1 * * 3 test $((10#$(date +\%W)\%2)) -eq 0 && yourCommand
#Every Wednesdays at 6am, proceeds only on odd weeks
0 6 * * 3 test $((10#$(date +\%W)\%2)) -eq 1 && yourCommand
Change the -eq's to 1 or 0 depending if you want to start with odd or even week. It should work according to your example, because Wednesday Nov 09 2016 6.00 AM is even.

Search for text between two time frame using sed

I have log files with time stamps. I want to search for text between two time stamps using sed even if the first time stamp or the last time stamp are not present.
For example, if I search between 9:30 and 9:40 then it should return text even if neither 9:30 nor 9:40 is there but the time stamp is between 9:30 and 9:40.
I am using a sed one liner:
sed -n '/7:30:/,/7:35:/p' xyz.log
But it only returns data if both the time stamps are present; it will print everything if one of the time stamp are missing. And if the time is in 12 hr format it will pull data for both AM and PM.
Additionally, I have different time stamp formats for different log files so I need a generic command.
Here are some time format examples:
<Jan 27, 2013 12:57:16 AM MST>
Jan 29, 2013 8:58:12 AM
2013-01-31 06:44:04,883
Some of them contain AM/PM i.e. 12 hr format and others contain 24 hr format so I have to account for that as well.
I have tried this as well but it doesn't work:
sed -n -e '/^2012-07-19 18:22:48/,/2012-07-23 22:39:52/p' history.log
With the serious medley of time formats you have to parse, sed is not the correct tool to use. I'd automatically reach for Perl, but Python would do too, and you probably could do it in awk if you put your mind to it. You need to normalize the time formats (you don't say anything about date, so I assume you're working only with the time portion).
#!/usr/bin/env perl
use strict;
use warnings;
use constant debug => 0;
my $lo = "09:30";
my $hi = "09:40";
my $lo_tm = to_minutes($lo);
my $hi_tm = to_minutes($hi);
while (<>)
{
print "Read: $_" if debug;
if (m/\D\d\d?:\d\d:\d\d/)
{
my $tm = normalize_hhmm($_);
print "Normalized: $tm\n" if debug;
print $_ if ($tm >= $lo_tm && $tm<= $hi_tm);
}
}
sub to_minutes
{
my($val) = #_;
my($hh, $mm) = split /:/, $val;
if ($hh < 0 || $hh > 24 || $mm < 0 || $mm >= 60 || ($hh == 24 && $mm != 0))
{
print STDERR "to_minutes(): garbage = $val\n";
return undef;
}
return $hh * 60 + $mm;
}
sub normalize_hhmm
{
my($line) = #_;
my($hhmm, $ampm) = $line =~ m/\D(\d\d?:\d\d):\d\d\s*(AM|PM|am|pm)?/;
my $tm = to_minutes($hhmm);
if (defined $ampm)
{
if ($ampm =~ /(am|AM)/)
{
$tm -= 12 * 60 if ($tm >= 12 * 60);
}
else
{
$tm += 12 * 60 if ($tm < 12 * 60);
}
}
return $tm;
}
I used the sample data:
<Jan 27, 2013 12:57:16 AM MST>
Jan 29, 2013 8:58:12 AM
2013-01-31 06:44:04,883
Feb 2 00:00:00 AM
Feb 2 00:59:00 AM
Feb 2 01:00:00 AM
Feb 2 01:00:00 PM
Feb 2 11:00:00 AM
Feb 2 11:00:00 PM
Feb 2 11:59:00 AM
Feb 2 11:59:00 PM
Feb 2 12:00:00 AM
Feb 2 12:00:00 PM
Feb 2 12:59:00 AM
Feb 2 12:59:00 PM
Feb 2 00:00:00
Feb 2 00:59:00
Feb 2 01:00:00
Feb 2 11:59:59
Feb 2 12:00:00
Feb 2 12:59:59
Feb 2 13:00:00
Feb 2 09:31:00
Feb 2 09:35:23
Feb 2 09:36:23
Feb 2 09:37:23
Feb 2 09:35:00
Feb 2 09:40:00
Feb 2 09:40:59
Feb 2 09:41:00
Feb 2 23:00:00
Feb 2 23:59:00
Feb 2 24:00:00
Feb 3 09:30:00
Feb 3 09:40:00
and it produced what I consider the correct output:
Feb 2 09:31:00
Feb 2 09:35:23
Feb 2 09:36:23
Feb 2 09:37:23
Feb 2 09:35:00
Feb 2 09:40:00
Feb 2 09:40:59
Feb 3 09:30:00
Feb 3 09:40:00
I'm sure this isn't the only way to do the processing; it seems to work, though.
If you need to do date analysis, then you need to use one of the date or time manipulation packages from CPAN to deal with the problems. The code above also hard codes the times in the script. You'd probably want to handle them as command line arguments, which is perfectly doable, but isn't scripted above.

GridView Layout/Output

I have an Website using ASP.Net 2.0 with SQL Server as Database and C# 2005 as programming language. In one of the pages I have a GridView with following layout.
Date -> Time -> QtyUsed
The sample values are as follows: (Since this GridView/Report is generated for a specific month only, I have extracted and displaying only the Day part of the date ignoring the month and year part.
01 -> 09:00 AM -> 05
01 -> 09:30 AM -> 03
01 -> 10:00 AM -> 09
02 -> 09:00 AM -> 10
02 -> 09:30 AM -> 09
02 -> 10:00 AM -> 11
03 -> 09:00 AM -> 08
03 -> 09:30 AM -> 09
03 -> 10:00 AM -> 12
Now the user wants the layout to be like:
Time 01 02 03 04 05 06 07 08 09
-------------------------------------------------------------------------
09:00 AM -> 05 10 08
09:30 AM -> 03 09 09
10:00 AM -> 09 11 12
The main requirement is that the days should be in the column header from 01 to the last date (the reason why I extracted only the day part from the date). The Timeslots should be down as rows.
From my experience with Excel, the idea of Transpose comes to my mind to solve this, but I am not sure.
Please help me in solving this problem.
Thank you.
Lalit Kumar Barik
You will have to generate the dataset accordingly. I am guessing you are doing some kind of grouping based on the hour so generate a column for each hour of the day and populate the dataset accordingly.
In SQL Server, there is a PIVOT function that may be of use.
The MSDN article specifies usage and gives an example.
The example is as follows
Table DailyIncome looks like
VendorId IncomeDay IncomeAmount
---------- ---------- ------------
SPIKE FRI 100
SPIKE MON 300
FREDS SUN 400
SPIKE WED 500
...
To show
VendorId MON TUE WED THU FRI SAT SUN
---------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
FREDS 500 350 500 800 900 500 400
JOHNS 300 600 900 800 300 800 600
SPIKE 600 150 500 300 200 100 400
Use this select
SELECT * FROM DailyIncome
PIVOT( AVG( IncomeAmount )
FOR IncomeDay IN
([MON],[TUE],[WED],[THU],[FRI],[SAT],[SUN])) AS AvgIncomePerDay
Alternatively, you could select all of the data from DailyIncome and build a DataTable with the data pivoted. Here is an example.

Resources