Calculating Duration in org mode table - datetime

I'm trying to figure out how to to use org-mode to calculate the duration between two time points, however, whilst I figured out how to do it for two separate dates, when I add in the time component, it gives an answer, but I'd rather have the answer in
XX days, xx hours, xx minutes
| Start | End | Duration |
|------------------------+------------------------+----------|
| <2013-07-16 Tue 15:15> | <2013-07-17 Wed 11:15> | 0.833333 |
| | | 0 |
#+TBLFM: $3=(date(<$2>)-date(<$1>))

You may use the T flag to use the form HH:MM[:SS]. Example:
| Start | End | Days | HH:MM:SS |
|------------------------+------------------------+----------+----------|
| <2013-07-15 Tue 10:15> | <2013-07-17 Wed 11:15> | 2.041667 | 49:00:00 |
| | | 0 | 00:00:00 |
#+TBLFM: $3=date(<$2>)-date(<$1>)::$4=60*60*24*$3;T

Related

Remove duplicates based on multiple values in R or POWER BI

I have a data set, each line representing a "service visit" for customers. A customer might have between 0 and 5 service calls. If there isn't a service call for someone, the columns associated with a service call would all be empty.
+--------------+-------------------+-------------------+------------------------+---------------------+
| Project Name | Customer Name | Service Call.Name | Service Call Date Time | Service Call Status |
+--------------+-------------------+-------------------+------------------------+---------------------+
| OO-99999 | A | SC-001762 | 3/21/2022 7:00:00 PM | Completed |
| OO-99999 | A | SC-002323 | null | Completed |
| OO-99999 | A | SC-002357 | 10/3/2022 7:00:00 PM | 2nd Visit Scheduled |
| OO-88888 | B | SC-001260 | 2/1/2022 8:00:00 PM | Completed |
| OO-88888 | B | SC-002938 | 8/25/2022 7:00:00 PM | Scheduled |
| OO-55555 | C | SC-000957 | 12/27/2021 8:00:00 PM | Completed |
| OO-55555 | C | SC-001418 | 2/7/2022 4:30:00 PM | Completed |
| OO-55555 | C | SC-003007 | null | null |
| OO-66666 | D | SC-001626 | null | No Longer Required |
| OO-66666 | D | SC-002329 | 6/9/2022 7:00:00 PM | Completed |
| OO-66666 | D | SC-002538 | null | Completed |
| OO-66666 | D | SC-002932 | null | Call Reviewed |
| OO-66666 | D | SC-003350 | 9/29/2022 7:00:00 PM | Scheduled |
| OO-11111 | F | null | null | null |
+--------------+-------------------+-------------------+------------------------+---------------------+
My goal is to filter out duplicates. I only want one row per customer, but I want to keep a specific row. A duplicate only appears if someone has multiple service calls.
If someone has a service call (Service Call.Name not equal to null), and one of those has a service call status of something OTHER than "Completed" or "Not required", I want to keep that row. So for Customer A, I want the third row since the service call status is not "completed" or "Not required".
If someone has multiple service calls, like customer , and they are all "completed" or "Not required". I don't care which one I keep, as long as I only keep one.
If someone has one service call or no service call, there will be no duplicate of that person, so I want to keep that row.
EDIT
There were cases of duplicates I didn't realize I had, I've edited the data to show them.
For someone with more than one open service call like customer E, I only want to keep one of them. If there is a date for both, I want the latest date of the two. If one has a date and the other doesn't, i want the one with a date. If neither have a date, i don't care which is kept, but i only want one.
I am working in Power BI, but I have access to R and think that might be easier.
Here is a solution. duplicated will give what rows to keep by customer name and another logical index, created with %in%, the rows to keep by status.
dat <- read.table(text = '+--------------+---------------+-------------------+------------------------+---------------------+
| Project Name | Customer Name | Service Call.Name | Service Call Date Time | Service Call Status |
+--------------+---------------+-------------------+------------------------+---------------------+
| OO-99999 | A | SC-001762 | 3/21/2022 7:00:00 PM | Completed |
| OO-99999 | A | SC-002323 | null | Completed |
| OO-99999 | A | SC-002357 | 10/3/2022 7:00:00 PM | 2nd Visit Scheduled |
| OO-88888 | B | SC-001260 | 2/1/2022 8:00:00 PM | Completed |
| OO-88888 | B | SC-002938 | 8/25/2022 7:00:00 PM | Scheduled |
| OO-55555 | C | SC-000957 | 12/27/2021 8:00:00 PM | Completed |
| OO-55555 | C | SC-001418 | 2/7/2022 4:30:00 PM | Completed |
| OO-55555 | C | SC-003007 | null | null |
| OO-11111 | D | null | null | null |
+--------------+---------------+-------------------+------------------------+---------------------+
', header = TRUE, sep = "|", comment.char = "+", strip.white = TRUE, check.names = FALSE)
dat <- dat[-c(1, ncol(dat))]
not_wanted <- c("Completed", "Not required")
i <- dat[['Service Call Status']] %in% not_wanted
i <- ave(i, dat[['Customer Name']], FUN = \(k) {
if(all(k)) k[1] <- FALSE
!k
})
result <- dat[i,]
j <- ave(result[['Service Call Status']], result[['Customer Name']], FUN = duplicated)
result <- result[!as.logical(j), ]
result
#> Project Name Customer Name Service Call.Name Service Call Date Time Service Call Status
#> 3 OO-99999 A SC-002357 10/3/2022 7:00:00 PM 2nd Visit Scheduled
#> 5 OO-88888 B SC-002938 8/25/2022 7:00:00 PM Scheduled
#> 8 OO-55555 C SC-003007 null null
#> 9 OO-11111 D null null null
Created on 2022-10-26 with reprex v2.0.2

Select values from one table, count common values from other table, show 0 if no common values

I have two tables that are something as follows:
WORKDAYS
DATE | WORKDAY_LENGHT |
-----------+----------------+
12-05-2018 | 8 |
13-05-2018 | 6.5 |
14-05-2018 | 7.5 |
15-05-2018 | 8 |
ACCIDENTS
TOD | SEVERITY |
-----------------+-----------+
12-05-2018 12:00 | minor |
12-05-2018 15:00 | minor |
13-05-2018 08:00 | severe |
13-05-2018 12:00 | severe |
14-05-2018 10:30 | severe |
And I need a result that is as follows:
WORKDAYS
DATE | WORKDAY_LENGHT | ACCIDENTS_COUNT|
-----------+----------------+----------------+
12-05-2018 | 8 | 2 |
13-05-2018 | 6.5 | 2 |
14-05-2018 | 7.5 | 1 |
15-05-2018 | 8 | 0 |
What I so far have tried is this:
SELECT DISTINCT
w.date,
(
SELECT
COUNT(*)
FROM
accidents a
WHERE
date(w.date) = date(a.tod)
)
AS accidents_count
FROM
workdays w
Which gives me an answer that is somewhat in the right direction. Something like this:
WORKDAYS
DATE | WORKDAY_LENGHT | ACCIDENTS_COUNT|
-----------+----------------+----------------+
12-05-2018 | 8 | 1 |
12-05-2018 | 8 | 1 |
13-05-2018 | 6.5 | 1 |
13-05-2018 | 6.5 | 1 |
14-05-2018 | 7.5 | 1 |
15-05-2018 | 8 | 0 |
This is sqlite, so the date values are stored as strings. The date function therefore should make them just dates, right? Or is that the one causing problems?
I was missing a group by and feel ashamed for opening a question before figuring this out.
adding GROUP BY date(w.date) is the solution here.

why does the frequency of my Gnocchi measurements not match the set granularity

Im running openstack and am trying to get my gnocchi meters to come through more frequently so that I can run a scaling demo without lots of 5 minute lags. In Gnocchi I have changed the Archive-policy to be a custom policy with granularity set to 30 seconds (I've also tried the following using the existing 'medium' policy and it has the same result)
+---------------------+--------------------------------------------------------+
| Field | Value |
+---------------------+--------------------------------------------------------+
| aggregation_methods | std, count, min, max, sum, mean |
| back_window | 0 |
| definition | - points: 120, granularity: 0:00:30, timespan: 1:00:00 |
| name | test |
+---------------------+--------------------------------------------------------+
the cpu_util meter is picking it up correclty
+------------------------------------+-------------------------------------------------------------------+
| Field | Value |
+------------------------------------+-------------------------------------------------------------------+
| archive_policy/aggregation_methods | std, count, min, max, sum, mean |
| archive_policy/back_window | 0 |
| archive_policy/definition | - points: 120, granularity: 0:00:30, timespan: 1:00:00 |
| archive_policy/name | test |
| created_by_project_id | e499d0c2e0fb4a05ac39c3f8c260052b |
| created_by_user_id | 21759a51f3834b9bbae49c3ed17a13e4 |
| creator | 21759a51f3834b9bbae49c3ed17a13e4:e499d0c2e0fb4a05ac39c3f8c260052b |
| id | e5a02f3a-9fbe-4e44-bb91-e1cfe6b86143 |
| name | cpu_util |
| resource/created_by_project_id | e499d0c2e0fb4a05ac39c3f8c260052b |
| resource/created_by_user_id | 21759a51f3834b9bbae49c3ed17a13e4 |
| resource/creator | 21759a51f3834b9bbae49c3ed17a13e4:e499d0c2e0fb4a05ac39c3f8c260052b |
| resource/ended_at | None |
| resource/id | 243b9715-95ba-4532-9728-3e61776e1c29 |
| resource/original_resource_id | 243b9715-95ba-4532-9728-3e61776e1c29 |
| resource/project_id | 43a7db62d5d54c4590e363868fff49e2 |
| resource/revision_end | None |
| resource/revision_start | 2018-08-08T14:05:09.770765+00:00 |
| resource/started_at | 2018-08-08T13:20:45.948842+00:00 |
| resource/type | instance |
| resource/user_id | 4e5015006b304e7ca57edc5419b42be3 |
| unit | % |
+------------------------------------+-------------------------------------------------------------------+
but the measurements are still only coming out every 5 min
gnocchi measures show e5a02f3a-9fbe-4e44-bb91-e1cfe6b86143
+---------------------------+-------------+--------------+
| timestamp | granularity | value |
+---------------------------+-------------+--------------+
| 2018-08-08T13:30:00+00:00 | 30.0 | 0.0400002375 |
| 2018-08-08T13:35:00+00:00 | 30.0 | 0.0366666763 |
| 2018-08-08T13:40:00+00:00 | 30.0 | 0.0366667101 |
| 2018-08-08T13:45:00+00:00 | 30.0 | 0.0399999545 |
| 2018-08-08T13:50:00+00:00 | 30.0 | 0.0366664861 |
| 2018-08-08T13:55:00+00:00 | 30.0 | 0.0400000543 |
| 2018-08-08T14:00:00+00:00 | 30.0 | 0.0366665877 |
+---------------------------+-------------+--------------+
any ideas what I am missing?
I had the same issue. In Gnocchi-backed Ceilometer there is a new configuration file: polling.yaml. Resources polling interval is being set there.
https://review.opendev.org/#/c/405682/
https://docs.openstack.org/ceilometer/pike/admin/telemetry-best-practices.html

dojo how to have multiple rows in a datagrid header

i need to have a datagrid with the following layout:
------------------------------------------------
| | | player |
| date | time |-------------------------------|
| | | first name | last name | age |
|----------------------------------------------|
| jan | 14:02 | roy | batty | 3 |
|----------------------------------------------|
| mar | 17:12 | pika | chu | 1 |
|----------------------------------------------|
| dec | 05:31 | louie | dickens | 33 |
------------------------------------------------
Preliminary inquiries seem to reveal that dojo does not support this kind of behavior, am i right?
thank you very much
This is possible in dojo, you need to use CompoundColumns feature for this and then you will achieve this.
Please follow this link,
CompoundColumns for multilevel row in dojo grid
Please do try this and do let me know if have any issue/concern.

Is there a way to show partitions on Cloudera impala?

Normally, I can do show partitions <table> in hive. But when it is a parquet table, hive does not understand it. I can go to hdfs and check the dir structure, but that is not ideal. Is there any better way to do that?
I am using Impala 1.4.0 and I can see partitions.
From the impala-shell give the command:
show partitions <mytablename>
I have something looking like this:
+-------+-------+-----+-------+--------+---------+--------------+---------+
| year | month | day | #Rows | #Files | Size | Bytes Cached | Format |
+-------+-------+-----+-------+--------+---------+--------------+---------+
| 2013 | 11 | 1 | -1 | 3 | 25.87MB | NOT CACHED | PARQUET |
| 2013 | 11 | 2 | -1 | 3 | 24.84MB | NOT CACHED | PARQUET |
| 2013 | 11 | 3 | -1 | 2 | 19.05MB | NOT CACHED | PARQUET |
| 2013 | 11 | 4 | -1 | 3 | 23.63MB | NOT CACHED | PARQUET |
| 2013 | 11 | 5 | -1 | 3 | 26.56MB | NOT CACHED | PARQUET |
Alternatively you can go to your table in HDFS . They are normally seen in this path:
/user/hivestore/warehouse/<mytablename> or
/user/hive/warehouse/<mytablename>
Unfortunately no. Issue is open though. So checking it manually seems to be the only option right now.

Resources