I am trying to select rows based on the timestamp. In the sample data that follows, some columns contain duplicate computer names. I am interested in the row with the latest timestamp.
+------------------------+----------+---------+------------+
| TIMEST AMP | COMPUTER | VERSION | MORE COLS. |
+------------------------+----------+---------+------------+
| 2019-10-02 10:32:40 | COMPA | 1234 | ... |
+------------------------+----------+---------+------------+
| 2019-09-12 11:15 23 | COMPA | 1235 | ... |
+------------------------+----------+---------+------------+
| 2019-11-13 15:23:25 | COMPA | 1234 | ... |
+------------------------+----------+---------+------------+
| 2019-10-02 10:32:40 | COMPB | 1234 | ... |
+------------------------+----------+---------+------------+
| 2019-09-13 11:15 23 | COMPC | 1235 | ... |
+------------------------+----------+---------+------------+
| 2019-11-13 15:23:25 | COMPC | 1235 | ... |
+------------------------+----------+---------+------------+
The following result should be returned
+------------------------+----------+---------+------------+
| TIMEST AMP | COMPUTER | VERSION | MORE COLS. |
+------------------------+----------+---------+------------+
| 2019-11-13 15:23:25 | COMPA | 1234 | ... |
+------------------------+----------+---------+------------+
| 2019-10-02 10:32:40 | COMPB | 1234 | ... |
+------------------------+----------+---------+------------+
| 2019-11-13 15:23:25 | COMPC | 1235 | ... |
+------------------------+----------+---------+------------+
It looks like a nested query should work. I found an example, but I'm not sure how to get it to work with this data
SAMPLE
dependencies
| where resultCode == toscalar(
dependencies
| where resultId == 7
| top 1 by timestamp desc
| project resultCode)
you could try using summarize arg_max() (doc):
datatable(timestamp:datetime, computer:string, version:int)
[
datetime(2019-10-02 10:32:40), 'COMPA', 1234,
datetime(2019-09-12 11:15:23), 'COMPA', 1235,
datetime(2019-11-13 15:23:25), 'COMPA', 1234,
datetime(2019-10-02 10:32:40), 'COMPB', 1234,
datetime(2019-09-13 11:15:23), 'COMPC', 1235,
datetime(2019-11-13 15:23:25), 'COMPC', 1235,
]
| summarize arg_max(timestamp, *) by computer
-->
| computer | timestamp | version |
|----------|-----------------------------|---------|
| COMPA | 2019-11-13 15:23:25.0000000 | 1234 |
| COMPB | 2019-10-02 10:32:40.0000000 | 1234 |
| COMPC | 2019-11-13 15:23:25.0000000 | 1235 |
Related
I have a table name employees
MariaDB [company]> describe employees;
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| employee_id | char(10) | NO | | NULL | |
| first_name | varchar(20) | NO | | NULL | |
| last_name | varchar(20) | NO | | NULL | |
| email | varchar(60) | NO | | NULL | |
| phone_number | char(14) | NO | | NULL | |
| hire_date | date | NO | | NULL | |
| job_id | int(11) | NO | | NULL | |
| salary | varchar(30) | NO | | NULL | |
| commission_pct | char(10) | NO | | NULL | |
| manager_id | char(10) | NO | | NULL | |
| department_id | char(10) | NO | | NULL | |
+----------------+-------------+------+-----+---------+-------+
MariaDB [company]> select * from employees;
+-------------+-------------+-------------+--------------------+---------------+------------+--------+----------+----------------+------------+---------------+
| employee_id | first_name | last_name | email | phone_number | hire_date | job_id | salary | commission_pct | manager_id | department_id |
+-------------+-------------+-------------+--------------------+---------------+------------+--------+----------+----------------+------------+---------------+
| 100 | Steven | King | sking#gmail.com | 515.123.4567 | 2003-06-17 | 1 | 24000.00 | 0.00 | 0 | 90 |
| 101 | Neena | Kochhar | nkochhar#gmail.com | 515.123.4568 | 2005-09-21 | 2 | 17000.00 | 0.00 | 100 | 90 |
| 102 | Lex | Wow | Lwow#gmail.com | 515.123.4569 | 2001-01-13 | 2 | 17000.00 | 0.00 | 100 | 9 |
| 103 | Alexander | Hunold | ahunold#gmail.com | 590.423.4567 | 2006-01-03 | 3 | 9000.00 | 0.00 | 102 | 60 |
| 104 | Bruce | Ernst | bernst#gmail.com | 590.423.4568 | 2007-05-21 | 3 | 6000.00 | 0.00 | 103 | 60 |
| 105 | David | Austin | daustin#gmail.com | 590.423.4569 | 2005-06-25 | 3 | 4800.00 | 0.00 | 103 | 60 |
| 106 | Valli | Pataballa | vpatabal#gmail.com | 590.423.4560 | 2006-02-05 | 3 | 4800.00 | 0.00 | 103 | 60 |
| 107 | Diana | Lorentz | dlorentz#gmail.com | 590.423.5567 | 2007-02-07 | 3 | 4200.00 | 0.00 | 103 | 60 |
| 108 | Nancy | Greenberg | ngreenbe#gmail.com | 515.124.4569 | 2002-08-17 | 4 | 12008.00 | 0.00 | 101 | 100 |
| 109 | Daniel | Faviet | dfaviet#gmail.com | 515.124.4169 | 2002-08-16 | 5 | 9000.00 | 0.00 | 108 | 100 |
| 110 | John | Chen | jchen#gmail.com | 515.124.4269 | 2005-09-28 | 5 | 8200.00 | 0.00 | 108 | 100 |
| 111 | Ismael | Sciarra | isciarra#gmail.com | 515.124.4369 | 2005-09-30 | 5 | 7700.00 | 0.00 | 108 | 100 |
| 112 | Jose | Urman | jurman#gmail.com | 515.124.4469 | 2006-03-07 | 5 | 7800.00 | 0.00 | 108 | 100 |
| 113 | Luis | Popp | lpopp#gmail.com | 515.124.4567 | 2007-12-07 | 5 | 6900.00 | 0.00 | 108 | 100 |
| 114 | Den | Raphaely | drapheal#gmail.com | 515.127.4561 | 2002-12-07 | 6 | 11000.00 | 0.00 | 100 | 30 |
| 115 | Alexander | Khoo | akhoo#gmail.com | 515.127.4562 | 2003-05-18 | 7 | 3100.00 | 0.00 | 114 | 30 |
+-------------+-------------+-------------+--------------------+---------------+------------+--------+----------+----------------+------------+---------------+
I wanted to change the salary column from string to integer. So, I ran this command
MariaDB [company]> alter table employees modify column salary int;
ERROR 1292 (22007): Truncated incorrect INTEGER value: '24000.00'
As you can see it gave me truncation error. I found some previous questions where they showed how to use convert() and trim() but those actually didn't answer my question.
sql code and data can be found here https://0x0.st/oYoB.com_5zfu
I tested this on MySQL and it worked fine. So it is apparently an issue only with MariaDB.
The problem is that a string like '24000.00' is not an integer. Integers don't have a decimal place. So in strict mode, the implicit type conversion fails.
I was able to work around this by running this update first:
update employees set salary = round(salary);
The column is still a string, but '24000.00' has been changed to '24000' (with no decimal point character or following digits).
Then you can alter the data type, and implicit type conversion to integer works:
alter table employees modify column salary int;
See demonstration using MariaDB 10.6:
https://dbfiddle.uk/V6LrEMKt
P.S.: You misspelled the column name "commission_pct" as "comission_pct" in your sample DDL, and I had to edit that to test. In the future, please use one of the db fiddle sites to share samples, because they will test your code.
I have the following dataset with 21 columns - 19 variables and Month and Date as date type columns.
The aim is to analyze how correlation change over time calculating a daily correlation between variables summarized in one month. For example, see this "monthly correlation" over time. (X-axis as month type)
+------------+---------+-----+-----+--------+---------+-------------+
| Date | Month | AOV | ASP | Clicks | Traffic | Impressions |
+------------+---------+-----+-----+--------+---------+-------------+
| 2017-01-01 | 2017-01 | 50 | 6 | 700 | 10000 | 4500 |
+------------+---------+-----+-----+--------+---------+-------------+
| 2017-01-02 | 2017-01 | 55 | 7 | 800 | 20000 | 4600 |
+------------+---------+-----+-----+--------+---------+-------------+
| 2017-02 | 2017-02 | 58 | 8 | 700 | 4599 | 2300 |
+------------+---------+-----+-----+--------+---------+-------------+
At the moment I have the following code but I only can compare two variables at the same time
ddply(corr,"Month",summarise,corr=cor(AOV,ASP))
I get the table below
+---------+------------+
| Month | corr |
+---------+------------+
| 2017-1 | 0.4958738 |
+---------+------------+
| 2017-10 | 0.8527522 |
+---------+------------+
| 2017-11 | -0.2751771 |
+---------+------------+
| 2017-12 | NA |
+---------+------------+
| 2017-2 | 0.6596346 |
+---------+------------+
| 2017-3 | 0.6399969 |
+---------+------------+
| 2017-4 | 0.7926245 |
+---------+------------+
| 2017-5 | 0.6429613 |
+---------+------------+
| 2017-6 | 0.3824414 |
+---------+------------+
| 2017-7 | 0.9154873 |
+---------+------------+
| 2017-8 | 0.7235767 |
+---------+------------+
| 2017-9 | 0.8264006 |
+---------+------------+
I have been using combn to create the combinations set but I'm not quite sure how to use it with ddply. I get 171 combinations in pairs.
combn(corr,2,simplify = F)
You can just do:
cor(your_data_frame)
I have Grid view data like this
Machine Code | Job | Worker | Start | End
DR01 | AAA01 | Mr.A | 2017/01/01 | 2017/01/03
DR01 | AAA02 | Mr.B | 2017/01/02 | 2017/01/04
DR01 | AAA03 | Mr.C | 2017/01/03 | 2017/01/05
And I want to loop in grid view and show result like this
[DR01 Machine]
| AA01 | AA02 | AA03 |
| Mr.A | Mr.B | Mr.C |
| 2017/01/01 | 2017/01/02 | 2017/01/03 |
| 2017/01/03 | 2017/01/04 | 2017/01/05 |
I have a table definied in org-mode:
`#+RESULTS[4fc5d440d2954e8355d32d8004cab567f9918a64]: table
| 7.4159 | 3.0522 | 5.9452 |
| -1.0548 | 12.574 | -6.5001 |
| 7.4159 | 3.0522 | 5.9452 |
| 5.1884 | 4.9813 | 4.9813 |
`
and I want to produce the following table:
#+caption: Caption of my table
| | group 1 | group 2 | group 3 |
|--------+---------+---------+---------|
| plan 1 | 7.416 | 3.052 | 5.945 |
| plan 2 | -1.055 | 12.574 | -6.5 |
| plan 3 | 7.416 | 3.052 | 5.945 |
| plan 4 | 5.1884 | 4.9813 | 4.9813 |
How can I accomplish that? Here is what I tried (in R):
`
#+begin_src R :colnames yes :var table=table :session
data.frame(table)
#+end_src
`
But of course it doesn't work, here is what I get:
`#RESULTS:
| X7.4159 | X3.0522 | X5.9452 |
|---------+---------+---------|
| -1.0548 | 12.574 | -6.5001 |
| 7.4159 | 3.0522 | 5.9452 |
| 5.1884 | 4.9813 | 4.9813 |`
Any suggestions?
thanks!
This gets pretty close. first define this function:
#+BEGIN_SRC emacs-lisp
(defun add-caption (caption)
(concat (format "org\n#+caption: %s" caption)))
#+END_SRC
Next, use this kind of src block. I use python, but it should work in R too, you just need the :wrap. I passed your data in through the var, you don't need it if you generate the data in the block.
#+BEGIN_SRC python :results value :var data=data :wrap (add-caption "Some really long, uninteresting, caption about data that is in this table.")
data.insert(0, ["", "group 1", "group 2", "group 3"])
data.insert(1, None)
return data
#+END_SRC
This outputs
#+BEGIN_org
#+caption: Some really long, uninteresting, caption about data that is in this table.
| | group 1 | group 2 | group 3 |
|--------+---------+---------+---------|
| plan 1 | 7.416 | 3.052 | 5.945 |
| plan 2 | -1.055 | 12.574 | -6.5 |
| plan 3 | 7.416 | 3.052 | 5.945 |
| plan 4 | 5.1884 | 4.9813 | 4.9813 |
#+END_org
and it exports ok too I think.
Here's my data:
+--------+------------+---------+------------+----------------+
| UserID | VisitDate | VisitID | PurchaseID | LastPurchaseID |
+--------+------------+---------+------------+----------------+
| 1234 | 2014-10-03 | 1 | 4a75 | 4a75 |
| 1234 | 2014-10-06 | 2 | | 4a75 |
| 1234 | 2014-10-07 | 3 | b305 | b305 |
| 1234 | 2014-10-08 | 4 | | b305 |
| 1234 | 2014-10-09 | 5 | | b305 |
| 1234 | 2014-10-10 | 6 | b305 | b305 |
| 1234 | 2014-10-10 | 7 | | b305 |
| 1234 | 2014-10-15 | 8 | | b305 |
+--------+------------+---------+------------+----------------+
I don't have LastPurchaseID - this is what I want
I figure I have to use window functions, but I'm not sure how to get it to keep the most recent non-null value, even if the most recent non-null value is many rows ago.
For example, I've tried something like:
SELECT UserID,
VisitDate,
VisitID,
PurchaseID,
LAG(TRIM(PurchaseID)) IGNORE NULLS
OVER (ORDER BY UserID, VisitDate) AS LastPurchaseID
FROM TheTable;
but this only returns:
+--------+------------+---------+------------+----------------+
| UserID | VisitDate | VisitID | PurchaseID | LastPurchaseID |
+--------+------------+---------+------------+----------------+
| 1234 | 2014-10-03 | 1 | 4a75 | 4a75 |
| 1234 | 2014-10-06 | 2 | | 4a75 |
| 1234 | 2014-10-07 | 3 | b305 | b305 |
| 1234 | 2014-10-08 | 4 | | b305 |
| 1234 | 2014-10-09 | 5 | | |
| 1234 | 2014-10-10 | 6 | b305 | b305 |
| 1234 | 2014-10-10 | 7 | | b305 |
| 1234 | 2014-10-15 | 8 | | |
+--------+------------+---------+------------+----------------+
Is there any way to use a window function say "keep the most recent, if it is null, assume it hasn't changed from the previous non-null value"?
I eventually got it, sorry about that. For anyone else in this somewhat unique situation, this is what was happening:
Since PurchaseID was a string in my case, I wasn't considering the case where the PurchaseID was an empty string (or just a space, which trim() turned into the empty string), which is not null.
I have since fixed the job that inserts into the table to prevent this from occurring, and also changed the LastPurchaseID logic to the following:
SELECT LAG(CASE WHEN LENGTH(TRIM(PurchaseID)) = 0 THEN NULL
ELSE TRIM(PurchaseID) END)
IGNORE NULLS OVER (ORDER BY UserID, VisitDate) AS LastPurchaseID
FROM TheTable;