I am trying to find values which are the same as at least two values above it. Please take a look.
id number
1 2
2 6
3 7
4 7
5 7
6 1
7 2
8 4
9 7
So in this case select would return:
ID NUMBER
3 7
4 7
5 7
You can look up values in othe rows with a correlated subquery:
SELECT *
FROM MyTable
WHERE number = (SELECT number
FROM MyTable AS T2
WHERE T2.id = MyTable.id - 1)
AND number = (SELECT number
FROM MyTable AS T2
WHERE T2.id = MyTable.id - 2);
Related
A teradata table
as
Group_categ id
A 1
A 2
A 3
A 5
A 8
A 9
B 11
C 1
C 2
C 3
C 4
need to filter it like
Group_categ min_id max _id
A 1 2
A 3 5
A 8 9
B 11 11
C 1 2
C 3 4
Seems you want to combine two consecutive rows into a single row:
SELECT Group_categ, Min(id), Max(id)
FROM
(
SELECT
Group_categ, id,
-- assign the same value to two consecutive rows: 0,0,1,1,2,2,..
-- -> used in the outer GROUP BY
(Row_Number() Over (PARTITION BY Group_categ ORDER BY id)-1) / 2 AS grp
FROM mytab
) AS dt
GROUP BY Group_categ, grp
I am trying to calculate a running sum in R using sqldf.
I have tried several ways, and I keep getting an this error,
error in statement: near "(": syntax error
I have a pretty simple example dataframe
DF <- data.frame(col1 = 1:4, id = 1:12)
And this is what I am trying to do
install.packages('sqldf')
require(sqldf)
sqldf("SELECT col1, SUM(col1) OVER (ORDER BY id) AS runningsum FROM DF")
I want to get something like this
1) sqlite with the default sqlite backend to sqldf that syntax is not supported but this works:
library(sqldf)
sqldf("select a.*, sum(b.col1) as runningSum
from DF as a
left join DF b on a.id >= b.id
group by a.id")
giving:
col1 id runningSum
1 1 1 1
2 2 2 3
3 3 3 6
4 4 4 10
5 1 5 11
6 2 6 13
7 3 7 16
8 4 8 20
9 1 9 21
10 2 10 23
11 3 11 26
12 4 12 30
2) H2 With the H2 backend we can do this:
library(RH2)
library(sqldf)
sqldf("select *, set(#i, ifnull(#i, 0) + col1) as runningSum from DF")
3) PostgreSQL With the PostgreSQL back end it could be done like this:
library(RPostgreSQL)
library(sqldf)
sqldf('select
*,
sum(col1) over (order by id asc rows between unbounded preceding and current row)
from "DF"')
I have to retrieve IDs for employees who have completed the minimum number of jobs. There are multiple employees who have completed 1 job. My current sqldf query retrieves only 1 row of data, while there are multiple employee IDs who have completed just 1 job. Why does it stop at the first minimum value? And how do I fetch all rows with the minimum value in a column? Here is a data sample:
ID TaskCOunt
1 74
2 53
3 10
4 5
5 1
6 1
7 1
The code I have used:
sqldf("select id, min(taskcount) as Jobscompleted
from (select id,count(id) as taskcount
from MyData
where id is not null
group by id order by id)")
Output is
ID leastcount
5 1
While what I want is all the rows with minimum jobs completed.
ID Jobscompleted
5 1
6 1
7 1
min(...) always returns one row in SQL as do all SQL aggregate functions. Try this instead:
sqldf("select ID, TaskCount TasksCompleted from MyData
where TaskCount = (select min(TaskCount) from MyData)")
giving:
ID TasksCompleted
1 5 1
2 6 1
3 7 1
Note: The input in reproducible form is:
Lines <- "
ID TaskCount
1 74
2 53
3 10
4 5
5 1
6 1
7 1"
MyData <- read.table(text = Lines, header = TRUE)
As an alternative to sqldf, you could use data.table:
library(data.table)
dt <- data.table(ID=1:7, TaskCount=c(74, 53, 10, 5, 1, 1, 1))
dt[TaskCount==min(TaskCount)]
## ID TaskCount
## 1: 5 1
## 2: 6 1
## 3: 7 1
SQLite UPDATE Statement
I have two tables:
Table1
ID Num
1
2
3
4
5
Table2
ID
1
1
2
2
2
3
3
4
4
4
5
I need to UPDATE the Num field in Table1 with occurences of the ID field in Table2 i.e. based on previous:
Table1
ID Num
1 2
2 3
3 2
4 3
5 1
If i run this SQLite statement:
SELECT COUNT(t2.ID) FROM Table1 t1,Table2 t2 WHERE t1.ID=t2.ID GROUP BY t2.ID;
i have the correct table but when i try to UPDATE with that statement:
UPDATE Table1
SET Num=(SELECT COUNT(t2.ID) FROM Table1 t1,Table2 t2 WHERE t1.ID=t2.ID GROUP BY t2.ID);
i have nonsense output.Any ideas?
You must use a correlated subquery to correlate the value returned by the subquery with the current row in the outer query.
This means that you must not use Table1 again in the subquery, but instead refer to the outer table (with the actual name; the UPDATEd table does not support an alias):
UPDATE Table1
SET Num = (SELECT COUNT(t2.ID)
FROM Table2 t2
WHERE Table1.ID=t2.ID
GROUP BY t2.ID);
I would like to delete old data from database table. I would just like to keep last 2 records per id. For example I have a table with following records.
ID TIME DATA
1 2 3
1 3 4
1 4 5
2 2 3
2 3 4
2 4 5
2 5 6
Result which I would like to make is (it must be sorted by TIME):
ID TIME DATA
1 3 4
1 4 5
2 4 5
2 5 6
Thank you for your help.
A solution could be:
select * from tab where (
select count(*) from tab as t
where t.ID = tab.ID and t.TIME >= tab.TIME
) <= 2;
for more details visit:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/