Incrementing an ID number each time a condition is met - r

I have a dataset that I'm trying to chunk up into "events" based on a condition. I want to create a consecutive group number (ID) which increases each time the condition is met.
Some kinds of records indicate that a new event has started, while other kinds of records represent no change / staying the course.
For example, in this dataset whenever 'Action' is "Left" or "Right", a new event has started and 'Id' should be incremented by 1:
| Id | Action |
|-----+---------|
| 1 | Left |
| 2 | Forward |
| 3 | Forward |
| 4 | Right |
| 5 | Forward |
| 6 | Left |
| ... | ... |
The resulting table I want would look like:
| Id | Action | GroupId |
|-----+---------+---------|
| 1 | Left | 1 |
| 2 | Forward | 1 |
| 3 | Forward | 1 |
| 4 | Right | 2 |
| 5 | Forward | 2 |
| 6 | Left | 3 |
| ... | ... | ... |
In something like python I might do this with a counter and a for loop (pseudo-ish code):
GroupID = 1
for row in data:
if Action == "Left" OR Action == "Right":
GroupID = GroupID + 1
else:
GroupID = GroupID
I feel like this should be a really simple one-liner, but my brain is broken right now and I'm having a hard time conceptualizing this.

GroupId = cumsum(Action %in% c("Left", "Right"))

Related

Master-Detail show data SQL

I'm working with SQL Server and I have this 3 tables
STUDENTS
| id | student |
-------------
| 1 | Ronald |
| 2 | Jenny |
SCORES
| id | score | period | student |
| 1 | 8 | 1 | 1 |
| 2 | 9 | 2 | 1 |
PERIODS
| id | period |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
And I want a query that returns this result:
| student | score1 | score2 | score3 | score4 |
| Ronald | 8 | 9 | null | null |
| Jenny | null | null | null | null |
As you can see, the number of scores depends of the periods because sometimes it can be 4 o 3 periods.
I don't know if I have the wrong idea or should I make this in the application, but I want some help.
You need to PIVOT your data e.g.
select Y.Student, [1], [2], [3], [4]
from (
select T.Student, P.[Period], S.Score
from Students T
cross join [Periods] P
left join Scores S on S.[Period] = P.id and S.Student = T.id
) X
pivot
(
sum(Score)
for [Period] in ([1],[2],[3],[4])
) Y
Reference: https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-20

sqlite - how do I write a query to receive an additional column containing a selection of data from another table in every cell

Assume I have two tables, e.g.:
table_1:
+----+-------+------------+--
| id | name | table_2_id | ...
+----+-------+------------+--
| 1 | test1 | 2 | ...
| 2 | test2 | 1 | ...
| 3 | test3 | 1 | ...
...
and
table_2:
+----+------+--
| id | name | ...
+----+------+--
| 1 | xxx | ...
| 2 | yyy | ...
| 3 | zzz | ...
...
Now I want to select everything from table_2 and add another column containing in every cell a collection of all names from table_1 where table_2_id corresponds with the current id from table_2:
output:
+----+------+-----+--------------+
| id | name | ... | link |
+----+------+-----+--------------+
| 1 | xxx | ... | test2, test3 |
| 2 | yyy | ... | test1 |
| 3 | zzz | ... | % |
...
How can I achieve this?
This can be done with a correlated subquery.
To combine values from multiple rows, use group_concat:
SELECT id,
name,
(SELECT group_concat(name)
FROM table_1
WHERE table_2_id = table_2.id
) AS link
FROM table_2;

SQLite - Update a column based on values from two other tables' columns

I am trying to update Data1's ID to Record2's ID when:
Record1's and Record2's Name are the same, and
Weight is greater in Record2.
Record1
| ID | Weight | Name |
|----|--------|------|
| 1 | 10 | a |
| 2 | 10 | b |
| 3 | 10 | c |
Record2
| ID | Weight | Name |
|----|--------|------|
| 4 | 20 | a |
| 5 | 20 | b |
| 6 | 20 | c |
Data1
| ID | Weight |
|----|--------|
| 4 | 40 |
| 5 | 40 |
I have tried the following SQLite query:
update data1
set id =
(select record2.id
from record2,record1
where record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
where record1.name=record2.name
and record1.weight<record2.weight)
Using the above query Data1's id is updated to 4 for all records.
NOTE: Record1's ID is the foreign key for Data1.
For the given data set the following seems to serve the cause:
update data1
set id =
(select record2.id
from record2,record1
where
data1.id = record1.id
and record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
where
record1.id in (select id from data1)
and record1.name=record2.name
and record1.weight<record2.weight)
;
See it in action: SQL Fiddle.
Please comment if and as this requires adjustment / further detail.

Select single row per unique field value with SQL Developer

I have thousands of rows of data, a segment of which looks like:
+-------------+-----------+-------+
| Customer ID | Company | Sales |
+-------------+-----------+-------+
| 45678293 | Sears | 45 |
| 01928573 | Walmart | 6 |
| 29385068 | Fortinoes | 2 |
| 49582015 | Walmart | 1 |
| 49582015 | Joe's | 1 |
| 19285740 | Target | 56 |
| 39506783 | Target | 4 |
| 39506783 | H&M | 4 |
+-------------+-----------+-------+
In every case that a customer ID occurs more than once, the value in 'Sales' is also the same but the value in 'Company' is different (this is true throughout the entire table). I need for each value in 'Customer ID to only appear once, so I need a single row for each customer ID.
In other words, I'd like for the above table to look like:
+-------------+-----------+-------+
| Customer ID | Company | Sales |
+-------------+-----------+-------+
| 45678293 | Sears | 45 |
| 01928573 | Walmart | 6 |
| 29385068 | Fortinoes | 2 |
| 49582015 | Walmart | 1 |
| 19285740 | Target | 56 |
| 39506783 | Target | 4 |
+-------------+-----------+-------+
If anyone knows how I can go about doing this, I'd much appreciate some help.
Thanks!
Well it would have been helpful, if you have put your sql generate that data.
but it might go something like;
SELECT customer_id, Max(Company) as company, Count(sales.*) From Customers <your joins and where clause> GROUP BY customer_id
Assumes; there are many company and picks out the most number of occurance and the sales data to be in a different table.
Hope this helps.

Sort comments by rating for the top two rows of the result and the rest by date?

Let's say I have a table of comments, like so:
---------------------------
| comment | date | rating |
---------------------------
| a | 1 | 1 |
---------------------------
| b | 4 | 3 |
---------------------------
| c | 7 | 2 |
---------------------------
| d | 1 | 10 |
---------------------------
| e | 3 | 20 |
---------------------------
I want to sort the table so that the two most rated comments always appear at the top of the result, independently of the date, and the rest of the comments are sorted by the date in descending order. The results should look like this:
---------------------------
| comment | date | rating |
---------------------------
| e | 3 | 20 |
---------------------------
| d | 1 | 10 |
---------------------------
| c | 7 | 2 |
---------------------------
| b | 4 | 3 |
---------------------------
| a | 1 | 1 |
---------------------------
Is this possible?
You could do two SQL-Queries. The first one selects the two highest commands.
SELECT comment, date, rating FROM comments ORDER BY rating DESC Limit 2
And then you can just show the others and order them by date. Aren't you saving an ID for the comments table? If yes, you could also select the ID in the above query and then in the second query select all comments, ordered by date, that don't have the ID from the previous query.

Resources