Combining aggregate functions in sqlite - sqlite

Assuming the following table and using sqlite I have the following question:
Node |Loadcase | Fx | Cluster
---------------------------------
1 | 1 | 50 | A
2 | 1 | -40 | A
3 | 1 | 60 | B
4 | 1 | 80 | C
1 | 2 | 50 | A
2 | 2 | -50 | A
3 | 2 | 80 | B
4 | 2 | -100 | C
I am trying to write a query which fetches the maximum absolute value of Fx and the Load case for each Node 1-4.
An additional requirement is that Fx having the same Cluster shall be summed up before making this query .
In the example above I would expect the following results:
Node | Loadcase | MaxAbsClusteredFx
-----|-----------|-------------------
1 | 1 | 10
2* | |
3 | 2 | 80
4 | 2 | 100
N/A because summed up with node one. Both belonging to cluster A
Query:
For Node 1 I would execute a query similar to this
SELECT Loadcase,abs(Fx GROUP BY Cluster) FROM MyTable WHERE abs(Fx GROUP BY Cluster) = max(abs(Fx GROUP BY Cluster)) AND Node = 1
I keep getting " Error while executing query: near "Forces": syntax error " or alike.
Thankful for any help!

Related

Updating multiple rows in SQLite with relevant data from the same table

I have a database that I don't control the source of directly and results in errant '0' entries which mess up generated graphs with these drops to zero. I am able to manipulate the data after the fact and update that database.
It is acceptable that the last known good value can be used instead and so I am trying to make a general query that will remove all the zeros and populate it with the last known value.
Luckily, every entry includes the ID of the last entry and so it is a matter of simply looking back and grabbing it.
I have got very close to a final answer, but instead of updating with the last good value, it just uses the first value over and over again.
dummy data
CREATE TABLE tbl(id INT,r INT,oid INT);
INSERT INTO tbl VALUES(1,10,0);
INSERT INTO tbl VALUES(2,20,1);
INSERT INTO tbl VALUES(3,0,2);
INSERT INTO tbl VALUES(4,40,3);
INSERT INTO tbl VALUES(5,50,4);
INSERT INTO tbl VALUES(6,0,5);
INSERT INTO tbl VALUES(7,70,6);
INSERT INTO tbl VALUES(8,80,7);
SELECT * FROM tbl;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 0 | 2 | ** NEEDS FIXING
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 0 | 5 | ** NEEDS UPDATE
| 7 | 70 | 6 |
| 8 | 80 | 7 |
I have worked several queries to get results around what I am after:
All zero entries:
SELECT * FROM tbl WHERE r = 0;
OUTPUT:
| id | r | oid |
|----|----|-----|
| 3 | 0 | 2 |
| 6 | 0 | 5 |
Output only the those rows with the preceding good row
SELECT * FROM tbl WHERE A in (
SELECT id FROM tbl WHERE r = 0
UNION
SELECT oid FROM tbl WHERE r = 0
)
OUTPUT:
| id| r |oid|
|---|----|---|
| 2 | 20 | 1 |
| 3 | 0 | 2 |
| 5 | 50 | 4 |
| 6 | 0 | 5 |
Almost works
This is as close as I have got, it does change all the zero's, but it changes them all to the value of the first lookup
UPDATE tbl
SET r = (SELECT r
FROM tbl
WHERE id in (SELECT oid
FROM tbl
WHERE r = 0)
) WHERE r = 0 ;
OUTPUT:
| id| r |oid|
|---|----|---|
| 1 | 10 | 0 |
| 2 | 20 | 1 |
| 3 | 20 | 2 | ** GOOD
| 4 | 40 | 3 |
| 5 | 50 | 4 |
| 6 | 20 | 5 | ** BAD, should be 50
| 7 | 70 | 6 |
| 8 | 80 | 7 |
If it helps, I created this fiddle here that I've been playing with:
http://sqlfiddle.com/#!5/8afff/1
For this sample data all you have to do is use the correct correlated subquery that returns the value of r from the row with id equal to the current oid in the WHERE clause:
UPDATE tbl AS t
SET r = (SELECT tt.r FROM tbl tt WHERE tt.id = t.oid)
WHERE t.r = 0;
See the demo.

Data preparation before running exact logistic (elrm in R)

I started out using Firth's logistic (logistf) to deal with my small sample size (n=80), but wanted to try out exact logistic regression using the elrm package. However, I'm having trouble figuring out how to create the "collapsed" data required for elrm to run. I have a csv that I import into R as a dataframe that has the following variables/columns. Here is some example data (real data has a few more columns and 80 rows):
+------------+-----------+-----+--------+----------------+
| patien_num | asymmetry | age | female | field_strength |
+------------+-----------+-----+--------+----------------+
| 1 | 1 | 25 | 1 | 1.5 |
| 2 | 0 | 50 | 0 | 3 |
| 3 | 0 | 75 | 1 | 1.5 |
| 4 | 0 | 33 | 1 | 3 |
| 5 | 0 | 66 | 1 | 3 |
| 6 | 0 | 99 | 0 | 3 |
| 7 | 1 | 20 | 0 | 1.5 |
| 8 | 1 | 40 | 1 | 3 |
| 9 | 0 | 60 | 1 | 3 |
| 10 | 0 | 80 | 0 | 1.5 |
+------------+-----------+-----+--------+----------------+
Basically my data is one line per patient (not a frequency table). I'm trying to run a regression with asymmetry as the dependent variable and age (continuous), female (binary), and field_strength (factor) as independent variables. I'm trying to understand how to collapse this into the appropriate format so I can get that "ntrials" part required for the elrm formula.
I've looked at https://stats.idre.ucla.edu/r/dae/exact-logistic-regression/ but they start with data in a different format than mine, and having trouble. Any help appreciated!

Subsetting a table in R

In R, I've created a 3-dimensional table from a dataset. The three variables are all factors and are labelled H, O, and S. This is the code I used to simply create the table:
attach(df)
test <- table(H, O, S)
Outputting the flattened table produces this table below. The two values of S were split up, so these are labelled S1 and S2:
ftable(test)
+-----------+-----------+-----+-----+
| H | O | S1 | S2 |
+-----------+-----------+-----+-----+
| Isolation | Dead | 2 | 15 |
| | Sick | 64 | 20 |
| | Recovered | 153 | 379 |
| ICU | Dead | 0 | 15 |
| | Sick | 0 | 2 |
| | Recovered | 1 | 9 |
| Other | Dead | 7 | 133 |
| | Sick | 4 | 20 |
| | Recovered | 17 | 261 |
+-----------+-----------+-----+-----+
The goal is to use this table object, subset it, and produce a second table. Essentially, I want only "Isolation" and "ICU" from H, "Sick" and "Recovered" from O, and only S1, so it basically becomes the 2-dimensional table below:
+-----------+------+-----------+
| | Sick | Recovered |
+-----------+------+-----------+
| Isolation | 64 | 153 |
| ICU | 0 | 1 |
+-----------+------+-----------+
S = S1
I know I could first subset the dataframe and then create the new table, but the goal is to subset the table object itself. I'm not sure how to retrieve certain values from each dimension and produce the reduced table.
Edit: ANSWER
I now found a much simpler method. All I needed to do was reference the specific columns in their respective directions. So a much simpler solution is below:
> test[1:2,2:3,1]
O
H Sick Healed
Isolation 64 153
ICU 0 1
Subset the data before running table, example:
ftable(table(mtcars[, c("cyl", "gear", "vs")]))
# vs 0 1
# cyl gear
# 4 3 0 1
# 4 0 8
# 5 1 1
# 6 3 0 2
# 4 2 2
# 5 1 0
# 8 3 12 0
# 4 0 0
# 5 2 0
# subset then run table
ftable(table(mtcars[ mtcars$gear == 4, c("cyl", "gear", "vs")]))
# vs 0 1
# cyl gear
# 4 4 0 8
# 6 4 2 2

how to use order by based on strings in sqlite

i have a dataset in sqlite db which has ids and image urls. It is as follows:
ID | URL
1 | http://img.url1.com
2 |
3 | http://img.url2.com
4 | http://img.url3.com
5 | http://img.url4.com
6 |
7 | http://img.url5.com
8 |
9 | http://img.url6.com
10 |
11 | http://img.url7.com
12 | http://img.url8.com
13 |
I want to sort this accordingly in such a way that the ids with blank values appears at the bottom and the ids with nearest with img urls get appeared at the top with nearest values in the sorted order of ids as follows:
ID | URL
1 | http://img.url1.com
3 | http://img.url2.com
4 | http://img.url3.com
5 | http://img.url4.com
7 | http://img.url5.com
9 | http://img.url6.com
11 | http://img.url7.com
12 | http://img.url8.com
2 |
6 |
8 |
10 |
13 |
Can you suggest me what kind of query I need to put ?
SELECT * FROM T
ORDER BY (URL IS NULL),id
SQLFiddle demo

SQLite join selection from the same table as two columns

I have one table 'positions' with columns:
id | session_id | keyword_id | position
and some rows in it:
10 rows with session_id = 1
and 10 with session_id = 2.
As a result of the query I need a table like this:
id | keyword_id | position1 | position2
where 'position1' is a column with values that had session_id = 1 and 'position2' is a column with values that had session_id = 2.
The result set should contain 10 records.
Sorry for my bad English.
Data examle:
id | session_id | keyword_id | position
1 | 1 | 1 | 2
2 | 1 | 2 | 3
3 | 1 | 3 | 0
4 | 1 | 4 | 18
5 | 2 | 5 | 9
6 | 2 | 1 | 0
7 | 2 | 2 | 14
8 | 2 | 3 | 2
9 | 2 | 4 | 8
10 | 2 | 5 | 19
Assuming that you wish to combine positions with the same id, from the two sessions, then the following query should to the trick:
SELECT T1.keyword_id
, T1.position as Position1
, T2.position as Position2
FROM positions T1
INNER JOIN positions T2
ON T1.keyword_id = T2.keyword_id -- this will match positions by [keyword_id]
AND T1.session_id = 1
AND T2.session_id = 2

Resources