I'm new to SQLite and I'm trying to create a view of joined tables, where table_results references table_names. The user can add, edit and remove items from table_names, but they can not change the references within table_results. What I'm trying to accomplish is that if an entry in table_names is removed, during a JOIN table_names ON (table_results.name_id=table_names._id) will return all rows in table_results, but where a name entry is removed will display "NO NAME"
Example:
table_names:
_id name
1 John
2 Bill
3 Sally
4 Nancy
table_results:
_id name_id score_1 score_2
1 1 50 75
2 4 80 60
3 2 83 88
4 3 75 75
5 2 93 95
where:
Select table_results._id table_names.name, table_results.score_1, table_results.score_2
FROM table_results
JOIN table_names ON (table_reults.name_id = table_names._id);
produces:
1 John 50 75
2 Nancy 80 60
3 Bill 83 88
4 Sally 75 75
5 Bill 93 95
Now if the user was to remove Bill from table_names, the same query string would produce:
1 John 50 75
2 Nancy 80 60
4 Sally 75 75
My Question:
Is there a way to have the query that substitutes values when the join doesn’t find a match? I’d like the above example to produce the following output, but I’m not sure how to write the query string for SQL.
1 John 50 75
2 Nancy 80 60
3 NO NAME 83 88
4 Sally 75 75
5 NO NAME 93 95
Thanks for your help.
Sure, you are doing an inner join, just change it up to a left outer join..
SELECT table_results._id
,COALESCE(table_names.name, 'NO NAME') AS [name]
,table_results.score_1
,table_results.score_2
FROM table_results
LEFT OUTER JOIN table_names
ON (table_reults.name_id = table_names._id);
Related
I have a data cleaning/transformation problem which I've solved in a way which I'm 1,000% sure could have been solved much more simply.
Below is an example of what my data looks like initially. The first four columns are numebrs I'll use for a lookup, the next is the type of the item, and the last two columns are the ones I want to fill. Based on the value of the column type I would like to fill in the value_one and value_two columns with the values of the same numbered column of the matching type- either one_apple and two_apple or one_orange and two_orange . For example, for the first row if the value is "apple", I would like to fill value_one with the value of one_apple for that row, and value_two with the value of two_apple from that row.
one_apple one_orange two_apple two_orange type value_one value_two
1 23 56 90 orange NA NA
2 24 57 91 orange NA NA
3 25 58 92 apple NA NA
4 26 59 93 apple NA NA
5 27 60 94 orange NA NA
6 28 61 95 apple NA NA
...
This is what I would like that dataframe to look like after I run my code:
one_apple one_orange two_apple two_orange type value_one value_two
1 23 56 90 apple 1 56
2 24 57 91 orange 24 91
3 25 58 92 apple 3 58
4 26 59 93 apple 4 59
5 27 60 94 apple 5 60
6 28 61 95 apple 6 61
...
The way I have solved this right now is to use a for loop, which figures out the index of the columns matching the type value in that row, which(str_sub(names(example_data), start = 5) == example_data$type[i]). Then I use that index to select the correct value for the value_one column from the appropriate place, example_data[i,...)[1]] and assign it to value_one. I do the same thing for value_two.
Below I have code which first creates an example dataset like the one I want to transform, and then shows my for loop running on it to transform the data.
example_data = data.frame(one_apple = 1:(1+30), one_orange = 23:(23+30), two_apple = 56:(56+30), two_orange = 90:(90+30), type = sample(c("apple","orange"), 31, replace = T), value_one = rep(NA,31), value_two = rep(NA,31))
for(i in 1:nrow(example_data)){
example_data$value_one[i] = example_data[i,which(str_sub(names(example_data), start = 5) == example_data$type[i])[1]]
example_data$value_two[i] = example_data[i,which(str_sub(names(example_data), start = 5) == example_data$type[i])[2]]
}
This transformation works, but it is clearly not great code and I feel like I'm missing an easier way to do it with apply and without the convoluted use of which to grab column indexes and stuff. It would be very helpful to see a better way to do this.
I have created a table that show how much time each person in a team has spend for tasks each month.
Empl_level team_member 2022/05 2022/06 2022/07 2022/08
0 department 117 69 73 30
1 Diana 108 108 113 184
1 Irina 90 63 56 40
2 Inga 77 56 74 30
3 Elina 23 35 58 79
However there is such "team member" as department. how to to create a new dataset, where time from the sell department will be equally divided by real team members
Empl_level team_member 2022/05 2022/06
1 Diana 108+(117/4) 108+(69/4)
1 Irina 90+(117/4) 63+(69/4)
2 Inga 77+(117/4) etc.
3 Elina 23+(117/4)
Using data.table, something like the following could work:
library(data.table)
setDT(df)
df[, names(df)[-(1:2)] := lapply(.SD, function(x) {x + x[1]/4}), .SDcols = !1:2][-1]
The [-1] at the end removes the first "department" row.
Using prev() function I can access previous rows individually.
mytable
| sort by Time asc
| extend mx = max_of(prev(Value, 1), prev(Value, 2), prev(Value, 3))
How to define a window to aggregate over in more generic way? Say I need maximum of 100 values in previous rows. How to write a query that does not require repeating prev() 100 times?
Can be achieved by combining scan and series_stats_dynamic().
scan is used to create an array of last x values, per record.
series_stats_dynamic() is used to get the max value of each array.
// Data sample generation. Not part of the solution
let mytable = materialize(range i from 1 to 15 step 1 | extend Time = ago(1d*rand()), Value = toint(rand(100)));
// Solution starts here
let window_size = 3; // >1
mytable
| order by Time asc
| scan declare (last_x_vals:dynamic)
with
(
step s1 : true => last_x_vals = array_concat(array_slice(s1.last_x_vals, -window_size + 1, -1), pack_array(Value));
)
| extend toint(series_stats_dynamic(last_x_vals).max)
i
Time
Value
last_x_vals
max
5
2022-06-10T11:25:49.9321294Z
45
[45]
45
14
2022-06-10T11:54:13.3729674Z
82
[45,82]
82
2
2022-06-10T13:25:40.9832745Z
44
[45,82,44]
82
1
2022-06-10T17:38:28.3230397Z
24
[82,44,24]
82
7
2022-06-10T18:29:33.926463Z
17
[44,24,17]
44
15
2022-06-10T19:54:33.8253844Z
9
[24,17,9]
24
3
2022-06-10T20:17:46.1347592Z
43
[17,9,43]
43
12
2022-06-11T00:02:55.5315197Z
94
[9,43,94]
94
9
2022-06-11T00:11:18.5924511Z
61
[43,94,61]
94
11
2022-06-11T00:39:40.6858444Z
38
[94,61,38]
94
4
2022-06-11T03:54:59.418534Z
84
[61,38,84]
84
10
2022-06-11T05:55:38.2904242Z
6
[38,84,6]
84
6
2022-06-11T07:25:43.3977923Z
36
[84,6,36]
84
13
2022-06-11T09:36:08.7904844Z
28
[6,36,28]
36
8
2022-06-11T09:51:45.2225391Z
73
[36,28,73]
73
Fiddle
I have a data file that has my subject responses listed by their emails and I have another file with each subject email next to his/her subject ID. How do replace all the emails in the main data file with their subject IDs?
One of the things that's great about R is the ease with which one can create a minimal, complete and verifiable example. For this question, it's a simple matter of generating some example data, reading it into R and working out a potential solution. We'll create a list of student email addresses, IDs, and a separate data set containing exam scores.
nameData <- "email ID
Alicia#gmail.com 1
Jane#aol.com 2
Thomas#msn.com 3
Henry#yale.edu 4
LaShawn#uga.edu 5
"
examData <- "email exam1 exam2 exam3
Alicia#gmail.com 98 77 87
Jane#aol.com 99 88 93
Thomas#msn.com 73 62 73
Henry#yale.edu 100 98 99
LaShawn#uga.edu 84 98 92"
names <- read.table(text=nameData,header=TRUE,stringsAsFactors=FALSE)
exams <- read.table(text=examData,header=TRUE,stringsAsFactors=FALSE)
# merge data and drop email, which is first column
mergedData <- merge(names,exams)[,-1]
mergedData[order(mergedData$ID),]
...and the output, sorted by ID:
> mergedData[order(mergedData$ID),]
ID exam1 exam2 exam3
1 1 98 77 87
3 2 99 88 93
5 3 73 62 73
2 4 100 98 99
4 5 84 98 92
>
I have the following bill table
building name amount payments receiptno
1234 name a 123 0 0
1234 name a 12 10 39
1234 name a 125 125 40
1235 name a 133 10 41
1235 name b 125 125 50
1234 name c 100 90 0
I want to select rows that amount minus payments is greater than zero and display the max value of receiptno
so I want to select only the following from building 1234
name a 39
name c 0
How can I do this?
Translating your description into SQL results in this:
SELECT building,
name,
MAX(receiptno)
FROM BillTable
WHERE amount - payments > 0
GROUP BY building,
name