How to create a new table from two different tables in sqlite3? - sqlite

I want to merge two tables to create a new one. My first database table Data has these kind of information:
cell_id i j
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 1 2
7 2 2
8 3 2
9 4 2
10 5 2
Second table whose name is Layer is like that, it contains geometry as record:
geom
blob
blob
blob
blob
blob
I want to create a layer or insert into values in Data to the Layer where j=1 (it means 5 rows in Data table same as Layer table row number). Like that:
cell_id i j geom
1 1 1 blob
2 2 1 blob
3 3 1 blob
4 4 1 blob
5 5 1 blob
How can I handle it in sqlite3?

You may want to CREATE VIEW maybe? That's might be a solution.
CREATE VIEW LayerData AS
SELECT DISTINCT *
FROM Layer L
JOIN Data D
WHERE L.j = 1
Or, as stated by Ignacio Vazquea-Abrams, you can use CREATE TABLE ... AS.

Related

Transposing column on data frame from multiple different tables

I'm working with R Studio Version 1.0.143.
I need to transpose one column to become the name of the variables without losing information about the value (freq_var and freq_mut in this example).
Also, I need to know for which table the data came from (sample 1, 2 etc)
The main problem is how I add everything together even if one value in Gene is not present in Sample 1 but IS present in Sample 2 (NA's in the example)
I could do it manually, but my table contains thousands of values for each variable!
Sample 1
Freq_var Freq_mut Gene
2 2 A
3 3 B
2 5 C
Sample 2
Freq_var Freq_mut Gene
1 2 A
1 1 B
1 1 D
To:
A(Freq_var) B(Freq_var) C(Freq_var) D(Freq_var) A(Freq_mut).....
Sample 1 2 3 2 NA 2
Sample 2 1 1 NA 1 2

Generate three level dependency in case a verb is attached with non verb in dependency parsing

I am using dependency parsing for a use case in R with the corenlp package. However, I need to tweak the dataframe for a specific use case.
I need a dataframe where I have three columns. I have used the below code to reach till the dependency tree.
devtools::install_github("statsmaths/coreNLP")
coreNLP::downloadCoreNLP()
initCoreNLP()
inp_cl = "generate odd numbers from column one and print."
output = annotateString(inp_cl)
dc = getDependency(output)
sentence governor dependent type governorIdx dependentIdx govIndex depIndex
1 1 ROOT generate root 0 1 NA 1
2 1 numbers odd amod 3 2 3 2
3 1 generate numbers dobj 1 3 1 3
4 1 column from case 5 4 5 4
5 1 generate column nmod:from 1 5 1 5
6 1 column one nummod 5 6 5 6
7 1 column and cc 5 7 5 7
8 1 generate print nmod:from 1 8 1 8
9 1 column print conj:and 5 8 5 8
10 1 generate . punct 1 7 1 10
Using POS tagging with the following code, I ended up with the following data frame.
ps = getToken(output)
ps = ps[,c(1,2,7,3)]
colnames(dc)[8] = "id"
dp = merge(dc, ps[,c("sentence","id","POS")],
by.x=c("sentence","governorIdx"),by.y = c("sentence","id"),all.x = T)
dp = merge(dp, ps[,c("sentence","id","POS")],
by.x=c("sentence","dependentIdx"),by.y = c("sentence","id"),all.x = T)
colnames(dp)[9:10] = c("POS_gov","POS_dep")
sentence dependentIdx governorIdx governor dependent type govIndex id POS_gov POS_dep
1 1 1 0 ROOT generate root NA 1 <NA> VB
2 1 2 3 numbers odd amod 3 2 NNS JJ
3 1 3 1 generate numbers dobj 1 3 VB NNS
4 1 4 5 column from case 5 4 NN IN
5 1 5 1 generate column nmod:from 1 5 VB NN
6 1 6 5 column one nummod 5 6 NN CD
7 1 7 5 column and cc 5 7 NN CC
8 1 8 1 generate print nmod:from 1 8 VB NN
9 1 8 5 column print conj:and 5 8 NN NN
10 1 9 1 generate . punct 1 9 VB .
In case a verb(action word) is attached to a non-verb(non action word), but the non-verb(non-action word) is connected to other non-verb(non-action words) then one row should indicate the entire connection. Eg: generate is a verb connected to numbers and numbers is a non verb connected to odd.
So the intended data frame needs to be
Topic1 Topic2 Action
numbers odd generate
column from generate
column one generate
column and generate
column from print
column one print
column and print
. generate
First you'll need to have your dependency tree tag print as a verb, rather than a noun.
Try using a sentence with two independent clauses, and see if the root of the second independent clause is tagged as such.
If so, it's a simple walk through the governoridx column. If not, you'll need to address the mechanics of your dependency tree generator.

join a bigger data set to a smaller one keeping the number of rows of the smaller in R

I have an F1 Data Frame with pit stops called pitStops:
DriverId stop lap
1 1 3
2 1 4
3 1 3
4 1 2
and I have another data frame with the driver position lap by lap called posLap:
driverId lap Position
1 1 1
1 2 1
1 3 3
1 4 3
2 1 2
2 2 2
2 3 2
2 4 5
When I do a merge or a left_join or any kind of join the pit stop data frame increases in the amount of rows hap because R is coercing to character vector.
the code i have written is the following:
AllAusPit2017 = inner_join(AllAusPit2017, AllAusPos2017, by = "driverId", "lap")
I am doing the join based on driverId and lap
what I would like to see is the following:
driverId stop lap position
1 1 3 3
2 1 4 5
and so on for the rest of the drivers. Is this something that R can do? Please let me know if I have not explained myself correctly.
Try this:
library(dplyr)
AllAusPit2017 = left_join(AllAusPit2017, AllAusPos2017, by = c("DriverId" = "driverId", "lap"))
When joining on multiple columns, the argument needs to submitted as a vector. Your original code only took "driverId" into account.

How to keep User ID using Rtsne package

I want to use T-SNE to visualize user's variable but I want to be able to join the data to the user's social information.
Unfortunately, the output of Rtsne doesn't seems to return data with the user id..
The data looks like this:
client_id recency frequen monetary
1 2 1 1 1
2 3 3 1 2
3 4 1 1 2
4 5 3 1 1
5 6 4 1 2
6 7 5 1 1
and the Rtsne output:
x y
1 -6.415009 -0.4726438
2 -16.027732 -9.3751709
3 17.947615 0.2561859
4 1.589996 13.8016613
5 -9.332319 -13.2144419
6 10.545698 8.2165265
and the code:
tsne = Rtsne(rfm[, -1], dims=2, check_duplicates=F)
Rtsne preserves the input order of the dataframe you pass to it.
Try:
Tsne_with_ID = cbind.data.frame(rfm[,1],tsne$y)
and then just fix the first column name:
colnames(Tsne_with_ID)[1] <- paste(colnames(rfm)[1])

SQLite delete old data

I would like to delete old data from database table. I would just like to keep last 2 records per id. For example I have a table with following records.
ID TIME DATA
1 2 3
1 3 4
1 4 5
2 2 3
2 3 4
2 4 5
2 5 6
Result which I would like to make is (it must be sorted by TIME):
ID TIME DATA
1 3 4
1 4 5
2 4 5
2 5 6
Thank you for your help.
A solution could be:
select * from tab where (
select count(*) from tab as t
where t.ID = tab.ID and t.TIME >= tab.TIME
) <= 2;
for more details visit:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

Resources