Multiple conditions & query cypher / apoc - graph

I ran following query to use multiple apoc.do.when but it seems only my first apoc.do.when is executing
load csv from "file:///D:leads.csv"
as row
FIELDTERMINATOR ','
WITH row[0] as id,
row[1] as fname,
row[2] as lname,
row[4] as email1,
row[5] as email2,
row[6] as phone1,
row[7] as phone2,
row[8] as phone3,
split(row[11]," ") as birthDay
LIMIT 5
MERGE (l:Lead {id:id})
with l as leadRef,email1,email2,phone1,phone2,phone3,fname,lname,id
CALL apoc.do.when(email1 is not null,'MERGE (e1:Email {value:email}) MERGE (l)-[r:Has_Email]->
(e1)','',{email:email1,l:leadRef}) YIELD value WITH value AS
ignored,leadRef,email1,email2,phone1,phone2,phone3,fname,lname,id
CALL apoc.do.when(phone1 is not null,'MERGE (p1:Phone {value:phone}) MERGE (l)-[r:Has_Phone]->
(p1)','',{phone:phone1,l:leadRef}) YIELD value WITH value AS
ignored2,leadRef,email1,email2,phone1,phone2,phone3,fname,lname,id
CALL apoc.do.when(phone2 is not null,'MERGE (p2:Phone {value:phone}) MERGE (l)-[r:Has_Phone]->
(p2)','',{phone:phone2,l:leadRef}) YIELD value WITH value AS
ignored3,leadRef,email1,email2,phone1,phone2,phone3,fname,lname,id
return true
is there a way to execute multiple query based on multiple condition ? without break in first statement

Make use of UNION between each CALL apoc.do.when.

Related

QUERY formula to return only the latest record per email based on timestamp

Having this Google Sheet QUERY formula:
=QUERY(FORMULARIO!A4:Q, " select G,E,K,M,N,L,O,Q where not upper(J) = 'PAGÓ' and A > date '"&TEXTO(HOY()-3, "yyyy-mm-dd")&"' and E is not null and O is not null and Q contains '#' and not Q matches '"&TEXTJOIN("|", 1, QUERY(FORMULARIO!J4:Q, "select Q where upper(J) = 'PAGÓ' ", 0))&"'", 0)
How to return only one record per email (column Q) being the most recent one looking at date which is at column A?
TEST SHEET:
https://docs.google.com/spreadsheets/d/1jBMo42cbylrNHcf9b9QLI4oDw0H0oomCGrEAuPsBlA8/edit#gid=1465616509
Under tab "results" is the working query where i only need to filter out duplicates based on recency, leaving only the most recent one, and the unique identifier being email
EDIT
(following OP's shared demo sheet)
The formula to use would be:
=SORT(SORTN(QUERY(FORMULARIO!A4:Q, " select G,E,K,M,N,L,O,Q where not upper(J) = 'PAGÓ' and A > datetime '"&TEXTO(AHORA()-3, "yyyy-mm-dd HH:mm:ss")&"' and E is not null and O is not null and Q contains '#' and not Q matches '"&TEXTJOIN("|", 1, QUERY(FORMULARIO!J4:Q, "select Q where upper(J) = 'PAGÓ' ", 0))&"'", 1),9^9,2,2,1),1,1)
What was changed:
We wrapped your original query with the SORTN function to find and exclude all duplicate emails from column Q except the most recent one based on the Timestamp from column A.
We -one more time- wrapped the results with the SORT function to have our final results sorted by the Timestamp.
Pro tip
An extra change was also made within the main query by changing
and A > date '"&TEXT(HOY()-3, "yyyy-mm-dd")&"'
to
and A > datetime '"&TEXT(AHORA()-3, "yyyy-mm-dd HH:mm:ss")&"'
So we changed TODAY to NOW.
By doing this we count dates making use of both the date and time present in a timestamp instead of just the date.
(There is still room for minor improvements/alterations but out of the scope of this question.)
Original answer
Taking for granted that your formula works as expected (cannot check it without a test sheet) but produces multiple rows you can use the limit clause in the end of your formula.
=QUERY(FORMULARIO!A4:Q, " select G,E,K,M,N,L,O,Q where not upper(J) = 'PAGÓ' and A > date '"&TEXTO(HOY()-3, "yyyy-mm-dd")&"' and E is not null and O is not null and Q contains '#' and not Q matches '"&TEXTJOIN("|", 1, QUERY(FORMULARIO!J4:Q, "select Q where upper(J) = 'PAGÓ' ", 0))&"' limit 1", 0)

Converting a string to be used in in-clause Teradata

I have a string like this ('car, bus, train')
I want to convert it to be used in an in-clause. Basically I want to convert it to
('car','bus','train'). Please how do I do this in Teradata
I don't know how you are getting data like that, but if you have no control over that, you can use STRTOK_SPLIT_TO_TABLE.
select t.* from table (strtok_split_to_table(1,'car, bus, train',',')
returns (outkey integer,tokennum integer,resultstring varchar(25))) as t
Run by itself, that gives you:
outkey tokennum resultstring
1 1 car
1 2 bus
1 3 train
You can use that as a derived table and join it to the table you want to filter by. Something like:
select
<your table>.*
from
<your table>
inner join (select t.* from table (strtok_split_to_table(1,'car, bus, train',',')
returns (outkey integer,tokennum integer,resultstring varchar(25))) as t) dt
on yourtable.yourcolumn = dt.resultstring
here is the another way of spliting the input for n number of commas and use IN clause.
SELECT regexp_substr('car,bus,train','[^,]+',1,day_of_calendar) fields
FROM sys_calendar.calendar
WHERE day_of_calendar <= (CHAR('car,bus,train') - CHAR(oreplace('car,bus,train',',','')))+1;
Output of the Query
fields
~~~~~~~~
bus
car
train
Here is the systax to use in where clause
SELECT * FROM <your table>
WHERE yourtable.requiredColumn in
(
SELECT regexp_substr('car,bus,train','[^,]+',1,day_of_calendar) fields
FROM sys_calendar.calendar
WHERE
day_of_calendar <= (CHAR('car,bus,train') - CHAR(oreplace('car,bus,train',',','')))+1
);
Basically what we are doing here is splitting the string for each comma and below function is calculating number of commas in the string
(CHAR('car,bus,train') - CHAR(oreplace('car,bus,train',',','')))+1

u-sql script to search for a string then Groupby that string and get the count of distinct files

I am quite new to u-sql, trying to solve
str1=\global\europe\Moscow\12345\File1.txt
str2=\global.bee.com\europe\Moscow\12345\File1.txt
str3=\global\europe\amsterdam\54321\File1.Rvt
str4=\global.bee.com\europe\amsterdam\12345\File1.Rvt
case1:
how do i get just "\europe\Moscow\12345\File1.txt" from the strings variable str1 & str2, i want to just take ("\europe\Moscow\12345\File1.txt") from str1 and str2 then "Groupby(\global\europe\Moscow\12345)" and take the count of distinct files from the path (""\europe\Moscow\12345\")
so the output would be something like this:
distinct_filesby_Location_Date
to solve the above case i tried the below u-sql code but not quite sure whether i am writing the right script or not:
#inArray = SELECT new SQL.ARRAY<string>(
filepath.Contains("\\europe")) AS path
FROM #t;
#filesbyloc =
SELECT [ID],
path.Trim() AS path1
FROM #inArray
CROSS APPLY
EXPLODE(path1) AS r(location);
OUTPUT #filesbyloc
TO "/Outputs/distinctfilesbylocation.tsv"
USING Outputters.Tsv();
any help would you greatly appreciated.
One approach to this is to put all the strings you want to work with in a file, eg strings.txt and save it in your U-SQL input folder. Also have a file with the cities in you want to match, eg cities.txt. Then try the following U-SQL script:
#input =
EXTRACT filepath string
FROM "/input/strings.txt"
USING Extractors.Tsv();
// Give the strings a row-number
#input =
SELECT ROW_NUMBER() OVER() AS rn,
filepath
FROM #input;
// Get the cities
#cities =
EXTRACT city string
FROM "/input/cities.txt"
USING Extractors.Tsv();
// Ensure there is a lower-case version of city for matching / joining
#cities =
SELECT city,
city.ToLower() AS lowercase_city
FROM #cities;
// Explode the filepath into separate rows
#working =
SELECT rn,
new SQL.ARRAY<string>(filepath.Split('\\')) AS pathElement
FROM #input AS i;
// Explode the filepath string, also changing to lower case
#working =
SELECT rn,
x.pathElement.ToLower() AS pathElement
FROM #working AS i
CROSS APPLY
EXPLODE(pathElement) AS x(pathElement);
// Create the output query, joining on lower case city name, display, normal case name
#output =
SELECT c.city,
COUNT( * ) AS records
FROM #working AS w
INNER JOIN
#cities AS c
ON w.pathElement == c.lowercase_city
GROUP BY c.city;
// Output the result
OUTPUT #output TO "/output/output.txt"
USING Outputters.Tsv();
//OUTPUT #working TO "/output/output2.txt"
//USING Outputters.Tsv();
My results:
HTH
Taking the liberty to format your input file as TSV file, and not knowing all the column semantics, here is a way to write your query. Please note that I made the assumptions as provided in the comments.
#d =
EXTRACT path string,
user string,
num1 int,
num2 int,
start_date string,
end_date string,
flag string,
year int,
s string,
another_date string
FROM #"\users\temp\citypaths.txt"
USING Extractors.Tsv(encoding: Encoding.Unicode);
// I assume that you have only one DateTime format culture in your file.
// If it becomes dependent on the region or city as expressed in the path, you need to add a lookup.
#d =
SELECT new SqlArray<string>(path.Split('\\')) AS steps,
DateTime.Parse(end_date, new CultureInfo("fr-FR", false)).Date.ToString("yyyy-MM-dd") AS end_date
FROM #d;
// This assumes your paths have a fixed formatting/mapping into the city
#d =
SELECT steps[4].ToLowerInvariant() AS city,
end_date
FROM #d;
#res =
SELECT city,
end_date,
COUNT( * ) AS count
FROM #d
GROUP BY city,
end_date;
OUTPUT #res
TO "/output/result.csv"
USING Outputters.Csv();
// Now let's pivot the date and count.
OUTPUT #res2
TO "/output/res2.csv"
USING Outputters.Csv();
#res2 =
SELECT city, MAP_AGG(end_date, count) AS date_count
FROM #res
GROUP BY city;
// This assumes you know exactly with dates you are looking for. Otherwise keep it in the first file representation.
#res2 =
SELECT city,
date_count["2016-11-21"]AS [2016-11-21],
date_count["2016-11-22"]AS [2016-11-22]
FROM #res2;
UPDATE AFTER RECEIVING SOME EXAMPLE DATA IN PRIVATE EMAIL:
Based on the data you sent me (after the extraction and counting of the cities that you either could do with the join as outlined in Bob's answer where you need to know your cities in advance, or with the taking the string from the city location in the path as in my example, where you do not need to know the cities in advance), you want to pivot the rowset city, count, date into the rowset date, city1, city2, ... were each row contains the date and the counts for each city.
You could easily adjust my example above by changing the calculations of #res2 in the following way:
// Now let's pivot the city and count.
#res2 = SELECT end_date, MAP_AGG(city, count) AS city_count
FROM #res
GROUP BY end_date;
// This assumes you know exactly with cities you are looking for. Otherwise keep it in the first file representation or use a script generation (see below).
#res2 =
SELECT end_date,
city_count["istanbul"]AS istanbul,
city_count["midlands"]AS midlands,
city_count["belfast"] AS belfast,
city_count["acoustics"] AS acoustics,
city_count["amsterdam"] AS amsterdam
FROM #res2;
Note that as in my example, you will need to enumerate all cities in the pivot statement by looking it up in the SQL.MAP column. If that is not known apriori, you will have to first submit a script that creates the script for you. For example, assuming your city, count, date rowset is in a file (or you could just duplicate the statements to generate the rowset in the generation script and the generated script), you could write it as the following script. Then take the result and submit it as the actual processing script.
// Get the rowset (could also be the actual calculation from the original file
#in = EXTRACT city string, count int?, date string
FROM "/users/temp/Revit_Last2Months_Results.tsv"
USING Extractors.Tsv();
// Generate the statements for the preparation of the data before the pivot
#stmts = SELECT * FROM (VALUES
( "#s1", "EXTRACT city string, count int?, date string FROM \"/users/temp/Revit_Last2Months_Results.tsv\" USING Extractors.Tsv();"),
( "#s2", "SELECT date, MAP_AGG(city, count) AS city_count FROM #s1 GROUP BY date;" )
) AS T( stmt_name, stmt);
// Now generate the statement doing the pivot
#cities = SELECT DISTINCT city FROM #in2;
#pivots =
SELECT "#s3" AS stmt_name, "SELECT date, "+String.Join(", ", ARRAY_AGG("city_count[\""+city+"\"] AS ["+city+"]"))+ " FROM #s2;" AS stmt
FROM #cities;
// Now generate the OUTPUT statement after the pivot. Note that the OUTPUT does not have a statement name.
#output =
SELECT "OUTPUT #s3 TO \"/output/pivot_gen.tsv\" USING Outputters.Tsv();" AS stmt
FROM (VALUES(1)) AS T(x);
// Now put the statements into one rowset. Note that null are ordering high in U-SQL
#result =
SELECT stmt_name, "=" AS assign, stmt FROM #stmts
UNION ALL SELECT stmt_name, "=" AS assign, stmt FROM #pivots
UNION ALL SELECT (string) null AS stmt_name, (string) null AS assign, stmt FROM #output;
// Now output the statements in order of the stmt_name
OUTPUT #result
TO "/pivot.usql"
ORDER BY stmt_name
USING Outputters.Text(delimiter:' ', quoting:false);
Now download the file and submit it.

Concatenating Quotation marks Around SQL Results to use in another Query using R

I have two queries: First one gets a list of IDs that are character type and then I want to use those IDs in another Query.
library(RODBC)
connection<- odbcConnect(dsn=production; db=production, uid='user1', pwd='p#ssw0rd')
IDs_of_Events<-sqlQuery(connection,
"SELECT eventid
FROM ngh_events
WHERE event_period = 2"
)
Count_Attendes<-sqlQuery(connection,
paste("SELECT eventid, COUNT(attendee_ID)
FROM paid_events
WHERE eventid IN (", IDs_of_Events , ")
GROUP BY eventid", sep="")
)
The problem is I am unable to concatenate the list of eventids from the first query result to have a "EventID" , "EventID", "EVENTID"
You have to collapse IDs_Of_Events into a single string using paste:
paste0("SELECT eventid, COUNT(attendee_ID)
FROM paid_events
WHERE eventid IN ('", paste(IDs_of_Events,collapse = "','") , "')
GROUP BY eventid")
Note that I added the opening and closing single quotes in the other pieces. If you do this a lot, it's worth writing a wrapper for paste that does this stuff for you.

sqlite - how do I get a one row result back? (luaSQLite3)

How can I get a single row result (e.g. in form of a table/array) back from a sql statement. Using Lua Sqlite (LuaSQLite3). For example this one:
SELECT * FROM sqlite_master WHERE name ='myTable';
So far I note:
using "nrows"/"rows" it gives an iterator back
using "exec" it doesn't seem to give a result back(?)
Specific questions are then:
Q1 - How to get a single row (say first row) result back?
Q2 - How to get row count? (e.g. num_rows_returned = db:XXXX(sql))
In order to get a single row use the db:first_row method. Like so.
row = db:first_row("SELECT `id` FROM `table`")
print(row.id)
In order to get the row count use the SQL COUNT statement. Like so.
row = db:first_row("SELECT COUNT(`id`) AS count FROM `table`")
print(row.count)
EDIT: Ah, sorry for that. Here are some methods that should work.
You can also use db:nrows. Like so.
rows = db:nrows("SELECT `id` FROM `table`")
row = rows[1]
print(row.id)
We can also modify this to get the number of rows.
rows = db:nrows("SELECT COUNT(`id`) AS count FROM `table`")
row = rows[1]
print(row.count)
Here is a demo of getting the returned count:
> require "lsqlite3"
> db = sqlite3.open":memory:"
> db:exec "create table foo (x,y,z);"
> for x in db:urows "select count(*) from foo" do print(x) end
0
> db:exec "insert into foo values (10,11,12);"
> for x in db:urows "select count(*) from foo" do print(x) end
1
>
Just loop over the iterator you get back from the rows or whichever function you use. Except you put a break at the end, so you only iterate once.
Getting the count is all about using SQL. You compute it with the SELECT statement:
SELECT count(*) FROM ...
This will return one row containing a single value: the number of rows in the query.
This is similar to what I'm using in my project and works well for me.
local query = "SELECT content FROM playerData WHERE name = 'myTable' LIMIT 1"
local queryResultTable = {}
local queryFunction = function(userData, numberOfColumns, columnValues, columnTitles)
for i = 1, numberOfColumns do
queryResultTable[columnTitles[i]] = columnValues[i]
end
end
db:exec(query, queryFunction)
for k,v in pairs(queryResultTable) do
print(k,v)
end
You can even concatenate values into the query to place inside a generic method/function.
local query = "SELECT * FROM ZQuestionTable WHERE ConceptNumber = "..conceptNumber.." AND QuestionNumber = "..questionNumber.." LIMIT 1"

Resources