Ambiguous column name, joining three tables

Ambiguous column name, joining three tables - sqlite

I am using "DB Browser for SQLite" to test my SQL query and I get the error:
ambiguous column name: Devices.Label
I have a database with three tables:
I want to get a result with the latest timestamp from each device, so something like:
Hot | 1545074701 | 25.8
Hot | 1545071101 | 23.2
Cool | 1545071101 | 24.0
My query looks like:
SELECT Devices.Label, max(Humidity.Timestamp), Humidity.H, max(Temperature.Timestamp), Temperature.C
FROM Temperature, Humidity
INNER JOIN Devices ON Temperature.DeviceID = Devices.DeviceID
INNER JOIN Devices ON Humidity.DeviceID = Devices.DeviceID;
What am I missing? Have I not specified from which table I am taking the info?
And the next step would of course be to get the proper result from the query, this will then be sent as a JSON object to an Android application.

Thanks to sticky bit I changed my query to:
SELECT
DevHum.Label, MAX(Humidity.Timestamp), Humidity.H, Temperature.C
FROM
Temperature, Humidity
INNER JOIN
Devices AS DevTemp ON Temperature.DeviceID = DevTemp.DeviceID
INNER JOIN
Devices AS DevHum ON Humidity.DeviceID = DevHum.DeviceID;`
Which seem to do the trick.
I decided to split it up in two queries, one for Hot and one for cool as combining them is above my level right now.

Related

Slow query on table | WHERE x | ORDER by timestamp | DISTINCT a,b,c,d | TAKE 20 when table large

We are experiencing a sudden performance drop with a query structured like this:
table(tablename)
| where MeasurementName in ('ActiveJobId')
and MachineId == machineId
and SourceTimestamp <= from
and isnotnull( Value)
| order by SourceTimestamp desc
| distinct SourceTimestamp, MeasurementName, tostring(Value), SourceTimestampUtc
| take rows
tablename, machineId, from, rows are all query parameters. rows is typically "20". Value column is of type "dynamic"
The table contains 240 Million entries, with about 64,000 matching the WHERE criteria. The goal of the query is to get the last 20 UNIQUE, non-empty entries for a given machine and data point, starting after a specific date.
The query runs smooth in the Staging database system, but started to degrade in performance on the Dev system. Possibly because of increased data amount.
If we remove the distinct clause, or move it behind the TAKE clause, the query completes very fast. (<1s). The data contains about 5-10% duplicate entries.
To our understanding the query should be performed like this:
Prepare a filter for the source table, start at a specific datetime range
Order desc: walk backwards
Walk down the table and stop when you got 20 distinct rows
From the time it sometimes takes it looks almost as if ADX walks down the whole table, performs a distinct, and then only takes the topmost 20 rows.
The problem persists if we swap | order and | distinct around.
The problem disappears if we move | distinct to the end of the query, but then we often receive 1-2 items less than required.
Is there a logical error we make, can this query be rewritten, or are there better options at hand?

The goal of the query is to get the last 20 UNIQUE, non-empty entries for a given machine and data point, starting after a specific date.
This part of the description doesn't match the filter in your query: and SourceTimestamp <= from - did you mean to use >= instead of <= ?
Is there a logical error we make, can this query be rewritten, or are there better options at hand?
If you can't eliminate the duplicates upstream, you can consider setting a materialized view that performs the deduplication, then query the view directly instead of the raw data. Also see Handle duplicate data

Kusto: Full table scan on join even join conditions are time-based?

I have a Kusto Query like:
(Events
| take 1)
| join kind=leftouter Sensor_Data on $left.start_timestmp ==
$right.timestmp, someotherfield
and it will never return. The right side of the join has several billion entries.
If I do a
Events
| take 1
and use the result in the where clause of Sensor_Data it returns in no time.
The MS Support Team explains that this query requires a full table scan of the Sensors_Data table. The join parameters are not taken into consideration by the query optimizer.
Question:
Is the Kusto Query Optimizer really no able to optimize queries based on the join condition? To me it sounds a little bit like 1999 to have to first do the left side of the query manualy and then do the right side manualy as well? Is there some hint or so, to make this run fast?

consider rewriting your query as follows (for example) and see if that helps it perform better:
let x = toscalar(Events | take 1 | project pack("ts", start_timestmp, "sof", someotherfield));
Sensor_Data
| where timestmp == todatetime(x["ts"])
| where someotherfield == tostring(x["sof"])

SUM IFS equivalent for SQLite

The below crashes my DB Browser. Essentially I am trying to sum sales ("sales") by a sales person ("name") that occurred between two dates ("beg_period" and "end_period") pulled from a separate table.
SELECT ta.name, ta.beg_period, ta.end_period,
(SELECT SUM(tb.sales)
FROM sales_log tb
WHERE ta.name = tb.name
AND tb.date BETWEEN ta.beg_period AND ta.end_period
)
FROM performance ta
;

The nested query can be re-written as a single query with a standard join.
SELECT ta.name, ta.beg_period, ta.end_period, SUM(tb.sales)
FROM performance ta INNER JOIN sales_log tb
ON ta.name = tb.name
WHERE tb.date BETWEEN ta.beg_period AND ta.end_period
GROUP BY ta.name, ta.beg_period, ta.end_period;
My guess is that the original query was okay (however inefficient), but DB Browser just didn't know how to interpret the subquery for whatever parsing it attempts, etc. In other words, just because it crashed DB Browser doesn't mean that it would crash sqlite library. Try another sqlite database manager.

App Insights 'join' that returns only the first result

I have a job running hourly (at slightly different times) and logging metrics into Application Insights.
I want to trigger an alert based on the metrics from the latest job run.
let metrics = customMetrics | where ... | extend run = bin(timestamp, 1m);
let latestRun = metrics | top 1 by run desc;
metrics | join latestRun on run
Looking at metrics I can see this query should be returning 8 results. But it returns only the first of them. Why?

Surprisingly, this is by design - the query language does not use inner joins by default, instead it uses "innerunique" joins.
Switching to join kind=inner fixed my original query.

What's the best way to retrieve this data?

The architecture for this scenario is as follows:
I have a table of items and several tables of forms. Rather than having the forms own the items, the items own the forms. This is because one item can be on several forms (although only one of each type, but not necessarily on any). The forms and items are all tied together by a common OrderId. This can be represented like so:
OrderItems | Form A | Form B etc....
---------- |--------- |
ItemId |FormAId |
OrderId |OrderId |
FormAId |SomeField |
FormBId |OtherVar |
FormCId |etc...
This works just fine for these forms. However, there is another form, (say, FormX) which cannot have an OrderId because it consists of items from multiple orders. OrderItems does contain a column for FormXId as well, but I'm confused about the best way to get a list of the "FormX"s related to a single OrderId. I'm using MySQL and was thinking maybe a stored proc was the best way to go on this, but I've never used a stored proc on MySQL and don't really know the best way to go about it. My other (kludgy) option was to hit the DB twice, first to get all the items that are for the given OrderId that also have a FormXId, and then get all their FormXIds and do a dynamic SELECT statement where I do something like (pseudocode)
SELECT whatever FROM sometable WHERE FormXId=x OR FormXId=y....
Obviously this is less than ideal, but I can't really think of any other way... anything better I could do either programmatically or architecturally? My back-end code is ASP.NET.
Thanks so much!
UPDATE
In response to the request for more info:
Sample input:
OrderId = 1000
Sample output
FormXs:
-----------------
FormXId | FieldA | FieldB | etc
-------------------------------
1003 | value | value | ...
1020 | ... .. ..
1234 | .. . .. . . ...
You see the problem is that FormX doesn't have one single OrderId but is rather a collection of OrderIds. Sometimes multiple items from the same order are on FormX, sometimes it's just one, most orders don't have any items on FormX. But when someone pulls up their order, I need for all the FormXs their items belong on to show up so they can be modified/viewed.
I was thinking of maybe creating a stored proc that does what I said above, run one query to pull down all the related OrderIds and then another to return the appropriate FormXs. But there has to be a better way...

I understand you need to get a list of the "FormX"s related to a single OrderId. You say, that OrderItems does contain a column for FormXId.
You can issue the following query:
select
FormX.*
From
OrderItems
join
Formx
on
OrderItems.FormXId = FormX.FormXId
where
OrderItems.OrderId = #orderId
You need to pass #orderId to your query and you will get a record set with FormX records related to this order.
You can either package this query up as a stored procedure using #orderId paramter, or you can use dynamic sql and substitute #orderId with real order number you executing your query for.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex