Type checking an XQuery query

Type checking an XQuery query - xquery

I am a reasonably competent XSLT 1.0 developer.
I am experimenting in XQuery 3.0 in Oxygen.
I've got an evaluation of Oxygen, and Saxon 9.9.1.7 EE, I can force a type issue
e.g. I've defined
<xs:element name="vhs" type="xs:string"/>
and created a query
import schema default element namespace "" at "file:/C:/Foo/Oxygen/XQuery/src/videos.xsd";
for $x in result/videos/video
where $x/vhs>14.50
order by $x/title
return $x/title
If I run the query it fails with a sensible sort of error.
Cannot compare xs:string to xs:decimal
But can I typecheck the query without running a query? (either via API or from commandline)
My doubt is that the IDE doesnt see the type error, I have to run it.

The type system in XQuery is a hybrid between strong/strict/static typing and weak/dynamic typing. Implementations have a lot of discretion as to how much they do statically and how much dynamically. XQuery 1.0 defined strict static typing (I call it "pessimistic" static typing) as an option, but relatively few products implemented it (SQL Server is one). Under pessimistic static typing, $a/b/c is an error unless you know statically that $a is an element that always has a child named b, which always has a child named c. So a pre-condition for that is that your query is schema-aware.
Saxon implements "optimistic static typing": it only reports a type error at compile time if the construct is guaranteed to fail. For example, starts-with(12, 1) is guaranteed to fail because the arguments must be strings and you can establish statically that they aren't.
If your query were schema-aware, then $x/vhs>14.50 would be allowed if vhs is typed as xs:decimal, and disallowed if it is typed as xs:string -- provided that the type of $x is known, which depends on how you declared it. In a non-schema-aware query $x/vhs>14.50 will succeed or fail at run-time based on the actual value in the vhs element.
Taking advantage of schema-awareness can give you much better compile-time error detection. But sadly, not many people do it, because it involves extra effort at coding time, and people always believe the code they're writing will work first time.

Related

Why do we use [1] behind an order by clause in xquery expressions?

SELECT xText.query (' let $planes := /planes/plane
return <results>
{
for $x in $planes
where $x/year >= 1970
order by ($x/year)[1]
return ($x/make, $x/model,$x/year )
}
</results>
')
FROM planes
In this code, what is the purpose of [1] in line order by ($x/year)[1]
I have tried to execute this code without using [1] behind order by clause. Then this error has occurred.
XQuery [planes.xtext.query()]: 'order by' requires a singleton (or empty sequence), found operand of type 'xtd:untypedAtomic*'

XQuery 1.0 defined an option called "static type checking": if this option is in force, you need to write queries in such a way that the compiler can tell in advance ("statically") that the result will not be a type error. The operand of "order by" needs to be a singleton, and with static type checking in force, the compiler needs to be able to verify that it will be a singleton, which is why the [1] has been added. It would have been better to write exactly-one($x/year) because this doesn't only keep the compiler happy, it also demands a run-time check that $x/year is actually a singleton.
Very few XQuery vendors chose to implement static type checking, for very good reasons in my view: it makes queries harder to write, and it actually encourages you to write things like this example that do LESS checking than a system without this "feature".
In fact, as far as I know the only mainstream (non-academic) implementation that does static type checking is Microsoft's SQL Server.
Static type checking should not be confused with optimistic type checking where the compiler tells you about things that are bound to fail, but defers checking until run-time for things that might or might not be correct.
Actually the above is a bit of a guess. It's also possible that some <plane> elements have more than one child called <year>, and that you want to sort on the first of these. That would justify the [1] even on products that don't do static type checking.

The [1] is a predicate that is selecting the first item in the sequence. It is equivalent to the expression [position() = 1]. A predicate in XPath acts kind of like a WHERE clause in SQL. The filter is applied, and anything that returns true is selected, the things that return false are not.
When you don't apply the predicate, you get the error. That error is saying that the order by expects a single item, or nothing (an empty sequence).
At least one of the plane has multiple year, so the predicate ensures that only the first one is used for the order by expression.

Having trouble understanding FLWOR expressions combined with inserts and return statements [duplicate]

I recognized that (insert/delete)-XQueries executed with the BaseX client always returning an empty string. I find this very confusing or unintuitive.
Is there a way to find out if the query was "successful" without querying the database again (and using potentially buggy "transitive" logic like "if I deleted a node, there must be 'oldNodeCount-1' nodes in the XML")?

XQuery Update statements do not return anything -- that's how they are defined. But you're not the only one who does not like those restrictions, and BaseX added two ways around this limitation:
Returning Results
By default, it is not possible to mix different types of expressions
in a query result. The outermost expression of a query must either be
a collection of updating or non-updating expressions. But there are
two ways out:
The BaseX-specific update:output() function bridges this gap: it caches the results of its arguments at runtime and returns them after
all updates have been processed. The following example performs an
update and returns a success message:
update:output("Update successful."), insert node <c/> into doc('factbook')/mondial
With the MIXUPDATES option, all updating constraints will be turned off. Returned nodes will be copied before they are modified by
updating expressions. An error is raised if items are returned within
a transform expression.
If you want to modify nodes in main memory, you can use the transform
expression.
The transform expression will not help you, as you seem to modify the data on disk. Enabling MIXUPDATES allows you to both update the document and return something at the same time, for example running something like
let $node := <c/>
return ($node, insert node $node into doc('factbook')/mondial)
MIXUPDATES allows you to return something which can be further processed. Results are copied before being returned, if you run multiple updates operations and do not get the expected results, make sure you got the concept of the pending update list.
The db:output() function intentionally breaks its interface contract: it is defined to be an updating function (not having any output), but at the same time it prints some information to the query info. You cannot further process these results, but the output can help you debugging some issues.
Pending Update List
Both ways, you will not be able to have an immediate result from the update, you have to add something on your own -- and be aware updates are not visible until the pending update list is applied, ie. after the query finished.
Compatibility
Obviously, these options are BaseX-specific. If you strongly require compatible and standard XQuery, you cannot use these expressions.

Ada Function Parameter as Access Type or Not

I am refactoring some code originally written using access types, but not yet tested. I found access types to be troublesome in Ada because they can only refer to dynamically allocated items, and referring to items defined at compile time is apparently not allowed. (This is Ada83.)
But now I come to a function like this one:
function Get_Next(stateInfo : State_Info_Access_Type) return integer;
I know that I can easily pass parameter "contents" of an access type rather than the access pointer itself, so I am considering writing this as
function Get_Next(stateInfoPtr : State_Info_Type) return integer;
where State_Info_Type is the type that State_Info_Access_Type refers to.
With this refactor, for all intents and purposes I think I'm still really passing what amounts to an implicit pointer back to the contents (using the .all) syntax).
I want to refactor and test starting with the lowest level functions, working my way up the call chains. My goal is to push the access types out of the code as I go.
Am I understanding this correctly or am I missing something?

I think original author(s), and possibly OP are missing a point, that is, how Ada parameter modes work.
To quote #T.E.D
Every Ada compiler I know of under the hood passes objects larger than fit in a machine register by reference. It is the compiler, not the details of your parameter passing mechanisim, that enforces not writing data back out of the routine.
Ada does this automatically, and leaves the parameter modes as a way of describing the flow of information (Its NOT the C style reference / value conundrum). See the useful wikibook.
What worries me is that the code you have inherited looks like the author has used the explicit access parameter type as a way of getting functions to have side effects (usually considered a bad thing in Ada - World).
My recommendation is to change your functions to:
function Get_Next(State_Info : in State_Info_Type) return Integer;
and see if the compiler tells you if you are trying to modify State_Info. If so, you may need to change your functions to procedures like this:
procedure Get_Next(State_Info : in out State_Info_Type;
Result : out Integer);
This explicitly shows the flow of information without needing to know the register size or the size of State_Info_Type.
As an aside Ada 2012 Will allow you to have functions that have in out parameters.

To quote #T.E.D,
Every Ada compiler I know of under the hood passes objects larger than fit in a machine register by reference. It is the compiler, not the details of your parameter passing mechanisim, that enforces not writing data back out of the routine.
Since this code hasn’t yet been tested, I think you are completely right to rework what looks like code written with a C mindset. But, you oughtn’t to mention pointers at all; you suggested
function Get_Next(stateInfoPtr : State_Info_Type) return integer;
but this would be better as
function Get_Next(stateInfo : State_Info_Type) return integer;
or even (IMO) as
function Get_Next(State_Info : State_Info_Type) return Integer;
to use more standard Ada styling! My editor (Emacs, but GPS can do this too) will change state_info into State_Info on the fly.
As an afterthought, you might be able to get rid of State_Info_Type_Pointer altogether. You mention .all, so I guess you’ve got
SITP : State_Info_Type_Pointer := new State_Info_Type;
... set up components
Next := Get_Next (SITP.all);
but what about
SIT : State_Info_Type;
... set up components
Next := Get_Next (SIT);

I wouldn't recommend this, but you can get pointers to variables in Ada 83 by using 'Address.
You can then use overlays (again this is all Ada83 stuff) to achieve access...
function something(int_address : Address) return integer
is
x : integer;
for x'address use int_address;
begin
-- play with x as you will.

How to use MarkLogic xquery to tell if a document is 'in-memory'

I want to tell if an XML document has been constructed (e.g. using xdmp:unquote) or has been retrieved from a database. One method I have tried is to check the document-uri property
declare variable $doc as document-node() external;
if (fn:exists(fn:document-uri($doc))) then
'on database'
else
'in memory'
This seems to work well enough but I can't see anything in the MarkLogic documentation that guarantees this. Is this method reliable? Is there some other technique I should be using?

I think that behavior has been stable for a while. You could always check for the URI too, as long as you expect it to be from the current database:
xdmp:exists(fn:doc(fn:document-uri($doc)))
Or if you are in an update context and need ACID guarantees, use fn:exists.
The real test would be to try to call xdmp:node-replace or similar, and catch the expected error. Those node-level update functions do not work on constructed nodes. But that requires an update context, and might be tricky to implement in a robust way.

If your XML document is in-memeory, you can use in-mem-update API
import module namespace mem = "http://xqdev.com/in-mem-update" at "/MarkLogic/appservices/utils/in-mem-update.xqy";
If your XML document exists in your database you can use fn:exists() or fn:doc-available()

The real test of In-memory or In-Db is xdmp:node-replace .
If you are able to replace , update , delete a node then it is in database else if it throws exception then it's not in database.
Now there are two situation
1. your document is not created at all:
you can use fn:empty() to check if it is created or not.
2. Your document is created and it's in memory:
if fn:empty() returns false and xdmp:node-replace throws exception then it's in-memory

How to prevent Sqlite unbound parameters interpreted as NULL

As the Sqlite docs specify, unbound parameters in prepared statements are interpreted as NULL - my question is this then:
Is there a way to have Sqlite ensure that all parameters have been bound at least once, thereby ensuring none were missed by accident?
It is better to get an error and require calls to sqlite3_bind_null(statement_, col); then to get a subtle error because I forgot to call sqlite3_bind_* on the where clause of an update statement!

It is not possible to differentiate unbound parameters from parameters set to NULL using the current SQLite libraries.
If you have a look at the C source code for sqlite3_bind_null(), you will find that it simply calls the internal SQLite function that unbinds a parameter. Therefore there is no way to tell the two cases apart.
The only solution for this would be to wrap the SQLite C API functions with your own functions that will do a bit of book-keeping. You can bundle a bitmap with each sqlite3_stmt structure, with each bit set to true if and only if the corresponding parameter is bound. You would have to create wrappers for at least the sqlite3_bind_*() functions and sqlite3_clear_bindings().

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Type checking an XQuery query - xquery

Related

Why do we use [1] behind an order by clause in xquery expressions?

Having trouble understanding FLWOR expressions combined with inserts and return statements [duplicate]

Ada Function Parameter as Access Type or Not

How to use MarkLogic xquery to tell if a document is 'in-memory'

How to prevent Sqlite unbound parameters interpreted as NULL

Categories

Resources