AND\OR Query search using MarkLogic (XQuery or equivalent) - xquery

I am new to MarkLogic and we are evaluating MarkLogic for our product use case.
We evaluated few NoSQL databases like MongoDB, Couchbase etc.
I am looking for a below type of query search.
(Condition1 OR Condition2) AND (Condition3 OR Condition4) AND (Condition5)
Can MarkLogic provide such type of search query?
I am just started learning MarkLogic and trying to understand the architecture.
Thanks,
Sameer

Yes, MarkLogic provides some high level libraries for this type of functionality. Take a look at Search API.
Start here: https://developer.marklogic.com/learn/2009-07-search-api-walkthrough
And more thorough documentation is here: https://docs.marklogic.com/guide/search-dev/search-api

MarkLogic can handle this kind of logic in many ways as mentioned.
For example, this is how you could setup a search query using the CTS library (I highly recommend the CTS library, since it uses indexes much better, and the construction of them are so much more flexible):
cts:search(//elementName,
cts:and-query((
cts:element-attribute-value-query(xs:QName("entry"), xs:QName("private"), "true"),
cts:or-query((
cts:element-attribute-value-query(xs:QName("entry"), xs:QName("forced"), "false"),
cts:element-attribute-value-query(xs:QName("entry"), xs:QName("forced"), "pending")
))
))
)
This snippet shows both AND and OR logic. The cts:and-query() and cts:or-query() functions can take a list of nodes. The above query says: "Find an element (called elementName) that has an attribute of private='true' AND has either one of the following: forced='true' or forced='pending'".
For much simpler data, you can use xQuery predicates by doing something like the following:
for $node in $xml/parent/child[#param1 eq "test" AND #param2 eq "OK"]/grandchild[#service eq "yahoo" or #service eq "google"]
return $node

The short answer to the original question is "yes". The details of "how" will depend on the approach used to express the queries.
The reference architecture recommends a three-tier approach using the Java or Node.js Client APIs if you use one of those, or HTTP calls to the REST API if you use a different language in your middle tier.
You can also use the Search API (as mentioned by wst) if you're working in MarkLogic's application server (typically as a two-tier architecture). You can do that with either XQuery or Server-side JavaScript, as of MarkLogic 8.

Related

Is there a way to import multiple enumerands in IBM Rhapsody?

I have an enumerand of around 150 entries, which I need to get into IBM Rhapsody.
Doing this by hand is clearly lengthy and error prone. I have google extensively but found only things that tell me how to edit the generated code -- not go the other way.
The question is: How is this done? And if there is no way -- please someone post that as an answer.
David,
I would jump into the Java API (plugin subsystem) and do it that way. If you haven't learned how to use the API, there is a bit of a learning curve. There are two ways to go about it: Implement a Java (or your favorite JVM language--I use Scala) app that realizes the Rhapsody Plugin framework and then you choose to package it up and deploy it so that it gets loaded when you load your model, or, if it is a one off job, do everything up to the point of packaging it up and then run it from within your IDE and you are done. If you are comfortable with Scala, I can post some code.
So what I did in the end was I edited the relevant .sbs file, used a small python program to generate the items I required, and then update the length of the array accordingly.
all_the_literals = ["enum_name = 0x4e", enum_name2 = 0xF2", ... ,]
for field1, waste, field1_value in map(lambda x: x.split(" "),
all_the_literals):
literal_string = f""" {{ IEnumerationLiteral
- _id = GUID {uuid.uuid4()};
- _name = \"{field1}\";
- codeUpdateCGTime = 5.16.2022::19:24:18;
- _modifiedTimeWeak = 5.16.2022::19:24:18;
- _value = \"{field1_value}\";
}}"""
print(literal_string)
Note the above "code" snippet purely prints the items, which you then copy-paste into the relevant field in the sbs file. YMMV -- this was the correct format for an enum in Rhapsody (and note how I fudged the update time, but it worked successfully, so you'll need to do the same if you use this answer).
Also note it's probably better to use bauhaus9's answer, but I definitely didn't have time for it.

Microsoft Graph API- list all users with OneDrive license

I want to list all users that have OneDrive license.
I an using this URL but doesn't work.
https://graph.microsoft.com/v1.0/users?$filter=assignedLicenses/any(x:x/skuId eq 4b585984-651b-448a-9e53-3b10f069cf7f or x/skuId eq c7df2760-2c81-4ef7-b578-5b5392b571df)
Do you have any idea how to do it?
Unfortunately complex query (Whatever you're trying to do above) on property assignedLicenses is not supported. If you do so, the API will throw the error:
Complex query on property assignedLicenses is not supported
Being said that i can see it
works for simple filter, like,
https://graph.microsoft.com/v1.0/users?$filter=assignedLicenses/any(x:x/skuId eq 4b585984-651b-448a-9e53-3b10f069cf7f)

Marklogic - Delete Versioned Collections

I have around 43 million documents which is having the latest versioned document in LIVE collection and also have same versioned document in another version collection named as (/collection/versionNumber). I want to delete the versioned collections which is around 34 million. what is best approach to go for it to delete all in one go .
You could try using xdmp:collection-delete() to delete all documents in the collection in a single transaction.
If that doesn't work and it isn't able to delete in one shot, then I would look to utilize batch tools. For instance, a CoRB job.
An example job options file with properties needed, except for the XCC-CONNECTION-URI:
# Inline module to select all URIs from the collection
URIS-MODULE=INLINE-XQUERY|let $uris := cts:uris("",(),cts:collection-query("/collection/versionNumber")) return (count($uris), $uris)
# Inline module to delete the docs
PROCESS-MODULE=INLINE-XQUERY|declare variable $URI as xs:string external; xdmp:document-delete($URI)
THREAD-COUNT=10
I think your application is using DLS library for versioning. If yes, and if you never want any version to look into in future, then only delete the versioned documents. Use can use "dls:document-unmanage" API in that case.
Also, explore dls:purge and dls:document-purge before proceeding. I am not very sure of these two.
Anyways, even if it's not DLS, processing them in one go (single transaction) would not be a recommended way. Either process them in batches or set them all in different threads on task server through spawn.

Marklogic Rest API for directory-query

I have the following XQuery which I use to fetch documents for a directory.
xquery version "1.0-ml";
cts:search(fn:collection(), cts:directory-query("/Path/To/Docs/", "infinity"))
Now I need to translate this into a REST call but I can't seem to crack it following the documentation on this page.
https://docs.marklogic.com/REST/GET/v1/search
Update:
using the Jersey REST API, It tried this but got 406 Error
String query = "{\"queries\":[ {\"directory-query\":{\"uri\":[\"/Path/to/Docs/\"]},\"infinite\":true} ]}";
String encodedQuery = URLEncoder.encode(query, "UTF-8");
WebTarget target = searchWebTarget.queryParam("structuredQuery", encodedQuery);
final Response response = target.request().get();
Any ideas?
As David said, you don't need to use structured query for this purpose, but in case you have future need:
I believe your original issue was that this is not a well-formed structured query:
{\"queries\":[ {\"directory-query\":{\"uri\":[\"/Path/to/Docs/\"]},\"infinite\":true} ]}
You're missing the top level "query" property. You can find an example of a fully formed structured query that uses directory-query here:
http://docs.marklogic.com/guide/search-dev/structured-query#id_97452
Also, you're probably already aware, but there is a native Java API that sits atop the REST API. You can learn more about this API here:
https://docs.marklogic.com/javadoc/client/index.html
http://docs.marklogic.com/guide/java
Constraining by directory is a query parameter directly on the search API. NO other notation needed.
See the docs here: https://docs.marklogic.com/REST/GET/v1/search

Schema qualified tables with SQLAlchemy, SQLite and Postgresql?

I have a Pylons project and a SQLAlchemy model that implements schema qualified tables:
class Hockey(Base):
__tablename__ = "hockey"
__table_args__ = {'schema':'winter'}
hockey_id = sa.Column(sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True)
baseball_id = sa.Column(sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id'))
This code works great with Postgresql but fails when using SQLite on table and foreign key names (due to SQLite's lack of schema support)
sqlalchemy.exc.OperationalError: (OperationalError) unknown database "winter" 'PRAGMA "winter".table_info("hockey")' ()
I'd like to continue using SQLite for dev and testing.
Is there a way of have this fail gracefully on SQLite?
I'd like to continue using SQLite for
dev and testing.
Is there a way of have this fail
gracefully on SQLite?
It's hard to know where to start with that kind of question. So . . .
Stop it. Just stop it.
There are some developers who don't have the luxury of developing on their target platform. Their life is a hard one--moving code (and sometimes compilers) from one environment to the other, debugging twice (sometimes having to debug remotely on the target platform), gradually coming to an awareness that the gnawing in their gut is actually the start of an ulcer.
Install PostgreSQL.
When you can use the same database environment for development, testing, and deployment, you should.
Not to mention the QA team. Why on earth are they testing stuff they're not going to ship? If you're deploying on PostgreSQL, assure the quality of your work on PostgreSQL.
Seriously.
I'm not sure if this works with foreign keys, but someone could try to use SQLAlchemy's Multi-Tenancy Schema Translation for Table objects. It worked for me but I have used custom primaryjoin and secondaryjoinexpressions in combination with composite primary keys.
The schema translation map can be passed directly to the engine creator:
...
if dialect == "sqlite":
url = lambda: "sqlite:///:memory:"
execution_options={"schema_translate_map": {"winter": None, "summer": None}}
else:
url = lambda: f"postgresql://{user}:{pass}#{host}:{port}/{name}"
execution_options=None
engine = create_engine(url(), execution_options=execution_options)
...
Here is the doc for create_engine. There is a another question on so which might be related in that regard.
But one might get colliding table names all schema names are mapped to None.
I'm just a beginner myself, and I haven't used Pylons, but...
I notice that you are combining the table and the associated class together. How about if you separate them?
import sqlalchemy as sa
meta = sa.MetaData('sqlite:///tutorial.sqlite')
schema = None
hockey_table = sa.Table('hockey', meta,
sa.Column('score_id', sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True),
sa.Column('baseball_id', sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id')),
schema = schema,
)
meta.create_all()
Then you could create a separate
class Hockey(Object):
...
and
mapper(Hockey, hockey_table)
Then just set schema above = None everywhere if you are using sqlite, and the value(s) you want otherwise.
You don't have a working example, so the example above isn't a working one either. However, as other people have pointed out, trying to maintain portability across databases is in the end a losing game. I'd add a +1 to the people suggesting you just use PostgreSQL everywhere.
HTH, Regards.
I know this is a 10+ year old question, but I ran into the same problem recently: Postgres in production and sqlite in development.
The solution was to register an event listener for when the engine calls the "connect" method.
#sqlalchemy.event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
dbapi_connection.execute('ATTACH "your_data_base_name.db" AS "schema_name"')
Using ATTACH statement only once will not work, because it affects only a single connection. This is why we need the event listener, to make the ATTACH statement over all connections.

Resources