What is Database.ReadAsync() for? - azure-cosmosdb

In the samples for the Cosmos DB SQL API, there are a couple of uses of Database.ReadAsync() which don't seem to be doing anything useful. The remarks second in the method's documentation doesn't really indicate what it might be used for either.
What is the reason for using it in these cases? When would you typically use it?
ChangeFeed/Program.cs#L475 shows getting a database then calling ReadAsync to get another reference to the database
database = await client.GetDatabase(databaseId).ReadAsync();
await database.DeleteAsync();
which seems to be functionally the same as
database = client.GetDatabase(databaseId);
await database.DeleteAsync();
throwing the same exception if the database is not found.
and DatabaseManagement/Program.cs#L80-L83
DatabaseResponse readResponse = await database.ReadAsync();
Console.WriteLine($"\n3. Read a database: {readResponse.Resource.Id}");
await readResponse.Database.CreateContainerAsync("testContainer", "/pk");
which seems to be equivalent to:
Console.WriteLine($"\n3. Read a database: {database.Id}");
await database.CreateContainerAsync("testContainer", "/pk");
producing the same output and creating the container as before.

You are correct that those samples might need polishing, the main difference is:
GetDatabase just gets a proxy object, it does not mean the database actually exists. If you attempt an operation on a database that does not exist, for example, CreateContainer, it can fail with a 404.
ReadAsync will read the DatabaseProperties and allow you obtain any information from there and also would succeed if the database actually exists. Does that guarantee that if I call CreateContainer right away it will succeed? No, because the database could have been deleted right in the middle.
So in summary, ReadAsync is good if you want to get any of the DatabaseProperties or if you want to for some reason verify the database exists.
Most common scenarios would just use GetDatabase because you are probably attempting operations down the chain (like creating a container or executing item level operations in some container in that database).

Short Answer
Database.ReadAsync(...) is useful for reading database properties.
The Database object is useful for performing operations on the database, such as creating a container via Database.CreateContainerIfNotExistsAsync(...).
A bit more detail
The Microsoft Docs page for Database.ReadAsync is kind of confusing and not well written in my opinion:
The definition says:
Reads a DatabaseProperties from the Azure Cosmos service as an
asynchronous operation.
However, the example shows ReadAsync returning a DatabaseResponse object, not a DatabaseProperties object:
// Reads a Database resource where database_id is the ID property of the Database resource you wish to read.
Database database = this.cosmosClient.GetDatabase(database_id);
DatabaseResponse response = await database.ReadAsync();
It's only after a bit more digging that things become clearer. When you look at the documentation page for the DatabaseResponse Class it says the inheritance chain for DatabaseResponse is:
Inheritance: Object -> Response<DatabaseProperties> -> DatabaseResponse
If you then have a look at the Docs page for the Response<T> Class you'll see there is an implicit operator that converts Response<T> to T:
public static implicit operator T (Microsoft.Azure.Cosmos.Response<T> response);
So that means that even though the ReadAsync method returns a DatabaseResponse object, that is implicitly converted to a DatabaseProperties object (since DatabaseResponse inherits Response<DatabaseProperties>).
So Database.ReadAsync is useful for reading database properties.
The Docs page for Database.ReadAsync could have clearer about the implicit link between the DatabaseResponse object returned by the method and the DatabaseProperties object that it wraps.

Related

Async SQLalchemy: accessing eagerly-loaded empty relationship triggers new lazy-load, raising error

I am using sqlalchemy + asyncpg, and 'selectin' eager loading.
I have Person items that have one-to-many relationships with Friends.
I insert a Person into my database, with no related Friend entries. If in the same session I try and get that Person from the database, I can access their static (non-relationship) columns fine, but cannot access the friends relationship.
I think trying to access person.friends is triggering a lazy load, despite it being enforced previously as a selectin load. Why is this? How can I avoid it?
# Create the ORM model
class Person(Base):
__tablename__ = 'items'
id_ = Column(POSTGRES_UUID(as_uuid=True), primary_key=True)
name = Column(String(32))
friends = relationship('Friend', lazy='selectin')
# Create an instance
person_id = uuid4()
person = Person(id_=person_id, name='Alice') # Note that this Person's friends are not set
# Add to database
async with AsyncSession(engine, expire_on_commit=False) as session:
try:
session.begin()
session.add(person)
await session.commit()
except:
await session.rollback()
raise
# Get the added person from the database
created_person = await session.get(person, person_id)
print(created_person.id_) # Works fine
print(created_person.friends) # Raises error
Error:
sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called; can't call await_() here.
Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)
The solution is to use the populate_existing parameter in get:
populate_existing – causes the method to unconditionally emit a SQL query and refresh the object with the newly loaded data, regardless of whether or not the object is already present.
Replace
created_person = await session.get(person, person_id)
with
created_person = await session.get(person, person_id, populate_existing=True)
session.get documentation
See also: https://github.com/sqlalchemy/sqlalchemy/issues/7176
#theo-brown's answers goes straight to the point, but wanted to add some interesting information here.
Adding extra context on lazy loading and async SQLAlchemy:
When you fetch data using async SqlAlchemy, every model being queried spawns a coroutine. If you don't eager load your relationships, you'll end up with partially populated models.
Imagine this use case that I'm working on: I have a batch_job object, that relates to a batch_file and batch_job entries, all of which are tables in my database. When I don't eager load them, see what happens in the debugger:
The Traceback that I get when returning the object from an endpoint is this one:
greenlet_spawn has not been called; can't call await_only() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)
The reason is that I didn't await these values, and that's what eager loading does for you in async sqlalchemy.
However, you might not have to eager load if you're working inside the application scope and you'll want to use these values later, and hence you could await them.
For those who are using the ORM, you could do it with the good old loading options:
results = await db_session.execute(select(YourModel).options(joinedload(YourModel.relationshipcolumn)).all()

Axon - How to get #QueryHandler handle method to return an Optional<MyType>

Note:
The point of this question is not to just getting back a value that I ultimately want.
I can do that by simply not using Optional.
I would like an elegant solution so I could start returning Optional.
Explanation of what I tried to do:
I used the QueryGateway with a signature that will query my handler.
I broke out my code so you can see that on my CompletableFuture I will do a blocking get in order to retrieve the Optional that contains the object I really want.
Note that I am not looking for a class that holds my optional.
If this is not elegant then I may as well just do my null check.
The call to the query works, but I get the following error:
org.axonframework.axonserver.connector.query.AxonServerQueryDispatchException: CANCELLED: HTTP/2 error code: CANCEL
Received Rst Stream
AXONIQ-5002
58484#DESKTOP-CK6HLMM
Example of code that initiates the query:
UserProfileOptionByUserQuery q = new UserProfileOptionByUserQuery(userId);
CompletableFuture<Optional<UserProfile>> query =
queryGateway.query(q,ResponseTypes.optionalInstanceOf(UserProfile.class));
Optional<UserProfile> optional = query.get();
Error occurs on the query.get() invocation.
Example of my Query Handler:
#QueryHandler
Optional<UserProfile> handle(UserProfileOptionByUserQuery query, #MetaDataValue(USER_INFO) UserInfo userInfo) {
assertUserCanQuery(query, userInfo);
return userProfileRepository.findById(query.getUserId());
}
The query handler works fine.
Other efforts such as using OptionalResponseType would not initiate my query as desired.
I think the key lies with the exception you are receiving Stephen.
Just to verify for my own good, I've tested the following permutations when it comes to Optional query handling:
Query Handler returns Optional, Query Dispatcher uses OptionalResponeType
Query Handler returns MyType, Query Dispatcher uses OptionalResponeType
Query Handler returns Optional, Query Dispatcher uses InstanceResponeType
Added, I've tried out these permutations both with the SimpleQueryBus and Axon Server. Both buses on all three options worked completely fine for me.
This suggest to me that you should dive in to the AxonServerQueryDispatchException you are receiving.
Hence, I am going to give you a couple of follow up questions to further deduce what the problem is. I'd suggest to update you original question with the response(s) to them.
Do you have a more detailed stack trace per chance?
And, what versions of Axon Framework and Axon Server are you using?
Are you on the Standard Edition? Enterprise edition?
Does this behavior only happen for this exact Optional query handler you've shared with us?

How to use MarkLogic xquery to tell if a document is 'in-memory'

I want to tell if an XML document has been constructed (e.g. using xdmp:unquote) or has been retrieved from a database. One method I have tried is to check the document-uri property
declare variable $doc as document-node() external;
if (fn:exists(fn:document-uri($doc))) then
'on database'
else
'in memory'
This seems to work well enough but I can't see anything in the MarkLogic documentation that guarantees this. Is this method reliable? Is there some other technique I should be using?
I think that behavior has been stable for a while. You could always check for the URI too, as long as you expect it to be from the current database:
xdmp:exists(fn:doc(fn:document-uri($doc)))
Or if you are in an update context and need ACID guarantees, use fn:exists.
The real test would be to try to call xdmp:node-replace or similar, and catch the expected error. Those node-level update functions do not work on constructed nodes. But that requires an update context, and might be tricky to implement in a robust way.
If your XML document is in-memeory, you can use in-mem-update API
import module namespace mem = "http://xqdev.com/in-mem-update" at "/MarkLogic/appservices/utils/in-mem-update.xqy";
If your XML document exists in your database you can use fn:exists() or fn:doc-available()
The real test of In-memory or In-Db is xdmp:node-replace .
If you are able to replace , update , delete a node then it is in database else if it throws exception then it's not in database.
Now there are two situation
1. your document is not created at all:
you can use fn:empty() to check if it is created or not.
2. Your document is created and it's in memory:
if fn:empty() returns false and xdmp:node-replace throws exception then it's in-memory

Groovy DSL with embedded groovy scripts

I am writing a DSL for expressing flow (original I know) in groovy. I would like to provide the user the ability to write functions that are stored and evaluated at certain points in the flow. Something like:
states {
"checkedState" {
onEnter {state->
//do some groovy things with state object
}
}
}
Now, I am pretty sure I could surround the closure in quotes and store that. But I would like to keep syntax highlighting and content assist if possible when editing these DSLs. I realize that the closure COULD reference artifacts from the surrounding flow definition which would no longer be valid when executing the closure in a different context, and I am fine with this. In reality I would like to use the closure syntax for a non-closure function definition.
tl;dr; I need to get the closure's code while evaluating the DSL so that it can be stored in the database and executed by a script host later.
I don't think there is a way to get a closure's source code, as this information is discarded during compilation. Perhaps you could try writing an AST transformation that would make closure's syntax tree available at runtime.
If all you care about is storing the closure in the database, and you don't need later access to the source code, you can try serializing it and storing the serialized form.
Closure implements Serializable, and after nulling its owner, thisObject and delegate attributes I was able to serialize it, but I'm getting ClassNotFoundException on deserialization.
def myClosure = {a, b -> a + b}
Closure.metaClass.setAttribute(myClosure, "owner", null)
Closure.metaClass.setAttribute(myClosure, "thisObject", null)
myClosure.delegate = null
def byteOS = new ByteArrayOutputStream()
new ObjectOutputStream(byteOS).writeObject(myClosure)
def serializedClosure = byteOS.toByteArray()
def input = new ObjectInputStream(new ByteArrayInputStream(serializedClosure))
def deserializedClosure = input.readObject() // throws CNFE
After some searching, I found Groovy Remote Control, a library created specifically to enable serializing closures and executing them later, possibly on a remote machine. Give it a try, maybe that's what you need.

When should I call javax.jdo.Query.close(Object)?

I'm trying to understand when I should call Query.close(Object) or Query.closeAll();
From the docs I see that it will "close a query result" and "release all resources associated with it," but what does "close" mean? Does it mean I can't use objects that I get out of a Query after I've called close on them?
I ask because I'm trying to compartmentalize my code with functions like getUser(id), which will create a Query, fetch the User, and destroy the query. If I have to keep the Query around just to access the User, then I can't really do that compartmentalization.
What happens if I don't call close on an object? Will it get collected as garbage? Will I be able to access that object later?
Please let me know if I can further specify my question somehow.
You can't use the query results since you closed them (i.e the List object you got back from query.execute). You can access the objects in the results ... if you copied them to your own List, or made references to them in your code. Failure to call close can leak memory
When your query method returns a single object it is easy to simply close the query before returning the single object.
On the other hand, when your query method returns a collection the query method itself can not close the query before returning the result because the query needs to stay open while the caller is iterating through the results.
This puts the responsibility for closing a query that returns a collection on the caller and can introduce leaks if the caller neglects to close the query - I thought there must be a safer way and there is!
Guido, a long time DataNucleus user, created a 'auto closing' collection facade that wraps the collection returned by JDO's Query.execute method. Usage is extremely simple: Wrap the query result inside an instance of the auto closing collection object:
Instead of returning the Query result set like this:
return q.execute();
simply return an 'auto closing' wrapped version of it:
return new JDOQueryResultCollection(q, q.execute());
To the caller it appears like any other Collection but the wrapper keeps a reference to the query that created the collection result and auto closes it when the wrapper is disposed of by the GC.
Guido kindly gave us permission to include his clever auto closing code in our open source exPOJO library. The auto closing classes are completely independent of exPOJO and can be used in isolation. The classes of interest are in the expojo_jdo*.jar file that can be downloaded from:
http://www.expojo.com/
JDOQueryResultCollection is the only class used directly but it does have a few supporting classes.
Simply add the jar to your project and import com.sas.framework.expojo.jdo.JDOQueryResultCollection into any Java file that includes query methods that return results as a collection.
Alternatively you can extract the source files from the jar and include them in your project: They are:
JDOQueryResultCollection.java
Disposable.java
AutoCloseQueryIterator.java
ReadonlyIterator.java

Resources