How to get SimpleDB Query Count with Boto and SDBManager - count

I would like to query my SimpleDB domain to get the count of records that match a certain criteria. Something that could be done like this:
rs = appsDomain.select("SELECT count(*) FROM %s WHERE (%s='%s' or %s='%s') and %s!='%s'" % (APPS_SDBDOMAIN, XML_APPNODE_NAME_ATTR, appName, XML_APPNODE_RESERVED_NAME_ATTR, appName, XML_EMAIL_NODE, thisSession.email), None, True)
After doing some reading I have found that possibly getting a query count from SimpleDB via the SDBManager count method might be more efficient than doing a straight forward "count(*)" style query. Further, I would love not to have to loop over a result set when I know there is only one row and column that I need yet I would want to avoid this too:
count = int(rs.iter().next()['Count'])
Is it true that SDBManager is more efficient? Is there a better way?
If SDBManager is the best way can anyone show me how to use it as I have been thoroughly unsuccessful?
Thanks in advance!

Well, I stopped being lazy and simply went to the source to get my answer
(FROM: boto-2.6.0-py2.7.egg/boto/sdb/db/manager/sdbmanager.py)
def count(self, cls, filters, quick=True, sort_by=None, select=None):
"""
Get the number of results that would
be returned in this query
"""
query = "select count(*) from `%s` %s" % (self.domain.name, self._build_filter_part(cls, filters, sort_by, select))
count = 0
for row in self.domain.select(query):
count += int(row['Count'])
if quick:
return count
return count
As you can see the sdbmanager.count method does nothing special and in fact does what I was hoping to avoid which is looping over a record store just to get the 'Count' value(s).
So in the end I will probably just implement this method myself as using the SDBManager actually implies a lot more over head which, in my case, is not worth it.
Thanks!

Related

how to execute a query after getting response of another query?

I need the result of the first query to pass it as an input parameter to my second query. and also want to know to write multi queries.
In my use case, the second query can be traversed only using the result of the first query and that too using loop(which is similar to for loop)
const query1 =g.V().hasLabel('Province').has('code',market').inE('partOf').outV().has('type',state).values('code').as('state')
After executing query1,the result is
res=[{id1},{id2},........]
query2 = select('state').repeat(has('code',res[0]).inE('partOf').outV().has('type',city).value('name')).times(${res.length-1}).as('city')
I made the assumptions that your first query tries to finds "states by market" where the market is a variable you intend to pass to your query. If that is correct then your first query simplifies to:
g.V().hasLabel('Province').has('code',market).
in('partOf').
has('type','state').values('code')
so, prefer in() to inE().outV() when no filtering on edge properties is needed.
Your second query doesn't look like valid Gremlin, but maybe you were just trying to provide an example of what you wanted to do. You wrote:
select('state').
repeat(has('code',res[0]).
inE('partOf').outV().
has('type',city).value('name')).
times(${res.length-1}).as('city')
and I assume that means you want to use the states found in the first query to find their cities. If that's what you're after you can simplify this to a single query of:
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
in('partOf').has('type','city').
values('name')
If you need data about the state and the city as part of the result then consider project():
g.V().hasLabel('Province').has('code',market).
in('partOf').has('type','state').
project('state','cities').
by('code').
by(__.in('partOf').has('type','city').
values('name').
fold())

DynamoDBScanExpression withLimit returns more records than Limit

Have to list all the records from a DynamoDB table, without any filter expression.
I want to limit the number of records hence using DynamoDBScanExpression with setLimit.
DynamoDBScanExpression scanExpression = new DynamoDBScanExpression();
....
// Set ExclusiveStartKey
....
scanExpression.setLimit(10);
However, the scan operation returns more than 10 results always !!!!
Is this the expected behaviour and if so how?
Python Answer
It is not possible to set a limit for scan() operations, however, it is possible to do so with a query.
A query searches through items, the rows in the database. It starts at the top or bottom of the list and finds items based on set criteria. You must have a partion and a sort key to do this.
A scan on the other hand searches through the ENTIRE database and not by items, and, as a result, is NOT ordered.
Since queries are based on items and scan is based on the ENTIRE database, only queries can support limits.
To answer OP's question, essentially it doesn't work because you're using scan not query.
Here is an example of how to use it using CLIENT syntax. (More advanced syntax version. Sorry I don't have a simpler example that uses resource. you can google that.)
def retrieve_latest_item(self):
result = self.dynamodb_client.query(
TableName="cleaning_company_employees",
KeyConditionExpression= "works_night_shift = :value",
ExpressionAttributeValues={':value': {"BOOL":"True"}},
ScanIndexForward = False,
Limit = 3
)
return result
Here is the DynamoDB module docs

X++ Odd count result

I'm experiencing a really odd result when I do a count in X++, something I've not experienced before. I am performing what I thought was a really simply count query, but I can't seem to get the result I am after.
WMSOrderTrans orderTrans;
WMSOrderTrans orderTransChk;
;
select count(RecId) from orderTrans group by shipmentid where orderTrans.inventTransRefId == 'XXXXXX';
info(strFmt('Count is %1', orderTrans.RecId));
while select orderTransChk group by shipmentid where orderTransChk.inventTransRefId == 'XXXXXX' {
info(strFmt('Shipment is %1', orderTransChk.shipmentId));
}
The data set that I am selecting all have only 1 shipmentid, so the first select I am expecting a count of 1, instead I get 4 (which is how many lines for that transrefid exist). If I change the count from 'RecId' to 'ShipmentId', then instead of the count, I get actual shipmentId. I simply want it to return the count of the records, which is what I believe I've asked it to do.
I really can't see what I am missing.
In the while select, I get what I expect (the shipmentid), only 1 infolog message for the loop. This tells me that the group by with the where clause is working, but it doesn't explain why the first count select statement isn't behaving as I would expect.
For reference, this is AX2012 R1 system.
For anyone who might be interested in knowing my answer, it's tied up with Jeff's response. At the end of the day, I didn't look at data well enough, and the query returned the correct results. I initially thought there were a number of unique shipments, but I was wrong. My expected result was erroneous. There were 4 lines in the file, but the lines were unique for the item, not the shipment. They were all on the same shipment. So really, my own fault, it goes to show that one really needs to look at the data closely.
Thanks to all that responded, greatly appreciated.
I would try a database sync, then restarting the AOS. I don't see anything obviously wrong, so it points to bouncing everything.
Try getting the select statement via this method (from memory so check syntax) and then review the query against SQL directly. It uses generateOnly.
select generateOnly count(RecId) from orderTrans group by shipmentid where orderTrans.inventTransRefId == 'XXXXXX';
info(orderTrans.getSQLStatement());
If I understand what you try to achieve, you'd like to get something like this SQL query:
select count(distinct shipmentid) from orderTrans
where inventTransRefId = 'XXXXXX'
The 'distinct' keyword is not available in AX select command. The group by clause will allow you to iterate all distinct values but not to use an aggregate on top of it.
You may use a sql connection to push the exact sql command you want.
In AX, aggregate values are stores in the field used: Count(RecId), the count will go in the recid field otherwise the system may need to add new field on the buffer on the fly. I don't think you can aggregate on the group by clause as it's important to have its value.
You can try (I don't have an AX to test it) to use a query:
Query query = new Query();
QueryRun queryRun;
QueryBuildDataSource qbd;
qbd = query.addDataSource(tablenum(OrderTrans));
qbd.addRange(fieldNum(OrderTrans, InventTransId)).value("xxxx");
qbd.addSortField(fieldNum(OrderTrans, ShipmentId));
qbd.SortOrder(SortOrder::GroupBy);
queryRun = new QueryRun(query);
info(strfmt("Total Records in Query %1",SysQuery::countTotal(queryRun)));

SQLite - Get a specific row index for a Sorted/Filtered Query

I'm creating a caching system to take data from an SQLite database table using a sorted/filtered query and display it. The tables I'm pulling from can be potentially very large and, of course, I need to minimize impact on memory by only retaining a maximum number of rows in memory at any given time. This is easily done by using LIMIT and OFFSET to load only the records I need and update the cache as needed. Implementing this is trivial. The problem I'm having is determining where the insertion index is for a new record inserted into a particular query so I can update my UI appropriately. Is there an easy way to do this? So far the ideas I've had are:
Dump the entire cache, re-count the Query results (there's no guarantee the new row will be included), refresh the cache and refresh the entire UI. I hope it's obvious why that's not really desirable.
Use my own algorithm to determine whether the new row is included in the current query, if it is included in the current cached results and at what index it should be inserted into if it's within the current cached scope. The biggest downfall of this approach is it's complexity and the risk that my own sorting/filtering algorithm won't match SQLite's.
Of course, what I want is to be able to ask SQLite: Given 'Query A' what is the index of 'Row B', without loading the entire query results. However, so far I haven't been able to find a way to do this.
I don't think it matters but this is all occurring on an iOS device, using the objective-c programming language.
More Info
The Query and subsequent cache is based off of user input. Essentially the user can re-sort and filter (or search) to alter the results they're seeing. My reticence in simply recreating the cache on insertions (and edits, actually) is to provide a 'smoother' UI experience.
I should point out that I'm leaning toward option "2" at the moment. I played around with creating my own caching/indexing system by loading all the records in a table and performing the sort/filter in memory using my own algorithms. So much of the code needed to determine whether and/or where a particular record is in the cache is already there, so I'm slightly predisposed to use it. The danger lies in having a cache that doesn't match the underlying query. If I include a record in the cache that the query wouldn't return, I'll be in trouble and probably crash.
You don't need record numbers.
Save the values of the ordered field in the first and last records of the LIMITed query result.
Then you can use these to check whether the new record falls into this range.
In other words, assuming that you order by the Name field, and that the original query was this:
SELECT Name, ...
FROM mytab
WHERE some_conditions
ORDER BY Name
LIMIT x OFFSET y
then try to get at the new record with a similar query:
SELECT 1
FROM mytab
WHERE some_conditions
AND PrimaryKey = LastInsertedValue
AND Name BETWEEN CachedMin AND CachedMax
Similarly, to find out before (or after) which record the new record was inserted, start directly after the inserted record and use a limit of one, like this:
SELECT Name
FROM mytab
WHERE some_conditions
AND Name > MyInsertedName
AND Name BETWEEN CachedMin AND CachedMax
ORDER BY Name
LIMIT 1
This doesn't give you a number; you still have to check where the returned Name is in your cache.
Typically you'd expect a cache to be invalidated if there were underlying data changes. I think dropping it and starting over will be your simplest, maintainable solution. I would recommend it unless you have a very good reason.
You could write another query that just returned the row count (example below) to see if your cache should be invalidated. That would save recreating the cache when it did not change.
SELECT name,address FROM people WHERE area_code=970;
SELECT COUNT(rowid) FROM people WHERE area_code=970;
The information you'd need from sqlite to know when your cache was invalidated would require some rather intimate knowledge of how the query and/or index was working. I would say that is fairly high coupling.
Otherwise, you'd want to know where it was inserted with regards to the sorting. You would probably key each page on the sorted field. Delete anything greater than the insert/delete field. Any time you change the sorting you'd drop everything.
Something like the below would be a start if you were using C++. I realize you aren't doing C++, but hopefully it is evident as to what I'm trying to do.
struct Person {
std::string name;
std::string addr;
};
struct Page {
std::string key;
std::vector<Person> persons;
struct Less {
bool operator()(const Page &lhs, const Page &rhs) const {
return lhs.key.compare(rhs.key) < 0;
}
};
};
typedef std::set<Page, Page::Less> pages_t;
pages_t pages;
void insert(const Person &person) {
if (sql_insert(person)) {
pages_t::iterator drop_cache_start = pages.lower_bound(person);
//... drop this page and everything after it
}
}
You'd have to do some wrangling to get different datatypes of key to work nicely, but its possible.
Theoretically you could just leave the pages out of it and only use the objects themselves. The database would no longer "own" the data though. If you only fill pages from the database, then you'll have less data consistency worries.
This may be a bit off topic, you aren't re-implementing views are you? It doesn't cache per se, but it isn't clear if that is a requirement of your project.
The solution I came up with is not exactly simple, but it's currently working well. I realized that the index of a record in a Query Statement is also the Count of all it's previous records. What I needed to do was 'convert' all the ORDER statements in the query to a series of WHERE statements that would return only the preceding records and take a count of those records. It's trickier than it sounds (or maybe not...it sounds tricky). The biggest issue I had was making sure the query was, in fact, sorted in a way I could predict. This meant I needed to have an order column in the Order Parameters that was based off of a column with unique values. So, whenever a user sorts on a column, I append to the statement another order parameter on a unique column (I used a "Modified Date Stamp") to break ties.
Creating the WHERE portion of the statement requires more than just tacking on a bunch of ANDs. It's easier to demonstrate. Say you have 3 Order columns: "LastName" ASC, "FirstName" DESC, and "Modified Stamp" ASC (the tie breaker). The WHERE statement would have to look something like this ('?' = record value):
WHERE
"LastName" < ? OR
("LastName" = ? AND "FirstName" > ?) OR
("LastName" = ? AND "FirstName" = ? AND "Modified Stamp" < ?)
Each set of WHERE parameters grouped together by parenthesis are tie breakers. If, in fact, the record values of "LastName" are equal, we must then look at "FirstName", and finally "Modified Stamp". Obviously, this statement can get really long if you're sorting by a bunch of order parameters.
There's still one problem with the above solution. Mathematical operations on NULL values always return false, and yet when you sort SQLite sorts NULL values first. Therefore, in order to deal with NULL values appropriately you've gotta add another layer of complication. First, all mathematical equality operations, =, must be replace by IS. Second, all < operations must be nested with an OR IS NULL to include NULL values appropriately on the < operator. This turns the above operation into:
WHERE
("LastName" < ? OR "LastName" IS NULL) OR
("LastName" IS ? AND "FirstName" > ?) OR
("LastName" IS ? AND "FirstName" IS ? AND ("Modified Stamp" < ? OR "Modified Stamp" IS NULL))
I then take a count of the RowID using the above WHERE parameter.
It turned out easy enough for me to do mostly because I had already constructed a set of objects to represent various aspects of my SQL Statement which could be assembled to generate the statement. I can't even imagine trying to manipulate a SQL statement like this any other way.
So far, I've tested using this on several iOS devices with up to 10,000 records in a table and I've had no noticeable performance issues. Of course, it's designed for single record edits/insertions so I don't really need it to be super fast/efficient.

SQL Count Available Rooms

I am a newbie using asp.net I have a problem on what I am going to use. The problem is that i should count the number of available rooms in a hotel using SQL I use count but it's not working is there any way to use?
availableRMS.Text = rdr.Item(0)
First first column in the table is at the 0 index, not the 1 index
I know it's not the direct answer to your question, but it'd be much simpler if you just used ExecuteScalar to get your count value, since you only have one row/value being returned:
int count = (int) cmd1.ExecuteScalar();
availableRMS.Text = count;
Since Count will always return a number with your query in SQL Server (zero if no rows), then you don't need all the extra checks required for using the reader.

Resources