How to delete first n rows in realm - realm

Suppose there is a table called RecentViewItem which stores recently viewed items by user. I want to keep only first 10 recently viewed items by deleting all other items. My query is like this:
RealmResults<RecentViewItem> results =
realm.where(RecentViewItem.class)
.findAllSorted("updatedAt", Sort.DESCENDING);
// What to do next ?

RealmResults<RecentViewItem> results = realm.where(RecentViewItem.class).findAllSorted("updatedAt", Sort.DESCENDING);
for(int i = 0, size = results.size(); i < 10 && i < size; i++) {
RecentViewItem recentViewItem = results.get(i);
recentViewItem.deleteFromRealm();
}

Since Realm for Java lazy loads its data, there isn't a LIMIT query which adds a bit of work. I query all the data sorted by a timestamp (in your case updatedAt).
Realm realm = Realm.getDefaultInstance();
final RealmResults<RecentViewItem> results = realm.where(RecentViewItem.class).findAllSorted("timestamp", Sort.DESCENDING);
I check whether it's over my limit and, if it is, get the result on the threshold so I can get its timestamp. Then I query against that threshold timestamp and delete all the results from before it.
if (results.size() > UPPER_LIMIT) {
RecentViewItem firstOverLimit = results.get(TRUNCATE_LIMIT);
// Assuming a system time. Maybe your updatedAt is a date. Could use an id too.
long threshold = firstOverLimit.timestamp;
final RealmResults<RecentViewItem> resultsToDelete = realm.where(RecentViewItem.class).lessThanOrEqualTo("timestamp", threshold).findAll();
Log.d(TAG, "cleanUp: total "+ results.size() +
"; over threshold of " + UPPER_LIMIT +
"; truncating to " + TRUNCATE_LIMIT +
"; deleting " + resultsToDelete.size() +
"; all on or before "+ threshold);
realm.executeTransaction(realm1 -> resultsToDelete.deleteAllFromRealm());
}
My TRUNCATE_LIMIT is half my UPPER_LIMIT so that this clean up does not run constantly as new records are added, but you can tune that however you like.
I'm also deleting the first record over the limit, but you can play with the boundary and retain it.
Also, my timestamp is quite exact but if your date is just "today" then you will get fuzzier results.

Related

DocumentDB Change Feed and saving Checkpoint

After reading the documentation, I'm having a hard time conceptualizing the change feed. Let's take the code from the documentation below. The second change feed is picking up the changes from the last time it was run via the checkpoints. Let's say it is being used to create summary data and there was an issue and it needed to be re-run from a prior time. I don't understand the following:
How to specify a particular time the checkpoint should start. I understand I can save the checkpoint dictionary and use that for each run, but how do you get the changes from X time to maybe rerun some summary data
Secondly, let's say we are rerunning some summary data and we save the last checkpoint used for each summarized data so we know where that one left off. How does one know that a record is in or before that checkpoint?
Code that runs from collection beginning and then from last checkpoint:
Dictionary < string, string > checkpoints = await GetChanges(client, collection, new Dictionary < string, string > ());
await client.CreateDocumentAsync(collection, new DeviceReading {
DeviceId = "xsensr-201", MetricType = "Temperature", Unit = "Celsius", MetricValue = 1000
});
await client.CreateDocumentAsync(collection, new DeviceReading {
DeviceId = "xsensr-212", MetricType = "Pressure", Unit = "psi", MetricValue = 1000
});
// Returns only the two documents created above.
checkpoints = await GetChanges(client, collection, checkpoints);
//
private async Task < Dictionary < string, string >> GetChanges(
DocumentClient client,
string collection,
Dictionary < string, string > checkpoints) {
List < PartitionKeyRange > partitionKeyRanges = new List < PartitionKeyRange > ();
FeedResponse < PartitionKeyRange > pkRangesResponse;
do {
pkRangesResponse = await client.ReadPartitionKeyRangeFeedAsync(collection);
partitionKeyRanges.AddRange(pkRangesResponse);
}
while (pkRangesResponse.ResponseContinuation != null);
foreach(PartitionKeyRange pkRange in partitionKeyRanges) {
string continuation = null;
checkpoints.TryGetValue(pkRange.Id, out continuation);
IDocumentQuery < Document > query = client.CreateDocumentChangeFeedQuery(
collection,
new ChangeFeedOptions {
PartitionKeyRangeId = pkRange.Id,
StartFromBeginning = true,
RequestContinuation = continuation,
MaxItemCount = 1
});
while (query.HasMoreResults) {
FeedResponse < DeviceReading > readChangesResponse = query.ExecuteNextAsync < DeviceReading > ().Result;
foreach(DeviceReading changedDocument in readChangesResponse) {
Console.WriteLine(changedDocument.Id);
}
checkpoints[pkRange.Id] = readChangesResponse.ResponseContinuation;
}
}
return checkpoints;
}
DocumentDB supports check-pointing only by the logical timestamp returned by the server. If you would like to retrieve all changes from X minutes ago, you would have to "remember" the logical timestamp corresponding to the clock time (ETag returned for the collection in the REST API, ResponseContinuation in the SDK), then use that to retrieve changes.
Change feed uses logical time in place of clock time because it can be different across various servers/partitions. If you would like to see change feed support based on clock time (with some caveats on skew), please propose/upvote at https://feedback.azure.com/forums/263030-documentdb/.
To save the last checkpoint per partition key/document, you can just save the corresponding version of the batch in which it was last seen (ETag returned for the collection in the REST API, ResponseContinuation in the SDK), like Fred suggested in his answer.
How to specify a particular time the checkpoint should start.
You could try to provide a logical version/ETag (such as 95488) instead of providing a null value as RequestContinuation property of ChangeFeedOptions.

Retrieving row count from QSqlQuery, but got -1

I'm trying to get the row count of a QSqlQuery, the database driver is qsqlite
bool Database::runSQL(QSqlQueryModel *model, const QString & q)
{
Q_ASSERT (model);
model->setQuery(QSqlQuery(q, my_db));
rowCount = model->query().size();
return my_db.lastError().isValid();
}
The query here is a select query, but I still get -1;
If I use model->rowCount() I get only ones that got displayed, e.g 256, but select count(*) returns 120k results.
What's wrong about it?
This row count code extract works for SQLite3 based tables as well as handles the "fetchMore" issue associated with certain SQLite versions.
QSqlQuery query( m_database );
query.prepare( QString( "SELECT * FROM MyDatabaseTable WHERE SampleNumber = ?;"));
query.addBindValue( _sample_number );
bool table_ok = query.exec();
if ( !table_ok )
{
DATABASETHREAD_REPORT_ERROR( "Error from MyDataBaseTable", query.lastError() );
}
else
{
// only way to get a row count, size function does not work for SQLite3
query.last();
int row_count = query.at() + 1;
qDebug() << "getNoteCounts = " << row_count;
}
The documentation says:
Returns ... -1 if the size cannot be determined or if the database does not support reporting information about query sizes.
SQLite indeed does not support this.
Please note that caching 120k records is not very efficient (nobody will look at all those); you should somehow filter them to get the result down to a manageable size.

Faster database access by index

I have this code
using (var contents = connection.CreateCommand())
{
contents.CommandText = "SELECT [subject],[note] FROM tasks";
var r = contents.ExecuteReader();
int zaehler = 0;
int zielzahl = 5;
while (r.Read())
{
if (zaehler == zielzahl)
{
//access r["subject"].ToString()
}
zaehler++;
}
}
I want to make it faster by accessing zielzahl directly like r[zielzahl] instead of iterating through all entries. But
r[zielzahl]["subject"]
does not work aswell as
r["subject"][zielzahl]
How do I access the column subject of result number zielzahl?
To get only the sixth record, use the OFFSET clause:
SELECT subject, note
FROM tasks
LIMIT 1 OFFSET 5
Please note that the order of returned records is not guaranteed unless you use the ORDER BY clause.

Linq to dataset, Query optimization

I have following linq queries:
var itembind = (from q in dsSerach.Tables[0].AsEnumerable()
select new
{
PatternID = q.Field<int>("PatternID"),
PatternName = q.Field<string>("PatternName") + " " + q.Field<string>("ColorID") + q.Field<string>("BookID"),
ColorID = q.Field<string>("ColorID"),
BookID = q.Field<string>("BookID"),
CoverImage = (from img1 in objJFEntities.ProductImages.ToList()
where img1.PatternName.ToLower() == q.Field<string>("PatternName").ToLower()
select new CoverImage
{
URL = "Images/MediumPatternImages/" +
q.Field<string>("PatternName") + "_" + q.Field<string>("ColorID") + q.Field<string>("BookID") + q.Field<string>("ImageExtension"),
ID = q.Field<int>("ProductImageID")
}).FirstOrDefault(),
TotalCount = q.Field<int>("TotalCount")
}).Distinct();
var patterns = (from r in itembind
group r by new { r.PatternID, r.ColorID } into g
select new SearchPattern
{
PatternID = g.Key.PatternID,
PatternName = string.Join(",", g.OrderBy(s => s.ColorID).OrderBy(s => s.BookID)
.Select(s => String.Format("<a href='{0:s}' title='{1:s}'>{2:s}</a><br />",
new object[] { String.Format("Product.aspx?ID={0}&img={1}", g.Key.PatternID, s.CoverImage.ID), s.PatternName, s.PatternName })).FirstOrDefault()),
CoverImage = g.Count() > 1 ? (from img1 in objJFEntities.ProductImages.ToList()
where img1.ProductImageID == g.Select(i => i.CoverImage.ID).FirstOrDefault() && img1.ColorID.ToString() == g.Key.ColorID
select new CoverImage
{
URL = "Images/MediumPatternImages/" +
img1.PatternName + "_" + img1.ColorID + img1.BookID + img1.ImageExtension,
ID = img1.ProductImageID
}).FirstOrDefault() : g.Select(i => i.CoverImage).FirstOrDefault()
}).ToList();
these queries are taking more then 1 minute to execute for the 1000 records only.
The dsSearch is a dataset filled with records returned from my procedure in SQL.
Am using entity framework. The site is deployed with IIS7.0. The SQL server 2008 is in use.
I got "Error Message:Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding." ,
"Cannot open database "DB" requested by the login. The login failed." & "The underlying provider failed on Open." kind of error very frequently site.
Please tell me how to optimize such a query.
EDIT:
Here is the procedure
http://pastie.org/7160934
In the first query you are doing a objJFEntities.ProductImages.ToList() , with the ToList() call you are fetching every entry from the database, and later filter the results in memory.
Rolfvm is correct in pointing out that objJFEntities.ProductImages causes the problem, but the analysis is a bit different. You fetch the entire ProductImages table into memory for each iteration of the query when you enumerate over it. So one optimization would be to fetch the images first in a collection and use that collection in the query statement
var localImages = objJFEntities.ProductImages.ToList();
...
CoverImage = (from img1 in localImages....
But then, your query seems to do far too much. You build the first part itembind without executing it. Then you build the second part (var patterns = (from r in itembind) and execute it by ToList(). But in the second part you never use the CoverImage from the first part. So creating these is a waste of resources. (Or you skimmed the code, hiding another use of the first part).

Retrieve Cellset Value in SSAS\MDX

Im writing SSAS MDX queries involving more than 2 axis' to retrieve a value. Using ADOMD.NET, I can get the returned cellset and determine the value by using
lblTotalGrossSales.Text = CellSet.Cells(0).Value
Is there a way I can get the CellSet's Cell(0) Value in my MDX query, instead of relying on the data returning to ADOMD.NET?
thanks!
Edit 1: - Based on Daryl's comment, here's some elaboration on what Im doing. My current query is using several axis', which is:
SELECT {[Term Date].[Date Calcs].[MTD]} ON 0,
{[Sale Date].[YQMD].[DAY].&[20121115]} ON 1,
{[Customer].[ID].[All].[A612Q4-35]} ON 2,
{[Measures].[Loss]} ON 3
FROM OUR_CUBE
If I run that query in Management Studio, I am told Results cannot be displayed for cellsets with more than two axes - which makes sense since.. you know.. there's more than 2 axes. However, if I use ADOMD.NET to run this query in-line, and read the returning value into an ADOMD.NET cellset, I can check the value at cell "0", giving me my value... which as I understand it (im a total noob at cubes) is the value sitting where all these values intersect.
So to answer your question Daryl, what I'd love to have is the ability to have the value here returned to me, not have to read in a cell set into the calling application. Why you may ask? Well.. ultimately I'd love to have one query that performs several multi-axis queries to return the values. Again.. Im VERY new to cubes and MDX, so it's possible Im going at this all wrong (Im a .NET developer by trade).
Simplify your query to return two axis;
SELECT {[Measures].[Loss]} ON 0, {[Term Date].[Date Calcs].[MTD] * [Sale Date].[YQMD].[DAY].&[20121115] * [Customer].[ID].[All].[A612Q4-35]} ON 1 FROM OUR_CUBE
and then try the following to access the cellset;
string connectionString = "Data Source=localhost;Catalog=AdventureWorksDW2012";
//Create a new string builder to store the results
System.Text.StringBuilder result = new System.Text.StringBuilder();
AdomdConnection conn = new AdomdConnection(connectionString);
//Connect to the local serverusing (AdomdConnection conn = new AdomdConnection("Data Source=localhost;"))
{
conn.Open();
//Create a command, using this connection
AdomdCommand cmd = conn.CreateCommand();
cmd.CommandText = #"SELECT { [Measures].[Unit Price] } ON COLUMNS , {[Product].[Color].[Color].MEMBERS-[Product].[Color].[]} * [Product].[Model Name].[Model Name]ON ROWS FROM [Adventure Works] ;";
//Execute the query, returning a cellset
CellSet cs = cmd.ExecuteCellSet();
//Output the column captions from the first axis//Note that this procedure assumes a single member exists per column.
result.Append("\t\t\t");
TupleCollection tuplesOnColumns = cs.Axes[0].Set.Tuples;
foreach (Microsoft.AnalysisServices.AdomdClient.Tuple column in tuplesOnColumns)
{
result.Append(column.Members[0].Caption + "\t");
}
result.AppendLine();
//Output the row captions from the second axis and cell data//Note that this procedure assumes a two-dimensional cellset
TupleCollection tuplesOnRows = cs.Axes[1].Set.Tuples;
for (int row = 0; row < tuplesOnRows.Count; row++)
{
for (int members = 0; members < tuplesOnRows[row].Members.Count; members++ )
{
result.Append(tuplesOnRows[row].Members[members].Caption + "\t");
}
for (int col = 0; col < tuplesOnColumns.Count; col++)
{
result.Append(cs.Cells[col, row].FormattedValue + "\t");
}
result.AppendLine();
}
conn.Close();
TextBox1.Text = result.ToString();
} // using connection
Source : Retrieving Data Using the CellSet
This is fine upto select on columns and on Rows. It will be helpful analyze how to traverse sub select queries from main query.

Resources