I have nearly 70000 entries in my Cassandra DB and want to fetch and process them in less than 1 second. I followed DATASTAX documentation and came up with following solution. But i still feel that there can be some mistakes. I highly appreciate if anyone can give me some advice and tips for optimize my code more
public Map<String,String> loadObject(ArrayList<Integer> tradigAccountList){
com.datastax.driver.core.Session session;
Map<String,String> orderListMap = new HashMap<>();
List<ResultSetFuture> futures = new ArrayList<>();
List<ListenableFuture<ResultSet>> Future;
try {
session =jdbcUtils.getCassandraSession();
PreparedStatement statement = session.prepare(
"SELECT * FROM omsks_v1.ordersStringV1 WHERE tradacntid = ?");
for (Integer tradingAccount:tradigAccountList){
futures.add(session.executeAsync(statement.bind(tradingAccount).setFetchSize(50000)));
}
Future = Futures.inCompletionOrder(futures);
for (ListenableFuture<ResultSet> future : Future){
for ( Row row : future.get()){
orderListMap.put(row.getString("cliordid"),row.getString("ordermsg"));
}
}
}catch (Exception e){
}finally {
}
return orderListMap;
}
Related
I currently have a problem with the datareader when creating Microsoft.SqlServer.Management.Smo.Table asynchronously. Note: I derived my SmoTable from TableView and IDisposable.
private async Task Generate()
{
await Task.Run(()=>
{
MSSMSDatabase db = CreateDB(txtDBname.Text);
List<string> tableNames = GetTableNameList();
for(string tableName in tableNames)
{
using(SmoTable tbl = new Table(db, tableName)) // <=== after a few loops, the error occurs within here.
{
foreach(var col in columnList)
{
tbl.AddColumns(col);
}
tbl.Create();
}
}
});
}
Microsoft.SqlServer.Management.Smo.FailedOperationException: InvalidOperationException: There is already an open DataReader associated with this Connection which must be closed first.
I tried implementing IDisposable to my SmoTable class that I derived from the TableView class but still have the same error.
Thanks in advance.
I did a trial and error and found out that you need to create a new connection for each table creation to create a separate datareader for it. So, if you include the instantiation of Server in the foreach loop it will create a new connection and hence a new datareader.
for(string tableName in tableNames)
{
using(SmoTable tbl = new Table(db, tableName)) // <=== after a few loops, the error occurs within here.
{
foreach(var col in columnList)
{
_server = GetSQLServer(); // <=== this is basically Server server = new Server(); return server; kind of method.
db = _server.Databases[_databaseName];
tbl.AddColumns(col);
}
tbl.Create();
}
}
I wrote below Query in CMIS.
Query= select * from cmis:document
But it returns only first 100 results. Actually in Repository more than 100 results are there.
So how can i get All results using same query?
I wrote below CMIS Code--
Code=
public ArrayList<JSONObject> search() {
ItemIterable<QueryResult> results =null;
StringBuilder sb=null;
sb = new StringBuilder();
sb.append("select * from hr:hrdoctype");
CMISSession1 s=new CMISSession1();
Session session=s.getSession();
// execute query
results = session.query(sb.toString(), false);
ArrayList<JSONObject> list=new ArrayList<>();
for (QueryResult qr : results) {
GregorianCalendar gc = (GregorianCalendar) qr.getPropertyValueById("cmis:creationDate");
try{
int month = gc.getTime().getMonth();
-
-
-
}
catch(org.apache.chemistry.opencmis.commons.exceptions.CmisRuntimeException e)
{
}
}
-------------
list.add(json);
}
return list;
}
Please Help.
Thanks in Advance.
From an OpenCMIS point of view that looks alright.
However, for performance reasons you should change the batch size:
OperationContext oc = session.createOperationContext();
oc.setMaxItemsPerPage(10000); // batch size, default = 100
results = session.query(sb.toString(), false, oc);
Please see also this thread: https://community.alfresco.com/thread/206836-alfresco-cmis-query-returning-only-100-results
I have code like this:
public bool Set(IEnumerable<WhiteForest.Common.Entities.Projections.RequestProjection> requests)
{
var documentSession = _documentStore.OpenSession();
//{
try
{
foreach (var request in requests)
{
documentSession.Store(request);
}
//requests.AsParallel().ForAll(x => documentSession.Store(x));
documentSession.SaveChanges();
documentSession.Dispose();
return true;
}
catch (Exception e)
{
_log.LogDebug("Exception in RavenRequstRepository - Set. Exception is [{0}]", e.ToString());
return false;
}
//}
}
This code gets called many times. After i get to around 50,000 documents that have passed through it i get an OutOfMemoryException.
Any idea why ? perhaps after a while i need to declare a new DocumentStore ?
thank you
**
UPDATE:
**
I ended up using the Batch/Patch API to perform the update I needed.
You can see the discussion here: https://groups.google.com/d/topic/ravendb/3wRT9c8Y-YE/discussion
Basically since i only needed to update 1 property on my objects, and after considering ayendes comments about re-serializing all the objects back to JSON, i did something like this:
internal void Patch()
{
List<string> docIds = new List<string>() { "596548a7-61ef-4465-95bc-b651079f4888", "cbbca8d5-be45-4e0d-91cf-f4129e13e65e" };
using (var session = _documentStore.OpenSession())
{
session.Advanced.DatabaseCommands.Batch(GenerateCommands(docIds));
}
}
private List<ICommandData> GenerateCommands(List<string> docIds )
{
List<ICommandData> retList = new List<ICommandData>();
foreach (var item in docIds)
{
retList.Add(new PatchCommandData()
{
Key = item,
Patches = new[] { new Raven.Abstractions.Data.PatchRequest () {
Name = "Processed",
Type = Raven.Abstractions.Data.PatchCommandType.Set,
Value = new RavenJValue(true)
}}});
}
return retList;
}
Hope this helps ...
Thanks alot.
I just did this for my current project. I chunked the data into pieces and saved each chunk in a new session. This may work for you, too.
Note, this example shows chunking by 1024 documents at a time, but needing at least 2000 before we decide it's worth chunking. So far, my inserts got the best performance with a chunk size of 4096. I think that's because my documents are relatively small.
internal static void WriteObjectList<T>(List<T> objectList)
{
int numberOfObjectsThatWarrantChunking = 2000; // Don't bother chunking unless we have at least this many objects.
if (objectList.Count < numberOfObjectsThatWarrantChunking)
{
// Just write them all at once.
using (IDocumentSession ravenSession = GetRavenSession())
{
objectList.ForEach(x => ravenSession.Store(x));
ravenSession.SaveChanges();
}
return;
}
int numberOfDocumentsPerSession = 1024; // Chunk size
List<List<T>> objectListInChunks = new List<List<T>>();
for (int i = 0; i < objectList.Count; i += numberOfDocumentsPerSession)
{
objectListInChunks.Add(objectList.Skip(i).Take(numberOfDocumentsPerSession).ToList());
}
Parallel.ForEach(objectListInChunks, listOfObjects =>
{
using (IDocumentSession ravenSession = GetRavenSession())
{
listOfObjects.ForEach(x => ravenSession.Store(x));
ravenSession.SaveChanges();
}
});
}
private static IDocumentSession GetRavenSession()
{
return _ravenDatabase.OpenSession();
}
Are you trying to save it all in one call?
The DocumentSession need to turn all of the objects that you pass it into a single request to the server. That means that it may allocate a lot of memory for the write to the server.
Usually we recommend on batches of about 1,024 items in you are doing bulks saves.
DocumentStore is a disposable class, so I worked around this problem by disposing the instance after each chunk. I highly doubt this is the most efficient way to run operations, but it will prevent significant memory overhead from happening.
I was running a sort of "delete all" operation like so. You can see the using blocks disposing both the DocumentStore and the IDocumentSession objects after each chunk.
static DocumentStore GetDataStore()
{
DocumentStore ds = new DocumentStore
{
DefaultDatabase = "test",
Url = "http://localhost:8080"
};
ds.Initialize();
return ds;
}
static IDocumentSession GetDbInstance(DocumentStore ds)
{
return ds.OpenSession();
}
static void Main(string[] args)
{
do
{
using (var ds = GetDataStore())
using (var db = GetDbInstance(ds))
{
//The `Take` operation will cap out at 1,024 by default, per Raven documentation
var list = db.Query<MyClass>().Skip(deleteSum).Take(5000).ToList();
deleteCount = list.Count;
deleteSum += deleteCount;
foreach (var item in list)
{
db.Delete(item);
}
db.SaveChanges();
list.Clear();
}
} while (deleteCount > 0);
}
I have a web page that needs to update multiple records. This page gets all the information and then begins a transaction sending multiple UPDATE queries to the data base.
foreach row
{
Prepare the query
Hashtable Item = new Hashtable();
Item.Add("Id", Id);
Item.Add("Field1", Field1);
Item.Add("Field2", Field2);
Item.Add("Field3", Field3);
...
}
Then we launch the ytransaction
DO CHANGES()
public void execute_NonQuery_procedure_transaction(string StoredProcedure, List<Hashtable> Params)
{
using (MySqlConnection oConnection = new MySqlConnection(ConfigurationManager.AppSettings[DB]))
{
MySqlTransaction oTransaction;
bool HasErrors = false;
oConnection.Open();
oTransaction = oConnection.BeginTransaction();
try
{
MySqlCommand oCommand = new MySqlCommand(StoredProcedure, oConnection);
oCommand.CommandType = CommandType.StoredProcedure;
oCommand.Transaction = oTransaction;
foreach (Hashtable hParams in Params)
{
oCommand.Parameters.Clear();
IDictionaryEnumerator en = hParams.GetEnumerator();
while (en.MoveNext())
{
oCommand.Parameters.AddWithValue("_" + en.Key.ToString(), en.Value);
oCommand.Parameters["_" + en.Key.ToString()].Direction = ParameterDirection.Input;
}
oCommand.ExecuteNonQuery();
}
}
catch (Exception e)
{
HasErrors = true;
throw e;
}
finally
{
if (HasErrors)
oTransaction.Rollback();
else
oTransaction.Commit();
oConnection.Close();
}
}
}
Is there another way to do this or this is the most efficient way?
It depends on the situation, like if you have multiple row updates or adding new rows or deleting some rows or a combination of these, which modifies the database table then, the efficient way to do this is to have Batch Update...
Please go through this link Batch Update
Hope this helps...
it looks fine to me, you could eventually do not clear the Command.Parameters list but just assign the values on following iterations but probably this leads to no visible improvements.
pay attention your throw is wrong, in C# don't use throw e; but simply throw;.
I try to insert hundreds of records into empty database table using TableDirect type of SqlCeCommand. The problem is I get an exception SqlCeException "Unspecified error" when calling SqlCeResultSet::Insert. Below is my code. Any hints?
Thanks
public bool StoreEventsDB2(List<DAO.Event> events)
{
try
{
SqlCeCommand command = new SqlCeCommand("Event");
command.CommandType = System.Data.CommandType.TableDirect;
SqlCeResultSet rs = _databaseManager.ExecuteResultSet(command, ResultSetOptions.Updatable | ResultSetOptions.Scrollable );
foreach (DAO.Event theEvent in events)
{
SqlCeUpdatableRecord record = rs.CreateRecord();
record.SetInt32( 0, theEvent.ID );
record.SetInt32( 1, theEvent.ParentID);
record.SetString(2, theEvent.Name);
record.SetDateTime(3, theEvent.DateTime);
record.SetDateTime(4, theEvent.LastSynced);
record.SetInt32(5, theEvent.LastSyncedTS);
record.SetString(6, theEvent.VenueName);
record.SetBoolean(7, theEvent.IsParentEvent);
record.SetDateTime(11, DateTime.Now);
rs.Insert(record);
}
}
catch (SqlCeException e)
{
Log.Logger.GetLogger().Log(Log.Logger.LogLevel.ERROR, "[EventManager::StoreEventsDB] error: {0}", e.Message);
return false;
}
catch (Exception e)
{
Log.Logger.GetLogger().Log(Log.Logger.LogLevel.ERROR, "[EventManager::StoreEventsDB] error: {0}", e.Message);
return false;
}
return true;
}
I am unsure how your connection is managed with the database manager which could be the culprit - make sure you are using one connection (sqlce doesn't play nice). Also the results set option "ResultSetOption.Scrollable" is not needed (at least I have never used it for an insert).
Below is the syntax I use when doing direct table inserts. Every database/data access object is wrapped in a using statement to dispose of objects after use - this is very important especially with the compact framework and sqlce as the garbage collection is less than ideal (you WILL get out of memory exceptions!). I have added a transaction to your code also so that the option is all or nothing.
Hope this helps:
using (var transaction = connection.BeginTransaction())
{
using (var command = connection.CreateCommand())
{
command.Transaction = transaction;
command.CommandType = CommandType.TableDirect;
command.CommandText = "Event";
using (var rs = command.ExecuteResultSet(ResultSetOptions.Updatable))
{
var record = rs.CreateRecord();
foreach (DAO.Event theEvent in events)
{
record.SetInt32(0, theEvent.ID);
record.SetInt32(1, theEvent.ParentID);
record.SetString(2, theEvent.Name);
record.SetDateTime(3, theEvent.DateTime);
record.SetDateTime(4, theEvent.LastSynced);
record.SetInt32(5, theEvent.LastSyncedTS);
record.SetString(6, theEvent.VenueName);
record.SetBoolean(7, theEvent.IsParentEvent);
record.SetDateTime(11, DateTime.Now);
rs.Insert(record);
}
}
transaction.Commit();
}
}