We have an Azure SQL database which is on the S1 pricing tier. Our site is extremely heavily cached, so database hits are absolutely minimal. Average DTU usage is only ~1.5%, this is great as our DB costs are a fraction of what they used to be on our old website (£20p/m vs £400 p/m!)
On the site however, we do have small scripts that require insertion of ~100k records or so (user notifications for when someone performs an action such as creates a new tutorial).
When this is triggered, DTU's spike at 100% for around 3-5 minutes.
The script is simply a loop which calls an insert:
using(var db = new DBContext())
{
foreach(var userID in userIDs)
{
db.ExecuteCommand(
"INSERT INTO UserNotifications " +
"(ForUserID, Date, ForObjectTypeID, ForObjectID, TypeID, Count, MetaData1)
VALUES ({0}, {1}, NULL, {2}, {3}, {4}, {5}, {6})",
userID, DateTime.Now.ToUniversalTime(), forObjectID, (byte)type, 1, metaData1.Value
);
}
}
Is there a faster way to do inserts than this?
Additionally, what would be the best way to slow down execution of this script so DTU usage doesn't choke everything up?
You are doing one row per insert - that is not efficient.
A TVP is a like a reverse datareader and is efficient.
Lower tech is to insert 900 rows at a time (1000 is the max). This alone is probably 400x more efficient.
StringBuilder sb = new StringBuilder();
string insert = "INSERT INTO UserNotifications " +
"(ForUserID, Date, ForObjectTypeID, ForObjectID, TypeID, Count, MetaData1) " +
"VALUES ";
sb.AppendLine(insert);
int count = 0;
using(var db = new DBContext())
{
foreach(var userID in userIDs)
{
sb.AppendLine(string.Format(({0}, {1}, NULL, {2}, {3}, {4}, {5}, {6}), ",
userID, DateTime.Now.ToUniversalTime(), forObjectID, (byte)type, 1, metaData1.Value);
count++;
if (count = 990)
{
db.ExecuteCommand(sb.ToString());
count = 0;
sb.Clear();
sb.AppendLine(insert);
//can sleep here to throttle down cpu
}
}
if (count > 0)
{
db.ExecuteCommand(sb.ToString());
}
}
Instead of insert entity by entity you can make an insert of 100 entities at the same time packing the entities in a JSON and writing a store procedure that use it like in this example:
INSERT INTO [dbo].[AISecurityLogs]
([IpAddress], [TimeRange], [Requests], [LogId])
SELECT *, LogId = #logId
FROM OPENJSON ( #json )
WITH (
IpAddress varchar(15) '$.IpAddress',
TimeRange DATETIME '$.TimeRange',
Requests int '$.Requests'
)
To slow down execution and doesn't lost anything you can put the logs in a queue, and then read this information with an azure Job, that allow you to configure the read interval, and insert in the database as I wrote before.
This approach allows a big load (I have some several in production environments) and if something goes wrong with the agent or with the database, the messages are stored in the queue until you move them to the database.
Related
I wrote query like below. I am able to retrieve data fromtime and totime. My problem is for every minute they are 30 records. I would like to get help to get the first record for every one hour and 24 records for one day and I need this for 30 days.
var config = new QueryRequest
{
TableName = "dfgfdgdfg",
KeyConditionExpression = "id= :id AND plctime BETWEEN :fromtime AND :totime",
ExpressionAttributeValues = new Dictionary<string, AttributeValue> {
{
":serialNumber", new AttributeValue {S = id}
},
{
":fromtime", new AttributeValue {S = fromtime }
},
{
":totime", new AttributeValue {S = totime }
}
},
};
return await _dynamoClient.QueryAsync(config);
In addition to storing your record as is, you could consider inserting another record that looks like this :
{
pk : "DailyMarker_" + DateTime.Now.ToString("yyMMdd"), // partition key
sk : "HourlyMarker_" + DateTime.Now.ToString("yyMMddhh") // range key
record: <your entire record>
}
pk and sk would be of the structure DailyMarker_201911 and HourlyMarker_2019112101. Basically the part after the underscore acts as a date/time stamp with only the granularity you are interested in.
While inserting a marker record, you can add precondition checks, which, if they fail, will prevent the insertion from taking place (see PutItem -> ConditionExpression. This operation throws an exception with most SDKs if the condition evaluates to false, so you want to handle that exception.
At this point only the first record per hour is being inserted into this PK/SK combination, and all SKs for one day end up under the same PK
To query for different ranges, you will have to perform some calculations in your application code to determine the start and end buckets (pk and sk) that you want to query. While you will need to make one call per pk you are interested in, the range key can be queried using range queries
You could also switch the pk to be monthly instead of daily, so that will reduce the number of PKs to query while increasing the potential for imbalanced keys (aka. hot keys)
I recently read about SQLite and thought I would give it a try. When I insert one record it performs okay. But when I insert one hundred it takes five seconds, and as the record count increases so does the time. What could be wrong? I am using the SQLite Wrapper (system.data.SQlite):
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
dbcon.close();
Wrap BEGIN \ END statements around your bulk inserts. Sqlite is optimized for transactions.
dbcon = new SQLiteConnection(connectionString);
dbcon.Open();
SQLiteCommand sqlComm;
sqlComm = new SQLiteCommand("begin", dbcon);
sqlComm.ExecuteNonQuery();
//---INSIDE LOOP
sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
sqlComm = new SQLiteCommand("end", dbcon);
sqlComm.ExecuteNonQuery();
dbcon.close();
I read everywhere that creating transactions is the solution to slow SQLite writes, but it can be long and painful to rewrite your code and wrap all your SQLite writes in transactions.
I found a much simpler, safe and very efficient method: I enable a (disabled by default) SQLite 3.7.0 optimisation : the Write-Ahead-Log (WAL).
The documentation says it works in all unix (i.e. Linux and OSX) and Windows systems.
How ? Just run the following commands after initializing your SQLite connection:
PRAGMA journal_mode = WAL
PRAGMA synchronous = NORMAL
My code now runs ~600% faster : my test suite now runs in 38 seconds instead of 4 minutes :)
Try wrapping all of your inserts (aka, a bulk insert) into a single transaction:
string insertString = "INSERT INTO [TableName] ([ColumnName]) Values (#value)";
SQLiteCommand command = new SQLiteCommand();
command.Parameters.AddWithValue("#value", value);
command.CommandText = insertString;
command.Connection = dbConnection;
SQLiteTransaction transaction = dbConnection.BeginTransaction();
try
{
//---INSIDE LOOP
SQLiteCommand sqlComm = new SQLiteCommand(sqlQuery, dbcon);
nRowUpdatedCount = sqlComm.ExecuteNonQuery();
//---END LOOP
transaction.Commit();
return true;
}
catch (SQLiteException ex)
{
transaction.Rollback();
}
By default, SQLite wraps every inserts in a transaction, which slows down the process:
INSERT is really slow - I can only do few dozen INSERTs per second
Actually, SQLite will easily do 50,000 or more INSERT statements per second on an average desktop computer. But it will only do a few dozen transactions per second.
Transaction speed is limited by disk drive speed because (by default) SQLite actually waits until the data really is safely stored on the disk surface before the transaction is complete. That way, if you suddenly lose power or if your OS crashes, your data is still safe. For details, read about atomic commit in SQLite..
By default, each INSERT statement is its own transaction. But if you surround multiple INSERT statements with BEGIN...COMMIT then all the inserts are grouped into a single transaction. The time needed to commit the transaction is amortized over all the enclosed insert statements and so the time per insert statement is greatly reduced.
See "Optimizing SQL Queries" in the ADO.NET help file SQLite.NET.chm. Code from that page:
using (SQLiteTransaction mytransaction = myconnection.BeginTransaction())
{
using (SQLiteCommand mycommand = new SQLiteCommand(myconnection))
{
SQLiteParameter myparam = new SQLiteParameter();
int n;
mycommand.CommandText = "INSERT INTO [MyTable] ([MyId]) VALUES(?)";
mycommand.Parameters.Add(myparam);
for (n = 0; n < 100000; n ++)
{
myparam.Value = n + 1;
mycommand.ExecuteNonQuery();
}
}
mytransaction.Commit();
}
I have a Database table:
Item
ID (uniqueidentifier)
Index (int)
I have a list of 2000 key-value pairs items where the key is ID and value is Index, which i need to update it. How can i update all the 2000 items from database using one single sql query?
Right now i have something like this:
// this dictionary has 2000 values
Dictionary<Guid, int> values = new Dictionary<Guid,int>();
foreach(KeyValuePair<Guid, int> item in values)
{
_db.Database.ExecuteSqlCommand("UPDATE [Item] SET [Index] = #p0 WHERE [Id] = #p1", item.Value, item.Key);
}
However, i am making too many requests to the SQL Server, and i want to improve this.
Use table value parameters to send those values to SQL Server and update Items table in one shot:
CREATE TYPE KeyValueType AS TABLE
(
[Key] GUID,
[Value] INT
);
CREATE PROCEDURE dbo.usp_UpdateItems
#pairs KeyValueType READONLY
AS
BEGIN
UPDATE I
SET [Index] = P.Value
FROM
[Item] I
INNER JOIN #pairs P ON P.Id = I.Id
END;
GO
If you really need to update in that manner and have no other alternative - the main way around it could be this rather "ugly" technique (and therefore rarely used, but still works pretty well);
Make all 2000 statements in one string, and execute that one string. That makes one call to the database with the 2000 updates in it.
So basically something like this (code not made to actually run, it's an example so t
Dictionary<Guid, int> values = new Dictionary<Guid, int>();
System.Text.StringBuilder sb = new System.Text.StringBuilder();
foreach (KeyValuePair<Guid, int> item in values)
{
sb.Append(String.Format("UPDATE [Item] SET [Index] = {0} WHERE [Id] = '{1}';", item.Value, item.Key));
}
_db.Database.ExecuteSqlCommand(sb.ToString);
I'm just trying to import data to database (Sql) from a dataset, but its a bit slower when I try to import 70000 rows. Am I doing something wrong or missing?
Could please give me some advice how can I do it better?
Here is my asp.net code:
ArtiDB entity = new ArtiDB();
int grid = 50;
foreach (string item_kisiler in kisiler)
{
if (item_kisiler == "")
continue;
if (Tools.isNumber(item_kisiler) == false)
continue;
else
{
string gsm1 = item_kisiler;
if (gsm1.Length > 10)
gsm1 = gsm1.Substring(1, 10);
entity.veriaktar(gsm1, gg, grid);
}
}
This is my store prosedure:
alter proc veriaktar
(
#gsm1 nvarchar(50)=null,
#userid uniqueidentifier,
#grupid int = 0
)
as
begin
Declare #AltMusID int
if not exists (select * from tbl_AltMusteriler with (updlock, rowlock, holdlock) where Gsm1=#gsm1 and UserId=#userid)
begin
insert into tbl_AltMusteriler (Gsm1,UserId)
values (#gsm1,#userid)
Set #AltMusID = scope_identity()
end
else
begin
Set #AltMusID = (select AltMusteriID from tbl_AltMusteriler with (updlock, rowlock, holdlock) where Gsm1=#gsm1 and UserId=#userid)
end
if (#grupid != 0)
begin
if not exists (select * from tbl_KisiGrup with (updlock, rowlock, holdlock) where GrupID=#grupid and AltMusteriID=#AltMusID)
begin
insert into tbl_KisiGrup values(#grupid,#AltMusID)
end
end
end
go
The server is designed to work with sets. You're requiring it to deal with one row at a time, and with each row three times. Stop doing that and things will get better.
First go back to your VB docs and look for a way to do one INSERT for all 70,000 rows. If you can use the Bulk Copy (bcp) feature of SQL Server, you should be able to insert the whole set in 10-20 seconds.
The read-test-update paradigm might work here, but it's error-prone and forces the server to work much harder than necessary. If some of the 70,000 are new and others are updates, bulk them into a temporary table and use MERGE to apply it to tbl_AltMusteriler.
Second, uniqueidentifier isn't a good sign. It looks like tbl_AltMusteriler is used to generate a surrogate key. Why wouldn't a simple integer do? It would be faster to generate (with IDENTITY), easier to read, faster to query, and have better PK properties generally. (Also, make sure both the natural key and the surrogate are declared to be unique. What would it mean if two rows have the same values for gsm1 and userid, differing only by AltMusteriID?)
In short, find a way to insert all rows at once, so that your interaction with the DBMS is limited to one or at most two calls.
I have this SQL query:
SELECT Sum(ABS([Minimum Installment])) AS SumOfMonthlyPayments FROM tblAccount
INNER JOIN tblAccountOwner ON tblAccount.[Creditor Registry ID] = tblAccountOwner.
[Creditor Registry ID] AND tblAccount.[Account No] = tblAccountOwner.[Account No]
WHERE (tblAccountOwner.[Account Owner Registry ID] = 731752693037116688)
AND (tblAccount.[Account Type] NOT IN
('CA00', 'CA01', 'CA03', 'CA04', 'CA02', 'PA00', 'PA01', 'PA02', 'PA03', 'PA04'))
AND (DATEDIFF(mm, tblAccount.[State Change Date], GETDATE()) <=
4 OR tblAccount.[State Change Date] IS NULL)
AND ((tblAccount.[Account Type] IN ('CL10','CL11','PL10','PL11')) OR
CONTAINS(tblAccount.[Account Type], 'Mortgage')) AND (tblAccount.[Account Status ID] <> 999)
I have created a Linq query:
var ownerRegistryId = 731752693037116688;
var excludeTypes = new[]
{
"CA00", "CA01", "CA03", "CA04", "CA02",
"PA00", "PA01", "PA02", "PA03", "PA04"
};
var maxStateChangeMonth = 4;
var excludeStatusId = 999;
var includeMortgage = new[] { "CL10", "CL11", "PL10", "PL11" };
var sum = (
from account in context.Accounts
from owner in account.AccountOwners
where owner.AccountOwnerRegistryId == ownerRegistryId
where !excludeTypes.Contains(account.AccountType)
where account.StateChangeDate == null ||
(account.StateChangeDate.Month - DateTime.Now.Month)
<= maxStateChangeMonth
where includeMortgage.Contains(account.AccountType) ||
account.AccountType.Contains("Mortgage")
where account.AccountStatusId != excludeStatusId
select account.MinimumInstallment).ToList()
.Sum(minimumInstallment =>
Math.Abs((decimal)(minimumInstallment)));
return sum;
Are they equal/same ? I dont have records in db so I cant confirm if they are equal. In SQL there are brackets() but in Linq I didnt use them so is it ok?
Please suggest.
It is not possible for us to say anything about this, because you didn't show us the DBML. The actual definition of the mapping between the model and the database is important to be able to see how this executes.
But before you add the DBML to your question: we are not here to do your work, so here are two tips to find out whether they are equal or not:
Insert data in your database and run the queries.
Use a SQL profiler and see what query is executed by your LINQ provider under the covers.
If you have anything more specific to ask, we will be very willing to help.
The brackets will be generated by LINQ provider, if necessary.
The simplest way to check if the LINQ query is equal to the initial SQL query is to log it like #Atanas Korchev suggested.
If you are using Entity Framework, however, there is no Log property, but you can try to convert your query to an ObjectQuery, and call the ToTraceString method then:
string sqlQuery = (sum as ObjectQuery).ToTraceString();
UPD. The ToTraceString method needs an ObjectQuery instance for tracing, and the ToList() call already performs materialization, so there is nothing to trace. Here is the updated code:
var sum = (
from account in context.Accounts
from owner in account.AccountOwners
where owner.AccountOwnerRegistryId == ownerRegistryId
where !excludeTypes.Contains(account.AccountType)
where account.StateChangeDate == null ||
(account.StateChangeDate.Month - DateTime.Now.Month)
<= maxStateChangeMonth
where includeMortgage.Contains(account.AccountType) ||
account.AccountType.Contains("Mortgage")
where account.AccountStatusId != excludeStatusId
select account.MinimumInstallment);
string sqlQuery = (sum as ObjectQuery).ToTraceString();
Please note that this code will not perform the actual query, it is usable for testing purposes only.
Check out this article if you are interested in ready-for-production logging implementation.
There can be a performance difference:
The SQL query returns a single number (SELECT Sum...) directly from the database server to the client which executes the query.
In your LINQ query you have a greedy operator (.ToList()) in between:
var sum = (...
...
select account.MinimumInstallment).ToList()
.Sum(minimumInstallment =>
Math.Abs((decimal)(minimumInstallment)));
That means that the query on the SQL server does not contain the .Sum operation. The query returns a (potentially long?) list of MinimumInstallments. Then the .Sum operation is performed in memory on the client.
So effectively you switch from LINQ to Entities to LINQ to Objects after .ToList().
BTW: Can you check the last proposal in your previous question here which would avoid .ToList() on this query (if the proposal should work) and would therefore be closer to the SQL statement.