Sql Azure - maxing DTU Percentage querying a 'empty table' - ef-code-first

I have been having trouble with a database for the last month or so... (it was fine in November).
(S0 Standard tier - not even the lowest tier.) - Fixed in update 5
Select statements are causing my database to throttle (timeout even).
To makes sure it wasn't just a problem with my database, Ive:
Copied the database... same problem on both (unless increasing the tier size).
Deleted the database, and created the database again (blank database) from entity framework code-first
The second one proved more interesting. Now my database has 'no' data, and it still peaks the DTU and makes things unresponsive.
Firstly ... is this normal?
I do have more complicated databases at work that use about 10% max of the dtu at the same level (s0). So i'm perplexed. This is just one user, one database and currently empty, and I can make it unresponsive.
Update 2:
From the copy ("the one with data 10000~ records"). I upgraded it to standard S2 (5x more powerful than s0 potentially. No problems.
Down-graded it to S0 again and
SET STATISTICS IO ON
SET STATISTICS TIME ON
select * from Competitions -- 6 records here...
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
(6 row(s) affected)
Table 'Competitions'. Scan count 1, logical reads 3, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 407 ms, elapsed time = 21291 ms.
Am i miss understanding azure databases, that they need to keep warming up? If i run the same query again it will be immediate. If i close the connection and do it again its back to ~20 seconds.
Update 3:
s1 level and it does the same query above for the first time at ~1 second
Update 4:
s0 level again ... first query...
(6 row(s) affected)
Table 'Competitions'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 35 ms.
Nothing is changing on these databases apart from the tier. After roaming around on one of my live sites (different database, schema and data) on s0 ... it peaked at 14.58% (its a stats site)
Its not my best investigation. But im tired :D
I can give more updates if anyone is curious.
** Update: 5 - fixed sort of **
The first few 100% spikes were the same table. After updating the schema and removing a geography field (the data was null in that column) it has moved to the later smaller peaks ~1-4% and a result time back in the very low ms.
Thanks for the help,
Matt

The cause of the problem to the crippling 100% DTO was a GEOGRAPHY field:
http://msdn.microsoft.com/en-gb/library/cc280766.aspx
Removing this from my queries fixed the problem. Removing it from my EF models will hopefully make sure it never comes back.
I do want to use the geography field in Azure (eventually and probably not for a few months), so if anyone knows why it was causing a unexpected amount of DTU to be spent on a (currently always null) column that would be very useful for future knowledge.

Related

Redis error : Timeout performing EVAL (5000ms) with asp.net session state

Getting frequent issues as below after migrating session state to redis
Timeout performing EVAL(5000ms), inst: 0,qs:5,in:65536,serverEndpoint: Unspecified/xxxxx, mgr 10 of 10 available, clientName:xxxx, IOCP:(Busy=0,Free=200,Min=100,Max=200),WORKER:(Busy=11,Free=189,Min=100,Max=200), v:2.0.519.65453
The inst, qs, in, busy, free value changes with each error.
We are using on prem Redis instance with Tier 1-1GB and no replication memory allocated.
With each person logging in we see 2 keys added(Internal, Data) and our data size overall is 2MB each. We have 70 keys approx. Keys are very small but 3 have very big values which makes up approx 1.7MB of 2 of which 1 itself may be of 1MB and other 2 make up. 7MB and rest 67 keys/values 0.3 mb
I have seen this issue generally occurs when trying to fetch one of those 3 bigger key values.
Is there any restriction in value size of a key?
Or could it be some other issue?

AWS DAX Performance issues with table scan

Hi I am working on an project that requires to bring all dyanamo db document in memory. I will be using table.scan() boto3 method which nearly takes 33 seconds for all 10k records.
I have configured the DAX and using it for table scan, which takes nearly the 42 seconds with same 10k records with same lambda configuration. I tried multiple times results are same.
I tried below code :
daxclient = amazondax.AmazonDaxClient.resource(endpoint_url="...")
table = daxclient.Table('table_name')
start_time = time.perf_counter()
retry = True
while retry:
try:
response = table.scan(TableName ="table_name")
retry = 'LastEvaluatedKey' in response
scan_args['ExclusiveStartKey'] = response.get('LastEvaluatedKey')
except Exception as e:
print(e)
print(time.perf_counter()-start_time)
I tried boto3 getItem() method this becomes faster like first time it takes 0.4seconds and after that it takes 0.01 seconds.
Not sure why it is not working with table scan method.
Please suggest.
DAX doesn’t cache scan results. You therefore shouldn’t expect a performance boost and, since you’re bouncing through an extra server on the way to the database, can expect a performance penalty.
You must have very large items to see these performance numbers. And you’re doing a scan a lot? You might want to double check DynamoDB is the right fit.

peak read capacity units dynamo DB table

I need to find out the peak read capacity units consumed in the last 20 seconds in one of my dynamo DB table. I need to find this pro-grammatically in java and set an auto-scaling action based on the usage.
Please can you share a sample java program to find the peak read capacity units consumed in the last 20 seconds for a particular dynamo DB table?
Note: there are unusual spikes in the dynamo DB requests on the database and hence needs dynamic auto-scaling.
I've tried this:
result = DYNAMODB_CLIENT.describeTable(recomtableName);
readCapacityUnits = result.getTable()
.getProvisionedThroughput().getReadCapacityUnits();
but this gives the provisioned capacity but I need the consumed capacity in last 20 seconds.
You could use the CloudWatch API getMetricStatistics method to get a reading for the capacity metric you require. A hint for the kinds of parameters you need to set can be found here.
For that you have to use Cloudwatch.
GetMetricStatisticsRequest metricStatisticsRequest = new GetMetricStatisticsRequest()
metricStatisticsRequest.setStartTime(startDate)
metricStatisticsRequest.setEndTime(endDate)
metricStatisticsRequest.setNamespace("AWS/DynamoDB")
metricStatisticsRequest.setMetricName('ConsumedWriteCapacityUnits',)
metricStatisticsRequest.setPeriod(60)
metricStatisticsRequest.setStatistics([
'SampleCount',
'Average',
'Sum',
'Minimum',
'Maximum'
])
List<Dimension> dimensions = []
Dimension dimension = new Dimension()
dimension.setName('TableName')
dimension.setValue(dynamoTableHelperService.campaignPkToTableName(campaignPk))
dimensions << dimension
metricStatisticsRequest.setDimensions(dimensions)
client.getMetricStatistics(metricStatisticsRequest)
But I bet you'd results older than 5 minutes.
Actually current off the shelf autscaling is using Cloudwatch. This does have a drawback and for some applications is unacceptable.
When spike load is hitting your table it does not have enough capacity to respond with. Reserved with some overload is not enough and a table starts throttling. If records are kept in memory while waiting a table to respond it can simply blow the memory up. Cloudwatch on the other hand reacts in some time often when spike is gone. Based on our tests it was at least 5 mins. And rising capacity gradually, when it was needed straight up to the max
Long story short. We have created custom solution with own speedometers. What it does is counting whatever it has to count and changing tables's capacity accordingly. There is a still a delay because
App itself takes a bit of time to understand what to do
Dynamo table takes ~30 sec to get updated with new capacity details.
On a top we also have a throttling detector. So if write/read request has got throttled we immediately rise a capacity accordingly. Some times level of capacity looks all right but throttling because of HOT key issue.

Is there an efficient way to process data expiration / massive deletion (to free space) with Riak on leveldb?

On Riak :
Is there a way to process data expiration or to dump old data to free some space?
Is it efficient ?
Edit: Thanks to Joe to provide the answer and its workaround (answer down).
Data expiration should be thought from the very beginning as it requires an additional index with a map-reduce algorithme.
Short answer: No, there is no publisher-provided expiry.
Longer answer: Include the write time, in a integer representation like Unix epoch, in a secondary index entry on each value that you want to be subject to expiry. The run a periodic job during off-peak times to do a ranged 2I query to get any entries from 0 to (now - TTL). This could be used as an input to a map/reduce job to do the actual deletes.
As to recovering disk space, leveldb is very slow about that. When a value is written to leveldb it starts in level 0, then as each level fills, compaction moves values to the next level, so your least recently written data resides on disk in the lowest levels. When you delete a value, a tombstone is written to level 0, which masks the previous value in the lower level, and as normal compaction occurs the tombstone is moved down as any other value would be. The disk space consumed by the old value is not reclaimed until the tombstone reaches the same level.
I have written a little c++ tool that uses the leveldb internal function CompactRange to perform this task. Here you can read the article about this.
With this we are able to delete an entire bucket (key by key) and wipe all tombstones. 50Gb of 75Gb are freed!
Unfortunately, this only works if leveldb is used as backend.

Sqlite3 WAL mode commit perfromance

I have a SQLite3 DB where I Insert each second one row of data (500 bytes per row)for each table (around 100 tables). After several minutes, In order to keep the DB size small, I also remove the last line in each table. Ao totally I insert 50K of data, and on the steady state I remove 50K of data. I wrap the inserts and deletions in a transaction, where each second I commit those transaction.
WAL mode is enabled, with sync mode = NORMAL. There is another process that occasionally performs read operation on the DB, but those are very fast.
I'm seeing a strange behaviour. Every several minutes, the commit command itself takes several seconds, while in other times it takes few milliseconds. I tried to play with the wal_autocheckpoint with no success.
It is worth mentioning that the filesystem is working on Linux software raid on Linux VM. Without the raid, the performance are better, but those "hiccups" still occur.
Thanks!

Resources