We have started to use the updated System.Web.Providers provided in the Microsoft.AspNet.Providers.Core package from NuGet. We started to migrate our existing users and found performance slowing and then deadlocks occurring. This was with less than 30,000 users (much less than the 1,000,000+ we need to create). When we were calling the provider, it was from multiple threads on each server and there were multiple servers running this same process. This was to be able to create all the users we required as quickly as possible and to simulate the load we expect to see when it goes live.
The logs SQL Server generated for for a deadlock contained the EF generated sql below:
SELECT
[Limit1].[UserId] AS [UserId]
, [Limit1].[ApplicationId] AS [ApplicationId]
, [Limit1].[UserName] AS [UserName]
, [Limit1].[IsAnonymous] AS [IsAnonymous]
, [Limit1].[LastActivityDate] AS [LastActivityDate]
FROM
(SELECT TOP (1)
[Extent1].[UserId] AS [UserId]
, [Extent1].[ApplicationId] AS [ApplicationId]
, [Extent1].[UserName] AS [UserName]
, [Extent1].[IsAnonymous] AS [IsAnonymous]
, [Extent1].[LastActivityDate] AS [LastActivityDate]
FROM
[dbo].[Users] AS [Extent1]
INNER JOIN [dbo].[Applications] AS [Extent2] ON [Extent1].[ApplicationId] = [Extent2].[ApplicationId]
WHERE
((LOWER([Extent2].[ApplicationName])) = (LOWER(#p__linq__0)))
AND ((LOWER([Extent1].[UserName])) = (LOWER(#p__linq__1)))
) AS [Limit1]
We ran the query manually and the execution plan said that it was performing a table scan even though there was an underlying index. The reason for this is the use of LOWER([Extent1].[UserName]).
We looked at the provider code to see if we were doing something wrong or if there was a way to either intercept or replace the database access code. We didn't see any options to do this but we did find the source of the LOWER issue, .ToLower() is being called on both the column and parameter.
return (from u in ctx.Users
join a in ctx.Applications on u.ApplicationId equals a.ApplicationId into a
where (a.ApplicationName.ToLower() == applicationName.ToLower()) && (u.UserName.ToLower() == userName.ToLower())
select u).FirstOrDefault<User>();
Does anyone know of a way that we change the behaviour of the provider to not use .ToLower() so allowing the index to be used?
You can create an index on lower(username) per Sql Server : Lower function on Indexed Column
ALTER TABLE dbo.users ADD LowerFieldName AS LOWER(username) PERSISTED
CREATE NONCLUSTERED INDEX IX_users_LowerFieldName_ ON dbo.users(LowerFieldName)
I was using the System.Web.Providers.DefaultMembershipProvider membership provider too but found that it was really slow. I changed to the System.Web.Security.SqlMembershipProvider and found it to be much faster (>5 times faster).
This tutorial shows you how to set up the SQL database that you need to use the SqlMembershipProvider http://weblogs.asp.net/sukumarraju/archive/2009/10/02/installing-asp-net-membership-services-database-in-sql-server-expreess.aspx
This database that is auto generated uses stored procedures which may or may not be an issue for your DB guys.
Related
I've been attempting to increase my knowledge and trying out some challenges. I've been going at this for a solid two weeks now finished most of the challenge but this one part remains. The error is shown below, what am i not understanding?
Error in sqlite query: update users set last_browser= 'mozilla' + select sql from sqlite_master'', last_time= '13-04-2019' where id = '14'
edited for clarity:
I'm trying a CTF challenge and I'm completely new to this kind of thing so I'm learning as I go. There is a login page with test credentials we can use for obtaining many of the flags. I have obtained most of the flags and this is the last one that remains.
After I login on the webapp with the provided test credentials, the following messages appear: this link
The question for the flag is "What value is hidden in the database table secret?"
So from the previous image, I have attempted to use sql injection to obtain value. This is done by using burp suite and attempting to inject through the user-agent.
I have gone through trying to use many variants of the injection attempt shown above. Im struggling to find out where I am going wrong, especially since the second single-quote is added automatically in the query. I've gone through the sqlite documentation and examples of sql injection, but I cannot sem to understand what I am doing wrong or how to get that to work.
A subquery such as select sql from sqlite_master should be enclosed in brackets.
So you'd want
update user set last_browser= 'mozilla' + (select sql from sqlite_master''), last_time= '13-04-2019' where id = '14';
Although I don't think that will achieve what you want, which isn't clear. A simple test results in :-
You may want a concatenation of the strings, so instead of + use ||. e.g.
update user set last_browser= 'mozilla' || (select sql from sqlite_master''), last_time= '13-04-2019' where id = '14';
In which case you'd get something like :-
Thanks for everyone's input, I've worked this out.
The sql query was set up like this:
update users set last_browser= '$user-agent', last_time= '$current_date' where id = '$id_of_user'
edited user-agent with burp suite to be:
Mozilla', last_browser=(select sql from sqlite_master where type='table' limit 0,1), last_time='13-04-2019
Iterated with that found all tables and columns and flags. Rather time consuming but could not find a way to optimise.
I am thinking of using Simple.Data Micro-ORM for my ASP.NET 4.5 website. However, there is something that I need to know before deciding whether to use it or not.
Let's take the following Join query for example:
var albums = db.Albums.FindAllByGenreId(1)
.Select(
db.Albums.Title,
db.Albums.Genre.Name);
This query will be translated to:
select
[dbo].[Albums].[Title],
[dbo].[Genres].[Name]
from [dbo].[Albums]
LEFT JOIN [dbo].[Genres] ON ([dbo].[Genres].[GenreId] = [dbo].[Albums].[GenreId])
WHERE [dbo].[Albums].[GenreId] = #p1
#p1 (Int32) = 1
Let's assume that the 'Genres' table is a a table with thousands or even millions of rows. I think that it might be very inefficient to filter the data after the JOIN has taken place, which is what this query translated for in Simple.Date.
Would it be better to filter the data firs in the Generes table, which means create make a SELECT statement first and make the JOIN with that filtered table?
Wouldn't it be better to filter the data ahead of time?
Furthermore, is there an option to make that type of complex (JOIN on a filtered table) query using Simple.Data.
Need your answer to know if to proceed with Simple.Data, or damp it in favor of another micro-ORM.
You are confused about how SQL is interpreted and executed by the database engine. Modern databases are incredibly smart about the best way to execute queries, and the order in which instructions appear in SQL statements has nothing to do with the order in which they are executed.
Try running some queries through SQL Management Studio and looking at the Execution Plan to see how they are actually optimised and executed. Or just try the SQL you think would work better and see how it actually performs compared to what is generated by Simple.Data.
The sql that Simple.Data is generating is idomatic T-SQL, too be honest its what I would be writing if I was drafting the sql myself.
This sql allows Sql Server to optimise the execution plan which should mean the most efficient retrieval of data.
The beauty of Simple.Data is that if you have any doubts or issues with the sql it generates you can just call a stored proc:
db.ProcedureWithParameters(1, 2);
I've got this query
UPDATE linkeddb...table SET field1 = 'Y' WHERE column1 = '1234'
This takes 23 seconds to select and update one row
But if I use openquery (which I don't want to) then it only takes half a second.
The reason I don't want to use openquery is so I can add parameters to my query securely and be safe from SQL injections.
Does anyone know of any reason for it to be running so slowly?
Here's a thought as an alternative. Create a stored procedure on the remote server to perform the update and then call that procedure from your local instance.
/* On remote server */
create procedure UpdateTable
#field1 char(1),
#column1 varchar(50)
as
update table
set field1 = #field1
where column1 = #column1
go
/* On local server */
exec linkeddb...UpdateTable #field1 = 'Y', #column1 = '1234'
If you're looking for the why, here's a possibility from Linchi Shea's Blog:
To create the best query plans when
you are using a table on a linked
server, the query processor must have
data distribution statistics from the
linked server. Users that have limited
permissions on any columns of the
table might not have sufficient
permissions to obtain all the useful
statistics, and might receive aless
efficient query plan and experience
poor performance. If the linked
serveris an instance of SQL Server, to
obtain all available statistics, the
user must own the table or be a member
of the sysadmin fixed server role, the
db_ownerfixed database role, or the
db_ddladmin fixed database role on the
linkedserver.
(Because of Linchi's post, this clarification has been added to the latest BooksOnline SQL documentation).
In other words, if the linked server is set up with a user that has limited permissions, then SQL can't retrieve accurate statistics for the table and might choose a poor method for executing a query, including retrieving all rows.
Here's a related SO question about linked server query performance. Their conclusion was: use OpenQuery for best performance.
Update: some additional excellent posts about linked server performance from Linchi's blog.
Is column1 primary key? Probably not. Try to select records for update using primary key (where PK_field=xxx), otherwise (sometimes?) all records will be read to find PK for records to update.
Is column1 a varchar field? Is that why are you surrounding the value 1234 with single-quotation marks? Or is that simply a typo in your question?
This question is a followup to This Question
The solution, clearing the execution plan cache seemed to work at the time, but i've been running into the same problem over and over again, and clearing the cache no longer seems to help. There must be a deeper problem here.
I've discovered that if I remove the .Distinct() from the query, it returns rows (with duplicates) in about 2 seconds. However, with the .Distinct() it takes upwards of 4 minutes to complete. There are a lot of rows in the tables, and some of the where clause fields do not have indexes. However, the number of records returned is fairly small (a few dozen at most).
The confusing part about it is that if I get the SQL generated by the Linq query, via Linqpad, then execute that code as SQL or in SQL Management Studio (including the DISTINCT) it executes in about 3 seconds.
What is the difference between the Linq query and the executed SQL?
I have a short term workaround, and that's to return the set without .Distinct() as a List, then using .Distinct on the list, this takes about 2 seconds. However, I don't like doing SQL Server work on the web server.
I want to understand WHY the Distinct is 2 orders of magnitude slower in Linq, but not SQL.
UPDATE:
When executing the code via Linq, the sql profiler shows this code, which is basically identical query.
sp_executesql N'SELECT DISTINCT [t5].[AccountGroupID], [t5].[AccountGroup]
AS [AccountGroup1]
FROM [dbo].[TransmittalDetail] AS [t0]
INNER JOIN [dbo].[TransmittalHeader] AS [t1] ON [t1].[TransmittalHeaderID] =
[t0].[TransmittalHeaderID]
INNER JOIN [dbo].[LineItem] AS [t2] ON [t2].[LineItemID] = [t0].[LineItemID]
LEFT OUTER JOIN [dbo].[AccountType] AS [t3] ON [t3].[AccountTypeID] =
[t2].[AccountTypeID]
LEFT OUTER JOIN [dbo].[AccountCategory] AS [t4] ON [t4].[AccountCategoryID] =
[t3].[AccountCategoryID]
LEFT OUTER JOIN [dbo].[AccountGroup] AS [t5] ON [t5].[AccountGroupID] =
[t4].[AccountGroupID]
LEFT OUTER JOIN [dbo].[AccountSummary] AS [t6] ON [t6].[AccountSummaryID] =
[t5].[AccountSummaryID]
WHERE ([t1].[TransmittalEntityID] = #p0) AND ([t1].[DateRangeBeginTimeID] = #p1) AND
([t1].[ScenarioID] = #p2) AND ([t6].[AccountSummaryID] = #p3)',N'#p0 int,#p1 int,
#p2 int,#p3 int',#p0=196,#p1=20100101,#p2=2,#p3=0
UPDATE:
The only difference between the queries is that Linq executes it with sp_executesql and SSMS does not, otherwise the query is identical.
UPDATE:
I have tried various Transaction Isolation levels to no avail. I've also set ARITHABORT to try to force a recompile when it executes, and no difference.
The bad plan is most likely the result of parameter sniffing: http://blogs.msdn.com/b/queryoptteam/archive/2006/03/31/565991.aspx
Unfortunately there is not really any good universal way (that I know of) to avoid that with L2S. context.ExecuteCommand("sp_recompile ...") would be an ugly but possible workaround if the query is not executed very frequently.
Changing the query around slightly to force a recompile might be another one.
Moving parts (or all) of the query into a view*, function*, or stored procedure* DB-side would be yet another workaround.
* = where you can use local params (func/proc) or optimizer hints (all three) to force a 'good' plan
Btw, have you tried to update statistics for the tables involved? SQL Server's auto update statistics doesn't always do the job, so unless you have a scheduled job to do that it might be worth considering scripting and scheduling update statistics... ...tweaking up and down the sample size as needed can also help.
There may be ways to solve the issue by adding* (or dropping*) the right indexes on the tables involved, but without knowing the underlying db schema, table size, data distribution etc that is a bit difficult to give any more specific advice on...
* = Missing and/or overlapping/redundant indexes can both lead to bad execution plans.
The SQL that Linqpad gives you may not be exactly what is being sent to the DB.
Here's what I would suggest:
Run SQL Profiler against the DB while you execute the query. Find the statement which corresponds to your query
Paste the whole statment into SSMS, and enable the "Show Actual Execution Plan" option.
Post the resulting plan here for people to dissect.
Key things to look for:
Table Scans, which usually imply that an index is missing
Wide arrows in the graphical plan, indicating lots of intermediary rows being processed.
If you're using SQL 2008, viewing the plan will often tell you if there are any indexes missing which should be added to speed up the query.
Also, are you executing against a DB which is under load from other users?
At first glance there's a lot of joins, but I can only see one thing to reduce the number right away w/out having the schema in front of me...it doesn't look like you need AccountSummary.
[t6].[AccountSummaryID] = #p3
could be
[t5].[AccountSummaryID] = #p3
Return values are from the [t5] table. [t6] is only used filter on that one parameter which looks like it is the Foreign Key from t5 to t6, so it is present in [t5]. Therefore, you can remove the join to [t6] altogether. Or am I missing something?
Are you sure you want to use LEFT OUTER JOIN here? This query looks like it should probably be using INNER JOINs, especially because you are taking the columns that are potentially NULL and then doing a distinct on it.
Check that you have the same Transaction Isolation level between your SSMS session and your application. That's the biggest culprit I've seen for large performance discrepancies between identical queries.
Also, there are different connection properties in use when you work through SSMS than when executing the query from your application or from LinqPad. Do some checks into the Connection properties of your SSMS connection and the connection from your application and you should see the differences. All other things being equal, that could be the difference. Keep in mind that you are executing the query through two different applications that can have two different configurations and could even be using two different database drivers. If the queries are the same then that would be only differences I can see.
On a side note if you are hand-crafting the SQL, you may try moving the conditions from the WHERE clause into the appropriate JOIN clauses. This actually changes how SQL Server executes the query and can produce a more efficient execution plan. I've seen cases where moving the filters from the WHERE clause into the JOINs caused SQL Server to filter the table earlier in the execution plan and significantly changed the execution time.
I would like to find out if it is possible to find out which package or procedure in a package is updating a table?
Due to a certain project being handed over (the person who handed over the project has since left) without proper documentation, data that we know we have updated always go back to some strange source point.
We are guessing that this could be a database job or scheduler that is running the update command without our knowledge. I am hoping that there is a way to find out where the source code is calling from that is updating the table and inserting the source as a trigger on that table that we are monitoring.
Any ideas?
Thanks.
UPDATE: I poked around and found out
how to trace a statement back to its
owning PL/SQL object.
In combination with what Tony mentioned, you can create a logging table and a trigger that looks like this:
CREATE TABLE statement_tracker
( SID NUMBER
, serial# NUMBER
, date_run DATE
, program VARCHAR2(48) null
, module VARCHAR2(48) null
, machine VARCHAR2(64) null
, osuser VARCHAR2(30) null
, sql_text CLOB null
, program_id number
);
CREATE OR REPLACE TRIGGER smb_t_t
AFTER UPDATE
ON smb_test
BEGIN
INSERT
INTO statement_tracker
SELECT ss.SID
, ss.serial#
, sysdate
, ss.program
, ss.module
, ss.machine
, ss.osuser
, sq.sql_fulltext
, sq.program_id
FROM v$session ss
, v$sql sq
WHERE ss.sql_address = sq.address
AND ss.SID = USERENV('sid');
END;
/
In order for the trigger above to compile, you'll need to grant the owner of the trigger these permissions, when logged in as the SYS user:
grant select on V_$SESSION to <user>;
grant select on V_$SQL to <user>;
You will likely want to protect the insert statement in the trigger with some condition that only makes it log when the the change you're interested in is occurring - on my test server this statement runs rather slowly (1 second), so I wouldn't want to be logging all these updates. Of course, in that case, you'd need to change the trigger to be a row-level one so that you could inspect the :new or :old values. If you are really concerned about the overhead of the select, you can change it to not join against v$sql, and instead just save the SQL_ADDRESS column, then schedule a job with DBMS_JOB to go off and update the sql_text column with a second update statement, thereby offloading the update into another session and not blocking your original update.
Unfortunately, this will only tell you half the story. The statement you're going to see logged is going to be the most proximal statement - in this case, an update - even if the original statement executed by the process that initiated it is a stored procedure. This is where the program_id column comes in. If the update statement is part of a procedure or trigger, program_id will point to the object_id of the code in question - you can resolve it thusly:
SELECT * FROM all_objects where object_id = <program_id>;
In the case when the update statement was executed directly from the client, I don't know what program_id represents, but you wouldn't need it - you'd have the name of the executable in the "program" column of statement_tracker. If the update was executed from an anonymous PL/SQL block, I'm not how to track it back - you'll need to experiment further.
It may be, though, that the osuser/machine/program/module information may be enough to get you pointed in the right direction.
If it is a scheduled database job then you can find out what scheduled database jobs exist and look into what they do. Other things you can do are:
look at the dependencies views e.g. ALL_DEPENDENCIES to see what packages/triggers etc. use that table. Depending on the size of your system that may return a lot of objects to trawl through.
Search all the database source code for references to the table like this:
select distinct type, name
from all_source
where lower(text) like lower('%mytable%');
Again that may return a lot of objects, and of course there will be some "false positives" where the search string appears but isn't actually a reference to that table. You could even try something more specific like:
select distinct type, name
from all_source
where lower(text) like lower('%insert into mytable%');
but of course that would miss cases where the command was formatted differently.
Additionally, could there be SQL scripts being run through "cron" jobs on the server?
Just write an "after update" trigger and, in this trigger, log the results of "DBMS_UTILITY.FORMAT_CALL_STACK" in a dedicated table.
The purpose of this function is exactly to give you the complete call stack of al the stored procedures and triggers that have been fired to reach your code.
I am writing from the mobile app, so i can't give you more detailed examples, but if you google for it you'll find many of them.
A quick and dirty option if you're working locally, and are only interested in the first thing that's altering the data, is to throw an error in the trigger instead of logging. That way, you get the usual stack trace and it's a lot less typing and you don't need to create a new table:
AFTER UPDATE ON table_of_interest
BEGIN
RAISE_APPLICATION_ERROR(-20001, 'something changed it');
END;
/