I just started learning LINQ to SQL, and so far I'm impressed with the easy of use and good performance.
I used to think that when doing LINQ queries like
from Customer in DB.Customers where Customer.Age > 30 select Customer
LINQ gets all customers from the database ("SELECT * FROM Customers"), moves them to the Customers array and then makes a search in that Array using .NET methods. This is very inefficient, what if there are hundreds of thousands of customers in the database? Making such big SELECT queries would kill the web application.
Now after experiencing how actually fast LINQ to SQL is, I start to suspect that when doing that query I just wrote, LINQ somehow converts it to a SQL Query string
SELECT * FROM Customers WHERE Age > 30
And only when necessary it will run the query.
So my question is: am I right? And when is the query actually run?
The reason why I'm asking is not only because I want to understand how it works in order to build good optimized applications, but because I came across the following problem.
I have 2 tables, one of them is Books, the other has information on how many books were sold on certain days. My goal is to select books that had at least 50 sales/day in past 10 days. It's done with this simple query:
from Book in DB.Books where (from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID).Contains(Book.ID) select Book
The point is, I have to use the checking part in several queries and I decided to create an array with IDs of all popular books:
var popularBooksIDs = from Sale in DB.Sales where Sale.SalesAmount >= 50 && Sale.DateOfSale >= DateTime.Now.AddDays(-10) select Sale.BookID;
BUT when I try to do the query now:
from Book in DB.Books where popularBooksIDs.Contains(Book.ID) select Book
It doesn't work! That's why I think that we can't use thins kinds of shortcuts in LINQ to SQL queries, like we can't use them in real SQL. We have to create straightforward queries, am I right?
You are correct. LINQ to SQL does create the actual SQL to retrieve your results.
As for your shortcuts, there are ways to work around the limitations:
var popularBooksIds = DB.Sales
.Where(s => s.SalesAmount >= 50
&& s.DateOfSale >= DateTime.Now.AddDays(-10))
.Select(s => s.Id)
.ToList();
// Actually should work.
// Forces the table into memory and then uses LINQ to Objects for the query
var popularBooksSelect = DB.Books
.ToList()
.Where(b => popularBooksIds.Contains(b.Id));
Yes, query gets translated to a SQL string, and the underlying SQL can be different depending on what you are trying to do... so you have to be careful in that regard. Checkout a tool called linqpad, you can try your query in it and see the executing SQL.
Also, it runs when iterating through the collection or calling a method on it like ToList().
Entity framework or linq queries can be tricky sometimes. Sometimes you are surprised at the efficiency of the sql query generated and sometimes the query is so complicated and inefficient that you would smack your forehead.
Best idea is that if you have any suspicions about a query, run an sql profiler at the backend that would monitor all the queries coming in. That way you know exactly what is being passed on to the sql server and correct any inefficiencies if need be.
http://damieng.com/blog/2008/07/30/linq-to-sql-log-to-debug-window-file-memory-or-multiple-writers
This will help you to see what and when queries are being run. Also, Damiens blog is full of other linq to sql goodness.
You can generate an EXISTS clause by using the .Any method. I have had more success that way than trying to generate IN clauses, because it likes to retrieve all the data and pass it all back in as parameters to a query
In linq to sql, IQueryable expression fragments can be combined to create a single query, it will try to keep everything as an IQueryable for as long as it can, before you do something that cannot be expressed in SQL. When you call ToList you are directly asking it to resolve that query into an IEnumerable stored in memory.
In most cases you are better off not selecting the book ids in advance. Keep the fragment for popular books in a single place in the code and use it when necessary, to build on another query. An IQueryable is just an expression tree, which is resolved into SQL at some other point.
If you think your application will perform better by storing the popular books elsewhere (memcache or whatever), then you may consider pulling them out before hand, and checking against that later. This will mean each book id will be passed in as a sproc parameter and used in an IN clause.
Related
In AX programming best practice, which is the best way:
using a Query created from the AOT,
using a select statement with X++ code,
using a query created with X++ code the Query classe ...
And when to use each one of them?
First off, AX always uses queries internally, X++ selects are translated to query constructions calls, which are executed at run time. The query is translated to SQL at runtime on the the first queryRun.next() or datasource.executeQuery(). There is thus no performance difference using one or the other.
Forms also use queries, most often it is automatically constructed for you, because property AutoQuery has a Yes default value. You can use a X++ select in the executeQuery method, but I would consider that bad practice, as the user will have no filter or sorting options available. Always use queries in forms, prefer to use the auto queries. Add ranges or sorting in the init method using this.queryBuildDatasource() if needed. The exception being listpages which always use an AOT query.
In RunBase classes prefer to use queries, as the user will have the option to change the query. You may of cause use simple X++ select in the inner loop, but consider to include it in the prebuilt query, if possible.
Otherwise, your primary goal as a programmer (besides solving the problem) is to minimize the number of code lines.
Queries defined in the AOT start out with zero code lines, which count in their favor. Thus, if there are serveral statically defined ranges, links, or complex joins, use AOT queries. You cannot beat:
QueryRun qr = new QueryRun(queryStr(MyQuery))
qr.query().dataSourceTable(tableNum(MyTable)).findRange(fieldNum(MyTable,MyField)).value('myValue');
With:
Query q = new Query();
QueryRun qr = new QueryRun(q);
QueryBuildDataSource ds = q.addDataSource(tableNum(MyTable));
QueryBuildRange qbr = ds.addRange(fieldNum(MyTable,MyField));
qbr.value('myValue');
qbr.locked(true);
Thus in the static case prefer to use AOT queries, then change the query at runtime if needed. On the flip side, if your table is only known at runtime, you cannot use AOT queries, nor X++ selects, and you will need to build your query at runtime. The table browser is a good example of that.
What is left for X++?
Simple selects with small where clauses and with simple or no joins.
Cases where you cannot use queries (yet), delete_from, update_recordset and insert_recordset comes to mind.
Avoiding external dependencies (like AOT queries) may sometimes be more important.
Code readability of X++ queries is better than query construction.
The main techniques for selecting records in the database are as follows:
The select statement
A query
The techniques are essentially the same. They both deliver a set of records from the database in a table variable that can be accessed.
Use the select statement when:
The selection criteria are complex.
You are selecting a set of records from X++. The user is not going to change the selection criteria.
Use a query when:
The user can choose which records are selected.
The user can change the range of records to be selected.
The selection criteria are not more complex than the query can accommodate.
When you use queries, develop them in the Application Object Tree (AOT) or build them from scratch in your code. In both situations, the query can be modified in the code. A query built from scratch in code can be saved so that it occurs in the AOT. However, this should usually be avoided.
Build a query in the AOT when:
A specific query definition is being used in many places. (The query in the AOT can be reused.)
The query must contain code.
The query definition is more complex. The AOT provides a visual representation of the query.
So far, I've been using classic ADO.NET model for database access. I have to tell you that I'm quite happy with it. But I have also been hearing much about Entity Framework recently so I thought I could give it a try. Actually the main reason which pushed me was the need to find a way to build the WHERE clause of my Stored Procedures. With the classic way I have to do either of the following:
Build the WHERE clause in the client side based on the user inputs and send it as a VARCHAR2 argument to the Stored Procedure, concatenate the WHERE clausewith the main part of the SQL and pass the whole string to EXECUTE_IMMEDIATE function. I personally hate to have to do so.
Inside the Stored Procedure construct lots and lots of SQL statements, which means I have to take all the possible combinations that WHERE clause might be composed of into account. This seems worse than the first case.
I know that EF has made it possible to use Stored Procedures as well. But will it be possible to build the WHERE part dynamically? Can EF rescue me somehow?
yes, you can use Dynamic queries in Linq.
Dynamic Query LIbrary
from scott gu example
var query = Northwind.Products.Where("Lastname LIKE "someValue%");
or some complex query
var query =
db.Customers.
Where("City = #0 and Orders.Count >= #1", "London", 10).
OrderBy("CompanyName").
Select("new(CompanyName as Name, Phone)");
or from this answer Where clause dynamically.
var pr = PredicateBuilder.False<User>();
foreach (var name in names)
{
pr = pr.Or(x => x.Name == name && x.Username == name);
}
return query.AsExpandable().Where(pr);
I am thinking of using Simple.Data Micro-ORM for my ASP.NET 4.5 website. However, there is something that I need to know before deciding whether to use it or not.
Let's take the following Join query for example:
var albums = db.Albums.FindAllByGenreId(1)
.Select(
db.Albums.Title,
db.Albums.Genre.Name);
This query will be translated to:
select
[dbo].[Albums].[Title],
[dbo].[Genres].[Name]
from [dbo].[Albums]
LEFT JOIN [dbo].[Genres] ON ([dbo].[Genres].[GenreId] = [dbo].[Albums].[GenreId])
WHERE [dbo].[Albums].[GenreId] = #p1
#p1 (Int32) = 1
Let's assume that the 'Genres' table is a a table with thousands or even millions of rows. I think that it might be very inefficient to filter the data after the JOIN has taken place, which is what this query translated for in Simple.Date.
Would it be better to filter the data firs in the Generes table, which means create make a SELECT statement first and make the JOIN with that filtered table?
Wouldn't it be better to filter the data ahead of time?
Furthermore, is there an option to make that type of complex (JOIN on a filtered table) query using Simple.Data.
Need your answer to know if to proceed with Simple.Data, or damp it in favor of another micro-ORM.
You are confused about how SQL is interpreted and executed by the database engine. Modern databases are incredibly smart about the best way to execute queries, and the order in which instructions appear in SQL statements has nothing to do with the order in which they are executed.
Try running some queries through SQL Management Studio and looking at the Execution Plan to see how they are actually optimised and executed. Or just try the SQL you think would work better and see how it actually performs compared to what is generated by Simple.Data.
The sql that Simple.Data is generating is idomatic T-SQL, too be honest its what I would be writing if I was drafting the sql myself.
This sql allows Sql Server to optimise the execution plan which should mean the most efficient retrieval of data.
The beauty of Simple.Data is that if you have any doubts or issues with the sql it generates you can just call a stored proc:
db.ProcedureWithParameters(1, 2);
My situation
I have a c# object which contains some lists. One of these lists are for example a list of tags, which is a list of c# "SystemTag"-objects. I want to instantiate this object the most efficient way.
In my database structure, I have the following tables:
dbObject - the table which contains some basic information about my c# object
dbTags - a list of all available tabs
dbTagConnections - a list which has 2 fields: TagID and ObjectID (to make sure an object can have several tags)
(I have several other similar types of data)
This is how I do it now...
Retrieve my object from the DB using an ID
Send the DB object to a "Object factory" pattern, which then realise we have to get the tags (and other lists). Then it sends a call to the DAL layer using the ID of our C# object
The DAL layer retrieves the data from the DB
These data are send to a "TagFactory" pattern which converts to tags
We are back to the Object Factory
This is really inefficient and we have many calls to the database. This especially gives problems as I have 4+ types of lists.
What have I tried?
I am not really good at SQL, but I've tried the following query:
SELECT * FROM dbObject p
LEFT JOIN dbTagConnection c on p.Id= c.PointId
LEFT JOIN dbTags t on c.TagId = t.dbTagId
WHERE ....
However, this retreives as many objects as there are tagconnections - so I don't see joins as a good way to do this.
Other info...
Using .NET Framework 4.0
Using LINQ to SQL (BLL and DAL layer with Factory patterns in the BLL to convert from DAL objects)
...
So - how do I solve this as efficient as possible? :-) Thanks!
At first sight I don't see your current way of work as "inefficient" (with the information provided). I would replace the code:
SELECT * FROM dbObject p
LEFT JOIN dbTagConnection c on p.Id= c.PointId
LEFT JOIN dbTags t on c.TagId = t.dbTagId
WHERE ...
by two calls to the DALs methods, first to retrieve the object main data (1) and one after that to get, only, the data of the tags related (2) so that your factory can fill-up the object's tags list:
(1)
SELECT * FROM dbObject WHERE Id=#objectId
(2)
SELECT t.* FROM dbTags t
INNER JOIN dbTag Connection c ON c.TagId = t.dbTagId
INNER JOIN dbObject p ON p.Id = c.PointId
WHERE p.Id=#objectId
If you have many objects and the amount of data is just a few (meaning that your are not going to manage big volumes) then I would look for a ORM based solution as the Entity Framework.
I (still) feel comfortable writing SQL queries in the DAOs to have under control all queries being sent to the DB server, but finally it is because in our situation is a need. I don't see any inconvenience on having to query the database to recover, first, the object data (SELECT * FROM dbObject WHERE ID=#myId) and fill the object instance, and then query again the DB to recover all satellite data that you may need (the Tags in your case).
You have be more concise about your scenario so that we can provide valuable recommendations for your particular scenario. Hope this is useful you you anyway.
We used stored procedures that returned multiple resultsets, in a similar situation in a previous project using Java/MSSQL server/Plain JDBC.
The stored procedure takes the ID corresponding to the object to be retrieved, return the row to build the primary object, followed by multiple records of each one-to-many relationship with the primary object. This allowed us to build the object in its entirety in a single database interaction.
Have you thought about using the entity framework? You would then interact with your database in the same way as you would interact with any other type of class in your application.
It's really simple to set up and you would create the relationships between your database tables in the entity designer - this will give you all the foreign keys you need to call related objects. If you have all your keys set up in the database then the entity designer will use these instead - creating all the objects is as simple as selecting 'Create model from database' and when you make changes to your database you simply right-click in your designer and choose 'update model from database'
The framework takes care of all the SQL for you - so you don't need to worry about that; in most cases..
A great starting place to get up and running with this would be here, and here
Once you have it all set up you can use LINQ to easily query the database.
You will find this a lot more efficient than going down the table adapter route (assuming that's what you're doing at the moment?)
Sorry if i missed something and you're already using this.. :)
As far I guess, your database exists already and you are familiar enough with SQL.
You might want to use a Micro ORM, like petapoco.
To use it, you have to write classes that matches the tables you have in the database (there are T4 generator to do this automatically with Visual Studio 2010), then you can write wrappers to create richer business objects (you can use the ValueInjecter to do it, it is the simpler I ever used), or you can use them as they are.
Petapoco handles insert / update operations, and it retrieves generated IDs automatically.
Because Petapoco handles multiple relationships too, it seems to fit your requirements.
I have been using Linq to SQL for a while now on one of my sites and over time the code I am using to query the database has gotten a little messy so I decided to re-write, originally my queries were all handled exclusively by Linq but recently there has been a demand for more advanced search features which has led me more towards using ExecuteQuery and handwriting my SQL statements the problem is that I cannot for the life of me get the Join statement to work properly.
I have two tables in my databases, t_events and t_clients. The only thing similar between the two tables is that they both have a clientid field (the id of the client the event is for). What I need to be able to do is pull all of the events into the page (which works fine) but I dont want to show the clientid on the page I need to show the client name. Originally I had a join clause that handled this nicely:
var eve = from p in db.t_events
join c in db.Clients on p.clientid equals c.clientid
where p.datedue <= thisDay && p.status != "complete"
select new { p.eventname, p.datedue, p.details, p.eventid, p.status, c.clientname };
With the redesign of the page however I am having issues recreating what linq has done here with the join. My current code:
StringBuilder sqlQuery = new StringBuilder("SELECT * FROM dbo.t_events JOIN dbo.t_clients ON dbo.t_events.clientid=dbo.t_clients.clientid");
var query = db.ExecuteQuery<t_events>(sqlQuery.ToString());
foreach (var c in query)
{
counter = counter + 1;
MyStringBuilder.Append("<tr class='"+c.status+"'><td><a href='searchdetails.aspx?id="+c.eventid+"'>"+c.eventname+"</a></td><td>" +c.clientname+ "</td></tr>");
}
in the foreach loop I have you can see I am trying to pull in c.clientname (which doesnt work) as it is not on the t_events database, changing this to c.clientid makes the code work, I am not sure what the issue is as taking that same SQL and running the query directly off the sql management tool works like a charm. Any ideas on this issue would be greatly appreciated!
FIXED!
DaveMarkle suggested using a view, which was by far a much easier way of doing this. I created a view that joins the two tables together with the fields I need and run my queries against it, simple and effective, I thank you!
Erm - so maybe we should have an answer here then so the question drops off the 'unanswered' list.
As Dave Markle stated.
Use a view.
Another option!
Execute the query twice: once with db.ExecuteQuery<t_events> and once db.ExecuteQuery<t_clients>. Now that you have both events and clients you can re-join them client-side by matching client_id.