Is it possible to get a LINQ to SQL DataContext to run completely in-memory? Without it touching the database?
I am doing some very rapid prototyping, and want to minimize the surface area for major changes since the UI is changing so fast. However, the data model already exists.
Data access is handled through the use of I[Model]Repository classes that return the actual LINQ to SQL data classes, so I currently have some concrete InMemory[Model]Repository classes that shove stuff in cache. The implementation is a little cumbersome however.
So... is it possible to simply override enough of the DataContext behavior to have it run in-memory and never touch the database. My assumption is that it is not possible, but I thought I would go fishing anyway.
You can only do this if you are prepared to wrap access to the datacontext with your own interface. Then for rapid prototyping you can write your own datacontext alternative that implements this interface and instead uses lists and LINQ to Objects to perform in-memory queries.
Related
I'm working on an app that uses XMLSerialization and SQLite. Both require public accessors. However, there are many instances where I want accessors to return conditional data or only have read access. With SQLite the accessors must both be public, so I can't even use protected.
What's the best way to handle this? Do I really need a secondary class that is basically a copy of the serializable class? With XML serialization I could possibly construct my own serialization process, but this is painful and probably worse than a shadow class.
Ideas?
After a lot of exploration it seems like this is an unfortunate YES. Really there should be the objects filled by SQLite queries and XML serialization in the Data Access layer. Then in the Business layer there should be a conversion of those objects to what is used by the app layer.
Hopefully this makes sense to others searching for the same.
Can using Modules or Shared/Static references to the BLL/DAL improve the performance of an ASP.NET website?
I am working of a site that consists of two projects, one the website, the other a VB.NET class library which acts as a combination of DAL and BLL.
The library is used to communicate with databases and sometimes transform/validate the data going into/coming from the DBs.
Currently each page on the site that needs db access (vast majority) will create an instance of the relevant class in the library to access specific tables.
As I understand it this leads to a class from the library being instantiated and garbage collected for each request, with the possibility of multiple concurrent instances if multiple users view the same page.
If I converted the classes to modules (shared/static class) would performance increase and memory be saved as only one instance of each module exists at a time and a new instance is not having to be created for each request?
(if so, does anyone know if having TableAdapters as global variables in the modules would cause problems due to threading?)
Alternatively would making the references to the Library class it the ASP.NET page have the same effect? (except I would have to re-write a lot less)
I'm no expert, but think that the absence of examples of this static class / session object model in books and online is indicative of it being a bad idea.
I inherited a Linq-To-Sql application where the db contexts were static, and after n requests the whole thing just fell apart. The standard model for L2Sql is the Unit-of-Work pattern (define a task or set of tasks - do them and close). Let the framework worry about connection pooling and efficient GC.
Are you just trying to be efficient or do you have performance issues? If the latter it's usually more effective to look at caching or improving query efficiency (use stored procedures, root out queries in loops) than looking at object instantiation.
Statics don't play well with unit tests either (another reason why they have dropped out of fashion).
instances are only a problem if they are not collected by the CG (a memory leak). Instances are more flexible than static as well because you can configure the instance to the specific context you are using.
When an application has poor performance or memory problems its usually a sign that
instances are not properly released (IDisposable)
the amount of data retrieved is too big (not paging large sets of data)
a large number of queries are executed (select n+1, or just a lot of queries)
poorly constructed sql statements (missing indexes, FK, too many joins, etc)
too many remote calls (either to other servers, or disk)
These are first things I would check. then start looking at the number of instantiated objects. Chances are that correcting the above mentioned list will solve most performance bottlenecks.
Can using Modules or Shared/Static references to the BLL/DAL improve
the performance of an ASP.NET website?
It's possible, but it depends heavily on how you use your data. One tradeoff in using a single shared instance of an object instead of one per request is that you will need to apply locking unless the objects are strictly read-only, and locking can both slow things down and complicate your code (not to mention being a common source of bugs).
However, if each object is going to contain the exact same data, then the tradeoff may be worth it -- even more so if it can save a DB round-trip.
You might consider using either a Singleton or a small number of parameterized objects rather than a static, though -- and use caching to manage them. That would give you the flexibility to let go of objects that you no longer need, which is harder to do when you're dealing with statics.
I'm starting a new project based on ASP.NET and Windows server.
The application is planned to be pretty big and serve large amount of clients pulling and updating high freq. changing data.
I have previously created projects with Linq-To-Sql or with Ado.Net.
My plan for this project is to use VS2010 and the new EF4 framework.
It would be great to hear other
programmers options about development
with Entity Framework
Pros and cons from previous
experience?
Do you think EF4 is ready for
production?
Should i take the risk or just stick with plain old good ADO.NET?
Whether EF4 is really ready for production is a bit hard to say since it's not officially been released yet.... but all the preliminary experiences and reports about it seem to indicate it's quite good.
However: you need to take into consideration what EF is trying to solve; it's a two-layer approach, one layer maps to your physical storage schema in your database (and supports multiple backends), and the second layer is your conceptual model you program against. And of course, there's the need for a mapping between those two layers.
So EF4 is great if you have a large number of tables, if you have multiple backends to support, if you need to be able to map a physical schema to a different conceptual schema, and so forth. It's great for complex enterprise level applications.
BUT that comes at a cost - those extra layers do have an impact on performance, complexity, maintainability. If you need those features, you'll be happy to pay that price, no question. But do you need that??
Sure, you could go back to straight ADO.NET - but do you really want to fiddle around with DataTables, DataRows, and untyped Row["RowName"] constructs again?? REALLY???
So my recommendation would be this:
if you need only SQL Server as your backend
if you have a fairly simple and straightforward mapping of one database table to one entity object in your model
then: use Linq-to-SQL ! Why not?? It's still totally supported by Microsoft in .NET 4 - heck, they even did bugfixes and added a few bits and pieces - it's fast, it's efficient, it's lean and mean - so why not??
My advice is use both. At first I thought I would only use linq to sql and never have to touch ado.net ever again ( what made me happy lol).
Now I am using both because some things linq to sql(and any ORM like EF) can't do. I had to do some mass inserts and I did it first with linq to sql and to do 500 records it took over 6mins(2 mins for validation rules rest was inserting into the db).
I changed it to sql bulk copy and now it is down to 2min and 4 seconds(4 seconds to do all inserts)
But like marc_s said I really did not want to fiddle around with DataTables, DataRows, and untyped Row["RowName"].
Say my table was like 10 columns long and called Table A. What I did was I used linq to sql and made a Table A class( new TableA()) object and populated it with data. I then would pass this object to a method that created the datarow.
So linq to sql saved me some time because I probably would have made a class as I would not have wanted to pass in 10 parameters into the method that makes the data row. I also feel it gives a bit of typeness back as you have to pass in the right object to use that method so less chance of passing in the wrong data.
Finally you can still use linq to sql to call Stored procedures and that is like one line of code.
So I would use both when ever you notice that linq to sql (or in your case EF) is slow then just write a SP and call it through EF. If you need to do straight ado.net evaluate what you need to do maybe you can use EF for most of the code(so you can at least work with objects) and only for that small portion ado.net sort of what I did with sql bulk copy.
EF 4 is now more similar to LINQ to SQL, in the good ways; it has the FK keys right in the object, has add methods right in the object sets, and a lot of other nice features. THe designer is much improved, and the major plus is that it works with SQL and Oracle, and maybe some others (as long as the provider supports it) instead of LINQ to SQL with only SQL Server.
EF is the future; the ADO.NET data services is a web service add on, plus it supports POCO and T4 generation, and any new features will support this (LINQ to SQL is maintenance only, and data sets won't be getting any changes any more).
HTH.
I am working on architecture of mid sized web application & for my DAL layer i am having 3 options
1) Traditional Stored proc Based Architecture (Using NTiers Template of Codesmith)
2) LINQ To SQL (or PLINQO Template of codesmith)
3) LINQ To Entity
From above LINQ to Entity is out of reach as we need to start application very quickly and we don't have the sufficient skillset for the same and as team has never worked on any OR/M tools it will be steep learning curve for them (This is what i read some where)
I prefer to go ahead with LINQ to SQL (But only fear is microsoft is not going to support or enhance LINQ to SQL further), from my point of view if microsoft is not going to enhance it further i am not having any issue as whatever feature i require in my project it is sufficient.
Now my issue is should i use linq to sql or should i stick to traditional architecture ?
OR else any other option is there ...
EDIT : I am going to use SQL Server as database and it does not require to interact with any other database
One of the most important objective in designing DAL Layer is faster development and maintainability for future database table changes, as there are chances that field may increase or decrease in future.
Also if you feel that any ORM tool is really good and does not have steep learning curve then also we can use
Please provide suggestions
As you are working in medium size project, I would suggest you to use LINQ-TO-SQL because of these advantages
Advantages using LINQ to SQL:
•No magic strings, like you have in SQL queries
•Intellisense
•Compile check when database changes
•Faster development
•Unit of work pattern (context)
•Auto-generated domain objects that are usable small projects
•Lazy loading.
•Learning to write linq queries/lambdas is a must learn for .NET developers.
Regarding performance:
•Most likely the performance is not going to be a problem in most solutions. To pre-optimize is an anti-pattern. If you later see that some areas of the application are to slow, you can analyze these parts, and in some cases even swap some linq queries with stored procedures or ADO.NET.
•In many cases the lazy loading feature can speed up performance, or at least simplify the code a lot.
Regarding debuging:
•In my opinion debuging Linq2Sql is much easier than both stored procedures and ADO.NET. I recommend that you take a look at Linq2Sql Debug Visualizer, which enables you to see the query, and even trigger an execute to see the result when debugging.
•You can also configure the context to write all sql queries to the console window, more information here
Regarding another layer:
•Linq2Sql can be seen as another layer, but it is a purely data access layer. Stored procedures is also another layer of code, and I have seen many cases where part of the business logic has been implemented into stored procedures. This is much worse in my opinion because you are then splitting the business layer into two places, and it will be harder for developers to get a clear view of the business domain.
There is no absolutely preffered way of writing DAL. These are all options. Which one to choose depends on your project, your skills and your inclinations.
Normally, with LINQ you can expect to be more productive. On the other hand, the DAL built with stored procedures can be expected to perform faster.
The issue only comes when you need some specific queries that the default LINQ to SQL provider won't be able to generate to be blazingly fast. In that case you will have to tap into your LINQ code to plug in your custom stored procedures where needed.
Regarding LINQ to SQL support and further development, it was grounded a long time ago already. So no official further development. Note: that is true for LINQ to SQL (it will be taken over by EF) relational solution, not for the main LINQ functionality.
Entity Framework in its v.1 only received massive critics. You're advised to wait until v2 comes out.
The most important limitation with LINQ (over Entity Framework or any other popular ORM) is that it doesn't support 1 to n mappings. That is, each your LINQ class can only map to a single table, not represent some sort of view over several others. Maybe it's not important to you, but maybe it is. Depends on your project.
The argument of stored procedures vs ORM's is long-standing and unlikely to be resolved any time soon. My recommendation would be to go with an ORM (Linq-to-Sql in your case).
Yes, stored procedures will always be faster since the queries are precompiled. The real question you have to ask yourself is whether you have such a performance-intensive system that your users will actually notice the difference. Keep in mind that using stored procedures means that you will need to manually write all your own queries where using an ORM does this for you. This usually means that an ORM will speed up your development.
Since you mention that speeding up development time is one of your goals I would recommend Linq-to-Sql - otherwise you will basically write the entire DAL yourself.
All of the options you've provided have significant drawbacks. None of them meet the requirements you've set out.
You need to prioritize what is most important for you.
If learning curve is your biggest issue, stay away from all ORMs if you are already comfortable with ADO.NET, DataTables, etc.
If development speed is your biggest issue, you should learn an ORM and go that route. The easiest ORM to recommend is NHibernate. Every other ORM has significant weaknesses. NHibernate works in the vast majority of projects, whereas other ORMs are much more situationally appropriate (depending on your DB design, model design, agility requirements, legacy schema support, etc.). All ORMs have learning curves, they just come into play at different times and in different ways.
Just to expand on #Developer Art, using the traditional stored proc approach enables you to put business logic in the database. Usually you will want to avoid this, but sometimes it is necessary to do. Not to mention you could also enforce constraints and permissions at the database level using this approach. It all depends on your requirements.
With the limitations mention I would say just stick to adhoc/custom queries and ADO.NET and not go for any jazzy stuff. Also stored procedure based DAL are faster is a notion based lame arguments like stored procedures are precompiled but they are not. All that they have is query plan cache. So lesser the investment in stored procedures the better you are. My advice ADO.Net and custom dynamic queries constructed from entity objects.
In order to fully use LinqToSql in an ASP.net 3.5 application, it is necessary to create DataContext classes (which is usually done using the designer in VS 2008). From the UI perspective, the DataContext is a design of the sections of your database that you would like to expose to through LinqToSql and is integral in setting up the ORM features of LinqToSql.
My question is: I am setting up a project that uses a large database where all tables are interconnected in some way through Foreign Keys. My first inclination is to make one huge DataContext class that models the entire database. That way I could in theory (though I don't know if this would be needed in practice) use the Foreign Key connections that are generated through LinqToSql to easily go between related objects in my code, insert related objects, etc.
However, after giving it some thought, I am now thinking that it may make more sense to create multiple DataContext classes, each one relating to a specific namespace or logical interrelated section within my database. My main concern is that instantiating and disposing one huge DataContext class all the time for individual operations that relate to specific areas of the Database would be impose an unnecessary imposition on application resources. Additionally, it is easier to create and manage smaller DataContext files than one big one. The thing that I would lose is that there would be some distant sections of the database that would not be navigable through LinqToSql (even though a chain of relationships connects them in the actual database). Additionally, there would be some table classes that would exist in more than one DataContext.
Any thoughts or experience on whether multiple DataContexts (corresponding to DB namespaces) are appropriate in place of (or in addition to) one very large DataContext class (corresponding to the whole DB)?
I disagree with John's answer. The DataContext (or Linq to Entities ObjectContext) is more of a "unit of work" than a connection. It manages change tracking, etc. See this blog post for a description:
Lifetime of a LINQ to SQL DataContext
The four main points of this blog post are that DataContext:
Is ideally suited
for a "unit of work" approach
Is also designed for
"stateless" server operation
Is not designed for
Long-lived usage
Should be used very carefully after
any SumbitChanges() operation.
Considering that, I don't think using more than one DataContext would do any harm- in fact, creating different DataContexts for different types of work would help make your LinqToSql impelmentation more usuable and organized. The only downside is you wouldn't be able to use sqlmetal to auto-generate your dmbl.
I'd been wrangling over the same question whilst retro fitting LINQ to SQL over a legacy DB. Our database is a bit of a whopper (150 tables) and after some thought and experimentation I elected to use multiple DataContexts. Whether this is considered an anti-pattern remains to be seen, but for now it makes life manageable.
I think John is correct.
"My main concern is that instantiating and disposing one huge DataContext class all the time for individual operations that relate to specific areas of the Database would be impose an unnecessary imposition on application resources"
How do you support that statement? What is your experiment that shows that a large DataContext is a performance bottleneck? Having multiple datacontexts is a lot like having multiple databases and makes sense in similar scenarios, that is, hardly ever. If you are working with multiple datacontexts you need to keep track of which objects belong to which datacontext and you can't relate objects that are not in the same data context. That is a costly design smell for no real benefit.
#Evan "The DataContext (or Linq to Entities ObjectContext) is more of a "unit of work" than a connection"
That is precisely why you should not have more than one datacontext. Why would you want more that one "unit of work" at a time?
I have to disagree with the accepted answer. In the question posed, the system has a single large database with strong foreign key relationships between almost every table (also the case where I work). In this scenario, breaking it up into smaller DataContexts (DC's) has two immediate and major drawbacks (both mentioned by the question):
You lose relationships between some tables. You can try to choose your DC boundaries wisely, but you will eventually run into a situation where it would be very convenient to use a relationship from a table in one DC to a table in another, and you won't be able to.
Some tables may appear in multiple DC's. This means that if you want to add table-specific helper methods, business logic, or other code in partial classes, the types won't be compatible across DC's. You can work around this by inheriting each entity class from its own specific base class, which gets messy. Also, schema changes will have to be duplicated across multiple DC's.
Now those are significant drawbacks. Are there advantages big enough to overcome them? The question mentions performance:
My main concern is that instantiating and disposing one huge
DataContext class all the time for individual operations that relate
to specific areas of the Database would be impose an unnecessary
imposition on application resources.
Actually, it is not true that a large DC takes significantly more time to instantiate or use in a typical unit of work. In fact, after the first instance is created in a running process, subsequent copies of the same DC can be created almost instantaneously.
The only real advantage from multiple DC's for a single, large database with thorough foreign key relationships is that you can compartmentalize your code a little better. But you can already do this with partial classes.
Also, the unit of work concept is not really relevant to the original question. Unit of work typically refers to how much a work a single DC instance is doing, not how much work a DC class is capable of doing.
In my experience with LINQ to SQL and LINQ to Entities a DataContext is synonymous to a connection to the database. So if you were to use multiple data stores you would need to use multiple DataContexts. My gut reaction is you wouldn't notice to much of a slow down with a DataContext that encompasses a large number of tables. If you did however you could always split the database logically at points where you can isolate tables that don't have any relationship to other sets of tables and create multiple contexts.