Lambda expression is slower than foreach statement? - asp.net

I did a small experiment to test whether lamdba expression can retrieve faster results than foreach statement. but, Lambda failed
System.Diagnostics.Stopwatch st = new System.Diagnostics.Stopwatch();
st.Start();
List<int> lst = new List<int>();
foreach (GridViewRow item in GridView1.Rows)
{
if (((CheckBox)item.FindControl("Check")).Checked)
{
lst.Add(Convert.ToInt32(((Label)item.FindControl("Id")).Text));
}
}
st.Stop();
Response.Write(st.Elapsed.ToString());
Response.Write("<br/><br/><br/>");
st.Reset();
st.Start();
List<int> lstRank = GridView1.Rows.OfType<GridViewRow>().Where(s => ((CheckBox)s.FindControl("Check")).Checked)
.Select(s => Convert.ToInt32(((Label)s.FindControl("Id")).Text)).ToList();
st.Stop();
Response.Write(st.Elapsed.ToString());
int i = 0;
output
00:00:00.0000249
00:00:00.0002464
why lambda is slower than foreach. This may be a drawback of lambda expression

Technically your 2 approaches are not identical. There are a few differences such as the use of "OfType" which is filtering the collection before continuing. You'd be better using "Cast<GridViewRow>()" as you know each element is of type GridViewRow.
Also, do you really need the expense of the ToList() at the end of the Linq statement as your linq query is now ready to iterate over and execute rather than having to convert back to a list?

There is a small overhead with lambda expressions because they are "compiled" at runtime. I think that's what you see in your "benchmark". for each ... is a fully compiled statement.
You can precompile lambda expressions. Look here. Maybe you want to rework your code and test again.

I won't talk about the correctness of your code but I'd like to get a chance to explain a general rule
In Software Develpment the performance loss is inversely proportional to the level of abtsraction.
In this case in quite normal that foreach is faster then LINQ (which is more abstract).
If you compare it with the classic for (for (int i:i++l,etc..) ) it will be faster than the foreach.
Access an object thought an interface is slower then access the concrete object : the interface is already a very small level of abstraction.
The code you write will be as much fast as it is "close" to the machine language but of course it will be less readable and maintainable.
The matter is how to find the right level of abstraction for what we are developing keeping an eyes on the performance and the code readability.
You won't need MVC pattern for making a one-page web site that shows a table on a Repeater :-)

Related

How to change the dart-sqlite code from synchronous style to asynchronous?

I'm trying to use Dart with sqlite, with this project dart-sqlite.
But I found a problem: the API it provides is synchronous style. The code will be looked like:
// Iterating over a result set
var count = c.execute("SELECT * FROM posts LIMIT 10", callback: (row) {
print("${row.title}: ${row.body}");
});
print("Showing ${count} posts.");
With such code, I can't use Dart's future support, and the code will be blocking at sql operations.
I wonder how to change the code to asynchronous style? You can see it defines some native functions here: https://github.com/sam-mccall/dart-sqlite/blob/master/lib/sqlite.dart#L238
_prepare(db, query, statementObject) native 'PrepareStatement';
_reset(statement) native 'Reset';
_bind(statement, params) native 'Bind';
_column_info(statement) native 'ColumnInfo';
_step(statement) native 'Step';
_closeStatement(statement) native 'CloseStatement';
_new(path) native 'New';
_close(handle) native 'Close';
_version() native 'Version';
The native functions are mapped to some c++ functions here: https://github.com/sam-mccall/dart-sqlite/blob/master/src/dart_sqlite.cc
Is it possible to change to asynchronous? If possible, what shall I do?
If not possible, that I have to rewrite it, do I have to rewrite all of:
The dart file
The c++ wrapper file
The actual sqlite driver
UPDATE:
Thanks for #GregLowe's comment, Dart's Completer can convert callback style to future style, which can let me to use Dart's doSomething().then(...) instead of passing a callback function.
But after reading the source of dart-sqlite, I realized that, in the implementation of dart-sqlite, the callback is not event-based:
int execute([params = const [], bool callback(Row)]) {
_checkOpen();
_reset(_statement);
if (params.length > 0) _bind(_statement, params);
var result;
int count = 0;
var info = null;
while ((result = _step(_statement)) is! int) {
count++;
if (info == null) info = new _ResultInfo(_column_info(_statement));
if (callback != null && callback(new Row._internal(count - 1, info, result)) == true) {
result = count;
break;
}
}
// If update affected no rows, count == result == 0
return (count == 0) ? result : count;
}
Even if I use Completer, it won't increase the performance. I think I may have to rewrite the c++ code to make it event-based first.
You should be able to write a wrapper without touching the C++. Have a look at how to use the Completer class in dart:async. Basically you need to create a Completer, return Completer.future immediately, and then call Completer.complete(row) from the existing callback.
Re: update. Have you seen this article, specifically the bit about asynchronous extensions? i.e. If the C++ API is synchronous you can run it in a separate thread, and use messaging to communicate with it. This could be a way to do it.
The big problem you've got is that SQLite is an embedded database; in order to process your query and provide your results, it must do computation (and I/O) in your process. What's more, in order for its transaction handling system to work, it either needs its connection to be in the thread that created it, or for you to run in serialized mode (with a performance hit).
Because these are fairly hard constraints, your plan of switching things to an asynchronous operation mode is unlikely to go well except by using multiple threads. Since using multiple connections complicates things a lot (as you can't share some things between them, such as TEMP TABLEs) let's consider going for a single serialized connection; all activity will be serialized at the DB level, but for an application that doesn't use the DB a lot it will be OK. At the C++ level, you'd be talking about calling that execute from another thread and then sending messages back to the caller thread to indicate each row and the completion.
But you'll take a real hit when you do this; in particular, you're committing to only doing one query at a time, as the technique runs into significant problems with semantic effects when you start using two connections at once and the DB forces serialization on you with one connection.
It might be simpler to do the above by putting the synchronous-asynchronous coupling at the Dart level by managing the worker thread and inter-thread communication there. That would let you avoid having to change the C++ code significantly. I don't know Dart well enough to be able to give much advice there.
Myself, I'd just stick with synchronous connection processing so that I can make my application use multi-threaded mode more usefully. I'd be taking the hit with the semantics and giving each thread its own connection (possibly allocated lazily) so that overall speed was better, but I do come from a programming community that regards threads as relatively heavyweight resources, so make of that what you will. (Heavy threads can do things that reduce the number of locks they need that it makes no sense to try to do with light threads; it's about overhead management.)

Languages supporting complete reflection

Only recently, I discovered that both Java and C# do not support reflection of local variables. For example, you cannot retrieve the names of local variables at runtime.
Although clearly this is an optimisation that makes sense, I'm curious as to whether any current languages support full and complete reflection of all declarations and constructs.
EDIT: I will qualify my "names of local variables" example a bit further.
In C#, you can output the names of parameters to methods using reflection:
foreach(ParameterInfo pi in typeof(AClass).GetMethods()[0].GetParameters())
Trace.WriteLine(pi.Name);
You don't need to know the names of the parameters (or even of the method) - it's all contained in the reflection information. In a fully-reflective language, you would be able to do:
foreach(LocalVariableInfo lvi in typeof(AClass).GetMethods()[0].GetLocals())
Trace.WriteLine(lvi.Name);
The applications may be limited (many applications of reflection are), but nevertheless, I would expect a reflection-complete language to support such a construct.
EDIT: Since two people have now effectively said "there's no point in reflecting local variable names", here's a basic example of why it's useful:
void someMethod()
{
SomeObject x = SomeMethodCall();
// do lots of stuff with x
// sometime later...
if (!x.StateIsValid)
throw new SomeException(String.Format("{0} is not valid.", nameof(x));
}
Sure, I could just hardcode "x" in the string, but correct refactoring support makes that a big no-no. nameof(x) or the ability to reflect all names is a nice feature that is currently missing.
Your introductory statement about the names of local variables drew my interest.
This code will actually retrieve the name of the local var inside the lambda expression:
static void Main(string[] args)
{
int a = 5;
Expression<Func<int>> expr = (() => a);
Console.WriteLine(expr.Compile().Invoke());
Expression ex = expr;
LambdaExpression lex = ex as LambdaExpression;
MemberExpression mex = lex.Body as MemberExpression;
Console.WriteLine(mex.Member.Name);
}
Also have a look at this answer mentioning LocalVariableInfo.
Yes, there are languages where this is (at least kind of) possible. I would say that reflection in both Smalltalk and Python are pretty "complete" for any reasonable definition.
That said, getting the name of a local variable is pretty pointless - by definition to get the name of that variable, you must know its name. I wouldn't consider the lack of an operation to perform that exact task a lacuna in the reflection facility.
Your second example does not "determine the name of a local variable", it retrieves the name of all local variables, which is a different task. The equivalent code in Python would be:
for x in locals().iterkeys(): print x
eh, in order to access a local var you have to be within the stackframe/context/whatever where the local var is valid. Since it is only valid at that point in time, does it matter if it is called 't1' or 'myLittlePony'?

Getting Dictionary<K,V> from ConcurrentDictionary<K,V> in .NET 4.0

I'm parallelizing some back-end code and trying not to break interfaces. We have several methods that return Dictionary and internally, I'm using ConcurrentDictionary to perform Parallel operations on.
What's the best way to return Dictionary from these?
This feels almost too simple:
return myConcurrentDictionary.ToDictionary(kvp => kvp.Key, kvp => kvp.Value);
I feel like I'm missing something.
Constructing the Dictionary<K,V> directly will be slightly more efficient than calling ToDictionary. The constructor will pre-allocate the target dictionary to the correct size and won't need to resize on-the-fly as it goes along.
return new Dictionary<K,V>(myConcurrentDictionary);
If your ConcurrentDictionary<K,V> uses a custom IEqualityComparer<K> then you'll probably want to pass that into the constructor too.
Nope. This is completely fine. .NET sequences are just nice like that. :D

Comparing javascripts native "for loop" to prototype's each()

So I was having a debate with a fellow engineer about looping in JavaScript. The issue was about the native for loop construct and prototype's each() method. Now I know there are lots of docs/blogs about for and for-each, but this debate is somewhat different and I would like to hear what some of you think.
Let's take the following loop for example
example 1
var someArray = [blah, blah, blah,...,N];
var length = someArray.length;
for(var index = 0; index < length; index++){
var value = someFunction(someArray[index]);
var someOtherValue = someComplicatedFunction(value, someArray[index]);
//do something interesting...
}
To me, this comes second nature mainly because I learnt how to code in C and it has carried me through. Now, I use the For-each in both C# and Java (bear with me, I know we are talking about JavaScript here..) but whenever i hear for loops, i think of this guy first. Now lets look at the same example written using Prototype's each()
example 2
var someArray = [blah, blah, blah,…..,N];
someArray.each(function(object){
var value = someFunction(object);
var someOtherValue = someComplicatedFunction(value, object);
//do something interesting...
});
In this example, right off the bat, we can see that the construct has less code, however, i think each time we loop through an object, we have to create a new function to deal with the operation in question. Thus this would preform badly with collections with large number of objects. So my buddy's argument was that example 2 is much easier to understand and is actually cleaner than example 1 due to its functional approach. My argument is that any programmer should be able to understand example 1 since it is taught in programming 101. So the easier argument doesn't apply and example 1 performs better than example 2. Then why bother with #2. Now after reading around i found out that when the array size is small the overhead for example 2 is minuscule. However people kept on talking about the lines of code you write is less and that example 1 is error prone. I still don't buy those reasons, so I wanted to know what you guys think…
You are not creating a new function on each iteration in the second example. There is only one function that keeps getting called over and over. To clarify the point of only a single function being used, consider how would you implement the each method yourselves.
Array.prototype.each = function(fn) {
for(var i = 0; i < this.length; i++) {
// if "this" should refer to the current object, then
// use call or apply on the fn object
fn(this[i]);
}
}
// prints each value (no new function is being created anywhere)
[1, 2, 3].each(function(v) { console.log(v); });
Sure, there is the overhead of calling a function in the first place, but unless you are dealing with massive collections, this is a non-issue in modern browsers.
Personally I prefer the second approach for the simple reason that it frees me from worrying about unnecessary details that I shouldn't be concerned about in the first place. Say if we are looping through a collection of employee objects, then index is just an implementation detail and in most cases, if not all, can be abstracted away from the programmer by constructs such as the each method in Prototype, which by the way is now a standard in JavaScript as it has been incorporated into ECMAScript 5th ed under the name forEach.
var people = [ .. ];
for(var i /* 1 */ = 0; i /* 2 */ < people.length; i++ /* 3 */) {
people[i /* 4 */].updateRecords();
people[i /* 5 */].increaseSalary();
}
Marked all 5 occurrences all i inline with comments. We could have so easily eliminated the index i altogether.
people.forEach(function(person) {
person.updateRecords();
person.increaseSalary();
});
For me the benefit is not in lesser lines of code, but removing needless details from code. This is the same reason why most programmers jumped on the iterator in Java when it was available, and later on the enhanced for loop when it came out. The argument that the same grounds don't hold for JavaScript just doesn't apply.
I'm fairly certain it does not create a new code function for every iteration, but rather calls the function on each one. It may involve a little more overhead to keep track of the iterator internally, (some languages allow the list to be changed, some don't) but it shouldn't be all that very much different internally than the procedural version.
But anytime you ask which is faster, you should ask, does it really matter? Have you put a code profiler on it and tested it with real world data? You can spend a lot of time figuring out which is faster, but if it only accounts for .0001% of your execution time, who cares? Use profiling tools to find the bottlenecks that really matter, and use whichever iteration method you and your team agree is easier to use and read.
Example one is not only error prone, for arrays of trivial length it's a poor choice - and the each (or forEach, as defined in JavaScript 1.6, yet IE8 still does not support it) is definitely style-wise the better choice.
The reason for this is simple: you are telling the code what to do, not how to do it. In my tests with firefox, the forEach method is about 30% the speed as a for loop. But when you're doing miniscule arrays it doesn't even matter that much. Much better to make your code cleaner and easier to understand (remember: what it's doing instead of how to do it), for not only your sanity the next time you come back to it, but for the sanity of anyone else looking at your code.
If the only reason you're including prototype is for the .each method, you're doing it wrong. If all you want is a clean iteration method, use the .forEach method - but remember to define your own. This is from the MDC page on forEach - a useful check to give yourself a .forEach method if none exists:
if (!Array.prototype.forEach)
{
Array.prototype.forEach = function(fun /*, thisp*/)
{
var len = this.length >>> 0;
if (typeof fun != "function")
throw new TypeError();
var thisp = arguments[1];
for (var i = 0; i < len; i++)
{
if (i in this)
fun.call(thisp, this[i], i, this);
}
};
}
You can tell from how this works, that a new function is not created for each item, though it is invoked for each item.
The answer is - its subjective.
For me, example 2 is not really THAT much less code, and also involves downloading (bandwidth) AND parsing/executing (execution-time) a ~30kb library before you can even use it - so not only is the method itself less efficient in and of itself, it also involves setup overhead. For me - arguing that example 2 is better is insanity - however that's just an opinion, many would (and are perfectly entitled to) disagree completely.
IMO the second approach is succinct and easier to use, though it gets complicated (see prototype doc) if you want to use things like break or continue. See the each documentation.
So if you are using simple iteration, use of the each() function is better IMO as it could be more succinct and easy to understand, although its less performant than the raw for loop

Deleting items in foreach

Should you be allowed to delete an item from the collection you are currently iterating in a foreach loop?
If so, what should be the correct behavior?
I can take quite a sophisticated Collection to support enumerators that track changes to the collection to keep position info correct. Even if it does some compromisation or assumptions need to be made. For that reason most libraries simply outlaw such a thing or mutter about unexpected behaviour in their docs.
Hence the safest approach is to loop. Collect references to things that need deleting and subsequently use the collected references to delete items from the original collection.
It really depends on the language. Some just hammer through an array and explode when you change that array. Some use arrays and don't explode. Some call iterators (which are wholly more robust) and carry on just fine.
Generally, modifying a collection in a foreach loop is a bad idea, because your intention is unknown to the program. Did you mean to loop through all items before the change, or do you want it to just go with the new configuration? What about the items that have already been looped through?
Instead, if you want to modify the collection, either make a predefined list of items to loop through, or use indexed looping.
Some collections such as hash tables and dictionaries have no notion of "position" and the order of iteration is generally not guaranteed. Therefore it would be quite difficult to allow deletion of items while iterating.
You have to understand the concept of the foreach first, and actually it depends on the programming language. But as a general answer you should avoid changing your collections inside foreach
Just use a standard for loop, iterate through the item collection backwards and you should have no problem deleting items as you go.
iterate in reverse direction and delete item one by one... That should proper solution.
No, you should not. The correct behaviour should be to signal that a potential concurrency problem has been encountered, however that is done in your language of choice (throw exception, return error code, raise() a signal).
If you modify a data structure while iterating over its elements, the iterator might no longer be valid, which means that you risk working on objects that are no longer part of the collection. If you want to filter elements based on some more complex notation, you could do something like this (in Java):
List<T> toFilter = ...;
List<T> shadow;
for ( T element : toFilter )
if ( keep(element) )
shadow.add(element);
/* If you'll work with toFilter in the same context as the filter */
toFilter = shadow;
/* Alternatively, if you want to modify toFilter in place, for instance if it's
* been given as a method parameter
*/
toFilter.clear();
toFilter.addAll(shadow);
The best way to remove an item from a collection you are iterating over it to use the iterator explitly. For example.
List<String> myList = ArrayList<String>();
Iterator<String> myIt = myList.iterator();
while (myIt.hasNext()) {
myIt.remove();
}

Resources