If globals are bad, why does every web framework use them? - global-variables

Every web development framework I've come across, including the best designed ones (Web2Py, Kohana) make use of some form of global variables to represent objects that are global within the application domain- for example, objects representing 'request' and 'response'.
On the other hand, the idea that global variables are bad is one of the fundamental axioms of programming. Indeed the now common disparagements of the singleton pattern usually point to the fact that it's nothing more than globals in disguise, as if that were explanation enough.
I'm trying to understand once and for all how globals can be so condemnable and at the same time be a seemingly indispensable part of all our web frameworks?

What is a global? Taking your text I assume you mean a variable that's declared at global scope. Such variable could be overridden by any assignment and break existing functionality.
However, in OO languages, everything is inside a class and assignment can be wrapped in gettors and settors for properties, or completely hidden behind methods. This gives a safe way of dealing with globals. Note that in proper OO languages (Java, C#, VB.NET etc) it is not possible to have global variables (sometimes a language construct suggests otherwise, but static fields in C# or modules in VB, mixins in Ruby are all wrapped in classes and thus not truly global).
A singleton, you mention it, is a special kind of global. As a designer you can control how many instances run of it. A car only needs one engine, a country only one government (or war breaks loose) and a program needs only one main thread. Globals are a necessity to programming, the real discussion should not be, do we need them, but how to solidly create and use them.
You say that request and response objects are globals in web development. They are not. They are (usually, depending on your toolset) convenience variables set in scope before your code is run. Since a web application can have multiple request objects at any given time, I think these are a poor example of a global variable. They are not (but they are usually local and a singleton to your current thread).
One important feature that you cannot cover in traditional procedural languages (like Basic, Pascal, C) is access control and thus concurrency and thread safety for global variables. In .NET for instance, any static method or property in the BCL (one could say that any static variable is global by definition) is thread-safe by design. Guidelines for user-defined static methods or properties suggest you do the same.
EDIT: the danger is with languages that allow global variables but at the same time propagate themselves as truly OO. While these are wonderful languages, it is indeed dangerous to step out of the protection of OO and create globals in for instance Perl, Python, Ruby, PHP.

I don't know exactly in what context those globals are used in web frameworks, but anything global starts to create problems as soon as you need to have solid access control. If you start to use such a global in concurrently executing program, it's quite hard to say who and when accessed and changed it. It creates so-called shared state. This makes debugging even more difficult.
Anyway, I am not really in favour of such statements. This only leads to oversimplifications. You have to weight you requirements and then decide if this or that pattern brings more positive or negative effects...

Related

Can a statically typed language support metaclasses like smalltalk/python/ruby?

I find the concept of metaclass fascinating, it treats classes as first class objects. You can assign a class to a variable, pass it to a method, and even create new classes at runtime. It seems that, every programming language that supports metaclasses are either dynamic typed languages(Smalltalk, Python, Ruby) or gradually typed languages(Objective C and Groovy both support static and dynamic typing). I have not seen a statically typed language that supports metaclasses.
Are static typing and metaclasses incompatible with each other? It does seem to me that, metaclass' functionalities require a certain degree of dynamism. I still wonder if it is technically possible for a static typed language like Java, C# and Kotlin to support metaclass and have classes as first class objects, or is it theoretically impossible?
Given that Python itself is implemented in C, and its objects can be "seen" and used in C through its API, the answer is "yes".
What may be harder in some languages is that objects belonging to a class hyerarchy were they have proper metaclasses, introspection, and runtime class creation may be a little tougher - requiring the code that implements such metaclasses to replicate part of the language runtime itself, or to call, at runtime, different code generation functions - at very least, one class could generate the source code with the declaration for the dynamically created classes, and iterate with the runtime to compile and load that code into the current process.
With C++, for example, where one can have full control of in-memory layout of data, maybe it is possible to emulate compiled classes with a lighter approach, that just fill in data-structures and redo the name-mangling part to attach class members.
If that would be something convenient, however, it is another question altogether. Maybe it is just better to leave metaclasses either for languages that support it out of the box, or, for languages were you are implementing a class system from scratch (like the Python runtime does using C).

Using Local Special Variables

For prototyping convenience I've relied on a number of global variables which are heavily used throughout the code. But now I'd like to make some of them local (but dynamic). Is there any significant downside (eg, efficiency, something else) to declaring them locally special instead of global?
Unpopular features of special variables include:
Lack of referential transparency
This makes it harder to reason functionally about your code. It means that your function produces different results with syntactically equivalent calls.
Introduce bugs
If a lexical variable is defined somewhere in your code (eg. in a system function), you will overwrite it and cause bugs.
Confusing
Special (dynamic) binding is unpopular, and will confuse your readers who are not familiar with it.
Unnecessary
Just use lexical binding, or even anaphoric macros instead.
More information:
Anaphoric macros See Let Over Lambda by Doug Hoyte, or Paul Graham's anaphoric macros.
LiSP (Lisp in Small Pieces) has a section on binding and dynamic binding
Elisp has dynamic binding by default, and enforced dynamic binding for a long time
Many early lisps had dynamic binding, but dropped it.

What is the difference between dynamic languages and functional languages?

I often find developers use the terms functional language and dynamic language together, and wonder why are they always being put together.
What are the differences between them? Can a language be both dynamic and functional? Do they complement each other? Why do we need them anyway?
I'm a C# programmer and don't yet understand this whole dynamic/functional thing (C# is going to have some dynamic features in ver 4. Will it also be functional? what's going on here?).
Thanks,
Abraham
To put it in a simple (but not accurate) answer
Dynamic languages are ones in which the Type (Name of the Class) is not as important as compared to its nemesis statically typed languages. A variable may have objects of different types assigned to it at any given point of time. Method invocations are resolved at run-time. This means you lose the benefits of static typing (compiler warnings) but simple methods turn generic - sort(list) works for a list of strings as well as list of ints. e.g. Ruby et. all
Functional languages value immutability. The programs are written in terms of bigger and bigger functions (usually bottom up). The concept of object state and mutability is frowned upon. A function in this context is self-sufficient (The term is Pure as per Wikipedia): everything it needs to produce output, lies in the input that it receives. It also produces no side-effects (unless it explicitly mentions it) and returns consistent output for a given input. This can lead to elegant code (see: fluent interfaces), where input data is pipelined through diff functions to produce the eventual output e.g. LISP et.all
However the boundaries are being muddied with languages picking up the best of all worlds... You could have a language which is both, one or neither.
e.g. predominantly static C# picking up lambda expressions in 3.0 and bringing in dynamic capabilities with 4.0
Dynamic typing, a type system, is orthogonal to 'functional', a programming paradigm.
Dynamic 'languages' are actually dynamically typed. This means that you don't have compile-time checking of your variable types.
Functional languages offer loads of support for e.g. lambda calculus - anonymous functions.
An example of a language that does dynamic typing, and supports anonymous functions: javascript. Ruby has some functional style support, too. And there are others.
Dynamic typing and functional programming are independent concepts. You can have either, neither or both in a language.
Static typing means that types of objects are known at compilation time. In dynamic typing they are known at runtime.
Functional programming means programming style where computation is done by evaluating functions while avoiding state changes. (example: you use recursion instead of for-loops, because a loop would need changing of a counter variable, etc.) This helps to avoid bugs and makes concurrent programming easier. Pure languages require you to program in functional style, others just enable it.
Example languages:
|----------------+---------+---------|
| | Dynamic | Static |
|----------------+---------+---------|
| Functional | LISP | Haskell |
| Not functional | PHP | Java |
|----------------+---------+---------|
Dynamic languages on the other hand are a broader concept. There is no exact definition, but usually the more features of the compiler are moved to the runtime, more dynamic the language is. This means that in dynamic languages you can usually evaluate expressions, change object structure etc. at runtime.
If you're interested in paradigms, the paper Programming Paradigms for Dummies: What Every Programmer Should Know covers them.
In functional programming, state is implicit - the program executes by calling functions which call other functions. In imperative programming and object oriented programming, state is explicit - you change the value of a variable or object's field.
In a way, functional and imperative systems can be seen as duals - what's fixed in one is a dynamic value in the other.
Closures - which trap some explicit, mutable state in an object which can be called as a function - sit somewhere between, being neither pure functional programming but not quite fully fledged objects; they are more like anonymous objects than functions.
'Dynamic languages' is vague term, usually meaning one of the following:
Dynamically Typed Languages - languages which delay determination of type to runtime, but the set of types is fixed. Examples are Smalltalk, Lisps, current Fortress implementations. Some otherwise statically typed languages also allow some dynamic type checks - Java, C#, C++ and Ada. ( it was a failed dynamic type cast from float to int in Ada that crashed Ariane 5 )
Languages with dynamic types - languages where new types can be created at runtime. The most popular is JavaScript. Because you have to run the program to determine the types, it's harder to make IDEs for these with type aware autocompletion.
Languages which are dynamically compiled - languages where new scripts can be compiled at runtime. This is true of bash, JSP, PHP and ASP at the page scale, and true for at a finer scale for lisps and JavaScript which support an 'eval' function which compiles and runs an expression.
Functional languages which are strongly typed often perform a large amount of type inference, so it's common for their programs to have less explicit typing than poorly implemented static typed languages. This can confuse people who have only seen the lack of explicit typing in dynamic typed languages into believing that type inference is the same as dynamic typing.
xtofl has already offered a good overall picture. I can speak to the C# point.
C# has been becoming easier to work with in a functional way for a while now:
C# 2 introduced anonymous methods, which made it easier to create delegates which used state which was otherwise local to a method
C# 3 introduced lambda expressions which are mostly like anonymous methods but even more compact
LINQ support in both C# 3 and .NET 3.5 made it easier to query data in a functional way, chaining together predicates, projections etc
None of the C# 4 features directly contributes to functional programming IMO, although named arguments and optional parameters may make it easier to create/use immutable types, which is one of the biggest features missing from the functional picture IMO.
(There are other things functional languages often have, such as pattern matching and more impressive type inference, but you can write a lot of functional-style code reasonably easily in C#.)
C# 4 will gain some dynamic abilities through the dynamic type (which itself is effectively a static type you can do anything with). This will be somewhat "opt in" - if you never use the dynamic type, C# will still be fully static language. There's no language support for responding dynamically, but the DLR has support for this - if you implement IDynamicMetaObjectProvider or derive from DynamicObject, for example, you can add dynamic behaviour.
I would say that C# isn't becoming a functional language or a dynamic language, but one in which you can code in a functional style and interoperate with dynamic platforms.

Execution speed of references vs pointers

I recently read a discussion regarding whether managed languages are slower (or faster) than native languages (specifically C# vs C++). One person that contributed to the discussion said that the JIT compilers of managed languages would be able to make optimizations regarding references that simply isn't possible in languages that use pointers.
What I'd like to know is what kind of optimizations that are possible on references and not on pointers?
Note that the discussion was about execution speed, not memory usage.
In C++ there are two advantages of references related to optimization aspects:
A reference is constant (refers to the same variable for its whole lifetime)
Because of this it is easier for the compiler to infer which names refer to the same underlying variables - thus creating optimization opportunities. There is no guarantee that the compiler will do better with references, but it might...
A reference is assumed to refer to something (there is no null reference)
A reference that "refers to nothing" (equivalent to the NULL pointer) can be created, but this is not as easy as creating a NULL pointer. Because of this the check of the reference for NULL can be omitted.
However, none of these advantages carry over directly to managed languages, so I don't see the relevance of that in the context of your discussion topic.
There are some benefits of JIT compilation mentioned in Wikipedia:
JIT code generally offers far better performance than interpreters. In addition, it can in some or many cases offer better performance than static compilation, as many optimizations are only feasible at run-time:
The compilation can be optimized to the targeted CPU and the operating system model where the application runs. For example JIT can choose SSE2 CPU instructions when it detects that the CPU supports them. With a static compiler one must write two versions of the code, possibly using inline assembly.
The system is able to collect statistics about how the program is actually running in the environment it is in, and it can rearrange and recompile for optimum performance. However, some static compilers can also take profile information as input.
The system can do global code optimizations (e.g. inlining of library functions) without losing the advantages of dynamic linking and without the overheads inherent to static compilers and linkers. Specifically, when doing global inline substitutions, a static compiler must insert run-time checks and ensure that a virtual call would occur if the actual class of the object overrides the inlined method.
Although this is possible with statically compiled garbage collected languages, a bytecode system can more easily rearrange memory for better cache utilization.
I can't think of something related directly to the use of references instead of pointers.
In general speak, references make it possible to refer to the same object from different places.
A 'Pointer' is the name of a mechanism to implement references. C++, Pascal, C... have pointers, C++ offers another mechanism (with slightly other use cases) called 'Reference', but essentially these are all implementations of the general referencing concept.
So there is no reason why references are by definition faster/slower than pointers.
The real difference is in using a JIT or a classic 'up front' compiler: the JIT can data take into account that aren't available for the up front compiler. It has nothing to do with the implementation of the concept 'reference'.
Other answers are right.
I would only add that any optimization won't make a hoot of difference unless it is in code where the program counter actually spends much time, like in tight loops that don't contain function calls (such as comparing strings).
An object reference in a managed framework is very different from a passed reference in C++. To understand what makes them special, imagine how the following scenario would be handled, at the machine level, without garbage-collected object references: Method "Foo" returns a string, which is stored into various collections and passed to different pieces of code. Once nothing needs the string any more, it should be possible to reclaim all memory used in storing it, but it's unclear what piece of code will be the last one to use the string.
In a non-GC system, every collection either needs to have its own copy of the string, or else needs to hold something containing a pointer to a shared object which holds the characters in the string. In the latter situation, the shared object needs to somehow know when the last pointer to it gets eliminated. There are a variety of ways this can be handled, but an essential common aspect of all of them is that shared objects need to be notified when pointers to them are copied or destroyed. Such notification requires work.
In a GC system by contrast, programs are decorated with metadata to say which registers or parts of a stack frame will be used at any given time to hold rooted object references. When a garbage collection cycle occurs, the garbage collector will have to parse this data, identify and preserve all live objects, and nuke everything else. At all other times, however, the processor can copy, replace, shuffle, or destroy references in any pattern or sequence it likes, without having to notify any of the objects involved. Note that when using pointer-use notifications in a multi-processor system, if different threads might copy or destroy references to the same object, synchronization code will be required to make the necessary notification thread-safe. By contrast, in a GC system, each processor may change reference variables at any time without having to synchronize its actions with any other processor.

What are the downsides to static methods?

What are the downsides to using static methods in a web site business layer versus instantiating a class and then calling a method on the class? What are the performance hits either way?
The performance differences will be negligible.
The downside of using a static method is that it becomes less testable. When dependencies are expressed in static method calls, you can't replace those dependencies with mocks/stubs. If all dependencies are expressed as interfaces, where the implementation is passed into the component, then you can use a mock/stub version of the component for unit tests, and then the real implementation (possibly hooked up with an IoC container) for the real deployment.
Jon Skeet is right--the performance difference would be insignificant...
Having said that, if you are building an enterprise application, I would suggest using the traditional tiered approach espoused by Microsoft and a number of other software companies. Let me briefly explain:
I'm going to use ASP.NET because I'm most familiar with it, but this should easily translate into any other technology you may be using.
The presentation layer of your application would be comprised of ASP.NET aspx pages for display and ASP.NET code-behinds for "process control." This is a fancy way of talking about what happens when I click submit. Do I go to another page? Is there validation? Do I need to save information to the database? Where do I go after that?
The process control is the liaison between the presentation layer and the business layer. This layer is broken up into two pieces (and this is where your question comes in). The most flexible way of building this layer is to have a set of business logic classes (e.g., PaymentProcessing, CustomerManagement, etc.) that have methods like ProcessPayment, DeleteCustomer, CreateAccount, etc. These would be static methods.
When the above methods get called from the process control layer, they would handle all the instantiation of business objects (e.g., Customer, Invoice, Payment, etc.) and apply the appropriate business rules.
Your business objects are what would handle all the database interaction with your data layer. That is, they know how to save the data they contain...this is similar to the MVC pattern.
So--what's the benefit of this? Well, you still get testability at multiple levels. You can test your UI, you can test the business process (by calling the business logic classes with the appropriate data), and you can test the business objects (by manually instantiating them and testing their methods. You also know that if your data model or objects change, your UI won't be impacted, and only your business logic classes will have to change. Also, if the business logic changes, you can change those classes without impacting the objects.
Hope this helps a bit.
Performance wise, using static methods avoids the overhead of object creation/destruction. This is usually non significant.
They should be used only where the action the method takes is not related to state, for instance, for factory methods. It'd make no sense to create an object instance just to instantiate another object instance :-)
String.Format(), the TryParse() and Parse() methods are all good examples of when a static method makes sense. They perform always the same thing, do not need state and are fairly common so instancing makes less sense.
On the other hand, using them when it does not make sense (for example, having to pass all the state into the method, say, with 10 arguments), makes everything more complicated, less maintainable, less readable and less testable as Jon says. I think it's not relevant if this is about business layer or anywhere else in the code, only use them sparingly and when the situation justifies them.
If the method uses static data, this will actually be shared amongst all users of your web application.
Code-only, no real problems beyond the usual issues with static methods in all systems.
Testability: static dependencies are less testable
Threading: you can have concurrency problems
Design: static variables are like global variables

Resources