I'm learning C++ on my own. I'm an EE and learned it about 20 years ago, but in the progress of my career I stopped programming and didn't take it up again until recently. I should add that I never took any classes in programming.
I have a theoretical question about pointers. In reading the books about pointers it seems they have an important role in C++. My problem is that I can't see what that is. I see that pointers have a role in arrays, but I can't see their role in anything else.
I can see what they do, but I don't see why use pointers in the situations I see them in. Either references or straight variables would work just as well. I have a feeling the answer lies in the area of memory ( it's optimal use), but I just don't know.
Any answers would be appreciated. Thanks.
Consider the following from cplusplus.com:
"[T]here may be cases where the memory needs of a program can only be
determined during runtime. For example, when the memory needed depends
on user input. On these cases, programs need to dynamically allocate
memory, for which the C++ language integrates the operators new and
delete."
If you could determine all your memory needs prior to run time and did not need to make use of any abstract data type like a linked list, then yes, it would be difficult to see their use. However, what if you want to store values in an array, but you don't yet know how big that array will need to be?
Another value of pointers arises when you consider passing values from function to function. You may find this thread of value regarding the differences between pointers and references in C++ and how/why to use each.
We have been having several pedagogical conversations focused on pointers on the CSEducators.SE site. I'd encourage you to read those as well:
Simple Pointer Examples in C
Lesson Idea: Arrays, Pointers, and Syntactic Sugar
Pointers come from C, which had no concept of reference, and which C++ inherited from.
Everything that can be done with a reference in C++ is done with a pointer in C.
I find this question really great because it is pure.
A programming language is considered "safe" when the programs written in it can only call functions and access data that the program can name.
Now, the concept of pointer was invented to break this sandbox of safety and provide developer with freedom to think and act outside of the box.
Think of pointers as poor man's tool to achieve something not provided by the programming language itself.
It is misleading to think you could achieve higher performance if programmed some algorithm using pointers. Optimization is privilege of the compiler and hardware, not human.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm not sure, if some weird things make my Code faster:
Is it normally better to use inbuilt operations or write new specialized functions, that do the same thing?
(for example a version of #'map only for vectors; my version is often faster without type declarations)
Should I define new (complicated) types to use them in declarations?
(for example a typed list)
Should I define slots directly to an object? (for example px and py for a 2-dimensional object, or is it ok to use one slot pos of type vector, that I could reuse it for other purposes)
There are a few parts to this but here is a quick braindump
PROFILE!
Use a distribution of CL that has a profiler built in, I use sbcl for example http://www.sbcl.org/1.0/manual/Statistical-Profiler.html
The nice thing about the sbcl profiler is that once you have profiled a function, if you disassemble it, the machine code is annotated with statistics. This requires some knowledge of the target machine code.
Do not underestimate your implementation: They can have advanced type and flow analysis built in and are able to, for example, pick a vector only version of map when it makes sense.
Learn compiler macros: compiler macros can shadow functions this gives you a place to put extra optimizations based on the context of the form. IT does this without replacing the function so it can still be used in a higher order way.
Learn Type declarations
I found this series of blog posts helped me understand this technique http://nklein.com/tags/optimization/page/2/ Read em all!
ONE MASSIVE NOTE: Don't ever lie to your compiler about a type. Type declarations are a way of telling your compiler you know what the type is the compiler doesn't even have to use them, and when it does it doesn't have to check you are giving it the correct thing.
Unboxed data
Some implementations are able to unbox certain datatype in certain conditions. Sorry that is vague but you will need to read up for your implementation. For sbcl the 'sbcl internals' guide is very helpful.
For example:
(make-array 100 :element-type 'single-float :initial-element 0.0)
Can be stored as a contiguous block of memory in sbcl.
PROFILE AGAIN (With realistic data)
I spent 3 hours writing a crazy compiler macro based n dimensional matrix multiplication routine and then tested it against a 1 line built in solution. For matricies below 5 dimensions there was not a big difference! For higher dimensions, yeah It rocked but that 'performance benefit' is purely academic because those code paths were never touched. Luckily I undertook the task for fun as I was asking the same question you are now.
Algorithms
All the type specifiers in the world won't give you a 100times performance increase. This comes from better techniques. Read on the maths behind the problem, implements different helper functions that have different strengths and choose between them at runtime...then go back and use compiler macros to allow lisp to choose at compile time. OR specify the technique as a higher order functions, for example make-hash-table allows you to specify the hashing function and rehash sizes, this can be crucial in getting good performance at certain sizes.
Know the limits of BigO
Algorithmic complexity means nothing if you loose all the of performance due to memory locality issues. Conversly sometime we can achieve superlinear performance characteristics if, by spliting the problem among cores, the reduced dataset now fits in the l2 cache.
BigO is a great metric but it isn't the end of the story. This is the reason assoc lists are a totally valid alternative to hash-tables for low numbers of keys and certain access profiles.
Summary
There is a golden mantra I heard from somewhere in the lisp community that works so well:
Make it Fast and then make it Fast
If nothing else follow this. Chant it to yourself!
Get the program up and running quickly, in doing so you are more likely to spot the places where you can use a better technique or algorithm to get your several-orders-of-magnitude improvement. Do use CL's own functions first. Don't trade lisp's higher order nature too early by using macros, explore how far you can go with functions.
[Edit] More notes - the following is for sbcl
Type definitions on struct slots are used for optimizing, type declarations for class slots are not.
With regard to types, start with what makes the program easy to write and understand (Make it fast) and then look into access times if it is the bottleneck (make It Fast!)
(slot-value x 'name) is very fast when name is known. Look at how with-slots uses symbol-macrolet to it's advantage
So to kinda directly answer your original question:
built in first (also check libraries)
does it make the problem easier to write and understand?
use pos. By the time the performance of that indirection becomes and issue you will have found a dozen other ways to speed up the problem and the solution will be part of a wider optimization technique.
So I'm having a hard time grasping the idea behind pointers and all that memory allocation.
I'm thinking nowadays with computer as powerful as they are right now why do we have to use pointers at all?
Isn't there always a workaround to do things without the help of pointers?
Pointers are an indirection: instead of working with the data itself, you are working with (something) that points to the data. Depending on the semantics of the language, this allows many things: cheaply switch to another instance of data (by setting the pointer to point to another instance), passing pointers allows access to the original data without having to make (a possibly expensive) copy, etc.
Memory allocation is related to pointers, but separate: you can have pointers without allocating memory. The reason you need pointers for memory allocation is that the actual address the allocated block of memory resides is not known at compile time, so you can only access it via a level of indirection (i.e. pointers) -- the compiler statically allocates space for the pointer that will point to the dynamically allocated memory.
Pointers are incredibly powerful. Just because computers have a faster processing time nowdays, doesn't mean that's any reason to abandon something as essential as pointers. Passing around giant chunks of memory on the stack is inefficient at best, catastrophic at worst. With pointers, you only need to maintain a reference to where the data resides, rather than duplicating huge chunks of memory each time you call a function.
Also, if you're copying all the data every time, how do you modify the original data? Aside from returning the copy of the structure in every call that touches it.
I remember reading somewhere that Dijkstra was assessing a student for a programming course; this student was quite intelligent but s/he wasn't able to solve the problem because there was sort of a mental block.
All the code was sort of ok, but what was needed was simply to use the expression
a[a[i+1]] = j;
and even if being so close to the solution still the goal seemed to be miles away.
Languages "without pointers" already exist... e.g. BASIC. Without explicit pointers, that is. But the indirection idea... the idea that you can have data to mean just where to find other data is central to programming.
The very idea of array is about being able to use computed values to find other values.
Trying to hide this idea is an horrible plan. According to Dijkstra anyone that has been exposed to the BASIC language has already received such a mental mutilation that is impossible to recover as a good programmer (and probably the absence of explicit indirection was one of the problems).
I think he was exaggerating.
Just a bit.
Do you know if are there pointers in Haskell?
If yes: how do you use them? Are there any problems with them? And why aren't they popular?
If no: is there any reason for it?
Yes there are. Take a look at Foreign.Ptr or Data.IORef
I suspect this wasn't what you are asking for though. As Haskell is for the most part without state, it means pointers don't fit into the language design. Having a pointer to memory outside the function would mean that a function is no longer pure and only allowing pointers to values within the current function is useless.
Haskell does provide pointers, via the foreign function interface extension. Look at, for example, Foreign.Storable.
Pointers are used for interoperating with C code. Not for every day Haskell programming.
If you're looking for references -- pointers to objects you wish to mutate -- there are STRef and IORef, which serve many of the same uses as pointers. However, you should rarely -- if ever -- need Refs.
If you simply wish to avoid copying large values, as sepp2k supposes, then you need do nothing: in most implementation, all non-trivial values are allocated separately on a heap and refer to one another by machine-level addresses (i.e. pointers). But again, you need do nothing about any of this, it is taken care of for you.
To answer your question about how values are passed, they are passed in whatever way the implementation sees fit: since you can't mutate the values anyway, it doesn't impact the meaning of the code (as long as the strictness is respected); usually this works out to by-need unless you're passing in e.g. Int values that the compiler can see have already been evaluated...
Pass-by-need is like pass-by-reference, except that any given reference could refer either to an actual evaluated value (which cannot be changed), or to a "thunk" for a not-yet-evaluated value. Wikipedia has more.
I will be teaching a course on the fundamentals of programming next Fall, first year computer science course. What are the pros and cons of teaching pointers in such a course? (My position: they should be taught).
Edit: My problem with the "cater your audience" argument is that in the first couple of years in University, we (profs) do not know if students would like to be scientists or not... we wish we knew, but we have to strike a balance between those who will remain in school (4 years does not a scientist make), and those who will be engineers.
Final decision: At least references, but possibly pointers without pointer arithmetic.
At the very least you should teach references or some equivalent concept. I think you should probably take it easy on things like pointer arithmetic, c arrays and strings, but indirection is a very important concept in computer science, and students should be introduced to it.
Yes.
Pointers underpin a huge number of concepts in other, higher level languages, and I'm firmly of the opinion that you need to teach a certain amount of the lower-level stuff to facilitate a good understanding of why we bother with anything higher level at all.
Once you understand a bit about how memory is allocated, and how it's addressed and manipulated with pointers, explaining a lot of other constructs gets easier. For example, explaining a NullPointerException in Java, or even the concept of references in such languages is child's play if you've got someone who understands pointers in C (and better still, if they also grok references in C++).
Absolutely teach them. Understanding indirection is essential for programming, whether it's with pointers, references, dynamic binding, or any number of other things. Now obviously don't start off with them, but understanding indirection is at least as important as understanding control flow ideas.
The con of course is that some people just won't get it and will do poorly or drop out. If this is a course for people who want to be CS majors then don't sweat it because you're just giving them incentive to switch majors earlier rather than later. If it's more or a general ed course for people who are kind of interested in programming, then they should probably still be introduced, but not graded harshly or heavily.
During my first year as a CS student, I took a Java course in fall which was the general intro. The professor didn't teach pointers directly, but he did teach the concept of references, and why you can modify objects and not when primitives when either is passed in an argument.
During my 2nd semester, I took the next course in the series, which was about C, and this class heavily relied on pointers.
For an intro to programming class, I'd say just mention references, but not pointers directly.
I think that a "fundamentals of programming" course should at least touch on basic processor architecture and assembly language, and if it does, you can't really make a case for not discussing pointers.
If you only teach higher-level (byte-code) languages, then I guess pointers would confuse the audience.
Pros: solid understanding of the way that memory is used by the machine, the difference between (and pitfalls of) pointers to data on the heap vs. pointers to data on the stack, passing methods by address, etc.
Cons: complex for an audience who is not yet knowledgeable (or has not had enough time to assimilate the concepts) of computer architecture, including what is stack, what are registers, calling conventions, etc.
So, to summarize, it depends a lot on your audience and on the language(s) you'll tackle (pointers will be meaningless in the context of LISP or Java), as well as on how deep you are willing to go in the direction of what is heap, what is stack, how scope is translated into stack (i.e. why never to return a pointer to a local variable), etc.
When I taught pointers to an engineering class I ultimately fired up a debugger on a simple "hello world" program, and showed the students the actual machine code, register values and corresponding memory dumps, with the stack manipulation and parameter passing, etc., but they were ready for it. Would your audience be receptive to such a drill-down expedition, to ensure solid understanding of what's going on behind the scenes, and would you be willing to go to such lengths? :)
I think you shouldn't teach it first. But later, once basic concepts of programming are acquired.
A good example is the last Stroustrup book : Programming -- Principles and Practice Using C++ where he teach how to make a parser, I/O (streams) usage and GUI usage before even talking about pointers!
I think it will be a good reference for teaching because it is more natural to understand the way we build ideas instead of how much constraints (memory management for example) we have to handle at the same time to make a software work. I really recommand you this book to have a fresh perspective about teaching fundamentals of programming.
It really depends on the goal of your course - teaching programming and teaching computer science are two separate goals, and though they are not mutually exclusive, introductory classes generally do not teach both equally well. Here's an example of the difference: say we want to learn how to sort a list. A programming course in C++ would teach you to use the syntax of a std::sort function template, and homework might be writing several comparison functors. A computer science course would explain to you what a merge sort is, what the algorithm looks like in pseudo-code, and its performance/space characteristics, and homework would be writing the sort function itself.
So if you are teaching introductory programming, then yes, you should teach your students about pointers.
If you are teaching computer science, then no, there is no need to understand pointers at an introductory level.
Anybody who calls themselves a good programmer must know how pointers work; being a good programmer implies that they do not know only a single programming language, but that they know how programming languages work in general, allowing them to adapt to programming languages they haven't seen before.
This doesn't mean that a fundamentals of programming course should be teaching pointers, however.
If your goal is to give these people a complete, well rounded familiarity with programming languages in general, then yes pointers shall be part of that.
If your way of introducing them to programming is to use one programming language at first, with the intention of covering other languages in subsequent courses, and pointers are not relevant to that language, then there's no need to talk about pointers yet.
I think there's a lot to be said by starting people out in one language only, rather than trying to cover every style of language at once.
My first introductory programming course used Haskell. It wasn't until a subsequent course using C that pointers were introduced (I was already a good C and C++ programmer when I took the course; those subjects were mandatory).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
Most people suggest that learning assembly is essential, its important to know the underlying workings of the computer, and so forth. But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
What are your suggestions? What am I missing out on by not learning Assembly and pointers/memory management in general?
I think the main practical advantage to learning low-level things like assembly language, pointers, and memory management is that when you're writing or reviewing high-level code you're better able to instinctively or subconsciously spot performance issues or other pitfalls.
An average developer, might write a simple loop and think, "This code iterates over a set of integers and writes each to the console."
An expert developer might write the same loop and think, "This code iterates over a set of integers, and has to box each element to call the ToString method and ToString has to format the string in base 10 which is somewhat non-trivial, and then both the boxed integer and the formatted string will soon be eligible for garbage collection as no references will remain, and the first time this method runs, it will need to be JIT'ed..." and so on.
9 times out of 10, it may not matter. But that 1 time out of 10, the expert developer is likely to notice a problem in code that the average developer would never think to consider.
Pointers/memory management are more general than assembly language. You need to understand them for C and C++ as well, which you might need if you have to maintain code written in C.
For assembly language, it is sometimes useful to read the assembler code that the C compiler generates, to find out whether it generates correct and efficient code.
You need to learn to read assembly so you can figure out what goes wrong when a complex statement bombs out. The CPU debug window shouldn't be a mysterious place.
This is sort of one of those questions that will always be asked: "Why should I know anything." etc. Well, perhaps you could get a job doing something besides building the next generic CRUD application or something like that. If you want to do any sort of system development, having a working knowledge of assembly is very helpful, if not vital. As far as what you're "missing out" perhaps you are missing out on actually knowing how computers work. Some people think this is desirable. Some people don't. Some people build processors. Some people dig ditches. It's all a matter of personal preference :)
I think it's great to learn new languages. It opens my mind. Some languages are more mind-opening than others. I'd say assembler is one of those. It forces you to think about stuff like the call stack and instruction pointer. And it'll make you appreciate higher level languages even more. Another fun language to learn is PostScript.
I don't think you need to learn assembly for anything practical. However, it will ensure that you understand the real roots of what you are doing as a developer. In essence, assembly programming is a discipline for learning chip logic and architecture. I haven't programmed assembly in over two decades but it still informs the kinds of choices I make when programming C#.
But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
Learn what assembly is.
Really learn how to read (and understand) small fragments of it: how to walk/step through it in your mind.
Perhaps too, step through some of it with a debugger (including seeing memory and registers being changed).
Ideally, find some annotated assembly.
But, don't bother to learn how to write assembly: instead, learning to write C or C++ is probably 'low' enough for most practical purposes.
Well, on a practical level I did a class in 6502 assembler when I was first learning to code the early 80s. I also did some 8088 assembler. It's been of occasional use of the years since but I can't say it's ever really got my out of a hole on more than one or two occasions in 25 years. Groking C at a pretty fundamental level is of far more use. YMMV and it's certainly helpful as background, but as a direct practical benefit? Marginal really.
Perversely though one thing that has proved useful is at an even lower level. I did a class on chip design (NAND gates and the like) and as part of that was taught formal Boolean logic at some depth. That's been massively useful ever since - it's surprising the number of coders who don't really know what they are doing with ands, ors and nots :-)
Pointers and memory management are really a different question than assembly. If you want to do C/C++, then you need to learn pointers and memory management, because those are part of the language. But, even if you plan to use nothing but (say) Java all your life, you should learn something about memory management to keep from writing a memory leak despite the GC, and pointers are just the difference between atomic types and object references. You need the concepts or you'll write programs that don't work!
Practical reasons for learning assembly: debugging and optimization. Even if you don't write any assembly, one of these days you may need to optimize C/C++ code for performance. In that case, you'll need to be able to read the assembly for your inner loop, even if you never need to write another line of it.
Ultimately, I think your distinction between "knowing the underlying workings of your computer" and "practical suggestions that will make the effort of learning assembly worth it" is a false one. Ignorance does not pay. Learning how your computer works is a practical suggestion worth the effort!
I have a prophecy: someday soon, your program will run far too slowly to be practical, and crash intermittently with an out-of-memory exception. On that day, the sheer screaming anxiety of not knowing what the hell is going on or where to start looking in order to fix it will refund your karma debt, with interest...
These days many assembly languages are actually fairly high level.
And it's always been true that if you learn 'C', that's close enough to assembly to get most of the learning benefits.
edit: thinking about this a bit more, in Knuth's books he describes an idealised assembly language. You won't go far wrong learning that, and reading those books.
Another practical reason I can think of is reverse engineering application code to modify it for educational purposes ONLY, since this is widely used by crackers to bypass shareware application protections like time-limit or serial numbers.
An application like win32Dasm can convert executables into assembly code that can later be modified with a Hex editor like hiew. You can learn quite a lot about the flow of the program.
I think learning about computer architecture, in conjunction of assembly, would open your mind quite a bit.
It would help explain lots of performance issues - e.g. parser's slow because there's lots of branches, and pipeline gets flushed very easily, branch predictor cannot compensate for everything.
Also, different architectures have their quirks. Someone talked about an assembly trick to swap 2 registers in place, involving xor's. It works, and it would run great for in-order execution core (most recent example would be the Intel Atom, and the Via C7 in netbooks), but not so great in out-of-order cores.
Knowing that may help you to detect poorly compiled code by inspecting it in assembly, and possibly be able to write code in higher-level language to sidestep the imperfection of compiler optimizers. I'm not trying to diss them, but they just can't be perfectly in tuned.
The biggest practical advantage to learning Assembly is performance. You can optimize to near perfection when its required.