gnu file tree...is this advisable to beginners? - unix

I am in academics, where most of the coding is done in fortran, C(++) (and little bit of python). My question is about projects of 2-3k lines of code using fortran.
In gnu's code, we generally see a very standard set of directories, eg src, doc etc and files like README, ChangeLog etc (I have not found anything that says these are standard directory, so, I assume them as good practice).
Now, for teaching new students (undergrads taking their 1st coding course), is it advisable to introduce them to this "standard" tree with the exact names? I have not seen many projects in academic world follow this.
So, for the beginners, what is better?

These are not standards in the sense of "standardised by some organization", but best practices ("coding conventions") entered in common use many years ago. So I think is good to introduce students to these practices if they will have to not only write, but also install and use software already made by others.

Related

What is a good language to develop in for simple, yet customizable math programs?

I'm writing to ask for some guidance on choosing a language and course of action in learning programming. I apologize if this type of question is inappropriate for Cross Validated, please advise me to another forum if that is the case.
I've seen thread after thread with questions from newbies, asking, "What is the best language to start with?" and then it always starts a flame war or someone just answers, "There's no best language, it's best to pick one and start learning it." My question is a little bit more focused than that.
First off, I've been programming my whole life, in very limited capacities. My deepest training was in C++. Whilst in my EECS degree program, I resolved to never be a software developer because I couldn't stand not interacting with people for such long periods of time. Instead I realized I wanted to be a math teacher, and so that is the path I have taken.
But now that I'm well down that path, I've started to realize that perhaps I could develop my own software to help me in the classroom. If I want to demonstrate the Euclidean algorithm, what better way than to have a piece of software that breaks down the process? Students could run that software as part of their studies, and the advanced students might even develop programs for themselves. Or, with an Ipad in hand, why not have an app that lets students take their own attendance? It would certainly streamline some of the needs of classroom management.
There's obviously a lot of great stuff already out there for math, and for education, but I want a way to more directly create things specific to my lectures. If I'm teaching a specific way of calculating a percent, I want to create an app that aligns with my teaching style, not just another calculator app that requires the student to learn twice.
The most I use in class right now is iWork Numbers/Microsoft Excel for my stats class. Students can learn the basic statistical functions, and turn some of their data into graphs.
I have dabbled a bit with R, and used Maple in college. I've started the basic tutorials for OS X/iOS development and have actually made good progress making an OS X app that takes a text string, converts it to numbers, and performs encryption using modular addition and multiplication. I sometimes use Wolfram|Alpha to save myself some time in getting quick solutions to equations or base conversions. I know of MatLab, Mathematica, and recently people have been telling me to check into Python or Ruby. I also know basic HTML, and while it's forgotten now, learned Javascript and PERL in college.
If I keep on the path of Obj-C/Cocoa, I think it will have great benefits. Unfortunately, anything I produced for Mac would only be usable on a Mac, so it wouldn't be universal for all of my students. Perhaps then learning a web language would be better. Second, I'm wondering if the primary use is mathematical, then perhaps my time would be better spent learning Mathematica Programming Language, or R, or something based less on GUI and more on simple coding of algorithms, maybe Python or Ruby?
It seems that Mathematica already has a lot of demos for different math concepts, so why reinvent the wheel is also a question I have. I think overall, it would be good to have more control and design things the way I need. And then, if I do want to make an "Attendance" app or something else, I would already have the programming experience to more easily design something for my iPad or MacBook.
The related question to this is what is a good language to teach to my students? In his TED talk, Conrad Wolfram says one of the best ways to check the understanding of a student is have them write a program. But if Mathematica does the math virtually automatically for them, then I'm not sure that will get the deeper experience of working out logic for themselves, like you do when you're writing C, or a traditional procedural language.
I know that programming takes time to learn, but I also know that at this point, my goal is not to be able to make an app like "Tiny Wings." With the app store ease, some of my work may be an extra revenue stream, but I see myself as more of a hobbyist, and now teacher looking to software development specifically for its ability to help me demonstrate mathematical concepts.
I think I will push ahead with Obj-C/Cocoa for OSX/iOS, but if anyone has some better guidance regarding all of the other available stuff, it would be much appreciated. I don't think I would want to go fully to the web (I like apps), but perhaps someone could suggest a nice way of bridging what I produce in XCode to a universal web version. For example, if you come up with an algorithm in obj-c is it easiest to transition that to ruby and run it online, or is there another approach that works better?
Mathematica is pretty awesome for the first part of your question. I've used the interactive mode (Manipulate[]) for explaining things to my colleges (and myself). It makes really nice dynamic figures and is fairly expressive (although your code can end up looking like line noise). It is very powerful, but it does far less for you than you might think. It's pretty intuitive, which is a good thing for teaching.
You could use Scala if you want an "easy" way to make a domain specific language for teaching. Python seems to confuse people as a first programming language. Objective C seems like a completely random choice to me.
Mathematica then. It's worth the price. But anything that is interpreted and has an interactive shell is probably better than a compiled language. BBC BASIC?
Nothing beats Haskell for general-purpose mathematical programming. The wiki's quite extensive and the IRC channel (#haskell on Freenode) is great for asking questions. If you statically link your binaries on compilation, you should be able to run your programs on just about any system (with a few exceptions, e.g., libgmp).
Haskell code reads (roughly) like mathematical notation once you get the hang of it, so it can really help to tie things together for your students who are motivated to write their own programs. The purely functional style can be beneficial, as well, since it focuses less on I/O and the marshalling of data (perfectly useful in applications, perhaps less so in pure math), and more on the actual creation and refinement of functions and algorithms. You can even compose functions just as you would on paper.
If you want to get really serious, you could also look into Coq or Agda, but those might be a bit much for most classes.
For a Haskell program idea for an educator, check out this link.
A nice list of arguments can also be found at:
Eleven Reasons to use Haskell as a Mathematician and the book The Haskell Road to Logic, Maths and Programming

Chances of IDL in Image processing

I am a software engineer working in Medical Imaging.I have just started using the language IDL and i feel very comfortable with it.As a new member in this field with a language like IDL, i would like to know the chances of IDL in this field.Can any one help me?
Well, so here is my biased opinion -> I'm heading the opposite way to you. I have used IDL (and before PV-Wave) on and off for ca 10 years (mostly MRI) and I'm now trying to part from it. Here is why. If you are proficient you can very quickly test something in an interactive / lightly scripted fashion. This is the typical use case of scientists; most have little CS education and are happy to grab any tool that seems to helpful. In fact, IDL is fairly good at dealing with largish arrays/images etc as you are likely to encounter in imaging.
However, it is not very pretty and coding gets increasingly awkward as your project size increases. If you are a software engineer, I suspect you'll hit the limits soon and will be cursing it no end. If you try to develop GUI code for people around you, you might be in for a rough ride. This is one of the main reasons I am moving over to Python + EPD with scipy and the likes. Also, binding to existing sophisticated image processing tools as you might need (registration, segmentation, etc) are not ideal.
A further complaint I have are the ongoing licensing costs. Even in an academic environment they are becoming prohibitive and I'd rather spend it on a Coop-student who could code for me than on ITT. A nice feature though is the ability to compile almost all IDL code into a sav file that others can use with a free IDL virtual machine.
Essentially, what it will come down to is how much your collaborators need you to use IDL. If it's fully your choice, I would look elsewhere. If there is a significant (and decent) code base, I would stay. The medical imaging plus astro community is dependent enough to keep this going for a while. If you do decide to hang on, I can highly recommend Dave Fanning's writings (his web page + his book + the google-group). He is somewhat of an icon in the idl community and certainly taught me things that were very useful. (Check out the mighty histogram function, I'm not kidding!)
Hope this works out for you.

Statistical tools for programmers [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm trying to evaluate the purchase of a statistical tool. This will be used in part by non-programming users (doing clinical studies) and in part by programmers, so I'm trying to find a good compromise between usability and automation. Of course, cost is an issue, but if I can build a solid case, we could probably buy a commercial package, so we're not totally limited to free options.
So far, our options are:
Statistica (which some non-programmers already know)
Matlab Statistics toolbox (programmers already use matlab)
R language (would need a UI for non-programmers)
Hack something into Excel (not fun, but that's what non-programmers do right now)
?...
What else is out there? What's the industry standard? What kind of distinctive features should I look for? What would you recommend, and why?
Ideally, we'd like a tool that can run both on Linux and Windows machines.
(I work in medical imaging, so we do both biostatistics, and software engineering statistics)
Hands down it's R. R is very programmer friendly. It has functional aspects and it's GNU.
S-PLUS and R are both based off the S language. Both are similar and in most cases you can run as S-PLUS program in R and vice versa.
SAS is another option, although geared more towards BI and enterprise. SAS has a simpler syntax than R and in my opinion is easier to pickup for a non-programmer.
Other options include SPSS, Matlab, and even Excel.
I recommend R, personally. It's used by bioinformaticians and psychologists, I hear. Don't know what your field is though, so maybe it's a lousy choice. It is reasonably easy to use and learn.
Stata and SPSS tend to be the most commonly used packages in clinical studies. Both are pretty easy to pick up and use for non-technically minded folks but are generally flexible enough. I've used Stata more than any of the others and have been pretty happy with its options (supports both menu-based and command line operation, decent enough plugin system to get new user-created modules, good graphing support).
R is a little more daunting for newbie users, though it is popular with the biostatisticians. Since it's free, that's another nice point in its favor.
For a statistical package with a GUI which non-technical users can use, I would recommend that you go with "SAS Enterprise Guide". You will get the common and advanced SAS procedures, an excellent graphics facility and the ability to program for the technical users. I recommend that you start with the "SAS Learning Edition" (http://support.sas.com/learn/le/) which is a fully functional version of Enterprise Guide, but limited to processing 1000 rows at a time only. It is under $500, which makes it a pretty good deal.
I would look at S-Plus.
You get a strong programming environment (S-Plus Workbench, based upon the Eclipse platform), an intuitive GUI for non-programmers, and an extensive user community (including users of R, which was based upon the original S).
Visual Numerics is another option.
It sounds like you're trying to maximize multiple goals. You say "This will be used in part by non-programming users (doing clinical studies) and in part by programmers, so I'm trying to find a good compromise between usability and automation", with an implicit assumption that this will be the same tool in both cases, when that might not be realistic. What's the compromise for Word and LaTeX, for example?
Some different questions about the requirements:
Should it be extensible for programmers
Able to use C extensions
Easy to make new procedures and methods
What analysis are non-programmers going to want to use?
Graphics?
Ease of use for different groups
So my read on this:
Easy to extend: R/S-plus, Matlab/Octave (I happen to prefer R, but I do more stats and fewer matrix things)
Easy to use for normal people: Excel, custom wrapped R, SPSS
Also, R on windows has a limited GUI, which may or may not help your users.
If it was me, I'd go with a hybrid solution. Use R, and give a cheat sheet for for common tasks to non-programmers that illustrates common tasks, or even better, write some wrapper functions with names like "image_summary" that automate their exploratory work.
For writing front end scripts for R, the RPy python wrappers might help as well.
SAS Enterprise Guide has good usability for non-programmers. Also, it has good options to connect to Excel. And for programmers, it's the most robust option out there. The sas server runs on anything, though, enterprise guide is Windows only.
Consider Excel one more time. It is well known, and widely available. Refer this book or this book.
This Wikipedia page compares the features available for several statistical packages, as well as their OS compatibility and pricing info (which seems a little out of date, but it gives an overall idea)
We ended up getting the Matlab Statistics toolbox (mainly because we already have some experience with Matlab in the team, and needed the tool anyway)
So far, it's doing what we need to do, and it's easily expansible. Usage will show if non-programmers really use it, but so far it's looking good.

What are the practical advantages of learning Assembly? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
Most people suggest that learning assembly is essential, its important to know the underlying workings of the computer, and so forth. But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
What are your suggestions? What am I missing out on by not learning Assembly and pointers/memory management in general?
I think the main practical advantage to learning low-level things like assembly language, pointers, and memory management is that when you're writing or reviewing high-level code you're better able to instinctively or subconsciously spot performance issues or other pitfalls.
An average developer, might write a simple loop and think, "This code iterates over a set of integers and writes each to the console."
An expert developer might write the same loop and think, "This code iterates over a set of integers, and has to box each element to call the ToString method and ToString has to format the string in base 10 which is somewhat non-trivial, and then both the boxed integer and the formatted string will soon be eligible for garbage collection as no references will remain, and the first time this method runs, it will need to be JIT'ed..." and so on.
9 times out of 10, it may not matter. But that 1 time out of 10, the expert developer is likely to notice a problem in code that the average developer would never think to consider.
Pointers/memory management are more general than assembly language. You need to understand them for C and C++ as well, which you might need if you have to maintain code written in C.
For assembly language, it is sometimes useful to read the assembler code that the C compiler generates, to find out whether it generates correct and efficient code.
You need to learn to read assembly so you can figure out what goes wrong when a complex statement bombs out. The CPU debug window shouldn't be a mysterious place.
This is sort of one of those questions that will always be asked: "Why should I know anything." etc. Well, perhaps you could get a job doing something besides building the next generic CRUD application or something like that. If you want to do any sort of system development, having a working knowledge of assembly is very helpful, if not vital. As far as what you're "missing out" perhaps you are missing out on actually knowing how computers work. Some people think this is desirable. Some people don't. Some people build processors. Some people dig ditches. It's all a matter of personal preference :)
I think it's great to learn new languages. It opens my mind. Some languages are more mind-opening than others. I'd say assembler is one of those. It forces you to think about stuff like the call stack and instruction pointer. And it'll make you appreciate higher level languages even more. Another fun language to learn is PostScript.
I don't think you need to learn assembly for anything practical. However, it will ensure that you understand the real roots of what you are doing as a developer. In essence, assembly programming is a discipline for learning chip logic and architecture. I haven't programmed assembly in over two decades but it still informs the kinds of choices I make when programming C#.
But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
Learn what assembly is.
Really learn how to read (and understand) small fragments of it: how to walk/step through it in your mind.
Perhaps too, step through some of it with a debugger (including seeing memory and registers being changed).
Ideally, find some annotated assembly.
But, don't bother to learn how to write assembly: instead, learning to write C or C++ is probably 'low' enough for most practical purposes.
Well, on a practical level I did a class in 6502 assembler when I was first learning to code the early 80s. I also did some 8088 assembler. It's been of occasional use of the years since but I can't say it's ever really got my out of a hole on more than one or two occasions in 25 years. Groking C at a pretty fundamental level is of far more use. YMMV and it's certainly helpful as background, but as a direct practical benefit? Marginal really.
Perversely though one thing that has proved useful is at an even lower level. I did a class on chip design (NAND gates and the like) and as part of that was taught formal Boolean logic at some depth. That's been massively useful ever since - it's surprising the number of coders who don't really know what they are doing with ands, ors and nots :-)
Pointers and memory management are really a different question than assembly. If you want to do C/C++, then you need to learn pointers and memory management, because those are part of the language. But, even if you plan to use nothing but (say) Java all your life, you should learn something about memory management to keep from writing a memory leak despite the GC, and pointers are just the difference between atomic types and object references. You need the concepts or you'll write programs that don't work!
Practical reasons for learning assembly: debugging and optimization. Even if you don't write any assembly, one of these days you may need to optimize C/C++ code for performance. In that case, you'll need to be able to read the assembly for your inner loop, even if you never need to write another line of it.
Ultimately, I think your distinction between "knowing the underlying workings of your computer" and "practical suggestions that will make the effort of learning assembly worth it" is a false one. Ignorance does not pay. Learning how your computer works is a practical suggestion worth the effort!
I have a prophecy: someday soon, your program will run far too slowly to be practical, and crash intermittently with an out-of-memory exception. On that day, the sheer screaming anxiety of not knowing what the hell is going on or where to start looking in order to fix it will refund your karma debt, with interest...
These days many assembly languages are actually fairly high level.
And it's always been true that if you learn 'C', that's close enough to assembly to get most of the learning benefits.
edit: thinking about this a bit more, in Knuth's books he describes an idealised assembly language. You won't go far wrong learning that, and reading those books.
Another practical reason I can think of is reverse engineering application code to modify it for educational purposes ONLY, since this is widely used by crackers to bypass shareware application protections like time-limit or serial numbers.
An application like win32Dasm can convert executables into assembly code that can later be modified with a Hex editor like hiew. You can learn quite a lot about the flow of the program.
I think learning about computer architecture, in conjunction of assembly, would open your mind quite a bit.
It would help explain lots of performance issues - e.g. parser's slow because there's lots of branches, and pipeline gets flushed very easily, branch predictor cannot compensate for everything.
Also, different architectures have their quirks. Someone talked about an assembly trick to swap 2 registers in place, involving xor's. It works, and it would run great for in-order execution core (most recent example would be the Intel Atom, and the Via C7 in netbooks), but not so great in out-of-order cores.
Knowing that may help you to detect poorly compiled code by inspecting it in assembly, and possibly be able to write code in higher-level language to sidestep the imperfection of compiler optimizers. I'm not trying to diss them, but they just can't be perfectly in tuned.
The biggest practical advantage to learning Assembly is performance. You can optimize to near perfection when its required.

The Clean programming language in the real world?

Are there any real world applications written in the Clean programming language? Either open source or proprietary.
This is not a direct answer, but when I checked last time (and I find the language very interesting) I didn't find anything ready for real-world.
The idealist in myself always wants to try out new languagages, very hot on my list (apart from the aforementioned very cool Clean Language) is currently (random order) IO, Fan and Scala...
But in the meantime I then get my pragmatism out and check the Tiobe Index. I know you can discuss it, but still: It tells me what I will be able to use in a year from now and what I possibly won't be able to use...
No pun intended!
I am using Clean together with the iTasks library to build websites quite easy around workflows.
But I guess another problem with Clean is the lack of documentation and examples: "the Clean book" is from quite a few years back, and a lot of new features don't get documented except for the papers they publish.
http://clean.cs.ru.nl/Projects page doesn't look promising :) It looks like just another research project with no real-world use to date.
As one of my professors at college has been involved in the creation of Clean, it was no shock he'd created a real world application. The rostering-program of our university was created entirely in Clean.
The Clean IDE and the Clean compiler are written in Clean. (http://wiki.clean.cs.ru.nl/Download_Clean)
Cloogle, a search engine for Clean libraries, syntax, etc. (like Hoogle for Haskell) is written in Clean. Its source is on Radboud University's GitLab instance (web frontend; engine).

Resources