exploring mathematics of/in computer science [closed] - math
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
I have been working for two years in software industry. Some things that have puzzled me are as follows:
There is lack of application of mathematics in current software industry.
e.g.: When a mechanical engineer designs an electricity pole , he computes the stress on the foundation by using stress analysis techniques(read mathematical equations) to determine exactly what kind and what grade of steel should be used, but when a software developer deploys a web server application he just guesses on the estimated load on his server and leaves the rest on luck and god, there is nothing that he can use to simulate mathematically to answer his problem (my observation).
Great softwares (wind tunnel simulators etc) and computing programs(like matlab etc) are there to simulate real world problems (because they have their mathematical equations) but we in software industry still are clueless about how much actual resources in terms of memory , computing resources, clock speed , RAM etc would be needed when our server side application would actually be deployed. we just keep on guessing about the solution and solve such problem's by more or less 'hit and trial' (my observation).
Programming is done on API's, whether in c, C#, java etc. We are never able to exactly check the complexity of our code and hence efficiency because somewhere we are using an abstraction written by someone else whose source code we either don't have or we didn't have the time to check it.
e.g. If I write a simple client server app in C# or java, I am never able to calculate beforehand how much the efficiency and complexity of this code is going to be or what would be the minimum this whole client server app will require (my observation).
Load balancing and scalability analysis are just too vague and are merely solved by adding more nodes if requests on the server are increasing (my observation).
Please post answers to any of my above puzzling observations.
Please post relevant references also.
I would be happy if someone proves me wrong and shows the right way.
Thanks in advance
Ashish
I think there are a few reasons for this. One is that in many cases, simply getting the job done is more important than making it perform as well as possible. A lot of software that I write is stuff that will only be run on occasion on small data sets, or stuff where the performance implications are pretty trivial (it's a loop that does a fixed computation on each element, so it's trivially O(n)). For most of this software, it would be silly to spend time analyzing the running time in detail.
Another reason is that software is very easy to change later on. Once you've built a bridge, any fixes can be incredibly expensive, so it's good to be very sure of your design before you do it. In software, unless you've made a horrible architectural choice early on, you can generally find and optimize performance hot spots once you have some more real-world data about how it performs. In order to avoid those horrible architectural choices, you can generally do approximate, back-of-the-envelope calculations (make sure you're not using an O(2^n) algorithm on a large data set, and estimate within a factor of 10 or so how many resources you'll need for the heaviest load you expect). These do require some analysis, but usually it can be pretty quick and off the cuff.
And then there are cases in which you really, really do need to squeeze the ultimate performance out of a system. In these case, people frequently do actually sit down, work out the performance characteristics of the systems they are working with, and do very detailed analyses. See, for instance, Ulrich Drepper's very impressive paper What Every Programmer Should Know About Memory (pdf).
Think about the engineering sciences, they all have very well defined laws that are applicable to the design, and building of physical items, things like gravity, strength of materials, etc. Whereas in Computer science, there are not many well defined laws when it comes to building an application against.
I can think of many different ways to write a simple hello world program that would satisfy the requirment. However, if I have to build an electricity pole, I am severely constrained by the physical world, and the requirements of the pole.
Point by point
An electricity pole has to withstand the weather, a load, corrosion etc and these can be quantified and modelled. I can't quantify my website launch success, or how my database will grow.
Premature optimisation? Good enough is exactly that, fix it when needed. If you're a vendor, you've no idea what will be running your code in real life or how it's configured. Again you can't quantify it.
Premature optimisation
See point 1. I can add as needed.
Carrying on... even engineers bollix up. Collapsing bridges, blackout, car safety recalls, "wrong kind of snow" etc etc. Shall we change the question to "why don't engineers use more empirical observations?"
The answer to most of these is in order to have meaningful measurements (and accepted equations, limits, tolerances etc) that you have in real-world engineering you first need a way of measuring what it is that you are looking at.
Most of these things simply can't be measured easily - Software complexity is a classic, what is "complex"? How do you look at source code and decide if it is complex or not? McCabe's Cyclomatic Complexity is the closest standard we have for this but it's still basically just counting branch instructions in methods.
There is little math in software programs because the programs themselves are the equation. It is not possible to figure out the equation before it is actually run. Engineers use simple (and very complex) programs to simulate what happens in the real world. It is very difficult to simulate a simulator. additionally, many problems in computer science don't even have an answer mathematically: see traveling salesman.
Much of the mathematics is also built into languages and libraries. If you use a hash table to store data, you know to find any element can be done in constant time O(1), no matter how many elements are in the hash table. If you store it in a binary tree, it will take longer depending on the number of elements [0(n^2) if i remember correctly].
The problem is that software talks with other software, written by humans. The engineering examples you describe deal with physical phenomenon, which are constant. If I develop an electrical simulator, everyone in the world can use it. If I develop a protocol X simulator for my server, it will help me, but probably won't be worth the work.
No one can design a system from scratch and people that write semi-common libraries generally have plenty of enhancements and extensions to work on rather than writing a simulator for their library.
If you want a network traffic simulator you can find one, but it will tell you little about your server load because the traffic won't be using the protocol your server understands. Every server is going to see completely different sets of traffic.
There is lack of application of mathematics in current software industry.
e.g.: When a mechanical engineer designs an electricity pole , he computes the stress on the foundation by using stress analysis techniques(read mathematical equations) to determine exactly what kind and what grade of steel should be used, but when a software developer deploys a web server application he just guesses on the estimated load on his server and leaves the rest on luck and god, there is nothing that he can use to simulate mathematically to answer his problem (my observation).
I wouldn't say that luck or god are always the basis for load estimation. Often realistic data can be had.
It's also not true that there are no mathematical techniques to answer the question. Operations research and queuing theory can be applied to good advantage.
The real problem is that mechanical engineering is based on laws of physics and a foundation of thousands of years worth of empirical and scientific investigation. Computer science is only as old as me. Computer science will be much further along by the time your children and grandchildren apply the best practices of their day.
An MIT EE grad would not have this problem ;)
My thoughts:
Some people do actually apply math to estimate server load. The equations are very complex for many applications and many people resort to rules of thumb, guess and adjust or similar strategies. Some applications (real time applications with a high penalty for failure... weapons systems, powerplant control applications, avionics) carefully compute the required resources and ensure that they will be available at runtime.
Same as 1.
Engineers also use components provided by others, with a published interface. Think of electrical engineering. You don't usually care about the internals of a transistor, just it's interface and operating specifications. If you wanted to examine every component you use in all of it's complexity, you would be limited to what one single person can accomplish.
I have written fairly complex algorithms that determine what to scale when based on various factors such as memory consumption, CPU load, and IO. However, the most efficient solution is sometimes to measure and adjust. This is especially true if the application is complex and evolves over time. The effort invested in modeling the application mathematically (and updating that model over time) may be more than the cost of lost efficiency by try and correct approaches. Eventually, I could envision a better understanding of the correlation between code and the environment it executes in could lead to systems that predict resource usage ahead of time. Since we don't have that today, many organizations load test code under a wide range of conditions to empirically gather that information.
Software engineering are very different from the typical fields of engineering. Where "normal" engineering are bound to the context of our physical universe and the laws in it we've identified, there's no such boundary in the software world.
Producing software are usually an attempt to mirror a subset of the real-life world into a virtual reality. Here we define the laws ourselves, by only picking the ones we need and by making them just as complex as we need. Because of this fundamental difference, you need to look at the problem-solving from a different perspective. We try to make abstractions to make complex parts less complex, just like we teach kids that yellow + blue = green, when it's really the wavelength of the light that bounces on the paper that changes.
Once in a while we are bound by different laws though. Stuff like Big-O, Test-coverage, complexity-measurements, UI-measurements and the likes are all models of mathematic laws. If you look into digital signal processing, realtime programming and functional programming, you'll often find that the programmers use equations to figure out a way to do what they want. - but these techniques aren't really (to some extend) useful to create a virtual domain, that can solve complex logic, branching and interact with a user.
The reasons why wind tunnels, simulations, etc.. are needed in the engineering world is that it's much cheaper to build a scaled down prototype, than to build the full thing and then test it. Also, a failed test on a full scale bridge is destructive - you have to build a new one for each test.
In software, once you have a prototype that passes the requirements, you have the full-blown solution. there is no need to build the full-scale version. You should be running load simulations against your server apps before going live with them, but since loads are variable and often unpredictable, you're better off building the app to be able to scale to any size by adding more hardware than to target a certain load. Bridge builders have a given target load they need to handle. If they had a predicted usage of 10 cars at any given time, and then a year later the bridge's popularity soared to 1,000,000 cars per day, nobody would be surprised if it failed. But with web applications, that's the kind of scaling that has to happen.
1) Most business logic is usually broken down into decision trees. This is the "equation" that should be proofed with unit tests. If you put in x then you should get y, I don't see any issue there.
2,3) Profiling can provide some insight as to where performance issues lie. For the most part you can't say that software will take x cycles because that will change over time (ie database becomes larger, OS starts going funky, etc). Bridges for instance require constant maintenance, you can't slap one up and expect it to last 50 years without spending time and money on it. Using libraries is like not trying to figure out pi every time you want to find the circumference of a circle. It has already been proven (and is cost effective) so there is no need to reinvent the wheel.
4) For the most part web applications scale well horizontally (multiple machines). Vertical (multithreading/multiprocess) scaling tends to be much more complex. Adding machines is usually relatively easy and cost effective and avoid some bottlenecks that become limited rather easily (disk I/O). Also load balancing can eliminate the possibility of one machine being a central point of failure.
It isn't exactly rocket science as you never know how many consumers will come to the serving line. Generally it is better to have too much capacity then to have errors, pissed of customers and someone (generally your boss) chewing your hide out.
Related
Grid exploration robots with no visions and completely blind
I am doing a research on how grid explorer robots move. most articles are about robots with no vision but with sensors to see state of all surrounding cells, but my robot has really no vision and only can sense a cell after it could or failed to explore it(imagine the sensors are 4 pushed buttons that will be pushed when it hits the obstacles.) I was hoping someone could lead me to some clues of how to find papers about this kind of robots. I am really having a hard time figuring out how to find related information.
If you have a priori knowledge of the environment, but imperfect ability to sense your location in the environment (and probably imperfect actions) then what you're describing may be a partially observable markov decision process (POMDP.) Your four-button sensor fits nicely into that idea. If you don't even have prior knowledge of the environment, then you need to augment the POMDP notion with elements of exploration or machine learning. Note that POMDPs are geared toward scenarios with "rewards", and that while POMDPs are pretty well understood, efficient solutions are still a topic of research. As are ML/POMDP hybrids.
Environmental exploration and traversal with just push-button sensors for obstacle awareness, and without priori knowledge sounds to me like a Simultaneous localization and mapping (SLAM) problem. If I recall correctly, Roombas use this, so there should be a fair bit of study done on it in that regard.
Tuning Mathematical Parallel Codes
Assuming that I am interested in performance rather than portability of my linear algebra iterative multi-threaded solver and that I have the results of profiling my code in hand, how do I go about tuning my code to run optimally on that machine of my choice? The algorithm involves Matrix-Vector multiplications, norms and dot-products. (FWIW, I am working on CG and GMRES). I am working on codes which are of matrix size roughly equivalent to the full size of the RAM (~6GB). I'll be working on Intel i3 Laptop. I'll be linking my codes using Intel MKL. Specifically, Is there a good resource(PDF/Book/Paper) for learning manual tuning? There are numerous things that I learnt by doing for instance : Manual Unrolling isn't always optimal or about compiler flags but I would prefer a centralized resource. I need something to translate profiler information to improved performance. For instance, my profiler tells me that my stacks of one processor are being accessed by another or that my mulpd ASM is taking too much time. I have no clue what these mean and how I could use this information for improving my code. My intention is to spend as much time as needed to squeeze as much compute power as possible. Its more of a learning experience than for actual use or distribution as of now. (I am concerned about manual tuning not auto-tuning) Misc Details: This differs from usual performance tuning since the major portions of the code are linked to Intel's proprietary MKL library. Because of Memory Bandwidth issues in O(N^2) matrix-vector multiplications and dependencies, there is a limit to what I could manage on my own through simple observation. I write in C and Fortran and I have tried both and as discussed a million times on SO, I found no difference in either if I tweak them appropriately.
Gosh, this still has no answers. After you've read this you'll still have no useful answers ... You imply that you've already done all the obvious and generic things to make your codes fast. Specifically you have: chosen the fastest algorithm for your problem (either that, or your problem is to optimise the implementation of an algorithm rather than to optimise the finding of a solution to a problem); worked your compiler like a dog to squeeze out the last drop of execution speed; linked in the best libraries you can find which are any use at all (and tested to ensure that they do in fact improve the performance of your program; hand-crafted your memory access to optimise r/w performance; done all the obvious little tricks that we all do (eg when comparing the norms of 2 vectors you don't need to take a square root to determine that one is 'larger' than another, ...); hammered the parallel scalability of your program to within a gnat's whisker of the S==P line on your performance graphs; always executed your program on the right size of job, for a given number of processors, to maximise some measure of performance; and still you are not satisfied ! Now, unfortunately, you are close to the bleeding edge and the information you seek is not to be found easily in books or on web-sites. Not even here on SO. Part of the reason for this is that you are now engaged in optimising your code on your platform and you are in the best position to diagnose problems and to fix them. But these problems are likely to be very local indeed; you might conclude that no-one else outside your immediate research group would be interested in what you do, I know you wouldn't be interested in any of the micro-optimisations I do on my code on my platform. The second reason is that you have stepped into an area that is still an active research front and the useful lessons (if any) are published in the academic literature. For that you need access to a good research library, if you don't have one nearby then both the ACM and IEEE-CS Digital Libraries are good places to start. (Post or comment if you don't know what these are.) In your position I'd be looking at journals on 2 topics: peta- and exa-scale computing for science and engineering, and compiler developments. I trust that the former is obvious, the latter may be less obvious: but if your compiler already did all the (useful) cutting-edge optimisations you wouldn't be asking this question and compiler-writers are working hard so that your successors won't have to. You're probably looking for optimisations which like, say, loop unrolling, were relatively difficult to find implemented in compilers 25 years ago and which were therefore bleeding-edge back then, and which themselves will be old and established in another 25 years. EDIT First, let me make explicit something that was originally only implicit in my 'answer': I am not prepared to spend long enough on SO to guide you through even a summary of the knowledge I have gained in 25+ years in scientific/engineering and high-performance computing. I am not given to writing books, but many are and Amazon will help you find them. This answer was way longer than most I care to post before I added this bit. Now, to pick up on the points in your comment: on 'hand-crafted memory access' start at the Wikipedia article on 'loop tiling' (see, you can't even rely on me to paste the URL here) and read out from there; you should be able to quickly pick up the terms you can use in further searches. on 'working your compiler like a dog' I do indeed mean becoming familiar with its documentation and gaining a detailed understanding of the intentions and realities of the various options; ultimately you will have to do a lot of testing of compiler options to determine which are 'best' for your code on your platform(s). on 'micro-optimisations', well here's a start: Performance Optimization of Numerically Intensive Codes. Don't run away with the idea that you will learn all (or even much) of what you want to learn from this book. It's now about 10 years old. The take away messages are: performance optimisation requires intimacy with machine architecture; performance optimisation is made up of 1001 individual steps and it's generally impossible to predict which ones will be most useful (and which ones actually harmful) without detailed understanding of a program and its run-time environment; performance optimisation is a participation sport, you can't learn it without doing it; performance optimisation requires obsessive attention to detail and good record-keeping. Oh, and never write a clever piece of optimisation that you can't easily un-write when the next compiler release implements a better approach. I spend a fair amount of time removing clever tricks from 20-year old Fortran that was justified (if at all) on the grounds of boosting execution performance but which now just confuses the programmer (it annoys the hell out of me too) and gets in the way of the compiler doing its job. Finally, one piece of wisdom I am prepared to share: these days I do very little optimisation that is not under one of the items in my first list above; I find that the cost/benefit ratio of micro-optimisations is unfavourable to my employers.
R Quality Assurance Techniques
Could you provide some insight into the techniques that you use to ensure the quality of your solutions. For example, sometimes, I like to test my result using stopifnot() to ensure I'm not receiving ridiculous results. Are there any other techniques or functions that you use in data processing to ensure that you're receiving the solution you meant to? Note: I realize that this is a broad question and perhaps a candidate for community wiki or even closure, but rather than voting to close, perhaps assist me by adding comments to direct the conversation.
Just a few things that come to mind (in random order) This page has very interesting link for debugging in R (ok this is during production, but still related to your issue I think) You can use exceptions, as explained in this discussions (and links therein) You can write tests with known results (both for success and failure) and see that they actually do what they are supposed to do. Be sure to pass some weird data to the functions and see how they behave in a "not-so-normal" situation. Don't just rely on automated tests: give your functions to a fairly computer illiterate person at work (not enough that he/she can't use R though!) and let him/her do some beta tests. You'll be amazed at the quantity of errors he/she will come up with!!! :)
Quality in software engineering is quite a massive area, and most of it applies to code written in R as much as code written in Cobol or C#, so my first answer would be 'it depends'. For me, I come from the Pharmaceutical Industry, where what we do is regulated by government agencies like the FDA and the MHRA. For us, Quality is something we think about throughout the process so I would list the following as visible artifacts of quality; We have a software development process, that's written down and repeatable (traditionally in this kind of industry this is a waterfall style, but more and more agile / prototyping style methodologies are being used) We have a system that ensures every person involved knows what they should be doing (job descriptions) and is suitably qualified to do that job (training) We start by defining what is required in some way, hopefully in some way that can be tested We have some way of documenting our development process, where we've been and how (a combination of good documentation and Source Control) We do testing wherever possible, and as early as possible (so, automated if possible) We have people who are responsible for overseeing Quality, who are separate from people who are doing to prevent conflicts We control the software environment that is used for development, testing and production (read; change control) We control and manage software once it is in use, tracking issues and managing them (Issue Tracking) We keep records, so that even if every person involved went under a bus / won the lottery the new people could still defend and prove everything above to a government inspector. However, that's a big list, and I imagine their are lots of industries that don't do all of them (finance, education) and probably some who do more (building nuclear reactors, saving lives, NASA). More specifically to what i assume you're getting at, before you code you should be able to define some specific starting input's and the answers you should get out, and I recommend you use something like RUnit or Testthat to build these in.
Evolutionary vs throwaway prototyping [closed]
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 6 years ago. Improve this question Who is winning in the "Low vs High fidelity prototyping" debate? Should prototype-zero (P0) be the first version of the final product? Or should be P-0 always a throwaway? What approach is the industry favoring? Excelent article from wikipedia: Software prototyping
A prototype should always be a throwaway - a prototype is used to quickly prove a concept and influence the design of the real product. As such, a lot of things which are important for a real product (a thought-out architecture and design, reliability, security, maintainability, etc.) fall by the wayside. If you do take these things into account when building your prototype, you're not really building a prototype anymore. My experience with prototypes where the code directly evolved into an actual product shows that the end-result suffers because of it - the lack of a real architecture resulted in a lot of cobbled-together code that had to be constantly hacked to add new features. I've even seen a case the original technology chosen for rapid development of the prototype was not the best choice for the actual product, and a complete re-write was necessary for V2.
I think we, the pedants, have lost this particular battle -- alleged "prototypes" (which by definition should be rewritten from scratch!!!-) are in fact being "evolved" into (often half-baked "betas"), etc. Even today, I've applauded at the smart attempt by a colleague of mine to recapture the concept, even if the term is a lost battle: he's setting up a way for proofs of concept small projects to be developed (and, if the concept does get proven, transferred to software engineers for real prototyping, then development). The idea is that, in our department, we have many people who aren't (and aren't in fact supposed to be!-) software developers, but are very smart, computer savvy, and in daily contact with the reality "in the trenches" -- they are the ones who are most likely to smell an opportunity for some potential innovation which could have real impact once implemented as a "production-ready" software project. Salespeople, account managers, business analysts, technology managers -- at our company, they all often fit this description. But they're NOT going to program in C++, hardly at all in Java, maybe in Python but miles away from "productionized" -- indeed they're far more likely to whip up a smart proof of concept in php, javascript, perl, bash, Excel+VBA, and sundry other "quick and dirty" technologies we don't even want to dream about productionizing and supporting forevermore!-) So by calling their prototypes "proofs of concept", we hope to encourage them to embody their daring concepts in concrete form (vague natural-language blabberings and much waving of hands being least useful, and alien to the company's culture anyway;-) and yet sharply indicate that such projects, if promoted to exist among the software engineers' goals and priorities, DO have to be programmed from scratch -- the proof-of-concept serves, at best, as a good draft/sketch spec for what the engineers are aiming for, definitely NOT to be incrementally enriched, but redone from the root up!-). It's early to say how well this idea works -- ask me in three months, when we evaluate the quarter's endeavors (right now, we're just providing a blueprint for them, hot on the heels of evaluating last quarter's department- and company-wise undertakings!-).
Write the prototype, then keep refactoring it until it becomes the product. The key is to not hesitate to refactor when necessary. It helps to have few people working on it initially. With too many people working on something, refactoring becomes more difficult.
Response from BUNDALLAH, HAMISI A prototype typically simulates only a few aspects of the features of the eventual program, and may be completely different from the eventual implementation. Contrary to what my other colleagues have suggested above, I would NOT advise my boss to opt for the throw away prototype model. I am with Anita on this. Given the two prototype models and the circumstances provided, I would strongly advise the management (my boss) to opt for the evolutionary prototype model. The company being large with all the other variables given such as the complexity of the code, the newness of the programming language to be used, I would not use throw away prototype model. The throw away prototype model becomes the starting point from which users can re-examine their expectations and clarify their requirements. When this has been achieved, the prototype model is 'thrown away', and the system is formally developed based on the identified requirements (Crinnion, 1991). But with this situation, the users may not know all the requirements at once due to the complexity of the factors given in this particular situation. Evolutionary prototyping is the process of developing a computer system by a process of gradual refinement. Each refinement of the system contains a system specification and software development phase. In contrast to both the traditional waterfall approach and incremental prototyping, which required everyone to get everything right the first time this approach allows participants to reflect on lessons learned from the previous cycle(s). It is usual to go through three such cycles of gradual refinement. However there is nothing stopping a process of continual evolution which is often the case in many systems. According to Davis (1992), an evolutionary prototyping acknowledges that we do not understand all the requirements (as we have been told above that the system is complex, the company is large, the code will be complex, and the language is fairly new to the programming team). The main goal when using Evolutionary Prototyping is to build a very robust prototype in a structured manner and constantly refine it. The reason for this is that the Evolutionary prototype, when built, forms the heart of the new system, and the improvements and further requirements will be built. This technique allows the development team to add features, or make changes that couldn't be conceived during the requirements and design phase. For a system to be useful, it must evolve through use in its intended operational environment. A product is never "done;" it is always maturing as the usage environment change. Developers often try to define a system using their most familiar frame of reference--where they are currently (or rather, the current system status). They make assumptions about the way business will be conducted and the technology base on which the business will be implemented. A plan is enacted to develop the capability, and, sooner or later, something resembling the envisioned system is delivered. (SPC, 1997). Evolutionary Prototypes have an advantage over Throwaway Prototypes in that they are functional systems. Although they may not have all the features the users have planned, they may be used on an interim basis until the final system is delivered. In Evolutionary Prototyping, developers can focus themselves to develop parts of the system that they understand instead of working on developing a whole system. To minimize risk, the developer does not implement poorly understood features. The partial system is sent to customer sites. As users work with the system, they detect opportunities for new features and give requests for these features to developers. Developers then take these enhancement requests along with their own and use sound configuration-management practices to change the software-requirements specification, update the design, recode and retest. (Bersoff and Davis, 1991). However, the main problems with evolutionary prototyping are due to poor management: Lack of defined milestones, lack of achievement - always putting off what would be in the present prototype until the next one, lack of proper evaluation, lack of clarity between a prototype and an implemented system, lack of continued commitment from users. This process requires a greater degree of sustained commitment from users for a longer time span than traditionally required. Users must be constantly informed as to what is going on and be completely aware of the expectations of the 'prototypes'. References Bersoff, E., Davis, A. (1991). Impacts of Life Cycle Models of Software Configuration Management. Comm. ACM. Crinnion, J.(1991). Evolutionary Systems Development, a practical guide to the use of prototyping within a structured systems methodology. Plenum Press, New York. Davis, A. (1992). Operational Prototyping: A new Development Approach. IEEE Software. Software Productivity Consortium (SPC). (1997). Evolutionary Rapid Development. SPC document SPC-97057-CMC, version 01.00.04.
Is mathematics necessary for programming? [closed]
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 11 years ago. Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions. I happened to debate with a friend during college days whether advanced mathematics is necessary for any veteran programmer. He used to argue fiercely against that. He said that programmers need only basic mathematical knowledge from high school or fresh year college math, no more no less, and that almost all of programming tasks can be achieved without even need for advanced math. He argued, however, that algorithms are fundamental & must-have asset for programmers. My stance was that all computer science advances depended almost solely on mathematics advances, and therefore a thorough knowledge in mathematics would help programmers greatly when they're working with real-world challenging problems. I still cannot settle on which side of the arguments is correct. Could you tell us your stance, from your own experience?
To answer your question as it was posed I would have to say, "No, mathematics is not necessary for programming". However, as other people have suggested in this thread, I believe there is a correlation between understanding mathematics and being able to "think algorithmically". That is, to be able to think abstractly about quantity, processes, relationships and proof. I started programming when I was about 9 years old and it would be a stretch to say I had learnt much mathematics by that stage. However, with a bit of effort I was able to understand variables, for loops, goto statements (forgive me, I was Vic 20 BASIC and I hadn't read any Dijkstra yet) and basic co-ordinate geometry to put graphics on the screen. I eventually went on to complete an honours degree in Pure Mathematics with a minor in Computer Science. Although I focused mainly on analysis, I also studied quite a bit of discrete maths, number theory, logic and computability theory. Apart from being able to apply a few ideas from statistics, probability theory, vector analysis and linear algebra to programming, there was little maths I studied that was directly applicable to my programming during my undergraduate degree and the commercial and research programming I did afterwards. However, I strongly believe the formal methods of thinking that mathematics demands — careful reasoning, searching for counter-examples, building axiomatic foundations, spotting connections between concepts — has been a tremendous help when I have tackled large and complex programming projects. Consider the way athletes train for their sport. For example, footballers no doubt spend much of their training time on basic football skills. However, to improve their general fitness they might also spend time at the gym on bicycle or rowing machines, doing weights, etc. Studying mathematics can be likened to weight-training or cross-training to improve your mental strength and stamina for programming. It is absolutely essential that you practice your basic programming skills but studying mathematics is an incredible mental work-out that improves your core analytic ability.
While advanced mathematics may not be required for programming (unless you are programming advanced mathematics capability) the thought process of programming and mathematics are very similar. You begin with a base of known things (axioms, previously proven theories) and try to get to someplace new. You cannot skip steps. If you do skip steps, then you are required to fill in the blanks. It's a critical thought process that makes the two incredibly similar. Also, mathematicians and programmers both think critically in the abstract. Real world things are represented by objects and variables. The ability to translate from concrete to abstract also links the two fields. There's a very good chance that if you're good at one, you will probably be good at the other.
computer science != programming OK, seriously, I know good and bad programmers who were English and Psychology majors and some that were Computer Science majors. Some very famous guys that I admire as developers didn't have a CS background. Larry Wall(Perl), for example, was a linguist. On the other hand, it helps to know something about the domain you are working on because then you can at least see if your data makes sense and help your customer/users drill down to what they really want. And yes, there's the issue of computational complexity and efficient data structures and program correctness. That's stuff you learn in Computer Science and that's useful to know in almost any domain, but it's neither necessary nor sufficient.
I guess I am going to be the first person to say you do need math. As others have said math is not all that important for certain aspects of development, but the fundamentals of critical thinking and structured analysis are very important. More so, math is important in understanding a lot of the fundamentals that go into things like schedulers, optimizations, sorting, protocol management, and a number of other aspects of computers. Though the math involved from a calculation level is not complex (its mostly High school algebra) the theories and applications can be quite complex as a solid understanding of math through calculus will be of great benefit. Can you get by without it, absolutely, and you shouldnt let a less then thorough knowledge of math hold you back, but if you had the chance, or the inclination I would study as much math as you could, calculus, numeric theory, linear algebra, combinatorics, practical applications, all of it has both practical and theoretical applications in a wide range of computer science. I have known people who were highly successful on both sides of the fence (those without a strong focus on math, and those who went to school for physics or math), but in both groups they enjoyed numerical problems and learning about algorithms and math theory.
I have a maths degree, but I can't remember requiring that maths a single time in my career. It was useful in terms of training my mind for logical thinking, but I've not written any code using fluid dynamics, quantum theory or Markov Chains. (The last is the most likely to come up, I suspect.) Most line-of-business developers won't need advanced maths most of the time. Sometimes knowing trigonometry can help, and certainly being able to understand enough maths to implement algorithms described mathematically can be important - but beyond that? Nah. Don't forget that most programmers aren't advancing computer science - they're building applications. I don't need to know advanced engineering to drive a modern car, even though that car has almost certainly been improved through advanced engineering.
I would argue that having advanced logic (discrete) math can really help. That along with set theory. When dealing with common computer programs, these disciplines can help a lot. However, a lot of the other math I took in university was calculus, which as far as I can see, had very limited usage. Since 90% (or something like that) of programming is doing business apps with very simple math, I would say that for the most part, you can get by with very little math knowledge. However, a good understanding of boolean algebra, logic, discrete math, and set theory can really put you up to that next level.
I'll go against the grain here and say "Yes" I switch from Civil Engineering to programming (Concrete Sucks!). My math background consists of the usual first year stuff, second and third year Calculus(Diff EQ, volume integrations, Series, Fourier and Laplace transforms) and a Numerical Analysis course. I find that my math is incredibly lacking for computer programming. There are entire areas of Discrete math and logic that I am missing, and I only survive due to an extensive library of textbooks, Wikipedia and Wolfram. Most advanced algorithms are based on advanced math, and I am unable to develop advanced algorithms without doing extensive research (Essentially the equivalent to a half-course worth of work.) I am certainly unable to come up with NEW algorithms, as I just don't have the mathematical foundations as the shoulders of giants upon which to stand.
It depends on what you are doing. If you do a lot of 3D programming, knowledge of 3D geometry is certainly necessary, don't you agree? ;-) If you want to create a new image format like JPG or a new audio format like MP3, you are also pretty lost if you can't understand a cosine or fourier transformation, as these are the basics most lossy compression are based on. Many other problems can be resolved better if you know your math rather well. There are also many other programming tasks you will find do not need much math.
If you find the subject fascinating enough to post this, just go ahead and start learning. The rest will come naturally.
Yeah, there is no need for advanced mathematics - if you are programming commercial - off the shelf software. However when dealing with hardcore stuff such as: Calculating trajectories to control a robot Creating AI-like applications to support uncertainty and automatic reasoning Playing with 3-D motion and graphics Some advanced mathematics knowledge might come in handy. And it's not like they are "out-of-this world" problems. I had to create a software to try to "predict" the necessary amount of paper for an office (and it was hell just to find out the best way to approximate values). You have to be careful, though, because it is easy to get lost when using advanced things - there is a friend of mine who resorted to using Turing to store the state of a dynamic menu just to display it correctly - humm... perhaps he wnet too far in his imagination.
What type of programming? In my commercial experience, I have needed no advanced mathematics, but this is heavily dependent on the field you are in. Computer graphics require a large amount of advanced mathematics. A lot of academic computer programming requires advanced mathematics. So saying there tends to be a correlation between people who are good at mathematics and people who are good at programming. I hope this wishy-washy answer helps.
Mathematics are needed for developers in some fields but are almost useless in others. If you are a game developer and have to work with physics a lot - understanding of math is crucial. If you are working with advanced visual controls - you could not do much without geometry. If you're planning to do some financial calculations - it would REALLY help to have solid knowledge of statistics. On the other hand over last 5 years I had only 2 or 3 projects where ANY amount of math was required at all. Of these there was only 1 occurrence when a Google search did not help. At the end of the day even financial calculations are very often something your clients do for you and give you formulas to implement. So if you're in 'applied software' business you are likely to never use your math degree. If you're in academic software maths are crucial.
I agree with Chris. I would say "Yes", also. But this depends on your market as stated above. If you are simply creating some basic "off-the-shelf" applications or writing tools to help your everyday work...then math isn't nearly as important. Engineering custom software solutions requires lots of problem solving and critical thinking. Skills that are most definitely enhanced when a mathematics background is present. I minored in Math with my Computer Engineering degree and I give credit to all of my math-oriented background as to why I'm where I am today. That's my 2 cents, I can tell from reading above that many would not agree. I encourage all to consider that I'm not saying you can't have those skills without a math background, I'm simply stating that the skills are side-effects of having such a background and can impact software positively.
In my experience math is required in programming, you can't get away from it. The whole of programming is based on math. The issue is not black and white, but more colorful. The question is not whether or not you need math, but how much. The higher levels of math will give you more tools and open up your mind to different paths of though. For example, you can program if you only known addition and subtraction. When multiplication is required, you will have to perform a lot of additions. Multiplication simplifies repetitive additions. Algebra allows one to simplify math before implementing it into programs. Linear Algebra provides tools for transforming images. Boolean Algebra provides mechanics for reducing all those if statements. And don't forget the sibling to mathematics, Logic and Philosophy. Logic will help you make efficient use of case or switch statements. Philosophy will help you understand the thinking of the guy who wrote that code you are modifying. Yes, you don't need much math to write programs. Some programs may require more math than others. More math knowledge will give you an advantage over those who have lesser understanding. In these times, people need every advantage they can get to obtain those jobs.
I've been programming for 8 years professionally, and since I was 12 as a hobby. Math is not necessary, logic is. Math is horribly helpful though, to say it's not necessary is like saying that to kill a man, a gun isn't necessary, you can use a knife. Well, it is true, but that gun makes it a lot easier. There are a couple bare minimums, which you should already meet. You need to know basic algebraic expressions and notation, and the common computer equivalents. For example, you need to know what an exponential is (3 to the 3rd is 27), and the common computer expression is 3^3. The common notations for algebra does change between languages, but many of them use a somewhat unified methodology. Others (looking at you LISP) don't. You also need to know order of operations. You need to understand algorithmic thought. First this, then this, produces this which is used in this calculation. Chances are you understand this or you don't, and it's a fairly hard hurdle to jump if you don't understand it; I've found that this is something you 'get', and not really something you can learn. Conversely, some people don't 'get' art. They should not become painters. Also, there have been students in CS curriculum who cannot figure out why this does not work: x = z + w; z = 3; y = 5; It's not that they don't understand addition, it's that they aren't grasping the requirement of unambiguous express. If they understand it, the computer should too, right? If you can't see what's wrong with the above three lines, then don't become a programmer. Lastly, you need to know whatever math is under your domain of programming. Accounting software could stop at basic algebra. If you are programming physics, you'll need to know physics (loosely) and math in 3-dimensional geometry (Euclidean). If you're programming architecture software, you'll need to know trigonometry. This goes farther then math though; whatever domain you are programming for, you need to soundly understand the basics. If you are programming language analysis software, you'll need to know probability, statistics, grammar theory (multiple languages), etc. Often times, certain domains need, or can benefit from, knowledge you'd think is unrelated. For example, if you were programming audio software, you actually need to know trigonometry to deal with waveforms. Magnitude changes things also. If you are sorting a financial data set of 1000 items, it's no big thing. If it was 10 million records, however, you would benefit greatly from knowing vector math actually, and having a deep understanding of sorting at the binary level (how does a system sort alphabetically? How does it know 'a' is less than 'b'?) You are going to find that as a programmer, your general knowledge base is going to explode, because each project will necessitate more learning outside of the direct sphere of programming. If you are squeamish or lazy about self-learning, and do not like the idea of spending 10+ hours a week doing essentially 'homework', do not become a programmer. If you like thought exercises, if you like learning, if you can think about abstract things like math without a calculator or design without a sketchpad, if you have broad tastes in life and hobbies, if you are self-critical and can throw away 'favorited' ideas, if you like perfecting things, then become a programmer. Do not base this decision on math, but rather, the ability to think logically and learn. Those are what is important; math is just the by-product.
Of course it depends on what kind of programmer you want to be, or better what kind of programmer your employers want you to be. I think calculus and algebra are essentials, statistic and linear programming is indeed a good tool to have in your briefcase, maybe analysis (derivative, integrals, functions...) could be done without. But if you want to know how things work skin-deep (electronics, for example, or some non trivial algorhytms) "advanced" math is something you'd better not go without anywhere.
Most of the programming I have done involved physics simulations for research including things like electromagnetism, quantum mechanics and structural mechanics. Since the problem domains have advanced mathematics associated with them I would be hard pressed to solve them without using advanced mathematics. So the answer to your question is - it depends on what you are trying to do.
Advanced maths knowledge is vital if you're going to be writing a new programming language. Or you need write your own algorithms. However, for most day-to-day programming - from websites to insurance processing applications - only basic maths are necessary.
Someone with a solid mathematical (which is not merely arithmetic) or logic background will cope well with algorithms, variable use, conditional reasoning and data structures. Not everyone can design a UI. Not everyone can make efficient code. Not everyone can comment and document clearly. Not everyone can do a good algorithm Mathematics will help you to a point, but only to a point.
I dont think advanced mathemetics knowledge is a requirement for a good programmer, but based on personal experience I think that programmers who have a better grasp at advanced maths also make better programmers. This may simply be due to a more logical mind, or a more logical outlook due to their experiences of solving mathematical problems.
The fundamental concept of maths is the following, devising, understanding, implementation, and use of algorithms. If you cannot do maths then it is because you cannot do these things, and if you cannot do these things then you cannot be an effective programmer. Common programming tasks might not need any specific mathematical knowledge (e.g. you probably won't need vector algebra and calculus unless you're doing tasks like 3D graphics or physics simulations, for example), but the underlying skillsets are identical, and lack of ability in one domain will be matched by a corresponding lack of ability in the other domain.
Math is a toolbox for creating programs. I recommend Cormen's Introduction to Algorithms. It touches on the more "mathy" stuff. - Greatest lowest limit (managing resources) - Random variables (game programming) - Topological sort (adjusting spreadsheets) - Matrix operations (3d graphics) - Number theory (encryption) - Fast fourier transforms (networks)
I don't think that higher math is a requirement for being a good programmer - as always it depends on what you are coding. Of course if you are in 3D graphics programming, you'll need matrices and stuff. As author of business software, you'll probably need statistics math. But being a professional programmer for almost 10 years (and another 10 years amateur) "higher math" is not something that I needed regularily. In about 99.8% of all cases it's just plus, minus, division and multiplication in some intelligent combinations - in most cases it's about algorithms, not math.
Learning higher math, for most programmers, is important simply because it bends your brain to think logically, in a step-by-step manner to get from one thing to another. Very few programming jobs, though, require anything above high school math. I've used linear algebra once. I've never used calculus. I use algebra every day.
Mathematical knowledge is often useful to a programmer, as are graphic design skill, puzzle-solving ability, work ethic and a host of other skills and traits. Very few programmers are good at everything that a programmer can possibly be good at. I wouldn't agree with any statement of the form "you're not a real programmer unless you can {insert favorite programming ability here}". But I would be wary of a programmer who couldn't do Math. More so than of one who couldn't draw.
I think it really depends on what you're trying to do, but IMHO, the CS and OS theory are more important than math here, and you really need only the math that they involve. For example, there's a lot of CS background of scheduling theory and optimization that stands behind many schedulers in modern OSs. That is an example of something that would require some math, though not something super complicated. But honestly, for most stuff, you don't need math. What you need is to learn the ability to think in base 2 and 16, such as the ability to mentally OR/AND. For example, if you have a byte and within that byte there are two 3-bit fields and 2 wasted bits, knowing which bits are in which fields are active when the byte value is something like 11 will make things slightly faster than having to use pen and paper.
I began programming about the same time I entered my pre-algebra class.. So I wouldn't say math is all that important, though it can help in certain types of programming, especially functional. I haven't taken Discrete Math yet, but I see a lot of theory stuff with programming written in a math-notation that is taught in this class. Also, make sure you know how to calculate anything in any base, especially base 2, 8, and 16. Also, one class that really brought home some concepts for me was this pre-programming class. We got taught unions, intersections, and all that happy stuff and It almost exactly parallels bitwise math. And we covered boolean logic very heavily. What I considered the most useful was when we learned how to reduce complex boolean statements. This was very handy: (x|y) & (x|z) & (x|foo) can be simplified to x | (y & z & foo) Which I previously did not quite grasp.
Well you generated a number of responses, and no I did not read them all. I am in the middle on this, no you certainly do not need math in order to be a programmer. Assembler vs device drivers in linux are no more or less complicated than the other and neither require math. In no way shape or form do you need to take or pass a math class for any of this. I will agree that the problem solving mindset for programming is quite similar to that of math solutions, and as a result math probably comes easily. or the contrary if math is hard then programming may be hard. A class or a degree or any pieces of paper or trophies are not required, going off and learning stuff, sure. Now if you cannot convert from hex to binary to decimal quickly either in your head, on paper, or using a calculator you are going to struggle. If you want to get into networking and other things that involve timing, which kernel drivers often do but dont have to. You are going to struggle. I know of a very long list of people with math degrees and/or computer science, and/or engineering degrees that struggle with the rate calculations, bits per second, bytes per second, how much memory you need to do something, etc. To some extent it may be considered some sort of knack that some have and others have to work toward. My bottom line is I believe in will power, if you want to learn this stuff you can and will, it is as simple as that. You dont need to take a class or spend a lot of money, linux and qemu for example can keep you busy for quite some time, different asm langauges, etc. crashable environments for kernel development, embedded, etc. You are not limited to that, but I dont believe that you have to run off and take any classes if you dont want to. If you want to then sure take some ee classes, some cs classes and some math classes..
You need math. Programming is nothing more than math. Any findings of theoretical physics does not become a practical(applicable) implication, unless they are explained in terms of mathematical solutions. None of those can be solved computationally if they cannot be interpreted on computers, and more specifically on programming languages. Different languages are thus designed to solve specific problems. But for the general purpose and wide spread programming languages like java, c, c++ much of our programming tasks involve repetitive(continuous) solution to same problems like extracting values from database, text files, putting them on windows(desktop, web), manipulating same values, sometimes accessing some data from similar devices(but given different brand names, different port and a headache) etc which does not involve more than unitary method, and algebra(counter, some logic), geometry(graphics) etc. So it depends what you are trying to solve.
IMO, you probably need an aptitude for mathematics, without necessarily having much knowledge in the field. So the things you require to be good at maths are similar to the things you require to be good at programming. But in general, I can't remember the last time I used any sort of advanced maths in day-to-day programming, so no.