CNN AlexNet algorithm complexity - math

I'm first year student in machine learning and I really recently started to immersing.
So, professor gave me a task, calculate number of:
matrix additions
matrix multiplications
matrix divisions
Which are processed in the well known convolutional neural network - AlexNet.
I found some matherials about it, but I really confused where to start.
So, the overall structure might looks like:
But, how can I calculate operations for each type distinctly?

It's a convolutional network. These limit parameters to make the calculations manageable while still giving good results.
This particular network is described in many places and papers, so it's not difficult to get the figures for the number of parameters and conv nets involved.
But you need to start with an understanding of how a convolutional network works. I find this is a good place to start: http://cs231n.github.io/convolutional-networks/

Related

Unchanges training accuracy in convolutional neural network using MXnet

I am totally new to NN and want to classify the almost 6000 images that belong to different games (gathered by IR). I used the steps introduced in the following link, but I get the same training accuracy in each round.
some info about NN architecture: 2 convloutional, activation, and pooling layers. Activation type: relu, Number of filters in first and second layers are 30 and 70 respectively.
2 fully connected layer with 500 and 2 hidden layers respectively.
http://firsttimeprogrammer.blogspot.de/2016/07/image-recognition-in-r-using.html
I had a similar problem, but for regression. After trying several things (different optimizers, varying layers and nodes, learning rates, iterations etc), I found that the way initial values are given helps a lot. For instance I used a -random initializer with variance of 0.2 (initializer = mx.init.normal(0.2)).
I came upon this value from this blog. I recommend you read it. [EDIT]An excerpt from the same,
Weight initialization. Worry about the random initialization of the weights at the start of learning.
If you are lazy, it is usually enough to do something like 0.02 * randn(num_params). A value at this scale tends to work surprisingly well over many different problems. Of course, smaller (or larger) values are also worth trying.
If it doesn’t work well (say your neural network architecture is unusual and/or very deep), then you should initialize each weight matrix with the init_scale / sqrt(layer_width) * randn. In this case init_scale should be set to 0.1 or 1, or something like that.
Random initialization is super important for deep and recurrent nets. If you don’t get it right, then it’ll look like the network doesn’t learn anything at all. But we know that neural networks learn once the conditions are set.
Fun story: researchers believed, for many years, that SGD cannot train deep neural networks from random initializations. Every time they would try it, it wouldn’t work. Embarrassingly, they did not succeed because they used the “small random weights” for the initialization, which works great for shallow nets but simply doesn’t work for deep nets at all. When the nets are deep, the many weight matrices all multiply each other, so the effect of a suboptimal scale is amplified.
But if your net is shallow, you can afford to be less careful with the random initialization, since SGD will just find a way to fix it.
You’re now informed. Worry and care about your initialization. Try many different kinds of initialization. This effort will pay off. If the net doesn’t work at all (i.e., never “gets off the ground”), keep applying pressure to the random initialization. It’s the right thing to do.
http://yyue.blogspot.in/2015/01/a-brief-overview-of-deep-learning.html

Graph partitioning optimization

The problem
I have a set of locations on a plane (actually they are pins in a KML file) and I want to partition this graph into subgraphs. Connectivity is pretty good - as with all real world road networks - so I assume that if two locations are close they have some kind of connection. The resulting set of subgraphs should adhere to these constraints:
Every node has to be covered by a subgraph
Every node should be in exactly 1 subgraph
Every node within a subgraph should be close to each other (L2 norm distances)
Every subgraph should contain at least 5 locations
The amount of subgraphs should be minimal
Right now the amount of locations is no more than 100 so I thought about brute forcing through every possibility but this obviously won't scale well.
I thought about using some k-Nearest-Neighbors algorithm (e.g. using QuickGraph) but I can't get my head around where to start and how to extend/shrink the subgraphs on the way. Maybe it's possible to map this problem to another problem that can easily be solved with some numerical procedure (e.g. Simplex) ...
Maybe someone has experience in this kind of optimization problems and is willing to help me find a solution? I don't have access to Mathematica/Matlab or the like ... but sufficient .NET programming skills and hmm Excel :-)
Thanks a lot!
As soon as there are multiple criteria that need to be appeased in the best possible way simultanously, it is usually starting to get difficult.
A numerical solution could work as follows: You could define yourself a utility function, that maps partitionings of your locations to positive real values, describing how "good" a partition is by assigning it a "rating" (good could be high "bad" could be near zero).
Once you have such a function assigning partitions their according "values", you simply need to optimize it and then you hopefully obtain a good solution if you defined your utility function reasonably. Evolutionary algorithms are good at that task since your utility function is probably analytically too complex to solve due to its discrete nature.
The problem is then only how you assign "values" to partitions via this utility function. This is then your task. It can be done for example by weighing each criterion with a factor and summing the results up, or even more complex functions (least squares etc.). The factors you use in the definition of the utility function are tuning parameters and can be varied until the result seems to be good.
Some CA software wold help a lot for testing if you can get your hands on one, bit I guess to obtain a black box solver for your partitioning problem, you need to implement the complete procedure yourself using a language of your choice.

Can any existing Machine Learning structures perfectly emulate recursive functions like the Fibonacci sequence?

To be clear I don't mean, provided the last two numbers in the sequence provide the next one:
(2, 3, -> 5)
But rather given any index provide the Fibonacci number:
(0 -> 1) or (7 -> 21) or (11 -> 144)
Adding two numbers is a very simple task for any machine learning structure, and by extension counting by ones, twos or any fixed number is a simple addition rule. Recursive calculations however...
To my understanding, most learning networks rely on forwards only evaluation, whereas most programming languages have loops, jumps, or circular flow patterns (all of which are usually ASM jumps of some kind), thus allowing recursion.
Sure some networks aren't forwards only; But can processing weights using the hyperbolic tangent or sigmoid function enter any computationally complete state?
i.e. conditional statements, conditional jumps, forced jumps, simple loops, complex loops with multiple conditions, providing sort order, actual reordering of elements, assignments, allocating extra registers, etc?
It would seem that even a non-forwards only network would only find a polynomial of best fit, reducing errors across the expanse of the training set and no further.
Am I missing something obvious, or did most of Machine Learning just look at recursion and pretend like those problems don't exist?
Update
Technically any programming language can be considered the DNA of a genetic algorithm, where the compiler (and possibly console out measurement) would be the fitness function.
The issue is that programming (so far) cannot be expressed in a hill climbing way - literally, the fitness is 0, until the fitness is 1. Things don't half work in programming, and if they do, there is no way of measuring how 'working' a program is for unknown situations. Even an off by one error could appear to be a totally different and chaotic system with no output. This is exactly the reason learning to code in the first place is so difficult, the learning curve is almost vertical.
Some might argue that you just need to provide stronger foundation rules for the system to exploit - but that just leads to attempting to generalize all programming problems, which circles right back to designing a programming language and loses all notion of some learning machine at all. Following this road brings you to a close variant of LISP with mutate-able code and virtually meaningless fitness functions that brute force the 'nice' and 'simple' looking code-space in attempt to follow human coding best practices.
Others might argue that we simply aren't using enough population or momentum to gain footing on the error surface, or make a meaningful step towards a solution. But as your population approaches the number of DNA permutations, you are really just brute forcing (and very inefficiently at that). Brute forcing code permutations is nothing new, and definitely not machine learning - it's actually quite common in regex golf, I think there's even an xkcd about it...
The real problem isn't finding a solution that works for some specific recursive function, but finding a solution space that can encompass the recursive domain in some useful way.
So other than Neural Networks trained using Backpropagation hypothetically finding the closed form of a recursive function (if a closed form even exists, and they don't in most real cases where recursion is useful), or a non-forwards only network acting like a pseudo-programming language with awful fitness prospects in the best case scenario, plus the virtually impossible task of tuning exit constraints to prevent infinite recursion... That's really it so far for machine learning and recursion?
According to Kolmogorov et al's On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition, a three layer neural network can model arbitrary function with the linear and logistic functions, including f(n) = ((1+sqrt(5))^n - (1-sqrt(5))^n) / (2^n * sqrt(5)), which is the close form solution of Fibonacci sequence.
If you would like to treat the problem as a recursive sequence without a closed-form solution, I would view it as a special sliding window approach (I called it special because your window size seems fixed as 2). There are more general studies on the proper window size for your interest. See these two posts:
Time Series Prediction via Neural Networks
Proper way of using recurrent neural network for time series analysis
Ok, where to start...
Firstly, you talk about 'machine learning' and 'perfectly emulate'. This is not generally the purpose of machine learning algorithms. They make informed guesses given some evidence and some general notions about structures that exist in the world. That typically means an approximate answer is better than an 'exact' one that is wrong. So, no, most existing machine learning approaches aren't the right tools to answer your question.
Second, you talk of 'recursive structures' as some sort of magic bullet. Yet they are merely convenient ways to represent functions, somewhat analogous to higher order differential equations. Because of the feedbacks they tend to introduce, the functions tend to be non-linear. Some machine learning approaches will have trouble with this, but many (neural networks for example) should be able to approximate you function quite well, given sufficient evidence.
As an aside, having or not having closed form solutions is somewhat irrelevant here. What matters is how well the function at hand fits with the assumptions embodied in the machine learning algorithm. That relationship may be complex (eg: try approximating fibbonacci with a support vector machine), but that's the essence.
Now, if you want a machine learning algorithm tailored to the search for exact representations of recursive structures, you could set up some assumptions and have your algorithm produce the most likely 'exact' recursive structure that fits your data. There are probably real world problems in which such a thing would be useful. Indeed the field of optimisation approaches similar problems.
The genetic algorithms mentioned in other answers could be an example of this, especially if you provided a 'genome' that matches the sort of recursive function you think you may be dealing with. Closed form primitives could form part of that space too, if you believe they are more likely to be 'exact' than more complex genetically generated algorithms.
Regarding your assertion that programming cannot be expressed in a hill climbing way, that doesn't prevent a learning algorithm from scoring possible solutions by how many much of your evidence it's able to reproduce and how complex they are. In many cases (most? though counting cases here isn't really possible) such an approach will find a correct answer. Sure, you can come up with pathological cases, but with those, there's little hope anyway.
Summing up, machine learning algorithms are not usually designed to tackle finding 'exact' solutions, so aren't the right tools as they stand. But, by embedding some prior assumptions that exact solutions are best, and perhaps the sort of exact solution you're after, you'll probably do pretty well with genetic algorithms, and likely also with algorithms like support vector machines.
I think you also sum things up nicely with this:
The real problem isn't finding a solution that works for some specific recursive function, but finding a solution space that can encompass the recursive domain in some useful way.
The other answers go a long way to telling you where the state of the art is. If you want more, a bright new research path lies ahead!
See this article:
Turing Machines are Recurrent Neural Networks
http://lipas.uwasa.fi/stes/step96/step96/hyotyniemi1/
The paper describes how a recurrent neural network can simulate a register machine, which is known to be a universal computational model equivalent to a Turing machine. The result is "academic" in the sense that the neurons have to be capable of computing with unbounded numbers. This works mathematically, but would have problems pragmatically.
Because the Fibonacci function is just one of many computable functions (in fact, it is primitive recursive), it could be computed by such a network.
Genetic algorithms should do be able to do the trick. The important this is (as always with GAs) the representation.
If you define the search space to be syntax trees representing arithmetic formulas and provide enough training data (as you would with any machine learning algorithm), it probably will converge to the closed-form solution for the Fibonacci numbers, which is:
Fib(n) = ( (1+srqt(5))^n - (1-sqrt(5))^n ) / ( 2^n * sqrt(5) )
[Source]
If you were asking for a machine learning algorithm to come up with the recursive formula to the Fibonacci numbers, then this should also be possible using the same method, but with individuals being syntax trees of a small program representing a function.
Of course, you also have to define good cross-over and mutation operators as well as a good evaluation function. And I have no idea how well it would converge, but it should at some point.
Edit: I'd also like to point out that in certain cases there is always a closed-form solution to a recursive function:
Like every sequence defined by a linear recurrence with constant coefficients, the Fibonacci numbers have a closed-form solution.
The Fibonacci sequence, where a specific index of the sequence must be returned, is often used as a benchmark problem in Genetic Programming research. In most cases recursive structures are generated, although my own research focused on imperative programs so used an iterative approach.
There's a brief review of other GP research that uses the Fibonacci problem in Section 3.4.2 of my PhD thesis, available here: http://kar.kent.ac.uk/34799/. The rest of the thesis also describes my own approach, which is covered a bit more succinctly in this paper: http://www.cs.kent.ac.uk/pubs/2012/3202/
Other notable research which used the Fibonacci problem is Simon Harding's work with Self-Modifying Cartesian GP (http://www.cartesiangp.co.uk/papers/eurogp2009-harding.pdf).

Genetic Algorithms Introduction

Starting off let me clarify that i have seen This Genetic Algorithm Resource question and it does not answer my question.
I am doing a project in Bioinformatics. I have to take data about the NMR spectrum of a cell(E. Coli) and find out what are the different molecules(metabolites) present in the cell.
To do this i am going to be using Genetic Algorithms in R language. I DO NOT have the time to go through huge books on Genetic algorithms. Heck! I dont even have time to go through little books.(That is what the linked question does not answer)
So i need to know of resources which will help me understand quickly what it is Genetic Algorithms do and how they do it. I have read the Wikipedia entry ,this webpage and also a couple of IEEE papers on the subject.
Any working code in R(even in C) or pointers to which R modules(if any) to be used would be helpful.
A brief (and opinionated) introduction to genetic algorithms is at http://www.burns-stat.com/pages/Tutor/genetic.html
A simple GA written in R is available at http://www.burns-stat.com/pages/Freecode/genopt.R The "documentation" is in 'S Poetry' http://www.burns-stat.com/pages/Spoetry/Spoetry.pdf and the code.
I assume from your question you have some function F(metabolites) which yields a spectrum but you do not have the inverse function F'(spectrum) to get back metabolites. The search space of metabolites is large so rather than brute force it you wish to try an approximate method (such as a genetic algorithm) which will make a more efficient random search.
In order to apply any such approximate method you will have to define a score function which compares the similarity between the target spectrum and the trial spectrum. The smoother this function is the better the search will work. If it can only yield true/false it will be a purely random search and you'd be better off with brute force.
Given the F and your score (aka fitness) function all you need to do is construct a population of possible metabolite combinations, run them all through F, score all the resulting spectrums, and then use crossover and mutation to produce a new population that combines the best candidates. Choosing how to do the crossover and mutation is generally domain specific because you can speed the process greatly by avoiding the creation of nonsense genomes. The best mutation rate is going to be very small but will also require tuning for your domain.
Without knowing about your domain I can't say what a single member of your population should look like, but it could simply be a list of metabolites (which allows for ordering and duplicates, if that's interesting) or a string of boolean values over all possible metabolites (which has the advantage of being order invariant and yielding obvious possibilities for crossover and mutation). The string has the disadvantage that it may be more costly to filter out nonsense genes (for example it may not make sense to have only 1 metabolite or over 1000). It's faster to avoid creating nonsense rather than merely assigning it low fitness.
There are other approximate methods if you have F and your scoring function. The simplest is probably Simulated Annealing. Another I haven't tried is the Bees Algorithm, which appears to be multi-start simulated annealing with effort weighted by fitness (sort of a cross between SA and GA).
I've found the article "The science of computing: genetic algorithms", by Peter J. Denning (American Scientist, vol 80, 1, pp 12-14). That article is simple and useful if you want to understand what genetic algorithms do, and is only 3 pages to read!!

In what areas of programming is a knowledge of mathematics helpful? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
For example, math logic, graph theory.
Everyone around tells me that math is necessary for programmer. I saw a lot of threads where people say that they used linear algebra and some other math, but no one described concrete cases when they used it.
I know that there are similar threads, but I couldn't see any description of such a case.
Computer graphics.
It's all matrix multiplication, vector spaces, affine spaces, projection, etc. Lots and lots of algebra.
For more information, here's the Wikipedia article on projection, along with the more specific case of 3D projection, with all of its various matrices. OpenGL, a common computer graphics library, is an example of applying affine matrix operations to transform and project objects onto a computer screen.
I think that a lot of programmers use more math than they think they do. It's just that it comes so intuitively to them that they don't even think about it. For instance, every time you write an if statement are you not using your Discrete Math knowledge?
In graphic world you need a lot of transformations.
In cryptography you need geometry and number theory.
In AI, you need algebra.
And statistics in financial environments.
Computer theory needs math theory: actually almost all the founders are from Maths.
Given a list of locations with latitudes and longitudes, sort the list in order from closest to farthest from a specific position.
All applications that deal with money need math.
I can't think of a single app that I have written that didn't require math at some point.
I wrote a parser compiler a few months back, and that's full of graph-theory. This was only designed to be slightly more powerful than regular expressions (in that multiple matches were allowed, and some other features were added), but even such a simple compiler requires loop detection, finite state automata, and tons more math.
Implementing the Advanced Encryption Standard (AES) algorithm required some basic understanding of finite field math. See act 4 of my blog post on it for details (code sample included).
I've used a lot of algebra when writing business apps.
Simple Examples
BMI = weight / (height * height);
compensation = 10 * hours * ((pratio * 2.3) + tratio);
A few years ago, I had a DSP project that had to compute a real radix-2 FFT of size N, in a given time. The vendor-supplied real radix-2 FFT wouldn't run in the allocated time, but their complex FFT of size N/2 would. It is easy to feed the real data into the complex FFT. Getting the answers out afterwards is not so easy: it is called post-weaving, or post-unweaving, or unweaving. Deriving the unweave equations from the FFT and complex number theory was not fun. Going from there to tightly-optimized DSP code was equally not fun.
Naturally, the signal I was measuring did not match the FFT sample size, which causes artifacts. The standard fix is to apply a Hanning window. This causes other artifacts. As part of understanding (and testing) that code, I had to understand the artifacts caused by the Hanning window, so I could interpret the results and decide whether the code was working or not.
I've used tons of math in various projects, including:
Graph theory for dealing with dependencies in large systems (e.g. a Makefile is a kind of directed graph)
Statistics and linear regression in profiling performance bottlenecks
Coordinate transformations in geospatial applications
In scientific computing, project requirements are often stated in algebraic form, especially for computationally intensive code
And that's just off the top of my head.
And of course, anything involving "pure" computer science (algorithms, computational complexity, lambda calculus) tends to look more and more like math the deeper you go.
In answering this image-comparison-algorithm question, I drew on lots of knowledge of math, some of it from other answers and web searches (where I had to apply my own knowledge to filter the information), and some from my own engineering training and lengthy programming background.
General Mindforming
Solving Problems - One fundamental method of math, independent of the area, is transofrming an unknown problem into a known one. Even if you don't have the same problems, you need the same skill. In math, as in programming, virtually everything has different representations. Understanding the equivalence between algorithms, problems or solutions that are completely different on the surface helps you avoid the hard parts.
(A similar thing happens in physics: to solve a kinematic problem, choice of the coordinate system is often the difference between one and ten pages full of formulas, even though problem and solution are identical.)
Precision of Language / Logical reasoning - Math has a very terse yet precise language. Learning to deal with that will prepare you for computers doing what you say, not what you meant. Also, the same precision is required to analyse if a specification is sufficient, to check a piece of code if it covers all possible cases, etc.
Beauty and elegance - This may be the argument that's hardest to grasp. I found the notion of "beauty" in code is very close to the one found in math. A beautiful proof is one whose idea is immediately convincing, and the proof itself is merely executing a sequence of executing the next obvious step.
The same goes for an elegant implementation.
(Most mathematicians I've encountered have a faible for putting the "Aha!" - effect at the end rather than at the beginning. As have most elite geeks).
You can learn these skills without one lesson of math, of course. But math ahs perfected this for centuries.
Applied Skills
Examples:
- Not having to run calc.exe for a quick estimation of memory requirements
- Some basic statistics to tell a valid performance measurement from a shot in the dark
- deducing a formula for a sequence of values, rather than hardcoding them
- Getting a feeling for what c*O(N log N) means.
- Recursion is the same as proof by inductance
(that list would probably go on if I'd actively watch myself for items for a day. This part is admittedly harder than I thought. Further suggestions welcome ;))
Where I use it
The company I work for does a lot of data acquisition, and our claim to fame (comapred to our competition) is the brain muscle that goes into extracting something useful out of the data. While I'm mostly unconcerned with that, I get enough math thrown my way. Before that, I've implemented and validated random number generators for statistical applications, implemented a differential equation solver, wrote simulations for selected laws of physics. And probably more.
I wrote some hash functions for mapping airline codes and flight numbers with good efficiency into a fairly limited number of data slots.
I went through a fair number of primes before finding numbers that worked well with my data. Testing required some statistics and estimates of probabilities.
In machine learning: we use Bayesian (and other probabilistic) models all the time, and we use quadratic programming in the form of Support Vector Machines, not to mention all kinds of mathematical transformations for the various kernel functions. Calculus (derivatives) factors into perceptron learning. Not to mention a whole theory of determining the accuracy of a machine learning classifier.
In artifical intelligence: constraint satisfaction, and logic weigh very heavily.
I was using co-ordinate geometry to solve a problem of finding the visible part of a stack of windows, not exactly overlapping on one another.
There are many other situations, but this is the one that I got from the top of my head. Inherently all operations that we do is mathematics or at least depends on/related to mathematics.
Thats why its important to know mathematics to have a more clearer understanding of things :)
Infact in some cases a lot of math has gone into our common sense that we don't notice that we are using math to solve a particular problem, since we have been using it for so long!
Thanks
-Graphics (matrices, translations, shaders, integral approximations, curves, etc, etc,...infinite dots)
-Algorithm Complexity calculations (specially in line of business' applications)
-Pointer Arithmetics
-Cryptographic under field arithmetics etc.
-GIS (triangles, squares algorithms like delone, bounding boxes, and many many etc)
-Performance monitor counters and the functions they describe
-Functional Programming (simply that, not saying more :))
-......
I used Combinatorials to stuff 20 bits of data into 14 bits of space.
Machine Vision or Computer Vision requires a thorough knowledge of probability and statistics. Object detection/recognition and many supervised segmentation techniques are based on Bayesian inference. Heavy on linear algebra too.
As an engineer, I'm trying really hard to think of an instance when I did not need math. Same story when I was a grad student. Granted, I'm not a programmer, but I use computers a lot.
Games and simulations need lots of maths - fluid dynamics, in particular, for things like flames, fog and smoke.
As an e-commerce developer, I have to use math every day for programming. At the very least, basic algebra.
There are other apps I've had to write for vector based image generation that require a strong knowledge of Geometry, Calculus and Trigonometry.
Then there is bit-masking...
Converting hexadecimal to base ten in your head...
Estimating load potential of an application...
Yep, if someone is no good with math, they're probably not a very good programmer.
Modern communications would completely collapse without math. If you want to make your head explode sometime, look up Galois fields, error correcting codes, and data compression. Then symbol constellations, band-limited interpolation functions (I'm talking about sinc and raised-cosine functions, not the simple linear and bicubic stuff), Fourier transforms, clock recovery, minimally-ambiguous symbol training sequences, Rayleigh and/or Ricean fading, and Kalman filtering. All of those involve math that makes my head hurt bad, and I got a Masters in Electrical Engineering. And that's just off the top of my head, from my wireless communications class.
The amount of math required to make your cell phone work is huge. To make a 3G cell phone with Internet access is staggering. To prove with sufficient confidence that an algorithm will work in most all cases sometimes takes people's careers.
But... if you're only ever going to work with this stuff as black boxes imported from a library (at their mercy, really), well, you might get away with just knowing enough algebra to debug mismatched parentheses. And there are a lot more of those jobs than the hard ones... but at the same time, the hard jobs are harder to find a replacement for.
Examples that I've personally coded:
wrote a simple video game where one spaceship shoots a laser at another ship. To know if the ship was in the laser's path, I used basic algebra y=mx+b to calculate if the paths intersect. (I was a child when I did this and was quite amazed that something that was taught on a chalkboard (algebra) could be applied to computer programming.)
calculating mortgage balances and repayment schedules with logarithms
analyzing consumer buying choices by calculating combinatorics
trigonometry to simulate camera lens behavior
Fourier Transform to analyze digital music files (WAV files)
stock market analysis with statistics (linear regressions)
using logarithms to understand binary search traversals and also disk space savings when using packing information into bit fields. (I don't calculate logarithms in actual code, but I figure them out during "design" to see if it's feasible to even bother coding it.)
None of my projects (so far) have required topics such as calculus, differential equations, or matrices. I didn't study mathematics in school but if a project requires math, I just reference my math books and if I'm stuck, I search google.
Edited to add: I think it's more realistic for some people to have a programming challenge motivate the learning of particular math subjects. For others, they enjoy math for its own sake and can learn it ahead of time to apply to future programming problems. I'm of the first type. For example, I studied logarithms in high school but didn't understand their power until I started doing programming and all of sudden, they seem to pop up all over the place.
The recurring theme I see from these responses is that this is clearly context-dependent.
If you're writing a 3D graphics engine then you'd be well advised to brush up on your vectors and matrices. If you're writing a simple e-commerce website then you'll get away with basic algebra.
So depending on what you want to do, you may not need any more math than you did to post your question(!), or you might conceivably need a PhD (i.e. you would like to write a custom geometry kernel for turbine fan blade design).
One time I was writing something for my Commodore 64 (I forget what, I must have been 6 years old) and I wanted to center some text horizontally on the screen.
I worked out the formula using a combination of math and trial-and-error; years later I would tackle such problems using actual algebra.
Drawing, moving, and guidance of missiles and guns and lasers and gravity bombs and whatnot in this little 2d video game I made: wordwarvi
Lots of uses sine/cosine, and their inverses, (via lookup tables... I'm old, ok?)
Any geo based site/app will need math. A simple example is "Show me all Bob's Pizzas within 10 miles of me" functionality on a website. You will need math to return lat/lons that occur within a 10 mile radius.
This is primarily a question whose answer will depend on the problem domain. Some problems require oodles of math and some require only addition and subtraction. Right now, I have a pet project which might require graph theory, not for the math so much as to get the basic vocabulary and concepts in my head.
If you're doing flight simulations and anything 3D, say hello to quaternions! If you're doing electrical engineering, you will be using trig and complex numbers. If you're doing a mortgage calculator, you will be doing discrete math. If you're doing an optimization problem, where you attempt to get the most profits from your widget factory, you will be doing what is called linear programming. If you are doing some operations involving, say, network addresses, welcome to the kind of bit-focused math that comes along with it. And that's just for the high-level languages.
If you are delving into highly-optimized data structures and implementing them yourself, you will probably do more math than if you were just grabbing a library.
Part of being a good programmer is being familiar with the domain in which you are programming. If you are working on software for Fidelity Mutual, you probably would need to know engineering economics. If you are developing software for Gallup, you probably need to know statistics. LucasArts... probably Linear Algebra. NASA... Differential Equations.
The thing about software engineering is you are almost always expected to wear many hats.
More or less anything having to do with finding the best layout, optimization, or object relationships is graph theory. You may not immediately think of it as such, but regardless - you're using math!
An explicit example: I wrote a node-based shader editor and optimizer, which took a set of linked nodes and converted them into shader code. Finding the correct order to output the code in such that all inputs for a certain node were available before that node needed them involved graph theory.
And like others have said, anything having to do with graphics implicitly requires knowledge of linear algebra, coordinate spaces transformations, and plenty of other subtopics of mathematics. Take a look at any recent graphics whitepaper, especially those involving lighting. Integrals? Infinite series?! Graph theory? Node traversal optimization? Yep, all of these are commonly used in graphics.
Also note that just because you don't realize that you're using some sort of mathematics when you're writing or designing software, doesn't mean that you aren't, and actually understanding the mathematics behind how and why algorithms and data structures work the way they do can often help you find elegant solutions to non-trivial problems.
In years of webapp development I didn't have much need with the Math API. As far as I can recall, I have ever only used the Math#min() and Math#max() of the Math API.
For example
if (i < 0) {
i = 0;
}
if (i > 10) {
i = 10;
}
can be done as
i = Math.max(0, Math.min(i, 10));

Resources