Function point to kloc ratio as a software metric... the "Name That Tune" metric? - functional-programming

What do you think of using a metric of function point to lines of code as a metric?
It makes me think of the old game show "Name That Tune". "I can name that tune in three notes!" I can write that functionality in 0.1 klocs! Is this useful?
It would certainly seem to promote library usage, but is that what you want?

I think it's a terrible idea. Just as bad as paying programmers by lines of code that they write.
In general, I prefer concise code over verbose code, but only as long as it still expresses the programmers' intention clearly. Maximizing function points per kloc is going to encourage everyone to write their code as briefly as they possibly can, which goes beyond concise and into cryptic. It will also encourage people to join adjacent lines of code into one line, even if said joining would not otherwise be desirable, just to reduce the number of lines of code. The maximum allowed line length would also become an issue.

KLOC is tolerable if you strictly enforce code standards, kind of like using page requirements for a report: no putting five statements on a single line or removing most of the whitespace from your code.
I guess one way you could decide how effective it is for your environment is to look at several different applications and modules, get a rough estimate of the quality of the code, and compare that to the size of the code. If you can demonstrate that code quality is consistent within your organization, then KLOC isn't a bad metric.
In some ways, you'll face the same battle with any similar metric. If you count feature or function points, or simply features or modules, you'll still want to weight them in some fashion. Ultimately, you'll need some sort of subjective supplement to the objective data you'll collect.

"What do you think of using a metric of function point to lines of code as a metric?"
Don't get the question. The above ratio is -- for a given language and team -- a simple statistical fact. And it tends toward a mean value with a small standard deviation.
There are lots of degrees of freedom: how you count function points, what language you're using, how (collectively) clever the team is. If you don't change those things, the value stays steady.
After a few projects together, you have a solid expectation that 1200 function points will be 12,000 lines of code in your preferred language/framework/team organization.
KSloc / FP is a bare statistical observation. Clearly, there's something else about this that's bothering you. Could you be more specific in your question?

The metric of Function Points to Lines of Code is actually used to generate the language level charts (actually, it is Function Points to Statements) to give an approximate sense of how powerful a programming language is. Here is an example: http://web.cecs.pdx.edu/~timm/dm/functionpoints.html
I wouldn't recommend using that ratio for anything else, except high level approximations like the language level chart.
Promoting library usage is a good thing, but the other thing to keep in mind is you will lose in the ratio when you are building the libraries, and will only pay it off with dividends of savings over time. Bean-counters won't understand that.
I personally would like to see a Function point to ABC metric ratio -- as I am curious about how the ABC metric (which indicates size and includes complexity as part of the info) would relate - perhaps linear, perhaps exponential, etc... www.softwarerenovation.com/ABCMetric.pdf

All metrics suck. My theory has always been that if you have to have them, then use the easiest thing you can to gather them and be done with it and onto important things.
That generally means something along the lines of
grep -c ";" *.h *.cpp | awk -F: '/:/ {x += $2} END {print x}'
If you are looking for a "metric" to track code efficency, don't. If you insist, again try something stupid but easy like source file size (see grep command above, w/o the awk pipe) or McCabe (with a counter program).

Related

Will Dijkstra or A* work correctly with cost a function of full path?

What I'm considering is this: when a node becomes the current node, compute "on the fly" the cost to each neighbor, where the cost is a function of the complete path to arrive at the current node. I can't think how this would break the assumptions of the algorithm, but I have a feeling it might.
I'm doing the on the fly computation for storage reasons anyway, but the new thing would be having the costs be a function of more than the two nodes involved. Could it work?
As far as I see, it doesn't break the assumptions of the Dijsktra algorithm, i.e. you can still continue to use it. However, when you want to do so, it does require you to completely refigure your graph.
More detailed, you can't simply use Node-Indices {1,...,N} anymore, but then your state rather needs to be something like {(1,all-ways-to-get-there), ... , (N,all-ways-to-get-there)}. This will bring in a bad exponential scaling.
The reason for this is that the Dijkstra algorithm -- like Dynamic programming -- relies on the fact that the problem can be split in parts and solved there, which is not the case here.
Here is an example why it can't be done by "normal" Dijkstra: say your function which assigns a cost to a given way {node_1, node_2, ..., node_N} is called f and is assumed arbitrary. Then it is completely irrelevant what your current best costs or best path {node_1, ..., node_{N-1}} is at the moment, as you can't make any implication on that -- all you can do is to work out each possible path, which grows exponentially and is hopeless for large graphs.
In case your function fulfills some requirements, however, there might be better things to do. For example, in the simplest case when your function is linear, f({path1} + {path2}) = f({path1}) + f({path2})], the "original" Dijkstra algorithm is recovered.
If it's possible to pre-compute the cost of travelling between each pair of nodes, then there is absolutely no reason you can't use Dijkstra or A*, as long as none of your edge weights can be negative.
If it's not possible to pre-compute the cost, then it's likely that you're doing something wrong in your pathfinding, as it likely depends on the state of the search. :)

How to quantitatively measure how simplified a mathematical expression is

I am looking for a simple method to assign a number to a mathematical expression, say between 0 and 1, that conveys how simplified that expression is (being 1 as fully simplified). For example:
eval('x+1') should return 1.
eval('1+x+1+x+x-5') should returns some value less than 1, because it is far from being simple (i.e., it can be further simplified).
The parameter of eval() could be either a string or an abstract syntax tree (AST).
A simple idea that occurred to me was to count the number of operators (?)
EDIT: Let simplified be equivalent to how close a system is to the solution of a problem. E.g., given an algebra problem (i.e. limit, derivative, integral, etc), it should assign a number to tell how close it is to the solution.
The closest metaphor I can come up with it how a maths professor would look at an incomplete problem and mentally assess it in order to tell how close the student is to the solution. Like in a math exam, were the student didn't finished a problem worth 20 points, but the professor assigns 8 out of 20. Why would he come up with 8/20, and can we program such thing?
I'm going to break a stack-overflow rule and post this as an answer instead of a comment, because not only I'm pretty sure the answer is you can't (at least, not the way you imagine), but also because I believe it can be educational up to a certain degree.
Let's assume that a criteria of simplicity can be established (akin to a normal form). It seems to me that you are very close to trying to solve an analogous to entscheidungsproblem or the halting problem. I doubt that in a complex rule system required for typical algebra, you can find a method that gives a correct and definitive answer to the number of steps of a series of term reductions (ipso facto an arbitrary-length computation) without actually performing it. Such answer would imply knowing in advance if such computation could terminate, and so contradict the fact that automatic theorem proving is, for any sufficiently powerful logic capable of representing arithmetic, an undecidable problem.
In the given example, the teacher is actually either performing that computation mentally (going step by step, applying his own sequence of rules), or gives an estimation based on his experience. But, there's no generic algorithm that guarantees his sequence of steps are the simplest possible, nor that his resulting expression is the simplest one (except for trivial expressions), and hence any quantification of "distance" to a solution is meaningless.
Wouldn't all this be true, your problem would be simple: you know the number of steps, you know how many steps you've taken so far, you divide the latter by the former ;-)
Now, returning to the criteria of simplicity, I also advice you to take a look on Hilbert's 24th problem, that specifically looked for a "Criteria of simplicity, or proof of the greatest simplicity of certain proofs.", and the slightly related proof compression. If you are philosophically inclined to further understand these subjects, I would suggest reading the classic Gödel, Escher, Bach.
Further notes: To understand why, consider a well-known mathematical artefact called the Mandelbrot fractal set. Each pixel color is calculated by determining if the solution to the equation z(n+1) = z(n)^2 + c for any specific c is bounded, that is, "a complex number c is part of the Mandelbrot set if, when starting with z(0) = 0 and applying the iteration repeatedly, the absolute value of z(n) remains bounded however large n gets." Despite the equation being extremely simple (you know, square a number and sum a constant), there's absolutely no way to know if it will remain bounded or not without actually performing an infinite number of iterations or until a cycle is found (disregarding complex heuristics). In this sense, every fractal out there is a rough approximation that typically usages an escape time algorithm as an heuristic to provide an educated guess whether the solution will be bounded or not.

Recursive hypothesis-building with ambiguites - what's it called?

There's a problem I've encountered a lot (in the broad fields of data analyis or AI). However I can't name it, probably because I don't have a formal CS background. Please bear with me, I'll give two examples:
Imagine natural language parsing:
The flower eats the cow.
You have a program that takes each word, and determines its type and the relations between them. There are two ways to interpret this sentence:
1) flower (substantive) -- eats (verb) --> cow (object)
using the usual SVO word order, or
2) cow (substantive) -- eats (verb) --> flower (object)
using a more poetic world order. The program would rule out other possibilities, e.g. "flower" as a verb, since it follows "the". It would then rank the remaining possibilites: 1) has a more natural word order than 2), so it gets more points. But including the world knowledge that flowers can't eat cows, 2) still wins. So it might return both hypotheses, and give 1) a score of 30, and 2) a score of 70.
Then, it remembers both hypotheses and continues parsing the text, branching off. One branch assumes 1), one 2). If a branch reaches a contradiction, or a ranking of ~0, it is discarded. In the end it presents ranked hypotheses again, but for the whole text.
For a different example, imagine optical character recognition:
** **
** ** *****
** *******
******* **
* ** **
** **
I could look at the strokes and say, sure this is an "H". After identifying the H, I notice there are smudges around it, and give it a slightly poorer score.
Alternatively, I could run my smudge recognition first, and notice that the horizontal line looks like an artifact. After removal, I recognize that this is ll or Il, and give it some ranking.
After processing the whole image, it can be Hlumination, lllumination or Illumination. Using a dictionary and the total ranking, I decide that it's the last one.
The general problem is always some kind of parsing / understanding. Examples:
Natural languages or ambiguous languages
OCR
Path finding
Dealing with ambiguous or incomplete user imput - which interpretations make sense, which is the most plausible?
I'ts recursive.
It can bail out early (when a branch / interpretation doesn't make sense, or will certainly end up with a score of 0). So it's probably some kind of backtracking.
It keeps all options in mind in light of ambiguities.
It's based off simple rules at the bottom can_eat(cow, flower) = true.
It keeps a plausibility ranking of interpretations.
It's recursive on a meta level: It can fork / branch off into different 'worlds' where it assumes different hypotheses when dealing with the next part of data.
It'll forward the individual rankings, probably using bayesian probability, to dependent hypotheses.
In practice, there will be methods to train this thing, determine ranking coefficients, and there will be cutoffs if the tree becomes too big.
I have no clue what this is called. One might guess 'decision tree' or 'recursive descent', but I know those terms mean different things.
I know Prolog can solve simple cases of this, like genealogies and finding out who is whom's uncle. But you have to give all the data in code, and it doesn't seem convienent or powerful enough to do this for my real life cases.
I'd like to know, what is this problem called, are there common strategies for dealing with this? Is there good literature on the topic? Are there libraries for ideally C(++), Python, were you can just define a bunch of rules, and it works out all the rankings and hypotheses?
I don't think there is one answer that fits all the bullet points you have. But I hope my links will lead you closer to an answer or might give you a different question.
I think the closest answer is Bayesian network since you have probabilities affecting each other as I understand it, it is also related to Conditional probability and Fuzzy Logic
You also describe a bit of genetic programming as well as Artificial Neural Networks
I can name drop some more topics which might be related:
http://en.wikipedia.org/wiki/Rule-based_programming
http://en.wikipedia.org/wiki/Expert_system
http://en.wikipedia.org/wiki/Knowledge_engineering
http://en.wikipedia.org/wiki/Fuzzy_system
http://en.wikipedia.org/wiki/Bayesian_inference

Coding mathematical algorithms - should I use variables in the book or more descriptive ones?

I'm maintaining code for a mathematical algorithm that came from a book, with references in the comments. Is it better to have variable names that are descriptive of what the variables represent, or should the variables match what is in the book?
For a simple example, I may see this code, which reflects the variable in the book.
A_c = v*v/r
I could rewrite it as
centripetal_acceleration = velocity*velocity/radius
The advantage of the latter is that anyone looking at the code could understand it. However, the advantage of the former is that it is easier to compare the code with what is in the book. I may do this in order to double check the implementation of the algorithms, or I may want to add additional calculations.
Perhaps I am over-thinking this, and should simply use comments to describe what the variables are. I tend to favor self-documenting code however (use descriptive variable names instead of adding comments to describe what they are), but maybe this is a case where comments would be very helpful.
I know this question can be subjective, but I wondered if anyone had any guiding principles in order to make a decision, or had links to guidelines for coding math algorithms.
I would prefer to use the more descriptive variable names. You can't guarantee everyone that is going to look at the code has access to "the book". You may leave and take your copy, it may go out of print, etc. In my opinion it's better to be descriptive.
We use a lot of mathematical reference books in our work, and we reference them in comments, but we rarely use the same mathematically abbreviated variable names.
A common practise is to summarise all your variables, indexes and descriptions in a comment header before starting the code proper. eg.
// A_c = Centripetal Acceleration
// v = Velocity
// r = Radius
A_c = (v^2)/r
I write a lot of mathematical software. IF I can insert in the comments a very specific reference to a book or a paper or (best) web site that explains the algorithm and defines the variable names, then I will use the SHORT names like a = v * v / r because it makes the formulas easier to read and write and verify visually.
IF not, then I will write very verbose code with lots of comments and long descriptive variable names. Essentially, my code becomes a paper that describes the algorithm (anyone remember Knuth's "Literate Programming" efforts, years ago? Though the technology for it never took off, I emulate the spirit of that effort). I use a LOT of ascii art in my comments, with box-and-arrow diagrams and other descriptive graphics. I use Jave.de -- the Java Ascii Vmumble Editor.
I will sometimes write my math with short, angry little variable names, easier to read and write for ME because I know the math, then use REFACTOR to replace the names with longer, more descriptive ones at the end, but only for code that is much more informal.
I think it depends almost entirely upon the audience for whom you're writing -- and don't ever mistake the compiler for the audience either. If your code is likely to be maintained by more or less "general purpose" programmers who may not/probably won't know much about physics so they won't recognize what v and r mean, then it's probably better to expand them to be recognizable for non-physicists. If they're going to be physicists (or, for another example, game programmers) for whom the textbook abbreviations are clear and obvious, then use the abbreviations. If you don't know/can't guess which, it's probably safer to err on the side of the names being longer and more descriptive.
I vote for the "book" version. 'v' and 'r' etc are pretty well understood as acronymns for velocity and radius and is more compact.
How far would you take it?
Most (non-greek :-)) keyboards don't provide easy access to Δ, but it's valid as part of an identifier in some languages (e.g. C#):
int Δv;
int Δx;
Anyone coming afterwards and maintaining the code may curse you every day. Similarly for a lot of other symbols used in maths. So if you're not going to use those actual symbols (and I'd encourage you not to), I'd argue you ought to translate the rest, where it doesn't make for code that's too verbose.
In addition, what if you need to combine algorithms, and those algorithms have conflicting usage of variables?
A compromise could be to code and debug as contained in the book, and then perform a global search and replace for all of your variables towards the end of your development, so that it is easier to read. If you do this I would change the names of the variables slightly so that it is easier to change them later.
e.g A_c# = v#*v#/r#

Normalizing FFT Data for Human Hearing

The typical FFT for audio looks pretty similar to this, with most of the action happening on the far left side
http://www.flight404.com/blog/images/fft.jpg
He multiplied it by a partial sine wave to get it to the bottom, but the article isn't too specific on this part of it. It also seems like a "good enough" modification of the dataset, rather than one based on some property. I understand that human hearing is better suited to the higher frequencies, thus, most music will have amplified bass and attenuated treble so that both sound to us as being of relatively equal strength.
My question is what modification needs to be done to the FFT to compensate for this standard falloff?
for(i = 0; i < fft.length; i++){
fft[i] = fft[i] * Math.log(i + 1); // does, eh, ok but the high
// end is still not really "loud"
// enough
}
EDIT ::
http://en.wikipedia.org/wiki/Equal-loudness_contour
I came across this article, I think it might be the direction to head in, but there still might be some property of an FFT that needs to be counteracte.
First, are you sure you want to do this? It makes sense to compensate for some things, like the microphone response not being flat, but not human perception. People are used to hearing sounds with the spectral content that the sounds have in the real world, not along perceptual equal loudness curves. If you play a sound that you've modified in the way you suggest it would sound strange. Maybe some people like the music to have enhanced low frequencies, but this is a matter of taste, not psychophysics.
Or maybe you are compensating for some other reason, for example, taking into account the poorer sensitivity to lower frequencies might enhance a compression algorithm. Is this the idea?
If you do want to normalize by the equal loudness curves, one should note that most of the curves and equations are in terms of sound pressure level (SPL). SPL is the log of the square of the waveform amplitude, so when you work with the FFTs, it's probably easiest to work with their square (the power specta). (Or, of course, you could compensate in other ways by, say, multiplying by sqrt(log(i+1)) in your equation above -- assuming that the log was an approximation of the inverse equal-loudness curve.)
I think the equal loudness contour is exactly the right direction.
However, its shape depends on the absolute pressure level.
In other words the sensitivity curve of our hearing changes with sound pressure.
There is no "correct normalization" if you have no information about absolute levels.
If this is a problem depends on what you want to do with the data.
The loudness contour is standardized in ISO 226 but this document is not freely available for download. It should be in a decent university library though.
Here is another source for
loudness contours
So you are trying to raise the level of the high end frequencies? Sounds like a high pass filter with a minimum multiplier might work, so that you don't attenuate the low frequency signals too much. Pick up a good book on filter design, maybe monkey around with this applet
In the old days of first samplers, this is before MOTU Boost people :) it wasn't FFT but simple (Fairlight or Roland it first I think) Normalisation done on the original or resulting time-domain signal (if you are doing beat slicing, recycle-style); can't you do that? Or only go for the FFT after you compensate to counteract for it?
Seems like a two phase procedure otherwise, I'd personally leave FFT as is for the task..

Resources