trec_eval: does iprec_at_recall staying constant mean something is wrong? - information-retrieval

I am completely new to trec_eval. When I run it for a set of queries, I get the following results for iprec_at_recall:
iprec_at_recall_0.00 all 0.5059
iprec_at_recall_0.10 all 0.5059
iprec_at_recall_0.20 all 0.5059
iprec_at_recall_0.30 all 0.5059
iprec_at_recall_0.40 all 0.5059
iprec_at_recall_0.50 all 0.5059
iprec_at_recall_0.60 all 0.5059
iprec_at_recall_0.70 all 0.5059
iprec_at_recall_0.80 all 0.5059
iprec_at_recall_0.90 all 0.5059
iprec_at_recall_1.00 all 0.5059
So my precision is not changing as a function of recall thresholds. Does this necessarily imply a problem with my data?

iprec_at_recall_X is the measure for interpolated precision at standard recall level X. The particular rule used to interpolate precision at standard recall level X in trec_eval is to use the maximum precision obtained for the query for any actual recall level greater than or equal to X (this is how there can be a precision value for recall level 0). You can read more about how trec_eval computes measures in the appendix to (some of) the TREC proceedings, for example, see https://trec.nist.gov/pubs/trec20/appendices/measures.pdf .
So, my guess is that you are using a very small collection (or, at least, one with very few relevant documents) such that you reach 100% recall very early in your ranked list.
Ellen Voorhees
TREC project manager
NIST

Related

What is NaN (Not a Number) in the words of a beginner? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
The community reviewed whether to reopen this question 10 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I still do not understand what a NaN or a (Number which isn´t a real Number) exactly is.
Main question:
What is a NaN value or NaN exactly (in the words of a non-math professor)?
Furthermore i have a few questions about the whole circumstance, which giving me complaints in understanding what a NaN should be, which are not necessary to answer my main question but desired:
What are operations which causing a NaN value as result?
Why is the result of 0.0 / 0.0 declared as undefined? Shouldn´t it be 0?
Why can´t the result of any mathematical operation be expressed by a floating point or integer number? How can it be that a value is unrepresentable?
Why is the square root of a negative number not a real number?
Why is NaN not equivalent to indefinite?
I did not found any understandable explanation of what NaN is for me in the whole Internet, including here on Stack Overflow.
Anyway I want to provide my research as links to places, i have scanned already to find an understandable answer to my question, even if some links go to the same question in other programming languages, but did not gave me the desired clear informations in total:
Wikipedia:
https://en.wikipedia.org/wiki/NaN
https://en.wikipedia.org/wiki/IEEE_754
Other:
http://foldoc.org/Not-a-Number
https://www.youtube.com/watch?v=HN_UmxIVS6M
https://www.youtube.com/watch?v=9EsHjXftO7s
Stack Overflow:
Similar or same questions for other Languages (I provide them as far as i think the base of the understanding is very similar if not the same):
In Java, what does NaN mean?
What is the rationale for all comparisons returning false for IEEE754 NaN values?
(Built-in) way in JavaScript to check if a string is a valid number
JavaScript: what is NaN, Object or primitive?
Not a Number (NaN)
Questions for C++:
What is difference between quiet NaN and signaling NaN?
Checking if a double (or float) is NaN in C++
Why does NaN - NaN == 0.0 with the Intel C++ Compiler?
What is the difference between IND and NAN numbers
Thank you for all helpful answers and comments.
You've asked a series of great questions here. Here's my attempt to address each of them.
What is a NaN value or NaN exactly (in the words of a non-math professor)?
Let's suppose you're working with real numbers - numbers like 1, π, e, -137, 6.626, etc. In the land of real numbers, there are some operations that usually can be performed, but sometimes don't have a defined result. For example, let's look at logarithms. You can take the logarithm of lots of real numbers: ln e = 1, for example, and ln 10 is about 2.3. However, mathematically, the log of a negative number isn't defined. That is, we can't take ln (-4) and get back a real number.
So now, let's jump to programming land. Imagine that you're writing a program that or computes the logarithm of a number, and somehow the user wants you to divide by take the logarithm of a negative number. What should happen?
There's lots of reasonable answers to this question. You could have the operation throw an exception, which is done in some languages like Python.
However, at the level of the hardware the decision that was made (by the folks who designed the IEEE-754 standard) was to give the programmer a second option. Rather than have the program crash, you can instead have the operation produce a value that means "you wanted me to do something impossible, so I'm reporting an error." The way this is done is by having the operation produce the special value NaN ("Not a Number"), indicating that, somewhere in your calculation, you tried to perform an operation that's mathematically not defined.
There are some advantages to this approach. In many scientific computing settings, the code performs a series of long calculations, periodically generating intermediate results that might be of interest. By having operations that aren't defined produce NaN as a result, the programmer can write code that just does the math as they want it to be done, then introduce specific spots in the code where they'll test whether the operation succeeded or not. From there, they can decide what to do. Contrast this with tripping an exception or crashing the program outright - that would mean the programmer either needs to guard every series of floating point operations that could fail or has to manually test things herself. It’s a judgment call about which option is better, which is why you can enable or disable the floating point NaN behavior.
What are operations which causing a NaN value as result?
There are many ways to get a NaN result from an operation. Here's a sampler, though this isn't an exhaustive list:
Taking the log of a negative number.
Taking the square root of a negative number.
Subtracting infinity from infinity.
Performing any arithmetic operation on NaN.
There are, however, some operations that don't produce NaN even though they're mathematically undefined. For example, dividing a positive number by zero gives positive infinity as a result, even though this isn't mathematically defined. The reason for this is that if you take the limit of x / y for positive x as y approaches zero from the positive direction, the value grows without bound.
Why is the result of 0.0 / 0.0 declared as undefined? Shouldn´t it be 0?
This is more of a math question than anything else. This has to do with how limits work. Let's think about how to define 0 / 0. One option would be to say the following: if we look at the expression 0 / x and take the limit as x approaches zero, then we'd see 0 at each point, so the limit should be zero. On the other hand, if we look at the expression x / x and take the limit as x approaches 0, we'd see 1 at each point, so the limit should be one. This is problematic, since we'd like the value of 0 / 0 to be consistent with what you'd find as you evaluated either of these expressions, but we can't pick a fixed value that makes sense. As a result, the value of 0 / 0 gets evaluated as NaN, indicating that there's no clear value to assign here.
Why can´t the result of any mathematical operation be expressed by a floating point or integer number? How can it be that a value is unrepresentable?
This has to do with the internals of IEEE-754 floating point numbers. Intuitively, this boils down to the simple fact that
there are infinitely many real numbers, infinitely many of which have infinitely long non-repeating decimals, but
your computer has finite memory.
As a result, storing an arbitrary real number might entail storing an infinitely long sequence of digits, which we can't do with our finite-memory computers. We therefore have floating point numbers store approximations of real numbers that aren't staggeringly huge, and the inability to represent values results from the fact that we're just storing approximations.
For more on how the numbers are actually stored, and what this means in practice, check out the legendary guide "What Every Programmer Should Know About Floating-Point Arithmetic"
Why is the square root of a negative number not a real number?
Let's take √(-1), for example. Imagine this is a real number x; that is, imagine that x = √(-1). The idea of a square root is that it's a number that, if multiplied by itself, gives you back the number you took the square root of.
So... what number is x? We know that x ≠ 0, because 02 = 0 isn't -1. We also know that x can't be positive, because any positive number times itself is a positive number. And we also know that x can't be negative, because any negative number times itself is positive.
We now have a problem. Whatever this x thing is, it would need to be not positive, not zero, and not negative. That means that it's not a real number.
You can generalize the real numbers to the complex numbers by introducing a number i where i2 = -1. Note that no real numbers do this, for the reason given above.
Why is NaN not equivalent to indefinite?
There's a difference between "indefinite" and "whatever it is, it's not a real number." For example, 0 / 0 may be said to be indeterminate, because depending on how you approach 0 / 0 you might get back 0, or 1, or perhaps something else. On the other hand, √(-1) is perfectly well-defined as a complex number (assuming we have √(-1) give back i rather than -i), so the issue isn't "this is indeterminate" as much as "it's got a value, but that value isn't a real number."
Hope this helps!
For a summary you can have a look at the wikiedia page:
In computing, NaN, standing for not a number, is a member of a numeric
data type that can be interpreted as a value that is undefined or
unrepresentable, especially in floating-point arithmetic. Systematic
use of NaNs was introduced by the IEEE 754 floating-point standard in
1985, along with the representation of other non-finite quantities
such as infinities.
On a practical side I would point out this:
If x or y are NaN floating points: then expressions like:
x<y
x<=y
x>y
x>=y
x==x
are always false. However,
x!=x
will be true and this is a way to check if x is NaN or not (see std::isnan).
Another remark is that when some NaN arise in numerical computations you may observe a big slowdown (this can also be a hint when debugging)
NaN operations on Intel CPUs are likely to generate exceptions which
invoke microcode, so the relative slowdown probably varies greatly
with CPU model.
See NaN slowdown for instance
A floating point number is encoded to a pattern of bits, but not all available bit patterns (for a given number of bits) are used, so there are bit patterns that dont't encode any floating point number. If such patterns are found, they are treated/displayed as NaNs.
Mathematical number systems contain a "set" of values. For example, the positive integers are 0, 1, 2, 3, 4 etc. The negative integers are -1, -2, -3, -4 etc (perhaps -0 too, depending on your branch of mathematics).
In computerland, floating-point numbers additionally have concepts of "infinity" and "not a number", amongst other things. This is like "NULL" for numbers. It means "the floating-point value does not represent a number in the mathematical sense".
They're useful for programmers when they have a float that they don't want to give a number value [yet], and they're also used by the floating-point standards to represent "invalid" results of operations.
You can, for example, get a NaN by dividing zero by zero, an operation with no meaningful value in any branch of mathematics that I'm aware of: how do you share a number of cakes between no people?.
(If you try to do this with integers, which have no concept of NaN or infinity, you instead get a [terribly-named] "floating point exception"; in other words, your program will crash.)
Read more on Wikipedia's article about NaN, which answers pretty much all of your questions.

Precision at k when fewer than k documents are retrieved

In information retrieval evaluation, what would precision#k be, if fewer than k documents are retrieved? Let's say only 5 documents were retrieved, of which 3 are relevant. Would the precision#10 be 3/10 or 3/5?
It can be hard to find text defining edge cases of measures like this, and the mathematical formulations often don't deal with the incompleteness of data. For issues like this, I tend to turn to the decision made by trec_eval which a tool distributed by NIST that has implementations of all common retrieval measures, especially those used by the challenges in Text Retrieval Conferences (TREC challenges).
Per the metric description in m_P.c of trec_eval 9.0 (called the latest on this page):
Precision measured at various doc level cutoffs in the ranking.
If the cutoff is larger than the number of docs retrieved, then
it is assumed nonrelevant docs fill in the rest. Eg, if a method
retrieves 15 docs of which 4 are relevant, then P20 is 0.2 (4/20).
Precision is a very nice user oriented measure, and a good comparison
number for a single topic, but it does not average well. For example,
P20 has very different expected characteristics if there 300
total relevant docs for a topic as opposed to 10.
This means that you should always divide by k even if fewer than k were retrieved, so the precision would be 0.3 instead of 0.6 in your particular case. (Punish the system for retrieving fewer than k).
The other tricky case is when there are fewer than k relevant documents. This is why they note that precision is a helpful measure but does not average well.
Some measures that are more robust to these issues are: Normalized Discounted Cumulative Gain (NDCG) which compares the ranking to an ideal ranking (at a cutoff) and (simpler) R-Precision: which calculates precision at the number of relevant documents, rather than a fixed k. So that one query may calculate P#15 for R=15, and another may calculate P#200 for R=200.

Why is NA^0 = 1 in R? [duplicate]

Prompted by a spot of earlier code golfing why would:
>NaN^0
[1] 1
It makes perfect sense for NA^0 to be 1 because NA is missing data, and any number raised to 0 will give 1, including -Inf and Inf. However NaN is supposed to represent not-a-number, so why would this be so? This is even more confusing/worrying when the help page for ?NaN states:
In R, basically all mathematical functions (including basic
Arithmetic), are supposed to work properly with +/- Inf and NaN as
input or output.
The basic rule should be that calls and relations with Infs really are
statements with a proper mathematical limit.
Computations involving NaN will return NaN or perhaps NA: which of
those two is not guaranteed and may depend on the R platform (since
compilers may re-order computations).
Is there a philosophical reason behind this, or is it just to do with how R represents these constants?
This is referenced in the help page referenced by ?'NaN'
"The IEC 60559 standard, also known as the ANSI/IEEE 754 Floating-Point Standard.
http://en.wikipedia.org/wiki/NaN."
And there you find this statement regarding what should create a NaN:
"There are three kinds of operations that can return NaN:[5]
Operations with a NaN as at least one operand.
It is probably is from the particular C compiler, as signified by the Note you referenced. This is what the GNU C documentation says:
http://www.gnu.org/software/libc/manual/html_node/Infinity-and-NaN.html
" NaN, on the other hand, infects any calculation that involves it. Unless the calculation would produce the same result no matter what real value replaced NaN, the result is NaN."
So it seems that the GNU-C people have a different standard in mind when writing their code. And the 2008 version of ANSI/IEEE 754 Floating-Point Standard is reported to make that suggestion:
http://en.wikipedia.org/wiki/NaN#Function_definition
The published standard is not free. So if you are have access rights or money you can look here:
http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4610933
The answer can be summed up by "for historical reasons".
It seems that IEEE 754 introduced two different power functions - pow and powr, with the latter preserving NaN's in the OP case and also returning NaN for Inf^0, 0^0, 1^Inf, but eventually the latter was dropped as explained briefly here.
Conceptually, I'm in the NaN preserving camp, because I'm coming at the issue from viewpoint of limits, but from convenience point of view I expect current conventions are slightly easier to deal with, even if they don't make a lot of sense in some cases (e.g. sqrt(-1)^0 being equal to 1 while all operations are on real numbers makes little sense if any).
Yes, I'm late here, but as R Core member who was involved in this design, let me recall what I commented above. NaN preserving and NA preserving work "equivalently" in R, so if you agree that NA^0 should give 1, NaN^0 |-> 1 is a consequence.
Indeed (as others said) you should really read R's help pages and not C or
IEEE standards, to answer such questions,
and SimonO101 correctly cited
1 ^ y and y ^ 0 are 1, always
and I'm pretty sure that I was heavily involved (if not the author) of that.
Note that it is good, not bad, to be able to provide non-NaN answers, also in cases other programming languages do differently.
The consequence of such a rule is that more things work automatically correctly;
in the other case, the R programmer would have been urged to do more special casing herself.
Or put differently, a simple rule as the above (returning non-NaN in all cases) is a good rule, because it propagates continuity in a mathematical sense: lim_x f(x) = f(lim x).
We have had a few cases where it was clearly advantageous (i.e. did not need special casing, I'm repeating..) to adhere to the above "= 1" rule, rather than to propagate NaN. As I said further up, the sqrt(-1)^0 is also such an example, as 1 is the correct result as soon as you extend to the complex plane.
Here's one reasoning. From Goldberg:
In IEEE 754, NaNs are often represented as floating-point numbers with
the exponent e_max + 1 and nonzero significands.
So NaN is a floating-point number, though with a special meaning. Raising a number to the power zero sets its exponent to zero, therefore it will no longer be NaN.
Also note:
> 1^NaN
[1] 1
One is a number whose exponent is zero already.
Conceptually, the only problem with NaN^0 == 1 is that zero values can come about at least four different ways, but the IEEE format uses the same representation for three of them. The above formula equality sense for the most common case (which is one of the three), but not for the others.
BTW, the four cases I would recognize would be:
A literal zero
Unsigned zero: the difference between two numbers that are indistinguishable
Positive infinitesimal: The product or quotient of two numbers of matching sign, which is too small to be distinguished from zero.
Negative infinitesimal: The product or quotient of two numbers of opposite sign, which is too small to be distinguished from zero.
Some of these may be produced via other means (e.g. literal zero could be produced as the sum of two literal zeros; positive infinitesimal by the division of a very small number by a very large one, etc.).
If a floating-point recognized the above, it could usefully regard raising NaN to a literal zero as yielding one, and raising it to any other kind of zero as yielding NaN; such a rule would allow a constant result to be assumed in many cases where something that might be NaN would be raised to something the compiler could identify as a constant zero, without such assumption altering program semantics. Otherwise, I think the issue is that most code isn't going to care whether x^0 might would NaN if x is NaN, and there's not much point to having a compiler add code for conditions code isn't going to care about. Note that the issue isn't just the code to compute x^0, but for any computations based on that which would be constant if x^0 was.
If you look at the type of NaN, it is still a number, it's just not a specific number that can be represented by the numeric type.
EDIT:
For example, if you were to take 0/0. What is the result? If you tried to solve this equation on paper, you get stuck at the very first digit, how many zero's fit into another 0? You can put 0, you can put 1, you can put 8, they all fit into 0*x=0 but it's impossible to know which one the correct answer is. However, that does not mean the answer is no longer a number, it's just not a number that can be represented.
Regardless, any number, even a number that you can't represent, to the power of zero is still 1. If you break down some math x^8 * x^0 can be further simplified by x^(8+0) which equates to x^8, where did the x^0 go? It makes sense if x^0 = 1 because then the equation x^8 * 1 explains why x^0 just sort of disappears from existence.

What is the difference between 'precision' and 'accuracy'?

What is the difference between 'accurate' and 'precise' ?
If there is a difference, can you give an example of
a number that is accurate but not precise
a number that is precise but not accurate
a number that is both accurate and precise
Thanks!
Precision refers to how much information is conveyed by a number (in terms of number of digits) whereas accuracy is a measure of "correctness".
Let's take the π approximation 22/7, for our purposes, 3.142857143.
For your specific questions:
a number that is accurate but not precise: 3.14. That's certainly accurate in terms of closeness, given the precision available. There is no other number with three significant digits that is closer to the target (both 3.13 and 3.15 are further away from the real value).
a number that is precise but not accurate: 99999.12345678901234567890. That's much more precise since it conveys more information. Unfortunately its accuracy is way off since it's nowhere near the target value.
a number that is both accurate and precise: 3.142857143. You can get more precise (by tacking zeros on the end) but no more accurate.
Of course, that's if the target number is actually 3.142857143. If it's 22/7, then you can get more accurate and precise, since 3.142857143 * 7 = 22.000000001. The actual decimal number for that fraction is an infinitely repeating one (in base 10):
3 . 142857 142857 142857 142857 142857 ...
and so on, so you can keep adding precision and accuracy in that representation by continuing to repeat that group of six digits. Or, you can maximise both by just using 22/7.
One way to think of it is this:
A number that is "precise" has a lot of digits. But it might not be very correct.
A number that is "accurate" is correct, but may not have a lot of digits.
Examples:
3.14 is an "accurate" approximation to Pi. But it is not very precise.
3.13198408654198 is a very "precise" approximation to Pi, but it is not accurate,
3.14159265358979 is both accurate and precise.
So precision gives a lot of information. But says nothing about how correct it is.
Accuracy says how correct the information is, but says nothing about how much information there is.
Assume the exact time right now is 13:01:03.1234
Accurate but not precise - it's 13:00 +/- 0:05
Precise but not accurate - it's 13:15:01.1425
Accurate and precise - it's 13:01:03.1234
The standard example I always heard involved a dart board:
accurate but not precise: lots of darts scattered evenly all over the dart board
precise but not accurate: lots of darts concentrated in one spot of the dart board, that is not the bull's eye
both: lots of darts concentrated in the bull's eye
Accuracy is about getting the right answer. Precision is about repeatedly getting the same answer.
Precision and accuracy are defined by significant digits. Accuracy is defined by the number of significant digits while precision is identified by the location of the last significant digit. For instance the number 1234 is more accurate than 0.123 because 1234 had more significant digits. The number 0.123 is more precise because the 3 (last significant figure) is in the thousandths place. Both types of digits typically only relevant because they are the results of a measurement. For instance, you can have a decimal number that's exact such as 0.123 such as 123/1000 as defined, thus the discussion of precision has no real meaning because 0.123 was given or defined;however, if you were to measure something and come up with that value, then 0.123 indicates the precision of the tool used to measure it.
The real confusion occurs when combining these numbers such as adding, subtracting, multiply and dividing. For example, when adding two numbers that are the result of a measurement, the answer can only be as precise as the least precise number. Think of it as a chain is only as strong as its weakest link.
Accuracy are very often confused with precision but they are much different.
Accuracy is degree to which the measured value agrees with true value.
Example-Our objective is to make rod of 25mm And we are able to make it of 25 mm then it is accurate.
Precision is the repeatability of the measuring process.
Example-Our objective is to make 10 rods of 25mm and we make all rods of 24mm then we are precise as we make all rods of same size,but it is not accurate as true value is 25 mm.

Oh no Another BigO one

I've been doing BigO recently, and I get the formula ok, but I've written a piece of code that takes and input and returns a time taken to complete a sort. So I have the input and time, how do I use this to classify what sort of BigO it is? I've made graphs and can see which sort they are but I can't do it using the formula? I'm not strong on maths which I think is my problem here!
For instance I get:
Size Time Operations
200 2 163648
400 1 162240
800 15 2489456
1600 6 10247376
3200 19 40858160
6400 79 165383984
12800 318 656588080
25600 1274 2624318128
51200 5059 10476803408
102400 20333 41969291968
I know that this is O(n^2) by looking at the graph and comparing, but how do I prove it?
Yes, you can sample a thousand different input sizes, and then try to derive a Big-O value from that, but you shouldn't - not only because it doesn't actually prove anything, but because that isn't the point.
The way to prove O(n^2) is to prove it on the code itself, not through experiments. The actual running time isn't important, because Big-O notation doesn't say anything about that - in simple terms, it only specifies the dominant term of whatever formula you would use to calculate the exact running time, in the sense of the number of operations executed for that function. Constants are thrown away, and so are smaller terms - the actual running time of a function might be 1000n^2+1000000n, but that's still O(n^2).
You can't mathematically prove anything from this table; the complexity might be O(1) if Time remains at 20333 for all larger values.
The best you can do is try fitting several curves to this table and selecting the best fit according to Occam's razor.
You can't prove it by looking at the timings, you can only prove it by analysing the code to see how many steps are performed. The reason for this is that the time taken is a function not only of your program but many other things outside of your control as well.
For example, who can say whether your machine didn't spend an inordinate amount of time in other processes during one particular test run of your program? This sort of thing can be minimsed to a point using statistical methods but the proof requires solid data.
What you can do is to look at some of your data points to get support for the contention that it's O(n2). Have a look at the last four entries:
Input Time
128 318
256 1274 1274 / 318 = 4.006
512 5059 5059 / 1274 = 3.971
1024 20333 20333 / 5059 = 4.019
You can see that each doubling of the input size has a multiplier effect of the time of about 4 which would tend to indicate an O(n2) property.
But this is support only. It applies only to that particular range of input values and, as stated, is subject to factors outside your control. Note also that the support would be harder to see if the time taken was not a simple one. For example, if the time function was t = n2/10 + 123n + 123456789, it would be a little harder to figure out.
Just by making a comparison between the values may not make any sense.However,if you plot a graph using this values( x-axis : input , y-axis:time),you will get a curve or a linear shape or whatever.Using this information,you can predict the BigO value of that function.Of course there may be(not always) some interrupts that affects the running of that process,but that does not last during the whole period.It is slight overhead that cannot affect the result.
In order to predict the BigO value , you will need some Calculus knowledge in order to make the analogy between the shape and BigO result.
For example,let's say that you got a linear shape and you know that it means O(n).In that point,you reached that result because you know the shape of a linear function graph and your graph looks like it.In order to reach the true proof , you have to draw both your functions curve and the graph of the mathematical function that has the closest shape to your graph.
There are some other functions like Big-Theta , Small-Omega that binds your function from upper or from lower.The mathematical function could be both of them,but as a result,your Big-O function is the closest one to that shape.

Resources