Find the width and density for each class. - r

So I have to find the density and the width for each of the following class. I have the solution but i am confused on something. I am confused on if the answer is correct or incorrect because some sources are saying uppperLimit - lowerLimit = class Width while some are saying it should be lowerLimit2 - lowerLimit1 = Class width. So please have a look at my data and solution and tell me if i am doing it correctly so i can proceed to find the density of it.
CLASS FREQUENCY
30.0-32.0 8
32.0-33.0 7
33.0-34.0 10
34.0-34.5 25
34.5-35.0 30
35.0-35.5 40
35.5-36.0 45
36.0-50.0 5
My Solution.
We first need to find the class boundaries. In this case, they are 30.0, 32.0, 33.0, 34.0, 34.5, 35.0, 35.5 and 36.0. The class widths are therefore c2 – c1 (i.e., 32.0 – 30.0 = 2.0)
So the class width should be --> 2.0, 1.0, 1.0, 0.5, 0.5, 0.5 and 14.0

Looks to me that you are doing it correctly -- the quantity you want in this case is the width of the bin, which is the distance from the lower bound to the upper bound.
More generally, what you need is the ordinary (Lebesgue) measure of the bin -- your density estimate is essentially comparing the observed mass (i.e. bin count) to the mass of the bin. This generalizes your example to other cases in a natural way. The Lebesgue measure of an interval is just the length of an interval, so that's the right thing whether the intervals touch each other (as in your example) or they don't touch at the endpoints (more generally). Also if you are working in two or more dimensions, the Lebesgue measure of the bin is its area or n-dimensional volume -- therefore in any dimensions, it's easy to know what you need to compute.

Related

How to construct a linear map from floats in [0,1] to integers in [0,255]?

I need to convert a 32 bit floating point value x in the range [0,1] to an 8 bit unsigned integer y in the range [0,255].
A formula I found in some C code is : y = (uint8)(255.99998f*x).
This provides the required conversion, but there is a problem with it.
Conversion of 0.75 yield 191, and conversion of 0.25 yields 63. While 0.75+0.25 = 1, 191+63 = 254 and not the desired 255.
Same problem with 0.5 that is converted into 127. 0.5 + 0.5 = 1 and 127+127= 254 instead of 255.
There is thus a rounding error.
Can this be avoided ? If yes, how ?
You will not be able to represent the closed segment [0.0, 1.0] in an accurate way into the segment [0,255]. The most evident problem is that 0.5 + 0.5 = 1.0 . So if 1.0 is represented by 255, 0.5 cannot be exactly represented.
The real problem is that 32 bits floating point numbers are represented in IEE 754 binary 32 format. So you will find a native injection from the [0.0, 1.0[ semi open segment into the [0,255] one by taking the most representative bits of the binary representation (conveniently shifted) and accepting that at the limit 1.0 would be represented as 256.
Then all fractions where the denominator is a power of 2 are exactly represented: 0.5 is 128, 0.25 is 64, and 0.75 is 192 but trying to nicely map [0.0, 1.0] to [0, 255] is close to finding a nice relation from [0,256] (257 values) into [0,255]...
Same problem with 0.5 that is converted into 127. 0.5 + 0.5 = 1 and 127+127= 254 instead of 255.
No mapping can satisfy this requirement since 255/2 is not representable as an integer. You have to decide what this mapping means to you and what properties it requires, but no mapping to integers can satisfy this.
If you choose a floor mapping as you've shown in your question, then 0.5f->127, in which case your algorithm or program might interpret this to define the range of [0-127] with 128 elements - exactly half of the 256 elements in [0-255], since the remaining range [128-255] also has 128 elements.
If, however, you choose an analytical mapping like
y = round(255*x);
this provides the most accurate numerical value - the value of the output will always be the closest integer to the input value. For a value of 0.5f, this produces 128, which is exactly half of the number of bins in the output range. In this case your algorithm might interpret this as the number of elements in the range which is half of the input range. It's really up to you to design the algorithm and interpretation of the mapping around the limitations imposed by discarding the resolution of a 32-bit float.
Ultimately, [0.0-1.0] is about measuring something and [0-255] is about counting something... only you know what you're measuring and what you're counting so we can't really make this decision for you.
If your application is one which is measurement-like, then round(255*x) will produce the closest integer to the input float - a value of 0.0039062, for example, is within 0.001% of a perfect map to 1, will map to 1.
If your application is one which is counting-like, and you are more interested in equally binning the float values, then a floor mapping (like your original suggestion) will map an equal range of the input to each bin. Using the round equation will leave the 0 bin and the 255 bin mapped to half the range of the rest of the bins. Using a floor mapping produces an equal distribution of the input range to the output bins, but sacrifices numerical precision. The above example of a value of 0.0039062, for example, would map to 0 in this case, even though it's 99.99% of the value you would consider to be 1.
It's entirely up to you to determine which mapping makes sense for your specific application.

Precision--recall curves in image retrieval domain

I am working on loop-closure detection problem in two different seasons, e.g., summer, and fall. I need to make precision-recall curves. Suppose, I have taken 500 image from summer and 500 image from fall season. I have distance matrix. enter image description here
But I am totally confused, how to make precison recall curves. Like, for each image from one season, I will get 500 nearest images in ascending (distance) order. I know the definition of precision and recall, but i can't get close to the solution of this problem. Looking forward for any kind of help or comments or advice. thanks in advance.
In precision-recall plots each point is a pair of precision and recall values. In your case, I guess, you'd need to compute those values for each image and then average them.
Imagine you have 1000 images in total and only 100 images that belong to summer. If you take 500 closest images to some "summer" image, precision in the best case (when the first images always belong to the class) would be:
precision(summer) = 100 / (100 + 400) = (retrieved summer images) / (retrieved summer images + other retrieved images) = 0.2
And recall:
recall(summer) = 100 / (100 + 0) = (retrieved summer images) / (retrieved summer images + not retrieved summer images) = 1
As you can see, it has high recall because all the summer images were retrieved, but low precision, because there are only 100 images, and 400 other images don't belong to the class.
Now, if you take the first 100 images instead of 500, both recall and precision would equal 1.
If you take 50 first images, then precision would be still 1, but recall would drop to 0.5.
So, by varying the number of images you can get points for the precision-recall curve. For the above-described example these points would be (0.2, 1), (1, 1), (1, 0.5).
You could compute these values for each of the 1000 images using different thresholds.

to determine if an azimuth is between two given azimuths

Azimuth is the angle a line makes between North pole/axis and itself. They can vary from 0 degree to 360 if rotated in a circular path. Lets say we have two such azimuths, Alpha and Beta. We wish to determine of another azimuth ,say Gamma, falls between two azimuths alpha and beta.
Can someone please help me out with a simple algorithm or formula to be used in excel to determine if the line corresponding to gamma is between two lines corresponding to alpha and beta. gamma can assume different values.
Thanks
Gamma is between two lines corresponding to alpha and beta when both expressions:
ag = atan2(cos(a)*sin(g)-sin(a)*cos(g), cos(a)*cos(g)+sin(a)*sin(g))
gb = atan2(cos(g)*sin(b)-sin(g)*cos(b), cos(g)*cos(b)+sin(g)*sin(b))
- have the same sign,
- (probably important - both values lie in range [0..Pi] or [-Pi..0]),
- and their sum is equal to
ab = atan2(cos(a)*sin(b)-sin(a)*cos(b), cos(a)*cos(b)+sin(a)*sin(b))
These expressions are angles between azimuths, taking into account possible angle wrapping around 360

3D Trilateration using given distances of unknown fixed points

I am new to this forum and not a native english speaker, so please be nice! :)
Here is the challenge I face at the moment:
I want to calculate the (approximate) relative coordinates of yet unknown points in a 3D euclidean space based on a set of given distances between 2 points.
In my first approach I want to ignore possible multiple solutions, just taking the first one by random.
e.g.:
given set of distances: (I think its creating a pyramid with a right-angled triangle as a base)
P1-P2-Distance
1-2-30
2-3-40
1-3-50
1-4-60
2-4-60
3-4-60
Step1:
Now, how do I calculate the relative coordinates for those points?
I figured that the first point goes to 0,0,0 so the second one is 30,0,0.
After that the third points can be calculated by finding the crossing of the 2 circles from points 1 and 2 with their distances to point 3 (50 and 40 respectively). How do I do that mathematically? (though I took these simple numbers for an easy representation of the situation in my mind). Besides I do not know how to get to the answer in a correct mathematical way the third point is at 30,40,0 (or 30,0,40 but i will ignore that).
But getting the fourth point is not as easy as that. I thought I have to use 3 spheres in calculate the crossing to get the point, but how do I do that?
Step2:
After I figured out how to calculate this "simple" example I want to use more unknown points... For each point there is minimum 1 given distance to another point to "link" it to the others. If the coords can not be calculated because of its degrees of freedom I want to ignore all possibilities except one I choose randomly, but with respect to the known distances.
Step3:
Now the final stage should be this: Each measured distance is a bit incorrect due to real life situation. So if there are more then 1 distances for a given pair of points the distances are averaged. But due to the imprecise distances there can be a difficulty when determining the exact (relative) location of a point. So I want to average the different possible locations to the "optimal" one.
Can you help me going through my challenge step by step?
You need to use trigonometry - specifically, the 'cosine rule'. This will give you the angles of the triangle, which lets you solve the 3rd and 4th points.
The rules states that
c^2 = a^2 + b^2 - 2abCosC
where a, b and c are the lengths of the sides, and C is the angle opposite side c.
In your case, we want the angle between 1-2 and 1-3 - the angle between the two lines crossing at (0,0,0). It's going to be 90 degrees because you have the 3-4-5 triangle, but let's prove:
50^2 = 30^2 + 40^2 - 2*30*40*CosC
CosC = 0
C = 90 degrees
This is the angle between the lines (0,0,0)-(30,0,0) and (0,0,0)- point 3; extend along that line the length of side 1-3 (which is 50) and you'll get your second point (0,50,0).
Finding your 4th point is slightly trickier. The most straightforward algorithm that I can think of is to firstly find the (x,y) component of the point, and from there the z component is straightforward using Pythagoras'.
Consider that there is a point on the (x,y,0) plane which sits directly 'below' your point 4 - call this point 5. You can now create 3 right-angled triangles 1-5-4, 2-5-4, and 3-5-4.
You know the lengths of 1-4, 2-4 and 3-4. Because these are right triangles, the ratio 1-4 : 2-4 : 3-4 is equal to 1-5 : 2-5 : 3-5. Find the point 5 using trigonometric methods - the 'sine rule' will give you the angles between 1-2 & 1-4, 2-1 and 2-4 etc.
The 'sine rule' states that (in a right triangle)
a / SinA = b / SinB = c / SinC
So for triangle 1-2-4, although you don't know lengths 1-4 and 2-4, you do know the ratio 1-4 : 2-4. Similarly you know the ratios 2-4 : 3-4 and 1-4 : 3-4 in the other triangles.
I'll leave you to solve point 4. Once you have this point, you can easily solve the z component of 4 using pythagoras' - you'll have the sides 1-4, 1-5 and the length 4-5 will be the z component.
I'll initially assume you know the distances between all pairs of points.
As you say, you can choose one point (A) as the origin, orient a second point (B) along the x-axis, and place a third point (C) along the xy-plane. You can solve for the coordinates of C as follows:
given: distances ab, ac, bc
assume
A = (0,0)
B = (ab,0)
C = (x,y) <- solve for x and y, where:
ac^2 = (A-C)^2 = (0-x)^2 + (0-y)^2 = x^2 + y^2
bc^2 = (B-C)^2 = (ab-x)^2 + (0-y)^2 = ab^2 - 2*ab*x + x^2 + y^2
-> bc^2 - ac^2 = ab^2 - 2*ab*x
-> x = (ab^2 + ac^2 - bc^2)/2*ab
-> y = +/- sqrt(ac^2 - x^2)
For this to work accurately, you will want to avoid cases where the points {A,B,C} are in a straight line, or close to it.
Solving for additional points in 3-space is similar -- you can expand the Pythagorean formula for the distance, cancel the quadratic elements, and solve the resulting linear system. However, this does not directly help you with your steps 2 and 3...
Unfortunately, I don't know a well-behaved exact solution for steps 2 and 3, either. Your overall problem will generally be both over-constrained (due to conflicting noisy distances) and under-constrained (due to missing distances).
You could try an iterative solver: start with a random placement of all your points, compare the current distances with the given ones, and use that to adjust your points in such a way as to improve the match. This is an optimization technique, so I would look up books on numerical optimization.
If you know the distance between the nodes (fixed part of system) and the distance to the tag (mobile) you can use trilateration to find the x,y postion.
I have done this using the Nanotron radio modules which have a ranging capability.

How to find the average of a set of bearings [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do you calculate the average of a set of angles?
If I have a set of bearings, ranging from 1-360, how can I find the average? Usually to find the average one would add them all up and divide by the number of items. The problem here is that doing that in the case of [1, 359], 2 bearings, would result in in 180, which in fact should be 360. Any ideas?
Represent the angles as vectors with Norm=1 and average the sum.
x1 = {cos(a),sin(a)}
x2 = {cos(b),sin(b)}
(x1+x2)/2 = {(cos(a)+cos(b))/2,(sin(a)+sin(b))/2}
which means the angle for the mean is
atan2((sin(a)+sin(b)) /(cos(a)+cos(b)))
Just beware of controlling the possible overflow when the denominator is close to zero.
It isn't clear from your question what you're trying to define the "average" to be... for directions on a circle there is no clear-cut obvious notion of average.
One interpretation is the value x that is the closest fit to the set of provided values, in the least-squares sense, where the distance between two bearings is defined as the smallest angle between them. Here is code to compute this average:
In[2]:= CircDist[a_, b_] := 180 - Mod[180 + a - b, 360]
In[6]:= Average[bearings_] :=
x /. NMinimize[
Sum[CircDist[x, bearings[[i]]]^2, {i, 1, Length[bearings]}],
x][[2]]
In[10]:= Average[{1, 359}]
Out[10]= -3.61294*10^-15
So what you want is the middle of two bearings - what happens if you have {90, 270}? Is the desired answer 0 or 180? This is something to consider.. also what's the middle of three bearings?
One thing you could do is:
Take the first two bearings in your set
Work out the difference between the two in either direction (i.e. [1, 359] would give 2 degrees in one direction, and 358 in the other)
If you want the desired angle to be the middle of the acutest of the two, take that as your difference and add to the anti-clockwise most of the pair (i.e. 359)
Use this as the new bearing and the next (i.e. 3rd in set) as the other bearing, and repeat, until all are 'middled'.
Off the top of my head, I don't think this is going to be fair, it'll probably bias it in one direction though (i.e. maybe in preference of the later values in your set).

Resources