Can someone help me with this formula ? (Standard deviation) - math

So I just started to learn Java, but my prof. just gave us this wild formula which we have to translate into code. I can't figure it out how to make this possible, can someone help me ?
σ means Standard deviation
µ means average
x means The Array x
N means N variables
]

The Σ upper-case Sigma character simply means the sum of.
So, for every data value, subtract the mean (average, in layman's terms) and square the result. Add all of those values together, divide it by the number of data values minus one, then take the square root of that.
Psuedo-code for that would be something like below. First, a function for calculating mean:
def calcMean(collection):
# Initialise for working out mean (sum / count).
sum = 0, count = 0
# Add every item to the sum and keep count.
for item in collection:
sum = sum + item
count = count + 1
# Avoid dive by zero, you choose what to do.
if count == 0:
handle empty collection somehow
# Return the mean.
return sum / count
Then using that to calculate the standard deviation:
def calcStdDev(collection):
# Get mean of the collection, initialise accumulator and count.
mean = calcMean(collection)
accum = 0, count = 0
for item in collection:
# Accumulate each '(item-mean) squared' value.
diff = item - mean
accum = accum + diff * diff
# Avoid dive by zero, you choose what to do.
if count < 2:
handle too-small collection somehow
# Divide and square root for result.
return sqrt(sum / (count - 1))
Now your job is to turn that pseudo-code into Java, something that should be a bit easier than turning the formula into Java.

Related

Mathematics, combinatorics. What is the maximum number of variations?

There's this line.
X1_X2_X3_X4_X5_X6
It is known that each variable X* can take values from 0 to 100. The sum of all X* variables is always equal to 100. How many possible string variants can be created?
Suppose F(n,s) is the number of strings with n variables, and the variables sum to s, where each variable is between 0 and 100, and suppose s<=100. You want F(6,100).
Clearly
F(1,s) = 1
If the first variable is t, then it can be followed by strings of n-1 variables that sum to s-t. Thus
F(n,s) = Sum{ 0<=t<=s | F(n-1, s-t) }
So its easy to write a wee function to compute the answer.

Statistical probability of N contiguous true-bits in a sequence of bits?

Let's assume I have an N-bit stream of generated bits. (In my case 64kilobits.)
Whats the probability of finding a sequence of X "all true" bits, contained within a stream of N bits. Where X = (2 to 16), and N = (16 to 1000000), and X < N.
For example:
If N=16 and X=5, whats the likelyhood of finding 11111 within a 16-bit number.
Like this pseudo-code:
int N = 1<<16; // (64KB)
int X = 5;
int Count = 0;
for (int i = 0; i < N; i++) {
int ThisCount = ContiguousBitsDiscovered(i, X);
Count += ThisCount;
}
return Count;
That is, if we ran an integer in a loop from 0 to 64K-1... how many times would 11111 appear within those numbers.
Extra rule: 1111110000000000 doesn't count, because it has 6 true values in a row, not 5. So:
1111110000000000 = 0x // because its 6 contiguous true bits, not 5.
1111100000000000 = 1x
0111110000000000 = 1x
0011111000000000 = 1x
1111101111100000 = 2x
I'm trying to do some work involving physically-based random-number generation, and detecting "how random" the numbers are. Thats what this is for.
...
This would be easy to solve if N were less than 32 or so, I could just "run a loop" from 0 to 4GB, then count how many contiguous bits were detected once the loop was completed. Then I could store the number and use it later.
Considering that X ranges from 2 to 16, I'd literally only need to store 15 numbers, each less than 32 bits! (if N=32)!
BUT in my case N = 65,536. So I'd need to run a loop, for 2^65,536 iterations. Basically impossible :)
No way to "experimentally calculate the values for a given X, if N = 65,536". So I need maths, basically.
Fix X and N, obiously with X < N. You have 2^N possible values of combinations of 0 and 1 in your bit number, and you have N-X +1 possible sequences of 1*X (in this part I'm only looking for 1's together) contained in you bit number. Consider for example N = 5 and X = 2, this is a possible valid bit number 01011, so fixed the last two characteres (the last two 1's) you have 2^2 possible combinations for that 1*Xsequence. Then you have two cases:
Border case: Your 1*X is in the border, then you have (2^(N -X -1))*2 possible combinations
Inner case: You have (2^(N -X -2))*(N-X-1) possible combinations.
So, the probability is (border + inner )/2^N
Examples:
1)N = 3, X =2, then the proability is 2/2^3
2) N = 4, X = 2, then the probaility is 5/16
A bit brute force, but I'd do something like this to avoid getting mired in statistics theory:
Multiply the probabilities (1 bit = 0.5, 2 bits = 0.5*0.5, etc) while looping
Keep track of each X and when you have the product of X bits, flip it and continue
Start with small example (N = 5, X=1 - 5) to make sure you get edge cases right, compare to brute force approach.
This can probably be expressed as something like Sum (Sum 0.5^x (x = 1 -> 16) (for n = 1 - 65536) , but edge cases need to be taken into account (i.e. 7 bits doesn't fit, discard probability), which gives me a bit of a headache. :-)
#Andrex answer is plain wrong as it counts some combinations several times.
For example consider the case N=3, X=1. Then the combination 101 happens only 1/2^3 times but the border calculation counts it two times: one as the sequence starting with 10 and another time as the sequence ending with 01.
His calculations gives a (1+4)/8 probability whereas there are only 4 unique sequences that have at least a single contiguous 1 (as opposed to cases such as 011):
001
010
100
101
and so the probability is 4/8.
To count the number of unique sequences you need to account for sequences that can appear multiple times. As long as X is smaller than N/2 this will happens. Not sure how you can count them tho.

Recursion Confusion - Summation Symbol

I have an assignment with this symbol on it: [Image of unfamiliar symbol
Basically the question asks "Write a recursive Java method which, given a positive integer n, computes and returns the sum of the integers from 1 to n as follows".
I do not need any help on the recursion itself, I really just need to understand what that symbol means (Link Included), so I can answer the question properly.
My Question: What meaning does the symbol possess? What is my instructor expecting as a valid response?
NOTE: I do NOT want anyone to attempt to answer the actual assignment question. I ONLY want know understand what the symbol being used means and what should be returned in my recursion method.
IT is the sigma symbol which means take the sum from i = 1 to n.
so your output comes as 1 + 2 + 3 + ..... + n
This explanation is to left hand side of the equation. others are the same.
It's a summation symbol
The sum of each i starting from i = 1 to i == n equals the sum of each i starting from i = 1 to i == n/2 plus the sum of of each i starting from i = n/2 + 1 to i == n

counting number of arithmetic progressions in an array

My previous qs. was unclear so I am again putting it in clear terms.
I need an efficient algorithm to count the number of arithmetic progressions in a series. The number of elements in a single AP should be >2.
eg. if the series is {1,2,2,3,4,4} then the different solutions are listed below(with index numbers):
0,1,3
0,2,3
0,1,3,4
0,1,3,5
0,2,3,4
0,2,3,5
hence the answer should be 6
I am not able to code it when these numbers become large and size of array increases. I need an efficient algorithm for this.
First of all, you answer is incorrect. Numbers 2,3,4 (indexes also 2,3,4) form an AP.
Second, here is a simple brute force algorithm:
def find (vec,value,start):
for i from start to length(vec):
if vec[i] == value:
return i
return None
for i from 0 to length(vec) - 2:
for j from i to length(vec) - 1:
next = 2 * vec[j] - vec[i] # the next element in the AP
pos = find(vec,next,j+1)
if pos is None:
continue
print "found AP:\n %d\n %d\n %d" % (i,j,pos)
prev = vec[j]
here = next
until (pos = find(vec,next = 2*here-prev,pos+1)) is None:
print ' '+str(pos)
prev = here
here = next
I don't think you can do better than this O(n^4) because the total number of APs to be printed is O(n^4) (consider a vector of zeros).
If, on the other hand, you want to only print maximal APs, i.e., APs which are not contained in any other AP, then the problem becomes much more interesting...

Number of Zero-crossings - Equation

I have written an algorithm that calculates the number of zero-crossings within a signal. By this, I mean the number of times a value changes from + to - and vice-versa.
The algorithm is explained like this:
If there are the following elements:
v1 = {90, -4, -3, 1, 3}
Then you multiply the value by the value next to it. (i * i+1)
Then taking the sign value sign(val) determine if this is positive or negative. Example:
e1 = {90 * -4} = -360 -> sigum(e1) = -1
e2 = {-4 * -3} = 12 -> signum(e2) = 1
e3 = {-3 * 1} = -3 -> signum(e3) = -1
e4 = {1 * 3} = 3 -> signum(e4) = 1
Therefore the total number of values changed from negative to positive is = 2 ..
Now I want to put this forumular, algorithm into an equation so that I can present it.
I have asked a simular question, but got really confused so went away and thought about it and came up with (what I think the equation should look like).. It's probably wrong, well, laughably wrong. But here it is:
Now the logic behind it:
I pass in a V (val)
I get the absolute value of the summation of the signum from calculating (Vi * Vi+1) .. The signum(Vi * Vi+1) should produce -1, 1, ..., values
If and only if the value is -1 (Because I'm only interested in the number of times zero is crossed, therefore, the zero values.
Does this look correct, if not, can anyone suggest improvements?
Thank you :)!
EDIT:
Is this correct now?
You are doing the right thing here but your equation is wrong simply because you only want to count the sign of the product of adjacent elements when it is negative. Dont sum the sign of products since positive sign products should be neglected. For this reason, an explicit mathematical formula is tricky as positive products between adjacent elements should be ignored. What you want is a function that takes 2 arguments and evaluates to 1 when their product is negative and zero when non-negative
f(x,y) = 1 if xy < 0
= 0 otherwise
then your number of crossing points is simply given by
sum(f(v1[i],v1[i+1])) for i = 0 to i = n-1
where n is the length of your vector/array v1 (using C style array access notation based on zero indexing). You also have to consider edge conditions such as 4 consecutive points {-1,0,0,1} - do you want to consider this as simply one zero crossing or 2??? Only you can answer this based on the specifics of your problem, but whatever your answer adjust your algorithm accordingly.

Resources