How to find n average number of trials before criteria are met with differing probabilities per outcome? - math

I've spent a few days trying to figure this out and looking up tutorials, but everything I've found so far seems like it's close to what I need but don't give the results I need.
I have a device that produces a single letter, A-F. For simplicity's sake, you can think of it like a die with letters. It will always produce one and only one letter each time it is used. However it has one major difference: each letter can have a differing known probability of being picked:
A: 25%
B: 5%
C: 20%
D: 15%
E: 20%
F: 15%
These probabilities remain constant throughout all attempts.
Additionally, I have a specific combination I must accrue before I am "successful":
As needed: 1
Bs needed: 3
Cs needed: 0
Ds needed: 1
Es needed: 2
Fs needed: 3
I need find the average number of letter picks (i.e. rolls/trials/attempts) that have to happen for this combination of letters to be accrued. It's completely fine for any individual outcome to have more than the required number of letters, but success is only counted when each letter has been chosen at least its minimum amount of times.
I've looked at plenty of tutorials for multinomial probability distribution and similar things, but I haven't found anything that explains how to find average number of trials for a scenario like this. Please kindly explain answers clearly as I'm not a wiz with statistics.

In addition to Severin's answer that logically looks good to me but might be costly to evaluate (i.e. infinite sum of factorials).
Let me provide some intuition that should give a good approximation.
Considering each category at a time. Refer this math stackexchange question/ answer. Expected number of tosses in which you would get the k number of successes for each category (i) can be calculated as k(i)/ P(i):
Given,
p(A): 25% ; Expected number of tosses to get 1 A = 1/ 0.25 = 4
p(B): 5% ; Expected number of tosses to get 3 B's = 3/ 0.05 = 60
p(C): 20% ; Expected number of tosses to get 0 C = 0/ 0.20 = 0
p(D): 15% ; Expected number of tosses to get 1 D = 1/ 0.15 = 6.67 ~ 7
p(E): 20% ; Expected number of tosses to get 2 E's = 2/ 0.20 = 10
p(F): 15% ; Expected number of tosses to get 3 F's = 3/ 0.15 = 20
you get an idea that getting 3 B's is your bottleneck, you can expect on average 60 tosses for your scenario to play out.

Well, minimum number of throws is 10. Average would be infinite sum
A=10•P(done in 10)+11•P(done in 11)+12•P(done in 12) + ...
For P(done in 10) we could use multinomial
P(10)=Pm(1,3,0,1,2,3|probs), where probs=[.25, .05, .20, .15, .20, .15]
For P(11) you have one more throw which you could distribute like this
P(11)=Pm(2,3,0,1,2,3|probs)+Pm(1,4,0,1,2,3|probs)+Pm(1,3,0,2,2,3|probs)+
Pm(1,3,0,1,3,3|probs)+Pm(1,3,0,1,2,4|probs)
For P(12) you have to distribute 2 more throws. Note, that there are combinations of throws which are impossible to get, like Pm(2,3,0,2,2,3|probs), because you have to stop earlier
And so on and so forth

Your process can be described as a Markov chain with a finite number of states, and an absorbing state.
The number of steps before reaching the absorbing state is called the hitting time. The expected hitting time can be calculated easily from the transition matrix of the Markov chain.
Enumerate all possible states (a, b, c, d, e, f). Consider only a finite number of states, because "b >= 3" is effectively the same as "b = 3", etc. The total number of states is (1+1)*(3+1)*(0+1)*(2+1)*(3+1) = 192.
Make sure that in your enumeration, starting state (0, 0, 0, 0, 0, 0) comes first, with index 0, and absorbing state (1, 3, 0, 1, 2, 3) comes last.
Build the transition matrix P. It's a square matrix with one row and column per state. Entry P[i, j] in the matrix gives the probability of going from state i to state j when rolling a die. There should be at most 6 non-zero entries per row.
For example, if i is the index of state (1, 0, 0, 1, 2, 2) and j the index of state (1, 1, 0, 1, 2, 2), then P[i, j] = probability of rolling face B = 0.05. Another example: if i is the index of state (1,3,0,0,0,0), then P[i,i] = probability of rolling A, B or C = 0.25+0.05+0.2 = 0.5.
Call Q the square matrix obtained by removing the last row and last column of P.
Call I the identity matrix of the same dimensions as Q.
Compute matrix M = (I - Q)^-1, where ^-1 is matrix inversion.
In matrix M, the entry M[i, j] is the expected number of times that state j will be reached before the absorbing state, when starting from state i.
Since our experiment starts in state 0, we're particularly interested in row 0 of matrix M.
The sum of row 0 of matrix M is the expected total number of states reached before the absorbing state. That is exactly the answer we seek: the number of steps to reach the absorbing state.
To understand why this works, you should read a course on Markov chains! Perhaps this one: James Norris' course notes on Markov chains. The chapter about "hitting times" (which is the name for the number of steps before reaching target state) is chapter 1.3.
Below, an implementation in python.
from itertools import product, accumulate
from operator import mul
from math import prod
import numpy as np
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def get_expected_n_trials(targets, dice_weights):
states = list(product(*(range(n+1) for n in targets)))
base = list(accumulate([n+1 for n in targets[:0:-1]], mul, initial=1))[::-1]
lookup = dict(map(reversed, enumerate(states)))
P = np.zeros((len(states), len(states)))
for i, s in enumerate(states):
a,b,c,d,e,f = s
for f, p in enumerate(dice_weights):
#j = index of state reached from state i when rolling face f
j = i + base[f] * (s[f] < targets[f])
j1 = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]
if (j != j1):
print(i, s, f, ' --> ' , j, j1)
assert(j == j1)
P[i,j] += p
Q = P[:-1, :-1]
I = np.identity(len(states)-1)
M = np.linalg.inv(I - Q)
return M[0,:].sum()
print(get_expected_n_trials(targets, dice_weights))
# 61.28361802372382
Explanations of code:
First we build the list of states using Cartesian product itertools.product
For a given state i and die face f, we need to calculate j = state reached from i when adding f. I have two ways of calculating that, either as j = i + base[f] * (s[f] < targets[f]) or as j = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]. Because I'm paranoid, I calculated it both ways and checked that the two ways gave the same result. But you only need one way. You can remove lines j1 = ... to assert(j == j1) if you want.
Matrix P begins filled with zeroes, and we fill up to six cells per row with P[i, j] += p where p is probability of rolling face f.
Then we compute matrices Q and M as I indicated above.
We return the sum of all the cells on the first row of M.
To help you better understand what is going on, I encourage you to examine the values of all variables. For instance you could replace return M[0, :].sum() with return states, base, lookup, P, Q, I, M and then write states, base, lookup, P, Q, I, M = get_expected_n_trials(targets, dice_weights) in the python interactive shell, so that you can look at the variables individually.

A Monte-Carlo simulation:
Actually roll the die until we hit the requirements;
Count how many rolls we did;
Repeat experiment 1000 times to get the empirical average value.
Implementation in python:
from collections import Counter
from random import choices
from itertools import accumulate
from statistics import mean, stdev
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def avg_n_trials(targets, dice_weights, n_experiments=1000):
dice_faces = range(len(targets))
target_state = Counter(dict(enumerate(targets)))
cum_weights = list(accumulate(dice_weights))
results = []
for _ in range(n_experiments):
state = Counter()
while not state >= target_state:
f = choices(dice_faces, cum_weights=cum_weights)[0]
state[f] += 1
results.append(state.total()) # python<3.10: sum(state.values())
m = mean(results)
s = stdev(results, xbar=m)
return m, s
m, s = avg_n_trials(targets, dice_weights, n_experiments=10000)
print(m)
# 61.4044

Related

Squid game Episode 7 with simulation

Last night I saw the episode 7 of the Squid game tv series. The episode has a game with binomial distribution in the bridge.
Specifically there are 16 players and a bridge with 18 pair of glasses (one pure glass and one safe glass).If one player happened to choose the pure glass then the glass couldn't stand the weight of the player and the glass broke. The next player had the advantage that he/she was starting from the position that the last player had and continues the binomial search.At the end 3 players happened to cross the bridge.
So i was wondering: It is like, I have 16 euros in my pocket and I play head or tails with p = 1/2. Every time I bet on heads. If the coin flip is head then I earn 0 and if is tails I lose 1 euro. What is the probability of hitting 18 times (consecutive or not) heads and to be left 3 euros in my pocket.
I tried to simulate this problem in R:
squid_bridge = function(a,n,p) {
players = a
while (position > 0 & position < n) {
jump = sample(c(0,1),1,prob=c(1-p,p))
position = position + jump
}
if (position == 0)
return(1)
else
return(0)
}
n = 18
trials = 100000
a = 16
p = 1/2
set.seed(1)
simlist = replicate(trials, squid_bridge(a, n, p))
It does not seem to work. Any help?
Here is a Monte Carlo experiment in R returning the distribution of the number of failures.
apply(apply(matrix(rgeom(16*1e6,.5)+1,nc=16),1,cumsum)>18,1,mean)
#with details:
#rgeom(16*1e6,.5)+1 for 16 x 10⁶ geometric simulations when
#the outcome is the number of attempts till "success",
# "success" included
#,1,cumsum) for the number of steps till 16th "success"
#)>18 for counting the cases when a player manages to X the bridge
#1,mean) for finding the probability of each of the players to X
This is not a Binomial but a truncated Negative Binomial experiment in that the number of new steps made by each player is a Geometric Geom(1/2) variate unless the 18 steps have been made. The average number of survivors is thus
sum(1-pnbinom(size=1:16,q=17:2,prob=.5))
#Explanation:
#pnbinom is the Negative Binomial cdf
#with size the number of "successes"
#q the integer at which the cdf is computed
#prob is the Negative Binomial probability parameter
#Because nbinom() is calibrated as the number of attempts
#before "success", rather than until "success", the value of
#q decreases by one for each player in the game
whose value is 7.000076, rather than 16-18/2=7!
Here is how I think you can model the game in R. The first version is similar to what you have: there's a 50% chance of guessing correctly and if the guess is correct, the players advance a tile. Otherwise they do not, and the number of players decrements by 1. If the number of players reaches 0, or they advance to the end, the game ends. This is shown in squid_bridge1().
squid_bridge1 <- function(players, n, prob) {
if (players == 0 | n == 18) {
# All players have died or we have reached the end
return(players)
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died
return(squid_bridge1(players - 1, n, prob))
}
if (jump == 1 & n < 18) {
# Player lives and advances 1 space
return(squid_bridge1(players, n + 1, prob))
}
}
However, this does not accurately depict the game since a wrong guess gives the remaining players additional information. If a player chooses wrong, the probability of the next guess being correct is not 50%, it's 100%. However, after that point the probability of a correct guess decreases to 50%. This can be accounted for with another argument to keep track of the correctness of the previous guess.
squid_bridge2 <- function(players, n, prob, previous) {
if (players == 0 | n == 18) {
# The game ends if there are no players or they have reached the end
return(players)
}
if (previous == 0) {
# The previous guess was wrong, but now the players know where to go next
return(squid_bridge2(players, n + 1, prob, previous = 1))
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died
return(squid_bridge2(players - 1, n, prob, previous = 0))
}
if (jump == 1 & n < 18) {
# Move is correct. Advance 1 space
return(squid_bridge2(players, n + 1, prob, previous = 1))
}
}
However, there's a catch. It wasn't quite that simple in the show, and players fell for reasons other than an incorrect guess (being pushed, jumping on purpose, etc.). I don't know what a reasonable probability of doing something like this is, but it is likely low, let's say 10%.
not_stupid <- function() {
x <- runif(1, 0, 1)
if (x <= 0.1) {
return(FALSE)
} else {
return(TRUE)
}
}
Since emotions spike just before each move, we will test this prior to each move.
squid_bridge3 <- function(players, n, prob, previous) {
if (players == 0 | n == 18) {
# The game is over because there are no players left or they reached the end
return(players)
}
if (previous == 0) {
# The previous guess was wrong, but now the players know where to go next
return(squid_bridge3(players, n + 1, prob, previous = 1))
}
if (!not_stupid()) {
return(squid_bridge3(players - 1, n, prob, previous = 1))
}
jump <- rbinom(1, 1, prob)
if (jump == 0) {
# Player died because of either choosing wrong or a self-inflicted loss
return(squid_bridge3(players - 1, n, prob, previous = 0))
}
if (jump == 1 & n < 18) {
# Move is correct. Advance 1 space
return(squid_bridge3(players, n + 1, prob, previous = 1))
}
}
Then running some simulations:
set.seed(123)
trials <- 10000
players <- 16
squid1 <- replicate(trials, squid_bridge1(players, 0, 0.5))
squid2 <- replicate(trials, squid_bridge2(players, 0, 0.5, 1))
squid3 <- replicate(trials, squid_bridge3(16, 0, 0.5, 1))
df <- tibble(squid1 = squid1,
squid2 = squid2,
squid3 = squid3) %>%
pivot_longer(cols = c(squid1, squid2, squid3))
ggplot(data = df,
aes(x = value)) +
geom_histogram(bins = 10,
binwidth = 1,
fill = "cornflowerblue",
color = "black") +
facet_wrap(~name,
nrow = 3) +
xlab("# of players to make it to the end") +
scale_x_continuous(breaks = seq(0, 16, by = 1),
labels = seq(0, 16, by = 1))
As you can see below, the first situation is heavily skewed to the left. Since the players are essentially "blindly guessing" at each tile, it is unlikely that any will make it to the end. However, after accounting for the information gained from a wrong guess, it averages somewhere around 7 players making it. By adding in a random chance of falling for another reason, the distribution skews to the left some.
Average for first situation: 1.45
Average for second situation: 7.01
Average for third situation: 4.99
To answer the question of the probability of only 3 players making it, I get ~ 10.8% for the last case
Edit: As requested, here is the code to generate the plots. I also fixed the various functions that had some naming issues (went through a few different names when I made them). It looks like it resulted in a slight bug for the 3rd function, but I have fixed it throughout.
○ △ □
##########
# Game ○ △ □
##########
squidd7<-function(N_Fields,N_Players,p_new_field){
Players<-data.frame(id = 1:N_Players, alive=rep(1,N_Players),Field=0)
for(i in 1:N_Players){
while (Players[i,"alive"]==TRUE && max(Players$Field)< N_Fields) {
Players[i,"Field"]=Players[i,"Field"]+1 # Jump onto the next Field
Players[i,"alive"]=rbinom(1,1,p_new_field)# Fall or repeat
}
Players[i+1,"Field"]=Players[i,"Field"] # next player starts where prior player died
}
Players<-Players[1:N_Players,] # cosmetic because i do i+1 in the prior line
# Print me some messages
if(!any(Players$alive>0)){
cat("Players loose!")
} else {
cat(" \n After", max(Players$Field),"goal was reached! ")
cat("Players",Players[Players$alive==1,"id"], "survive")
}
return(Players)
}
squidd7(18,16,0.5)
###########
# simulation ○ △ □
###########
results<-data.frame(matrix(0, nrow = 100, ncol = 20))
for(x in 1:ncol(results)){
for (i in 1:nrow(results)) {
Players<-squidd7(x+7,16,0.5)
results[i,x]<-sum(Players$alive)
}
}
###########
## Results ○○□□○ △ □
sdt<-apply(results,2,sd) # standart devation
mn<-apply(results,2,mean) # ○ △ □
boxplot(results,xlab ="n Steps ",names = 8:27,ylab="N Survivors of 16 ")
points(mn,type="l")
points(sdt,type="l")
colors<-colorRampPalette(c(rgb(0,1,0,0.4),
rgb(1,1,0,0.4),
rgb(1,0,0,0.4)), alpha = TRUE)(21)
plot(density(results$X1),type="n",xlim=c(-1,17),ylim=c(0,0.30),
main="○ △ □ ",
sub="○ △ □ ○ △ □ ○ △ □",
xlab="number of survivors")
for( i in 1:21){
polygon(density(results[,i]),col= colors[i])
}
legend(15,0.31,title="Steps",legend=8:28,fill=colors,border = NA,
y.intersp = 0.5,
cex = 0.8, text.font = 0.3)
well to simulate this game. you will need 50%/50% chance 16 times. meaning all you have to code is 50% chance of not losing and run it 16 times than if you lose you it will do a -1 form a varible of 18. that will create a perfect digital recreation of squidgame bridge
If I'm understanding correctly, I believe the other answers are complicating the simulation.
We can simulate draws from a binomial distribution of size 18. Every 1 kills someone (assuming there is anyone left to kill). Thus we can calculate the number of survivors by subtracting the number of 1s drawn from the number of players, truncated at 0 (any negative results are counted as 0, via pmax()).
set.seed(47)
n_sim = 1e4
survivors = pmax(0, 16 - rbinom(n_sim, size = 18, prob = 0.5))
mean(survivors)
# [1] 7.009
The mean seems to approach Xi'an's answer of 7.000076 as n_sim increases. At 500M simulations (which still runs rather quickly with this method!) we get a mean of 7.000073.
Plotting these results, they appear basically identical to cazman's squid2 scenario.
ggplot(data.frame(survivors), aes(x = survivors)) + geom_bar() +
scale_y_continuous(limits = c(0, 6000))

Is there a function f(n) that returns the n:th combination in an ordered list of combinations without repetition?

Combinations without repetitions look like this, when the number of elements to choose from (n) is 5 and elements chosen (r) is 3:
0 1 2
0 1 3
0 1 4
0 2 3
0 2 4
0 3 4
1 2 3
1 2 4
1 3 4
2 3 4
As n and r grows the amount of combinations gets large pretty quickly. For (n,r) = (200,4) the number of combinations is 64684950.
It is easy to iterate the list with r nested for-loops, where the initial iterating value of each for loop is greater than the current iterating value of the for loop in which it is nested, as in this jsfiddle example:
https://dotnetfiddle.net/wHWK5o
What I would like is a function that calculates only one combination based on its index. Something like this:
tuple combination(i,n,r) {
return [combination with index i, when the number of elements to choose from is n and elements chosen is r]
Does anyone know if this is doable?
You would first need to impose some sort of ordering on the set of all combinations available for a given n and r, such that a linear index makes sense. I suggest we agree to keep our combinations in increasing order (or, at least, the indices of the individual elements), as in your example. How then can we go from a linear index to a combination?
Let us first build some intuition for the problem. Suppose we have n = 5 (e.g. the set {0, 1, 2, 3, 4}) and r = 3. How many unique combinations are there in this case? The answer is of course 5-choose-3, which evaluates to 10. Since we will sort our combinations in increasing order, consider for a minute how many combinations remain once we have exhausted all those starting with 0. This must be 4-choose-3, or 4 in total. In such a case, if we are looking for the combination at index 7 initially, this implies we must subtract 10 - 4 = 6 and search for the combination at index 1 in the set {1, 2, 3, 4}. This process continues until we find a new index that is smaller than this offset.
Once this process concludes, we know the first digit. Then we only need to determine the remaining r - 1 digits! The algorithm thus takes shape as follows (in Python, but this should not be too difficult to translate),
from math import factorial
def choose(n, k):
return factorial(n) // (factorial(k) * factorial(n - k))
def combination_at_idx(idx, elems, r):
if len(elems) == r:
# We are looking for r elements in a list of size r - thus, we need
# each element.
return elems
if len(elems) == 0 or len(elems) < r:
return []
combinations = choose(len(elems), r) # total number of combinations
remains = choose(len(elems) - 1, r) # combinations after selection
offset = combinations - remains
if idx >= offset: # combination does not start with first element
return combination_at_idx(idx - offset, elems[1:], r)
# We now know the first element of the combination, but *not* yet the next
# r - 1 elements. These need to be computed as well, again recursively.
return [elems[0]] + combination_at_idx(idx, elems[1:], r - 1)
Test-driving this with your initial input,
N = 5
R = 3
for idx in range(choose(N, R)):
print(idx, combination_at_idx(idx, list(range(N)), R))
I find,
0 [0, 1, 2]
1 [0, 1, 3]
2 [0, 1, 4]
3 [0, 2, 3]
4 [0, 2, 4]
5 [0, 3, 4]
6 [1, 2, 3]
7 [1, 2, 4]
8 [1, 3, 4]
9 [2, 3, 4]
Where the linear index is zero-based.
Start with the first element of the result. The value of that element depends on the number of combinations you can get with smaller elements. For each such smaller first element, the number of combinations with first element k is n − k − 1 choose r − 1, with potentially some of-by-one corrections. So you would sum over a bunch of binomial coefficients. Wolfram Alpha can help you compute such a sum, but the result still has a binomial coefficient in it. Solving for the largest k such that the sum doesn't exceed your given index i is a computation you can't do with something as simple as e.g. a square root. You need a loop to test possible values, e.g. like this:
def first_naive(i, n, r):
"""Find first element and index of first combination with that first element.
Returns a tuple of value and index.
Example: first_naive(8, 5, 3) returns (1, 6) because the combination with
index 8 is [1, 3, 4] so it starts with 1, and because the first combination
that starts with 1 is [1, 2, 3] which has index 6.
"""
s1 = 0
for k in range(n):
s2 = s1 + choose(n - k - 1, r - 1)
if i < s2:
return k, s1
s1 = s2
You can reduce the O(n) loop iterations to O(log n) steps using bisection, which is particularly relevant for large n. In that case I find it easier to think about numbering items from the end of your list. In the case of n = 5 and r = 3 you get choose(2, 2)=1 combinations starting with 2, choose(3,2)=3 combinations starting with 1 and choose(4,2)=6 combinations starting with 0. So in the general choose(n,r) binomial coefficient you increase the n with each step, and keep the r. Taking into account that sum(choose(k,r) for k in range(r,n+1)) can be simplified to choose(n+1,r+1), you can eventually come up with bisection conditions like the following:
def first_bisect(i, n, r):
nCr = choose(n, r)
k1 = r - 1
s1 = nCr
k2 = n
s2 = 0
while k2 - k1 > 1:
k3 = (k1 + k2) // 2
s3 = nCr - choose(k3, r)
if s3 <= i:
k2, s2 = k3, s3
else:
k1, s1 = k3, s3
return n - k2, s2
Once you know the first element to be k, you also know the index of the first combination with that same first element (also returned from my function above). You can use the difference between that first index and your actual index as input to a recursive call. The recursive call would be for r − 1 elements chosen from n − k − 1. And you'd add k + 1 to each element from the recursive call, since the top level returns values starting at 0 while the next element has to be greater than k in order to avoid duplication.
def combination(i, n, r):
"""Compute combination with a given index.
Equivalent to list(itertools.combinations(range(n), r))[i].
Each combination is represented as a tuple of ascending elements, and
combinations are ordered lexicograplically.
Args:
i: zero-based index of the combination
n: number of possible values, will be taken from range(n)
r: number of elements in result list
"""
if r == 0:
return []
k, ik = first_bisect(i, n, r)
return tuple([k] + [j + k + 1 for j in combination(i - ik, n - k - 1, r - 1)])
I've got a complete working example, including an implementation of choose, more detailed doc strings and tests for some basic assumptions.

Number of action per year. Combinatorics question

I'm writing a diploma about vaccines. There is a region, its population and 12 month. There is an array of 12 values from 0 to 1 with step 0.01. It means which part of population should we vaccinate in every month.
For example if we have array = [0.1,0,0,0,0,0,0,0,0,0,0,0]. That means that we should vaccinate 0.1 of region population only in first month.
Another array = [0, 0.23,0,0,0,0,0,0, 0.02,0,0,0]. It means that we should vaccinate 0.23 of region population in second month and 0.02 of region population in 9th month.
So the question is: how to generate (using 3 loops) 12(months) * 12(times of vaccinating) * 100 (number of steps from 0 to 1) = 14_400 number of arrays that will contain every version of these combinations.
For now I have this code:
for(int month = 0;month<12;month++){
for (double step = 0;step<=1;step+=0.01){
double[] arr = new double[12];
arr[month] = step;
}
}
I need to add 3d loop that will vary number of vaccinating per year.
Have no idea how to write it.
Idk if it is understandable.
Hope u get it otherwise ask me, please.
You have 101 variants for the first month 0.00, 0.01..1.00
And 101 variants for the second month - same values.
And 101*101 possible combinations for two months.
Continuing - for all 12 months you have 101^12 variants ~ 10^24
It is not possible to generate and store so many combinations (at least in the current decade)
If step is larger than 0.01, then combination count might be reliable. General formula is P=N^M where N is number of variants per month, M is number of months
You can traverse all combinations representing all integers in range 0..P-1 in N-ric numeral system. Or make digit counter:
fill array D[12] with zeros
repeat
increment element at the last index by step value
if it reaches the limit, make it zero
and increment element at the next index
until the first element reaches the limit
It is similar to counting 08, 09, here we cannot increment 9, so make 10 and so on
s = 1
m = 3
mx = 3
l = [0]*m
i = 0
while i < m:
print([x/3 for x in l])
i = 0
l[i] += s
while (i < m) and l[i] > mx:
l[i] = 0
i += 1
if i < m:
l[i] += s
Python code prints 64 ((mx/s+1)^m=4^3) variants like [0.3333, 0.6666, 0.0]

Calculate if trend is up, down or stable

I'm writing a VBScript that sends out a weekly email with client activity. Here is some sample data:
a b c d e f g
2,780 2,667 2,785 1,031 646 2,340 2,410
Since this is email, I don't want a chart with a trend line. I just need a simple function that returns "up", "down" or "stable" (though I doubt it will ever be perfectly stable).
I'm terrible with math so I don't even know where to begin. I've looked at a few other questions for Python or Excel but there's just not enough similarity, or I don't have the knowledge, to apply it to VBS.
My goal would be something as simple as this:
a b c d e f g trend
2,780 2,667 2,785 1,031 646 2,340 2,410 ↘
If there is some delta or percentage or other measurement I could display that would be helpful. I would also probably want to ignore outliers. For instance, the 646 above. Some of our clients are not open on the weekend.
First of all, your data is listed as
a b c d e f g
2,780 2,667 2,785 1,031 646 2,340 2,410
To get a trend line you need to assign a numerical values to the variables a, b, c, ...
To assign numerical values to it, you need to have little bit more info how data are taken. Suppose you took data a on 1st January, you can assign it any value like 0 or 1. Then you took data b ten days later, then you can assign value 10 or 11 to it. Then you took data c thirty days later, then you can assign value 30 or 31 to it. The numerical values of a, b, c, ... must be proportional to the time interval of the data taken to get the more accurate value of the trend line.
If they are taken in regular interval (which is most likely your case), lets say every 7 days, then you can assign it in regular intervals a, b, c, ... ~ 1, 2, 3, ... Beginning point is entirely your choice choose something that makes it very easy. It does not matter on your final calculation.
Then you need to calculate the slope of the linear regression which you can find on this url from which you need to calculate the value of b with the following table.
On first column from row 2 to row 8, I have my values of a,b,c,... which I put 1,2,3, ...
On second column, I have my data.
On third column, I multiplied each cell in first column to corresponding cell in second column.
On fourth column, I squared the value of cell of first column.
On row 10, I added up the values of the above columns.
Finally use the values of row 10.
total_number_of_data*C[10] - A[10]*B[10]
b = -------------------------------------------
total_number_of_data*D[10]-square_of(A[10])
the sign of b determines what you are looking for. If it's positive, then it's up, if it's negative, then it's down, and if it's zero then stable.
This was a huge help! Here it is as a function in python
def trend_value(nums: list):
summed_nums = sum(nums)
multiplied_data = 0
summed_index = 0
squared_index = 0
for index, num in enumerate(nums):
index += 1
multiplied_data += index * num
summed_index += index
squared_index += index**2
numerator = (len(nums) * multiplied_data) - (summed_nums * summed_index)
denominator = (len(nums) * squared_index) - summed_index**2
if denominator != 0:
return numerator/denominator
else:
return 0
val = trend_value([2781, 2667, 2785, 1031, 646, 2340, 2410])
print(val) # -139.5
in python:
def get_trend(numbers):
rows = []
total_numbers = len(numbers)
currentValueNumber = 1
n = 0
while n < len(numbers):
rows.append({'row': currentValueNumber, 'number': numbers[n]})
currentValueNumber += 1
n += 1
sumLines = 0
sumNumbers = 0
sumMix = 0
squareOfs = 0
for k in rows:
sumLines += k['row']
sumNumbers += k['number']
sumMix += k['row']*k['number']
squareOfs += k['row'] ** 2
a = (total_numbers * sumMix) - (sumLines * sumNumbers)
b = (total_numbers * squareOfs) - (sumLines ** 2)
c = a/b
return c
trendValue = get_trend([2781,2667,2785,1031,646,2340,2410])
print(trendValue) # output: -139.5

Minimum number of element required to make a sequence that sums to a particular number

Suppose there is number s=12 , now i want to make sequence with the element a1+a2+.....+an=12.
The criteria is as follows-
n must be minimum.
a1 and an must be 1;
ai can differs a(i-1) by only 1,0 and -1.
for s=12 the result is 6.
So how to find the minimum value of n.
Algorithm for finding n from given s:
1.Find q = FLOOR( SQRT(s-1) )
2.Find r = q^2 + q
3.If s <= r then n = 2q, else n = 2q + 1
Example: s = 12
q = FLOOR( SQRT(12-1) ) = FLOOR(SQRT(11) = 3
r = 3^2 + 3 = 12
12 <= 12, therefore n = 2*3 = 6
Example: s = 160
q = FLOOR( SQRT(160-1) ) = FLOOR(SQRT(159) = 12
r = 12^2 + 12 = 156
159 > 156, therefore n = 2*12 + 1 = 25
and the 25-numbers sequence for
159: 1,2,3,4,5,6,7,8,9,10,10,10,9,10,10,10,9,8,7,6,5,4,3,2,1
Here's a way to visualize the solution.
First, draw the smallest triangle (rows containing successful odd numbers of stars) that has a greater or equal number of stars to n. In this case, we draw a 16-star triangle.
*
***
*****
*******
Then we have to remove 16 - 12 = 4 more stars. We do this diagonally starting from the top.
1
**2
****3
******4
The result is:
**
****
******
Finally, add up the column heights to get the final answer:
1, 2, 3, 3, 2, 1.
There are two cases: s odd and s even. When s is odd, you have the sequence:
1, 2, 3, ..., (s-1)/2, (s-1)/2, (s-1)/2-1, (s-1)/2-2, ..., 1
when n is even you have:
1, 2, 3, ..., s/2, s/2-1, s/2-2, ..., 1
The maximum possible for any given series of length n is:
n is even => (n^2+2n)/4
n is odd => (n+1)^2/4
These two results are arrived at easily enough by looking at the simple arithmetic sum of series where in the case of n even it is twice the sum of the series 1...n/2. In the case of n odd it is twice the sum of the series 1...(n-1)/2 and add on n+1/2 (the middle element).
Clearly you can generate any positive number that is less than this max as long as n>3.
So the problem then becomes finding the smallest n with a max greater than your target.
Algorithmically I'd go for:
Find (sqrt(4*s)-1) and round up to the next odd number. Call this M. This is an easy to work out value and will represent the lowest odd n that will work.
Check M-1 to see if its max sum is greater than s. If so then that your n is M-1. Otherwise your n is M.
Thank all you answer me. I derived a simpler solution. The algorithm looks like-
First find what is the maximum sum that can be made using n element-
if n=1 -> 1 sum=1;
if n=2 -> 1,1 sum=2;
if n=3 -> 1,2,1 sum=4;
if n=4 -> 1,2,2,1 sum=6;
if n=5 -> 1,2,3,2,1 sum=9;
if n=6 -> 1,2,3,3,2,1 sum=12;
So from observation it is clear that form any number,n 9<n<=12 can be
made using 6 element, similarly number
6<n<=9 can be made at using 5 element.
So it require only a binary search to find the number of
element that make a particular number.

Resources