Need help understanding probability problem - math

I'm studying for my statistics exam next week, and I'm struggling with probabilities, in particular when it comes to determining which formulas to use in order to tackle different problems. For instance, in the following scenario, I am naively assuming that the p(p) would also be equal to 0.1 given that the number of occurrences of p and q are equal, but I'm pretty sure there is more to it. Can someone please explain in very simple terms, how such a problem would be tackled, I've tried re-reading the chapter multiple times, and I feel I'm still clueless.
Consider the sample space: Ω={ p, q, r, s, t, u }
Consider the set of events: F ={ ∅, { p }, { q, r, s, t, u }, { q }, { p, r, s, t, u }, { p, q }, { r, s, t, u }, Ω}
The following probabilities are known:
P(q) = 0.1
P({ r, s, t, u }) = 0.3
Find P(p)

That set of events, F, says nothing about the probabilities of the outcomes; if the number of events which include p is equal to the number of events which include q, that tells you nothing about the probabilities of p and q, it's just an arbitrary decision about which of the 64 possible events to pay attention to.
The probabilities of all of the outcomes in the sample space must add up to 1, and the probability of an event is the sum of the probabilities of the outcomes that comprise it. So
1 = P({ p, q, r, s, t, u })
= P(p) + P(q) + P({ r, s, t, u })
P(p) = 1 - P(q) - P({ r, s, t, u })
= 1 - 0.1 - 0.3
= 0.6

Related

Could not find the optimal solution after adding constraints

My code is as follows:
gekko = GEKKO(remote=True)
# create variable, each variable is a vector, each element
# of the vector is a binary
s = []
for i in range(N):
s.append(gekko.Array(gekko.Var, s_len[i], value=0, lb=0, ub=1, integer=True))
# some constants used in the objective/constraint function
c, d, r, m, L = create_c_d_r_m_L() # they are all numpy ndarry
# define the objective function
def objective():
obj = 0
for i in range(N):
obj += np.dot(s[i], c[i]) + np.dot(s[i], d[i])
for idx, (i, j) in enumerate(E):
obj += np.dot(np.dot(s[i], r[idx].reshape(s_len[i], s_len[j])),\
s[j]) # s[i] * r[i, j] * s[j]
return obj
# add constraints
# (a) each vector can only have and must have one 1
for i in range(N):
gekko.Equation(gekko.sum(s[i]) == 1)
# (b)
for t in range(N):
peak_mem = gekko.sum([np.dot(s[i], m[i]) for i in L[t]])
gekko.Equation(peak_mem < DEVICE_MEM)
# DEVICE_MEM is a predefined big int
# solve
gekko.Obj(objective())
gekko.solve(disp=True)
I found that when removing constraint (b), the solver can output the optimal solution for s. However, if we add (b) and set DEVICE_MEM to a very large number (which should not affect the solution), the s is not optimal anymore. I'm wondering if I am doing something wrong here because I tried both APOPT(solvertype=1) and IPOPT (solvertype=3) and they give the same nonoptimal results.
To give more context to the problem: this is an optimization over the graph. N represents the number of nodes in the graph. E is the set that contains all edges in the graph. c, d, m are three types of cost of a node. r is the cost of edges. Each node has multiple strategies (represented by the vector s[i]), and we need to select the best strategy for each node so that the overall cost is minimal.
Detailed constants:
# s_len: record the length of each vector
# (the # of strategies for each node,
# here we assume the length are all 10)
s_len = np.ones(N) * 10
# c, d, m are the costs of each node
# let's assume the c/d/m cost for i node is just i
c, d, m = [], [], []
for i in range(N):
c[i] = s_len[i] * [i]
d[i] = s_len[i] * [i]
m[i] = s_len[i] * [i]
# r is the edge cost, let's assume the cost for
# each edge is just i * j
r = []
for (i,j) in E: # E records all edges
cur_r = s_len[i] * s_len[j] * [i*j]
r.append(cur_r)
# L contains the node ids, we just randomly generate 10 integers here
L = []
for i in range(N):
cur_L = [randrange(N) for _ in range(10)]
L.append(cur_L)
I've been stuck on this for a while and any comments/answers are highly appreciated! Thanks!
Try reframing the inequality constraint:
for t in range(N):
peak_mem = gekko.sum([np.dot(s[i], m[i]) for i in L[t]])
gekko.Equation(peak_mem < DEVICE_MEM)
as a variable with an upper bound:
peak_mem = m.Array(m.Var,N,ub=DEVICE_MEM)
for t in range(N):
m.Equation(peak_mem[t]==\
gekko.sum([np.dot(s[i], m[i]) for i in L[t]])
The N inequality constraints peak_mem < DEVICE_MEM are converted to equality constraints with slack variables as s[i] = DEVICE_MEM - peak_mem with a simple inequality constraint on the slack s[i]>=0. If the inequality constraint far from the bound, then the slack variable can be very large. Formulating the equation as a variable may help.
I tried using the information in the question to pose a minimal problem that could reproduce the error and the potential solution. If you need more specific suggestions, please modify the code to be a complete and minimal example that reproduces the error. This helps with verifying the solution.

Is it possible reduce time of building a mathematical formulate in coding?

I would like to use the optimization model in a code. But processing and preparation time of the objective function (f) is too long. Is there any way to reduce the time of these kinds of large models?
using JuMP,CPLEX
Tsp=Model(solver=CplexSolver());
#Parameters-----------------------------------------------------------------
V, T, K = 1:100, 1:5, 1:5
totalV=100
d=1 .+ 99 .*rand(V,V);
#variables---------------------------------------------------------------------
#variable(Tsp,x[V,V,K,T],Bin);
#variable(Tsp,u[V,V,K,T]>=0);
#constrains---------------------------------------------------------------------
#constraint(Tsp,c1[i in V, k in K,t in T ], sum(x[i,j,k,t] for j in V )==1);
#constraint(Tsp,c2[j in V, k in K,t in T], sum(x[i,j,k,t] for i in V )==1);
#constraint(Tsp,c3[i in U,j in V,k in K, t in T; i!=j],u[i,k,t]-u[j,k,t]+totalV*x[i,j,k,t]<=totalV-1);
# objective function---------------------------------------------------------
f=sum(d[i,j]*x[i,j,k,t] for i in V,j in V, k in K, t in T);
#objective(Tsp, Min, f);
solve(Tsp);
Thanks very much.
I'll assume you're using JuMP due to the tag.
Always provide a reproducible example: https://stackoverflow.com/help/minimal-reproducible-example. It's hard to offer advice without it.
Do not build JuMP expressions outside of the macros: https://jump.dev/JuMP.jl/stable/tutorials/getting_started/performance_tips/#Use-macros-to-build-expressions
Your code is wrong. If V = 100, then for i in V will only have one element, and that is i = 100. Perhaps you meant for i in 1:V?
Think about what the if statement is doing. It only uses i and j, but it needs to be evaluated for every t and every k.
Putting it all together, I would do something like:
V, H, K = 1:100, 1:5, 1:5
using JuMP
model = Model()
#variable(model, x[V, V, K, H])
d = 1 .+ 99 .* rand(V, V)
#expression(
model,
f,
sum(d[i, j] * sum(x[i,j,k,t] for t in H, k in K) for for i in V, j in V if i!=j)
)
Hope that helps.

Double integration with a differentiation inside in R

I need to integrate the following function where there is a differentiation term inside. Unfortunately, that term is not easily differentiable.
Is this possible to do something like numerical integration to evaluate this in R?
You can assume 30,50,0.5,1,50,30 for l, tau, a, b, F and P respectively.
UPDATE: What I tried
InnerFunc4 <- function(t,x){digamma(gamma(a*t*(LF-LP)*b)/gamma(a*t))*(x-t)}
InnerIntegral4 <- Vectorize(function(x) { integrate(InnerFunc4, 1, x, x = x)$value})
integrate(InnerIntegral4, 30, 80)$value
It shows the following error:
Error in integrate(InnerFunc4, 1, x, x = x) : non-finite function value
UPDATE2:
InnerFunc4 <- function(t,L){digamma(gamma(a*t*(LF-LP)*b)/gamma(a*t))*(L-t)}
t_lower_bound = 0
t_upper_bound = 30
L_lower_bound = 30
L_upper_bound = 80
step_size = 0.5
integral = 0
t <- t_lower_bound + 0.5*step_size
while (t < t_upper_bound){
L = L_lower_bound + 0.5*step_size
while (L < L_upper_bound){
volume = InnerFunc4(t,L)*step_size**2
integral = integral + volume
L = L + step_size
}
t = t + step_size
}
Since It seems that your problem is only the derivative, you can get rid of it by means of partial integration:
Edit
Not applicable solution for lower integration bound 0.

How to find n average number of trials before criteria are met with differing probabilities per outcome?

I've spent a few days trying to figure this out and looking up tutorials, but everything I've found so far seems like it's close to what I need but don't give the results I need.
I have a device that produces a single letter, A-F. For simplicity's sake, you can think of it like a die with letters. It will always produce one and only one letter each time it is used. However it has one major difference: each letter can have a differing known probability of being picked:
A: 25%
B: 5%
C: 20%
D: 15%
E: 20%
F: 15%
These probabilities remain constant throughout all attempts.
Additionally, I have a specific combination I must accrue before I am "successful":
As needed: 1
Bs needed: 3
Cs needed: 0
Ds needed: 1
Es needed: 2
Fs needed: 3
I need find the average number of letter picks (i.e. rolls/trials/attempts) that have to happen for this combination of letters to be accrued. It's completely fine for any individual outcome to have more than the required number of letters, but success is only counted when each letter has been chosen at least its minimum amount of times.
I've looked at plenty of tutorials for multinomial probability distribution and similar things, but I haven't found anything that explains how to find average number of trials for a scenario like this. Please kindly explain answers clearly as I'm not a wiz with statistics.
In addition to Severin's answer that logically looks good to me but might be costly to evaluate (i.e. infinite sum of factorials).
Let me provide some intuition that should give a good approximation.
Considering each category at a time. Refer this math stackexchange question/ answer. Expected number of tosses in which you would get the k number of successes for each category (i) can be calculated as k(i)/ P(i):
Given,
p(A): 25% ; Expected number of tosses to get 1 A = 1/ 0.25 = 4
p(B): 5% ; Expected number of tosses to get 3 B's = 3/ 0.05 = 60
p(C): 20% ; Expected number of tosses to get 0 C = 0/ 0.20 = 0
p(D): 15% ; Expected number of tosses to get 1 D = 1/ 0.15 = 6.67 ~ 7
p(E): 20% ; Expected number of tosses to get 2 E's = 2/ 0.20 = 10
p(F): 15% ; Expected number of tosses to get 3 F's = 3/ 0.15 = 20
you get an idea that getting 3 B's is your bottleneck, you can expect on average 60 tosses for your scenario to play out.
Well, minimum number of throws is 10. Average would be infinite sum
A=10•P(done in 10)+11•P(done in 11)+12•P(done in 12) + ...
For P(done in 10) we could use multinomial
P(10)=Pm(1,3,0,1,2,3|probs), where probs=[.25, .05, .20, .15, .20, .15]
For P(11) you have one more throw which you could distribute like this
P(11)=Pm(2,3,0,1,2,3|probs)+Pm(1,4,0,1,2,3|probs)+Pm(1,3,0,2,2,3|probs)+
Pm(1,3,0,1,3,3|probs)+Pm(1,3,0,1,2,4|probs)
For P(12) you have to distribute 2 more throws. Note, that there are combinations of throws which are impossible to get, like Pm(2,3,0,2,2,3|probs), because you have to stop earlier
And so on and so forth
Your process can be described as a Markov chain with a finite number of states, and an absorbing state.
The number of steps before reaching the absorbing state is called the hitting time. The expected hitting time can be calculated easily from the transition matrix of the Markov chain.
Enumerate all possible states (a, b, c, d, e, f). Consider only a finite number of states, because "b >= 3" is effectively the same as "b = 3", etc. The total number of states is (1+1)*(3+1)*(0+1)*(2+1)*(3+1) = 192.
Make sure that in your enumeration, starting state (0, 0, 0, 0, 0, 0) comes first, with index 0, and absorbing state (1, 3, 0, 1, 2, 3) comes last.
Build the transition matrix P. It's a square matrix with one row and column per state. Entry P[i, j] in the matrix gives the probability of going from state i to state j when rolling a die. There should be at most 6 non-zero entries per row.
For example, if i is the index of state (1, 0, 0, 1, 2, 2) and j the index of state (1, 1, 0, 1, 2, 2), then P[i, j] = probability of rolling face B = 0.05. Another example: if i is the index of state (1,3,0,0,0,0), then P[i,i] = probability of rolling A, B or C = 0.25+0.05+0.2 = 0.5.
Call Q the square matrix obtained by removing the last row and last column of P.
Call I the identity matrix of the same dimensions as Q.
Compute matrix M = (I - Q)^-1, where ^-1 is matrix inversion.
In matrix M, the entry M[i, j] is the expected number of times that state j will be reached before the absorbing state, when starting from state i.
Since our experiment starts in state 0, we're particularly interested in row 0 of matrix M.
The sum of row 0 of matrix M is the expected total number of states reached before the absorbing state. That is exactly the answer we seek: the number of steps to reach the absorbing state.
To understand why this works, you should read a course on Markov chains! Perhaps this one: James Norris' course notes on Markov chains. The chapter about "hitting times" (which is the name for the number of steps before reaching target state) is chapter 1.3.
Below, an implementation in python.
from itertools import product, accumulate
from operator import mul
from math import prod
import numpy as np
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def get_expected_n_trials(targets, dice_weights):
states = list(product(*(range(n+1) for n in targets)))
base = list(accumulate([n+1 for n in targets[:0:-1]], mul, initial=1))[::-1]
lookup = dict(map(reversed, enumerate(states)))
P = np.zeros((len(states), len(states)))
for i, s in enumerate(states):
a,b,c,d,e,f = s
for f, p in enumerate(dice_weights):
#j = index of state reached from state i when rolling face f
j = i + base[f] * (s[f] < targets[f])
j1 = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]
if (j != j1):
print(i, s, f, ' --> ' , j, j1)
assert(j == j1)
P[i,j] += p
Q = P[:-1, :-1]
I = np.identity(len(states)-1)
M = np.linalg.inv(I - Q)
return M[0,:].sum()
print(get_expected_n_trials(targets, dice_weights))
# 61.28361802372382
Explanations of code:
First we build the list of states using Cartesian product itertools.product
For a given state i and die face f, we need to calculate j = state reached from i when adding f. I have two ways of calculating that, either as j = i + base[f] * (s[f] < targets[f]) or as j = lookup[s[:f] + (min(s[f]+1, targets[f]),) + s[f+1:]]. Because I'm paranoid, I calculated it both ways and checked that the two ways gave the same result. But you only need one way. You can remove lines j1 = ... to assert(j == j1) if you want.
Matrix P begins filled with zeroes, and we fill up to six cells per row with P[i, j] += p where p is probability of rolling face f.
Then we compute matrices Q and M as I indicated above.
We return the sum of all the cells on the first row of M.
To help you better understand what is going on, I encourage you to examine the values of all variables. For instance you could replace return M[0, :].sum() with return states, base, lookup, P, Q, I, M and then write states, base, lookup, P, Q, I, M = get_expected_n_trials(targets, dice_weights) in the python interactive shell, so that you can look at the variables individually.
A Monte-Carlo simulation:
Actually roll the die until we hit the requirements;
Count how many rolls we did;
Repeat experiment 1000 times to get the empirical average value.
Implementation in python:
from collections import Counter
from random import choices
from itertools import accumulate
from statistics import mean, stdev
dice_weights = [0.25, 0.05, 0.2, 0.15, 0.2, 0.15]
targets = [1, 3, 0, 1, 2, 3]
def avg_n_trials(targets, dice_weights, n_experiments=1000):
dice_faces = range(len(targets))
target_state = Counter(dict(enumerate(targets)))
cum_weights = list(accumulate(dice_weights))
results = []
for _ in range(n_experiments):
state = Counter()
while not state >= target_state:
f = choices(dice_faces, cum_weights=cum_weights)[0]
state[f] += 1
results.append(state.total()) # python<3.10: sum(state.values())
m = mean(results)
s = stdev(results, xbar=m)
return m, s
m, s = avg_n_trials(targets, dice_weights, n_experiments=10000)
print(m)
# 61.4044

For Loop in R replacing Object Values at each iteration

I am struggling to figure out how to create a for loop in which some initial objects (u, l, h, and y) and their values are updated and reported at the end of each iteration of the loop. And that the loop takes into account the values of the prior iteration as the basis (for example after updating the above objects, the runif function takes the updated values of u and l in drawing a q. I keep getting the same result repeated with no variation, and I am unsure as to what might be the best way to resolve this.
Apologies in advance as I am fairly new to R and coding in general.
reset = {
l = 0.1 #lower bound of belief in theta
u = 0.9 #upper bound of belief in theta
h = 0.2 #lower legal threshold, below which an action is not liable
y = 0.8 #upper legal threshold, above which an action is liable
}
### need 1-u <= h <= y <= 1-l for each t along every path of play
period = c(1:100) ## Number of periods in the iteration of the loop.
for (t in 1:length(period)) {
q = runif(1,min = l, max = u) ### 1 draw of q from a uniform distribution
q
probg = function(q,l,u){(u - (1-q))/(u-l)} ### probability of being found guilty given q in the ambiguous region
probg(q,l,u)
probi = function(q,l,u){1-probg(q,l,u)} ### probability of being found innocent given q in the ambiguous region
probi(q,l,u)
ruling = if(q>=y | probg(q,l,u) > 1){print("Guilty") ###Strict liability
} else if(q<=h | probi(q,l,u) > 1) {print("Innocent") ###Permissible
} else if(q>h & q<y) { ###Ambiguous region
discovery = sample(c('guilty','not guilty'), size=1, replace=TRUE, prob=c(probg(q,l,u),probi(q,l,u))) ### court discovering whether a particular ambiguous q is permissible or not
}
discovery
ruling
if(ruling == "not guilty") {u = 1-q} else if (ruling == "guilty") {l = 1-q} else (print("beliefs unchanged"))
if(ruling == "not guilty"){h = 1 - u} else if (ruling == "guilty") {y = 1 - l} else (print("legal threshold unchanged")) #### legal adjustment and updating of beliefs in ambiguous region after discovery of liability
probg(q,l,u)
probi(q,l,u)
modelparam = c(l,u,h,y)
show(modelparam)
}

Resources