Random function and calculating percentage - math

Using a random library with these functions:
randomChance(p) Returns true with the probability indicated by p.
randomInteger(low, high) Returns a random integer in the range low to high, inclusive.
what is the easiest way to implement a "random selector" that takes consideration of percentage, 1/4 or 1/3 etc... I got a array with key/value pairing. For example "a" migth have the value 2 and "b" have the value 2. 1/2 chance for both.
The max value will be the size of the array, cause it only contains unique items. The randomChance() function ranges between 0.0 - 1.0 where 1 = 100%. If my array size is, say 4. What is the best way of "letting 4 be 1".

Lets say you have:
a = 2, b = 2, c = 1, d = 3
now make it:
a = 2, b = 4, c = 5, d = 8
Create a random number from 1 to MaxVal (value of the last key, 8 in this example). Select the first Key where Value >= RandomNum
EDIT
I made a small VB.Net to show the algorithm and how it works. The code is not meant to be: Good, elegant, performant or readable.
Module Module1
Private Class Value
Public vOrg, vRecalc, HitCount As Integer
Public Key As String
Public Sub New(s, v1, v2, c)
Key = s : vOrg = v1 : vRecalc = v2 : HitCount = c
End Sub
End Class
Sub Main()
' set initial values
Dim KVP() As Value = {New Value("A", 2, 0, 0),
New Value("B", 2, 0, 0),
New Value("C", 1, 0, 0),
New Value("D", 3, 0, 0)}
' recalc values
For i = 0 To KVP.Length - 1
If i = 0 Then KVP(0).vRecalc = KVP(0).vOrg Else KVP(i).vRecalc = KVP(i).vOrg + KVP(i - 1).vRecalc
Next
' do test
Dim r As New Random
Dim runs As Integer = 1000 * 1000, maxval As Integer = KVP(KVP.Length - 1).vRecalc
For i = 1 To runs
Dim RandVal = r.Next(1, maxval + 1)
Dim chosen As Integer = (From j In Enumerable.Range(0, KVP.Length) Where KVP(j).vRecalc >= RandVal Take 1 Select j)(0)
KVP(chosen).HitCount += 1
Next
' ouput results
For Each kv In KVP
Console.WriteLine("{0} was chosen with {1:F3} propability, expected was {2:F3}", kv.Key, kv.HitCount / CDbl(runs), kv.vOrg / CDbl(maxval))
Next
Console.ReadLine()
End Sub
End Module
An output sample:
A was chosen with 0.250 propability, expected was 0.250
B was chosen with 0.251 propability, expected was 0.250
C was chosen with 0.124 propability, expected was 0.125
D was chosen with 0.375 propability, expected was 0.375

just multiply the randomChance() outcome and the array length together. It'll give you the index in the range [0,array_length-1] which you can use to access the array
array_index = (unsigned int)(randomChance(p) * (array_length - 1));
maybe you mean "letting 3 to be 1" (not 4) in your example. The last index of an array of length 4 is 3.

Related

Giving Value to Python3 Count String

I am running a loop equation that divides or subtracts. Every time it divides I want it to represent each division with the # 2 and every time it subtracts I want it to represent the subtractions with a 1. I then want that count string to be a value that I can manipulate with some basic math. Basically it'll look like this: 20/2 = 10 (2) 10/2 = 5 (2) 5/2 = 2.5 (2) 2.5-.5 = 2 (1) 2/2 = 1 (2)
22212 <=== that I want to make a new value but with the way I have it coded, it's not working. I think it may have something to do with the end='' in the code.
I've tried giving the value of the string = to a int value and tried joining the string but no luck so far.
num = 20
while num >= 1.5:
num /= 2
v = 1
print(v, end='')
if int(num) != num:
num -= .5
v = 2
print(v, end='') #trying to make the output here a value
nv = ''.join(str(int(v)))
nv = int(v) #trying to give the joined strs of nv a value
print(nv) #trying to get this to print the combined valued of v to something that math can be applied to.
print('')
The code doesn't give any errors I just can't figure out how to make the output and actual number that I can manipulate.
you are printing v = 1 after your division. In your post you said you want a 2 for division, I am assuming what you wrote in the post is the result you want.
a = ""
num = 20
while num >= 1.5:
num /= 2
a += "2"
if int(num) != num:
num -= .5
a += "1"
print(a)
now a is a string with your desired result. You can always convert that String to an int to do some math with it.

Calculate if trend is up, down or stable

I'm writing a VBScript that sends out a weekly email with client activity. Here is some sample data:
a b c d e f g
2,780 2,667 2,785 1,031 646 2,340 2,410
Since this is email, I don't want a chart with a trend line. I just need a simple function that returns "up", "down" or "stable" (though I doubt it will ever be perfectly stable).
I'm terrible with math so I don't even know where to begin. I've looked at a few other questions for Python or Excel but there's just not enough similarity, or I don't have the knowledge, to apply it to VBS.
My goal would be something as simple as this:
a b c d e f g trend
2,780 2,667 2,785 1,031 646 2,340 2,410 ↘
If there is some delta or percentage or other measurement I could display that would be helpful. I would also probably want to ignore outliers. For instance, the 646 above. Some of our clients are not open on the weekend.
First of all, your data is listed as
a b c d e f g
2,780 2,667 2,785 1,031 646 2,340 2,410
To get a trend line you need to assign a numerical values to the variables a, b, c, ...
To assign numerical values to it, you need to have little bit more info how data are taken. Suppose you took data a on 1st January, you can assign it any value like 0 or 1. Then you took data b ten days later, then you can assign value 10 or 11 to it. Then you took data c thirty days later, then you can assign value 30 or 31 to it. The numerical values of a, b, c, ... must be proportional to the time interval of the data taken to get the more accurate value of the trend line.
If they are taken in regular interval (which is most likely your case), lets say every 7 days, then you can assign it in regular intervals a, b, c, ... ~ 1, 2, 3, ... Beginning point is entirely your choice choose something that makes it very easy. It does not matter on your final calculation.
Then you need to calculate the slope of the linear regression which you can find on this url from which you need to calculate the value of b with the following table.
On first column from row 2 to row 8, I have my values of a,b,c,... which I put 1,2,3, ...
On second column, I have my data.
On third column, I multiplied each cell in first column to corresponding cell in second column.
On fourth column, I squared the value of cell of first column.
On row 10, I added up the values of the above columns.
Finally use the values of row 10.
total_number_of_data*C[10] - A[10]*B[10]
b = -------------------------------------------
total_number_of_data*D[10]-square_of(A[10])
the sign of b determines what you are looking for. If it's positive, then it's up, if it's negative, then it's down, and if it's zero then stable.
This was a huge help! Here it is as a function in python
def trend_value(nums: list):
summed_nums = sum(nums)
multiplied_data = 0
summed_index = 0
squared_index = 0
for index, num in enumerate(nums):
index += 1
multiplied_data += index * num
summed_index += index
squared_index += index**2
numerator = (len(nums) * multiplied_data) - (summed_nums * summed_index)
denominator = (len(nums) * squared_index) - summed_index**2
if denominator != 0:
return numerator/denominator
else:
return 0
val = trend_value([2781, 2667, 2785, 1031, 646, 2340, 2410])
print(val) # -139.5
in python:
def get_trend(numbers):
rows = []
total_numbers = len(numbers)
currentValueNumber = 1
n = 0
while n < len(numbers):
rows.append({'row': currentValueNumber, 'number': numbers[n]})
currentValueNumber += 1
n += 1
sumLines = 0
sumNumbers = 0
sumMix = 0
squareOfs = 0
for k in rows:
sumLines += k['row']
sumNumbers += k['number']
sumMix += k['row']*k['number']
squareOfs += k['row'] ** 2
a = (total_numbers * sumMix) - (sumLines * sumNumbers)
b = (total_numbers * squareOfs) - (sumLines ** 2)
c = a/b
return c
trendValue = get_trend([2781,2667,2785,1031,646,2340,2410])
print(trendValue) # output: -139.5

Formatting data for clogit model

I have a survey with 53 respondents answering 8 questions each.
Currently it's in an Excel document in the format:
Person # | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8
Each question had three possible responses, "1", "2", or "3". For a given person, each question has a single number indicating the response.
I need to transform the answers from each person into one long column vector with responses coded in binary for each of the three choices. So for each person, there should be 24 rows (3 for each question), and for each question, there should be one row with a 1 (indicating the choice that was made) and two rows with 0's.
I've tried doing this in Excel and in R and cannot figure out how to do it without manually entering each value.
Please tell me there's a better way?
See explanation of code as inline comments
Sub Demo()
Dim wsSource As Worksheet
Dim wsDest As Worksheet
Dim rSource As Range
Dim rDest As Range
Dim vSource As Variant
Dim i As Long, j As Long
'--> adjust to suit your needs
' set up source and destination references
Set wsSource = Worksheets("SourceData")
Set wsDest = Worksheets("DestData")
'--> adjust to suit your needs
' Assumes source data has header in row 1, names in column A and responces in B..I
With wsSource
Set rSource = .Range(.Cells(2, 9), .Cells(.Rows.Count, 1).End(xlUp))
End With
'--> adjust to suit your needs
' Assumes generated data starts as cell A1
Set rDest = wsDest.Cells(1, 1)
' Get Source data
vSource = rSource.Value
' Size Destination data array
ReDim vDest(1 To UBound(vSource, 1) * 24, 1 To 2)
' Generate reformated data
For i = 1 To UBound(vSource, 1) ' For each Person
For j = 1 To 24 ' Add person name
vDest((i - 1) * 24 + j, 1) = vSource(i, 1)
Next
For j = 1 To 8 ' Code 8 results
vDest((i - 1) * 24 + (j - 1) * 3 + 1, 2) = IIf(vSource(i, j + 1) = 1, 1, 0)
vDest((i - 1) * 24 + (j - 1) * 3 + 2, 2) = IIf(vSource(i, j + 1) = 2, 1, 0)
vDest((i - 1) * 24 + (j - 1) * 3 + 3, 2) = IIf(vSource(i, j + 1) = 3, 1, 0)
Next
Next
' Place result on sheet
rDest.Resize(UBound(vDest, 1), UBound(vDest, 2)) = vDest
End Sub

How to find the range for a given number, interval and start value?

Provided the below values
start value = 1
End Value = 20
Interval = 5
I have been provided a number 6. I have to find the range of numbers in which the number 6 falls say now the answer is 6-10.
If the given number is greater than the end value then return the same number.
Is there any formula so that i can generate the range for the number?
UPDATE
I tried the below solution, But it is not working if the range interval is changed,
$end_value = $start_value + $range_interval;
// we blindly return the last term if value is greater than max value
if ($input_num > $end_value) {
return '>' . $end_value;
}
// we also find if its a first value
if ($input_num <= $end_value && $value >= $start_value) {
return $start_value . '-' . $end_value;
}
// logic to find the range for a given integer
$dived_value = $input_num/$end_value;
// round the value to get the exact match
$rounded_value = ceil($dived_value);
$upper_bound_range = $rounded_value*$end_value;
$lower_bound_range = $upper_bound_range - $end_value;
return $lower_bound_range . '-'. $upper_bound_range;
In (c-style) pseudocode:
// Integer division assumed
rangeNumber = (yourNumber - startValue) / rangeLength;
lower_bound_range = startValue + rangeNumber*rangeLength;
upper_bound_range = lower_bound_range + rangeLength-1;
For your input:
rangeNumber = (6-1)/5 = 1
lower_bound_range = 1 + 5*1 = 6
upper_bound_range = 10
and so range is [6, 10]
The answer depends on whether you talk about integers or floats. Since all your example numbers are integers, I assume you talk about those. I further assume that all your intervals contain the same number of integers, in your example 5, namely 1...5, 6...10, 11...15, and 16...20. Note that 0 is not contained in the 1st interval (otherwise the 1st interval had 6 numbers).
In this case the answer is easy.
Let be:
s the start value that is not contained in the 1st interval,
i the interval size, i.e. the number of integers that it contains,
p the provided number to which an interval should be assigned,
b the 1st integer in this interval, and
e the last integer in this interval.
Then:
b = s + (p-s-1)\i * i + 1 (here, "\" means integer division, i.e. without remainder)
e = b + i - 1
In your example:
s = 0, i = 5, p = 6, thus
b = 0 + (6-0-1)\5 * 5 + 1 = 6
e = 6 + 5 - 1 = 10

Collapsing a 10 period curve to 4 periods

I have a 10 period cost curve table below. How do I programmatically collapse/condense/shrink this to 4 periods. I'm using VBA but I should be able to follow other languages. The routine should work for whatever period you pass to it. For example, if I pass it a 7 it should condense the percentages to 7 periods. If I pass it 24 then expand the percentages to 24 periods, spreading the percentages based on the original curve. Any help or example will be appreciated. Thanks...
ORIGINAL
Period Pct
1 10.60%
2 19.00%
3 18.30%
4 14.50%
5 10.70%
6 8.90%
7 6.50%
8 3.10%
9 3.00%
10 5.40%
COLLAPSED
Period Pct
1 38.75%
2 34.35%
3 16.95%
4 9.95%
EDITED: I've added sample code below as to what I have so far. It only works for periods 1, 2, 3, 5, 9, 10. Maybe someone can help modify it to work for any period. Disclaimer, I'm not a programmer so my coding is bad. Plus, I have no clue as to what I'm doing.
Sub Collapse_Periods()
Dim aPct As Variant
Dim aPer As Variant
aPct = Array(0.106, 0.19, 0.183, 0.145, 0.107, 0.089, 0.065, 0.031, 0.03, 0.054)
aPer = Array(1, 2, 3, 5, 9, 10)
For i = 0 To UBound(aPer)
pm = 10 / aPer(i)
pct1 = 1
p = 0
ttl = 0
For j = 1 To aPer(i)
pct = 0
k = 1
Do While k <= pm
pct = pct + aPct(p) * pct1
pct1 = 1
p = p + 1
If k <> pm And k = Int(pm) Then
pct1 = (pm - Int(pm)) * j
pct = pct + (pct1 * aPct(p))
pct1 = 1 - pct1
End If
k = k + 1
Loop
Debug.Print aPer(i) & " : " & j & " : " & pct
ttl = ttl + pct
Next j
Debug.Print "Total: " & ttl
Next i
End Sub
I would like to know how this is done also using an Integral? This is how I would have done it - perhaps it's a longhand/longwinded method but I'd like to see some better suggestions.
It's probably easier to see the method in Excel first using the LINEST function and Named ranges. I've assumed the function is logarithmic. I've outlined steps [1.] - [5.]
This VBA code then essentially replicates the Excel method using a function to pass 2 arrays, periods and a return array that can be written to a range
Sub CallingProc()
Dim Periods As Long, returnArray() As Variant
Dim X_Values() As Variant, Y_Values() As Variant
Periods = 4
ReDim returnArray(1 To Periods, 1 To 2)
With Sheet1
X_Values = Application.Transpose(.Range("A2:A11"))
Y_Values = Application.Transpose(.Range("B2:B11"))
End With
FGraph X_Values, Y_Values, Periods, returnArray 'pass 1D array of X, 1D array of Y, Periods, Empty ReturnArray
End Sub
Function FGraph(ByVal x As Variant, ByVal y As Variant, ByVal P As Long, ByRef returnArray As Variant)
Dim i As Long, mConstant As Double, cConstant As Double
'calc cumulative Y and take Ln (Assumes Form of Graph is logarithmic!!)
For i = LBound(y) To UBound(y)
If i = LBound(y) Then
y(i) = y(i)
Else
y(i) = y(i) + y(i - 1)
End If
x(i) = Log(x(i))
Next i
'calc line of best fit
With Application.WorksheetFunction
mConstant = .LinEst(y, x)(1)
cConstant = .LinEst(y, x)(2)
End With
'redim array to fill for new Periods
ReDim returnArray(1 To P, 1 To 2)
'Calc new periods based on line of best fit
For i = LBound(returnArray, 1) To UBound(returnArray, 1)
returnArray(i, 1) = UBound(y) / P * i
If i = LBound(returnArray, 1) Then
returnArray(i, 2) = (Log(returnArray(i, 1)) * mConstant) + cConstant
Else
returnArray(i, 2) = ((Log(returnArray(i, 1)) * mConstant) + cConstant) - _
((Log(returnArray(i - 1, 1)) * mConstant) + cConstant)
End If
Next i
'returnArray can be written to range
End Function
EDIT:
This VBA code now calculates the linear trend of the points either side of the new period reduction. The data is returned in a 2dimension array named returnArray
Sub CallingProc()
Dim Periods As Long, returnArray() As Variant
Dim X_Values() As Variant, Y_Values() As Variant
Periods = 4
ReDim returnArray(1 To Periods, 1 To 2)
With Sheet1
X_Values = Application.Transpose(.Range("A2:A11"))
Y_Values = Application.Transpose(.Range("B2:B11"))
End With
FGraph X_Values, Y_Values, returnArray 'pass 1D array of X, 1D array of Y, Dimensioned ReturnArray
End Sub
Function FGraph(ByVal x As Variant, ByVal y As Variant, ByRef returnArray As Variant)
Dim i As Long, j As Long, mConstant As Double, cConstant As Double, Period As Long
Period = UBound(returnArray, 1)
'calc cumulative Y
For i = LBound(y) + 1 To UBound(y)
y(i) = y(i) + y(i - 1)
Next i
'Calc new periods based on line of best fit
For i = LBound(returnArray, 1) To UBound(returnArray, 1)
returnArray(i, 1) = UBound(y) / Period * i
'find position of new period to return adjacent original data points
For j = LBound(x) To UBound(x)
If returnArray(i, 1) <= x(j) Then Exit For
Next j
'calc linear line of best fit between existing data points
With Application.WorksheetFunction
mConstant = .LinEst(Array(y(j), y(j - 1)), Array(x(j), x(j - 1)))(1)
cConstant = .LinEst(Array(y(j), y(j - 1)), Array(x(j), x(j - 1)))(2)
End With
returnArray(i, 2) = (returnArray(i, 1) * mConstant) + cConstant
Next i
'returnarray holds cumulative % so calc period only %
For i = UBound(returnArray, 1) To LBound(returnArray, 1) + 1 Step -1
returnArray(i, 2) = returnArray(i, 2) - returnArray(i - 1, 2)
Next i
'returnArray now holds your data
End Function
Returns:
COLLAPSED
1 38.75%
2 34.35%
3 16.95%
4 9.95%

Resources