Collapsing a 10 period curve to 4 periods - math

I have a 10 period cost curve table below. How do I programmatically collapse/condense/shrink this to 4 periods. I'm using VBA but I should be able to follow other languages. The routine should work for whatever period you pass to it. For example, if I pass it a 7 it should condense the percentages to 7 periods. If I pass it 24 then expand the percentages to 24 periods, spreading the percentages based on the original curve. Any help or example will be appreciated. Thanks...
Period Pct
1 10.60%
2 19.00%
3 18.30%
4 14.50%
5 10.70%
6 8.90%
7 6.50%
8 3.10%
9 3.00%
10 5.40%
EDITED: I've added sample code below as to what I have so far. It only works for periods 1, 2, 3, 5, 9, 10. Maybe someone can help modify it to work for any period. Disclaimer, I'm not a programmer so my coding is bad. Plus, I have no clue as to what I'm doing.
Sub Collapse_Periods()
Dim aPct As Variant
Dim aPer As Variant
aPct = Array(0.106, 0.19, 0.183, 0.145, 0.107, 0.089, 0.065, 0.031, 0.03, 0.054)
aPer = Array(1, 2, 3, 5, 9, 10)
For i = 0 To UBound(aPer)
pm = 10 / aPer(i)
pct1 = 1
p = 0
ttl = 0
For j = 1 To aPer(i)
pct = 0
k = 1
Do While k <= pm
pct = pct + aPct(p) * pct1
pct1 = 1
p = p + 1
If k <> pm And k = Int(pm) Then
pct1 = (pm - Int(pm)) * j
pct = pct + (pct1 * aPct(p))
pct1 = 1 - pct1
End If
k = k + 1
Debug.Print aPer(i) & " : " & j & " : " & pct
ttl = ttl + pct
Next j
Debug.Print "Total: " & ttl
Next i
End Sub

I would like to know how this is done also using an Integral? This is how I would have done it - perhaps it's a longhand/longwinded method but I'd like to see some better suggestions.
It's probably easier to see the method in Excel first using the LINEST function and Named ranges. I've assumed the function is logarithmic. I've outlined steps [1.] - [5.]
This VBA code then essentially replicates the Excel method using a function to pass 2 arrays, periods and a return array that can be written to a range
Sub CallingProc()
Dim Periods As Long, returnArray() As Variant
Dim X_Values() As Variant, Y_Values() As Variant
Periods = 4
ReDim returnArray(1 To Periods, 1 To 2)
With Sheet1
X_Values = Application.Transpose(.Range("A2:A11"))
Y_Values = Application.Transpose(.Range("B2:B11"))
End With
FGraph X_Values, Y_Values, Periods, returnArray 'pass 1D array of X, 1D array of Y, Periods, Empty ReturnArray
End Sub
Function FGraph(ByVal x As Variant, ByVal y As Variant, ByVal P As Long, ByRef returnArray As Variant)
Dim i As Long, mConstant As Double, cConstant As Double
'calc cumulative Y and take Ln (Assumes Form of Graph is logarithmic!!)
For i = LBound(y) To UBound(y)
If i = LBound(y) Then
y(i) = y(i)
y(i) = y(i) + y(i - 1)
End If
x(i) = Log(x(i))
Next i
'calc line of best fit
With Application.WorksheetFunction
mConstant = .LinEst(y, x)(1)
cConstant = .LinEst(y, x)(2)
End With
'redim array to fill for new Periods
ReDim returnArray(1 To P, 1 To 2)
'Calc new periods based on line of best fit
For i = LBound(returnArray, 1) To UBound(returnArray, 1)
returnArray(i, 1) = UBound(y) / P * i
If i = LBound(returnArray, 1) Then
returnArray(i, 2) = (Log(returnArray(i, 1)) * mConstant) + cConstant
returnArray(i, 2) = ((Log(returnArray(i, 1)) * mConstant) + cConstant) - _
((Log(returnArray(i - 1, 1)) * mConstant) + cConstant)
End If
Next i
'returnArray can be written to range
End Function
This VBA code now calculates the linear trend of the points either side of the new period reduction. The data is returned in a 2dimension array named returnArray
Sub CallingProc()
Dim Periods As Long, returnArray() As Variant
Dim X_Values() As Variant, Y_Values() As Variant
Periods = 4
ReDim returnArray(1 To Periods, 1 To 2)
With Sheet1
X_Values = Application.Transpose(.Range("A2:A11"))
Y_Values = Application.Transpose(.Range("B2:B11"))
End With
FGraph X_Values, Y_Values, returnArray 'pass 1D array of X, 1D array of Y, Dimensioned ReturnArray
End Sub
Function FGraph(ByVal x As Variant, ByVal y As Variant, ByRef returnArray As Variant)
Dim i As Long, j As Long, mConstant As Double, cConstant As Double, Period As Long
Period = UBound(returnArray, 1)
'calc cumulative Y
For i = LBound(y) + 1 To UBound(y)
y(i) = y(i) + y(i - 1)
Next i
'Calc new periods based on line of best fit
For i = LBound(returnArray, 1) To UBound(returnArray, 1)
returnArray(i, 1) = UBound(y) / Period * i
'find position of new period to return adjacent original data points
For j = LBound(x) To UBound(x)
If returnArray(i, 1) <= x(j) Then Exit For
Next j
'calc linear line of best fit between existing data points
With Application.WorksheetFunction
mConstant = .LinEst(Array(y(j), y(j - 1)), Array(x(j), x(j - 1)))(1)
cConstant = .LinEst(Array(y(j), y(j - 1)), Array(x(j), x(j - 1)))(2)
End With
returnArray(i, 2) = (returnArray(i, 1) * mConstant) + cConstant
Next i
'returnarray holds cumulative % so calc period only %
For i = UBound(returnArray, 1) To LBound(returnArray, 1) + 1 Step -1
returnArray(i, 2) = returnArray(i, 2) - returnArray(i - 1, 2)
Next i
'returnArray now holds your data
End Function
How to Mimic Curve stretched over longer term

I am trying to find a way to mimic the curve or a product release schedule.
This is the release schedule for the number of units for a product. We have a new product coming out that will have the same release schedule in a curve. that should look like this:
I want it to have the same curve but stretched over a longer period of time:
I tried to just break it down by the phase but the results didn't match the same curve:
I have access to excel, VBA, and R. Is there a way that I can get a calculation that would allow me to match the curve structure if I have a different length of time and possibly a different total amount of units? I know I'm supposed to show what I have tried as well but nothing has gotten me even close.
You can get your NewData with Interpolation:
'Year .... 0 - 25
'x .... =year/25*17
'Units .... =Linterp($A$2:$B$19; x) --- A2:B19 is your input
Function Linterp(ByRef Tbl As Range, ByRef dX As Double) As Variant
' copied from
' shg 06 Jun 1997
' Linear interpolator / extrapolator
' Tbl is a two-column range containing known x, known y, sorted ascending
Dim i As Long ' index to Tbl
Dim nRow As Long ' rows in Tbl
Dim dXAbv As Double ' Tbl value above dX
Dim dXBlo As Double ' Tbl values below dX
Dim dRF As Double ' row fraction
nRow = Tbl.Rows.Count
If nRow < 2 Or Tbl.Columns.Count <> 2 Then
Linterp = "Table must have >= 2 rows, exactly two columns"
Exit Function '-------------------------------------------------------->
End If
If dX < Tbl(1, 1).Value Then ' dX < xmin, extrapolate first two entries
i = 1
i = WorksheetFunction.Match(dX, WorksheetFunction.Index(Tbl, 0, 1), 1)
If dX = Tbl(i, 1).Value Then ' dX is exact from table
Linterp = Tbl(i, 2)
Exit Function '---------------------------------------------------->
ElseIf i = nRow Then ' dX > xmax, extrapolate last two entries
i = nRow - 1
' dX lies between two rows, so interpolate entries i, i+1
' which is what happens by default
End If
End If
dXAbv = Tbl(i, 1).Value
dXBlo = Tbl(i + 1, 1).Value
dRF = (dX - dXAbv) / (dXBlo - dXAbv) ' row fraction
Linterp = Tbl(i, 2).Value * (1 - dRF) + Tbl(i + 1, 2).Value * dRF
End Function
Your input data is in A2:B19!
The new data is columns D, E and F!
D1: Year
D2: 0
D3: 1
E1: x
E2: =D2/25*17
E3: =D3/25*17
F1: Units
F2: =Linterp($A$2:$B$19;E2)
F3: =Linterp($A$2:$B$19;E3)

I have some diffculty count a(n) = the number of Niven (Harshad) numbers exceeding N

Harshad/Niven numbers are positive numbers that are divisible by the sum of their digits. All single-digit numbers are Harshad numbers.
For example, 27 is a Harshad number as 2 + 7 = 9, and 9 is a divisor of 27.Count a(n) = the number of Niven (Harshad) numbers exceeding n (n<=1e12)
We can define the function checkHarshad(n) (Calculate sum of digits of n and check whether n%S(n)), then count the number of Harshad numbers in range [1..n]. But the program just run fast if n <= 1e7
Hiroaki Yamanouchi had a python code that calculate a(10^n), but I don't understand it, and I don't think it's helpful to my problem if n <= 1e12.
def number_of_niven_numbers(digits):
- Count the number of Niven numbers
less than or equal to 10^digits.
N = digits
ret = 0
cnts = [0] * (digits + 1)
for digit_sum in range(1, N * 9 + 1):
curr = [[0] * digit_sum]
curr[0][0] = 1
next_mods = [0] * digit_sum
for i in range(digit_sum):
next_mods[i] = 10 * i % digit_sum
for left_digits in range(N):
left_sum_max = min(9 * left_digits, digit_sum)
next = [[0] * digit_sum for _ in range(min(left_sum_max + 9, digit_sum) + 1)]
for left_sum in range(left_sum_max + 1):
for left_mod in range(digit_sum):
cnt = curr[left_sum][left_mod]
if cnt == 0:
next_mod_base = next_mods[left_mod]
for next_digit in range(min(min(digit_sum, 9), digit_sum - left_sum) + 1):
next_sum = left_sum + next_digit
next_mod = next_mod_base + next_digit
if next_mod >= digit_sum:
next_mod -= digit_sum
next[next_sum][next_mod] += cnt
curr = next
if digit_sum < len(curr):
cnts[left_digits + 1] += curr[digit_sum][0]
return cnts[N] + 1
So, how can we calculate a(n) up to 1e12?

Formatting data for clogit model

I have a survey with 53 respondents answering 8 questions each.
Currently it's in an Excel document in the format:
Person # | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | Q8
Each question had three possible responses, "1", "2", or "3". For a given person, each question has a single number indicating the response.
I need to transform the answers from each person into one long column vector with responses coded in binary for each of the three choices. So for each person, there should be 24 rows (3 for each question), and for each question, there should be one row with a 1 (indicating the choice that was made) and two rows with 0's.
I've tried doing this in Excel and in R and cannot figure out how to do it without manually entering each value.
Please tell me there's a better way?
See explanation of code as inline comments
Sub Demo()
Dim wsSource As Worksheet
Dim wsDest As Worksheet
Dim rSource As Range
Dim rDest As Range
Dim vSource As Variant
Dim i As Long, j As Long
'--> adjust to suit your needs
' set up source and destination references
Set wsSource = Worksheets("SourceData")
Set wsDest = Worksheets("DestData")
'--> adjust to suit your needs
' Assumes source data has header in row 1, names in column A and responces in B..I
With wsSource
Set rSource = .Range(.Cells(2, 9), .Cells(.Rows.Count, 1).End(xlUp))
End With
'--> adjust to suit your needs
' Assumes generated data starts as cell A1
Set rDest = wsDest.Cells(1, 1)
' Get Source data
vSource = rSource.Value
' Size Destination data array
ReDim vDest(1 To UBound(vSource, 1) * 24, 1 To 2)
' Generate reformated data
For i = 1 To UBound(vSource, 1) ' For each Person
For j = 1 To 24 ' Add person name
vDest((i - 1) * 24 + j, 1) = vSource(i, 1)
For j = 1 To 8 ' Code 8 results
vDest((i - 1) * 24 + (j - 1) * 3 + 1, 2) = IIf(vSource(i, j + 1) = 1, 1, 0)
vDest((i - 1) * 24 + (j - 1) * 3 + 2, 2) = IIf(vSource(i, j + 1) = 2, 1, 0)
vDest((i - 1) * 24 + (j - 1) * 3 + 3, 2) = IIf(vSource(i, j + 1) = 3, 1, 0)
' Place result on sheet
rDest.Resize(UBound(vDest, 1), UBound(vDest, 2)) = vDest
End Sub

Random function and calculating percentage

Using a random library with these functions:
randomChance(p) Returns true with the probability indicated by p.
randomInteger(low, high) Returns a random integer in the range low to high, inclusive.
what is the easiest way to implement a "random selector" that takes consideration of percentage, 1/4 or 1/3 etc... I got a array with key/value pairing. For example "a" migth have the value 2 and "b" have the value 2. 1/2 chance for both.
The max value will be the size of the array, cause it only contains unique items. The randomChance() function ranges between 0.0 - 1.0 where 1 = 100%. If my array size is, say 4. What is the best way of "letting 4 be 1".
Lets say you have:
a = 2, b = 2, c = 1, d = 3
now make it:
a = 2, b = 4, c = 5, d = 8
Create a random number from 1 to MaxVal (value of the last key, 8 in this example). Select the first Key where Value >= RandomNum
I made a small VB.Net to show the algorithm and how it works. The code is not meant to be: Good, elegant, performant or readable.
Module Module1
Private Class Value
Public vOrg, vRecalc, HitCount As Integer
Public Key As String
Public Sub New(s, v1, v2, c)
Key = s : vOrg = v1 : vRecalc = v2 : HitCount = c
End Sub
End Class
Sub Main()
' set initial values
Dim KVP() As Value = {New Value("A", 2, 0, 0),
New Value("B", 2, 0, 0),
New Value("C", 1, 0, 0),
New Value("D", 3, 0, 0)}
' recalc values
For i = 0 To KVP.Length - 1
If i = 0 Then KVP(0).vRecalc = KVP(0).vOrg Else KVP(i).vRecalc = KVP(i).vOrg + KVP(i - 1).vRecalc
' do test
Dim r As New Random
Dim runs As Integer = 1000 * 1000, maxval As Integer = KVP(KVP.Length - 1).vRecalc
For i = 1 To runs
Dim RandVal = r.Next(1, maxval + 1)
Dim chosen As Integer = (From j In Enumerable.Range(0, KVP.Length) Where KVP(j).vRecalc >= RandVal Take 1 Select j)(0)
KVP(chosen).HitCount += 1
' ouput results
For Each kv In KVP
Console.WriteLine("{0} was chosen with {1:F3} propability, expected was {2:F3}", kv.Key, kv.HitCount / CDbl(runs), kv.vOrg / CDbl(maxval))
End Sub
End Module
An output sample:
A was chosen with 0.250 propability, expected was 0.250
B was chosen with 0.251 propability, expected was 0.250
C was chosen with 0.124 propability, expected was 0.125
D was chosen with 0.375 propability, expected was 0.375
just multiply the randomChance() outcome and the array length together. It'll give you the index in the range [0,array_length-1] which you can use to access the array
array_index = (unsigned int)(randomChance(p) * (array_length - 1));
maybe you mean "letting 3 to be 1" (not 4) in your example. The last index of an array of length 4 is 3.

When will this Recurrence Relation repeat

I have this recurrence formula:
P(n) = ( P(n-1) + 2^(n/2) ) % (X)
s.t. P(1) = 2;
where n/2 is computer integer division i.e. floor of x/2
Since i am taking mod X, this relation should repeat at least with in X outputs.
but it can start repeating before that.
How to find this value?
It needn't repeat within x terms, consider x = 3:
P(1) = 2
P(2) = (P(1) + 2^(2/2)) % 3 = 4 % 3 = 1
P(3) = (P(2) + 2^(3/2)) % 3 = (1 + 2) % 3 = 0
P(4) = (P(3) + 2^(4/2)) % 3 = 4 % 3 = 1
P(5) = (P(4) + 2^(5/2)) % 3 = (1 + 4) % 3 = 2
P(6) = (P(5) + 2^(6/2)) % 3 = (2 + 8) % 3 = 1
P(7) = (P(6) + 2^(7/2)) % 3 = (1 + 8) % 3 = 0
P(8) = (P(7) + 2^(8/2)) % 3 = 16 % 3 = 1
P(9) = (P(8) + 2^(9/2)) % 3 = (1 + 16) % 3 = 2
P(10) = (P(9) + 2^(10/2)) % 3 = (2 + 32) % 3 = 1
P(11) = (P(10) + 2^(11/2)) % 3 = (1 + 32) % 3 = 0
P(12) = (P(11) + 2^(12/2)) % 3 = (0 + 64) % 3 = 1
and you see that the period is 4.
Generally (suppose X is odd, it's a bit more involved for even X), let k be the period of 2 modulo X, i.e. k > 0, 2^k % X = 1, and k is minimal with these properties (see below).
Consider all arithmetic modulo X. Then
P(n) = 2 + ∑ 2^(j/2)
It is easier to see when we separately consider odd and even n:
m m
P(2*m+1) = 2 + 2 * ∑ 2^i = 2 * ∑ 2^i = 2*(2^(m+1) - 1) = 2^((n+2)/2) + 2^((n+1)/2) - 2
i=1 i=0
since each 2^j appears twice, for j = 2*i and j = 2*i+1. For even n = 2*m, there's one summand 2^m missing, so
P(2*m) = 2^(m+1) + 2^m - 2 = 2^((n+2)/2) + 2^((n+1)/2) - 2
and we see that the length of the period is 2*k, since the changing parts 2^((n+1)/2) and 2^((n+2)/2) have that period. The period immediately begins, there is no pre-period part (there can be a pre-period for even X).
Now k <= φ(X) by Euler's generalisation of Fermat's theorem, so the period is at most 2 * φ(X).
(φ is Euler's totient function, i.e. φ(n) is the number of integers 1 <= k <= n with gcd(n,k) = 1.)
What makes it possible that the period is longer than X is that P(n+1) is not completely determined by P(n), the value of n also plays a role in determining P(n+1), in this case the dependence is simple, each power of 2 being used twice in succession doubles the period of the pure powers of 2.
Consider the sequence a[k] = (2^k) % X for odd X > 1. It has the simple recurrence
a[0] = 1
a[k+1] = (2 * a[k]) % X
so each value completely determines the next, thus the entire following part of the sequence. (Since X is assumed odd, it also determines the previous value [if k > 0] and thus the entire previous part of the sequence. With H = (X+1)/2, we have a[k-1] = (H * a[k]) % X.)
Hence if the sequence assumes one value twice (and since there are only X possible values, that must happen within the first X+1 values), at indices i and j = i+p > i, say, the sequence repeats and we have a[k+p] = a[k] for all k >= i. For odd X, we can go back in the sequence, therefore a[k+p] = a[k] also holds for 0 <= k < i. Thus the first value that occurs twice in the sequence is a[0] = 1.
Let p be the smallest positive integer with a[p] = 1. Then p is the length of the smallest period of the sequence a, and a[k] = 1 if and only if k is a multiple of p, thus the set of periods of a is the set of multiples of p. Euler's theorem says that a[φ(X)] = 1, from that we can conclude that p is a divisor of φ(X), in particular p <= φ(X) < X.
Now back to the original sequence.
P(n) = 2 + a[1] + a[1] + a[2] + a[2] + ... + a[n/2]
= a[0] + a[0] + a[1] + a[1] + a[2] + a[2] + ... + a[n/2]
Since each a[k] is used twice in succession, it is natural to examine the subsequences for even and odd indices separately,
E[m] = P(2*m)
O[m] = P(2*m+1)
then the transition from one value to the next is more regular. For the even indices we find
E[m+1] = E[m] + a[m] + a[m+1] = E[m] + 3*a[m]
and for the odd indices
O[m+1] = O[m] + a[m+1] + a[m+1] = O[m] + 2*a[m+1]
Now if we ignore the modulus for the moment, both E and O are geometric sums, so there's an easy closed formula for the terms. They have been given above (in slightly different form),
E[m] = 3 * 2^m - 2 = 3 * a[m] - 2
O[m] = 2 * 2^(m+1) - 2 = 2 * a[m+1] - 2 = a[m+2] - 2
So we see that O has the same (minimal) period as a, namely p, and E also has that period. Unless maybe if X is divisible by 3, that is also the minimal (positive) period of E (if X is divisible by 3, the minimal positive period of E could be a proper divisor of p, for X = 3 e.g., E is constant).
Thus we see that 2*p is a period of the sequence P obtained by interlacing E and O.
It remains to be seen that 2*p is the minimal positive period of P. Let m be the minimal positive period. Then m is a divisor of 2*p.
Suppose m were odd, m = 2*j+1. Then
P(1) = P(m+1) = P(2*m+1)
P(2) = P(m+2) = P(2*m+2)
and consequently
P(2) - P(1) = P(m+2) - P(m+1) = P(2*m+2) - P(2*m+1)
But P(2) - P(1) = a[1] and
P(m+2) - P(m+1) = a[(m+2)/2] = a[j+1]
P(2*m+2) - P(2*m+1) = a[(2*m+2)/2] = a[m+1] = a[2*j+2]
So we must have a[1] = a[j+1], hence j is a period of a, and a[j+1] = a[2*j+2], hence j+1 is a period of a too. But that means that 1 is a period of a, which implies X = 1, a contradiction.
Therefore m is even, m = 2*j. But then j is a period of O (and of E), thus a multiple of p. On the other hand, m <= 2*p implies j <= p, and the only (positive) multiple of p satisfying that inequality is p itself, hence j = p, m = 2*p.
