I'm new to programming and trying to learn Julia. I tried to compute the weighted average cost of short-term stock trading activities as I did before in R. I rewrite the code in Julia, unfortunately, it return the incorrect result in data frame format.
I tried to investigate the result of each iteration step by changing return vwavg to println([volume[i], s, unitprice[i], value[i], t, vwavg[i], u]) and the output is correct. is it a problem with rounding?
Really appreciate your help
# create trial dataset
df = DataFrame(qty = [3, 2, 2, -7, 4, 4, -3,-2, 4, 4, -2, -3],
price = [100.0, 99.0, 101.0, 103.0, 95.0, 93.0, 90.0, 90.0, 93.0, 95.0, 93.0, 92.0])
# create function for weighted average cost of stock price
function vwacost(volume, unitprice)
value = Vector{Float64}(undef, length(volume))
vwavg = Vector{Float64}(undef, length(volume))
for i in 1:length(volume)
s = 0
t = 0
u = 0
if volume[i]>0
value[i] = (volume[i]*unitprice[i]) + t
volume[i] = volume[i] + s
vwavg[i] = value[i]/volume[i]
u = vwavg[i]
s = volume[i]
t = value[i]
else
volume[i] = volume[i] + s
value[i] = u * volume[i]
s = volume[i]
t = value[i]
vwavg[i] = u
end
return vwavg
end
end
out = transform(df, [:qty, :price] => vwacost)
Simple error:
for i in 1:length(volume)
...
return vwavg
end
should be:
for i in 1:length(volume)
...
end
return vwavg
You are currently returning the result after the first loop iteration, which is why your resulting vwawg vector has only one (the first) calculated entry, with all other entries being zero/whatever was in memory when you created the vwawg vector in the first place.
Ok, the second problem of changing original df that result in incorrect result can be solved by copy(df):
out = select(copy(df), [:qty, :price] => vwacost => :avgcost)
thus, the original df will not change and the result will consistent over time.
Related
Here is my code in Julia and I would like to improve its speed since it is slow for large dataset. I provided the code with a small example so it can be executed and produce the results. I think that bottleneck is using find function in the loop which causes the code to be very slow but I don't know how I can replace it with sth faster.
A = [[1,2,3,4,5], [2,3,4,5,6,7,8], [4,7,8,9], [9,10], [2,3,4,5]]
mx = maximum(maximum(ar))
idx_new = zeros(Int, mx)
flag = ones(Int, mx);
Hscore = rand(1, length(A))
thresh = 0.2 * sum(Hscore)
acc_q = 0
pos = sortperm(vec(Hscore))
iter = 1
while acc_q < thresh
acc_q = acc_q + Hscore[pos[iter]]
nd = A[pos[iter]]
fd_flag = flag[nd]
cc = in.(fd_flag, 2)
node = nd[findall(x->x==0, cc)]
dd = nd[findall(x->x!=0, cc)]
TF = isempty(dd)
if TF == true
q_val = Hscore[pos[iter]]
acc_q = acc_q + q_val
idx_new[vec(node)] .= (val + 1)
flag[node] .= 2
val = val + 1;
iter = iter + 1
end # end of if TF
end ## end of while loop
While "please improve my code" is not a right question style for StackOverflow, generally when searching many times for element among many many options these are the first two that you might consider:
Sort the list of elements (with sort!) and use searchsorted to find the desired element
Use Set(mylist) to create a hash set and than search within the set.
I need to apply a function until an index limit (distance, in this case) is met. I'm trying to figure out a way to apply the function repeatedly while avoiding recursion issues.
Example:
I want to apply the following code until total_dist = flight_distance (2500 km). The distance traveled in a given flight depends on the energy available. The flight proceeds as a series of jumps and stops--expending and obtaining energy, respectively. If there is enough energy at the start, the flight can be finished in only two jumps (with one stop). Sometimes two or more stops are necessary. I can't know this ahead of time.
So how can I modify jump_metrics to get it to repeat until the total distance covered is 2500?
flight_distance = 2500
flight_markers = c(1:flight_distance)
TD_cost_km = rnorm(2500, 5, 1)
potential_stops = c(1:(flight_distance-1))
cumulative_flight_cost = vector("list", length=length(potential_stops))
for(i in 1:length(potential_stops)) {
cumulative_flight_cost[[i]] = cumsum(TD_cost_km[flight_markers>potential_stops[i]])
}
max_fuel_quantiles = seq(0, 1, length=flight_distance)
jump_metrics = function() {
start_fuel_prob = qbeta(runif(flight_distance, pbeta(max_fuel_quantiles,1,1),
pbeta(max_fuel_quantiles,1,1)), 1.45*2, 1)
start_energy_level_est =
rnorm(1, sample(1544 + (start_fuel_prob * 7569), 1, replace=T),
0.25)
start_max_energy = ifelse(start_energy_level_est < 1544, 1544,
start_energy_level_est)
fuel_level = start_max_energy - cumulative_flight_cost[[1]]
dist_traveled = length(fuel_level[fuel_level>(max(fuel_level)*0.2)])
dist_left = flight_distance - dist_traveled
partial_dist = 1 + dist_traveled
dist_dep_max_energy = c(rep(start_max_energy, length(1:dist_left)),
seq(start_max_energy, 1544,
length.out=length((dist_left+1):flight_distance)))
next_max_energy = dist_dep_max_energy[partial_dist]
next_fuel_level = next_max_energy - cumulative_flight_cost[[partial_dist]]
next_dist_traveled = length(next_fuel_level[next_fuel_level >
(max(next_fuel_level)*0.2)])
total_dist = next_dist_traveled + partial_dist
list(partial_dist, next_dist_traveled, total_dist)
}
jump_metrics()
I have a weird question..
Essentially, I have a function which takes a data frame of dimension Nx(2k) and transforms it into an array of dimension Nx2xk. I then further use that array in various locations in the function.
My issue is this, when k == 2, I'm left with a matrix of degree Nx2, and even worse, if N = 1, I'm stuck with a matrix of degree 1x2.
I would like to write myArray[thisRow,,] to select that slice of the array, but this falls short for the N = 1, k = 2 case. I tried myArray[thisRow,,,drop = FALSE] but that gives an 'incorrect number of dimensions' error. This same issue arrises for the Nx2 case.
Is there a work around for this issue, or do I need to break my code into cases?
Sample Code Shown Below:
thisFunction <- function(myDF)
{
nGroups = NCOL(myDF)/2
afMyArray = myDF
if(nGroups > 1)
{
afMyArray = abind(lapply(1:nGroups, function(g){myDF[,2*(g-1) + 1:2]}),
along = 3)
}
sapply(1:NROW(myDF),
function(r)
{
thisSlice = afMyArray[r,,]
*some operation on thisSlice*
})
}
Thanks,
James
Say I have this dictionary in Lua
places = {dest1 = 10, dest2 = 20, dest3 = 30}
In my program I check if the dictionary has met my size limit in this case 3, how do I push the oldest key/value pair out of the dictionary and add a new one?
places["newdest"] = 50
--places should now look like this, dest3 pushed off and newdest added and dictionary has kept its size
places = {newdest = 50, dest1 = 10, dest2 = 20}
It's not too difficult to do this, if you really needed it, and it's easily reusable as well.
local function ld_next(t, i) -- This is an ordered iterator, oldest first.
if i <= #t then
return i + 1, t[i], t[t[i]]
end
end
local limited_dict = {__newindex = function(t,k,v)
if #t == t[0] then -- Pop the last entry.
t[table.remove(t, 1)] = nil
end
table.insert(t, k)
rawset(t, k, v)
end, __pairs = function(t)
return ld_next, t, 1
end}
local t = setmetatable({[0] = 3}, limited_dict)
t['dest1'] = 10
t['dest2'] = 20
t['dest3'] = 30
t['dest4'] = 50
for i, k, v in pairs(t) do print(k, v) end
dest2 20
dest3 30
dest4 50
The order is stored in the numeric indices, with the 0th index indicating the limit of unique keys that the table can have.
Given that dictionary keys do not save their entered position, I wrote something that should be able to help you accomplish what you want, regardless.
function push_old(t, k, v)
local z = fifo[1]
t[z] = nil
t[k] = v
table.insert(fifo, k)
table.remove(fifo, 1)
end
You would need to create the fifo table first, based on the order you entered the keys (for instance, fifo = {"dest3", "dest2", "dest1"}, based on your post, from first entered to last entered), then use:
push_old(places, "newdest", 50)
and the function will do the work. Happy holidays!
What I am trying to accomplish, is to be able to put some values inside an array, then based on a t (0-1), get a value out of the array based on its stored values.
To make this more clear, here's an example:
Array values = [0, 10]
Now this array would return value 0 for t=1 and value 10 for t=1. So t=.3 will give a value of 3.
Another example:
Array values = [10, 5, 5, 35]
t=.25 will give a value of 5
t=.125 will give a value of 7.5
Im looking for the most efficient formula to get the value at any given t using a given array.
Currently I'm using this (pseudo code)
var t:Number = .25;
var values:Array = [10, 5, 5, 35];
if(t == 1) value = [values.length-1];
else
var offset:Number = 1/values.length;
var startIndex:int = int(t/offset);
var fraction:Number = t % offset;
var roundPart:Number = (values[startIndex+1] - values[startIndex]) * fraction;
var value:Number = values[startIndex] + roundPart;
But i'm sure there's a far more better way of doing this. So i'm calling for the mathematicians on here!
Here is a One Liner in Mathematica. It's doing the same thing you are, only slightly more compact.
Arrays indexes start at 1.
values = {10, 5, 5, 35, 0}
f[a_, x_] := a[[k = IntegerPart[(k1 = (Dimensions[a][[1]] - 2) x)] + 1]] +
FractionalPart[k1] (a[[k + 1]] - a[[k]])
So your interpolation result on:
In[198]:= f[values,1]
Out[198]= 35
Etc.
If you plot changing the x scale: