Mutate lead variables - julia

This is my code:
data = #pipe data |> sort(_, :year) |> groupby(_,:id) |>
transform(_, :a1 => (x -> lead(x,1) .- x ) => :b1; ungroup = false) |>
transform(_, :a2 => (x -> lead(x,2) .- x ) => :b2; ungroup = false) |>
transform(_, :a3 => (x -> lead(x,3) .- x ) => :b3, ungroup = false) |>
transform(_, :a4 => (x -> lead(x,4) .- x ) => :b4, ungroup = false) |>
transform(_, :a5 => (x -> lead(x,5) .- x ) => :b5)
Is this the most efficient way to mutate lead variables on a very large dataset?
data is a DataFrame with columns a1, ..., :a5, and I want to have columns b1 , ..., :b5 where bi = lead(ai, i).

The question about "most efficient" has many dimensions, but I assume you want to avoid as much allocations as possible, in this case do the following:
#pipe data |>
sort!(_, :year) |>
groupby(_,:id) |>
transform!(_, [Symbol(:a, i) => (x -> lead(x, i) .- x ) => Symbol(:b, i) for i in 1:5])
in this way you will update your data data frame in-place minimizing copying (your solution copies a lot).

Related

Symply.py for getting coefficients for ALL combination of the variables of a multivariable polynomial

How to get coefficients for ALL combinations of the variables of a multivariable polynomial using sympy.jl or another Julia package for symbolic computation?
Here is an example from MATLAB,
syms a b y
[cxy, txy] = coeffs(ax^2 + by, [y x], ‘All’)
cxy =
[ 0, 0, b]
[ a, 0, 0]
txy =
[ x^2y, xy, y]
[ x^2, x, 1]
My goal is to get
[ x^2y, xy, y]
[ x^2, x, 1]
instead of [x^2, y]
I asked the same question at
https://github.com/JuliaPy/SymPy.jl/issues/482
and
https://discourse.julialang.org/t/symply-jl-for-getting-coefficients-for-all-combination-of-the-variables-of-a-multivariable-polynomial/89091
but I think I should ask if this can be done using Sympy.py.
Using Julia, I tried the following,
julia> #syms x, y, a, b
julia> ff = sympy.Poly(ax^2 + by, (x,y))
Poly(ax**2 + by, x, y, domain='ZZ[a,b]')
julia> [prod(ff.gens.^i) for i in ff.monoms()]
2-element Vector{Sym}:
x^2
y
This is a longer form rewrite of the one-liner in the comment.
It uses Pipe.jl to write expressions 'functionally', so familiarity with pipe operator (|>) and Pipe.jl will help.
using SymPy
using Pipe
#syms x, y, a, b
ff = sympy.Poly(a*x^2 + b*y, (x,y))
max_degrees =
#pipe ff.monoms() .|> collect |> hcat(_...) |>
reduce(max, _, dims=2) |> vec
degree_iter =
#pipe max_degrees .|> UnitRange(0, _) |>
tuple(_...) |> CartesianIndices
result = [prod(ff.gens.^Tuple(I)) for I in degree_iter] |>
reverse |> eachcol |> collect
or using more of the python methods:
[prod(ff.gens.^I) for
I in Iterators.product((0:d for d in ff.degree.(ff.gens))...)] |>
reverse |> eachcol |> collect
Both give the desired result:
2-element Vector{...}:
[x^2*y, x*y, y]
[x^2, x, 1]
UPDATE:
In case there are more than 2 generators, the result needs to be a Array with higher dimension. The last bits of matrix transposes is immaterial and the expressions become:
Method 1:
max_degrees =
#pipe ff.monoms() .|> collect |> hcat(_...) |>
reduce(max, _, dims=2) |> vec
degree_iter =
#pipe max_degrees .|> UnitRange(0, _) |>
tuple(_...) |> CartesianIndices
result = [prod(ff.gens.^Tuple(I)) for I in degree_iter]
Method 2:
result = [prod(ff.gens.^Tuple(I)) for I in degree_iter]
Thanks a lot #Dan Getz. Your solution works for the TOY example from MATLAB. My real case is more complicated, which has more variables and polynominals. I tried your method for 3 variables,
using SymPy
#syms x, y, z, a, b
ff = sympy.Poly(a*x^2 + b*y + z^2 + x*y + y*z, (x, y, z))
[prod(ff.gens.^Tuple(I)) for I in CartesianIndices(tuple(UnitRange.(0,vec(reduce(max, hcat(collect.(ff.monoms())...), dims=1)))...))]
I got the following error,
ERROR: LoadError: DimensionMismatch: arrays could not be broadcast to a common size; got a dimension with lengths 3 and 5
Stacktrace:
How to generate your method to any number of variables with different degrees, e.g., x^3 + y^3 + z^3 + xyz + xy^2z?
You can find the degree of each of the two variables of interest and then use them to create the matrix of generators; you can use them to get the coefficients of interest. I am not sure what you expect if the equation were like a*x**2 + b*y + c...
>>> from sympy import *
>>> from sympy.abc import a, b, x, y
>>> eq = a*x**2 + b*y
>>> deg = lambda x: Poly(eq, x).degree() # helper to give degree in "x"
>>> v = (Matrix([x**i for i in range(deg(x),-1,-1)]
... )*Matrix([y**i for i in range(deg(y),-1,-1)]).T).T; v
Matrix([[x**2*y, x*y, y], [x**2, x, 1]])
>>> Matrix(*v.shape, [eq.coeff(i) if i.free_symbols else eq.as_coeff_Add()[0]
... for i in v])
Matrix([[0, 0, b], [a, 0, 0]])
From #jverzani (thanks)
using SymPy;
#syms a b x y;
eq = a*x^2 + b*y;
deg = x -> sympy.Poly(eq, x).degree();
xs, ys = [x^i for i ∈ range(deg(x):-1:0], [y^i for i ∈ deg(y):-1:0];
v = permutedims(xs .* permutedims(ys));
M = [x^2*y x*y y; x^2 x 1];
[length(free_symbols(i)) > 0 ? eq.coeff(i) : eq.as_coeff_add()[1] for i ∈ v];
[0 0 b; a 0 0]

Take while running total smaller than value

I am trying to generate a list of even integers while the sum of the items in the list is less equal a given number.
For instance if the threshold k is 20, then the expected output is [0;2;4;6;8]
I can generate a list where the largest value is smaller by the threshold like this:
let listOfEvenNumbersSmallerThanTwenty =
Seq.unfold (fun x -> Some(x, x + 1)) 0 // natural numbers
|> Seq.filter (fun x -> x % 2 = 0) // even numbers
|> Seq.takeWhile (fun x -> x <= 20)
|> List.ofSeq
(I know that I can combine the unfold and filter to Some(x, x + 2) but this task is for educational purposes)
I managed to create a different list with a running total smaller than the threshold:
let runningTotal =
listOfEvenNumbersSmallerThanTwenty
|> Seq.scan (+) 0
|> Seq.filter (fun x -> x < 20)
|> List.ofSeq
But in order to do that, I have set the threshold in listOfEvenNumbersSmallerThanTwenty (which is way more than the items needed) and I have lost the initial sequence. I did also try to find that using a mutable value but didn't really like that route.
You can create a small predicate function that will encapsulate a mutable sum.
let sumLessThan threshold =
let mutable sum = 0
fun x ->
sum <- sum + x
sum < threshold
Usage is very simple and it can be applied to any sequence
Seq.initInfinite ((*) 2) |> Seq.takeWhile (sumLessThan 20)
There is nothing bad in using mutable state when its encapsulated (check usages of the mutable variable in Seq module)
Here's a solution that I think is pretty elegant (although not the most efficient):
let evens = Seq.initInfinite (fun i -> 2 * i)
Seq.initInfinite (fun i -> Seq.take i evens)
|> Seq.takeWhile (fun seq ->
Seq.sum seq <= 20)
|> Seq.last
|> List.ofSeq

How to use |> operator with a function which expects two parameters?

kll : Float
kll =
let
half x =
x / 2
in
List.sum (List.map half (List.map toFloat (List.range 1 10)))
converting using |>
can you also explain how to use the |> correctly with some examples cant find any online?
Thanks
This is my code:
kll : List Float
kll =
let
half x =
x / 2
in
((1 |> 1 |> List.range) |> toFloat |> List.map) (|>half |> List.map))|> List.sum
|> doesn't work with 2-parameter functions. It only feeds into functions that take one parameter.
Use currying to supply leading parameters. I think what you want is this:
List.range 1 10 |> List.map toFloat |> List.map half |> List.sum
Or more simply:
List.range 1 10 |> List.map (\x -> toFloat x / 2) |> List.sum

SML syntax error

Logic:
eploy(list, constant)
if list is empty then
return:
0;
else
return:
(first_element + constant*eploy(rest_of_the_elements, constant)
I have written following code:
fun eploy(xs, x1:int) =
if null xs
then (0)
else (x::xs') => x + x1*eploy(xs',x1)
eploy([1,2],4);
If you want to do pattern matching then you need to use case:
fun eploy(xs, x1) =
case xs of
nil => 0
| x::xs' => x + x1*eploy(xs', x1)
You can also merge that into the function definition by using clauses:
fun eploy(nil, x1) = 0
| eploy(x::xs', x1) = x + x1*eploy(xs', x1)

(Beginner's) issue with redundant case statement in SML

I'm trying to write a function in SML to compute the partial sum of an alternating harmonic series, and for the life of me I can't figure out why the compiler says one of the cases is redundant. I haven't used case statements before(or local, for that matter), but the order of these cases seems right to me.
local
fun altHarmAux (x:int, y:real) =
case x of
1 => 1.0
| evenP => altHarmAux(x-1, y - y/(real x))
| oddP => altHarmAux(x-1, y + y/(real x))
in
fun altHarmonic (a:int) = altHarmAux(a, real a)
end
Even if you have defined the two predicate functions somewhere, they can't be used in a case like that.
whatever you write on the left hand of => will be bound to the value you are matching on, thus the two last matches in your case will match the same input, rendering the last one useless, as the first one will always be used
You will have to apply your predicate function to the value directly, and then match on the result
local
fun altHarmAux (x, y) =
case (x, evenP x) of
(1, _) => 1.0
| (_ true) => altHarmAux(x-1, y - y/(real x))
| (_, false) => altHarmAux(x-1, y + y/(real x))
in
fun altHarmonic a = altHarmAux(a, real a)
end
or perhaps simpler
local
fun altHarmAux (1, _) = 1.0
| altHarmAux (x, y) =
altHarmAux (x-1, y + (if evenP x then ~y else y) / (real x))
in
fun altHarmonic a = altHarmAux (a, real a)
end
or
local
fun altHarmAux (1, _) = 1.0
| altHarmAux (x, y) =
if evenP x then
altHarmAux (x-1, y - y/(real x))
else
altHarmAux (x-1, y + y/(real x))
in
fun altHarmonic a = altHarmAux (a, real a)
end

Resources