Is there a lazy and iterative `map` in Julia? - julia

The map function seems eager, e.g.
map(x->x+1, 1:3) gives one [2,3,4].
I want to find a lazy and iterative version of map so that the values are not generated all at once, so I can just get values one by one from the result of the map?

You can use Base.Generator for this, e.g. in your case:
julia> g = (x + 1 for x in 1:3)
Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##5#6"))}(getfield(Main, Symbol("##5#6"))(), 1:3)
julia> collect(g)
3-element Array{Int64,1}:
2
3
4

Related

How to normalize the columns of a matrix in Julia

Given a matrix A of dimensions m,n, how would one normalize the columns of that matrix by some function or other process in Julia (the goal would be to normalize the columns of A so that our new matrix has columns of length 1)?
If you want a new matrix then mapslices is probably what you want:
julia> using LinearAlgebra
julia> x = rand(5, 3)
5×3 Matrix{Float64}:
0.185911 0.368737 0.533008
0.957431 0.748933 0.479297
0.567692 0.477587 0.345943
0.743359 0.552979 0.252407
0.944899 0.185316 0.375296
julia> y = mapslices(x -> x / norm(x), x, dims=1)
5×3 Matrix{Float64}:
0.112747 0.327836 0.582234
0.580642 0.66586 0.523562
0.344282 0.424613 0.377893
0.450816 0.491642 0.275718
0.573042 0.164761 0.409956
julia> map(norm, eachcol(y))
3-element Vector{Float64}:
1.0
1.0
1.0
If by length 1 you mean like with norm 1 maybe this could work
using LinearAlgebra
# if this is your matrix
m = rand(10, 10)
# norm comes from LinearAlgebra
# or you can define it as
# norm(x) = sqrt(sum(i^2 for i in x))
g(x) = x ./ norm(x)
# this has columns that have approximately norm 1
normed_m = reduce(hcat, g.(eachcol(m)))
Though there are likely better solutions I don't know!
mapslices seems to have some issues with performance. On my computer (and v1.7.2) this is 20x faster:
x ./ norm.(eachcol(x))'
This is an in-place version (because eachcol creates views), which is faster still (but still allocates a bit):
normalize!.(eachcol(x))
And, finally, some loop versions that are 40-70x faster than mapslices for the 5x3 matrix:
# works in-place:
function normcol!(x)
for col in eachcol(x)
col ./= norm(col)
end
return x
end
# creates new array:
normcol(x) = normcol!(copy(x))
Edit: Added a one-liner with zero allocations:
foreach(normalize!, eachcol(x))
The reason this does not allocate anything, unlike normalize!., is that foreach doesn't return anything, which makes it useful in cases where output is not needed.

Apply function with multiple arguments to a vector in Julia

I would like to apply function with multiple arguments to a vector.
It seems that both map() and map!() can be helpful.
It works perfect if function has one argument:
f = function(a)
a+a
end
x=[1,2,3,4,5]
map(f, x)
output: [2, 4, 6, 8, 10]
However, it is not clear how to pass arguments to the function, if possible, and the vector to broadcast, if the function has multiple arguments.
f = function(a,b)
a*b
end
However, non of the following working:
b=3
map(f(a,b), x, 3)
map(f, x, 3)
map(f, a=x, b=3)
map(f(a,b), x, 3)
map(f(a,b), a=x,b=3)
Expected output:
[3,6,9,12,15]
Use broadcast - just as you suggested in the question:
julia> f = function(a,b)
a*b
end
#1 (generic function with 1 method)
julia> x=[1,2,3,4,5]
5-element Vector{Int64}:
1
2
3
4
5
julia> b=3
3
julia> f.(x, b)
5-element Vector{Int64}:
3
6
9
12
15
map does not broadcast, so if b is a scalar you would manually need to write:
julia> map(f, x, Iterators.repeated(b, length(x)))
5-element Vector{Int64}:
3
6
9
12
15
You can, however, pass two iterables to map without a problem:
julia> map(f, x, x)
5-element Vector{Int64}:
1
4
9
16
25
One possible solution is to create an anonymous function inside map as follows -->
x = [1, 2, 3, 4, 5]
b = 3
f = function(a, b)
a * b
end
map(x -> f(x, b), x)
which produces below output-->
5-element Vector{Int64}:
3
6
9
12
15
Explanation :- Anonymous function is taking values from vector as its first argument and 2nd argument is fixed with b = 3.
A few other options:
julia> map(Base.splat(func), Iterators.product(3, x))
5-element Vector{Int64}:
3
6
9
12
15
Iterators.product returns a list of tuples (3, 1), (3, 2), etc. Since our function func takes multiple separate arguments and not a tuple, we use Base.splat on it which takes the tuple and splats it into separate arguments to pass on to func.
julia> using SplitApplyCombine: product
julia> product(func, x, 3)
5-element Vector{Int64}:
3
6
9
12
15
SplitApplyCombine.jl's product function can directly map a given function over each combination (Cartesian product) of the given arguments.
julia> map(func, x, Iterators.cycle(3))
5-element Vector{Int64}:
3
6
9
12
15
A difference from the two previous ways is that if the shorter argument was a vector with more than one element in it, the previous methods would apply the function to each combination of elements from the two arguments, whereas this one would behave like Python's zip_longest, repeating the shorter vector until they were the same length (and then applying the function).
julia> y = [10, 1000];
julia> SplitApplyCombine.product(func, x, y) # previous method
5×2 Matrix{Int64}:
10 1000
20 2000
30 3000
40 4000
50 5000
julia> map(func, x, Iterators.cycle(y))
5-element Vector{Int64}:
10
2000
30
4000
50

How to insert an element at a specific position of an empty vector?

In R we can create an empty vector where it is possible to insert an element in any position of this vector.
Example:
> x <- c()
> x[1] = 10
> x[4] = 20
The final result is:
> x
[1] 10 NA NA 20
I would like to do something similar using Julia, but couldn't find a way to do this.
The “append” function do not perform something like that.
Could anyone help?
You need to do this in two steps:
First resize the vector or create a vector with an appropriate size.
Next set the elements accordingly.
Since you are coming from R I assume you want the vector to be initially filled with missing values. Here is the way to do this.
In my example I assume you want to store integers in the vector. Before both options load the Missings.jl package:
using Missings
Option 1. Start with an empty vector
julia> x = missings(Int, 0)
Union{Missing, Int64}[]
julia> resize!(x, 4)
4-element Vector{Union{Missing, Int64}}:
missing
missing
missing
missing
julia> x[1] = 10
10
julia> x[4] = 40
40
julia> x
4-element Vector{Union{Missing, Int64}}:
10
missing
missing
40
Option 2. Preallocate a vector
julia> x = missings(Int, 4)
4-element Vector{Union{Missing, Int64}}:
missing
missing
missing
missing
julia> x[1] = 10
10
julia> x[4] = 40
40
The reason why Julia does not resize the vectors automatically is for safety. Sometimes it would be useful, but most of the time if x is an empty vector and you write x[4] = 40 it is a bug in the code and Julia catches such cases.
EDIT
What you can do is:
function setvalue(vec::Vector, idx, val)
#assert idx > 0
if idx > length(vec)
resize!(vec, idx)
end
vec[idx] = val
return vec
end

generate nested number sequences R-style

I need to generate number sequences as follows:
1
1,2
1,2,3
...
1,2,3...,n
2
2,3
2,3,4
...
2,3,4,...,n
...
...
n-1
n-1,n
n
I come from other programming languages where loops are perfectly fine. But I understand the R community favors the so-called vectorized operations rather than loops (more efficient, although I haven't read all the details on the why is this).
So, the first thing that comes to my mind for what I need to do was loops. And I wrote this code that certainly does the job (R gurus say euhh in 3,2,1...)
n <- 30
accum <- list()
for (x in 1:n) {
for (y in x:n) {
accum[[paste(x,y)]] <- x:y
}
}
But this is ugly code (and I guess non-efficient).
So, what is the clever R-style code for my problem?
I certainly haven't mastered vectorized operations and the apply family functions. But my best shot at this was:
n <- 30
accum <- lapply(1:n, FUN = function(x){lapply(x:n, FUN = seq, from = x)})
no idea if this is good R-style coding, but it almost get the job done. The problem with this solution is that it produces a list with n elements, which are also lists and contain the sequences. But what I wanted was a list with 465 elements (in case of n=30), so one element per sequence without all the nesting of lists that this solution produces.
I would really appreciate solutions that are clever and elegant in the R world.
To get a single vector:
n <- 4
u <- sequence(n:1)
(v <- sequence(u) + rep(1:n, rev(cumsum(1:n))) - 1)
# [1] 1 1 2 1 2 3 1 2 3 4 2 2 3 2 3 4 3 3 4 4
and a list of vectors:
split(v, rep(cumsum(u), u))
or something very similar to your solution:
Reduce('c', lapply(1:n, function(x) lapply(x:n, seq, from = x)))
You second solution is good. All you have to do is unlist one layer.
unlist(lapply(1:n, FUN = function(x) lapply(x:n, FUN = seq, from = x)), rec=FALSE)
What you have here is the list monad in disguise. To make that more clear, consider the following, which is equivalent
mapcat <- function(x,f,...) unlist(lapply(x,f,...),rec=FALSE)
mapcat(1:n,function(a) mapcat(a:n, function(b) list(seq(a,b))))
Here mapcat is the bind operation, and list is the unit/return.
In languages with the do-notation for list monads, this could be written, for example in Haskell, as
do
a <- [1..n]
b <- [a..n]
return([a..b])
I don't know of any R package with such sugar implemented, but using the foreach library, we can get closer
library(foreach)
foreach(a=1:n, .combine='c') %:% foreach(b=a:n) %do% seq(a,b)

Calculate a geometric progression

I'm using brute force right now..
x <- 1.03
Value <- c((1/x)^20,(1/x)^19,(1/x)^18,(1/x)^17,(1/x)^16,(1/x)^15,(1/x)^14,(1/x)^13,(1/x)^12,(1/x)^11,(1/x)^10,(1/x)^9,(1/x)^8,(1/x)^7,(1/x)^6,(1/x)^5,(1/x)^4,(1/x)^3,(1/x)^2,(1/x),1,x,x^2,x^3,x^4,x^5,x^6,x^7,x^8,x^9,x^10,x^11,x^12,x^13,x^14,x^15,x^16,x^17,x^18,x^19,x^20)
Value
but I would like to use an increment loop just like the for loop in java
for(integer I = 1; I<=20; I++)
^ is a vectorized function in R. That means you can simply use x^(-20:20).
Edit because this gets so many upvotes:
More precisely, both the base parameter and the exponent parameter are vectorized.
You can do this:
x <- 1:3
x^2
#[1] 1 4 9
and this:
2^x
#[1] 2 4 8
and even this:
x^x
#[1] 1 4 27
In the first two examples the length-one parameter gets recycled to match the length of the longer parameter. Thats why the following results in a warning:
y <- 1:2
x^y
#[1] 1 4 3
#Warning message:
# In x^y : longer object length is not a multiple of shorter object length
If you try something like that, you probably want what outer can give you:
outer(x, y, "^")
# [,1] [,2]
#[1,] 1 1
#[2,] 2 4
#[3,] 3 9
Roland already addressed the fact that you can do this vectorized, so I will focus on the loop part in cases where you are doing something more that is not vectorized.
A Java (and C, C++, etc.) style loop like you show is really just a while loop. Something that you would like to do as:
for(I=1, I<=20, I++) { ... }
is really just a different way to write:
I=1 # or better I <- 1
while( I <= 20 ) {
...
I <- I + 1
}
So you already have the tools to do that type of loop. However if you want to assign the results into a vector, matrix, array, list, etc. and each iteration is independent (does not rely on the previous computation) then it is usually easier, clearer, and overall better to use the lapply or sapply functions.

Resources