Clojure: recur vs. recursion via fn name - recursion

I'm just a beginner on Clojure, and I've been trying the 4clojure.com problems. There I stumbled upon a problem in an exercise where I am supposed to write a flatten implementation.
I basically understand the concept of tail call optimization, and how recur allows not consuming the stack, as opposed to "normal" recursion (I don't know if there's a proper term).
And that's why I don't get what is going on here:
(defn foo1 [x]
(if (> x 0)
(do (println x)
(foo1 (dec x)))))
(defn foo2 [x]
(if (> x 0)
(do (println x)
(recur (dec x)))))
As expected both foo1 and foo2 are the same functionally, but, given a parameter large enough (100000 in my case), I get a stack overflow™ on foo1 while foo2 completes normally.
Now, on to the flatten problem:
(defn flatten1 [ls]
(mapcat
#(if (coll? %)
(flatten1 %)
(list %))
ls))
(defn flatten2 [ls]
(mapcat
#(if (coll? %)
(recur %)
(list %))
ls))
Test case:
(flatten [1 [2] 3 [4 [5 6 [7] 8]]])
(flatten1 [1 [2] 3 [4 [5 6 [7] 8]]])
(flatten2 [1 [2] 3 [4 [5 6 [7] 8]]])
Expected result: '(1 2 3 4 5 6 7 8)
Well, flatten1 works ok (it's a small input anyway). But flatten2 just hangs indefinitely. Doesn't recur target the recursion point set at the defn? What's the difference (optimization aside) with recursing to the function by name?

By modifying the program a bit you can see the problem:
(ns clj.core
(:require [tupelo.core :as t] )
(:gen-class))
(t/refer-tupelo)
(defn flatten1 [ls]
(mapcat
(fn [it]
(println "f1: it=" it)
(if (coll? it)
(flatten1 it)
(list it)))
ls))
(defn flatten2 [ls]
(mapcat
(fn [it]
(println "f2: it=" it)
(if (coll? it)
(recur it)
(list it)))
ls))
(defn -main
[& args]
(newline) (println "main - 1")
(spyx (flatten [1 [2] 3 [4 [5 6 [7] 8]]]))
(newline) (println "main - 2")
(spyx (flatten1 [1 [2] 3 [4 [5 6 [7] 8]]]))
(newline) (println "main - 3")
(flatten2 [1 [2] 3 [4 [5 6 [7] 8]]])
Running the code produces this output:
main - 1
(flatten [1 [2] 3 [4 [5 6 [7] 8]]]) => (1 2 3 4 5 6 7 8)
main - 2
f1: it= 1
f1: it= [2]
f1: it= 2
f1: it= 3
f1: it= [4 [5 6 [7] 8]]
f1: it= 4
f1: it= [5 6 [7] 8]
f1: it= 5
f1: it= 6
f1: it= [7]
f1: it= 7
f1: it= 8
(flatten1 [1 [2] 3 [4 [5 6 [7] 8]]]) => (1 2 3 4 5 6 7 8)
main - 3
f2: it= 1
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
f2: it= [2]
So you can see it gets stuck on the [2] item, the 2nd element of the input list.
The reason this fails is that the recur statement only jumps back to the innermost function, which is the anonymous form #(if ...) in your original problem, of the form (fn [it] ...) in the 2nd version.
Note that recur can only "jump" to the innermost fn/loop target. You cannot use recur to jump out of your inner anonymous function to reach flatten2. Since it only jumps to the inner function, the 1-elem collection [2] does not replace the ls value at the end of the mapcat call, and you therefore get the infinite loop.
The best advice for any programming is "keep it simple". Recursion is simpler than loop/recur for most problems.
On the JVM, each stack frame requires some memory (consult the docs about the -Xs switch to increase). If you use too many stack frames, you will eventually run out of memory (controlled by the -Xmx switch). You should usually be able to count on at least 1000 stack frames being available (you can test if you like for your machine & params). So as a rule of thumb, if your recursion depth is 1000 or less, don't worry about using loop/recur.

Related

How to filter data frame row by a more decent way in R, given the condition that the cell is a list of lists?

I have been working on a project that analyzes organizational members' data. One of the approaches is that use the geocoding technique to get each member's location data. I have already gathered the relevant information from Google but there are still some that cannot process properly.
I would like to first filter out those rows that contain nothing inside the list. Yet, due to the nature of the data is a list of lists objects, I cannot find a proper way to filter them all effectively.
The targeted column that I aimed to process:
> family[4]
# A tibble: 5,324 x 1
district
<list>
1 <named list [2]>
2 <named list [2]>
3 <tibble [1 x 2]>
4 <named list [2]>
5 <named list [2]>
6 <tibble [1 x 2]>
7 <named list [2]>
8 <named list [2]>
9 <named list [2]>
10 <named list [2]>
# ... with 5,314 more rows
An example on the sturcture of a valid output (I hided most of the information because of sensitivity):
> family[4][[1]][[1]]
$results
$results[[1]]
$results[[1]]$address_components
$results[[1]]$address_components[[1]]
$results[[1]]$address_components[[1]]$long_name
[1] "xxxxxxxxxxxxxxxx"
$results[[1]]$address_components[[1]]$short_name
[1] "xxxxxxxxxxxxxxxx"
$results[[1]]$address_components[[1]]$types
$results[[1]]$address_components[[1]]$types[[1]]
[1] "premise"
$results[[1]]$address_components[[2]]
$results[[1]]$address_components[[2]]$long_name
[1] "xxxxxxxxxxxxxxxx"
$results[[1]]$geometry$viewport$northeast$lat
[1] xxxxxxxxxxxxxxxx
$results[[1]]$geometry$viewport$northeast$lng
[1] xxxxxxxxxxxxxxxx
$results[[1]]$geometry$viewport$southwest
$results[[1]]$geometry$viewport$southwest$lat
[1] xxxxxxxxxxxxxxxx
$results[[1]]$geometry$viewport$southwest$lng
[1] xxxxxxxxxxxxxxxx
$results[[2]]$geometry$viewport
$results[[2]]$geometry$viewport$northeast
$results[[2]]$geometry$viewport$northeast$lat
[1] xxx.xx
$results[[2]]$geometry$viewport$northeast$lng
[1] xxx.xx
$results[[2]]$geometry$viewport$southwest
$results[[2]]$geometry$viewport$southwest$lat
[1] xxx.xx
$results[[2]]$geometry$viewport$southwest$lng
[1] xxx.xx
$results[[2]]$place_id
[1] "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
$results[[2]]$types
$results[[2]]$types[[1]]
[1] "establishment"
$results[[2]]$types[[2]]
[1] "point_of_interest"
$status
[1] "OK"
The invalid output that I would like to fitler out:
> family[4][[1]][[3]]
# A tibble: 1 x 2
lon lat
<dbl> <dbl>
1 NA NA
Questions:
What is the coding that can extract those rows with valid outcomes (To keep the <named list [2]> and filter out the <tibble [1 x 2]>) out of the data frame?
Is there a way that to extract the only desired attributes from the list of lists into a new column of a data frame?
Such as the data of lat and lng:
$results[[2]]$geometry$viewport$northeast$lat
[1] xxx.xx
$results[[2]]$geometry$viewport$northeast$lng
[1] xxx.xx
Here's a simple MCVE to possibly solve your problem which at the moment doesn't really have one.
It builds a function that returns a logical vector that is then used to index a list:
dput(x) #This is what you should use to illustrate your problem.
#list(list(3, 4), list(x = 5, y = 6))
is_named <- function(x) sapply(x, function(z){ !is.null(names(z))})
is_named(x)
#[1] FALSE TRUE
x[is_named(x)]
#---vvvvv---------
[[1]]$x
[1] 5
[[1]]$y
[1] 6
You might need to make this recursive. And you might need to add a test for "list-ness".

How to create a function that takes N numbers and produces a list of lists with those element numbers, as in the example?

Someone knows how to make a function create n) that takes N numbers, and produces a list of lists with those element numbers. The elements of each list must be integers in ascending order
You are probably expected to write it in a functional way, in particular without side-effects. It is possible and without revealing too much, I suggest writing 3 functions:
range-list of two parameters from and to, which builds a list of numbers with both inclusive bounds; for example (range-list 2 4) is (2 3 4)
make-lists which takes a start parameter (the next integer from which generating integers), and a sizes parameter (a list), a list of sizes. It calls range-list to build lists and recurses to itself to build the next list of lists.
the answer function (you can rename it), which takes a variable amount of parameters, and calls make-lists with a start of 1 and the given list of sizes.
I implemented this in Common Lisp and traced all those functions with your example, the output is as follows (nb. don't mind the SO:: prefix, it is the current package (a namespace), it stands for StackOverflow):
0: (SO::ANSWER 4 3 3)
1: (SO::MAKE-LISTS 1 (4 3 3))
2: (SO::RANGE-LIST 1 4)
3: (SO::RANGE-LIST 2 4)
4: (SO::RANGE-LIST 3 4)
5: (SO::RANGE-LIST 4 4)
6: (SO::RANGE-LIST 5 4)
6: RANGE-LIST returned NIL
5: RANGE-LIST returned (4)
4: RANGE-LIST returned (3 4)
3: RANGE-LIST returned (2 3 4)
2: RANGE-LIST returned (1 2 3 4)
2: (SO::MAKE-LISTS 5 (3 3))
3: (SO::RANGE-LIST 5 7)
4: (SO::RANGE-LIST 6 7)
5: (SO::RANGE-LIST 7 7)
6: (SO::RANGE-LIST 8 7)
6: RANGE-LIST returned NIL
5: RANGE-LIST returned (7)
4: RANGE-LIST returned (6 7)
3: RANGE-LIST returned (5 6 7)
3: (SO::MAKE-LISTS 8 (3))
4: (SO::RANGE-LIST 8 10)
5: (SO::RANGE-LIST 9 10)
6: (SO::RANGE-LIST 10 10)
7: (SO::RANGE-LIST 11 10)
7: RANGE-LIST returned NIL
6: RANGE-LIST returned (10)
5: RANGE-LIST returned (9 10)
4: RANGE-LIST returned (8 9 10)
4: (SO::MAKE-LISTS 11 NIL)
4: MAKE-LISTS returned NIL
3: MAKE-LISTS returned ((8 9 10))
2: MAKE-LISTS returned ((5 6 7) (8 9 10))
1: MAKE-LISTS returned ((1 2 3 4) (5 6 7) (8 9 10))
0: ANSWER returned ((1 2 3 4) (5 6 7) (8 9 10))
This should help you get started.
In this implementations those were non-terminal recursive functions. Note however how range-list could push items starting from the end. Also, if you preprocess the sizes list in a terminal recursive way (like a fold/reduce), you can determine which is the last integer, and build the lists from the last to the first, again in a terminal recursive way.
Here's a simple minded solution.
First, I build a sequence, then partition according to the arguments:
#lang racket
(define (create-lln f . r)
; collects all args into a list
(define l (cons f r))
; builds sequence from [1, (sum l)]
(define seq (build-list (apply + l) add1))
; partitioning
(take-multiple l seq))
; partitions `taken-list` by `taking-list`
; example:
; (take-multiple '(3 4 4) '(1 2 3 4 5 6 7 8 9 10 11))
; => '((1 2 3) (4 5 6 7) (8 9 10 11))
(define (take-multiple taking-list taken-list)
(cond [(empty? taking-list) empty]
[else (begin
(define f (first taking-list))
(cons
(take taken-list f)
(take-multiple (rest taking-list) (list-tail taken-list f))))]))
I don't know how to do it in Racket, but my logic would be something like:
ListOfLists = new List()
indexEntry = 0
j = 1
listTemp = new List()
while (indexInput <ListOfLists.size()){
while(TempList.size() < Entry[Input index]){
listTemp.Add(j)
j = j+1
}
ListOfLists.Add(TempList)
listTemp = new List()
indexInput = indexInput +1
}

How to use map in Julia to mimic a nested list comprehension?

I would like to use Julia's map function to mimic a nested list comprehension. This ability would be particularly useful for parallel mapping (pmap).
For example, this nested list comprehension
[x+y for x in [0,10] for y in [1,2,3]]
produces the nice result
6-element Array{Int64,1}:
1
2
3
11
12
13
and this
[x+y for x in [0,10], y in [1,2,3]]
produces the equally nice
2×3 Array{Int64,2}:
1 2 3
11 12 13
Either of these outcomes are satisfactory for my purposes.
Now here is my best effort at replicating these outcomes with map
map([0,10]) do x
map([1,2,3]) do y
x + y
end
end
which yields the correct results, just not in a form I admire:
2-element Array{Array{Int64,1},1}:
[1, 2, 3]
[11, 12, 13]
Now I know there are brute-force ways get the outcome I want, such as hcat/vcat'ing the output or manipulating the input into a matrix, but I'd like to know if there exists a solution as elegant as the nested list comprehension.
The simplest way I can think of is to use comprehensions and combine them with map (low benefit) or pmap (here you get the real value).
On Julia 0.7 (use the fact that in this release you have destructuring in function arguments functionality):
julia> map(((x,y) for x in [0,10] for y in [1,2,3])) do (x,y)
x+y
end
6-element Array{Int64,1}:
1
2
3
11
12
13
julia> map(((x,y) for x in [0,10], y in [1,2,3])) do (x,y)
x+y
end
2×3 Array{Int64,2}:
1 2 3
11 12 13
On Julia 0.6.2 (less nice):
julia> map(((x,y) for x in [0,10] for y in [1,2,3])) do v
v[1]+v[2]
end
6-element Array{Int64,1}:
1
2
3
11
12
13
julia> map(((x,y) for x in [0,10], y in [1,2,3])) do v
v[1]+v[2]
end
2×3 Array{Int64,2}:
1 2 3
11 12 13
You could use Iterators.product:
julia> map(t -> t[1]+t[2], Iterators.product([0,10], [1,2,3]))
2×3 Array{Int64,2}:
1 2 3
11 12 13
Iterators.product returns an iterator whose elements are tuples.
(It's a shame the anonymous function above couldn't be written (x,y) -> x+y)

Julia - equivalent of recursive sapply function in R

I had a function in R (onestep below) which took as an argument a vector v and returned a new vector v as output which was a function of the input vector. I then iterated this function niter times and kept the output vectors of each iteration (which are not all the same length and can occasionally also end up having length 0) in another function iterate as follows (minimal example) :
onestep = function (v) c(v,2*v)
iterate = function (v, niter) sapply(1:niter, function (iter) {v <<- onestep(v)
return(v) } )
Example :
v=c(1,2,3)
iterate(v,3)
[[1]]
[1] 1 2 3 2 4 6
[[2]]
[1] 1 2 3 2 4 6 2 4 6 4 8 12
[[3]]
[1] 1 2 3 2 4 6 2 4 6 4 8 12 2 4 6 4 8 12 4 8 12 8 16 24
I was wondering what would be a compact and idiomatic way to do such a recursive function that returns all the intermediate results in Julia? Any thoughts? (Apologies if this is trivial but I am new to Julia)
Not sure on the compact and idiomatic front, but this is how I'd do it
onestep(v) = [v 2*v]
function iterate(v, niter)
Results = Array(Array, niter)
Results[1] = onestep(v)
for idx = 2:niter
Results[idx] = onestep(Results[idx - 1])
end
Results
end
v = [1 2 3]
iterate(v, 3)
Here is another way that is a bit more concise and more truly recursive, as per your original question:
v = Array[[1, 2, 3]] ## create v as an array of one dimensional arrays
function iterate(v::Array{Array, 1}, niter::Int)
niter == 0 && return v[2:end]
push!(v, [v[end] ; 2v[end]])
niter -= 1
iterate(v, niter)
end
iterate(v, 3)

creating subvectors from two vectors in clojure

simple question here. how do i go from having the two vectors [1 2 3] [5 7 9] to
having this: [1 5] [2 7] [3 9]?
I tried this: (map concat [1 2 3] [ 4 5 6]),
but i get "don't know how to create ISeq from: java.lang.Long "
Use map vector instead
(map vector [1 2 3] [5 7 9])

Resources