Hash functions in R: keys and values as doubles? - r

I'm fairly new to R. I want to use a hash function to hold key-value pairs as doubles/numerics.
I have a function f(x) = y where the function f takes the input x as a double and returns y as a double. I use this function f(x) over 10^8 times in my code (and I would like to use it much more often), and 99% of the time f(x) has been computed once already. I would like to store my answers as key-value pairs, so I can find them instead of calculating them again.
I read the article below about using environments as hash tables, but I cannot figure out how to use doubles as key-value pairs.
https://www.r-bloggers.com/hash-table-performance-in-r-part-i/
How do I use doubles as key-value pairs in a hash function in R?

Related

Represent stream of vectors

I have a stream of vectors ... then i have N number of vectors to best represent this stream. (smallest dist )
To do that I can use algorithm similar to Self Organizing Maps.. i.e. pick the closest vector and move it closer to the data vector. rinse and repeat.
Now my question is:
I want the number of vectors N to be variable and dynamic. For this I will probably need to maximize/minimize some external function ?
What will that function be ?
I will also probably need rules for when to : Merging, Creating or Deleting vectors

Functions as first class objects in SageMath

How in SageMath to represent a function from a set X to a set Y?
I want to enumerate all functions from a set X to a set Y. How to represent the values of the iterator which enumerates them?
I know I can just make a Python hash, but maybe there is already a more suitable object in Sage? (which would be internally represented as a hash but work as a function when applied to an argument)

Standard ML : Calculating the average of a given set

I recently had the assignment to calculate the average of a set (given by input) in Standard ML.
The idea is to have a function like below in which you input a list of real numbers and receive the average of those numbers (also a real), such that the terminal gives you this as a return answer when you input the function:
average = fn : real list -> real
We discussed this in a tutorial as well but I wanted to know if there was some sort of trick when creating such functions in Standard ML.
Thanks in advance!
Sum the numbers and divide by the length. A simple recursive sum is typically one of the first examples that you would see in any SML tutorial. You would need to have the empty list basis case of sum evaluate to 0.0 rather than 0 to make sure that the return type is real. Once you define a sum function then you can define average in 1 line using sum and the built in length function. A subtlty is that SML doesn't allow a real to be divided by an int. You could use the conversion function Real.fromInt on the length before dividing the sum by it. There is some inefficiency in passing over the same list twice, once to sum it and once to calculate its length, but there is little reason to worry about such things when you are first learning the language.
On Edit: Since you have found a natural solution and shared it in the comments, here is a more idiomatic version which computes the average in one pass over the list:
fun average nums =
let
fun av (s,n,[]) = s/Real.fromInt(n)
| av (s,n,x::xs) = av (s+x,n+1,xs)
in
av (0.0, 0, nums)
end;
It works by defining a helper function which does the heavy lifting. These are used extensively in functional programming. In the absence of mutable state, a common trick is to explicitly pass as parameters quantities which would be successively modified by a corresponding loop in an imperative language. Such parameters are often called accumulators since they typically accumulate growing lists, running sums, running products, etc. Here s and n are the accumulators, with s the sum of the elements and n the length of the list. In the basis case of (s,n,[]) there is nothing more to accumulate so the final answer is returned. In the non-basis case, (s,n,x::xs), s and n are modified appropriately and passed to the helper function along with the tail of the list. The definition of av is tail-recursive hence will run with the speed of a loop without growing the stack. The only thing that the overall average function needs to do is to invoke the helper function with the appropriate initial values. The let ... helper def ... in ... helper called with start-up values ...end is a common idiom used to prevent the top-level of a program from being cluttered with helper functions.
Since only non-empty lists can have averages, an alternative on John Coleman's answer is:
fun average [] = NONE
| average nums =
let
fun av (s,n,[]) = s/Real.fromInt(n)
| av (s,n,x::xs) = av (s+x,n+1,xs)
in
SOME (av (0.0, 0, nums))
end;
Whether a function for calculating averages should take non-empty lists into account depends on whether you intend to export it or only use it within a scope in which you guarantee elsewhere that the input list is non-empty.

Boolean (BitArray) multidimensional array indexing or masking in Julia?

As part of a larger algorithm, I need to produce the residuals of an array relative to a specified limit. In other words, I need to produce an array which, given someArray, comprises elements which encode the amount by which the corresponding element of someArray exceeds a limit value. My initial inclination was to use a distributed comparison to determine when a value has exceeded the threshold. As follows:
# Generate some test data.
residualLimit = 1
someArray = 2.1.*(rand(10,10,3).-0.5)
# Determine the residuals.
someArrayResiduals = (residualLimit-someArray)[(residualLimit-someArray.<0)]
The problem is that the someArrayResiduals is a one-dimensional vector containing the residual values, rather than a mask of (residualLimit-someArray). If you check [(residualLimit-someArray.<0)] you'll find that it is behaving as expected; it's producing a BitArray. The question is, why doesn't Julia allow to use this BitArray to mask someArray?
Casting the Bools in the BitArray to Ints using int() and distributing using .*produces the desired result, but is a bit inelegant... See the following:
# Generate some test data.
residualLimit = 1
someArray = 2.1.*(rand(10,10,3).-0.5)
# Determine the residuals.
someArrayResiduals = (residualLimit-someArray).*int(residualLimit-someArray.<0)
# This array should be (and is) limited at residualLimit. This is correct...
someArrayLimited = someArray + someArrayResiduals
Anyone know why a BitArray can't be used to mask an array? Or, any way that this entire process can be simplified?
Thanks, all!
Indexing with a logical array simply selects the elements at indices where the logical array is true. You can think of it as transforming the logical index array with find before doing the indexing expression. Note that this can be used in both array indexing and indexed assignment. These logical arrays are often themselves called masks, but indexing is more like a "selection" operation than a clamping operation.
The suggestions in the comments are good, but you can also solve your problem using logical indexing with indexed assignment:
overLimitMask = someArray .> residualLimit
someArray[overLimitMask] = residualLimit
In this case, though, I think the most readable way to solve this problem is with min or clamp: min(someArray, residualLimit) or clamp(someArray, -residualLimit, residualLimit)

New to OCaml: How would I go about implementing Gaussian Elimination?

I'm new to OCaml, and I'd like to implement Gaussian Elimination as an exercise. I can easily do it with a stateful algorithm, meaning keep a matrix in memory and recursively operating on it by passing around a reference to it.
This statefulness, however, smacks of imperative programming. I know there are capabilities in OCaml to do this, but I'd like to ask if there is some clever functional way I haven't thought of first.
OCaml arrays are mutable, and it's hard to avoid treating them just like arrays in an imperative language.
Haskell has immutable arrays, but from my (limited) experience with Haskell, you end up switching to monadic, mutable arrays in most cases. Immutable arrays are probably amazing for certain specific purposes. I've always imagined you could write a beautiful implementation of dynamic programming in Haskell, where the dependencies among array entries are defined entirely by the expressions in them. The key is that you really only need to specify the contents of each array entry one time. I don't think Gaussian elimination follows this pattern, and so it seems it might not be a good fit for immutable arrays. It would be interesting to see how it works out, however.
You can use a Map to emulate a matrix. The key would be a pair of integers referencing the row and column. You'll want to use your own get x y function to ensure x < n and y < n though, instead of accessing the Map directly. (edit) You can use the compare function in Pervasives directly.
module OrderedPairs = struct
type t = int * int
let compare = Pervasives.compare
end
module Pairs = Map.Make (OrderedPairs)
let get_ n set x y =
assert( x < n && y < n );
Pairs.find (x,y) set
let set_ n set x y v =
assert( x < n && y < n );
Pairs.add (x,y) set v
Actually, having a general set of functions (get x y and set x y at a minimum), without specifying the implementation, would be an even better option. The functions then can be passed to the function, or be implemented in a module through a functor (a better solution, but having a set of functions just doing what you need would be a first step since you're new to OCaml). In this way you can use a Map, Array, Hashtbl, or a set of functions to access a file on the hard-drive to implement the matrix if you wanted. This is the really important aspect of functional programming; that you trust the interface over exploiting the side-effects, and not worry about the underlying implementation --since it's presumed to be pure.
The answers so far are using/emulating mutable data-types, but what does a functional approach look like?
To see, let's decompose the problem into some functional components:
Gaussian elimination involves a sequence of row operations, so it is useful first to define a function taking 2 rows and scaling factors, and returning the resultant row operation result.
The row operations we want should eliminate a variable (column) from a particular row, so lets define a function which takes a pair of rows and a column index and uses the previously defined row operation to return the modified row with that column entry zero.
Then we define two functions, one to convert a matrix into triangular form, and another to back-substitute a triangular matrix to the diagonal form (using the previously defined functions) by eliminating each column in turn. We could iterate or recurse over the columns, and the matrix could be defined as a list, vector or array of lists, vectors or arrays. The input is not changed, but a modified matrix is returned, so we can finally do:
let out_matrix = to_diagonal (to_triangular in_matrix);
What makes it functional is not whether the data-types (array or list) are mutable, but how they they are used. This approach may not be particularly 'clever' or be the most efficient way to do Gaussian eliminations in OCaml, but using pure functions lets you express the algorithm cleanly.

Resources