I'm wandering which is the best practice to initialize a double. It is better to initialize it to 0.0 or NaN? And why?
I would say the best practice is just to initialize your variables. The value you choose depend of what you are doing: if you want to check the init of your variable, you choose a value which can not happen (or unique) in your algorithm (can be Nan, 0., -1., ...).
Related
Suppose A is an abstract type, I have a function f{T<:A}(x::Vector{A}). So x could be type Vector{A} or Vector{B} where B <: A. In the middle of the function I would like to cast x to Vector{A} so it can be consumed by another function that requires that signature.
What's the best way to do that? At the moment I am doing x = collect(A, x). Is there a way to avoid copying if possible?
If at all possible, I'd just change your second function definition to be parametric like f. Enforcing this kind of container structure in method signatures is a big performance bug that doesn't gain you any functionality… and just makes them much harder to use.
That said, the best way to do this kind of conversion where you don't care if the output aliases the input is with convert(Vector{A}, x). This will be a no-op if x already isa Vector{A}, but otherwise it'll be just like collect. That's as good as it gets.
Here's why: two containers of types Vector{A} and Vector{B} cannot share the same memory if A !== B since it'd be possible to corrupt the data in the Vector{B} by assigning a non-B element to the array through the Vector{A}.
Recently, a colleague of mine asked me how he could test the equalness of two arrays. He had two sources of Address and wanted to assert that both sources contained exactly the same elements, although order didn't matter.
Both using Array or like List in Java, or IList would be okay, but since there could be two equal Address objects, things like Sets can't be used.
In most programming languages, a List already has an equals method doing the comparison (assuming that the collection was ordered before doing it), but there is no information about the actual differences; only that there are some, or none.
The output should inform about elements that are in one collection but not in the other, and vice-versa.
An obvious approach would be to iterate through one of the collections (if one of them is), and just call contains(element) on the other one, and doing it the the other way around afterwards. Assuming a complexity of O(n) for contains, that would result in O(2n²), if I'm correct.
Is there a more efficient way for getting the information "A1 and A2 isn't in List1, A3 and A4 isn't in List2"? Are there data structures better suited for doing this job than lists? Is it worth it to sort the collections before and using a custom, binary search contains?
The first thing that comes to mind is using set difference
In pseudo-python
addr1 = set(originalAddr1)
addr2 = set(originalAddr2)
in1notin2 = addr1 - addr2
in2notin1 = addr2 - addr1
allDifferences = in1notin2 + in2notin1
From here you can see that set difference is O(len(set)) and union is O(len(set1) + len(set2)) giving you a linear time solution with this python specific set implementation, instead of quadratic as you suggest.
I believe other popular languages tend to implement these type of data structures pretty much the same way, but can't really be sure about this.
Is it worth to sort the collection [...]?
Compare the naive approach O(n²) to sorting two lists in O(n logn) and then comparing them in O(n) - or sorting one list in O(n logn) and iterating over the other in O(n)
I am attempting to represent dice rolls in Julia. I am generating all the rolls of a ndsides with
sort(collect(product(repeated(1:sides, n)...)), by=sum)
This produces something like:
[(1,1),(2,1),(1,2),(3,1),(2,2),(1,3),(4,1),(3,2),(2,3),(1,4) … (6,3),(5,4),(4,5),(3,6),(6,4),(5,5),(4,6),(6,5),(5,6),(6,6)]
I then want to be able to reasonably modify those tuples to represent things like dropping the lowest value in the roll or adding a constant number, etc., e.g., converting (2,5) into (10,2,5) or (5,).
Does Julia provide nice functions to easily modify (not necessarily in-place) n-tuples or will it be simpler to move to a different structure to represent the rolls?
Thanks.
Tuples are immutable, so you can't modify them in-place. There is very good support for other mutable data structures, so there aren't many methods that take a tuple and return a new, slightly modified copy. One way to do this is by splatting a section of the old tuple into a new tuple, so, for example, to create a new tuple like an existing tuple t but with the first element set to 5, you would write: tuple(5, t[2:end]...). But that's awkward, and there are much better solutions.
As spencerlyon2 suggests in his comment, a one dimensional Array{Int,1} is a great place to start. You can take a look at the Data Structures manual page to get an idea of the kinds of operations you can use; one-dimensional Arrays are iterable, indexable, and support the dequeue interface.
Depending upon how important performance is and how much work you're doing, it may be worthwhile to create your own data structure. You'll be able to add your own, specific methods (e.g., reroll!) for that type. And by taking advantage of some of the domain restrictions (e.g., if you only ever want to have a limited number of dice rolls), you may be able to beat the performance of the general Array.
You can construct a new tuple based on spreading or slicing another:
julia> b = (2,5)
(2, 5)
julia> (10, b...)
(10, 2, 5)
julia> b[2:end]
(5,)
I need to perform simple mathematical calculations in Python 2.7 with sums, subtractions, divisions, multiplications, sums over lists of numbers etc.
I want to write elegant, bullet-proof, and efficient code but I must admit I got confused by several things, for example:
if I have 1/(N-1)*x in my equation should I just code 1/(N-1)*x or maybe 1.0/(N-1)*x, 1.0/(N-1.0)*x or any other combination of these?
for division, should I use // or / with from __future__ import division?
what practices such as "using math.fsum() for concatenating a list of floats" are out there?
should I assume that input numbers are float or do the conversion just in case (maybe risking drop of efficiency on many float(x) operations)?
So what are the best practices for writing a code for simple mathematical calculations in Python that is
elegant/Pythonic,
efficient,
bullet-proof to issues like uncertainty in exact number type of input data (float vs integer) ?
If you use Python 2.7, ALWAYS use from __future__ import division. It removes a hell of a lot confusion and bugs.
With this you should never have to worry if a division is a float or not, / will always be a float and // will always be an int.
You should convert your input with float(). You will do it only once, and it won't be much of a performance hit.
I would get the sum of a list of floats like this: sum(li, 0.0), but if precision is required, use math.fsum which is specifically created for this.
And finally, your final statement was confusing. Did you mean 1/((N-1)*x) or (1/(N-1))*x? In the first case I would write it as 1 / (x * (N-1)) and in the second case x / (N-1). Both assume 3.x style division.
Also, look into numpy if you want some real performance.
If you want great performance for numerical code in Python, you should consider PyPy. Numpy and scipy are convenient for dealing with arrays, and they give good performance if you use linear algebra algorithms that they provide. But if your numerical operations are in pure Python code, PyPy can give significant improvements in performance. I have seen speedups above 20x. And when you use PyPy, the best way to write your mathematical expressions is the simplest way. It will optimize your code better than you could, so make it as simple and readable as possible.
I'm new to OCaml, and I'd like to implement Gaussian Elimination as an exercise. I can easily do it with a stateful algorithm, meaning keep a matrix in memory and recursively operating on it by passing around a reference to it.
This statefulness, however, smacks of imperative programming. I know there are capabilities in OCaml to do this, but I'd like to ask if there is some clever functional way I haven't thought of first.
OCaml arrays are mutable, and it's hard to avoid treating them just like arrays in an imperative language.
Haskell has immutable arrays, but from my (limited) experience with Haskell, you end up switching to monadic, mutable arrays in most cases. Immutable arrays are probably amazing for certain specific purposes. I've always imagined you could write a beautiful implementation of dynamic programming in Haskell, where the dependencies among array entries are defined entirely by the expressions in them. The key is that you really only need to specify the contents of each array entry one time. I don't think Gaussian elimination follows this pattern, and so it seems it might not be a good fit for immutable arrays. It would be interesting to see how it works out, however.
You can use a Map to emulate a matrix. The key would be a pair of integers referencing the row and column. You'll want to use your own get x y function to ensure x < n and y < n though, instead of accessing the Map directly. (edit) You can use the compare function in Pervasives directly.
module OrderedPairs = struct
type t = int * int
let compare = Pervasives.compare
end
module Pairs = Map.Make (OrderedPairs)
let get_ n set x y =
assert( x < n && y < n );
Pairs.find (x,y) set
let set_ n set x y v =
assert( x < n && y < n );
Pairs.add (x,y) set v
Actually, having a general set of functions (get x y and set x y at a minimum), without specifying the implementation, would be an even better option. The functions then can be passed to the function, or be implemented in a module through a functor (a better solution, but having a set of functions just doing what you need would be a first step since you're new to OCaml). In this way you can use a Map, Array, Hashtbl, or a set of functions to access a file on the hard-drive to implement the matrix if you wanted. This is the really important aspect of functional programming; that you trust the interface over exploiting the side-effects, and not worry about the underlying implementation --since it's presumed to be pure.
The answers so far are using/emulating mutable data-types, but what does a functional approach look like?
To see, let's decompose the problem into some functional components:
Gaussian elimination involves a sequence of row operations, so it is useful first to define a function taking 2 rows and scaling factors, and returning the resultant row operation result.
The row operations we want should eliminate a variable (column) from a particular row, so lets define a function which takes a pair of rows and a column index and uses the previously defined row operation to return the modified row with that column entry zero.
Then we define two functions, one to convert a matrix into triangular form, and another to back-substitute a triangular matrix to the diagonal form (using the previously defined functions) by eliminating each column in turn. We could iterate or recurse over the columns, and the matrix could be defined as a list, vector or array of lists, vectors or arrays. The input is not changed, but a modified matrix is returned, so we can finally do:
let out_matrix = to_diagonal (to_triangular in_matrix);
What makes it functional is not whether the data-types (array or list) are mutable, but how they they are used. This approach may not be particularly 'clever' or be the most efficient way to do Gaussian eliminations in OCaml, but using pure functions lets you express the algorithm cleanly.