I feel like I'm being really stupid here as I would have thought there's a simple command already in Pari, or it should be a simple thing to write up, but I simply cannot figure this out.
Given a vector, say V, which will have duplicate entries, how can one determine what the most common entry is?
For example, say we have:
V = [ 0, 1, 2, 2, 3, 4, 6, 8, 8, 8 ]
I want something which would return the value 8.
I'm aware of things like vecsearch, but I can't see how that can be tweaked to make this work?
Very closely related to this, I want this result to return the most common non-zero entry, and some vectors I look at will have 0 as the most common entry. Eg: V = [ 0, 0, 0, 0, 3, 3, 5 ]. So whatever I execute here I would like to return 3.
I tried writing up something which would remove all zero terms, but again struggled.
The thing I have tried in particular is:
rem( v ) = {
my( c );
while( c = vecsearch( v, 0 ); #c, v = vecextract( v, "^c" ) ); v
}
but vecextract doesn't seem to like this set up.
If you can ensure all the elements are within the some fixed range then it is enough just to do the counting sorting with PARI/GP code like this:
counts_for(v: t_VEC, lower: t_INT, upper: t_INT) = {
my(counts = vector(1+upper-lower));
for(i=1, #v, counts[1+v[i]-lower]++);
vector(#counts, i, [i-1, counts[i]])
};
V1 = [0, 1, 2, 2, 3, 4, 6, 8, 8, 8];
vecsort(counts_for(V1, 0, 8), [2], 4)[1][1]
> 8
V2 = [0, 0, 0, 0, 3, 3, 5];
vecsort(counts_for(V2, 0, 5), [2], 4)[1][1]
> 0
You also can implement the following short-cut for the sake of convenience:
counts_for1(v: t_VEC) = {
counts_for(v, vecmin(v), vecmax(v))
};
most_frequent(v: t_VEC) = {
my(counts=counts_for1(v));
vecsort(counts, [2], 4)[1][1]
};
most_frequent(V1)
> 8
most_frequent(V2)
> 0
The function matreduce provides this in a more general setting: applied to a vector of objects, it returns a 2-column matrix whose first column contains the distinct objects and the second their multiplicity in the vector. (The function has a more general form that takes the union of multisets.)
most_frequent(v) = my(M = matreduce(v), [n] = matsize(M)); M[n, 1];
most_frequent_non0(v) =
{ my(M = matreduce(v), [n] = matsize(M), x = M[n, 1]);
if (x == 0, M[n - 1, 1], x);
}
? most_frequent([ 0, 1, 2, 2, 3, 4, 6, 8, 8, 8 ])
%1 = 8
? most_frequent([x, x, Mod(1,3), [], [], []])
%2 = []
? most_frequent_non0([ 0, 0, 0, 0, 3, 3, 5 ])
%3 = 5
? most_frequent_non0([x, x, Mod(1,3), [], [], []])
%4 = x
The first function will error out if fed an empty vector, and the second one if there are no non-zero entries. The second function tests for "0" using the x == 0 test (and we famously have [] == 0 in GP); for a more rigorous semantic, use x === 0 in the function definition.
Related
When initializing a multidimensional Vec in Rust, I can use the vec!-macro like this:
vec![vec![0; 100]; 200]
However, this gets messy for Vecs of higher dimensions. Currently, I am using this:
vec![vec![vec![vec![vec![vec![vec![vec![0; N-1]; N-1]; N-1]; N-1]; 2]; 2]; 2]; 2]
This is not very concise, and also the order in which the dimensions are written is reverse to the indexing order. Is there a more concise way to do this? I am looking for something like
vec![0; 2, 2, 2, 2, N-1, N-1, N-1, N-1]
The ndarray crate allows you to have an N-dimensional array. For anything above 6 dimensions, you can use the ArrayD type. You can create a dynamic dymension using IxDyn - documentation with examples.
Example for a 7x7x7...x7 array initialization and element access:
let mut array_7d = ArrayD::<f64>::zeros(IxDyn(&[7, 7, 7, 7, 7, 7, 7, 7]));
let index = IxDyn(&[0, 0, 0, 0, 0, 0, 0, 0]);
array_7d[&index] = 1.0;
If I have a large struct that I want to create an array of (e.g. to later create a StructArray), how can I create an array of structs when I have keyword defaults.
E.g.
Base.#kwdef struct MyType
a = 0
b = 0
c = 0
d = 0
... # can be up to 10 or 20 fields
end
Base.#kwdef is nice because I can create objects with MyType(b=10,e=5) but sometimes I have arrays of the argument. I would like to be able to broadcast or succinctly construct an array of the structs.
That is I would like the following would create an array of three MyTypes: MyType.(c=[5,6,7],d = [1,2,3])
Instead, it creates a single MyType where c and d are arrays rather than scalar values.
What are ways to keep the convenience of both Base.#kwdef and easy array of struct construction?
Seems like a good use case for a comprehension:
julia> [MyType(c=cval, d=dval) for (cval, dval) in zip([5, 6, 7], [1, 2, 3])]
3-element Vector{MyType}:
MyType(0, 0, 5, 1)
MyType(0, 0, 6, 2)
MyType(0, 0, 7, 3)
Another possiblity (based on this answer ) is to explicitly do the broadcast call yourself:
julia> broadcast((cval, dval) -> MyType(c = cval, d = dval), [5, 6, 7], [1, 2, 3])
3-element Vector{MyType}:
MyType(0, 0, 5, 1)
MyType(0, 0, 6, 2)
MyType(0, 0, 7, 3)
or the equivalent ((cval, dval) -> MyType(c = cval, d = dval)).([5, 6, 7], [1, 2, 3]) as mentioned in the comment there.
Out of these, the array comprehension seems to me the clearest and most obvious way to go about it.
Following this post: https://github.com/JuliaLang/julia/issues/34737 there is no nice built-in syntax for your case.
One option is comprehension (see the other answer), second option (which I prefer here more) is building an anonymous function and vectoring over it such as:
julia> ((x,y)->MyType(;c=x,d=y)).([1,2],[3,5])
2-element Vector{MyType}:
MyType(0, 0, 1, 3)
MyType(0, 0, 2, 5)
It is also possible to call broadcast directly as:
julia> broadcast((x,y)->MyType(;c=x,d=y), [1,2],[3,5])
2-element Vector{MyType}:
MyType(0, 0, 1, 3)
MyType(0, 0, 2, 5)
Let's say I have a vector a = [1, 0, 1, 2, 3, 4, 5, 0, 5, 6, 7, 8, 0, 9, 0] and I want to split it to smaller vectors based on a condition depending on value in that array. E.g. value being zero.
Thus I want to obtain vector of following vectors
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
So far this was working for me as a naive solution, but it loses the type.
function split_by_λ(a::Vector, λ)
b = []
temp = []
for i in a
push!(temp, i)
if λ(i)
push!(b, temp)
temp = []
end
end
b
end
split_by_λ(a, isequal(0))
Then I tried to play with ranges, which feels a bit more idiomatic, and does not lose the type.
function split_by_λ(a::Vector, λ)
idx = findall(λ, a)
ranges = [(:)(i==1 ? 1 : idx[i-1]+1, idx[i]) for i in eachindex(idx)]
map(x->a[x], ranges)
end
split_by_λ(a, isequal(0))
but it still feels very cumbersome regarding it's a rather simple task.
Is there something I'm missing, some easier way?
Maybe someone has a shorter idea but here is mine:
julia> inds = vcat(0,findall(==(0),a),length(a))
julia> getindex.(Ref(a), (:).(inds[1:end-1].+1,inds[2:end]))
5-element Array{Array{Int64,1},1}:
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
[]
Or if you want to avoid copying a
julia> view.(Ref(a), (:).(inds[1:end-1].+1,inds[2:end]))
5-element Array{SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true},1}:
[1, 0]
[1, 2, 3, 4, 5, 0]
[5, 6, 7, 8, 0]
[9, 0]
0-element view(::Array{Int64,1}, 16:15) with eltype Int64
Pretty much the same as Przemyslaw's answer, but maybe less cryptic dense:
function split_by(λ, a::Vector)
first, last = firstindex(a), lastindex(a)
splits = [first-1; findall(λ, a); last]
s1, s2 = #view(splits[1:end-1]), #view(splits[2:end])
return [view(a, i1+1:i2) for (i1, i2) in zip(s1, s2)]
end
Also, I changed the signature to the conventional one of "functions first", which allows you to use do-blocks. Additionally, this should work with offset indexing.
One could surely get rid of the intermediate allocations, but I think that gets ugly without yield:
function split_by(λ, a::Vector)
result = Vector{typeof(view(a, 1:0))}()
l = firstindex(a)
r = firstindex(a)
while r <= lastindex(a)
if λ(a[r])
push!(result, #view(a[l:r]))
l = r + 1
end
r += 1
end
push!(result, #view(a[l:end]))
return result
end
Lets say you have two lists such as:
list1 = [-2, -1, 0, 1, 2, 3]
list2 = [4, 1, 0, 1, 4, 9]
...and the two lists were zipped into a dictionary to produce:
dict1 = {-2: 4,
-1: 1,
0: 0,
1: 1,
2: 4,
3: 9}
...where list1 is the key, and list 2 is the value.
You will notice that some of the elements in list2 are duplicates such as 4 and 1. They show up twice in list 2, and consequently in the dictionary.
-2 corresponds to 4
2 corresponds to 4
-1 corresponds to 1
1 corresponds to 1
I am trying to figure out a way either using the lists or the dictionary to identify the duplicate items in list2, and return their keys from list 1.
So the returned values I would expect from the two lists above would be:
(-2, 2) #From list 1 since they both correspond to 4 in list2
(-1, 1) #from list 1 since they both correspond to 1 in list2
In this example, list2 happens to be the square of list1. But this will not always be the case.
So ultimately, what I am looking for is a way to return those keys based on their duplicate values.
Any thoughts on how to approach this? I am able to identify the duplicates in list2, but I am completely stuck on how to identify their corresponding values in list 1.
In python3:
from itertools import groupby
list1 = [-2, -1, 0, 1, 2, 3]
list2 = [4, 1, 0, 1, 4, 9]
pairs = zip(list2, list1)
ordered = sorted(pairs, key=lambda x: x[0])
groups = ((k, list(g)) for k,g in groupby(ordered, key=lambda x: x[0])) # generator
duplicates = (k for k in groups if len(k[1])>1) # generator
for k,v in duplicates :
print(str(k) + " : " + str(list(v)))
result:
1 : [(1, -1), (1, 1)]
4 : [(4, -2), (4, 2)]
Bonus: in functional c#:
var list1 = new[] { -2, -1, 0, 1, 2, 3 };
var list2 = new[] { 4, 1, 0, 1, 4, 9 };
var g = list1.Zip(list2, (a, b) => (a, b)) //create tuples
.GroupBy(o => o.b, o => o.a, (k, group) => new { key = k, group = group.ToList() }) //create groups
.Where(o => o.group.Count > 1) // select group with minimum 2 elements
.ToList(); // no lazy
foreach (var kvp in g)
Console.WriteLine($"{kvp.key}: {string.Join(",", kvp.group)}");
result:
4: -2,2
1: -1,1
This was a past exam question and I have no idea what it does! Please can someone run through it.
public static int befuddle(int n){
if(n <= 1){
return n;
}else{
return befuddle(n - 1) * befuddle(n - 2) + 1;
}
}
this is computing the sequence: 0, 1, 1, 2, 3, 7, 22, 155, ...
Which can be expressed using this formula:
when dealing with numerical sequences, a great resources is The Online Encyclopedia of Integer Sequences!, a quick search there shows a similar sequence to yours but with:
giving the following sequence: 0, 0, 1, 1, 2, 3, 7, 22, 155, ...
you can find more about it here
public static is the type of member function it is. I'm assuming this is part of a class? The static keyword allows you to use it without creating an instance of the class.
Plug in a value of 'n' and step through it. For instance, if n = 1, then the function returns 1. If n = 0 -> 0; n = -100 -> -100.
If n = 2, the else branch is triggered and befuddled is called with 1 and 0. So n = 2 returns 0*1 + 1 = 1.
Do the same thing for n = 3, etc. (calls n = 2 -> 1, and n = 1 -> 1, so n=3 -> 1*1+1 = 2.)