Comparing values of 2 dictionaries - dictionary

Can anyone advice how I could compare values of 2 dictionaries. For example:
A = {'John': [(300, 5000), (700, 750), (10, 300)], 'Mary': [(12, 300), (5678, 9000), (200, 657), (800, 7894)]}
B = {‘Jim’:[(500,1000),(600,1500),(900,2000)], ‘Mary’:[(13,250), (1000,6000), (222,600)]}
I would like to compare the 2 such that if the 'key' (in this case 'Mary') is present between A and B dictionaries and the first and second numbers in the 'values' of B dictionaries are within that of the 'values in A (i.e. (13,250) and (222,600) are between (12,300) and (200, 657) respectively. The return results will therefore be 'Mary': [(13,250), (222,600)]
Thanks

Okay I did what I think you wanted and retrieved the results (13,250)
(222,600). So it looks like it is working. I made three classes one was the main class, another a Point Class, and another class that did the filling of the dictionaries and comparing. I didn't know how you made up your dictionaries. But I made it like so:
private Dictionary<String,Map<String,List<Point>>> first
= new Hashtable<String,Map<String,List<Point>>>();
It is dictionary that takes in a String and a Map, which takes in a String and a List, which takes in a Point Object. When I look at your snippet there; it just screamed Points. So I made a small class with x and y properties.
Next, one should make a method that will compare the values of the points like using less than and greater than:
Then in another method after you fill the Dictionary, then check the dictionary and loop through the size of the second list of the dictionary
int size = second.get("B").get("Mary").size();
for(int i =0; i<size; i++){
//compare method that you just made
}
Then print out results
My Output: Mary:(13,250)(222,600)
If you need any help with code please reply.

Related

Problem with extracting values from vector for for loop

I am trying to extract values from a vector to generate random numbers from a GEV distribution. I keep getting an error. This is my code
x=rand(Truncated(Poisson(2),0,10),10)
t=[]
for i in 1:10 append!(t, maximum(rand(GeneralizedExtremeValue(2,4,3, x[i])))
I am new to this program and I think I am not passing the variable x properly. Any help will be appreciated. Thanks
If I am correctly understanding what you are trying to do, you might want something more like
x = rand(Truncated(Poisson(2),0,10),10)
t = Float64[]
for i in 1:10
append!(t, max(rand(GeneralizedExtremeValue(2,4,3)), x[i]))
end
Among other things, you were missing a paren, and probably want max instead of maximum here.
Also, while it would technically work, t = [] creates an empty array of type Any, which tends to be very inefficient, so you can avoid that by just telling Julia what type you want that array to hold with e.g. t = Float64[].
Finally, since you already know t only needs to hold ten results, you can make this again more efficient by pre-allocating t
x = rand(Truncated(Poisson(2),0,10),10)
t = Array{Float64}(undef,10)
for i in 1:10
t[i] = max(rand(GeneralizedExtremeValue(2,4,3)), x[i])
end

Efficiently apply R functions over multidimensional array

I have an array that is composed of a list of lists of lists of lists..
For instance say
data[[1]][[1]][[1]][[1]][[5]]
returned a number. Now I need to compute say the smallest of these numbers across
data[[1]][[a]][[b]][[c]][[5]]
Where
a = 1:10
b = 1:100
c = 1:100
I could certainly do this with some nested for loops, but I feel like an apply command with min, or something equivalent in dyplr, should handle this without issue.
Alright a very rough example would be, say,
test <- rep(list(rep(list(rep(list(rep(list(rep(1:5,5)),100)),100)),10)),14)
And that is then that:
test[[1]][[14]][[10]][[100]][[100]][[5]]
Returns the number 5.
I now want to say take the minimum over the dimensions, in theory thought of like this:
test[[1][[1:10]][[1:100]][[1:100]][[5]]
Now obviously what we are going to get by taking the minimum number that ever appears over these (10*100*100) values will be 5, because the only number itself is 5.

How to convert a Matrix into a Vector of Vectors?

I have the following matrix A = [1.00 2.00; 3.00 4.00] and I need to convert it into a vector of Vectors as follows:
A1 = [1.00; 3.00]
A2 = [2.00; 4.00]
Any ideas?
tl;dr
This can be very elegantly created with a list comprehension:
A = [A[:,i] for i in 1:size(A,2)]
Explanation:
This essentially converts A from something that would be indexed as A[1,2] to something that would be indexed as A[2][1], which is what you asked.
Here I'm assigning directly back to A, which seems to me what you had in mind. But, only do this if the code is unambiguous! It's generally not a good idea to have same-named variables that represent different things at different points in the code.
NOTE: if this reversal of row / column order in the indexing isn't the order you had in mind, and you'd prefer A[1,2] to be indexed as A[1][2], then perform your list comprehension 'per row' instead, i.e.
A = [A[i,:] for i in 1:size(A,1)]
It would be much better simply to use slices of your matrix i.e. instead of A1 use
A[:,1]
and instead of A2 use
A[:,2]
If you really need them to be "seperate" objections you could try creating a cell array like so:
myfirstcell = cell(size(A,2))
for i in 1:size(A,2)
myfirstcell[i] = A[:,i]
end
See http://docs.julialang.org/en/release-0.4/stdlib/arrays/#Base.cell
(Cell arrays allow several different types of object to be stored in the same array)
Another option is B = [eachcol(A)...]. This returns an variable with type Vector{SubArray} which might be fine depending on what you want to do. To get a Vector{Vector{Float64}} try,
B = Vector{eltype(A)}[eachcol(A)...]

Python: Average a list and append to dictionary

I have a dictionary of names with a number (a score) assigned to them. The file is laid out as so:
Person A,7
Peron B,6
If a name is repeated in the file e.g. Person B occurred on 3 lines with 3 different scores I want to calculate the mean average of these scores then append this result to a dictionary in the form of a list. However, I keep encountering an error when i try to sort the dictionary. Code below.
else:
for key in results:
keyValue = results[key]
if len(keyValue) > 1:
# Line below this needs modification
keyValue = list(sum(keyValue)/len(keyValue))
newResults[key] = keyValue
# Error in above code...
else:
newResults[key] = keyValue
print(newResults)
print(sorted(zip(newResults.values(), newResults.keys()), reverse=True))
Results is a dictionary of the people (the keys) and their scores (the values) where the values are lists so that:
results = {'Bob':[7],'Jane':[8,9]}
If you're using Python 3.x you can use its statistics library which contains a function mean. Now assuming that your dict looks like: results = {'Bob': [7], 'Jane': [8, 9]} you can create a newResults dict like this:
from statistics import mean
newResults = {key: mean(results[key]) for key in results}
This is called dict comprehension and as you can see it's kinda intuitive. Starting with { you're telling that dict is going to be created. Then with key: value you're defining its structure. Lastly, with for loop you iterate over a collection that will be used for the dict creation. You can achieve the same with:
newResults = {}
for key in results:
newResults[key] = mean(results[key])
You want to sort the dict in the end. Unfortunately it's not possible. You can either create an OrderedDict, which remembers the items insertion order or a list which will contain sorted keys to your dict. The latter will look like:
sortedKeys = sorted(newResults, key=lambda x: newResults[x])

how to not use if else to assign value to vectors iterately in R

Say I have a vector defined a= rep(NA, 10);
I want to give its ith element a value for each iteration.
for(i in 1:10){
indexUsed[i] = largestGradient(X, y, indexUsed[is.na(indexUsed)], score)
}
as you see, I want use index[1:(i-1)] to calculate ith element, but for the first element, I want a NULL or whatever, special value there to let my function knows that it is empty (then it will handles this in the case for assigning value to the first element which is different from the next steps).
I do not know my writing is a good way to do that, usually how you do?
I don't have a better way of doing this than with a for loop, but would love to see other people's responses. However, it does seem to me that your code should read
indexUsed[i] <- largestGradient(X, y, indexUsed[!is.na(indexUsed)], score)
For i=1, your indexUsed[!is.na(indexUsed)] will be empty, and should be your based case in your function. For every other iteration, it will retrieve elements 1 through i-1.

Resources