numpy array of unexpected dimension - multidimensional-array

I'm currently switching from Matlab to Python and I have a problem with understanding numpy arrays.
The following code (copied from Numpy documentation) creates a [2x3] array
np.array([[1, 2, 3], [4, 5, 6]], np.int32).
Which behaves as expected.
Now I tried to adapt this to my case and tried
myArray = np.array([\
[-0.000847283, 0.000000000, 0.141182070, 2.750000000],
[ 0.000876414, -0.025855453, 0.270459334, 2.534537894],
[-0.000098373, 0.003388169, -0.021976882, 3,509325279],
[ 0.000077079, -0.004507202, 0.096453685, 2,917172446],
[-0.000049944, 0.003114201, -0.055974372, 3,933359490],
[ 0.000042697, -0.003833862, 0.117727186, 2.485846507],
[-0.000000843, 0.000084733, 0.000169340, 3.661424974],
[ 0.000000676, -0.000074756, 0.005751451, 3.596300338],
[-0.000001860, 0.000229543, -0.006420507, 3.758593109],
[ 0.000006764, -0.000934745, 0.045972458, 2.972698644],
[ 0.000014803, -0.002140505, 0.106260454, 1.967898711],
[-0.000025975, 0.004587858, -0.263799480, 8.752330828],
[ 0.000009098, -0.001725357, 0.114993424, 1.176472749],
[-0.000010418, 0.002080207, -0.132368251, 6.535975709],
[ 0.000032572, -0.006947575, 0.499576502, -8.209401868],
[-0.000039870, 0.009351884, -0.722882956, 22.352084596],
[ 0.000046909, -0.011475011, 0.943268640, -22.078624629],
[-0.000067764, 0.017766572, -1.542265901, 48.344854010],
[ 0.000144148, -0.039449875, 3.607214322,-106.139552662],
[-0.000108830, 0.032648910, -3.242170215, 110.757624352]
])
But not as expected the shape is (20,). I expected the following shape: (20x4).
Question 1: Can anyone tell me why? And how do I create the array correctly?
Question 2: When I add the datatype , dtype=np.float, I get the following
Error:
*TypeError: float() argument must be a string or a number, not 'list'*
but the array isn't intended to be a list.

I found the mistake on my own after trying to np.vstack all vectors.
The resulting error said that the size of the arrays with the row index 2, 3, 4 is not 4 as expected.
Replacing a , (comma) with a dot solved the problem.

Related

associative arrays in openscad?

Does openscad have any language primitive for string-keyed associative arrays (a.k.a hash maps, a.k.a dictionaries)? Or is there any convention for how to emulate associative arrays?
So far all I can think of is using vectors and using variables to map indexes into the vector to human readable names. That means there's no nice, readable way to define the vector, you just have to comment it.
Imagine I want to write something akin to the Python data structure:
bobbin_metrics = {
'majacraft': {
'shaft_inner_diameter': 9.0,
'shaft_outer_diameter': 19.5,
'close_wheel_diameter': 60.1,
# ...
},
'majacraft_jumbo': {
'shaft_inner_diameter': 9.0,
'shaft_outer_diameter': 25.0,
'close_wheel_diameter': 100.0,
},
# ...
}
such that I can reference it in model definitions in some recognisably hash-map-like way, like passing bobbin_metrics['majacraft'] to something as metrics and referencing metrics['close_wheel_diameter'].
So far my best effort looks like
# Vector indexes into bobbin-metrics arrays
BM_SHAFT_INNER_DIAMETER = 0
BM_SHAFT_OUTER_DIAMETER = 1
BM_CLOSE_WHEEL_DIAMETER = 2
bobbin_metrics_majacraft = [
9.0, # shaft inner diameter
19.5, # shaft outer diameter
60.1, # close-side wheel diameter
# ....
];
bobbin_metrics_majacraft_jumbo = [
9.0, # shaft inner diameter
25.0, # shaft outer diameter
100.0, # close-side wheel diameter
# ....
];
bobbin_metrics = [
bobbin_metrics_majacraft,
bobbin_metrics_majacraft_jumbo,
# ...
];
# Usage when passed a bobbin metrics vector like
# bobbin_metrics_majacraft as 'metrics' to a function
metrics[BM_SHAFT_INNER_DIAMETER]
I think that'll work. But it's U.G.L.Y.. Not quite "I write applications in bash" ugly, but not far off.
Is there a better way?
I'm prepared to maintain the data set outside openscad and have a generator for an include file if I have to, but I'd rather not.
Also, in honour of April 1 I miss the blink tag and wonder if the scrolling marquee will work? Tried 'em :)
I played around with the OpenSCAD search() function which is documented in the manual here;
https://en.wikibooks.org/wiki/OpenSCAD_User_Manual/Other_Language_Features#Search
The following pattern allows a form of associative list, it may not be optimal but does provide a way to set up a dictionary structure and retrieve a value against a string key;
// associative searching
// dp 2019
// - define the dictionary
dict = [
["shaft_inner_diameter", 9.0],
["shaft_outer_diameter", 19.5],
["close_wheel_diameter", 60.1]
];
// specify the serach term
term = "close_wheel_diameter";
// execute the search
find = search(term, dict);
// process results
echo("1", find);
echo ("2",dict[find[0]]);
echo ("3",dict[find[0]][1]);
The above produces;
Compiling design (CSG Tree generation)...
WARNING: search term not found: "l"
...
WARNING: search term not found: "r"
ECHO: "1", [2, 0]
ECHO: "2", ["close_wheel_diameter", 60.1]
ECHO: "3", 60.1
Personally, I would do this sort of thing in Python then generate the OpenSCAD as an intermediate file or maybe use the SolidPython library.
An example of a function that uses search() and does not produce any warnings.
available_specs = [
["mgn7c", 1,2,3,4],
["mgn7h", 2,3,4,5],
];
function selector(item) = available_specs[search([item], available_specs)[0]];
chosen_spec = selector("mgn7c");
echo("Specification was returned from function", chosen_spec);
The above will produce the following output:
ECHO: "Specification was returned from function", ["mgn7c", 1, 2, 3, 4]
Another very similar approach is using list comprehensions with a condition statement, just like you would in Python for example. Does the same thing, looks a bit simpler.
function selector(item) = [
for (spec = available_specs)
if (spec[0] == item)
spec
];

Matrix Inversion Methods

When one has a problem of a matrix inverse multiplication with a vector, as such:
one can take a Cholesky Decomposition of A and backsubstitute b to find the resulting vector x. However, a matrix inverse is sometimes needed when the problem is not formulated as above. My question is what is the best way to handle such a situation. Below, I have compared various ways(using numpy) to invert a positive definite matrix:
Firstly, generate the matrix:
>>> A = np.random.rand(5,5)
>>> A
array([[ 0.13516074, 0.2532381 , 0.61169708, 0.99678563, 0.32895589],
[ 0.35303998, 0.8549499 , 0.39071336, 0.32792806, 0.74723177],
[ 0.4016188 , 0.93897663, 0.92574706, 0.93468798, 0.90682809],
[ 0.03181169, 0.35059435, 0.10857948, 0.36422977, 0.54525 ],
[ 0.64871162, 0.37809219, 0.35742865, 0.7154568 , 0.56028468]])
>>> A = np.dot(A.transpose(), A)
>>> A
array([[ 0.72604206, 0.96959581, 0.82773451, 1.10159817, 1.05327233],
[ 0.96959581, 1.94261607, 1.53140854, 1.80864185, 1.9766411 ],
[ 0.82773451, 1.53140854, 1.52338262, 1.89841402, 1.59213299],
[ 1.10159817, 1.80864185, 1.89841402, 2.61930178, 2.01999385],
[ 1.05327233, 1.9766411 , 1.59213299, 2.01999385, 2.10012097]])
The results for the method of direct inversion are as follows:
>>> np.linalg.inv(A)
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
When using the Moore-Penrose Pseudoinverse, the results are as follows(you might notice that to the displayed precision, the results are the same as direct inversion):
>>> np.linalg.pinv(A)
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
Finally, when solving with the identity matrix:
>>> np.linalg.solve(A, np.eye(5))
array([[ 5.49746838, -1.92540877, 2.24730018, -2.20242449,
-0.53025806],
[ -1.92540877, 95.34219156, -67.93144606, 50.16450952,
-85.52146331],
[ 2.24730018, -67.93144606, 57.0739859 , -40.56297863,
58.55694127],
[ -2.20242449, 50.16450952, -40.56297863, 30.6441555 ,
-44.83400183],
[ -0.53025806, -85.52146331, 58.55694127, -44.83400183,
79.96573405]])
Again, you might notice that on a cursory inspection, the result is the same as the previous two methods.
It is well known that matrix inversion is an ill posed problem due to numerical instability that should be avoided where possible. However, in situations where it appears unavoidable, what is the preferable approach and why? To clarify, I am referring to the best approach when implementing such equations in software.
An example of such a problem is provided with another of my questions.
The reason for avoiding inverting matrices has only to do with efficiency. It is faster to solve the linear systems directly. If you think of the problem in your linked question a bit differently, you can apply the same principles.
In order to find the matrix inv(K) * Y * T(Y) * inv(K) - D * inv(K) you can solve the following systems of equations:
K * R * K = Y * T(Y)
You can solve it in two parts:
R2 * K = R1
K * R1 = Y * T(Y)
So you first solve for R1 with your usual method, then solve for R2 (recognise that you can solve T(K) * T(R2) = T(R1) if you have to).
However, at this point I don't know if this will be more efficient than computing the inverse explicitly unless K is symmetric. (There may be a way to efficiently get the decomposition of T(K) from K, but I don't know offhand)
If K is symmetric then you can compute your decomposition on K once and reuse it for the two back-substitution steps and it might be more efficient than computing the inverse explicitly.

ERROR: `*` has no method matching *(::Variable)

I wrote the following code:
using JuMP
m = Model()
const A =
[ :a0 ,
:a1 ,
:a2 ]
const T = [1:5]
const U =
[
:a0 => [9 9 9 9 999],
:a1 => [11 11 11 11 11],
:a2 => [1 1 1 1 1]
]
#defVar(m, x[A,T], Bin)
#setObjective(m, Max, sum{sum{x[i,j] * U[i,j], i=A}, j=T} )
print(m)
status = solve(m)
println("Objective value: ", getObjectiveValue(m))
println("x = ", getValue(x))
When I run it I get the following error
ERROR: `*` has no method matching *(::Variable)
in anonymous at /home/username/.julia/v0.3/JuMP/src/macros.jl:71
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
while loading /programs/julia-0.2.1/models/a003.jl, in expression starting on line 21
What's the correct way of doing this?
As the manual says:
There is one key restriction on the form of the expression in the second case: if there is a product between coefficients and variables, the variables must appear last. That is, Coefficient times Variable is good, but Variable times Coefficient is bad
Let me know if there is another place I could put this that would have helped you out.
This situation isn't desirable but unfortunately we haven't got a good solution yet that retains the fast model construction capabilities of JuMP.
I believe the problem with U is that it is a dictionary of arrays, thus you first need to index into the dictionary to return the correct array, then index into the array. JuMP's variables have more powerful indexing, so allow you to do it in one set of [].
I resolved my problem: constants must preceed variables as I read somewhere, moreover it seems that an array of constants must be used as an array of arrays while variables can be used as matrices.
Here's the correct line:
#setObjective(m, Max, sum{sum{U[i][j]*x[i,j], i=A}, j=T} )

Create output file in matlab containing numeric and string cells

I am currently working on a project where I have to program the same tool both in Matlab and R and compare both software options.
I started in R and now I am translating the code to Matlab but I am now stuck at the most important part. The output file that the tool creates after doing the analysis.
Basically, my tool makes an analysis that loops n times, and after each loop I get many variables that go into an output table. So, to be more clear, after each loop I get variables:
A = 123
B = 456
C = 'string1'
D = 'string2'
E = 789
The values in each variable change after each loop, I just want to make clear that the variables are both numeric and string values since this is what causes my problem.
In R what I do after each loop is:
outp <- cbind(A,B,C,D,E)
and create a data frame containing each variable in one cell arranged horizontally to afterwards add the result of each loop vertically in a new data frame:
outp2 <- rbind(outp2,outp)
so in the end I get a data frame (outp2) with A,B,C,D,E columns and n rows containing the values of each variable after each loop. So at the end of the looping process I can use write.csv function and create an output file of outp2 that contains both numeric and string columns.
I tried to do this in Matlab but I cannot find a function that can join the data in the same way I am doing it in R because using brackets '[]' only allows me to join numeric kind of variables. So basically my question is: How can I replicate what I am doing in R in Matlab?
I hope I was clear enough, I found it a bit hard to explain.
You can append your output with a cell array, first using curly brackets to declare your cell format (empty {} or containing your data {...}), then using brackets [...] to concatenate the output (one line after the others using ;).
out_array = {}; %initialize empty
%vertical concatenation with ";"
for ii = 1:3
out_array = [out_array; {123, 456, 'string1', 'string2', 789}];
end
This gives
out_array =
[123] [456] 'string1' 'string2' [789]
[123] [456] 'string1' 'string2' [789]
[123] [456] 'string1' 'string2' [789]
Don't now if this solves your problem, but in Matlab you can do things like
oupt = {123, 456, 'string1', 'string2', 789}
Just use curly braces instead of square brackets.
As they have said before, use curly braces to create a cell array. I imagine A, B, C, D, and E are your table headers and you already have the data that goes under them, so I'd do it like this:
outp = { A , B , C , D , E };
# This next step is only to have some data...
outp2 = magic(5);
outp2 = num2cell(outp2);
output = [ outp ; outp2 ]
output =
[123] [456] 'string1' 'string2' [789]
[ 17] [ 24] [ 1] [ 8] [ 15]
[ 23] [ 5] [ 7] [ 14] [ 16]
[ 4] [ 6] [ 13] [ 20] [ 22]
[ 10] [ 12] [ 19] [ 21] [ 3]
[ 11] [ 18] [ 25] [ 2] [ 9]

How can I get a flat result from a list comprehension instead of a nested list?

I have a list A, and a function f which takes an item of A and returns a list. I can use a list comprehension to convert everything in A like [f(a) for a in A], but this returns a list of lists. Suppose my input is [a1,a2,a3], resulting in [[b11,b12],[b21,b22],[b31,b32]].
How can I get the flattened list [b11,b12,b21,b22,b31,b32] instead? In other words, in Python, how can I get what is traditionally called flatmap in functional programming languages, or SelectMany in .NET?
(In the actual code, A is a list of directories, and f is os.listdir. I want to build a flat list of subdirectories.)
See also: How do I make a flat list out of a list of lists? for the more general problem of flattening a list of lists after it's been created.
You can have nested iterations in a single list comprehension:
[filename for path in dirs for filename in os.listdir(path)]
which is equivalent (at least functionally) to:
filenames = []
for path in dirs:
for filename in os.listdir(path):
filenames.append(filename)
>>> from functools import reduce # not needed on Python 2
>>> list_of_lists = [[1, 2],[3, 4, 5], [6]]
>>> reduce(list.__add__, list_of_lists)
[1, 2, 3, 4, 5, 6]
The itertools solution is more efficient, but this feels very pythonic.
You can find a good answer in the itertools recipes:
import itertools
def flatten(list_of_lists):
return list(itertools.chain.from_iterable(list_of_lists))
The question proposed flatmap. Some implementations are proposed but they may unnecessary creating intermediate lists. Here is one implementation that's based on iterators.
def flatmap(func, *iterable):
return itertools.chain.from_iterable(map(func, *iterable))
In [148]: list(flatmap(os.listdir, ['c:/mfg','c:/Intel']))
Out[148]: ['SPEC.pdf', 'W7ADD64EN006.cdr', 'W7ADD64EN006.pdf', 'ExtremeGraphics', 'Logs']
In Python 2.x, use itertools.map in place of map.
You could just do the straightforward:
subs = []
for d in dirs:
subs.extend(os.listdir(d))
You can concatenate lists using the normal addition operator:
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
The built-in function sum will add the numbers in a sequence and can optionally start from a specific value:
>>> sum(xrange(10), 100)
145
Combine the above to flatten a list of lists:
>>> sum([[1, 2], [3, 4]], [])
[1, 2, 3, 4]
You can now define your flatmap:
>>> def flatmap(f, seq):
... return sum([f(s) for s in seq], [])
...
>>> flatmap(range, [1,2,3])
[0, 0, 1, 0, 1, 2]
Edit: I just saw the critique in the comments for another answer and I guess it is correct that Python will needlessly build and garbage collect lots of smaller lists with this solution. So the best thing that can be said about it is that it is very simple and concise if you're used to functional programming :-)
subs = []
map(subs.extend, (os.listdir(d) for d in dirs))
(but Ants's answer is better; +1 for him)
import itertools
x=[['b11','b12'],['b21','b22'],['b31']]
y=list(itertools.chain(*x))
print y
itertools will work from python2.3 and greater
You could try itertools.chain(), like this:
import itertools
import os
dirs = ["c:\\usr", "c:\\temp"]
subs = list(itertools.chain(*[os.listdir(d) for d in dirs]))
print subs
itertools.chain() returns an iterator, hence the passing to list().
This is the most simple way to do it:
def flatMap(array):
return reduce(lambda a,b: a+b, array)
The 'a+b' refers to concatenation of two lists
You can use pyxtension:
from pyxtension.streams import stream
stream([ [1,2,3], [4,5], [], [6] ]).flatMap() == range(7)
Google brought me next solution:
def flatten(l):
if isinstance(l,list):
return sum(map(flatten,l))
else:
return l
If listA=[list1,list2,list3]
flattened_list=reduce(lambda x,y:x+y,listA)
This will do.

Resources