Related
I have what I think is a interesting question, about google sheets and some Maths, here is the scenario:
4 numbers as follows:
64.20 | 107 | 535 | 1070
A reference number in which the previous numbers needs to fit leaving the minimum possible residue while setting the number of times each of them fitted in the reference number for example we could say the reference number is the following:
806.45
So here is the problem:
I'm calculating how many times those 4 numbers can fit in the reference number by starting from the higher to the lower number like this:
| 1070 | => =IF(E12/((I15+J15)+IF(H17,K17,0)+IF(H19,K19,0)) > 0,ROUNDDOWN(E12/((I15+J15)+IF(H17,K17,0)+IF(H19,K19,0))),0)
| 535 | => =IF(H15>0,ROUNDDOWN((E12-K15-IF(H17,K17,0)-IF(H19,K19,0))/(I14+J14)),ROUNDDOWN(E12/((I14+J14)+IF(H17,K17,0)+IF(H19,K19,0))))
| 107 | => =IF(OR(H15>0,H14>0),ROUNDDOWN((E12-K15-K14-IF(H17,K17,0)-IF(H19,K19,0))/(I13+J13)),ROUNDDOWN((E12-IF(H17,K17,0)-IF(H19,K19,0))/(I13+J13)))
| 64.20 | => =IF(OR(H15>0,H14>0,H13>0),ROUNDDOWN((E12-K15-K14-K13-IF(H17,K17,0)-IF(H19,K19,0))/(I12+J12)),ROUNDDOWN((E12-IF(H17,K17,0)-IF(H19,K19,0))/(I12+J12)))
As you can notice, I'm checking if the higher values has a concurrence, so I can substract the amount from the original number and calculate again how many times can fit the lower number in the result of that subtraction , you can also see that I'm including some checkboxes to the formula in order to add a fixed number to the main number.
This actually works, and as you can see in the example, the result is:
| 1070 | -> Fits 0 times
| 535 | -> Fits 1 time
| 107 | -> Fits 2 times
| 64.20 | -> Fits 0 times
The residue of 806.45 in this example is: 57.45
But each number that needs to fit on the main number needs to take in consideration others; IF you solve this exercise manually, you could get something much better.. like this:
| 1070 | -> Fits 0 times
| 535 | -> Fits 1 time
| 107 | -> Fits 0 times
| 64.20 | -> Fits 4 times
The residue of 806.45 in this example is: 14.65
When I’m talking about residue I mean the result when subtracting, I’m sorry if this is not clear, it’s hard to me to explain maths in English, since is not my native language, please see the spreadsheet and make a copy to better understand what I’m trying to do, or suggest me a way to explain it better if possible.
So what would you do to make it work more efficient and "smart" with the minimum possible residue after the calculation?
Here is the Google's spreadsheet for reference and practice, please make a copy so others can try their own solutions:
LINK TO SPREADSHEET
Thanks in advance for any help or hints.
Delete all current formulas in H12:H15.
Then place this mega-formula in H12:
=ArrayFormula(QUERY(SPLIT(FLATTEN(SPLIT(VLOOKUP(E12,QUERY(SPLIT(FLATTEN(QUERY(SPLIT(FLATTEN(QUERY(SPLIT(FLATTEN(SEQUENCE(ROUNDUP(E12/I12),1,0)&" "&I12&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)&" "&I13)&"|"&(SEQUENCE(ROUNDUP(E12/I12),1,0)*I12)+(TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)*I13))),"|"),"Select Col1")&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I14),1,0)&" "&I14)&"|"&QUERY(SPLIT(FLATTEN(SEQUENCE(ROUNDUP(E12/I12),1,0)&" "&I12&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)&" "&I13)&"|"&((SEQUENCE(ROUNDUP(E12/I12),1,0)*I12)+(TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)*I13)))),"|"),"Select Col2")+TRANSPOSE(SEQUENCE(ROUNDUP(E12/I14),1,0)*I14)),"|"),"Select Col1")&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I15),1,0)&" "&I15)&"|"&QUERY(SPLIT(FLATTEN(QUERY(SPLIT(FLATTEN(SEQUENCE(ROUNDUP(E12/I12),1,0)&" "&I12&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)&" "&I13)&"|"&(SEQUENCE(ROUNDUP(E12/I12),1,0)*I12)+(TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)*I13))),"|"),"Select Col1")&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I14),1,0)&" "&I14)&"|"&QUERY(SPLIT(FLATTEN(SEQUENCE(ROUNDUP(E12/I12),1,0)&" "&I12&" / "&TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)&" "&I13)&"|"&((SEQUENCE(ROUNDUP(E12/I12),1,0)*I12)+(TRANSPOSE(SEQUENCE(ROUNDUP(E12/I13),1,0)*I13)))),"|"),"Select Col2")+TRANSPOSE(SEQUENCE(ROUNDUP(E12/I14),1,0)*I14)),"|"),"Select Col2")+TRANSPOSE(SEQUENCE(ROUNDUP(E12/I15),1,0)*I15)),"|"),"Select Col2, Col1 WHERE Col2 <= "&E12&" ORDER BY Col2 Asc, Col1 Desc"),2,TRUE)," / ",0,0))," "),"Select Col1"))
Typically, I explain my formulas. In this case, I trust that readers will understand why I cannot explain it. I can only offer it in working order.
To briefly give the general idea, this formula figures out how many times each of the four numbers fits into the target number alone and then adds every possible combination of all of those. Those are then limited to only the combinations less than the target number and sorted smallest to largest in total. Then a VLOOKUP looks up the target number in that list, returns the closest match, SPLITs the multiples from the amounts (which, in the end, have been concatenated into long strings), and returns only the multiples.
I have a stream of numbers and in every cycle I need to count the average of last N of them. This can be, of course, solved using an array where I store the last N numbers and in every cycle I shift it, add the new one and count the average.
N = 3
+---+-----+
| a | avg |
+---+-----+
| 1 | |
| 2 | |
| 3 | 2.0 |
| 4 | 3.0 |
| 3 | 3.3 |
| 3 | 3.3 |
| 5 | 3.7 |
| 4 | 4.0 |
| 5 | 4.7 |
+---+-----+
First N numbers (where there "isn't enough data for counting the average") doesn't interest me much, so the results there may be anything/undefined.
My question is, can this be done without using an array, that is, with static amount of memory? If so, then how?
I'll do the coding myself - I just need to know the theory.
Thanks
Think of this as a black box containing some state. If you control the input stream, you can draw conclusions on the state. In your sliding window array-based approach, it is kind of obvious that if you feed a bunch of zeros into the algorithm after the original input, you get a bunch of averages with a decreasing number of non-zero values taken into account. The last one has just one original non-zero value, so if you multiply that my N you get the last input back. Using that and the second-to-last output which accounts for two non-zero inputs, you can reconstruct the second-to-last input, and so on.
So essentially your algorithm needs to maintain sufficient state to reconstruct the last N elements of input, at least if you formulate it as an on-line algorithm. I don't think an off-line algorithm can do any better, except if you consider it reading the input multiple times, but I don't have as strong an agument for that.
Of course, in some theoretical models you can avoid the array and e.g. encode all the state into a single arbitrary length integer, but that's just cheating the theory, and doesn't make any difference in practice.
I need an algorithm that can do a one-to-one mapping (ie. no collision) of a 32-bit signed integer onto another 32-bit signed integer.
My real concern is enough entropy so that the output of the function appears to be random. Basically I am looking for a cipher similar to XOR Cipher but that can generate more arbitrary-looking outputs. Security is not my real concern, although obscurity is.
Edit for clarification purpose:
The algorithm must be symetric, so that I can reverse the operation without a keypair.
The algorithm must be bijective, every 32-bit input number must generate a 32-bit unique number.
The output of the function must be obscure enough, adding only one to the input should result big effect on the output.
Example expected result:
F(100) = 98456
F(101) = -758
F(102) = 10875498
F(103) = 986541
F(104) = 945451245
F(105) = -488554
Just like MD5, changing one thing may change lots of things.
I am looking for a mathmetical function, so manually mapping integers is not a solution for me. For those who are asking, algorithm speed is not very important.
Use any 32-bit block cipher! By definition, a block cipher maps every possible input value in its range to a unique output value, in a reversible fashion, and by design, it's difficult to determine what any given value will map to without the key. Simply pick a key, keep it secret if security or obscurity is important, and use the cipher as your transformation.
For an extension of this idea to non-power-of-2 ranges, see my post on Secure Permutations with Block Ciphers.
Addressing your specific concerns:
The algorithm is indeed symmetric. I'm not sure what you mean by "reverse the operation without a keypair". If you don't want to use a key, hardcode a randomly generated one and consider it part of the algorithm.
Yup - by definition, a block cipher is bijective.
Yup. It wouldn't be a good cipher if that were not the case.
I will try to explain my solution to this on a much simpler example, which then can be easily extended for your large one.
Say i have a 4 bit number. There are 16 distinct values. Look at it as if it was a four dimensional cube:
(source: ams.org)
.
Every vertex represents one of those numbers, every bit represents one dimension. So its basicaly XYZW, where each of the dimensions can have only values 0 or 1. Now imagine you use a different order of dimensions. For example XZYW. Each of the vertices now changed its number!
You can do this for any number of dimensions, just permute those dimensions. If security is not your concern this could be a nice fast solution for you. On the other hand, i dont know if the output will be "obscure" enough for your needs and certainly after a large amount of mapping done, the mapping can be reversed (which may be an advantage or disadvantage, depending on your needs.)
The following paper gives you 4 or 5 mapping examples, giving you functions rather than building mapped sets: www.cs.auckland.ac.nz/~john-rugis/pdf/BijectiveMapping.pdf
If your goal is simply to get a seemingly random permutation of numbers of a roughly defined size, then there is another possible way: reduce the set of numbers to a prime number.
Then you can use a mapping of the form
f(i) = (i * a + b) % p
and if p is indeed a prime, this will be a bijection for all a != 0 and all b. It will look fairly random for larger a and b.
For example, in my case for which I stumbled on this question, I used 1073741789 as a prime for the range of numbers smaller than 1 << 30. That makes me lose only 35 numbers, which is fine in my case.
My encoding is then
((n + 173741789) * 507371178) % 1073741789
and the decoding is
(n * 233233408 + 1073741789 - 173741789) % 1073741789
Note that 507371178 * 233233408 % 1073741789 == 1, so those two numbers are inverse the field of numbers modulo 1073741789 (you can figure out inverse numbers in such fields with the extended euclidean algorithm).
I chose a and b fairly arbitrarily, I merely made sure they are roughly half the size of p.
Apart from generating random lookup-tables, you can use a combination of functions:
XOR
symmetric bit permutation (for example shift 16 bits, or flip 0-31 to 31-0, or flip 0-3 to 3-0, 4-7 to 7-4, ...)
more?
Can you use a random generated lookup-table? As long as the random numbers in the table are unique, you get a bijective mapping. It's not symmetric, though.
One 16 GB lookup-table for all 32 bit values is probably not practical, but you could use two separate 16-bit lookup tables for the high-word and the low word.
PS: I think you can generate a symmetric bijective lookup table, if that's important. The algorithm would start with an empty LUT:
+----+ +----+
| 1 | -> | |
+----+ +----+
| 2 | -> | |
+----+ +----+
| 3 | -> | |
+----+ +----+
| 4 | -> | |
+----+ +----+
Pick the first element, assign it a random mapping. To make the mapping symmetric, assign the inverse, too:
+----+ +----+
| 1 | -> | 3 |
+----+ +----+
| 2 | -> | |
+----+ +----+
| 3 | -> | 1 |
+----+ +----+
| 4 | -> | |
+----+ +----+
Pick the next number, again assign a random mapping, but pick a number that's not been assigned yet. (i.e. in this case, don't pick 1 or 3). Repeat until the LUT is complete. This should generate a random bijective symmetric mapping.
Take a number, multiplies by 9, inverse digits, divide by 9.
123 <> 1107 <> 7011 <> 779
256 <> 2304 <> 4032 <> 448
1028 <> 9252 <> 2529 <> 281
Should be obscure enough !!
Edit : it is not a bijection for 0 ending integer
900 <> 8100 <> 18 <> 2
2 <> 18 <> 81 <> 9
You can always add a specific rule like :
Take a number, divide by 10 x times, multiplies by 9, inverse digits, divide by 9, multiples by 10^x.
And so
900 <> 9 <> 81 <> 18 <> 2 <> 200
200 <> 2 <> 18 <> 81 <> 9 <> 900
W00t it works !
Edit 2 : For more obscurness, you can add an arbitrary number, and substract at the end.
900 < +256 > 1156 < *9 > 10404 < invert > 40401 < /9 > 4489 < -256 > 4233
123 < +256 > 379 < *9 > 3411 < invert > 1143 < /9 > 127 < -256 > -129
Here is my simple idea:
You can move around the bits of the number, as PeterK proposed, but you can have a different permutation of bits for each number, and still be able to decipher it.
The cipher goes like this:
Treat the input number as an array of bits I[0..31], and the output as O[0..31].
Prepare an array K[0..63] of 64 randomly generated numbers. This will be your key.
Take the bit of input number from position determined by the first random number (I[K[0] mod 32]) and place it at the beginning of your result (O[0]). Now to decide which bit to place at O[1], use the previously used bit. If it is 0, use K[1] to generate position in I from which to take, it it is 1, use K[2] (which simply means skip one random number).
Now this will not work well, as you may take the same bit twice. In order to avoid it, renumber the bits after each iteration, omitting the used bits. To generate the position from which to take O[1] use I[K[p] mod 31], where p is 1 or 2, depending on the bit O[0], as there are 31 bits left, numbered from 0 to 30.
To illustrate this, I'll give an example:
We have a 4-bit number, and 8 random numbers: 25, 5, 28, 19, 14, 20, 0, 18.
I: 0111 O: ____
_
25 mod 4 = 1, so we'll take bit whose position is 1 (counting from 0)
I: 0_11 O: 1___
_
We've just taken a bit of value 1, so we skip one random number and use 28. There are 3 bits left, so to count position we take 28 mod 3 = 1. We take the first (counting from 0) of the remaining bits:
I: 0__1 O: 11__
_
Again we skip one number, and take 14. 14 mod 2 = 0, so we take the 0th bit:
I: ___1 O: 110_
_
Now it doesn't matter, but the previous bit was 0, so we take 20. 20 mod 1 = 0:
I: ____ O: 1101
And this is it.
Deciphering such a number is easy, one just has to do the same things. The position at which to place the first bit of the code is known from the key, the next positions are determined by the previously inserted bits.
This obviously has all the disadvantages of anything which just moves the bits around (for example 0 becomes 0, and MAXINT becomes MAXINT), but is seems harder to find how someone has encrypted the number without knowing the key, which has to be secret.
If you don't want to use proper cryptographic algorithms (perhaps for performance and complexity reasons) you can instead use a simpler cipher like the Vigenère cipher. This cipher was actually described as le chiffre indéchiffrable (French for 'the unbreakable cipher').
Here is a simple C# implementation that shifts values based on a corresponding key value:
void Main()
{
var clearText = Enumerable.Range(0, 10);
var key = new[] { 10, 20, Int32.MaxValue };
var cipherText = Encode(clearText, key);
var clearText2 = Decode(cipherText, key);
}
IEnumerable<Int32> Encode(IEnumerable<Int32> clearText, IList<Int32> key) {
return clearText.Select((i, n) => unchecked(i + key[n%key.Count]));
}
IEnumerable<Int32> Decode(IEnumerable<Int32> cipherText, IList<Int32> key) {
return cipherText.Select((i, n) => unchecked(i - key[n%key.Count]));
}
This algorithm does not create a big shift in the output when the input is changed slightly. However, you can use another bijective operation instead of addition to achieve that.
Draw a large circle on a large sheet of paper. Write all the integers from 0 to MAXINT clockwise from the top of the circle, equally spaced. Write all the integers from 0 to MININT anti-clockwise, equally spaced again. Observe that MININT is next to MAXINT at the bottom of the circle. Now make a duplicate of this figure on both sides of a piece of stiff card. Pin the stiff card to the circle through the centres of both. Pick an angle of rotation, any angle you like. Now you have a 1-1 mapping which meets some of your requirements, but is probably not obscure enough. Unpin the card, flip it around a diameter, any diameter. Repeat these steps (in any order) until you have a bijection you are happy with.
If you have been following closely it shouldn't be difficult to program this in your preferred language.
For Clarification following the comment: If you only rotate the card against the paper then the method is as simple as you complain. However, when you flip the card over the mapping is not equivalent to (x+m) mod MAXINT for any m. For example, if you leave the card unrotated and flip it around the diameter through 0 (which is at the top of the clock face) then 1 is mapped to -1, 2 to -2, and so forth. (x+m) mod MAXINT corresponds to rotations of the card only.
Split the number in two (16 most significant bits and 16 least significant bits) and consider the bits in the two 16-bit results as cards in two decks. Mix the decks forcing one into the other.
So if your initial number is b31,b30,...,b1,b0 you end up with b15,b31,b14,b30,...,b1,b17,b0,b16. It's fast and quick to implement, as is the inverse.
If you look at the decimal representation of the results, the series looks pretty obscure.
You can manually map 0 -> maxvalue and maxvalue -> 0 to avoid them mapping onto themselves.
I am building a website where I need to make sure that the number of "coins" and number of "users" wont kill the database if increases too quickly. I first posted this on mathematica (thinking its a maths website, but found it it's not). If this is the wrong place, please let me know and I'll move it accordingly. However, it does boil down to solving a complex problem: will my database explode if the users increase too quickly?
Here's the problem:
I am trying to confirm if the following equations would work for my problem. The problem is that i have USERS (u) and i have COINS (c).
There are millions of different coins.
One user may have the same coin another user has. (i.e. both users have coin A)
Users can trade coins with each other. (i.e. Trade coin A for coin B)
Each user can trade any coin with another coin, so long as:
they don't trade a coin for the same coin (i.e. can't trade coin A for another coin A)
they can't trade with themselves (i.e. I can't offer my own Coin A for my own Coin B)
So, effectively, there are database rows stored in the DB:
trade_id | user_id | offer_id | want_id
1 | 1 | A | B
2 | 2 | B | C
So in the above data structure, user 1 wants coin A for coint B, and user 2 wants coin B for coin C. This is how I propose to store the data, and I need to know that if I get 1000 users, and each of them have 15 coins, how many relationships will get built in this table if each user offers each coin to another user. Will it explode exponentially? Will it be scalable? etc?
In the case of 2 users with 2 coins, you'd have user 1 being able to trade his two coins with the other users two coins, and vice versa. That makes it 4 total possible trade relationships that can be set up. However, keeping in mind that if user 1 offers A for B... user 2 can't offer B for A (because that relationship already exists.
What would the equation be to figure out how many TRADES can happen with U users and C coins?
Currently, I have one of two solutions, but neither seem to be 100% right. The two possible equations I have so far:
U! x C!
C x C x (U-1) x U
(where C = coins, and U = users);
Any thoughts on getting a more exact equation? How can I know without a shadow of a doubt, that if we scale to 1000 users with 10 coins each, that this table won't explode into millions of records?
If we just think about how many users can trade with other users. You could make a table with the allowable combinations.
user 1
1 | 2 | 3 | 4 | 5 | 6 | ...
________________________________
1 | N | Y | Y | Y | Y | Y | ...
user 2 2 | Y | N | Y | Y | Y | Y | ...
3 | Y | Y | N | Y | Y | Y | ...
The total number of entries in the table is U * U, and there are U N's down the diagonal.
Theres two possibilities depending on if order matters. Is trade(user_A,user_B) is the same as trade(user_B,user_A) or not? If order matters the same the number of possible trades is the number of Y's in the table which is U * U - U or (U-1) * U. If the order is irrelevant then its half that number (U-1) * U / 2 which are the Triangular numbers. Lets assume order is irrelevant.
Now if we have two users the situation with coins is similar. Order does matter here so it is C * (C-1) possible trades between the users.
Finally multiply the two together (U-1) * U * C * (C-1) / 2.
The good thing is that this is a polynomial roughly U^2 * C^2 so it will not grow to quickly. This thing to watch out for is if you have exponential growth, like calculating moves in chess. Your well clear of this.
One of the possibilities in your question had U! which is the number of ways to arrange U distinct objects into a sequence. This would have exponential growth.
There are U possible users and there are C possible coins.
Hence there are OWNS = CxU possible "coins owned by an individual".
Hence there are also OWNS "possible offerings for a trade".
But a trade is a pair of two such offerings, restricted by the rule that the two persons acting as offerer cannot be the same, and neither can the offered coin be the same. So the number of candidates for completing a "possible offering" to form a "complete trade" is (C-1)x(U-1).
The number of possible ordered pairs that form a "full-blown trade" is thus
CxUx(C-1)x(U-1)
And then this is still to be divided by two because of the permutation issue (trades are a set of two (person,coin) pairs, not an ordered pair).
But pls note that this sort of question is actually an extremely silly one to worry about in the world of "real" database design !
I need to know that if I get 1,000 users, and each of them have 15 coins, how many relationships will get built in this table if each user offers each coin to another user.
The most that can happen is all 1,000 users each trade all of their 15 coins, for 7,500 trades. This is 15,000 coins up for trade (1,000 users x 15 coins). Since it takes at least 2 coins to trade, you divide 15,000 by 2 to get the maximum number of trades, 7,500.
Your trade table is basically a Cartesian product of the number of users times the number of coins, divided by 2.
(U x C) / 2
I'm assuming users aren't trading for the sake of trading. That they want particular coins and once they get the coins, won't trade again.
Also, most relational databases can handle millions and even billions of rows in a table.
Just make sure you have an index on trade id and on user id, trade id in your Trade table.
The way I understand this is that you are designing an offer table. I.e. user A may offer coin a in exchange for coin b, but not to a specific user. Any other user may take the offer. If this is the case, the maximum number of offers is proportional to the number of users U and the square of the number of coins C.
The maximum number of possible trades (disregarding direction) is
C(C-1)/2.
Every user can offer all the possible trades, as long as every user is offering the trades in the same direction, without any trade being matched. So the absolute maximum number of records in the offer table is
C(C-1)/2*U.
If trades are allowed between more than two user the number decreases above half that though. E.g. if A offers a for b, B offers b for c and C offers c for a. Then a trade could be accomplished in a triangle by A getting b from B, B getting c from C and C getting a from A.
The maximum number of rows in the table can then be calculated by splitting the C coins into two groups and offering any coin it the first group in exchange any coin in the second. We get the maximum number of combination if the groups are of the same size, C/2. The number of combinations is
C/2*C/2 = C^2/4.
Every user may offer all these trades without there being any possible trade. So the maximum number of rows is
C^2/4*U
which is just over half of
C(C-1)/2*U = 2*(C^2/4*U) - C/2*U.
I am quite a beginner in Data Warehouse Design. I have red some theory, but recently met a practical problem with a design of a OLAP cube. I use star schema.
Lets say I have 2 dimension tables and 1 fact table:
Dimension Gazetteer:
dimension_id
country_name
province_name
district_name
Dimension Device:
dimension_id
device_category
device_subcategory
Fact table:
gazetteer_id
device_dimension_id
hazard_id (measure column)
area_m2 (measure column)
A "business object" (which is a mine field actually) can have multiple devices, is located in a single location (Gazetteer) and ocuppies X square meters.
So in order to know which device categories there are, I created a fact per each device in hazard like this:
+--------------+---------------------+-----------------------+-----------+
| gazetteer_id | device_dimension_id | hazard_id | area_m2 |
+--------------+---------------------+-----------------------+-----------+
| 123 | 321 | 0a0a-502c-11aa1331e98 | 6000 |
+--------------+---------------------+-----------------------+-----------+
| 123 | 654 | 0a0a-502c-11aa1331e98 | 6000 |
+--------------+---------------------+-----------------------+-----------+
| 123 | 987 | 0a0a-502c-11aa1331e98 | 6000 |
+--------------+---------------------+-----------------------+-----------+
I defined a measure "number of hazards" as distinct-count of hazard_id.
I also defined a "total area occupied" measure as a sum of area_m2.
Now I can use the dimension gazetteer and device and know how many hazards there are with given dimension members.
But the problem is the area_m2: because it is defined as a sum, it gives a value n-times higher than the actual area, where n is th number of devices of the hazard object. For example, with the data above would give 18000m2.
How would you solve this problem?
I am using the Pentaho stack.
Thanks in advance
[moved from comment]
If a hazard-id is a minefield, and you're looking at mines-by-region(gazetter) & size-of-minefields-by-gazetteer, maybe you could make a Hazard dimension, which holds the area of the Hazard; or possibly make a Null-device entry in the DeviceDimension table, and only the Null-device entry gets the area_m2 set, the real devices get area_m2=0.
If you need to answer queries like: total area of minefields containing device 321, the second approach isn't going to easily answer these questions, which suggests that making a Hazard dimension might be a better approach.
I would also consider adding a device-count fact, which could have the num devices of each type per hazard.