I am trying to use an "implication" function that take two parameters f(A,B)= A -> B. In boolean algebra this kind of functions (eg. conjunction , disjunction ...) are used with only boolean parameters (true or false , 1 or 0).Imagine that these parameters take continuous values (between 0 and 1) and that the result function (A,B) will take also values between 0 and 1 instead of true or false.
for example :
when A increases(is near 1) and B increases(is near 1) f(A,B) increases(near 1)
when A increases and B decreases f(A,B) decreases
when A decreases and B increases f(A,B) increases
when A decreases and B decreases f(A,B) increases
How can i parse the implication function into a mathematical function that handle my continues values ?
Found it. The problem can be solved using the Fuzzy logic, which is a form of many-valued logic in which the truth values of variables may be any real number between 0 and 1. By contrast, in Boolean logic, the truth values of variables may only be 0 or 1. Fuzzy logic has been extended to handle the concept of partial truth, where the truth value may range between completely true and completely false. So may boolean implication function would be transformed to a fuzzy implication function, see here for more details.
Related
Say I am tossing a fair coin where 'tails' is assigned the value x = -1/2 and 'heads' is assigned x = 1/2.
I do this N times and I want to obtain the sum. This is what I have tried:
p = 0.5
N = 1e4
X(N,p)=(rand(N).<p)
I know this is incomplete but when I check (rand(N).<p) I see an array consisting of true, false. I interpret this as 'Tails' or 'Heads'. However, I don't know how to assign the values 1/2 and -1/2 to each of these elements in order for me to find the sum. If I simply use sum((rand(N).<p)) I do get an integer value, but I don't think this is the right way to do it because I haven't specified the values 1/2 and -1/2 anywhere.
Any help is greatly appreciated.
As indicated by the comments already, you want to do
sum(rand([-0.5, 0.5], N))
where N must be an integer (you wrote N=1e4, therefore typeof(N) == Float64 and rand won't work).
The documentation of rand (obtained by ?rand) describes what rand(S, N) does:
Pick a random element or array of random elements from the set of
values specified by S
Here, S can be an optional indexable collection, an array of values in your case (or a type like Int). So, above S = [-0.5, 0.5] and rand draws N random elements from this collection, which we can afterwards sum up.
Assigning specific values to a boolean array
Since this is the title of your question, and the answer above doesn't actually address this, let me comment on this as well.
You could do sum((rand(N).<p)-0.5), i.e. you shift all the ones to 0.5 and all the zeros to -0.5 to get the wanted result. Note that this is a general strategy: Let's say you want true to be a and false to be b, where a and b are numbers. You achieve this by (rand(N).<p)*(a-b) + b.
However, beyond being more "complicated", sum((rand(N).<p)-0.5) will allocate temporary arrays, first one of booleans, then one of numbers, the latter of which will eventually go into sum. Because of these unnecessary allocations this approach will be slower than the solution above.
I can create a recursive formula from recurrences where it only passes down one argument (something like $T(n/2)$). However, for a case like this where the value of $u$ and $v$ are different, how do I put them together? This is the problem:
The call to recursive function RecursiveFunction(n, n) for some n > 2
RecursiveFunction(a, b)
if a >= 2 and b >= 2
u=a/2
v=b-1
RecursiveFunction(u, v)
The end goal is to find the tight asymptotic bounds for the worst-case running time, but I just need a formula to start first.
There are in fact two different answers to this, depending on the relative sizes of a and b.
The function can be written as follows:
Where C is some constant work done per call (if statement, pushing u, v onto the call stack etc.). Since the two variables evolve independently, we can analyse their evolution separately.
a - consider the following function:
Expanding the iterative case by m times:
The stopping condition a < 2 is such that:
b - as before:
The complexity of T(a, b) thus depends on which variable reaches its stopping condition first, i.e. the smallest between m and n:
I tried all.equal(1,2) and it returned a mean difference of 1.
Why does all.equal(1,2,3) show TRUE?
Is it because if it is supplied more than two variables it runs is.numeric() on its different elements to check their type instead of numerical differences?
Because 1 is equal to 2 within a tolerance of 3. From the docs:
target: R object.
current: other R object, to be compared with ‘target’.
...: Further arguments for different methods, notably the
following two, for numerical comparison:
tolerance: numeric >= 0. Differences smaller than ‘tolerance’ are not
reported. The default value is close to ‘1.5e-8’.
Your 3 is the third argument, and so becomes the tolerance.
Per DICOM specification, a UID is defined by: 9.1 UID Encoding Rules. In other words the following are valid DICOM UIDs:
"1.2.3.4.5"
"1.3.6.1.4.35045.103501438824148998807202626810206788999"
"1.2.826.0.1.3680043.2.1143.5028470438645158236649541857909059554"
while the following are illegal DICOM UIDs:
".1.2.3.4.5"
"1..2.3.4.5"
"1.2.3.4.5."
"1.2.3.4.05"
"12345"
"1.2.826.0.1.3680043.2.1143.50284704386451582366495418579090595540"
Therefore I know that the string is at most 64 bytes, and should match the following regex [0-9\.]+. However this regex is really a superset, since there are a lot less than (10+1)^64 (=4457915684525902395869512133369841539490161434991526715513934826241L) possibilities.
How would one computes precisely the number of possibilities to respect the DICOM UID rules ?
Reading the org root / suffix rule clearly indicates that I need at least one dot ('.'). In which case the combination is at least 3 bytes (char) in the form: [0-9].[0-9]. In which case there are 10x10=100 possibilities for UID of length 3.
Looking at the first answer, there seems to be something unclear about:
The first digit of each component shall not be zero unless the
component is a single digit.
What this means is that:
"0.0" is valid
"00.0" or "1.01" are not valid
Thus I would say a proper expression would be:
(([1-9][0-9]*)|0)(\.([1-9][0-9]*|0))+
Using a simple C code, I could find:
f(0) = 0
f(1) = 0
f(2) = 0
f(3) = 100
f(4) = 1800
f(5) = 27100
f(6) = 369000
f(7) = 4753000
f(8) = 59049000
The validation of the Root UID part is outside the scope of this question. A second validation step could take care of rejecting some OID that cannot possibly be registered (some people mention restriction on first and second arc for example). For simplicity we'll accept all possible (valid) Root UID.
While my other answer takes good care of this specific application, here is a more generic approach. It takes care of situations where you have a different regular expression describing the language in question. It also allows for considerably longer string lengths, since it only requires O(log n) arithmetic operations to compute the number of combinations for strings of length up to n. In this case the number of strings grows so quickly that the cost of these arithmetic operations will grow dramatically, but that may not be the case for other, otherwise similar situations.
Build a finite state automaton
Start with a regular expression description of your language in question. Translate that regular expression into a finite state automaton. In your case the regular expression can be given as
(([1-9][0-9]*)|0)(\.([1-9][0-9]*|0))+
The automaton could look like this:
Eliminate ε-transitions
This automaton usually contains ε-transitions (i.e. state transitions which do not correspond to any input character). Remove those, so that one transition corresponds to one character of input. Then add an ε-transition to the accepting state(s). If the accepting states have other outgoing transitions, don't add ε-loops to them, but instead add an ε-transition to an accepting state with no outgoing edges and then add the loop to that. This can be seen as padding the input with ε at its end, without allowing ε in the middle. Taken together, this transformation ensures that performing exactly n state transitions corresponds to processing an input of n characters or less. The modified automaton might look like this:
Note that both the construction of the first automaton from the regular expression and the elimination of ε-transitions can be performed automatically (and perhaps even in a single step. The resulting automata might be more complicated than what I constructed here manually, but the principle is the same.
Ensuring unique paths
You don't have to make the automaton deterministic in the sense that for every combination of source state and input character there is only one target state. That's not the case in my manually constructed one either. But you have to make sure that every complete input has only one possible path to the accepting state, since you'll essentially be counting paths. Making the automaton deterministic would ensure this weaker property, too, so go for that unless you can ensure unique paths without this. In my example the length of each component clearly dictates which path to use, so I didn't make it deterministic. But I've included an example with a deterministic approach at the end of this post.
Build transition matrix
Next, write down the transition matrix. Associate the rows and columns with your states (in order a, b, c, d, e, f in my example). For each arrow in your automaton, write the number of characters included in the label of that arrow in the column associated with the source state and the row associated with the target state of that arrow.
⎛ 0 0 0 0 0 0⎞
⎜ 9 10 0 0 0 0⎟
⎜10 10 0 10 10 0⎟
⎜ 0 0 1 0 0 0⎟
⎜ 0 0 0 9 10 0⎟
⎝ 0 0 0 10 10 1⎠
Read result off that matrix
Now applying this matrix with a column vector once has the following meaning: if the number of possible ways to arrive in a given state is encoded in the input vector, the output vector gives you the number of ways one transition later. Take the 64th power of that matrix, concentrate on the first column (since ste start situation is encoded as (1,0,0,0,0,0), meaning only one way to end up in the start state) and sum up all the entries that correspond to accepting states (only the last one in this case). The bottom left element of the 64th power of this matrix is
1474472506836676237371358967075549167865631190000000000000000000000
which confirms my other answer.
Compute matrix powers efficiently
In order to actually compute the 64th power of that matrix, the easiest approach would be repeated squaring: after squaring the matrix 6 times you have an exponent of 26 = 64. If in some other scenario your exponent (i.e. maximal string length) is not a power of two, you can still perform exponentiation by squaring by multiplying the relevant squares according to the bit pattern of the exponent. This is what makes this approach take O(log n) arithmetic operations to compute the result for string length n, assuming a fixed number of states and therefore fixed cost for each matrix squaring.
Example with deterministic automaton
If you were to make my automaton deterministic using the usual powerset construction, you'd end up with
and sorting the states as a, bc, c, d, cf, cef, f one would get the transition matrix
⎛ 0 0 0 0 0 0 0⎞
⎜ 9 10 0 0 0 0 0⎟
⎜ 1 0 0 0 0 0 0⎟
⎜ 0 1 1 0 1 1 0⎟
⎜ 0 0 0 1 0 0 0⎟
⎜ 0 0 0 9 0 10 0⎟
⎝ 0 0 0 0 1 1 1⎠
and could sum the last three elements of the first column of its 64th power to obtain the same result as above.
Single component
Start by looking for ways to form a single component. The corresponding regular expression for a single component is
0|[1-9][0-9]*
so it is either zero or a non-zero digit followed by arbitrary many zero digits. (I had missed the possible sole zero case at first, but the comment by malat made me aware of this.) If the total length of such a component is to be n, and you write h(n) to denote the number of ways to form such a component of length exactly n, then you can compute that h(n) as
h(n) = if n = 1 then 10 else 9 * 10^(n - 1)
where the n = 1 case allows for all possible digits, and the other cases ensure a non-zero first digit.
One or more components
Subsection 9.1 only writes that a UID is a bunch of dot-separated number components, as outlined above. So in regular expressions that would be
(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))*
Suppose f(n) is the number of ways to write a UID of length n. Then you have
f(n) = h(n) + sum h(i) * f(n-i-1) for i from 1 to n-2
The first term describes the case of a single component, while the sum takes care of the case where it consists of more than one component. In that case you have a first component of length i, then a dot which accounts for the -1 in the formula, and then the remaining digits form one or more components which is expressed via the recursive use of f.
Two or more components
As the comment by cneller indicates, the part of section 9 before subsection 9.1 indicates that there has to be at least two components. So the proper regular expression would be more like
(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))+
with a + at the end indicating that we want at least one repetition of the parenthesized expression. Deriving an expression for this simply means leaving out the one-component-only case in the definition of f:
g(n) = sum h(i) * f(n-i-1) for i from 1 to n-2
If you sum all the g(n) for n from 3 (the minimal possible UID length) through 64 you get the number of possible UIDs as
1474472506836676237371358967075549167865631190000000000000000000000
or approximately 1.5e66. Which is considerably less than the 4.5e66 you get from your computation, in terms of absolute difference, although it's definitely on the same order of magnitude. By the way, your estimate doesn't explicitely mention UIDs shorter than 64, but you can always consider padding them with dots in your setup. I did the computation using a few lines of Python code:
f = [0]
g = [0]
h = [0, 10] + [9 * (10**(n-1)) for n in range(2, 65)]
s = 0
for n in range(1, 65):
x = 0
if n >= 3:
for i in range(1, n - 1):
x += h[i] * f[n-i-1]
g.append(x)
f.append(x + h[n])
s += x
print(h)
print(f)
print(g)
print(s)
Is it possible to express ANY random set of numbers by a function?
Question clarification:
for example:
if desired result set = {1,2,3,4,5}
so I don't mean something like this:
function getSet(){
return {1,2,3,4,5};
}
but more like this:
function genSet(){
result = {}
for(i=0;i<5;i++){
result.push(i);
}
return result;
}
So in other words, can there be a logic to calculate any desired set?
There is a lot of mathematics behind this question. There are some interesting results.
Any set of (real) numbers can be define by a polynomial function f(x) = a + b x + c x^2 + ... so that a number is in the set if f(x)=0. Technically this is an algebraic curve in 1D. While this might seem a optimistic result there is not limit on how complex the polynomial could be and polynomials above the degree 5 have no explicit result.
There is a whole field of study on Computable numbers, real numbers which can be can be computed to within any desired precision by a finite, terminating algorithm, and their converse: non computable numbers, which can't. The bad news is there are a lot more non-computable numbers than computable ones.
The above has been based on real numbers which are decidedly more tricky than the integers or even a finite set of integers which is all we can represent by int or long datatypes. There is a big field of study in this see Computability theory (computer science). I think the Turings halting problem come in to play, this is about if you can determine if a algorithm will terminate. Unfortunately this can't be determined and a consequence is "Not every set of natural numbers is computable." The proof of this does require the infinite size of the naturals so I'm not sure about finite sets.
Representations
There are two common representations used for sets when programming. Suppose the set S is a subset of some universe of items U.
Membership Predicate
One way to represent the set S is a function member from S to { true, false }. For all x in U:
member(x) = true if x is in S
member(x) = false if x is not in S
Pseudocode
bool member(int n)
return 1 <= n <= 5
Enumeration
Another way to represent the S is to store all of its members in a data structure, such as a list, hash table, or binary tree.
Pseudocode
enumerable<int> S()
for int i = 1 to 5
yield return i
Operations
With either of these representations, most set operations can be defined. For example, the union of two sets would look as follows with each of the two representations.
Membership Predicate
func<int, bool> union(func<int, bool> s, func<int, bool> t)
return x => s(x) || t(x)
Enumeration
enumrable<int> union(enumerable<int> s, enumerable<int> t)
hashset<int> r
foreach x in s
r.add(x)
foreach x in t
if x not in r
r.add(x)
return r
Comparison
The membership predicate representation can be extremely versatile because all kinds of set operations from mathematics can be very easily expressed (complement, Cartesian product, etc.). The drawback is that there is no general way to enumerate all the members of a set represented in this way. The set of all positive real numbers, for example, cannot even be enumerated.
The enumeration representation typically involves much more expensive set operations, and some operations (such as the complement of the integer set {1, 2, 3, 4, 5}) cannot even be represented. It should be chosen if you need to be able to enumerate the members of a set, not just test membership.