I have, for example, this vector coming as a signal from other block each
sample of time, let’s say each second. Actually, the nature of this vector is
random but this is just an example:
U = [1 1 0 0 1 0 0 0 0 1 0]
I want to process this signal to a block that counts the occurrences of changing
from 1 to 0. The initial value is assumed to be zero.
Therefore, in the above example, when the first two entries (which are ones)
enter this block, the block will give zero output.
But, when the third entry (which is zero and its previous value is 1) enters the
block, it will give me one and when the sixth entry (which is zero and its
previous value is 1) enters the block, it will give me two and when the last
entry (which is zero and its previous value is 1) enters the block, it will give
me three. For all other cases, the block will give zero.
So, the block will count the cases where the input is zero and its previous
input is one.
The output of the block is keeping changing over the time which, in turn, will
enter to other block.
I don’t want the implementation or details. I already know all of that.
I just want to know what is the name of the block that does such counting.
I tried using counter and memory blocks but unfortunately, I was not able to get
the right aimed results.
The
regards
No idea if you still require an answer, but the following should do it (I don't think it can be done in one standard block).
This assumes that your input is a signal that changes over time (and not a constant vector).
Version 1 would, for your input of [1 1 0 0 1 0 0 0 0 1 0], provide an output of [0 0 1 1 1 2 2 2 2 2 3].
Since you wrote
For all other cases, the block will give zero.
I also included a Version 2 which will, for your input, output [0 0 1 0 0 2 0 0 0 0 3].
Related
I'd like to split a sequence into k parts, and optimize the homogeneity of these sub-parts.
Example : 0 0 0 0 0 1 1 2 3 3 3 2 2 3 2 1 0 0 0
Result : 0 0 0 0 0 | 1 1 2 | 3 3 3 2 2 3 2 | 1 0 0 0 when you ask for 4 parts (k = 4)
Here, the algorithm did not try to split in fixed-length parts, but instead tried to make sure elements in the same parts are as homogeneous as possible.
What algorithm should I use ? Is there an implementation of it in R ?
Maybe you can use Expectation-maximization algorithm. Your points would be (value, position). In your example, this would be something like:
With the E-M algorithm, the result would be something like (by hand):
This is the desired output, so you can consider using this, and if it really works in all your scenarios. An annotation, you must assign previously the number of clusters you want, but I think it's not a problem for you, as you have set out your question.
Let me know if this worked ;)
Edit:
See this picture, is what you talked about. With k-means you should control the delta value, this is, how the position increment, to have its value to the same scale that value. But with E-M this doesn't matter.
Edit 2:
Ok I was not correct, you need to control the delta value. It is not the same if you increment position by 1 or by 3: (two clusters)
Thus, as you said, this algorithm could decide to cluster points that are not neighbours if their position is far but their value is close. You need to guarantee this not to happen, with a high increment of delta. I think that with a increment of 2 * (max - min) values of your sequence this wouldn't happen.
Now, your points would have the form (value, delta * position).
Per DICOM specification, a UID is defined by: 9.1 UID Encoding Rules. In other words the following are valid DICOM UIDs:
"1.2.3.4.5"
"1.3.6.1.4.35045.103501438824148998807202626810206788999"
"1.2.826.0.1.3680043.2.1143.5028470438645158236649541857909059554"
while the following are illegal DICOM UIDs:
".1.2.3.4.5"
"1..2.3.4.5"
"1.2.3.4.5."
"1.2.3.4.05"
"12345"
"1.2.826.0.1.3680043.2.1143.50284704386451582366495418579090595540"
Therefore I know that the string is at most 64 bytes, and should match the following regex [0-9\.]+. However this regex is really a superset, since there are a lot less than (10+1)^64 (=4457915684525902395869512133369841539490161434991526715513934826241L) possibilities.
How would one computes precisely the number of possibilities to respect the DICOM UID rules ?
Reading the org root / suffix rule clearly indicates that I need at least one dot ('.'). In which case the combination is at least 3 bytes (char) in the form: [0-9].[0-9]. In which case there are 10x10=100 possibilities for UID of length 3.
Looking at the first answer, there seems to be something unclear about:
The first digit of each component shall not be zero unless the
component is a single digit.
What this means is that:
"0.0" is valid
"00.0" or "1.01" are not valid
Thus I would say a proper expression would be:
(([1-9][0-9]*)|0)(\.([1-9][0-9]*|0))+
Using a simple C code, I could find:
f(0) = 0
f(1) = 0
f(2) = 0
f(3) = 100
f(4) = 1800
f(5) = 27100
f(6) = 369000
f(7) = 4753000
f(8) = 59049000
The validation of the Root UID part is outside the scope of this question. A second validation step could take care of rejecting some OID that cannot possibly be registered (some people mention restriction on first and second arc for example). For simplicity we'll accept all possible (valid) Root UID.
While my other answer takes good care of this specific application, here is a more generic approach. It takes care of situations where you have a different regular expression describing the language in question. It also allows for considerably longer string lengths, since it only requires O(log n) arithmetic operations to compute the number of combinations for strings of length up to n. In this case the number of strings grows so quickly that the cost of these arithmetic operations will grow dramatically, but that may not be the case for other, otherwise similar situations.
Build a finite state automaton
Start with a regular expression description of your language in question. Translate that regular expression into a finite state automaton. In your case the regular expression can be given as
(([1-9][0-9]*)|0)(\.([1-9][0-9]*|0))+
The automaton could look like this:
Eliminate ε-transitions
This automaton usually contains ε-transitions (i.e. state transitions which do not correspond to any input character). Remove those, so that one transition corresponds to one character of input. Then add an ε-transition to the accepting state(s). If the accepting states have other outgoing transitions, don't add ε-loops to them, but instead add an ε-transition to an accepting state with no outgoing edges and then add the loop to that. This can be seen as padding the input with ε at its end, without allowing ε in the middle. Taken together, this transformation ensures that performing exactly n state transitions corresponds to processing an input of n characters or less. The modified automaton might look like this:
Note that both the construction of the first automaton from the regular expression and the elimination of ε-transitions can be performed automatically (and perhaps even in a single step. The resulting automata might be more complicated than what I constructed here manually, but the principle is the same.
Ensuring unique paths
You don't have to make the automaton deterministic in the sense that for every combination of source state and input character there is only one target state. That's not the case in my manually constructed one either. But you have to make sure that every complete input has only one possible path to the accepting state, since you'll essentially be counting paths. Making the automaton deterministic would ensure this weaker property, too, so go for that unless you can ensure unique paths without this. In my example the length of each component clearly dictates which path to use, so I didn't make it deterministic. But I've included an example with a deterministic approach at the end of this post.
Build transition matrix
Next, write down the transition matrix. Associate the rows and columns with your states (in order a, b, c, d, e, f in my example). For each arrow in your automaton, write the number of characters included in the label of that arrow in the column associated with the source state and the row associated with the target state of that arrow.
⎛ 0 0 0 0 0 0⎞
⎜ 9 10 0 0 0 0⎟
⎜10 10 0 10 10 0⎟
⎜ 0 0 1 0 0 0⎟
⎜ 0 0 0 9 10 0⎟
⎝ 0 0 0 10 10 1⎠
Read result off that matrix
Now applying this matrix with a column vector once has the following meaning: if the number of possible ways to arrive in a given state is encoded in the input vector, the output vector gives you the number of ways one transition later. Take the 64th power of that matrix, concentrate on the first column (since ste start situation is encoded as (1,0,0,0,0,0), meaning only one way to end up in the start state) and sum up all the entries that correspond to accepting states (only the last one in this case). The bottom left element of the 64th power of this matrix is
1474472506836676237371358967075549167865631190000000000000000000000
which confirms my other answer.
Compute matrix powers efficiently
In order to actually compute the 64th power of that matrix, the easiest approach would be repeated squaring: after squaring the matrix 6 times you have an exponent of 26 = 64. If in some other scenario your exponent (i.e. maximal string length) is not a power of two, you can still perform exponentiation by squaring by multiplying the relevant squares according to the bit pattern of the exponent. This is what makes this approach take O(log n) arithmetic operations to compute the result for string length n, assuming a fixed number of states and therefore fixed cost for each matrix squaring.
Example with deterministic automaton
If you were to make my automaton deterministic using the usual powerset construction, you'd end up with
and sorting the states as a, bc, c, d, cf, cef, f one would get the transition matrix
⎛ 0 0 0 0 0 0 0⎞
⎜ 9 10 0 0 0 0 0⎟
⎜ 1 0 0 0 0 0 0⎟
⎜ 0 1 1 0 1 1 0⎟
⎜ 0 0 0 1 0 0 0⎟
⎜ 0 0 0 9 0 10 0⎟
⎝ 0 0 0 0 1 1 1⎠
and could sum the last three elements of the first column of its 64th power to obtain the same result as above.
Single component
Start by looking for ways to form a single component. The corresponding regular expression for a single component is
0|[1-9][0-9]*
so it is either zero or a non-zero digit followed by arbitrary many zero digits. (I had missed the possible sole zero case at first, but the comment by malat made me aware of this.) If the total length of such a component is to be n, and you write h(n) to denote the number of ways to form such a component of length exactly n, then you can compute that h(n) as
h(n) = if n = 1 then 10 else 9 * 10^(n - 1)
where the n = 1 case allows for all possible digits, and the other cases ensure a non-zero first digit.
One or more components
Subsection 9.1 only writes that a UID is a bunch of dot-separated number components, as outlined above. So in regular expressions that would be
(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))*
Suppose f(n) is the number of ways to write a UID of length n. Then you have
f(n) = h(n) + sum h(i) * f(n-i-1) for i from 1 to n-2
The first term describes the case of a single component, while the sum takes care of the case where it consists of more than one component. In that case you have a first component of length i, then a dot which accounts for the -1 in the formula, and then the remaining digits form one or more components which is expressed via the recursive use of f.
Two or more components
As the comment by cneller indicates, the part of section 9 before subsection 9.1 indicates that there has to be at least two components. So the proper regular expression would be more like
(0|[1-9][0-9]*)(\.(0|[1-9][0-9]*))+
with a + at the end indicating that we want at least one repetition of the parenthesized expression. Deriving an expression for this simply means leaving out the one-component-only case in the definition of f:
g(n) = sum h(i) * f(n-i-1) for i from 1 to n-2
If you sum all the g(n) for n from 3 (the minimal possible UID length) through 64 you get the number of possible UIDs as
1474472506836676237371358967075549167865631190000000000000000000000
or approximately 1.5e66. Which is considerably less than the 4.5e66 you get from your computation, in terms of absolute difference, although it's definitely on the same order of magnitude. By the way, your estimate doesn't explicitely mention UIDs shorter than 64, but you can always consider padding them with dots in your setup. I did the computation using a few lines of Python code:
f = [0]
g = [0]
h = [0, 10] + [9 * (10**(n-1)) for n in range(2, 65)]
s = 0
for n in range(1, 65):
x = 0
if n >= 3:
for i in range(1, n - 1):
x += h[i] * f[n-i-1]
g.append(x)
f.append(x + h[n])
s += x
print(h)
print(f)
print(g)
print(s)
I have vectors in such form
(1 1 1 0 1 0)
(0 0 1 0 0 0)
(1 0 0 0 0 0)
(0 0 0 1 0 0)
(1 1 0 0 1 0)
(0 0 1 1 0 0)
(1 0 1 1 0 0)
I need to find all linear dependent subsets over Z2.
For example 1,2,5 and 3,6,7.
OK, my 5 cents, brute force, IINM, is iterating over all subsets of the set of vectors. So, instead, you go from the bottom, where singleton sets of each vectors are obviously linearly independent.
In the next step, for each singleton set, you make a list by trying to add each of the remaining vectors in the set, and see which of such pairs are independent, taking note of those pairs, you've already tested, to prevent repetitive effort.
In the 3rd step, for each independent pair, you'll try to add each one of the remaining vectors to make triples, you test the new vector for independence, and mark the triples you've already tested.
This should provide much saving over brute force, with the worst case being a set of all independent vectors.
I have an n-partite (undirected) graph, given as an adjacency matrix, for instance this one here:
a b c d
a 0 1 1 0
b 0 0 0 1
c 0 0 0 1
d 0 0 0 0
I would like to know if there is a set of matrix operations that I can apply to this matrix, which will result in a matrix that "lists" all paths (of length n, i.e. through all the partitions) in this graph. For the above example, there are paths a->b->d and a->c->d. Hence, I would like to get the following matrix as a result:
a b c d
1 1 0 1
1 0 1 1
The first path contains nodes a,b,d and the second one nodes a,c,d. If necessary, the result matrix may have some all-0 lines, as here:
a b c d
1 1 0 1
0 0 0 0
1 0 1 1
0 0 0 0
Thanks!
P.S. I have looked at algorithms for computing the transitive closure, but these usually only tell if there is a path between two nodes, and not directly which nodes are on that path.
One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.
Now if you're interested in knowing all of the vertices along the path, I don't think that using purely matrix operations is the way to go. Bearing in mind that you have an n-partite graph, I would set up a data structure as follows: (Bear in mind that space costs will be expensive for all but small values.)
Each column will have one entry of each of the nodes in our graph. The n-th column will contain 1 in if this node is reachable on the n-th iteration from our designated start vertex or start set, and zero otherwise. Each column entry will also contain a list of back pointers to the vertices in the n-1 column which led to this vertex in the nth column. (This is like the viterbi algorithm, except that we have to maintain a list of backpointers for each entry rather than just one.) The complexity of doing this is (m^2)*n, where m is the number of vertices in the graph, and n is the length of the desired path.
I'm a little bit confused by your top matrix: with an undidrected graph, I would expect the adjacency matrix to be symmetric.
No, There is no pure matrix way to generate all paths. Please use pure combinatorial algorithms.
'One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.'
The power of matriax generates walks not paths.
I've never been much for math and I'm hoping that someone can help me out with the following.
I have 5 boxes:
1 2 3 4 5
[ ] [ ] [ ] [ ] [ ]
The boxes can either be white, gray, or black (or think of it as 0, 1, 2)
How many possible states can the box set be in?
What is the pseudocode (or in any language) to generate all the possible outcomes??
ie...
00000
00001
00011
00111
etc, etc...
I really appreciate any help anyone can give me with this.
the answer for the number of combinations is: 3x3x3x3x3 (3^5) since each box can have 3 possible colors.
As for generating the outcomes, see if you can figure it out using this matrix with 0, 1, or 2 to represent the color of the box. On a smaller scale (lets assume 3 boxes) it would look like this:
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2
This is a classic permutation generation problem. You have 3 possibilities for each position, and 5 positions. The total number of generated string is 3^5 = 243.
You need recursion if you want a general solution (a simple iterative loop only works for a single instance of the problem).
Here's a quick example:
public static void Main(string[] args){
Generate("", 5);
}
private void Generate(string s, int limit)
{
if (s.Length == limit)
Console.WriteLine(s);
else
{
Generate(s+"0", limit);
Generate(s+"1", limit);
Generate(s+"2", limit);
}
}
To answer your first question, what would the answer be if the boxes could contain only one of two values? So, what's the answer if the boxes contain one of three values?
To answer your second question, what pseudocode generates all possible outcomes of one box? Now, pseudocode generates all possible outcomes of two boxes?
I'd recommend solving the problem on paper first. Try to solve it with a smaller number of boxes (maybe three), and list all possibilities. Then, think of how your reasoning went, or how you'd explain what you did to a small child.
This seems like a homework problem. I'll just give you some help as to the solution then.
What you are saying is that each box has three states, which are all independent. One box would have 3 solutions, and two boxes would have 3 * 3 solutions - for each state of the first box the second box would have three states as well. Extend that to 5 boxes.
To generate each solution, you can just cycle through it. It is easy to make nested for loops for each box, and multiplying by powers of 10 can let you show the number at once.
You can generalize the code for multiple boxes in a similar way.
Thank you all for your answers, at least those of you who actually gave me one.
While I can appreciate that the question sounded like it was pulled straight out of Computer Science 101, it wasn't. The irony of the matter is that it was for real life on a real deadline and I didn't have time to hearken back to when I was being taught this stuff and said to myself, "when am I ever going to need this crap"
If I wanted to be patronized and treated like a school boy I would go back to my elementary school and ask my 5th grade teacher if I can go to the bathroom
Thanks again
the number of states is 3^5.
pseudocode is
for value from 0 to 3^5-1
print base3(value)
where base3 is a function that repeatedly takes modulo 3 to get a digit, then removes that digit (by dividing by 3)
Hint: imagine that each box is a position in a number and each colour is a different digit. In the real world, how many combinations (including zero) do you get with 2 positions and 10 possible digits? What about 3 positions? What's the relationship between adding an extra position and the number of combinations, given the number of digits you have available?
Unique number of combinations: 3^5=243
Code:
n = 0
for i = 0 to 3^5-1
{
s = ""
for j = 1 to 5
{
d = n mod 3
s = toascii(d) . s
n = n / 3
}
println s
i = i + 1
}
Here's how I first learned to do this: first think about how many choices you are making. You are making five choices, one for each box. So write down five blank lines with multiplication signs:
__ x __ x __ x __ x __ = ?
In each blank, write the number of objects you have to choose from for that box. Since you have 3 numbers to choose from for each box, you write a 3 in every blank:
3 x 3 x 3 x 3 x 3 = 243
This gives you the total number of permutations for those choices.
The number of possibilities is 3 to the power 5
If you loop from 0 to that number minus 1 and express it in base 3 you will have all the possibilities (remember to prepend 0s where necessary)
In Ruby:
number_of_possibilities = 3**5-1
for i in (0..number_of_possibilities)
base_3_number = i.to_s(3)
puts "%05d" % base_3_number # number formatting used to prepend 0s where necessary
end
Can I ask what about this you don't understand or whats tripping you up? I see that everyone here has simply answered the question, but if you've copied their answers, you've learned nothing, and thus completely missed the point of the homework. Assuming your next lesson builds upon this one, you're just going to fall further behind.
If you either worked for me or were in my class I'd simply ask the following...
"How do you think the problem should be solved?" The answer to which might reveal where you're getting hung up. A wise professor of mine at CMU once said "I can't help you understand this until you know what you don't understand" I never did figure out what I didn't understand and I dropped his class, but the lesson stuck with me.
I know its probably too late, but for these homework questions I really think we should be helping the person learn as opposed to simply providing an answer and doing their homework for them.
Your problem needs nothing more than the rule of product in combinatorics.
You can choose the state of the first box in 3 ways, and the state of the second box in 3 ways, and ... and the state of the 5th box in 3 ways. The number of ways in which you can set the state of all the boxes is the product of all the five (equal) numbers of ways, i.e. 3x3x3x3x3 = 35.
Similar question: how many numbers can you form with 5 digits in the decimal system, counting the leading zeros? That is, how many numbers are there from 00000 to 99999? You can choose the first digit in 10 ways (0...9), and so on and so on, and the answer is 10x10x10x10x10 = 100000, as you already know.
Don't even try to write code to answer this! The reason is that you need some very large numbers (factorials) to calculate it. These create numbers much larger than any base type in the CLR. You can use this opensource library to do the calculation.
void solve(int p=0,int n=5,int d=0)
{
if (n==p)
{
int rev=d;
int i=0;
while (i<5) {
cout << rev%10;
rev /= 10;
i++;## Heading ##
}
cout << endl;
return;
}
for(int i=0; i<3 ; i++)
{
solve(p+1,n, d*10 + i);
}
}