How to split a sequence in k homogeneous parts? - r

I'd like to split a sequence into k parts, and optimize the homogeneity of these sub-parts.
Example : 0 0 0 0 0 1 1 2 3 3 3 2 2 3 2 1 0 0 0
Result : 0 0 0 0 0 | 1 1 2 | 3 3 3 2 2 3 2 | 1 0 0 0 when you ask for 4 parts (k = 4)
Here, the algorithm did not try to split in fixed-length parts, but instead tried to make sure elements in the same parts are as homogeneous as possible.
What algorithm should I use ? Is there an implementation of it in R ?

Maybe you can use Expectation-maximization algorithm. Your points would be (value, position). In your example, this would be something like:
With the E-M algorithm, the result would be something like (by hand):
This is the desired output, so you can consider using this, and if it really works in all your scenarios. An annotation, you must assign previously the number of clusters you want, but I think it's not a problem for you, as you have set out your question.
Let me know if this worked ;)
Edit:
See this picture, is what you talked about. With k-means you should control the delta value, this is, how the position increment, to have its value to the same scale that value. But with E-M this doesn't matter.
Edit 2:
Ok I was not correct, you need to control the delta value. It is not the same if you increment position by 1 or by 3: (two clusters)
Thus, as you said, this algorithm could decide to cluster points that are not neighbours if their position is far but their value is close. You need to guarantee this not to happen, with a high increment of delta. I think that with a increment of 2 * (max - min) values of your sequence this wouldn't happen.
Now, your points would have the form (value, delta * position).

Related

Calculating edge embeddedness for signed graph in R

I have a dataframe looks like
src dst sign
0 1 +1
1 2 -1
2 5 +1
1 0 -1
...
to describe a signed graph (with two types of edges: +/-)
I want to calculate edge embeddedness of this graph.
Currently, I am writing two nested loop (i.e., a brute-force attack: just count one by one).
You could imagine that this solution is very slow.
Is there a better way to perform the task?
Thank you very much,

Counting occurrences of 1 --> 0 changing for a signal in SIMULINK

I have, for example, this vector coming as a signal from other block each
sample of time, let’s say each second. Actually, the nature of this vector is
random but this is just an example:
U = [1 1 0 0 1 0 0 0 0 1 0]
I want to process this signal to a block that counts the occurrences of changing
from 1 to 0. The initial value is assumed to be zero.
Therefore, in the above example, when the first two entries (which are ones)
enter this block, the block will give zero output.
But, when the third entry (which is zero and its previous value is 1) enters the
block, it will give me one and when the sixth entry (which is zero and its
previous value is 1) enters the block, it will give me two and when the last
entry (which is zero and its previous value is 1) enters the block, it will give
me three. For all other cases, the block will give zero.
So, the block will count the cases where the input is zero and its previous
input is one.
The output of the block is keeping changing over the time which, in turn, will
enter to other block.
I don’t want the implementation or details. I already know all of that.
I just want to know what is the name of the block that does such counting.
I tried using counter and memory blocks but unfortunately, I was not able to get
the right aimed results.
The
regards
No idea if you still require an answer, but the following should do it (I don't think it can be done in one standard block).
This assumes that your input is a signal that changes over time (and not a constant vector).
Version 1 would, for your input of [1 1 0 0 1 0 0 0 0 1 0], provide an output of [0 0 1 1 1 2 2 2 2 2 3].
Since you wrote
For all other cases, the block will give zero.
I also included a Version 2 which will, for your input, output [0 0 1 0 0 2 0 0 0 0 3].

Find all lineary dependent subsets of vectors

I have vectors in such form
(1 1 1 0 1 0)
(0 0 1 0 0 0)
(1 0 0 0 0 0)
(0 0 0 1 0 0)
(1 1 0 0 1 0)
(0 0 1 1 0 0)
(1 0 1 1 0 0)
I need to find all linear dependent subsets over Z2.
For example 1,2,5 and 3,6,7.
OK, my 5 cents, brute force, IINM, is iterating over all subsets of the set of vectors. So, instead, you go from the bottom, where singleton sets of each vectors are obviously linearly independent.
In the next step, for each singleton set, you make a list by trying to add each of the remaining vectors in the set, and see which of such pairs are independent, taking note of those pairs, you've already tested, to prevent repetitive effort.
In the 3rd step, for each independent pair, you'll try to add each one of the remaining vectors to make triples, you test the new vector for independence, and mark the triples you've already tested.
This should provide much saving over brute force, with the worst case being a set of all independent vectors.

Matrix operations to enumerate all paths through n-partite graph

I have an n-partite (undirected) graph, given as an adjacency matrix, for instance this one here:
a b c d
a 0 1 1 0
b 0 0 0 1
c 0 0 0 1
d 0 0 0 0
I would like to know if there is a set of matrix operations that I can apply to this matrix, which will result in a matrix that "lists" all paths (of length n, i.e. through all the partitions) in this graph. For the above example, there are paths a->b->d and a->c->d. Hence, I would like to get the following matrix as a result:
a b c d
1 1 0 1
1 0 1 1
The first path contains nodes a,b,d and the second one nodes a,c,d. If necessary, the result matrix may have some all-0 lines, as here:
a b c d
1 1 0 1
0 0 0 0
1 0 1 1
0 0 0 0
Thanks!
P.S. I have looked at algorithms for computing the transitive closure, but these usually only tell if there is a path between two nodes, and not directly which nodes are on that path.
One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.
Now if you're interested in knowing all of the vertices along the path, I don't think that using purely matrix operations is the way to go. Bearing in mind that you have an n-partite graph, I would set up a data structure as follows: (Bear in mind that space costs will be expensive for all but small values.)
Each column will have one entry of each of the nodes in our graph. The n-th column will contain 1 in if this node is reachable on the n-th iteration from our designated start vertex or start set, and zero otherwise. Each column entry will also contain a list of back pointers to the vertices in the n-1 column which led to this vertex in the nth column. (This is like the viterbi algorithm, except that we have to maintain a list of backpointers for each entry rather than just one.) The complexity of doing this is (m^2)*n, where m is the number of vertices in the graph, and n is the length of the desired path.
I'm a little bit confused by your top matrix: with an undidrected graph, I would expect the adjacency matrix to be symmetric.
No, There is no pure matrix way to generate all paths. Please use pure combinatorial algorithms.
'One thing you can do is to compute the nth power of you matrix A. The result will tell you how many paths there of length n from any one vertex to any other.'
The power of matriax generates walks not paths.

How do I calculate the number of permutations in base 3 combinatorics?

I've never been much for math and I'm hoping that someone can help me out with the following.
I have 5 boxes:
1 2 3 4 5
[ ] [ ] [ ] [ ] [ ]
The boxes can either be white, gray, or black (or think of it as 0, 1, 2)
How many possible states can the box set be in?
What is the pseudocode (or in any language) to generate all the possible outcomes??
ie...
00000
00001
00011
00111
etc, etc...
I really appreciate any help anyone can give me with this.
the answer for the number of combinations is: 3x3x3x3x3 (3^5) since each box can have 3 possible colors.
As for generating the outcomes, see if you can figure it out using this matrix with 0, 1, or 2 to represent the color of the box. On a smaller scale (lets assume 3 boxes) it would look like this:
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 1 2
0 2 0
0 2 1
0 2 2
1 0 0
1 0 1
1 0 2
1 1 0
1 1 1
1 1 2
1 2 0
1 2 1
1 2 2
2 0 0
2 0 1
2 0 2
2 1 0
2 1 1
2 1 2
2 2 0
2 2 1
2 2 2
This is a classic permutation generation problem. You have 3 possibilities for each position, and 5 positions. The total number of generated string is 3^5 = 243.
You need recursion if you want a general solution (a simple iterative loop only works for a single instance of the problem).
Here's a quick example:
public static void Main(string[] args){
Generate("", 5);
}
private void Generate(string s, int limit)
{
if (s.Length == limit)
Console.WriteLine(s);
else
{
Generate(s+"0", limit);
Generate(s+"1", limit);
Generate(s+"2", limit);
}
}
To answer your first question, what would the answer be if the boxes could contain only one of two values? So, what's the answer if the boxes contain one of three values?
To answer your second question, what pseudocode generates all possible outcomes of one box? Now, pseudocode generates all possible outcomes of two boxes?
I'd recommend solving the problem on paper first. Try to solve it with a smaller number of boxes (maybe three), and list all possibilities. Then, think of how your reasoning went, or how you'd explain what you did to a small child.
This seems like a homework problem. I'll just give you some help as to the solution then.
What you are saying is that each box has three states, which are all independent. One box would have 3 solutions, and two boxes would have 3 * 3 solutions - for each state of the first box the second box would have three states as well. Extend that to 5 boxes.
To generate each solution, you can just cycle through it. It is easy to make nested for loops for each box, and multiplying by powers of 10 can let you show the number at once.
You can generalize the code for multiple boxes in a similar way.
Thank you all for your answers, at least those of you who actually gave me one.
While I can appreciate that the question sounded like it was pulled straight out of Computer Science 101, it wasn't. The irony of the matter is that it was for real life on a real deadline and I didn't have time to hearken back to when I was being taught this stuff and said to myself, "when am I ever going to need this crap"
If I wanted to be patronized and treated like a school boy I would go back to my elementary school and ask my 5th grade teacher if I can go to the bathroom
Thanks again
the number of states is 3^5.
pseudocode is
for value from 0 to 3^5-1
print base3(value)
where base3 is a function that repeatedly takes modulo 3 to get a digit, then removes that digit (by dividing by 3)
Hint: imagine that each box is a position in a number and each colour is a different digit. In the real world, how many combinations (including zero) do you get with 2 positions and 10 possible digits? What about 3 positions? What's the relationship between adding an extra position and the number of combinations, given the number of digits you have available?
Unique number of combinations: 3^5=243
Code:
n = 0
for i = 0 to 3^5-1
{
s = ""
for j = 1 to 5
{
d = n mod 3
s = toascii(d) . s
n = n / 3
}
println s
i = i + 1
}
Here's how I first learned to do this: first think about how many choices you are making. You are making five choices, one for each box. So write down five blank lines with multiplication signs:
__ x __ x __ x __ x __ = ?
In each blank, write the number of objects you have to choose from for that box. Since you have 3 numbers to choose from for each box, you write a 3 in every blank:
3 x 3 x 3 x 3 x 3 = 243
This gives you the total number of permutations for those choices.
The number of possibilities is 3 to the power 5
If you loop from 0 to that number minus 1 and express it in base 3 you will have all the possibilities (remember to prepend 0s where necessary)
In Ruby:
number_of_possibilities = 3**5-1
for i in (0..number_of_possibilities)
base_3_number = i.to_s(3)
puts "%05d" % base_3_number # number formatting used to prepend 0s where necessary
end
Can I ask what about this you don't understand or whats tripping you up? I see that everyone here has simply answered the question, but if you've copied their answers, you've learned nothing, and thus completely missed the point of the homework. Assuming your next lesson builds upon this one, you're just going to fall further behind.
If you either worked for me or were in my class I'd simply ask the following...
"How do you think the problem should be solved?" The answer to which might reveal where you're getting hung up. A wise professor of mine at CMU once said "I can't help you understand this until you know what you don't understand" I never did figure out what I didn't understand and I dropped his class, but the lesson stuck with me.
I know its probably too late, but for these homework questions I really think we should be helping the person learn as opposed to simply providing an answer and doing their homework for them.
Your problem needs nothing more than the rule of product in combinatorics.
You can choose the state of the first box in 3 ways, and the state of the second box in 3 ways, and ... and the state of the 5th box in 3 ways. The number of ways in which you can set the state of all the boxes is the product of all the five (equal) numbers of ways, i.e. 3x3x3x3x3 = 35.
Similar question: how many numbers can you form with 5 digits in the decimal system, counting the leading zeros? That is, how many numbers are there from 00000 to 99999? You can choose the first digit in 10 ways (0...9), and so on and so on, and the answer is 10x10x10x10x10 = 100000, as you already know.
Don't even try to write code to answer this! The reason is that you need some very large numbers (factorials) to calculate it. These create numbers much larger than any base type in the CLR. You can use this opensource library to do the calculation.
void solve(int p=0,int n=5,int d=0)
{
if (n==p)
{
int rev=d;
int i=0;
while (i<5) {
cout << rev%10;
rev /= 10;
i++;## Heading ##
}
cout << endl;
return;
}
for(int i=0; i<3 ; i++)
{
solve(p+1,n, d*10 + i);
}
}

Resources