I think I do not quite understand the concept of epsilon transitions
when determining the language of a non-deterministic automata.
For example in this automata:
The language is: 'A double sequence of a or a double sequence of b where there is a possibility of a baa sequence'.
But, the word a belongs to the automata too, doesn't it? (also the word b, and aaa and so on...)
An ε-transition is just a impromptu transition which doesn't consume any input.
When you are in a state which has outgoing ε-transitions it's like being in all of them until the automaton does something which is observable, from here the non determinism. The set of such states is the ε-closure of a state.
According to the layout you can have an arbitrary amount of baa prefixes followed by an arbitrary amount of of as or bs. So:
empty
baa
baabaa
a
aa
ba
abab
baabab
...
Related
I made this code to generate all unique substrings given a string, and I'm struggling to find the complexity of the code when I use recursion. I think a good complexity of time for this problem is O(N²), but what is my complexity and how can I improve my code?
'''
a b c d
ab ac ad bc bd cd
abc abd acd bcd
abcd
'''
dict_poss = {}
#get all letters
def func(string):
for i,value in enumerate(string):
dict_poss[value] = True
recursive(value,string[i+1:])
#get all combinations
def recursive(letter,string):
for i,value in enumerate(string):
if letter+value not in dict_poss:
dict_poss[letter+value] = True;
recursive(letter+value,string[i+1:])
return
func("abcd")
print(dict_poss)
From what you have written at the top there, you are trying to find all possible subsequences of the string, in which case this will be O(2^n). Think of the number of possible binary strings of length N, where you can construct a subsequence by the mask of each possible binary string (take a letter if 1, ignore it if 0).
If you want to find all possible substrings, this comes down to the implementation of strings in the language you're using. In c++, it's fairly trivial to do an n^2, but in java it would be O(n^3) since substring / concatenation is O(n) (Although you could do it in n^2 in java, just have to be tricky about what you do :)). Not sure what it is in, im guessing this is python (you should tag your question with the language you're using if you include code), but you could look it up. You could also time it with differently sized inputs, it wouldn't be hard to get a measurable runtime with a complexity of n^2.
I have come across two completely different answers.
One says that:
Yes, there does exist a context-free grammar for {0i1j | 1≤i≤j≤2i}, the following grammar ensures that there can be half or lesser 0’s for the 1’s that are present:
S -> 0S11 | 0S1 | 01
The other:
No, proof by contradiction:
Case 1:
Suppose you push i 0s onto the stack.
Pop off j 1s.
You can’t determine if j<=2i.
Case 2:
Suppose you push 2i 0s onto the stack.
Pop off j 1s.
You can’t determine if j>=i.
Any other value pushed on the stack not equal to i or 2i is a value relative to either of these two values, so the same reasoning applies.
Are either correct? Thanks so much!
Since a grammar exists and you can pretty clearly check it matches the whole language, the language must be context-free. So the proof by contradiction is wrong. But why?
The proof assumes the machine must be deterministic. But you need a nondeterministic pushdown automata to recognize some context-free grammars. Thus, all the second proof proves (if it is correct) is that the language isn't a deterministic context-free language, but it doesn't show that it isn't a context-free language.
Indeed, if you let the machine be nondeterministic, then basically you push i 0s, then for each 0 on the stack, nondeterministically pop 1 or 2 1s. One of the computations will accept if the string is in the language.
I'm working on a problem from the Languages and Machines: An Introduction to the Theory of Computer Science (3rd Edition) in Chapter 2 Example 6.
I need help finding the answer of:
Recursive definition of set strings over {a,b} that contains one b and even number of a's before the first b?
When looking for a recursive definition, start by identifying the base cases and then look for the recursive steps - like you're doing induction. What are the smallest strings in this language? Well, any string must have a b. Is b alone a string in the language? Why yes it is, since there are zero as that come before it and zero is an even number.
Rule 1: b is in L.
Now, given any string in the language, how can we get more strings? Well, we can apparently add any number of as to the end of the string and get another string in the language. In fact, we can get all such strings from b if we simply allow you to add one more a to the end of a string in the language. From x in L, we therefore recover xa, xaa, ..., xa^n, ... = xa*.
Rule 2: if x is in L then xa is in L.
Finally, what can we do to the beginning of strings in our language? The number of as must be even. So far, rules 1 and 2 only allow us to construct strings that have zero as before the b. We should be able to get two, four, six, etc., all the even numbers, of as. A rule that lets us add two as to any string in our language will let us add ever more as to the beginning while maintaining the evenness property we require. Starting with x in L, we recover aax, aaaax, ..., (aa)^(2n)x, ... = (aa)*x.
Rule 3: if x is in L, then aax is in L.
Optionally, you may add the sometimes implicitly understood rule that only those things allowed by the aforementioned rules are allowed. Otherwise, technically anything is allowed since we haven't explicitly disallowed anything yet.
Rule 4: Nothing is in L unless by virtue of some combination of the rules 1, 2 and/or 3 above.
A very basic question here:
Example rule (suppose its generated from WEKA) :
bread=t 10 ==> milk=t 10 conf:(1)
Which means that "from 10 instances, everytime people buy bread, they also buy milk". (ignore the support)
Does this rule can be read both ways? Like, "every time people buy milk, they also buy bread?"
Another example
Physics101=A ==> Superphysics401=A
Can it be read both ways like this:
"If people got A on Physics101, they also got A on Superphysics401"
"If people got A on Superphysics401, they also got A on Physics101" ?
If so, what makes WEKA generate the rule in that order (Physics ==> Superphysics), why not the other way? Or does the order not relevant?
Does this rule can be read both ways? Like, "everytime people buy milk, they also buy bread?"
No, it can only be read one way.
This follows from the rules of implication. A -> B and B -> A are different things. Read former as "A is a subset of B", thus, whenever you are in A, you are in B. B -> A, also called converse of A -> B, can be interpreted in similar way. When both of these hold, we say that A <-> B which means that A and B are essentially the same.
If the above looks like too much jargon, keep the following in mind:
Rain -> Clouds is true. Whenever there is rain, there will be clouds, But Clouds -> Rain is not always true. There may be clouds but no rain.
If so, what makes WEKA generate the rule in that order (Physics ==>
Superphysics), why not the other way? Or does the order not relevant?
The dataset leads to the rules. Here is an example :
Milk, Bread, Waffers
Milk, Toasts, Butter
Milk, Bread, Cookies
Milk, Cashewnuts
Convince yourself that Bread -> Milk, but Milk ! -> Bread.
Note that we may not be always interested in rules that either hold or do not hold. Thus, we try to add a notion of confidence to the rules. A natural way of defining confidence for A->B is P(B|A) i.e. how often do we see B when we see A.
This can be calculated by dividing the count of B and A appearing together and dividing by the count of A appearing alone.
In our example,
P(Milk | Bread) = 2 / 2 = 1 and
P(Bread | Milk) = 2 / 4 = 0.5
You can now sort list of rules on the basis of confidence and decide which ones do you want to use.
History: I read from one of Knuth's algorithm book that first computers used the base of 10. Then, it switched to two's complement here.
Question: Why does the base could not be -2 in at least a monoid?
Examples:
(-2)^1 = -2
(-2)^3 = -8
The problem is that with a negabinary (base -2) system, it's more difficult to understand, and the number of possible positive and negative values are different. To see this latter point, consider a simple 3 bit case.
Here
the first (rightmost) bit represents the decimal 1;
the middle bit represents the decimal -2; and
the third (leftmost) bit represents the decimal 4
So
000 -> 0
001 -> 1
010 -> -2
011 -> -1
100 -> 4
101 -> 5
110 -> 2
111 -> 3
Thus the range of expressable values is -2 to 5, i.e. non-symmetric.
At its heart, digital logic is base two. A digital signal is either on or off. Supporting other bases (as in BCD) means wasted representation space, more engineering, more complex specification, etc.
Editted to add: In addition to the trivial representation of a single binary digit in digital logic, addition is easily realized in hardware, start half adder which is easily realized in Boolean logic (i.e. with transistors):
(No carry) (with carry)
| 0 1 0 1
--+--------------------
0 | 00 01 01 10
1 | 01 10 10 11
(the returned digit is (A xor B) xor C, and the carry is ((A and B) or (C and (A or B))) ) which are then chained together to generate a full register adder.
Which brings us to twos complement: negation is easy, and the addition of mixed positive and negative number follows naturally with no additional hardware. So subtraction comes almost for free.
Few other representations will allow arithmetic to be implemented so cheaply, and I know of none that are easier.
Optimization in storage and optimization in processing time are often at cross purposes with each other; all other things being equal, simplicity usually trumps complexity.
Anyone can propose any storage mechanism for information they wish, but unless there are processors or algorithms that support it, it won't get used.
There are two reasons to choose base 2 over base -2:
First, in a lot of applications you don't need to represent negative numbers. By isolating their representation to a single bit you can either expand the range of representable numbers, or reduce the storage space required when negative numbers aren't needed. In base -2 you need to include the negative values even if you clip the range.
Second, 2s complement hardware is simple to implement. Not only is simple to implement, it is super simple to implement 2s complement hardware that supports both signed and unsigned arithmetic, since they are the same thing. In other words, the binary representation of uint4(8) and sint4(-15) are the same, and the binary representation of uint(7) and sint4(7) are the same, which means you can do the addition without knowing whether or not it is signed, the values all work out either way. That means the HW can totally avoid knowing anything about signs and let it be dealt with as a language convention.
Also, the use of the binary system has a mathematical background. Consider the Information Theory by Claude Shannon . My english skills don't qualify to explain this topic, so better follow the link to wikipedia and enjoy the maths behind all this stuff.
In the end the decision was made because of voltage variance.
With base 2 it is on or off, no in between.
However with base 10 how do you know what each number is?
is .1 volts 1? What about .11? Voltage can vary and is not precise. Which is why an analog signal is not as good as a digital. This is if you pay more for a HDMI cable than $6 it is a waste, it is digital it gets there or not. Audio it does matter because the signal can change.
Please, see an example of the complexity that dmckee pointed out without examples. So you can see an example, the numbers 0-9:
0 = 0
1 = 1
2 = 110
3 = 111
4 = 100
5 = 101
6 = 11010
7 = 11011
8 = 11000
9 = 11001
1's complement does have 0 and -0 - is that what you're after?
CDC used to produce 1's complement machines which made negation very easy as you suggest. As I understand it, it also allowed them to produce hardware for subtraction that didn't infringe on IBM's patent on the 2's complement binary subtractor.