Arc-Consistency (AC3) and one Challenges? - constraints

In Applying Arc-Consistency (AC3) algorithms on one Constraint Satisfaction Problem, if domain of one variable be empty, what is the next step?
1) halt.
2) do backtrack.
3) start from another initial state.
4) it depends on that we are in which step.
Solution (4). I think (1) is true because it mean we cannot find any consistent assignment and halt. anyone can describe why (4) is true?

With the particular algorithm you're using, if the domain of a variable shrinks until it is empty, then it means that the constraint problem has no solutions. Therefore the algorithm should halt in the failure state.

Related

Reinforcement Learning Algorithm stuck in infinite loop when choosing an infeasible action

😊
I wrote a gym environment and i am training this gym environment using the DQN agent from stable-baselines.
What my environment dose:
action : int == 1, ..., n --> Add an order n (containing (1 <= x <= 10) items) to a list (lets call it "item_list").
action == 0 --> Close the item_list (no further items can be added).
So whats the problem?
There is a max number of items allowed in the item_list. This means if the item_list is almost full and the agent chooses action >= 1 --> The capacity of the "item_list is exceeded. Thus it is an infeasable action.
What happens in this case?
In that case the order will not be added to "item_list" and the Agent receaves a negative reward. The problem is that the Observation of my agent wont chainge at all. This will lead to my Agent choosing the same action over and over again.
When does it become a problem?
During training it does not matter that much. The Agent will learn to avoid choosing an action like this and exploration will always get the agent out of this loop after a while. Though, when i want to use the trained agent there will be no exploration. One "bad" action would be enough to send this agent into a infinite loop of choosing an order that does not fit into the item_list.
What kind of answer am I looking for?
Is there a way of dealing with those infinite loops exept form hardcoding a lot of exceptions? If the capacity of item_list ist exeeded it would be easy to hardcode that the list should be closed the way it ist. Though the same infinite loop can appear if the item_list is empty and the Agent chooses action == 0 (closing an empty list). There are many other cases where this problem can occure. Is there a smart solution for this that I am not aware of?
Two general approaches to consider:
First, do not allow illegal actions. When your agent outputs a distribution over actions, multiply it by indicator vector that zeroes out illegal actions. Then sample an action.
Second, after an illegal action terminate the episode with a negative reward. You already assign a negative reward for illegal action, additionally terminate the episode, so that there is no infinite loop.
I would go for the first one.
Edit: see this question as well: https://ai.stackexchange.com/questions/2980/how-should-i-handle-invalid-actions-when-using-reinforce

Derivatives warning doesn't make sense

I am getting a warning in this form:
DerivativesWarning:Constraints or objectives ['traj.linkages.stage_1_grav_turn:alpha_final|coast_1:alpha_initial', 'traj.phases.stage_1_maneuver.path_constraints.path:q_alpha'] cannot be impacted by the design variables of the problem.
not sure what to make of the first one, a linkage constraint. Alpha is a parameter in the grav_turn and coast phase, and it's set to 0. The second one makes no sense, as in the the stage_1_maneuver phase alpha is a control so you can definitely control dynamic pressure * alpha. Perhaps because alpha at the end of that phase is constrained to 0?
Anyways the optimizer converges fine, and produces results that look correct and makes sense when cross checked. Just was curious about this.
In OpenMDAO V3.9.0 a feature was added that detects rows and columns of all 0 in the total derivative Jacobian. A row of all 0's means that an objective or constraint is not impacted by any of the design variables. A column of all 0's means that a design variable does not impact any constraint or objective values.
Both of these situations are potentially problematic. A 0 column means that there are less degrees of freedom than you might think, since that DV doesn't affect anything. This isn't fatal, but it is still something that is worth warning a user about.
A 0 row is much more problematic. If it the row is associated with a constraint, it means the optimizer has no ability to satisfy that constraint. You may get "lucky" and find that the constraint happens that the constraint happens to be satisfied at the initial condition anyway, and so you can technically solve the optimization problem (your specific case is likely one of these). However, mathematically the problem is singular, and unless the optimizer you using has specific code to handle this corner case it can make things difficult.
One of the primary reasons this feature was added was that the OpenMDAO dev team noticed that Dymos users were particularly prone to accidentally creating 0 rows when adding linkage and path constraints. Often these 0 rows seem to not cause harm, but we have definitely also seen cases where they give the optimizer fits.
The warning helps you identify the problem so you can correct it.
In this case, it looks like you have two separate 0 rows.
traj.linkages.stage_1_grav_turn:alpha_final|coast_1:alpha_initial means that none of the design variables you've given to the optimizer affect that constraint. Likely this means that you have specified both alpha_final and alpha_initial as fixed_final and fixed_initial respectively. You're getting away with it because the initial conditions you provided must have both alphas equal by construction.
You would still be better off either removing the constraint, or adding at least one end of the linkage or the other as a design variable.
traj.phases.stage_1_maneuver.path_constraints.path:q_alpha means that at least one of the entries in your path constraint is not affected by any DV. It is likely not the entire path constraint, but just one end of it that is fixed because its computed from fixed boundary conditions. In this case, you can simply add indices to the add_path_constraint call to exclude the first or last point from the constraint.
Since your optimization is running, the 0 rows aren't killing you. However, its good practice to clean this up. It's possible that not having them will improve performance now, or it may save you from a future situation where the optimization "mysteriously" stops working because you somehow trigger a situation where the optimizer can no longer handle the 0 rows.

References or Standardization of "Value Updating" in Constraint Satisfaction

Constraint Satisfaction Problems (CSPs) are basically, you have a set of constraints with variables and the domains of values for the variables. Then given some configuration of the variables (assignment of variables to values in their domains), you check to see if the constraints are "satisfied". That is, you check to see that evaluating all of the constraints returns a Boolean "true".
What I would like to do is sort of the reverse. Instead of this Boolean "testing" if the constraints are true, I would like to instead take the constraints and enforce them on the variables. That is, set the variables to whatever values they need to be in order to satisfy the constraints. An example of this would be like in a game, you say "this box's right side is always to the left of its containing box's right side," or, box.right < container.right. Then the constraint solving engine (like Cassowary for the game example) would take the box and set its "right" property to whatever number value it resolved to. So instead of the constraint solver giving you a Boolean value "yes the variable configuration satisfies the constraints", it instead updates the variables' configuration with appropriate values, "you have updated the variables". I think Cassowary uses the Simplex Algorithm for solving its constraints.
I am a bit confused because Wikipedia says:
constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a set of values for the variables that satisfies all constraints—that is, a point in the feasible region.
That seems different than the constraint satisfaction problem, of which it says:
An evaluation is consistent if it does not violate any of the constraints.
That's why it seems CSPs are to return Boolean values, while in CS you can set the values. Not quite clear the distinction.
Anyways, I am looking for general techniques on Constraint Solving, in the sense of setting variables like in the simplex algorithm. However, I would like to apply it to any situation, not just linear programming. Some standard and simple example constraints are:
All variables are different.
box.right < container.right
The sum of all variables < 10
Variable a goes before variable b in evaluation.
etc.
For the first case, seeing if the constraints are satisfied (Boolean true) is pretty easy: iterate through the pairs of variables, and if any pair is not equal to each other, return false, otherwise return true after processing all variables.
However, doing the equivalent of setting the variables doesn't seem possible at first glance: iterate through the pairs of variables, and if they are not equal, perhaps you set the first one to the second one. You might have to do some fixed point thing, processing some of them more than once. And then figuring out what value to set them to seems arbitrary how I just did it. Maybe instead you need some further (nested) constraints defining how set the values (e.g. "set a to b if a > b, otherwise set b to a"). The possibilities are customizable.
In addition, for simpler cases like box.right < container.right, it is even complicated. You could say at first that if box.right >= container.right then set box.right = container.right. But maybe actually you don't want that, but instead you want some iPhone-like physics "bounce" where it overextends and then bounces back with momentum. So again, the possibilities are large, and you should probably have additional constraints.
So my question is, similar to how for testing the constraints (for Boolean value) is standardized to CSP, I am wondering if there are any references or standardizations in terms of setting the values used by the constraints.
The only thing I have seen so far is that Cassowary simplex algorithm example which works well for an array of linear inequalities on real-numbered variables. I would like to see something that can handle the "All variables are different" case, and the other cases listed, as well as the standard CSP example problems like for scheduling, box packing, etc. I am not sure why I haven't encountered more on setting/updating constraint variables instead of the Boolean "yes constraints are satisfied" problem.
The only limits I have are that the constraints work on finite domains.
If it turns out there is no standardization at all and that every different constraint listed requires its own entire field of research, that would be good to know. Then I at least know what the situation is and why I haven't really seen much about it.
CSP is a research fields with many publications each year. I suggest you to read one of the books on the subject, like Rina Dechter's.
For standardized CSP languages, check MiniZinc on one hand, and XCSP3 on the other.
There are two main approaches to CSP solving: systematic and stochastic (also known as local search). I have worked on three different CSP solvers, one of them stochastic, but I understand systematic solvers better.
There are many different approaches to systematic solvers. It is possible to fill a whole book covering all the possible approaches, so I will explain only the two approaches I believe the most in:
(G)AC3 which propagates constraints, until all global constraints (hyper-arcs) are consistent.
Reducing the problem to SAT, and letting the SAT solver do the hard work. There is a great algorithm that creates the CNF lazily, on demand when the solver is already working. In a sence, this is a hybrid SAT/CSP algorithm.
To get the AC3 approach going you need to maintain a domain for each variable. A domain is basically a set of possible assignments.
For example, consider the domains of a and b: D(a)={1,2}, D(b)={0,1} and the constraint a <= b. The algorithm checks one constraint at a time, and when it reaches a <= b, it sees that a=2 is impossible, and also b=0 is impossible, so it removes them from the domains. The new domains are D'(a)={1}, D'(b)={1}.
This process is called domain propagation. Using a queue of "dirty" constraints, or "dirty" variables, the solver knows which constraint to propagate next. When the queue is empty, then all constraints (hyper arcs) are consistent (this is where the name AC3 comes from).
When all arcs are consistent, then the solver picks a free variable (with more than one value in the domain), and restricts it to a single value. In SAT, this is called a decision It adds it to the queue and propagates the constraints. If it gets to a conflict (a constraint can't be satisfied), it goes back and undos an earlier decision.
There are a lot of things going on here:
First, how the domains are represented. Some solvers only hold a pair of bounds for each domain. Others, have a set of integers. My solver holds an interval set, or a bit vector.
Then, how the solver knows to propagate a constraint? Some solvers such as SAT solvers, Minion, and HaifaCSP, use watches to avoid propagating irrelevant constraints. This has a significant performance impact on clauses.
Then there is the issue of making decisions. Usually, it is good to choose a variable that has a small domain and high connectivity. There are many papers comparing many different strategies. I prefer a dynamic strategy that resembles the VSIDS of SAT solvers. This strategy is auto-tuned according to conflicts.
Making decision on the value is also important. Many simply take the smallest value in the domain. Sometimes this can be suboptimal if there is a constraint that limits a sum from below. Another option is to randomly choose between max and min values. I tune it further, and use the last assigned value.
After everything, there is the matter of backtracking. This is a whole can of worms. The problem with simple backtracking is that sometimes the cause for conflicts happened at the first decision, but it is detected only at the 100'th. The best thing is to analyze the conflict, and realize where the cause of the conflict is. SAT solvers have been doing this for decades. But CSP representation is not as trivial as CNF. So not many solvers could do it efficiently enough.
This is a nontrivial subject that can fill at least two university courses. Just the subject of conflict analysis can take half of a course.

deadlock and mutual exclusion

Two processes X and Y need to access a critical section. Consider the following synchronization construct used by both the processes.
http://d18khu5s3lkxd9.cloudfront.net//wp-content/uploads/2015/02/Q20.png
In the link given above,
varP and varQ are shared variables and both are initialized to false. Which one of the following statements is true?
1.The proposed solution prevents deadlock but fails to guarantee mutual exclusion
2.The proposed solution guarantees mutual exclusion but fails to prevent deadlock
3.The proposed solution guarantees mutual exclusion and prevents deadlock
4.The proposed solution fails to prevent deadlock and fails to guarantee mutual exclusion
According to the question setter 4th answer is the correct answer.
I have figured that it fails to guarantee mutual exclusion but how does it fails to prevent deadlock?
I came up with this after studying the algo carefully.
Say process Y has used the Critical Section.Therefore,it must have set VarQ variable as false.
Now if Process X tries to enter Critical Section.It can never enter unless Process Y also tries to enter.Reason being the condition while(varQ == true) will remain false unless Process Y tries to enter Critical Section and in doing so sets VarQ to true which before leaving Critical Section(CS) it had set to false.
So as we can see if Process Y does not tries to enter CS,Process X is indefinitely blocked and also the Critical Section is lying unused.
But the question still remains that how is lack of starvation freedom leading to lack of deadlock freedom.In deadlock every process is blocked,but if Process Y indeed tried to enter CS again,Process X could have been successful in its attempt to enter CS.

Modeling an HTTP transition system in Alloy

I want to model an HTTP interaction, i.e. a sequence of HTTPRequest/HTTPResponse, and I am trying to model this as a transition system.
I defined an ordering on a class State by using:
open util/ordering[State]
where a State is simply a set of Messages:
sig State {
msgSet: set Message
}
Each pair of (HTTPRequest->HTTPResponse) and (HTTPResponse->HTTPRequest) is represented as a rule in my transition system.
The rules are expressed in Alloy as predicates that let one move from one state to another.
E.g., this is a rule generating an HTTPResponse after a particular HTTPRequest is received:
pred rsp1 [s, s': State] {
one msg: Request, msg':Response | (
// Preconditions (previous Request)
msg.method=get &&
msg.address.url=sample_com &&
// Postconditions (next Response)
msg'.status=OK_200 &&
// previous Request has to be in previous state
msg in s.msgSet &&
// Response generated is added to next state
s'.msgSet = s.msgSet + msg'
}
Unfortunately, the model created seems to be too complex: we have a dozen of rules (more complex than the one above but following the same pattern) and the execution is very slow.
EDIT: In particular, the CNF generation is extremely slow, while the solving takes a reasonable amount of time.
Do you have any suggestion on how to model a similar transition system?
Thank you very much!
This is a model with an impressive level of detail; thank you for sharing it!
None of the various forms of honestAction by itself takes more than two or three minutes to find an instance (or in some cases to fail to find any instance), except for rsp8, which takes quite a while by itself (it ran for fifteen minutes or so before I stopped it).
So the long CNF preparation times you are observing are apparently caused by either (a) just predicate rsp8 that's causing your time issues, or (b) the size of the disjunction in the honestAction predicate, or (c) both.
I suspect but have not proved that the time issue is caused by combinatorial explosion in the number of individuals required to populate a model and the number of constraints in the model.
My first instinct (it's not more than that) would be to cut back on the level of detail in the model, in particular the large number of singleton signatures which instantiate your abstract signatures. These seem (I could be wrong) to be present either for bookkeeping purposes (so you can identify which rule licenses the transition from one state to another), or because the modeler doesn't trust Alloy to generate concrete instances of signatures like UserName, Password, Code, etc.
As the model now is, it looks as if you're doing a lot of work to define all the individuals involved in a particular example, instead of defining constraints and letting Alloy do the work of finding examples. (Using Alloy to check the properties a particular concrete example can be useful, but there are other ways to do that.)
Since so many of the concrete signatures in the model are constrained to singleton cardinality, I don't actually know that defining them makes the task of finding models more complex; for all I know, it makes it simpler. But my instinct is to think that it would be more useful to know (as well as possibly easier for Alloy to establish) that state transitions have a particular property in general, no matter what hosts, users, and URIs are involved, than to know that property rsp1 applies in all the cases where the host is named examplecom and the address URI is example_url_https and whatnot.
I conjecture that reducing the number of individuals whose existence and properties are prescribed, and the constraints on which individuals can be involved in which state transitions, will reduce the CNF generation time.
If your long-term goal is to test long sequences of state transitions to test whether from a given starting point it's possible or impossible to arrive at a particular state (or kind of state), you may need to re-think the approach to enable shorter sequences of state transitions to do the job.
A second conjecture would involve less restructuring of the model. For reasons I don't think I understand fully, sometimes quantification with one seems to hurt rather than help performance, as in this example, where explicitly quantifying some variables with some instead of one turned out to make a problem tractable instead of intractable.
That question involves quantification in a predicate, not in the model overall, and the quantification with one wasn't intended in the first place, so it may not be relevant here. But we can test the effect of the one keyword on this model in a simple way: I commented out everything in honestAction except rsp8 and ran the predicate first != last in a scope of 8, once with most of the occurrences of one commented out and once with those keywords intact. With the one keywords commented out, the Analyser ran the problem in 24 seconds or so; with the one keywords in place, it ran for 500 seconds so far before I decided the point was made and terminated it.
So I'd try removing the keyword one from all of the signatures with instance-specific individuals, leaving it only on get, post, OK_200, etc., and appData. I would also try doing without the various subtypes of Key, SessionID, URL, Host, UserName, and Password, or at least constraining their cardinality in the run command.

Resources