3-OCC-MAX SAT np-complete? - np-complete

Assuming 3-OCC-MAX SAT is the language of all CNF formulas in which every variable appears in at most 3 clauses.
Is this problem NP-Complete? I'm trying to find a karp reduction between SAT and this problem, but I couldn't find it.

This problem is NP-complete. It is easy to see that it is in NP (guessing a model; check it in polynomial time).
First Attempt (Failure)
To show NP-hardness, I propose the following construction:
Consider a 3-SAT instance F over n variables.
Consider a clause [L1, L2, L3].
Define fresh variables p1, p2, p3.
Define Li equivalent to pi.
Afterwards, replace the original clause using the fresh variables.
This results in clauses of the form:
[p1, p2, p3]
[-p1, L1]
[-L1, p1]
[-p2, L2]
[-L2, p2]
[-p3, L3]
[-L3, p3]
Do this for all clauses and always use fresh variables.
Note that the variables p1 to p3 are used exactly three times, whereas L1 till L3 are used twice.
This construction is polynomial.
EDIT: I currently see that this is not a valid solution yet: The original literals may exceed the maximum occurence of 3.
Second Attempt
The idea is to use fresh variables for every apperance of a literal.
Let M be the number of appearences of variables in the 3SAT formula (this can be improved).
For every atom A in the 3SAT CNF, add the following to the resulting the 3-OCC-MAX SAT formula:
q0 <- A
q1 <- q0
q2 <- q1
q3 <- q2
q4 <- q3
...
q_M <- q_M-1
q_M+1 <- q_M
q0 <- q_M+1
Do the same for the occurences of -A.
p0 <- -A
p1 <- p0
p2 <- p1
p3 <- p2
p4 <- p3
...
p_M <- p_M-1
p_M+1 <- p_M
p0 <- p_M+1
Furthermore, add the following to ensure that either the q-row or the p-row is true.
-p0 <- qM+1
-q0 <- pM+1
Now, add the clauses of the original 3SAT CNF, in which the n-th occurence of the literal L is replaced by qn.
There is no "0-th occurence", i.e. we start counting by 1; therefore q0 and p0 as well as qM and pM are not used in this context.
Note that A and -A appears 2x, the variable p0, q0, p_M+1, q_M+1 three times and the variables p_i, q_i, where i is between 1 and M at most three times.

Related

A more streamlined approach than nested for loops to fill an array in R

I'm trying to write a function in R that models population sizes (of different age brackets) across a network of connected populations. As a simplified example, imagine I want to predict how many juveniles and adults there are in 2 separate populations, over 10 discrete time steps:
age <- c("juvenile","adult")
pop<-c(1:2)
t <- c(1:10)
I make an empty 3D array to fill, and set starting (t=1) population sizes of 2 and 4.
a1 <- array(0, dim=c(10,2,2), dimnames = list(t, age,pop))
a1[1,2,1:2] <- c(2, 4)
And then I've written a function to fill this array from t=2 to t=10:
f1 <- function(a1){
for(t in 2:10){
for(p in 1:2){
a1[t,1,1] <- a1[t-1,2,1] * 2
a1[t,2,1] <- a1[t-1,1,1] + (a1[t-1,2,1] * 0.5)
a1[t,1,2] <- a1[t-1,2,2] * 3
a1[t,2,2] <- a1[t-1,1,2] + (a1[t-1,2,2] * 0.25)
}
}
return(data.frame(a1))
}
f1(a1)
Between each time step:
juveniles become adults,
adults produce more juveniles (either 2 or 3 per adult, depending on population), and
a proportion of adults (either 0.5 or 0.25, depending on pop) survive to the next time step.
I quite like this methodical approach, but in reality my array has many more dimensions, all of different lengths, that are all inter-connected. In other words, a whole lot of nested for loops and excessive array co-ordinates (it gets tricky to remember what something like a1[t,3,1,2,1] is actually referring to).
If anyone has any suggestions on where to start for a more streamlined approach, that would be greatly appreciated. I keep hearing how apply functions are more concise than for loops, but I've only been able to find examples on how to use apply to summarise arrays, rather than use it to fill them.

Find local minimum in a vector with r

Taking the ideas from the following links:
the local minimum between the two peaks
How to explain ...
I look for the local minimum or minimums, avoiding the use of functions already created for this purpose [max / min locale or global].
Our progress:
#DATA
simulate <- function(lambda=0.3, mu=c(0, 4), sd=c(1, 1), n.obs=10^5) {
x1 <- rnorm(n.obs, mu[1], sd[1])
x2 <- rnorm(n.obs, mu[2], sd[2])
return(ifelse(runif(n.obs) < lambda, x1, x2))
}
data <- simulate()
hist(data)
d <- density(data)
#
#https://stackoverflow.com/a/25276661/8409550
##Since the x-values are equally spaced, we can estimate dy using diff(d$y)
d$x[which.min(abs(diff(d$y)))]
#With our data we did not obtain the expected value
#
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
#we check
#1
optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
optimize(approxfun(d$x,d$y),interval=c(0,4),maximum = TRUE)$maximum
#2
tp <- pastecs::turnpoints(d$y)
summary(tp)
ind <- (1:length(d$y))[extract(tp, no.tp = FALSE, peak = TRUE, pit = TRUE)]
d$x[ind[2]]
d$x[ind[1]]
d$x[ind[3]]
My questions and request for help:
Why did the command lines fail:
d$x[which.min(abs(diff(d$y)))]
It is possible to eliminate the need to add one to the index in the command lines:
d$x[which(diff(sign(diff(d$y)))>0)+1]#pit
d$x[which(diff(sign(diff(d$y)))<0)+1]#peak
How to get the optimize function to return the two expected maximum values?
Question 1
The answer to the first question is straighforward. The line d$x[which.min(abs(diff(d$y)))] asks for the x value at which there was the smallest change in y between two consecutive points. The answer is that this happened at the extreme right of the plot where the density curve is essentially flat:
which.min(abs(diff(d$y)))
#> [1] 511
length(abs(diff(d$y)))
#> [1] 511
This is not only smaller than the difference at your local maxima /minima points; it is orders of magnitude smaller. Let's zoom in to the peak value of d$y, including only the peak and the point on each side:
which.max(d$y)
#> [1] 324
plot(d$x[323:325], d$y[323:325])
We can see that the smallest difference is around 0.00005, or 5^-5, between two consecutive points. Now look at the end of the plot where it is flattest:
plot(d$x[510:512], d$y[510:512])
The difference is about 1^-7, which is why this is the flattest point.
Question 2
The answer to your second question is "no, not really". You are taking a double diff, which is two elements shorter than x, and if x is n elements long, a double diff will correspond to elements 2 to (n - 1) in x. You can remove the +1 from the index, but you will have an off-by-one error if you do that. If you really wanted to, you could concatenate dummy zeros at each stage of the diff, like this:
d$x[which(c(0, diff(sign(diff(c(d$y, 0))))) > 0)]
which gives the same result, but this is longer, harder to read and harder to justify, so why would you?
Question 3
The answer to the third question is that you could use the "pit" as the dividing point between the minimum and maximum value of d$x to find the two "peaks". If you really want a single call to get both at once, you could do it inside an sapply:
pit <- optimize(approxfun(d$x,d$y),interval=c(0,4))$minimum
peaks <- sapply(1:2, function(i) {
optimize(approxfun(d$x, d$y),
interval = c(min(d$x), pit, max(d$x))[i:(i + 1)],
maximum = TRUE)$maximum
})
pit
#> [1] 1.691798
peaks
#> [1] -0.02249845 3.99552521

How to plot the next period value of a variable against its past period in R

I am trying to plot a difference equation using R. I have the following MWE:
N <- 10 #periods
time <- c(0:N)
x <- rep(0,N)
x[1] <- 0.5 #initial value
E <- rep(0,N)
E[1] <- 1
#Parameters
c <- 1
K <- 1
p <- 200
q <- 0.01
g <- 0.1
eta <- 0.3
for (t in 1:N) {
x[t+1] <- (1 + g - g*x[t]/K - q*E[t])*x[t]
E[t+1] <- (1 + eta*(p*q*x[t]-c))*E[t]
}
p1 <- plot(x[t], x[t+1]) #I try this but clearly this does not work
p2 <- plot(x[0:N], x[1:N+1]) #this produces a diagram where x=y, a 45 degree line
p1 produces
and p2 produces
I am expecting the graph to look like a stable spiral. My question is, how do you properly plot the next period value of a variable against the current period? I am looking to plot
$$x_{t+1}(x_{t})$$
I see two problems:
1.The N in your example is to small. To see the stable spiral set the N to 100 or 1000
2.Your plots show exactly what I would expect, but I don't think they do what you want to do ;)
p1 plots the second last element of x x[t] on the x axis (t is equal to N after the for loop) against the last element of x -> a single point.
p2 plots all elements of x except the last against all elements of x except the first, i.e. the value against itself with 1 lag -> points form almost a diagonal
But if you plot any of:
plot(time, E)
plot(time, x)
plot(E, x)
You get interesting plots, the last one drawing a spiral!

joining two Bézier curves smoothly (C2 continuous)

(Follow-up of this question.)
Given a sequence of cubic Bézier curves, how can I modify them minimally to make them join in a C2-continuous way?
Input:
curve P with control points P0, P1, P2, P3
curve Q with control points Q0, Q1, Q2, Q3
if it helps, you can assume that they are already C1 continuous.
Constraints:
C0 continuity: P3 = Q0
C1 continuity: P2 - P3 = Q0 - Q1
C2 continuity: P1 - 2*P2 + P3 = Q0 - 2*Q1 + Q2
modified curves as close as possible to original curves P and Q
Getting the modified curves as close as possible to the originals can have multiple interpretations, but one could consider that keeping endpoints and tangents far from the joining points constant could fit. So the points P0, P1, P3 = Q0, Q2, Q3 are constant.
We can change the origin such that P3 = Q0 = 0, enforcing C2 continuity can then be expressed as:
P1 - 2*P2 = 2*Q1 + Q2
One can expressP2=a*e^i*r and Q1=b*e^i*r in complex representations (keeping the same angle enforces C2 continuity. Compute
(P1 - Q2)/2 = c*e^i*s
Enforcing C2 continuity should be choosing r=s, and finding a combination of a and b such that a+b =c. There are infinitely many solutions, but one might use heuristics such as changing a if it is the smallest (thus producing less sensible changes).
If that's not producing sufficiently small variations, try a two-step optimisation: first change P1 and Q2 to get s closer to r, then apply the steps above.

How to use the sum function in a for loop in R?

We want to calculate the value of an integral in linear plot.
For a better understanding look at the photo. Let's say the overall area is 1. We want to find what the value in a certain part is. For instance we want to know how much % of the overall 100% lay within the 10th and 11th month if everything refers to months and A as maximum stands for 24.
We can calculate a integral and then should be able to get the searched area by F(x) - F(x-1)
I thoght about the following code:
a <- 24
tab <-matrix(0,a,1)
tab <-cbind(seq(1,a),tab)
tab<-data.frame(tab)
#initialization for first point
tab[1,2] <- (2*tab[1,1] / a - tab[1,1]^2 / a^2)
#for loop for calculation of integral of each point - integral until to the area
for(i in 2:nrow(tab))
{tab[i,2] <- (2*tab[i,1] / a - tab[i,1]^2/ a^2) - sum(tab[1,2]:tab[i-1,2])}
#plotting
plot(tab[,2], type="l")
If you see the plot - it's confusing. Any ideas how to handle this correct?
The base R function integrate() can do this for you:
f <- function(x, A) 2/A - x / A^2
integrate(function(x)f(x, 24), lower=10, upper=11)
0.06510417 with absolute error < 7.2e-16
Using the formulas directly:
a <- 24 # number of divisions
x <- c(seq(1,a)) #
y <- x*2/a - x^2/a^2 # F(x)
z <- (x*2/a - x^2/a^2) - ((x-1)*2/a - (x-1)^2/a^2) # F(x) - F(x-1)
Then do the binding afterward.
> sum(z)
[1] 1

Resources