Wrong value when copying from global to private memory - opencl

I am currently learning OpenCL and I have this kernel that works just fine when directly accessing the global array, but gives wrong results when using an intermediate value on the private memory, for example, aux on the code below.
__kernel void kernel_cte(__global float *U0,__global float *U1,__constant float *VP0, uint stride, uint nnoi, __constant float *g_W, uint k0, uint k1, float FATMDFX, float FATMDFY, float FATMDFZ) {
uint index = get_global_id(1)*nnoi + get_global_id(0) + k0 * stride;
uint k;
float aux;
aux = U0[index+1];
for(k=k0;k<k1;++k) {
if(VP0[index] > 0.0f){
U1[index] = 2.0f * U0[index] - U1[index]
+ FATMDFX * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index - 6] + U0[index + 6])
+ g_W[5] * (U0[index - 5] + U0[index + 5])
+ g_W[4] * (U0[index - 4] + U0[index + 4])
+ g_W[3] * (U0[index - 3] + U0[index + 3])
+ g_W[2] * (U0[index - 2] + U0[index + 2])
+ g_W[1] * (U0[index - 1] + aux)
+ g_W[0] * U0[index]
)
+ FATMDFY * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index - 6 * nnoi] + U0[index + 6 * nnoi])
+ g_W[5] * (U0[index - 5 * nnoi] + U0[index + 5 * nnoi])
+ g_W[4] * (U0[index - 4 * nnoi] + U0[index + 4 * nnoi])
+ g_W[3] * (U0[index - 3 * nnoi] + U0[index + 3 * nnoi])
+ g_W[2] * (U0[index - 2 * nnoi] + U0[index + 2 * nnoi])
+ g_W[1] * (U0[index - nnoi] + U0[index + nnoi])
+ g_W[0] * U0[index]
)
+ FATMDFZ * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index + 6 * stride] + U0[index - 6 * stride])
+ g_W[5] * (U0[index + 5 * stride] + U0[index - 5 * stride])
+ g_W[4] * (U0[index + 4 * stride] + U0[index - 4 * stride])
+ g_W[3] * (U0[index + 3 * stride] + U0[index - 3 * stride])
+ g_W[2] * (U0[index + 2 * stride] + U0[index - 2 * stride])
+ g_W[1] * (U0[index + stride] + U0[index - stride])
+ g_W[0] * U0[index]
);
} // end if
index += stride;
}
}
I would like to use vectors to perform these calculations but I can't understand why the correct value isn't copied to the private memory when I do aux = U0[index+1].

If each workitem is working on its own dataset, only thing they need is to commit global memory operations with a fence if they are using them and altering them multiple times in same kernel.
For example, U1[index] in code below needs committing to global memory if it is not meant to be cached.
mem_fence(CLK_GLOBAL_MEM_FENCE);
if(VP0[index] > 0.0f){
U1[index] = 2.0f * U0[index] - U1[index]
+ FATMDFX * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index - 6] + U0[index + 6])
+ g_W[5] * (U0[index - 5] + U0[index + 5])
+ g_W[4] * (U0[index - 4] + U0[index + 4])
+ g_W[3] * (U0[index - 3] + U0[index + 3])
+ g_W[2] * (U0[index - 2] + U0[index + 2])
+ g_W[1] * (U0[index - 1] + aux)
+ g_W[0] * U0[index]
)
+ FATMDFY * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index - 6 * nnoi] + U0[index + 6 * nnoi])
+ g_W[5] * (U0[index - 5 * nnoi] + U0[index + 5 * nnoi])
+ g_W[4] * (U0[index - 4 * nnoi] + U0[index + 4 * nnoi])
+ g_W[3] * (U0[index - 3 * nnoi] + U0[index + 3 * nnoi])
+ g_W[2] * (U0[index - 2 * nnoi] + U0[index + 2 * nnoi])
+ g_W[1] * (U0[index - nnoi] + U0[index + nnoi])
+ g_W[0] * U0[index]
)
+ FATMDFZ * VP0[index] * VP0[index] * (
+ g_W[6] * (U0[index + 6 * stride] + U0[index - 6 * stride])
+ g_W[5] * (U0[index + 5 * stride] + U0[index - 5 * stride])
+ g_W[4] * (U0[index + 4 * stride] + U0[index - 4 * stride])
+ g_W[3] * (U0[index + 3 * stride] + U0[index - 3 * stride])
+ g_W[2] * (U0[index + 2 * stride] + U0[index - 2 * stride])
+ g_W[1] * (U0[index + stride] + U0[index - stride])
+ g_W[0] * U0[index]
);
mem_fence(CLK_GLOBAL_MEM_FENCE);
because either GPU out-of-order instruction execution capability or compiler can reorder reads/writes without asking and fence/barrier is stopping them doing that and keeps order the way developer needs.
If workitems are meant to alter each others data region, then at least barrier() is needed and this works only inside of each block(workgroup).

I found the problem and it was very obvious, the reading aux = U0[index+1] should be performed inside the for loop.

Related

Finding RuntimeWarning: overflow encountered in double_scalars while running numerical schemes for fluid dynamic study

i'm writing a code to solve Hyperbolic differential equations with different numerica methods such as Lax-Friederichs, Lax-Wendroff and Upwind scheme. During the calculation i often obtain this type of error:
RuntimeWarning: overflow encountered in double_scalars
that seems to disappear when i reduce the dimensions of matrix. Here i attach my code:
for i in range (0,nt):
#inlet
rho[0,i] = P_inlet/(R*T_inlet)
u[0,i] = u_inlet
P[0,i] = P_inlet
T[0,i] = T_inlet
Ac[0,0] = A_var_list[0]
Q1[0,i] = rho[0,i]
Q2[0,i] = rho[0,i] * u[0,i]
Q3[0,i] = (1/2)*(rho[0,i])*(u[0,i]**2) + (P[0,i]/(k-1))
F1[0,i] = rho[0,i] * u[0,i]
F2[0,i] = (1/2)*(rho[0,i])*(u[0,i]**2) + P[0,i]
F3[0,i] = u[0,i] * ((1/2)*(rho[0,i])*(u[0,i]**2) + (k*P[0,i]/(k-1)))
#outlet
rho[nx-1,i] = rho_outlet
P[nx-1,i] = P_outlet
u[nx-1,i] = u_outlet
T[nx-1,i] = T_outlet
Q1[nx-1,i] = rho[nx-1,i]
Q2[nx-1,i] = rho[nx-1,i]*u[nx-1,i]
Q3[nx-1,i] = (1/2)*rho[nx-1,i]*u[nx-1,i] + (P[nx-1,i]/(k-1))
F1[nx-1,i] = rho[nx-1,i] * u[nx-1,i]
F2[nx-1,i] = (1/2)*rho[nx-1,i]*(u[nx-1,i]**2) + P[nx-1,i]
F3[nx-1,i] = u[nx-1,i] * ((1/2)*(rho[nx-1,i])*(u[nx-1,i]**2) + (k*P[nx-1,i]/(k-1)))
#manifold
for i in range (1,nx-1):
rho[i,0] = P_inlet/(R*Tw[i])
u[i,0] = u_inlet
P[i,0] = P_inlet
Ac[i,0] = A_var_list[i]
Q1[i,0] = rho[i,0]
Q2[i,0] = rho[i,0] * u[i,0]
Q3[i,0] = (1 / 2) * (rho[i,0]) * (u[i,0] ** 2) + (P[i,0] / (k - 1))
F1[i, 0] = rho[i, 0] * u[i, 0]
F2[i, 0] = (1 / 2) * (rho[i, 0]) * (u[i, 0] ** 2) + P[i, 0]
F3[i, 0] = u[i, 0] * ((1 / 2) * (rho[i, 0]) * (u[i, 0] ** 2) + (k * P[i, 0] / (k - 1)))
S1[i, 0] = -rho[i, 0] * u[i, 0] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, 0] = -(rho[i, 0] * ((u[i, 0] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - (
(frict_fact * np.pi * rho[i, 0] * d[i] * u[i, 0] ** 2) / (2 * Ac[i, 0]))
S3[i, 0] = - (u[i, 0] * (rho[i, 0] * ((u[i, 0] ** 2) / 2) + (k * P[i, 0] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
def Upwind():
for n in range (0,nt-1):
for i in range (1,nx):
Q1[i,n+1] = Q1[i-1,n]-((F1[i,n] - F1[i-1,n])/Dx)*Dt + (S1[i,n]-S1[i-1,n])*Dt
Q2[i, n + 1] = Q2[i-1, n] - ((F2[i, n] - F2[i - 1, n]) / Dx) * Dt + (S2[i, n] - S2[i - 1, n]) * Dt
Q3[i, n + 1] = Q3[i-1, n] - ((F3[i, n] - F3[i - 1, n]) / Dx) * Dt + (S3[i, n] - S3[i - 1, n]) * Dt
rho[i, n+1] = Q1[i, n+1]
u[i, n+1] = Q2[i, n+1] / rho[i, n+1]
P[i, n+1] = (Q3[i, n+1] - 0.5 * rho[i, n+1] * u[i, n+1] ** 2) * (k - 1)
T[i, n+1] = P[i, n+1] / (R * rho[i, n+1])
F1[i,n+1] = Q2[i,n+1]
F2[i,n+1] = rho[i,n+1]*((u[i,n+1]**2)/2) +P[i,n+1]
F3[i, n + 1] = u[i, n + 1] * (
(rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2)) + (k * P[i , n + 1] / (k - 1)))
S1[i, n + 1] = -rho[i, n + 1] * u[i, n + 1] * (Ac[i, 0] - Ac[i-1, 0])
S2[i, n + 1] = - (rho[i, n + 1] * (
(u[i, n + 1] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i-1, 0])) - ((
(frict_fact * np.pi * rho[i, n + 1] * d[i] * (u[i, n + 1] ** 2)) / (2 * Ac[i, 0])))
S3[i, n + 1] = -(u[i, n + 1] * (
rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2) + (k * P[i, n + 1] / (k - 1))) * (
(Ac[i , 0] - Ac[i-1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i ] * (Tw[i] - T[i, 0]) / Ac[i, 0])
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
def Lax_Friedrichs():
for n in range (1,nt):
for i in range (1,nx-1):
F1_m1 = 0.5 * (F1[i, n - 1] + F1[i - 1, n - 1])
F2_m1 = 0.5 * (F2[i, n - 1] + F2[i - 1, n - 1])
F3_m1 = 0.5 * (F3[i, n - 1] + F3[i - 1, n - 1])
S1_m1 = 0.5 * (S1[i, 0] + S1[i - 1, 0])
S2_m1 = 0.5 * (S2[i, 0] + S2[i - 1, 0])
S3_m1 = 0.5 * (S3[i, 0] + S3[i - 1, 0])
F1_p1 = 0.5 * (F1[i + 1, n - 1] + F1[i, n - 1])
F2_p1 = 0.5 * (F2[i + 1, n - 1] + F2[i, n - 1])
F3_p1 = 0.5 * (F3[i + 1, n - 1] + F3[i, n - 1])
S1_p1 = 0.5 * (S1[i + 1, n - 1] + S1[i, n - 1])
S2_p1 = 0.5 * (S2[i + 1, n - 1] + S2[i, n - 1])
S3_p1 = 0.5 * (S3[i + 1, n - 1] + S3[i, n - 1])
Q1[i, n] = 0.5 * (Q1[i - 1, n - 1] + Q1[i + 1, n - 1]) - Dt/Dx * (F1_p1 - F1_m1) + (S1_p1 - S1_m1) * Dt
Q2[i, n] = 0.5 * (Q2[i - 1, n - 1] + Q2[i + 1, n - 1]) - Dt/Dx * (F2_p1 - F2_m1) + (S2_p1 - S2_m1) * Dt
Q3[i, n] = 0.5 * (Q3[i - 1, n - 1] + Q3[i + 1, n - 1]) - Dt/Dx * (F3_p1 - F3_m1) + (S3_p1 - S3_m1) * Dt
rho[i, n] = Q1[i, n]
u[i, n] = Q2[i, n] / rho[i, n]
P[i, n] = (Q3[i, n] - 0.5 * rho[i, n] * u[i, n] ** 2) * (k - 1)
T[i, n] = P[i, n] / (R * rho[i, n])
F1[i, n] = Q2[i, n]
F2[i, n] = rho[i, n] * ((u[i, n] ** 2) / 2) + P[i, n]
F3[i, n] = u[i, n] * (
(rho[i, n] * ((u[i, n] ** 2) / 2)) + (k * P[i, n] / (k - 1)))
S1[i, n] = -rho[i, n] * u[i, n] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, n] = - (rho[i, n] * (
(u[i, n] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - ((
(frict_fact * np.pi * rho[i, n] * d[i] * (u[i, n] ** 2)) / (2 * Ac[i, 0])))
S3[i, n] = -(u[i, n] * (
rho[i, n] * ((u[i, n] ** 2) / 2) + (k * P[i, n] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
# Plot
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
def Lax_Wendroff():
for n in range (0,nt-1):
for i in range (1,nx-1):
Q1_plus_half = (1 / 2) * (Q1[i, n] + Q1[i + 1, n]) - (Dt / (2 * Dx)) * (F1[i + 1, n] - F1[i, n]) + (
S1[i + 1, n] - S1[i, n]) * Dt
Q1_less_half = (1 / 2) * (Q1[i, n] + Q1[i - 1, n]) - (Dt / (2 * Dx)) * (F1[i, n] - F1[i - 1, n]) + (
S1[i, n] - S1[i - 1, n]) * Dt
Q2_plus_half = (1 / 2) * (Q2[i-1, n] + Q2[i + 1, n]) - (Dt / (2 * Dx)) * (F2[i + 1, n] - F2[i, n]) + (
S2[i + 1, n] - S2[i, n]) * Dt
Q2_less_half = (1 / 2) * (Q2[i, n] + Q2[i - 1, n]) - (Dt / (2 * Dx)) * (F2[i, n] - F2[i - 1, n]) + (
S2[i, n] - S2[i - 1, n]) * Dt
Q3_plus_half = (1 / 2) * (Q3[i, n] + Q3[i + 1, n]) - (Dt / (2 * Dx)) * (F3[i + 1, n] - F3[i, n]) + (
S3[i + 1, n] - S3[i, n]) * Dt
Q3_less_half = (1 / 2) * (Q3[i, n] + Q3[i - 1, n]) - (Dt / (2 * Dx)) * (F3[i, n] - F3[i - 1, n]) + (
S3[i, n] - S3[i - 1, n]) * Dt
rho_less_half = Q1_less_half
u_less_half = Q2_less_half / rho_less_half
P_less_half = (Q3_less_half - ((1 / 2) * rho_less_half * (u_less_half ** 2) / 2)) * (k - 1)
F1_less_half = rho_less_half * u_less_half
F2_less_half = rho_less_half * ((u_less_half ** 2) / 2) + P_less_half
F3_less_half = u_less_half * ((rho_less_half * ((u_less_half ** 2) / 2)) + (k * P_less_half / (k - 1)))
rho_plus_half = Q1_plus_half
u_plus_half = Q2_plus_half / rho_plus_half
P_plus_half = (Q3_plus_half - ((1 / 2) * rho_plus_half * (u_plus_half ** 2) / 2)) * (k - 1)
F1_plus_half = rho_plus_half * u_plus_half
F2_plus_half = rho_plus_half * ((u_plus_half ** 2) / 2) + P_plus_half
F3_plus_half = u_plus_half * ((rho_plus_half * ((u_plus_half ** 2) / 2)) + (k * P_plus_half / (k - 1)))
# I termini sorgente da mettere dentro l'equazione finale di Q li calcolo come medie delle variabili nel condotto
S1_less_half = 0.5 * (S1[i - 1, n] + S1[i, n])
S2_less_half = 0.5 * (S2[i - 1, n] + S2[i, n])
S3_less_half = 0.5 * (S3[i - 1, n] + S3[i, n])
S1_plus_half = 0.5 * (S1[i + 1, n] + S1[i, n])
S2_plus_half = 0.5 * (S2[i + 1, n] + S2[i, n])
S3_plus_half = 0.5 * (S3[i + 1, n] + S3[i, n])
"""S1_less_half = Q1_less_half + F1_less_half
S2_less_half = Q2_less_half + F2_less_half
S3_less_half = Q3_less_half + F3_less_half
S1_plus_half = Q1_plus_half + F1_plus_half
S2_plus_half = Q2_plus_half + F2_plus_half
S3_plus_half = Q3_plus_half + F3_plus_half"""
Q1[i , n + 1] = Q1[i, n] - (Dt / Dx) * (F1_plus_half - F1_less_half) - (S1_plus_half - S1_less_half) * Dt
Q2[i, n + 1] = Q2[i, n] - (Dt / Dx) * (F2_plus_half - F2_less_half) - (S2_plus_half - S2_less_half) * Dt
Q3[i, n + 1] = Q3[i, n] - (Dt / Dx) * (F3_plus_half - F3_less_half) - (S3_plus_half - S3_less_half) * Dt
rho[i, n + 1] = Q1[i, n + 1]
u[i, n + 1] = Q2[i, n + 1] / rho[i, n + 1]
P[i, n + 1] = (Q3[i, n + 1] - 0.5 * rho[i, n + 1] * (u[i, n + 1] ** 2)) * (k - 1)
F1[i, n + 1] = rho[i, n + 1] * u[i, n + 1]
F2[i, n + 1] = rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2) + P[i, n + 1]
F3[i, n+1] = u[i, n+1] * (
(rho[i, n+1] * ((u[i, n+1] ** 2) / 2)) + (k * P[i, n+1] / (k - 1)))
S1[i, n+1] = -rho[i, n+1] * u[i, n+1] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, n+1] = - (rho[i, n+1] * (
(u[i, n+1] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - ((
(frict_fact * np.pi * rho[i, n+1] * d[i] * (u[i, n+1] ** 2)) / (2 * Ac[i, 0])))
S3[i, n+1] = -(u[i, n+1] * (
rho[i, n+1] * ((u[i, n+1] ** 2) / 2) + (k * P[i, n+1] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
# Plot
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
I'm pretty sure that's a matter of indices but i havent't found the solution yet. Hope you can help me.

empty argument error in rootSolve package in R

I am using rootSolve package in R to solve a system of 6 non-linear equations with 6 unknown variables. Here is my model
model <- function(x, parms) c(F1 = x[1] - parms[1] - 1 / ((-parms[7]) * (1 - x[4])),
F2 = x[2] - parms[2] - 1 / ((-parms[7]) * (1 - x[5])),
F3 = x[3] - parms[3] - 1 / ((-parms[7]) * (1 - x[6])),
F4 = x[4] - exp(parms[4] + parms[7] * x[1]) / (1 + exp(parms[4] + parms[7] * x[1]) + exp(parms[5] + parms[7] * x[2]) + exp(parms[6] + parms[7] * x[3])),
F5 = x[5] - exp(parms[5] + parms[7] * x[2]) / (1 + exp(parms[4] + parms[7] * x[1]) + exp(parms[5] + parms[7] * x[2]) + exp(parms[6] + parms[7] * x[3])),
F6 = x[6] - exp(parms[6] + parms[7] * x[3]) / (1 + exp(parms[4] + parms[7] * x[1]) + exp(parms[5] + parms[7] * x[2]) + exp(parms[6] + parms[7] * x[3])),
)
But when I call
new.equi = multiroot(model, start = initial.value, parms = parm)
where I pass value to initial.value and parm, I keep getting the error of
Error in c(F1 = x[1] - parms[1] - 1/((-parms[7]) * (1 - x[4])), F2 = x[2] - : argument 7 is empty
Why is this happening? Why should there be argument 7?
I also tried to specify parameters in the model explicitly, like this, but still get the same error.
model <- function(x) c(F1 = x[1] - 1.265436 - 1 / (2.443700 * (1 - x[4])),
F2 = x[2] - 1.195844 - 1 / (2.443700 * (1 - x[5])),
F3 = x[3] - 1.288660 - 1 / (2.443700 * (1 - x[6])),
F4 = x[4] - exp(4.600528 - 2.443700 * x[1]) / (1 + exp(4.600528 - 2.443700 * x[1]) + exp(3.924360 - 2.443700 * x[2]) + exp(4.643808 - 2.443700 * x[3])),
F5 = x[5] - exp(3.924360 - 2.443700 * x[2]) / (1 + exp(4.600528 - 2.443700 * x[1]) + exp(3.924360 - 2.443700 * x[2]) + exp(4.643808 - 2.443700 * x[3])),
F6 = x[6] - exp(4.643808 - 2.443700 * x[3]) / (1 + exp(4.600528 - 2.443700 * x[1]) + exp(3.924360 - 2.443700 * x[2]) + exp(4.643808 - 2.443700 * x[3])),
)
You have a trailing comma. E.g.:
> c(1,2,)
Error in c(1, 2, ) : argument 3 is empty

How does B(t) of a bezier curve move when P1/P2 move along tangent?

When P1 changes from (0,4) to (0,2), Q1(t=0.5) and Q2(t=0.6) move 0.75 and 0.576 respectively. How can I calculate for any B(t) the distance it moves when P1 or P2 move along (Start--P1) or (P2--End) respectively?
Just write Bezier curve expression:
B(t) = P0 * (1-t)^3 + P1 * 3 * t * (1-t)^2 + P2 * 3 * t^2 * (1-t) + P3 * t^3
Let's P1' is new position of P1 control point. Only the second term will be changed, so
DeltaB(t) = B'(t) - B(t) = (P1' - P1) * 3 * t * (1-t)^2
if P1' lies on P0-P1, then
P1' = P0 + (P1 - P0) * u
DeltaB(t) = (P0 + (P1 - P0) * u - P1) * 3 * t * (1-t)^2 =
(P0 - P1) * (1 - u) * 3 * t * (1-t)^2
For your example data
u = 0.5
(P0 - P1) * (1 - u) = (0, -2) // (x,y) components of vector
DeltaB(0.5) = (0, -2 * 3 * 0.5 * 0.25) = (0, -0.75)
DeltaB(0.6) = (0, -2 * 3 * 0.6 * 0.4 * 0.4) = (0, -0.576)

Relative position of a point within a quadrilateral

I am trying to find the easiest way to determine a relative position of a point within a quadrilateral. The known are (see figure) the positions of points 1, 2, 3, 4 and 5 in the xy-coordinate system: x1, y1, x2, y2, x3, y3, x4, y4, x5, y5.
Also known are the positions of points 1, 2, 3, and 4 in the ξ-η coordinate systems (see figure).
From this data, I want to determine what are the ξ and η for point 5.
Results
Thank you to all who anwsered! I find the solution by #dbc and #agentp similar. Also I find this solution better than the perspective transformation solution by #MBo, since I do not have to compute the inverse of a matrix (Ax=B --> x=inv(A)*B).
I get the following result for:
u = 0.5 * (ξ + 1)
v = 0.5 * (η + 1)
In my case all points are within the rectangle, therefore u>0 and v>0.
What you have here is a 2d bilinear blended surface. For simplicity, let's change its coordinates to range from zero to one:
u = 0.5 * (ξ + 1)
v = 0.5 * (η + 1)
In that case, the surface evaluator can be expressed as
F(u, v) = P1 + u * (P2 - P1) + v * ((P4 + u * (P3 - P4)) - (P1 + u * (P2 - P1)))
I.e., for a given u, construct a line passing through the following two points:
Pv0 = P1 + u * (P2 - P1);
Pv1 = P4 + u * (P3 - P4);
then interpolate between then for given v
F(u, v) = Pv0 + v * (Pv1 - Pv0)
What you seek are values (u,v) such that F(u, v) = P5. This will occur for given u when the line from Pv0 to Pv1 passes through P5, which will occur when P5 - Pv0 is parallel to Pv1 - Pv0 -- i.e. when their 2d cross is zero:
cross2d(P5 - Pv0, Pv1 - Pv0) = 0
⇒
cross2d(P5 - (P1 + u * (P2 - P1)),
P4 + u * (P3 - P4) - (P1 + u * (P2 - P1))) = 0
Now, the 2d cross of two 2d vectors A ⨯ B is given by Ax*By - Ay*Bx, so that equation becomes
(x5 - (x1 + u * (x2 - x1))) * (y4 + u * (y3 - y4) - (y1 + u * (y2 - y1))) - (y5 - (y1 + u * (y2 - y1))) * (x4 + u * (x3 - x4) - (x1 + u * (x2 - x1))) = 0
Expanding this expression out and collecting collecting together terms in u, we get
u^2 * (x1*y3 - x1*y4 - x2*y3 + x2*y4 + (-x3)*y1 + x3*y2 + x4*y1 - x4*y2)
+ u * (-x1*y3 + 2*x1*y4 - x1*y5 - x2*y4 + x2*y5 + x3*y1 - x3*y5 - 2*x4*y1 + x4*y2 + x4*y5 + x5*y1 - x5*y2 + x5*y3 - x5*y4)
+ (-x1*y4 + x1*y5 + x4*y1 - x4*y5 - x5*y1 + x5*y4)
= 0
This is now a quadratic equation over u, and can be solved as such. Note that in cases where the top and bottom edges of your quadrilateral are parallel then the quadratic devolves into a linear equation; your quadratic equation solver must needs handle this.
double a = (x1 * y3 - x1 * y4 - x2 * y3 + x2 * y4 + (-x3) * y1 + x3 * y2 + x4 * y1 - x4 * y2);
double b = (-x1 * y3 + 2 * x1 * y4 - x1 * y5 - x2 * y4 + x2 * y5 + x3 * y1 - x3 * y5 - 2 * x4 * y1 + x4 * y2 + x4 * y5 + x5 * y1 - x5 * y2 + x5 * y3 - x5 * y4);
double c = (-x1 * y4 + x1 * y5 + x4 * y1 - x4 * y5 - x5 * y1 + x5 * y4);
double[] solutions = Quadratic.Solve(a, b, c);
There may be more than one solution. There might also be no solutions for a degenerate quadrilateral.
Having solved for value(s) of u, finding the equivalent v is straightforward. Given points
Pv0 = P1 + u * (P2 - P1);
Pv1 = P4 + u * (P3 - P4);
you seek v such that
v * (Pv1 - Pv0) = P5 - Pv0;
Pick the coordinate index 0 or 1 such that |(Pv1 - Pv0)[index]| is maximized. (If both coordinates are almost zero, then give up -- there's no solution for this specific u. Then set
v = (P5 - Pv0)[index] / (Pv1 - Pv0)[index];
Finally, if you have more that one solution, prefer a solution inside the [u, v] boundaries of the blend. Then finally set
ξ = 2 * u - 1;
η = 2 * v - 1;
This looks like a standard finite element parameterization
(The question doesn't specify a particular mapping, but I imagine someone might be interested in this specific case)
{x, y} == (
(1 - eta) (1 - ci) {p1x, p1y} +
(1 - eta) (1 + ci) {p2x, p2y} +
(1 + eta) (1 + ci) {p3x, p3y} +
(1 + eta) (1 - ci) {p4x, p4y} )/4
This can be solved in closed form for {eta,ci}, but the expression is pretty unwieldy to post.
In practice, compute these constants:
ax = p1x + p2x + p3x + p4x
bx = p1x - p2x - p3x + p4x
cx = p1x + p2x - p3x - p4x
dx = p1x - p2x + p3x - p4x
ay = p1y + p2y + p3y + p4y
by = p1y - p2y - p3y + p4y
cy = p1y + p2y - p3y - p4y;
dy = p1y - p2y + p3y - p4y;
Solve this quadratic for eta :
(ax by - bx ay) - 4 (by x - bx y) +
eta (dx ay - cx by + bx cy - ax dy + 4 (x dy - dx y)) +
eta^2 (cx dy - dx cy) == 0
then get ci as:
ci = ((-ax + eta cx + 4 x)/(-bx + eta dx))
If the polygon is not too distorted just one of the solutions will satisfy -1<eta<1 and -1<ci<1
Referring to the self-answer of #blaz (please vote up the answers of #blaze, #dbc and #agentp)
For everybody who is not willing to copy the formulas by hand, here is the formula as C# code:
double v_sqrt = Math.Sqrt(
4 * (
(x3 - x4) * (y1 - y2) - (x1 - x2) * (y3 - y4)) * (x4 * (-1 * y + y1) + x1 * (y - y4) + x * (-1 * y1 + y4)) +
Math.Pow(
(x3 * y - x4 * y - x3 * y1 + 2 * x4 * y1 - x4 * y2 + x1 * (y + y3 - 2 * y4) + x2 * (-1 * y + y4) + x * (-1 * y1 + y2 - y3 + y4))
, 2)
);
double u_sqrt = Math.Sqrt(
4 * ((x3 - x4) * (y1 - y2) - (x1 - x2) * (y3 - y4))
* (
x4 * (-1 * y + y1) + x1 * (y - y4) + x * (-1 * y1 + y4)
) +
Math.Pow(
(x3 * y - x4 * y - x3 * y1 + 2 * x4 * y1 - x4 * y2 + x1 * (y + y3 - 2 * y4) + x2 * (-1 * y + y4) + x * (-1 * y1 + y2 - y3 + y4))
, 2)
);
double k = 1 / (2 * ((x3 - x4) * (y1 - y2) - (x1 - x2) * (y3 - y4)));
double l = 1 / (2 * ((x1 - x4) * (y2 - y3) - (x2 - x3) * (y1 - y4)));
///////////////////////////////////////////////////////////////////////////////////////////////
double v1 = l *
(x2 * y - x3 * y + x4 * y + x * y1 - 2 * x2 * y1 + x3 * y1 - x * y2 - x4 * y2 + x * y3 - x1 * (y - 2 * y2 + y3) - x * y4 + x2 * y4 +
v_sqrt);
///////////////////////////////////////////////////////////////////////////////////////////////
double u1 = -1 * k *
(-x2 * y + x3 * y - x * y1 - x3 * y1 + 2 * x4 * y1 + x * y2 - x4 * y2 - x * y3 + x1 * (y + y3 - 2 * y4) + x * y4 + x2 * y4 +
u_sqrt);
double v2 = -1 * l *
(x1 * y + x3 * y - x4 * y - x * y1 - 2 * x3 * y1 + x * y2 - -2 * x1 * y2 + x4 * y2 - x * y3 + x1 * y3 + x * y4 - x2 * (y - 2 * y1 + y4) +
v_sqrt);
/////////////////////////////////////////////////////////////////////////////////////////////////
double u2 = k *
(x2 * y - x3 * y + x4 * y + x * y1 + x3 * y1 - 2 * x4 * y1 - x * y2 + x4 * y2 + x * y3 - x1 * (y + y3 - 2 * y4) - x * y4 - x2 * y4 +
u_sqrt);
In most cases it is u1 and v1 so there should not be the need for computing the other ones.
I used it to calibrate the coordinates of a Pegasus Air-Pen device (ultrasonic stylus) on a sheet of paper. It does work best if your coordinates for point 1 to 5 are also >= 0.
Sry for posting this as an answer but it is too long for a comment and I think it is a valuable help for this post as it would be for me.
You need to calculate a matrix of perspective transformation, that maps 4 points of source quadrilateral to 4 points of destination quadrilateral (example) (more mathemathics), then apply this transformation to coordinates of 5th point (multiply matrix by coordinate vector)

Math Error in Bezier curve

I'm trying to get plots to generate a cubic Bezier Curve, and I've managed to be able to generate linear and quad easily, but I keep getting an error with my cubic formula,
Linear formula:x = (1-t)*(p0x + (t * p1x))
quad formula:x = (1-t)^2 * p0x + 2*(1-t) * t * p1x + t^2 * p2x
cubic formula:x = (1–t)^3 * p0x + 3*(1–t)^2 * t * p1x + 3*(1–t)*t^2 * p2x + t^3 * p3x
Though the quad and cubic formula are very similar, the cubic errors "')' expected near '–'". How can this be fixed?
I'm programming this in Lua.
The subtraction signs in your cubic formula aren't plain -:
>>> s = """
... linear formula: `x = (1-t)*(p0x + (t * p1x))`
... quad formula: `x = (1-t)^2 * p0x + 2*(1-t) * t * p1x + t^2 * p2x`
... cubic formula: `x = (1–t)^3 * p0x + 3*(1–t)^2 * t * p1x + 3*(1–t)*t^2 * p2x + t^3 * p3x`
...
... """
>>> for line in s.splitlines():
... print repr(line)
...
''
'linear formula: `x = (1-t)*(p0x + (t * p1x))`'
'quad formula: `x = (1-t)^2 * p0x + 2*(1-t) * t * p1x + t^2 * p2x`'
'cubic formula: `x = (1\xe2\x80\x93t)^3 * p0x + 3*(1\xe2\x80\x93t)^2 * t * p1x + 3*(1\xe2\x80\x93t)*t^2 * p2x + t^3 * p3x`'
''
They're actually U+2013 –, which is EN DASH. Fix those and it should be fine.

Resources