Halide: Using constant_exterior() + vectorize() in OpenCL

Halide: Using constant_exterior() + vectorize() in OpenCL - opencl

I can't generate an OpenCL implementation with Halide when I choose a constant_exterior() type of boundary condition with vectorize scheduling.
When compiling, I get the following error:
Error:
Vector of bool not valid in OpenCL C (yet)
I don't understand why it would need to use a boolean vector..
My function looks something like this:
void dummy_step()
{
Var x("x"), y("y"), c("c");
Func src("src");
Func dst("dst");
// input parameters
ImageParam image(UInt(8), 3, "inputImage");
Param<int> W;
Param<int> H;
// boundary condition
src = constant_exterior(image, 0, 0, W, 0, H);
Expr x0 = cast<int>(x + y);
Expr y0 = cast<int>(x - y);
dst(x, y, c) = cast<uint8_t>(clamp(src(x0, y0, c), 0.0f, 255.0f));
// scheduling
dst.vectorize(x, 4).gpu_tile(x, y, 16, 8).compute_root();
dst.compile_to_file("test", {image, W, H});
}
If I remove .vectorize(x, 4), the code compiles. If I use another boundary condition, let's say, src = repeat_edge(image, 0, W, 0, H); it also works.

constant_exterior checks if each x coordinate in the vector is within the bounds in order to mux between the constant exterior value and the interior values. The result of this check is a vector of booleans. repeat_edge doesn't need to do that check - it can just clamp the coordinates directly using min and max operations.
I suggest not vectorizing this part of the code using a schedule like so:
src.compute_at(dst, x);
dst.vectorize(x, 4).gpu_tile(x, y, 16, 8).compute_root();

Related

How would you normalize and calculate speed for a 2D vector if it was clamped by different speeds in all four directions? (-x, +x, -y, +y)

My goal here is to improve the user experience so that the cursor goes where the user would intuitively expect it to when moving the joystick diagonally, whatever that means.
Consider a joystick that has a different configured speed for each direction.
e.g. Maybe the joystick has a defect where some directions are too sensitive and some aren't sensitive enough, so you're trying to correct for that. Or maybe you're playing an FPS where you rarely need to look up or down, so you lower the Y-sensitivity.
Here are our max speeds for each direction:
var map = {
x: 100,
y: 200,
}
The joystick input gives us a unit vector from 0 to 1.
Right now the joystick is tilted to the right 25% of the way and tilted up 50% of the way.
joystick = (dx: 0.25, dy: -0.50)
Sheepishly, I'm not sure where to go from here.
Edit: I will try #Caderyn's solution:
var speeds = {
x: 100, // max speed of -100 to 100 on x-axis
y: 300, // max speed of -300 to 300 on y-axis
}
var joystick = { dx: 2, dy: -3 }
console.log('joystick normalized:', normalize(joystick))
var scalar = Math.sqrt(joystick.dx*joystick.dx / speeds.x*speeds.x + joystick.dy*joystick.dy / speeds.y*speeds.y)
var scalar2 = Math.sqrt(joystick.dx*joystick.dx + joystick.dy*joystick.dy)
console.log('scalar1' , scalar) // length formula that uses max speeds
console.log('scalar2', scalar2) // regular length formula
// normalize using maxspeeds
var normalize1 = { dx: joystick.dx/scalar, dy: joystick.dy/scalar }
console.log('normalize1', normalize1, length(normalize1))
// regular normalize (no maxpseed lookup)
var normalize2 = { dx: joystick.dx/scalar2, dy: joystick.dy/scalar2 }
console.log('normalize2', normalize2, length(normalize2))
function length({dx, dy}) {
return Math.sqrt(dx*dx + dy*dy)
}
function normalize(vector) {
var {dx,dy} = vector
var len = length(vector)
return {dx: dx/len, dy: dy/len}
}
Am I missing something massive or does this give the same results as regular vector.len() and vector.normalize() that don't try to integrate the maxspeed data at all?

three solutions :
You can simply multiply each component of the input vector by it's respective speed
you can divide the vector itself by sqrt(dx^2/hSpeed^2+dy^2/vSpeed^2)
you can multiply the vector itself by sqrt((dx^2+dy^2)/(dx^2/hSpeed^2+dy^2/vSpeed^2)) or 0 if the input is (0, 0)
the second solution will preserve the vector's direction when the first will tend to pull it in the direction with the greatest max speed. But if the domain of those function is the unit disc, their image will be an ellipse whose radii are the two max speeds
EDIT : the third method does what the second intended to do: if the imput is A, it will return B such that a/b=c/d (the second method was returning C):

Unable to plot solution of ODE in Maxima

Good time of the day!
Here is the code:
eq:'diff(x,t)=(exp(cos(t))-1)*x;
ode2(eq,x,t);
sol:ic1(%,t=1,x=-1);
/*---------------------*/
plot2d(
rhs(sol),
[t,-4*%pi, 4*%pi],
[y,-5,5],
[xtics,-4*%pi, 1*%pi, 4*%pi],
[ytics, false],
/*[yx_ratio , 0.6], */
[legend,"Solution."],
[xlabel, "t"], [ylabel, "x(t)"],
[style, [lines,1]],
[color, blue]
);
and here is the errors:
integrate: variable must not be a number; found: -12.56637061435917
What went wrong?
Thanks.

Here's a way to plot the solution sol which was found by ode2 and ic2 as you showed. First replace the integrate nouns with calls to quad_qags, a numerical quadrature function. I'll introduce a made-up variable name (a so-called gensym) to avoid confusion with the variable t.
(%i59) subst (nounify (integrate) =
lambda ([e, xx],
block ([u: gensym(string(xx))],
quad_qags (subst (xx = u, e), u, -4*%pi, xx)[1])),
rhs(sol));
(%o59) -%e^((-t)-quad_qags(%e^cos(t88373),t88373,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]
+quad_qags(%e^cos(t88336),t88336,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]+1)
Now I'll define a function foo1 with that result. I'll make a list of numerical values to see if it works right.
(%i60) foo1(t) := ''%;
(%o60) foo1(t):=-%e
^((-t)-quad_qags(%e^cos(t88373),t88373,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]
+quad_qags(%e^cos(t88336),t88336,-4*%pi,t,
epsrel = 1.0E-8,epsabs = 0.0,
limit = 200)[
1]+1)
(%i61) foo1(0.5);
(%o61) -1.648721270700128
(%i62) makelist (foo1(t), t, makelist (k, k, -10, 10));
(%o62) [-59874.14171519782,-22026.46579480672,
-8103.083927575384,-2980.957987041728,
-1096.633158428459,-403.4287934927351,
-148.4131591025766,-54.59815003314424,
-20.08553692318767,-7.38905609893065,-2.71828182845904,
-1.0,-0.3678794411714423,-0.1353352832366127,
-0.04978706836786394,-0.01831563888873418,
-0.006737946999085467,-0.002478752176666358,
-9.118819655545163E-4,-3.354626279025119E-4,
-1.234098040866796E-4]
Does %o62 look right to you? I'll assume it is okay. Next I'll define a function foo which calls foo1 defined before when the argument is a number, otherwise it just returns 0. This is a workaround for a bug in plot2d, which incorrectly determines that foo1 is not a function of t alone. Usually that workaround isn't needed, but it is needed in this case.
(%i63) foo(t) := if numberp(t) then foo1(t) else 0;
(%o63) foo(t):=if numberp(t) then foo1(t) else 0
Okay, now the function foo can be plotted!
(%i64) plot2d (foo, [t, -4*%pi, 4*%pi], [y, -5, 5]);
plot2d: some values were clipped.
(%o64) false
That takes about 30 seconds to plot -- calling quad_qags is relatively expensive.

it looks like ode2 does not know how to completely solve the problem, so the result contains an integral:
(%i6) display2d: false $
(%i7) eq:'diff(x,t)=(exp(cos(t))-1)*x;
(%o7) 'diff(x,t,1) = (%e^cos(t)-1)*x
(%i8) ode2(eq,x,t);
(%o8) x = %c*%e^('integrate(%e^cos(t),t)-t)
(%i9) sol:ic1(%,t=1,x=-1);
(%o9) x = -%e^((-%at('integrate(%e^cos(t),t),t = 1))
+'integrate(%e^cos(t),t)-t+1)
I tried it with contrib_ode also:
(%i12) load (contrib_ode);
(%o12) "/Users/dodier/tmp/maxima-code/share/contrib/diffequations/contrib_ode.mac"
(%i13) contrib_ode (eq, x, t);
(%o13) [x = %c*%e^('integrate(%e^cos(t),t)-t)]
So contrib_ode did not solve it completely either.
However the solution returned by ode2 (same for contrib_ode) appears to be a valid solution. I'll post a separate answer describing how to evaluate it numerically for plotting.

What does the C++ map() function do

I am currently looking at the following code (which can be found here)
void MPU6050::CalibrateAccel(uint8_t Loops,uint8_t OffsetSaveAddress) {
double kP = 0.15;
double kI = 8;
float x;
x = (100 - map(Loops, 1, 5, 20, 0)) * .01;
kP *= x;
kI *= x;
PID( 0x3B, OffsetSaveAddress, kP, kI, Loops);
}
Specifically I am struggling to understand what the line:
x = (100 - map(Loops, 1, 5, 20, 0)) * .01;
is doing?
The best matching function I can find for map() is here but it doesn't appear to match the integer parameters that are being passed into the function.
Obviously ideally I would run this code but unfortunately I am yet unable to get this to compile.
Have I correctly found the function being invoked and what is the behaviour of this function with the given parameters? I assume this is a map() function similar to any other typical map function in other languages/frameworks such as python, jquery etc.
Could anyone guide me in the right direction?

map defined in math, re-maps a number from one range to another. Syntax is map(value, fromLow, fromHigh, toLow, toHigh)
So, map(Loops, 1, 5, 20, 0)) means the value of variable Loops will be initially searched between (1,5) but result will be between 20 to 0 since it is remapped.

Scilab round-off error

I cannot solve a problem in Scilab because it get stucked because of round-off errors. I get the message
!--error 9999
Error: Round-off error detected, the requested tolerance (or default) cannot be achieved. Try using bigger tolerances.
at line 2 of function scalpol called by :
at line 7 of function gram_schmidt_pol called by :
gram_schmidt_pol(a,-1/2,-1/2)
It's a Gram Schmidt process with the integral of the product of two functions and a weight as the scalar product, between -1 and 1.
gram_schmidt_pol is the process specially designed for polynome, and scalpol is the scalar product described for polynome.
The a and b are parameters for the weigth, which is (1+x)^a*(1-x)^b
The entry is a matrix representing a set of vectors, it works well with the matrix [[1;2;3],[4;5;6],[7;8;9]], but it fails with the above message error on matrix eye(2,2), in addition to this, I need to do it on eye(9,9) !
I have looked for a "tolerance setting" in the menus, there is some in General->Preferences->Xcos->Simulation but I believe this is not for what I wan't, I have tried low settings (high tolerance) in it and it hasn't change anything.
So how can I solve this rounf-off problem ?
Feel free to tell me my message lacks of clearness.
Thank you.
Edit: Code of the functions :
// function that evaluate a polynomial (vector of coefficients) in x
function [y] = pol(p, x)
y = 0
for i=1:length(p)
y = y + p(i)*x^(i-1)
end
endfunction
// weight function evaluated in x, parametrized by a and b
// (poids = weight in french)
function [y] = poids(x, a, b)
y = (1-x)^a*(1+x)^b
endfunction
// scalpol compute scalar product between polynomial p1 and p2
// using integrate, the weight and the pol functions.
function [s] = scalpol(p1, p2, a, b)
s = integrate('poids(x,a, b)*pol(p1,x)*pol(p2,x)', 'x', -1, 1)
endfunction
// norm associated to scalpol
function [y] = normscalpol(f, a, b)
y = sqrt(scalpol(f, f, a, b))
endfunction
// finally the gram schmidt process on a family of polynome
// represented by a matrix
function [o] = gram_schmidt_pol(m, a, b)
[n,p] = size(m)
o(1:n) = m(1:n,1)/(normscalpol(m(1:n,1), a, b))
for k = 2:p
s =0
for i = 1:(k-1)
s = s + (scalpol(o(1:n,i), m(1:n,k), a, b) / scalpol(o(1:n,i),o(1:n,i), a, b) .* o(1:n,i))
end
o(1:n,k) = m(1:n,k) - s
o(1:n,k) = o(1:n,k) ./ normscalpol(o(1:n,k), a, b)
end
endfunction

By default, Scilab's integrate routine tries to achieve absolute error at most 1e-8 and relative error at most 1e-14. This is reasonable, but its treatment of relative error does not take into account the issues that occur when the exact value is zero. (See How to calculate relative error when true value is zero?). For this reason, even the simple
integrate('x', 'x', -1, 1)
throws an error (in Scilab 5.5.1).
And this is what happens in the process of running your program: some integrals are zero. There are two solutions:
(A) Give up on the relative error bound, by specifying it as 1:
integrate('...', 'x', -1, 1, 1e-8, 1)
(B) Add some constant to the function being integrated, then subtract from the result:
integrate('100 + ... ', 'x', -1, 1) - 200
(The latter should work in most cases, though if the integral happens to be exactly -200, you'll have the same problem again)
The above works for gram_schmidt_pol(eye(2,2), -1/2, -1/2) but for larger, say, gram_schmidt_pol(eye(9,9), -1/2, -1/2), it throws the error "The integral is probably divergent, or slowly convergent".
It appears that the adaptive integration routine can't handle the functions of the kind you have. A fallback is to use the simple inttrap instead, which just applies the trapezoidal rule. Since at x=-1 and 1 the function poids is undefined, the endpoints have to be excluded.
function [s] = scalpol(p1, p2, a, b)
t = -0.9995:0.001:0.9995
y = poids(t,a, b).*pol(p1,t).*pol(p2,t)
s = inttrap(t,y)
endfunction
In order for this to work, other related functions must be vectorized (* and ^ changed to .* and .^ where necessary):
function [y] = pol(p, x)
y = 0
for i=1:length(p)
y = y + p(i)*x.^(i-1)
end
endfunction
function [y] = poids(x, a, b)
y = (1-x).^a.*(1+x).^b
endfunction
The result is guaranteed to work, though the precision may be a bit lower: you are going to get some numbers like 3D-16 which are actually zeros.

trapezodial integral matlab

I want to use instead of matlab integration command, a basic self created one. Do you have any Idea how to fix the error? If I use Matlab quad command, my algorithm works good but when I try to use my self created integral function,not suprisingly for sure, it does not work:(
M-File:
function y = trapapa(low, up, ints, fun)
y = 0;
step = (up - low) / ints;
for j = low : step : up
y = y + feval(fun,j);
end
y = (y - (feval(fun, low) + feval(fun, up))/2) * step;
Mean algorithm:
clear;
x0=linspace(0,4,3);
y=linspace(0,2,3);
for i=1:length(x0)
for j=1:length(y)
x(i,j)=y(j)+x0(i);
alpha=#(rho)((5-2*x(i,j)).*exp(y(j)-rho))./2;
%int(i,j)=quad(alpha,0,y(j))
int(i,j)=trapapa(alpha,0,y(j),10)
end
end

You are not following your function definition in the script. The fun parameter (variable alpha) is supposed to be the last one.
Try int(i,j)=trapapa(0,y(j),10,alpha)

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Halide: Using constant_exterior() + vectorize() in OpenCL - opencl

Related

How would you normalize and calculate speed for a 2D vector if it was clamped by different speeds in all four directions? (-x, +x, -y, +y)

Unable to plot solution of ODE in Maxima

What does the C++ map() function do

Scilab round-off error

trapezodial integral matlab

Categories

Resources