Translating Screen Coordinates [ x, y ] to Camera Pan and Tilt angles

Translating Screen Coordinates [ x, y ] to Camera Pan and Tilt angles - math

I have a IP Camera which can PTZ. I am currently streaming live feed into the browser and want to allow user to click a point on the screen and the camera will pan and tilt so that the user clicked position will now become the center point of view.
my Camera Pan 360 degrees and Tilt from -55 to 90.
any algorithm that will guide to me achieve my goal ??

Let's start by declaring a 3D coordinate system around the camera (the origin). I will use the following: The z-axis points upwards. The x-axis is the camera direction with pan=tilt=0 and positive pan angles will move the camera towards the positive y-axis.
Then, the transform for a given pan/tilt configuration is:
T = Ry(-tilt) * Rz(pan)
This is the transform that positions our virtual image plane in 3D space. Let's keep that in mind and go to the image plane.
If we know the vertical and horizontal field of view and assume that lens distortions are already corrected, we can set up our image plane as follows: The image plane is 1 unit away from the camera (just by declaration) in the view direction. Let the center be the plane's local origin. Then, its horizontal extents are +- tan(fovx / 2) and its vertical extents are +- tan(fovy / 2).
Now, given a pixel position (x, y) in this image (origin in the top left corner), we first need to convert this location into a 3D direction. We start by calculating the local coordinates in the image plane. This is for the image's pixel width w and pixel height h:
lx = (2 * x / w - 1) * tan(fovx / 2)
ly = (-2 * y / h + 1) * tan(fovy / 2) (local y-axis points upwards)
lz = 1 (image plane is 1 unit away)
This is the ray that contains the according pixel under the assumption that there is no pan or tilt yet. But now it is time to get rid of this assumption. That's where our initial transform comes into play. We just need to transform this ray:
tx = cos(pan) * cos(tilt) * lx - cos(tilt) * sin(pan) * ly - sin(tilt) * lz
ty = sin(pan) * lx + cos(pan) * ly
tz = cos(pan) * sin(tilt) * lx - sin(pan) * sin(tilt) * ly + cos(tilt) * lz
The resulting direction now describes the ray that contains the specified pixel in the global coordinate system that we set up in the beginning. All that's left is calculate the new pan/tilt parameters:
tilt = atan2(tz, tx)
pan = asin(ty / sqrt(tx^2 + ty^2 + tz^2))

Related

Get Camera 2d position by 3 image points (1D)

i have an image and 3 points with following datas for each point:
x and y 2d-world coordinates
x image coordinate
how can i calculate the camera orientation (only left/right) and the 2d-world position?
thanks.
edit: the image is a normal photography (so perspective projection). The world coordinate is a top view of a map, so Orthographic projection).

Given a point in world space, the projection can be expressed as
(x - cx) * cos(phi) - (y - cy) * sin(phi)
proj(x, y) = -----------------------------------------
(x - cx) * sin(phi) + (y - cy) * cos(phi)
cx and cy are the camera position and phi is the camera rotation. The projection will result in a value in camera coordinates (not image coordinates). To transform image coordinates to camera coordinates, usw
cameraX(imageX) = (2 * imageX / W - 1) * tan(fovy / 2) * ratio
W is the pixel width of the image, fovy is the vertical field of view, ratio is the image's aspect ratio.
Then you want to solve the system of equations formed by the three given points. There is an analytic solution, but it is quite complex. So you're left with numerical (probably least-squares) solvers. Pick one, plug in the formula and get your result. Since you optimize for both a position and an angle, you may want to normalize the values so that they have a similar range. I got quite good results with levmar for similar problems if you're unsure what optimizer to use.
This all assumes that the camera does not distort the image.

Find radius of fixed length arc of a circle in a bounding box when the circle intersects the edge of the bounding box

I have a bounding Box that is represented as a Cartesian starting point(0,0) with a width and height.
I have a circle with centre point that can be anywhere within the bounding box.
the circumference of the circle is fixed.
When the circle intersects the edge of the bounding box an arc is formed.
This new arc has to have a length = to the circumference of the original circle.
The location of the centre of the circle is is known there for the distance from the centre to the edge of the bounding box is know.
as you move closer to the edge of the bound box the radius of the circle must increase to keep the arc length the same
the start and stop points of the arc are unknown as the radius is unknown.
This is where I'm stuck. knowing only the distance from the bounding box and the fixed length of the arc how can I find the radius of the circle ?
I have drawn an image to represent the question but I'm unable to post due to lack of reputation.
Any help on this will be greatly appreciated as I have spent many days trying to figure this out.
What I am trying to achieve is a radial menu with fixed number of items (of a fixed size) can be displayed around a centre point. the fixed length is a calculated length that all menu items can fit around.
I am implementing this in .net but for the sake of this query its purely a Math question.
Edit: here is image of the issue:

Here is a possible line of attack. Let's put some names:
alpha = angle at which the circle intercepts the horizontal line on the right side
r = radius
arc = length of the "visible" circle (known)
L = length to edge (known) (Let's assume L > 0)
Pi the number pi.
Using that arc = radius * angle (radians), we have:
arc = Pi * r + 2 * alpha * r
sin(alpha) = L / r
Solving for alpha in the first equation
alpha = arc / (2 * r) - Pi / 2
Using that sin(a - b) = sin(a)cos(b) - cos(a)sin(b)
L / r = sin(alpha) = -cos(arc / (2 * r))
Now put u = L/r. Since L is known, u becomes the unknown. Replacing:
u = -cos(arc / (2 * L) * u)
Finally put F = arc / (2 * L). Then F is known and
u = - cos(F * u)
So, the problem reduces to solve this equation, which will require some numerical algorithm.

What I have done is created a multiplier and maped the distance from the edge so Y = 0 to 150 and mapped that down to 2 to 1 so if Y = 150 the map = 1 and if Y = 0 the map is 2 so if y = 75 then the map = 1.5 ect
This mapping is then used as a multiplier
radius = radius * map
This gets me close enough...
Then in the corner i do the same thing for X and add the 2 multiplier together so if you're in the far corner both maps = 2 and
so radius = radius * (mapX + mapY)
Doubles it on edges and quadruples it in the corners. which is ~ close enough

clicking on a sphere

I have a unit sphere (radius 1) that is drawn centred in orthogonal projection.
The sphere may rotate freely.
How can I determine the point on the sphere that the user clicks on?

Given:
the height and width of the monitor
the radius of the projected circle, in pixels
the coordinates of the point the user clicked on
And assuming that the top-left corner is (0,0), the x value increases as you travel to the right, and the y value increases as you travel down.
Translate the user's click point into the coordinate space of the globe.
userPoint.x -= monitor.width/2
userPoint.y -= monitor.height/2
userPoint.x /= circleRadius
userPoint.y /= circleRadius
Find the z coordinate of the point of intersection.
//solve for z
//x^2 + y^2 + z^2 = 1
//we know x and y, from userPoint
//z^2 = 1 - x^2 - y^2
x = userPoint.x
y = userPoint.y
if (x^2 + y^2 > 1){
//user clicked outside of sphere. flip out
return -1;
}
//The negative sqrt is closer to the screen than the positive one, so we prefer that.
z = -sqrt(1 - x^2 - y^2);
Now that you know the (x,y,z) point of intersection, you can find the lattitude and longitude.
Assuming that the center of the globe facing the user is 0E 0N,
longitude = 90 + toDegrees(atan2(z, x));
lattitude = toDegrees(atan2(y, sqrt(x^2 + z^2)))
If the sphere is rotated so that the 0E meridian is not directly facing the viewer, subtract the angle of rotation from the longitude.

One possible approach is to generate the sphere from triangles, consisting of rows and columns. They can be invisible too. And then hit-testing those triangles with a mouse pick ray.
See this picture's latitude/longitude grid, but apply it much denser. For each grid cell, you need 2 triangles.

How to calculate the z-distance of a camera to view an image at 100% of its original scale in a 3D space

How can one calculate the camera distance from an object in 3D space (an image in this case) such that the image is at its original pixel width.
Am I right in assuming that this is possible given the aspect ratio of the camera, fov, and the original width/height of the image in pixels?
(In case it is relevant, I am using THREE.js in this particular instance).
Thanks to anyone who can help or lead me in the right direction!

Thanks everyone for all the input!
After doing some digging and then working out how this all fits into the exact problem I was trying to solve with THREE.js, this was the answer I came up with in JavaScript as the target Z distance for displaying things at their original scale:
var vFOV = this.camera.fov * (Math.PI / 180), // convert VERTICAL fov to radians
var targetZ = window.innerHeight / (2 * Math.tan(vFOV / 2) );
I was trying to figure out which one to mark as the answer but I kind of combined all of them into this solution.

Trigonometrically:
A line segment of length l at a right angle to the view plane and at a distance of n perpendicular to it will subtend arctan(l/n) degrees on the camera. You can arrive at that result by simple trigonometry.
Hence if your field of view in direction of the line is q, amounting to p pixels, you'll end up occupying p*arctan(l/n)/q pixels.
So, using y as the output number of pixels:
y = p*arctan(l/n)/q
y*q/p = arctan(l/n)
l/tan(y*q/p) = n
Linear algebra:
In a camera with a field-of-view of 90 degrees and a viewport of 2w pixels wide, the projection into screen space is equivalent to:
x' = w - w*x/z
When perpendicular, the length of a line on screen is the difference between two such xs so by normal associativity and commutivity rules:
l' = w - w*l/z
Hence:
w - l' = w*l/z
z = (w - l') / (w*l)
If your field of view is actually q degrees rather than 90 then you can use the cotangent to scale appropriately.

In your original question you said that you're using css3D. I suggest that you do the following:
Set up an orthographic camera with fov = 1..179 degrees, where left = screenWidth / 2, right = screenWidth / - 2, top = screenHeight / 2, bottom = screenHeight / - 2. Near and far planes do not affect CSS3D rendering as far as I can tell from experience.
camera = new THREE.OrthographicCamera(left, right, top, bottom, near, far);
camera.fov = 75;
now you need to calculate the distance between the camera and object in such way that when the object is projected using the camera with settings above, the object has 1:1 coordinate correspondence on screen. This can be done in following way:
var camscale = Math.tan(( camera.fov / 2 ) / 180 * Math.PI);
var camfix = screenHeight / 2 / camscale;
place your div to position: x, y, z
set the camera's position to 0, 0, z + camfix
This should give you 1:1 coordinate correspondence with rendered result and your pixel values in css / div styles. Remember that the origin is in center and the object's position is the center of the object so you need to do adjustments in order to achieve coordinate specs from top-left corner for example
object.x = ( screenWidth - objectWidth ) / 2 + positionLeft
object.y = ( screenHeight - objectHeight ) / 2 + positionTop
object.z = 0
I hope this helps, I was struggling with same thing (exact control of the css3d scene) but managed to figure out that the Orthographic camera + viewport size adjusted distance from object did the trick. Don't alter the camera rotation or its x and y coordinates, just fiddle with the z and you're safe.

finding a dot on a circle by degree?

Let's say we have a 100x100 coordinate system, like the one below. 0,0 is its left-top corner, 50,50 is its center point, 100,100 is its bottom right corner, etc.
Now we need to draw a line from the center outwards. We know the angle of the line, but need to calculate the coordinates of its end point. What do you think would be the best way to do it?
For example, if the angle of the line is 45 degrees, its end point coordinates would be roughly 75,15.

You need to use the trigonometric functions sin and cos.
Something like this:
theta = 45
// theta = pi * theta / 180 // convert to radians.
radius = 50
centerX = 50
centerY = 50
p.x = centerX + radius * cos(theta)
p.y = centerY - radius * sin(theta)
Keep in mind that most implementations assume that you're working with radians and have positive y pointing upwards.

Use the unit circle to calculate X and Y, but because your radius is 50, multiply by 50
http://en.wikipedia.org/wiki/Unit_circle
Add the offset (50,50) and bob's your uncle
X = 50 + (cos(45) * 50) ~ 85,36
Y = 50 - (sin(45) * 50) ~ 14,65
The above happens to be 45 degrees.
EDIT: just saw the Y axis is inverted

First you would want to calculate the X and Y coordinates as if the circle were the unit circle (radius 1). The X coordinate of a given angle is given by cos(angle), and the Y coordinate is given by sin(angle). Most implementations of sin and cos take their inputs in radians, so a conversion is necessary (1 degree = 0.0174532925 radians). Now, since your coordinate system is not in fact the unit circle, you need to multiply the resultant values by the radius of your circle. In this given instance, you would multiply by 50, since your circle extends 50 units in each direction. Finally, using a unit circle coorindate system assumes your circle is centered at the origin (0,0). To account for this, add (or subtract) the offset of your center from your calculated X and Y coordinates. In your scenario, the offset from (0,0) is 50 in the positive X direction, and 50 in the negative Y direction.
For example:
cos(45) = x ~= .707
sin(45) = y ~= .707
.707*50 = 35.35
35.35+50 = 85.35
abs(35.35-50) = 14.65
Thus the coordinates of the ending segment would be (85.35, 14.65).
Note, there is probably a built-in degrees-to-radians function in your language of choice, I provided the unit conversion for reference.
edit: oops, used degrees at first

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Translating Screen Coordinates [ x, y ] to Camera Pan and Tilt angles - math

Related

Get Camera 2d position by 3 image points (1D)

Find radius of fixed length arc of a circle in a bounding box when the circle intersects the edge of the bounding box

clicking on a sphere

How to calculate the z-distance of a camera to view an image at 100% of its original scale in a 3D space

finding a dot on a circle by degree?

Categories

Resources