I'm following one of BQ courses from Google's Skill Boost program. Using a dataset with football (soccer) stats, they're calculating the impact of shot distance on the likelihood of scoring a goal.
I don't quite get how the shot distance is calculated in this part:
SQRT(
POW(
(100 - positions[ORDINAL(1)].x) * 105/100,
2) +
POW(
(50 - positions[ORDINAL(1)].y) * 68/100,
2)
) AS shotDistance
I know the distance formula is used (d=√((x_2-x_1)²+(y_2-y_1)²)) but:
why use ORDINAL(1)? How does it work in this example?
why detract first from 100 and then from 50?
For the record, positions is a repeated field, with x,y int64 nested underneath. x and y have values between 1 and 100, demonstrating the % of the pitch where an event (e.g. a pass) was initiated or terminated.
The whole code is as follows:
WITH
Shots AS
(
SELECT
*,
/* 101 is known Tag for 'goals' from goals table */
(101 IN UNNEST(tags.id)) AS isGoal,
/* Translate 0-100 (x,y) coordinate-based distances to absolute positions
using "average" field dimensions of 105x68 before combining in 2D dist calc */
SQRT(
POW(
(100 - positions[ORDINAL(1)].x) * 105/100,
2) +
POW(
(50 - positions[ORDINAL(1)].y) * 68/100,
2)
) AS shotDistance
FROM
`soccer.events`
WHERE
/* Includes both "open play" & free kick shots (including penalties) */
eventName = 'Shot' OR
(eventName = 'Free Kick' AND subEventName IN ('Free kick shot', 'Penalty'))
)
SELECT
ROUND(shotDistance, 0) AS ShotDistRound0,
COUNT(*) AS numShots,
SUM(IF(isGoal, 1, 0)) AS numGoals,
AVG(IF(isGoal, 1, 0)) AS goalPct
FROM
Shots
WHERE
shotDistance <= 50
GROUP BY
ShotDistRound0
ORDER BY
ShotDistRound0
Thanks
why use ORDINAL(1)? How does it work in this example?
As per the BigQuery array documentation
To access elements from the arrays in this column, you must specify
which type of indexing you want to use: either OFFSET, for zero-based
indexes, or ORDINAL, for one-based indexes.
So taking a sample array to access the first element you would do the following:
array = [7, 5, 8]
array[OFFSET(0)] = 7
array[ORDINAL(1)] = 7
So in this example it is used to get the coordinates of where the shot took place (which in this data is the first set of x,y coordinates).
why detract first from 100 and then from 50?
The difference between 100 and 50 represents the position of the goals on the field.
So the end point of the shot is assumed to be in the middle of the goals which along the x axis from 0 - 100, 100 is the endline of the field, while on the y axis the goals is in the middle of the field equal distance from each sideline, so therefore 50 is the middle point of the goals.
Related
I am stuck on a problem that seems easy to solve but I can't seem to pinpoint the right formula.
I have a list of hexagon groups in a cube coordinate system. I know the cube coordinates of the groups but I need to calculate the "global" coordinate of a small hexagon in a given group.
For example, in the image below, I know the coordinates for GroupA (x=0, y=0, z=0) and GroupB (x=-1, y=1, z=0). How can I calculate the coordinates of the center tile of GroupB given that each group has the same radius (in this case the radius is 1) and they don't overlap each other (let's see it as a tiling of groups starting from 0,0,0 that creates a hex grid)?
In this simple example, I know as a human being that the center tile of GroupB is (x=-1, y=3, z=-2) but I need to code that logic in a way that a computer can calculate it for any given group on the map. I don't particularly need help on the code itself but the overall logic.
In this article, the author does the opposite (going from small hexagon and trying to find its group):
https://observablehq.com/#sanderevers/hexagon-tiling-of-an-hexagonal-grid
Any help would be greatly appreciated!
Thanks!
It looks like I have found something that seems to work.
Please feel free to correct me if I'm mistaken.
Based on the article I linked in my original question, I came up with an algorithm that calculates the small hexagon central coordinates based on its higher group coordinates (in this case, I've used a group with a radius of 10). I took the original algorithm and removed the area division the author did. The code is in javascript. The i, j and k variables are the cube coordinates of the group. The function returns the cube coordinates of the central small hex :
getGroupCentralTileCoordinates(i, j, k)
{
let r = 10;
let shift = 3 * r + 2;
let xh = shift * i + j;
let yh = shift * j + k;
let zh = shift * k + i;
return {
'x': (1 + xh - yh) / 3,
'y': (1 + yh - zh) / 3,
'z': (1 + zh - xh) / 3
};
}
I would just like to start by saying my calculus is terrible and I have next to no experience with using it.
I am trying to find an algorithm to help scaling in my game. Specifically it should scale the amount of waves that spawn per level. Ideally it will take any number as a level up to the max integer value. There would also be a minimum value and a maximum value that would be the minimum waves and maximum waves. So:
level = 0 to infinity
minValue = 3
maxValue = 40
result = an algorithm that will have a max curvature of the max value and shouldnt exceed it no matter what value the level is. I'm not sure how to calculate this but I think it would also need some kind of threshold that i could control to dictate the curvature based on the the level.
Try the next approach:
mult = Min(1, (level/MaxLevel)**Somepower))
minValue + (maxValue - minValue) * mult
Choose Somepower value suitable for your tasks. For example, value 2 gives parabola (note that value might be less than 1)
If you want more complex curve, show a picture of desired form.
Edit:
For the case when curve tends but does not become above some level, you can choose some function with horizontal asymptote. For example:
max * x /(x+1)
or
max * arctan(k*x) * 2 / Pi
Can anyone figure out a function that can perform a mapping from a finite set of N numbers X = {x0, x1, x2, ..., xN} where each x can be valued 0 to 999999999 and N < 999999999, to a set Y = {0, 1, 2, 3, ..., N}.
In my case, i have about 24000000 element in the first set whose values can range as X. This elements have continuous block (for example 53000 to 1234500, then 8000000 to 9000000 and so on) and i have to remap this elements from 0 to 2400000. I don't require to maintain order.
I need a (possibly simple and rapid) math function, or a bitwise transformation, not something like put it ordered into an array and then binary search for their position.
Really thank to whom that can figure out a way to solve this!
Luca
If you don't want to keep some gigabytes of straight map, then augmented segment tree is reasonable approach. Tree should contain intervals and shift of every interval (sum of left intervals). Of course, finding appropriate interval (and shift) in this method is close to the binary search.
For example, you get X=80000015. Find interval for this value - it is 8000000 to 9000000. Rank of this interval is 175501 (1234500-53000 + 1). So X maps to
X => 175501 + 80000015 - 80000000 = 175516
For sparse elements make counting stage - find what is rank R for every number M and put (key=M, value=R) pair in hash table.
X = (3, 19, 20, 101)
table: [(3:0), (19:1), (20:2), (101:3)]
Note that one should keep balance between speed and space - for long filled intervals it is better to store only interval ends.
How do I normalize any given number between 0 and 100?
The min is 0 and the max has no bounds (it's the search volume for a keyword).
normalized = (x-min(x))/(max(x)-min(x)) won't work since I have no definition of max.
Arcus tangens
Algebraically, you might start with some function that has poles, e.g. tan, and use its inverse, atan. That inverse will never exceed a given limit, namely π/2 in this case. Then you can use a formula of the kind
f(x) = 100 * 2/π * atan(x - min)
If that doesn't produce “nice” results for small inputs, you might want to preprocess the inputs:
f(x) = 100 * 2/π * atan(a*(x - min))
for some suitably chosen a. Making a larger than one increases values, while for 0 < a < 1 you get smaller values. According to a comment, the latter is what you'd most likely want.
You could even add a power in there:
f(x) = 100 * 2/π * atan(a*(x - min)^b) = 100 * 2/π * atan(pow(a*(x - min), b))
for some positive parameter b. Having two parameters to tweak gives you more freedom in adjusting the function to your needs. But to decide on what would be good fits, you might have to decide up front as to what values you'd expect for various inputs. A bit like in this question, although there the input range is not unbounded.
Stereographic projection
If you prefer geometric approaches: you can imagine your input as the positive half of the x axis, namely the ray from (0,0) to (∞,0). Then imagine a circle with center (0,1) and radius 1 sitting on that line. If you connect the point (0,2) with any point on the ray, the connecting line will intersect the circle in one other point. That way you can map the ray onto the right half of the circle. Now take either the angle as seen from the center of the circle, or the y coordinate of the point on the circle, or any other finite value like this, normalize input and output properly, and you have a function matching your requirements. You can also work out a formula for this, and atan will likely play a role in that.
I have an animation sequence that shows one specific frame after another. The original specifications were:
We will give you a total duration for the animation
Reduce the time each consecutive frame is shown by 20ms
So I wrote this linear timing function:
/**
* Get the delay until the next animation frame.
*
* #returns the number of ms until the next animation frame.
*/
getNextTiming: function () {
var n = this.getLength(), // total number of frames
t = this.animationDuration,
x = this.durationDelta, // 20 ms, according to spec
d = ((n * n * x) - (n * x) + (2 * t)) / (2 * n),
c = this.cursor; // current frame
// d is the duration of the first frame of animation.
return d - c * x;
},
Naturally, the artists aren't happy. They want the animation to follow a curve. Of course, they don't know exactly what curve, so I need to write something I can tweak numerically until they're happy. It seems to me I should be looking at cubic Bézier functions, but the ones that I can find either have abstract notation that I'm finding difficult to parse, or they're defined as p(t), instead of Δt/Δp.
I used to be good at math, but right now I'm stuck. I could use a shove in the right direction. How can I rewrite the above function so that I can feed it cubic Bézier control points P1 and P2 and get as output the time for each consecutive frame?