Suppose you have three points; (3500, 700), (52500, 5075), and (527500, 36800). As well as two x boundaries 25000 and 200000. The question is then to construct three lines (each of which goes through one of the points) in such a way that the lines achieve the same value at each boundary point. The catch is that the second line must have a lesser slope than the first line and the third line must have a lesser slope than the second line.
I do not believe a solution exists but I would like to know how to set up the problem in order to check for solutions in r. Ideally, I would like to then relax the constraint that the functions be equal at the boundaries to something where they are within 10% of one another or something like that.
Edit 1:
Essentially what I want is a line that goes through (3500, 700) and has a value (25000, y_1), a line that goes through (52500, 5075) and has values (25000, y_1) and (200000, y_2), and a third line that goes through (527500, 36800) and has a value(200000, y_2).
Edit 2:
Here is an updated graphic. The dashed lines represent an incorrect solution because the slope of the last segment is larger than the slope of the middle segment (which is 0). What I am essentially looking for is the y = mx + b format of each of the three segments if they were infinitely extending lines.
Related
Im writing a basic script to find the min distance between f(x):=log(x)-x and the origin. I would like to be able to plot the point closest to the origin over the top of a plot of f(x), but i can not figure out how to plot a single point.
Here is what i have written. Any ideas?
f(x):=log(x)-x;
d(x):=sqrt(x^2+f(x)^2)$
find_root(diff(d(x),x),x,0.01,5)$
a:%;
f(a);
print("min distance from f(x) to (0,0)")$
d(a);
print("passes second derivative test if next value greater than zero")$
g(x):=''(diff(d(x),x,2))$
g(a);
wxplot2d([f(x)], [x,.01,5], [y,-6,0])$
Use the discrete option as a second curve, and then use points in the style option.
Replacing your last line with
wxplot2d([f(x), [discrete, [a], [f(a)]]], [x,.01,5], [y,-6,0],
[style, lines, points],
[legend, "log(x)-x", "closest point to origin"],
[point_type, circle],
[gnuplot_preamble, "set key bottom"])$
gives you this:
I am trying to convey the concentration of lines in 2D space by showing the number of crossings through each pixel in a grid. I am picturing something similar to a density plot, but with more intuitive units. I was drawn to the spatstat package and its line segment class (psp) as it allows you to define line segments by their end points and incorporate the entire line in calculations. However, I'm struggling to find the right combination of functions to tally these counts and would appreciate any suggestions.
As shown in the example below with 50 lines, the density function produces values in (0,140), the pixellate function tallies the total length through each pixel and takes values in (0, 0.04), and as.mask produces a binary indictor of whether a line went through each pixel. I'm hoping to see something where the scale takes integer values, say 0..10.
require(spatstat)
set.seed(1234)
numLines = 50
# define line segments
L = psp(runif(numLines),runif(numLines),runif(numLines),runif(numLines), window=owin())
# image with 2-dimensional kernel density estimate
D = density.psp(L, sigma=0.03)
# image with total length of lines through each pixel
P = pixellate.psp(L)
# binary mask giving whether a line went through a pixel
B = as.mask.psp(L)
par(mfrow=c(2,2), mar=c(2,2,2,2))
plot(L, main="L")
plot(D, main="density.psp(L)")
plot(P, main="pixellate.psp(L)")
plot(B, main="as.mask.psp(L)")
The pixellate.psp function allows you to optionally specify weights to use in the calculation. I considered trying to manipulate this to normalize the pixels to take a count of one for each crossing, but the weight is applied uniquely to each line (and not specific to the line/pixel pair). I also considered calculating a binary mask for each line and adding the results, but it seems like there should be an easier way. I know that you can sample points along a line, and then do a count of the points by pixel. However, I am concerned about getting the sampling right so that there is one and only one point per line crossing of a pixel.
Is there is a straight-forward way to do this in R? Otherwise would this be an appropriate suggestion for a future package enhancement? Is this more easily accomplished in another language such as python or matlab?
The example above and my testing has been with spatstat 1.40-0, R 3.1.2, on x86_64-w64-mingw32.
You are absolutely right that this is something to put in as a future enhancement. It will be done in one of the next versions of spatstat. It will probably be an option in pixellate.psp to count the number of crossing lines rather than measure the total length.
For now you have to do something a bit convoluted as e.g:
require(spatstat)
set.seed(1234)
numLines = 50
# define line segments
L <- psp(runif(numLines),runif(numLines),runif(numLines),runif(numLines), window=owin())
# split into individual lines and use as.mask.psp on each
masklist <- lapply(1:nsegments(L), function(i) as.mask.psp(L[i]))
# convert to 0-1 image for easy addition
imlist <- lapply(masklist, as.im.owin, na.replace = 0)
rslt <- Reduce("+", imlist)
# plot
plot(rslt, main = "")
I have two sequences of length n and m. Each is a sequence of points of the form (x,y) and represent curves in an image. I need to find how different (or similar) these sequences are given that fact that
one sequence is likely longer than the other (i.e., one can be half or a quarter as long as the other, but if they trace approximately the same curve, they are the same)
these sequences could be in opposite directions (i.e., sequence 1 goes from left to right, while sequence 2 goes from right to left)
I looked into some difference estimates like Levenshtein as well as edit-distances in structural similarity matching for protein folding, but none of them seem to do the trick. I could write my own brute-force method but I want to know if there is a better way.
Thanks.
Do you mean that you are trying to match curves that have been translated in x,y coordinates? One technique from image processing is to use chain codes [I'm looking for a decent reference, but all I can find right now is this] to encode each sequence and then compare those chain codes. You could take the sum of the differences (modulo 8) and if the result is 0, the curves are identical. Since the sequences are of different lengths and don't necessarily start at the same relative location, you would have to shift one sequence and do this again and again, but you only have to create the chain codes once. The only way to detect if one of the sequences is reversed is to try both the forward and reverse of one of the sequences. If the curves aren't exactly alike, the sum will be greater than zero but it is not straightforward to tell how different the curves are simply from the sum.
This method will not be rotationally invariant. If you need a method that is rotationally invariant, you should look at Boundary-Centered Polar Encoding. I can't find a free reference for that, but if you need me to describe it, let me know.
A method along these lines might work:
For both sequences:
Fit a curve through the sequence. Make sure that you have a continuous one-to-one function from [0,1] to points on this curve. That is, for each (real) number between 0 and 1, this function returns a point on the curve belonging to it. By tracing the function for all numbers from 0 to 1, you get the entire curve.
One way to fit a curve would be to draw a straight line between each pair of consecutive points (it is not a nice curve, because it has sharp bends, but it might be fine for your purpose). In that case, the function can be obtained by calculating the total length of all the line segments (Pythagoras). The point on the curve corresponding to a number Y (between 0 and 1) corresponds to the point on the curve that has a distance Y * (total length of all line segments) from the first point on the sequence, measured by traveling over the line segments (!!).
Now, after we have obtained such a function F(double) for the first sequence, and G(double) for the second sequence, we can calculate the similarity as follows:
double epsilon = 0.01;
double curveDistanceSquared = 0.0;
for(double d=0.0;d<1.0;d=d+epsilon)
{
Point pointOnCurve1 = F(d);
Point pointOnCurve2 = G(d);
//alternatively, use G(1.0-d) to check whether the second sequence is reversed
double distanceOfPoints = pointOnCurve1.EuclideanDistance(pointOnCurve2);
curveDistanceSquared = curveDistanceSquared + distanceOfPoints * distanceOfPoints;
}
similarity = 1.0/ curveDistanceSquared;
Possible improvements:
-Find an improved way to fit the curves. Note that you still need the function that traces the curve for the above method to work.
-When calculating the distance, consider reparametrizing the function G in such a way that the distance is minimized. (This means you have an increasing function R, such that R(0) = 0 and R(1)=1,
but which is otherwise general. When calculating the distance you use
Point pointOnCurve1 = F(d);
Point pointOnCurve2 = G(R(d));
Subsequently, you try to choose R in such a way that the distance is minimized. (to see what happens, note that G(R(d)) also traces the curve)).
Why not do some sort of curve fitting procedure (least-squares whether it be ordinary or non-linear) and see if the coefficients on the shape parameters are the same. If you run it as a panel-data sort of model, there are explicit statistical tests whether sets of parameters are significantly different from one another. That would solve the problem of the the same curve but sampled at different resolutions.
Step 1: Canonicalize the orientation. For example, let's say that all curved start at the endpoint with lowest lexicographic order.
def inCanonicalOrientation(path):
return path if path[0]<path[-1] else reversed(path)
Step 2: You can either be roughly accurate, or very accurate. If you wish to be very accurate, calculate a spline, or fit both curves to a polynomial of appropriate degree, and compare coefficients. If you'd like just a rough estimate, do as follows:
def resample(path, numPoints)
pathLength = pathLength(path) #write this function
segments = generateSegments(path)
currentSegment = next(segments)
segmentsSoFar = [currentSegment]
for i in range(numPoints):
samplePosition = i/(numPoints-1)*pathLength
while samplePosition > pathLength(segmentsSoFar)+currentSegment.length:
currentSegment = next(segments)
segmentsSoFar.insert(currentSegment)
difference = samplePosition - pathLength(segmentsSoFar)
howFar = difference/currentSegment.length
yield Point((1-howFar)*currentSegment.start + (howFar)*currentSegment.end)
This can be modified from a linear resampling to something better.
def error(pathA, pathB):
pathA = inCanonicalOrientation(pathA)
pathB = inCanonicalOrientation(pathB)
higherResolution = max([len(pathA), len(pathB)])
resampledA = resample(pathA, higherResolution)
resampledB = resample(pathA, higherResolution)
error = sum(
abs(pointInA-pointInB)
for pointInA,pointInB in zip(pathA,pathB)
)
averageError = error / len(pathAorB)
normalizedError = error / Z(AorB)
return normalizedError
Where Z is something like the "diameter" of your path, perhaps the maximum Euclidean distance between any two points in a path.
I would use a curve-fitting procedure, but also throw in a constant term, i.e. 0 =B0 + B1*X + B2*Y + B3*X*Y + B4*X^2 etc. This would catch the translational variance and then you can do a statistical comparison of the estimated coefficients of the curves formed by the two sets of points as a way of classifying them. I'm assuming you'll have to do bi-variate interpolation if the data form arbitrary curves in the x-y plane.
I have a line that I must do calculations on for each grid square the line passes through.
I have used the Superline algorithm to get all these grid squares. This gives me an array of X,Y coordinates to check.
Now, here is where I am stuck, I need to be able to calculate the distance traveled through each of the grid squares... As in, on a line not on either 90 degree or 45 degree angles, each grid square accommodates a different 'length' of the total line.
Image example here, need 10 reputation to post images
As you can see, some squares have much more 'line length' in them than others - this is what I need to find.
How do I work this out for each grid square? I've been at this for a while and request the help of the Stack Overflowers!
There may be some clever way to do this that is faster and easier, but you could always hack through it like this:
You know the distance formula: s=sqrt((x2-x1)^2+(y2-y1)^2). To apply this, you must find the x and y co-ordinates of the points where the line intersects the edges of each grid cell. You can do this by plugging the x and y co-ordinates of the boundaries of the cell into the equation of the line and solve for x or y as appropriate.
That is, each cell extends from some point (x0,y0) to (x0+1,y0+1). So we need to find y(x0), y(x0+1), x(y0), and x(y0+1). For each of these, the x or y value found may or may not be within the ranges for that co-ordinate for that cell. Specifically, two of them will be and two won't. The two that are correspond to the edges that the line passes through, and the two that aren't are edges that it doesn't pass through.
Okay, maybe this sounds pretty confusing, so let's work through an example.
Let's say your line has the equation x=2/3 * y. You want to know where it intersects the edges of the cell extending from (1,0) to (2,1).
Plug in x=1 and you get y=2/3. 2/3 is in the legal range for y -- 0 to 1 -- so (1,2/3) is a point on the edge where the line intersects this cell. Namely, the left edge.
Plug in x=2 and you get y=4/3. 4/3 is outside the range for y. So the line does not pass through the right edge.
Plug in y=0 and you get x=0. 0 is not in the range for x, so the line does not pass through the bottom edge.
Plug in y=1 and you get x=3/2. 3/2 is in the legal range for x, so (3/2,1) is another intersection point, on the top edge.
Thus, the two points where the line intersects the edges of the cell are (1,2/3) and (3/2,1). Plug these into the distance formula and you'll get the length of the line segement through this cell, namely sqrt((1-3/2)^2+(2/3-1)^2)=sqrt(1/4+1/9)=sqrt(13/36). You can approximate that to any desired level of precision.
To do this in a program you'd need something like: (I'll use pseudo code because I don't know what language you're using)
// Assuming y=mx+b
function y(x)
return mx+b
function x(y)
return (y-b)/m
// cellx, celly are co-ordinates of lower left corner of cell
// Upper right must therefore be cellx+1, celly+1
function segLength(cellx, celly)
// We'll create two arrays pointx and pointy to hold co-ordinates of intersect points
// n is index into these arrays
// In an object-oriented language, we'd create an array of point objects, but whatever
n=0
y1=y(cellx)
if y1>=celly and y1<=celly+1
pointx[n]=cellx
pointy[n]=y1
n=n+1
y2=y(cellx+1)
if y2>=celly and y2<=celly+1
pointx[n]=cellx+1
pointy[n]=y2
n=n+1
x1=x(celly)
if x1>=cellx and x1<=cellx+1
pointx[n]=x1
pointy[n]=celly
n=n+1
x2=x(celly+1)
if x2>=cellx and x2<=cellx+1
pointx[n]=x2
pointy[n]=celly+1
n=n+1
if n==0
return "Error: line does not intersect this cell"
else if n==2
return sqrt((pointx[0]-pointx[1])^2+(pointy[0]-pointy[1])^2)
else
return "Error: Impossible condition"
Well, I'm sure you could make the code a little cleaner, but that's the idea.
have a look at Siddon's algorithm: "Fast calculation of the exact radiological path for a three-dimensional CT array"
unfortunately you need a subscription to read the original paper, but it is fairly well described in this paper
Siddon's algorithm is an O(n) algorithm for finding the length of intersection of a line with each pixel/voxel in a regular 2d/3d grid.
Use the Euclidean Distance.
sqrt((x2-x1)^2 + (y2-y1)^2)
This gives the actual distance in units between points (x1,y1) and (x2,y2)
You can fairly simply find this for each square.
You have the slope of the line m = (y2-y1)/(x2-x1).
You have the starting point:
(x1,y2)
What is the y position at x1 + 1? (i.e. starting at the next square)
Assuming you set your starting point to 0 the equation of this line is simply:
y_n = mx_n
so y_n = (y2-y1)/(x2-x1) * x_n
Then the coordinates at the first square are (x1,y1) and at the nth point:
(1, ((y2-y1)/(x2-x1))*1)
(2, ((y2-y1)/(x2-x1))*2)
(3, ((y2-y1)/(x2-x1))*3)
...
(n, ((y2-y1)/(x2-x1))*n)
Then the distance through the nth square is:
sqrt((x_n+1 - x_n)^2 + (y_n+1 - y_n)^2)
I'm trying to design a program that draws graphs given a set of points (x, y), and it also should recognize the curve (straight line, hyperbole, parabola), with only the help of the points.
Is there an algorithm to do that?
You'll need five points if the curve could be a straight line or a conic (hyperbola, parabola, ellipse, circle).
If the five points are collinear, you have a straight line. (Or a degenerate conic? But if you're expecting straight lines, this should indicate a straight line.)
If four are collinear, you have a degenerate conic, given by the line through the four collinear points and any line through the fifth point that is not parallel to the first line.
If three are collinear, you have a degenerate conic, given by the line through the three collinear points and the line through the two other points. (Unless these two lines are parallel, in which case this isn't a conic.)
If no three points are collinear, you have a unique, non-degenerate conic.
To find the equation for this conic (Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0), take a look at this page, specifically the formula in the DETAILS section. Put in your five x and y values, calculate the determinant of the matrix in terms of x and y, and this will give you the formula of your conic. Then see here to figure out what kind of conic you have based on the values of A, B and C.
If you have more than 5 points, pick five points (preferably so no three are collinear), find the conic, and then check that the remaining points lie on the conic.
You can do it by watching the function extreme but is maybe not optimal solution for this problem (i mean a problem in parabola function like that y = sqrt(x*x-1)).
For straight line you can calc y = ax + b by the 2 random point's and check what other points equal this condition, if yes. This is straight line if no you may check a 2 other exceptions or nothing from it. Maybe somone else get solution for next 2 cases?