Equation for non linear data - math
I have a set of non linear data. The data is the X & Y coordinates of different objects/points in a video( that is the x&y pixel co-ordinates of same objects in all the frames in a video.) upon plotting the values in one frame, I am getting a nonlinear graph as shown in the picture.
I want to form an equation for this graph so that, if I have a known X coorrdinate in this frame, then the corresponding Y coordinate can be obtained using this equation.(kind of predicting the new position, I am not sure this idea is correct or not)
OR
If this idea is illogical, can you suggest something that will work so that I can predict the location of new object using these data.
Any help or new ideas is highly appreciated.
A sample of my data is given below:
X Y
----------
214 182
830 185
1451 173
219 554
1453 548
214 941
830 934
1455 942
213 190
829 193
1450 181
218 561
1452 555
214 945
830 938
1455 946
213 190
828 193
1451 182
219 560
1452 554
214 945
830 938
1455 946
213 190
829 193
1450 181
219 556
1453 550
215 936
830 929
1455 937
I have selected 9 objects in each frame, so the first 9 data set belongs to one frame, and so on..
Your XY data looks like this:
There are clusters located on corners and mid-edges.
and when the lines that connect successive points are added
The points should come in groups of 8, in the sequence shown above. You can predict the location of a point using the index
// predict location `(x,y)` of point based on index `i`
point = MOD(i-1,8)+1; // get number 1-8 of the point (as shown above)
select case point
case [1,4,6] : x = 215;
case [2,7] : x = 829;
case [3,5,8] : x = 1463;
end select
select case point
case [1,2,3] : y = 186;
case [4,5] : y = 555;
case [6,7,8] : y = 940;
end select
You have to cut this curve in lot of linear lines, so following the value of X, you will be on linear line and its easy to calculate the equation of line knowing 2 points of this line
Related
Remove row with specific value
I have the following data: library(data.table) sales <- data.table(Customer = c(192,964,929,345,898,477,705,804,188,231,780,611,420,816,171,212,504,526,471,979,524,410,557,152,417,359,435,820,305,268,763,194,757,475,351,933,805,687,813,880,798,327,602,710,785,840,446,891,165,662), Producttype = c(1,2,3,2,3,3,2,1,3,3,1,1,2,2,1,3,1,3,3,1,1,1,1,3,3,3,3,2,1,1,3,3,3,3,1,1,3,3,3,2,3,2,3,3,3,2,1,2,3,1), Price = c(469,721,856,956,554,188,429,502,507,669,427,582,574,992,418,835,652,983,149,917,370,617,876,337,663,252,599,949,915,556,313,842,892,724,415,307,900,114,439,456,541,261,881,757,199,308,958,374,409,738), Quarter = c(2,3,3,4,4,1,4,4,3,3,1,1,1,1,1,1,4,1,2,1,3,1,2,3,3,4,4,1,1,4,1,1,3,2,1,3,3,2,2,2,1,4,3,3,1,1,1,3,1,1)) How can I remove (let's say) the row in which Customer = 891? And then I have another question: If I want to manipulate the data I use data [row, column]. But when I want to use only the rows in which Quarter equals (for example) 4. I use data [Quarter = 4,] Why is it not data [, Quarter = 4] since Quarter is a column and not a row? I did not find an appropriate answer in the internet which really explains the why. Thank you.
You have used 'data.table' function to import your data, so you could write : sales[Customer != 891,] The data[Quarter = 4, ], ensures that all columns should be returned for the rows where Quarter is equal to 4. The comma(,) is necessary to only select the rows, and not the column Quarter = 4.
When you use indexing, ie, data[row, column] you are telling R to look for either a specific row or column index. Row: sales[sales$Customer %in% c(192,964),] translates to "search the specific column Customer in the data frame (or table) for any rows that have values that contain 192 or 964 and isolate them. Note that data.table will allow for sales[Customer %in% c(192, 964),] but data frames cant (use sales[sales$Customer %in% c(192,964),]) Customer Producttype Price Quarter 1: 192 1 469 2 2: 964 2 721 3 Columns sales[, "Customer"] translates to "search the data frame (or table) for columns named "Customer" and isolate all its rows Customer 1: 192 2: 964 3: 929 4: 345 5: 898 ... Note this returns a data table with one column. If you use sales[,Customer] (data table) or sales$Customer (data frame), it will return a vector: # [1] 192 964 929 345 898 477 705 804 188 231 780 611 420 816 171 212 504 526 471 979 524 # [22] 410 557 152 417 359 435 820 305 268 763 194 757 475 351 933 805 687 813 880 798 327 # [43] 602 710 785 840 446 891 165 662 You can of course combine - if you did, sales[sales$Quarter %in% 1:2, c("Customer", "Producttype")] you would isolate all values of Customer and Producttype which were in quarters 1 and 2: Customer Producttype 1: 192 1 2: 477 3 3: 780 1 4: 611 1 5: 420 2 ...
Report the mean number of characters in Corpus document
So I have a corpus setup reading bunch of text file with paragraphs in them. library('tm') my.text.location <- "C:/Users//.../*/" apapers <- VCorpus(DirSource(my.text.location)) Now I need to find the mean of the characters in each text. Running a mean(nchar(apapers), na.rm =T) results in a very weird output, more than the number of characters. Any other way to get the mean?
You didn't supply a reproducible example, but rowMeans(sapply(apapers, nchar)) will return the mean number of characters over all documents. "Content" is the column you need. A longer version is running a sapply over the corpus counting the number of per document. Transpose this data and turn it into a data.frame. The data.frame will contain two columns, content and meta. Content is the one you need. Taking the mean of the content column will give you the average number of characters in a document. The advantage of this is that you have the table in case you need to report the numbers. # your code my_count <- data.frame(t(sapply(apapers, nchar))) mean(my_count$content) Reproducible example using the crude dataset: library(tm) data("crude") crude <- as.VCorpus(crude) # in one statement rowMeans(sapply(crude, nchar)) content meta 1220.30 453.15 # longer version keeping intermediate results. my_count <- data.frame(t(sapply(crude, nchar))) mean(my_count$content) [1] 1220.3 my_count content meta 127 527 440 144 2634 458 191 330 444 194 394 441 211 552 441 236 2774 455 237 2747 477 242 930 453 246 2115 440 248 2066 466 273 2241 458 349 593 492 352 621 468 353 591 445 368 629 440 489 876 445 502 1166 446 543 463 447 704 1797 456 708 360 451
R is not taking the parameter hgap in layout_with_sugiyama
I'm working on R on a graph and I'd like to have a hierarchical plot, based on the values in the vector S (a value for each node). lay2 <- layout_with_sugiyama(grafo, attributes="all", layers = S, hgap=10, vgap=10) plot(lay2$extd_graph, vertex.label.cex=0.5) However, the paramaters hgap e vgap are not taken and the graph is really confused (even because I've got 162 nodes). I'm doing something wrong or there is another way in which I can do a hierarchical graph?
I believe that layout_with_sugiyama is working just fine, but you may be misinterpreting the output. Since you do not provide any data, I will illustrate with some randomly generated data. library(igraph) set.seed(1234) grafo = erdos.renyi.game(162, 0.03) lay2 <- layout_with_sugiyama(grafo, attributes="all", hgap=10, vgap=10) plot(lay2$extd_graph, vertex.label.cex=0.5, vertex.size=9) I think the source of your question is the fact that the nodes are a bit crowded together in the horizontal direction. But that should be expected. Let's analyze the layout, starting with the easy part, the vertical direction. table(lay2$layout[,2]) 1 11 21 31 41 24 82 42 13 1 You can see that vgap worked. The spacing is 10 units apart. The second line up (y=11) has 82 nodes. Unless the nodes are tiny, 82 nodes on a single, horizontal line will overlap. But aren't they supposed to have spacing of at least 10? They do! Let's look at that second line. sort(lay2$layout[lay2$layout[,2]==11,1]) [1] -25 -15 -5 5 15 25 35 45 55 65 75 85 95 105 115 125 135 230 [19] 240 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 [37] 430 440 450 460 470 480 490 500 510 520 530 540 550 560 570 580 590 600 [55] 610 620 630 640 655 665 675 685 695 720 730 740 750 760 770 780 790 800 [73] 810 820 830 840 850 860 870 880 890 910 Looking at the whole graph, there is a slightly broader range. range(lay2$layout[,1]) [1] -65 910 None of the numbers are less that 10 apart - as requested. hgap worked too! However, what happens when you try to plot that? If you read the part of the ?igraph.plotting help page that refers to the parameter rescale, you will see: rescale: Logical constant, whether to rescale the coordinates to the [-1,1]x-1,1 interval. Defaults to TRUE, the layout will be rescaled. So the layout will be rescaled to a range of -1,1 and then plotted. Scaled or not, you need to fit 82 nodes in a single, horizontal row, so it is very difficult to avoid overlapping nodes.
Natural Neighbor Interpolation in R
I need to conduct Natural Neighbor Interpolation (NNI) via R in order to smooth my numeric data. For example, say I have very spurious data, my goal is to use NNI to model the data neatly. I have several hundred rows of data (one observation for each postcode), alongside latitudes and longitudes. I've made up some data below: Postcode lat lon Value 200 -35.277272 149.117136 7 221 -35.201372 149.095065 38 800 -12.801028 130.955789 27 801 -12.801028 130.955789 3 804 -12.432181 130.84331 29 810 -12.378451 130.877014 20 811 -12.376597 130.850489 3 812 -12.400091 130.913672 42 814 -12.382572 130.853877 32 820 -12.410444 130.856124 39 821 -12.426641 130.882367 39 822 -12.799278 131.131697 49 828 -12.474896 130.907378 38 829 -14.460879 132.280002 34 830 -12.487233 130.972637 8 831 -12.480066 130.984006 49 832 -12.492269 130.990891 29 835 -12.48138 131.029173 33 836 -12.525546 131.103025 40 837 -12.460094 130.842663 39 838 -12.709507 130.995407 28 840 -12.717562 130.351316 22 841 -12.801028 130.955789 8 845 -13.038663 131.072091 19 846 -13.226806 131.098416 50 847 -13.824123 131.835799 11 850 -14.464497 132.262021 2 851 -14.464497 132.262021 23 852 -14.92267 133.064654 36 854 -16.81839 137.14707 17 860 -19.648306 134.186642 3 861 -18.94406 134.318373 8 862 -20.231104 137.762232 28 870 -12.436101 130.84059 24 871 -12.436101 130.84059 16 Is there any kind of package that will do this? I should mention, that the only predictors I am using in this model are latitude and longitude. If there isn't a package than can do this, how can I implement it manually. I've searched extensively and I can't figure out how to implement this in R. I have seen one or two other SO posts, but they haven't assisted me in figuring this out. Please let me know if there's anything I must add to the question. Thanks.
I suggest the following: Reproject the data to the corresponding UTM Zone. Use R WhiteboxTools package to process the data using natural neighbour interpolation.
How do I make sure numbers are numeric from a .txt?
I'm setting up a script to extract the thickness and voltages from a single column text file and perform a Weibull distribution on it. When I try to use fitdistr() I get an error stating "'x' must be a non-empty numeric vector". R is supposed to interpret numbers in text files as numeric but that doesn't seem to be happening. Any thoughts? filename <- "SampleBreakdownSet.txt" d <- read.table(filename, header = FALSE, sep = "") #Extract thickness from the dataset; set to variable t t = d[1,1] #Extract the breakdown voltages and toss into dataset, BDV BDV = tail(d,(nrow(d)-1)) #Calculates the breakdown field from the thickness and BDV BDF = (BDV*10000) / t #Calculates the Weibull parameters from the input breakdown voltages. fitdistr(BDF, densfun ="weibull", lower = 0) fitdistr(BDF, densfun ="weibull", lower = 0) Error in fitdistr(BDF, densfun = "weibull", lower = 0) : 'x' must be a non-empty numeric vector Sample data I'm using: 2 200 250 450 320 100 400 200 403 502 203 420 120 342 304 253 423 534 534 243 253 423 123 433 534 234 633 432 342 543 532 123 453 231 532 342 213 243
You are passing a data.frame to fitdistr, but you should be passing the vector itself. Try this: d <- read.table(text='200 250 450 320 100 400 200 403 502 203 420 120 342 304 253 423 534 534 243 253 423 123 433 534 234 633 432 342 543 532 123 453 231 532 342 213 243', header=FALSE) t <- d[1,1] #Extract the breakdown voltages and toss into dataset, BDV BDV <- d[-1, 1] BDF <- (BDV*10000) / t library(MASS) fitdistr(BDF, densfun ="weibull", lower = 0) You could also refer to the relevant column when calling fitdistr, e.g.: fitdistr(BDF$V1, densfun ="weibull", lower = 0) # shape scale # 2.745485e+00 1.997509e+04 # (3.716797e-01) (1.283667e+03)