all_simple_paths() in igraph in R is taking too much time

all_simple_paths() in igraph in R is taking too much time - r

I am trying to solve a trip assignment problem (transport planning). Available data is this: trips between nodes and links shapefile with 'from' and 'to' (match with those in trips data) codes. The approach i am adopting is this:
take each Origin-Destination pair from trip data
find all the possible paths between that OD pair
sort those paths based on length
start with smallest path and assign trips to that path until its capacity is reached
then take 2nd smallest and assign the trips...and so on
the problem i am facing is at 2nd step. I am using all_simple_paths() to get all possible paths between two nodes but it is taking too long. Here is that line
paths_between <- all_simple_paths(g_2, from = "240", to = "14")
how to work around this? is there any algorithm that I can use to get all possible paths? any help would be appreciated. thank you.

Related

Calculate all time differences between points in R (instead of from one point to the next one)

I would like to calculate the differences in time from POSIXct format. I am able to calculate the differences between consecutive points using
diff(data$time)
but not from all against all. So I guess my data is at least correctly imported.
I actually want to calculate all distances between points of one individual, so my data looks like: Posix, individual, otherinfo. If there is a simple way i would love to calculate automaticly the differences from all points per indiviual. If its not so straight forward I will do data subsets per individual thats fine.
I would be happy if someone could help me! I tried
dist(data$time)
because I know its a distance matrix calculation tool but unfortunalety it just gives me a list of rising numers (1,2,3,...) so i guess it is not familiar with the time format..
Thanks a lot!

We can use sapply
sapply(data$time, `-`, data$time)
or with outer
outer(data$time, data$time, FUN = `-`)

Reading undirected graph relationships (A-B) in R and renaming vertices with igraph

In R I'm trying to map all Madrid tube stations using igraph and then calculate the shortest route between two stations (just the number of stations, not the distance). I'm following this syntax: "An undirected graph with two vertices called ‘A’ and ‘B’ and one edge only:
graph.formula(A-B)"
Below I just copy two tube lines for clarity's sake.
library("igraph")
metro<- graph.formula(PinardeChamartin-Bambu-Chamartin-PlazadeCastilla-Valdeacederas-Tetuan-Estrecho-Alvarado-CuatroCaminos-RiosRosas-Iglesia-Bilbao-Tribunal-GranVia-Sol-TirsodeMolina-AntonMartin-Atocha-AtochaRenfe-MenendezPelayo-Pacifico-PuentedeVallecas-NuevaNumancia-Portazgo,LasRosas-AvenidadeGuadalajara-Alsacia-LaAlmudena-LaElipa-Ventas-ManuelBecerra-Goya-PrincipedeVergara-Retiro-BancodeEspana-Sevilla-Sol-Opera-SantoDomingo-Noviciado-SanBernardo-Quevedo-Canal-CuatroCaminos)
sp <- get.shortest.paths(metro,from="Canal",to="Chamartin")
V(metro)[sp[[1]]]
It seems to work but I have two question:
1. How can I input the tube stations (nodes) and their relationships A-B for long lists into the graph more efficiently, reading a csv for instance?
2.How can I rename those nodes to include tildes, spaces and "ñ"? Because I tried double quotes before and after each node's name but I get an error. A + sign. I haver checked the long string many times and I cannot see the error, no parenthesis missing.
Sorry if they're very basic questions. I'm a very novice user.
Thank you very much

For the first question, see ?graph.data.frame and ?read.csv.
I am not quite sure what you are asking in the second question, what is the error you are getting. Your code works fine for me, with the modification required for igraph 0.7.x:
V(metro)[sp$vpath[[1]]]
# Vertex sequence:
# [1] "Canal" "CuatroCaminos" "Alvarado" "Estrecho"
# [5] "Tetuan" "Valdeacederas" "PlazadeCastilla" "Chamartin"

What data structures allow for efficient lookup in nested intervals?

I’m looking for a data structure that would help me find the smallest interval (the (low, high) pair) that encloses a given point. Intervals may nest properly. For example:
Looking for point 3 in (2,7), (2,3), (4,5), (8,12), (9,10) should yield (2,3).
During the construction of the data structure, intervals are added in no particular order and, specifically, not according to their nesting. Is there a good way to map this problem to a search tree data structure?

Segment tree should do the job. In nodes of a segment tree you keep the length of the shortest interval that covers this node, as well as the reference to the interval itself. When processing a query for a given point, you simply return the interval referenced by the node of that point.

neo4j cypher : how to query a linked list

I'm having a bit of trouble to design a cypher query.
I have a graph data structures that records some data in time, using
(starting_node)-[:last]->(data1)-[:previous]->(data2)-[:previous]->(data3)->...
Each of the data nodes has a date, and some data as attributes that I want to sum.
Now, for the sake of the example, let's say I want to query what happened last week.
The closer I got is to query something like
start n= ... // definition of the many starting nodes here
match n-[:last]->d1, path = d1-[:previous*0..7]->dn
where dn.date > some_date_a_week_ago
Unfortunately, as I get the right path, I also get all the intermediate paths (from 2 days ago, from 3 days ago...etc).
Since there is many starting nodes, and thus many possible path lengths, I cannot ask for the longest path in my query. Furthermore, dn.date can be different from date_a_week_ago ( if there is only one data node this week, and one data node last month, then the expected path is of length=1).
Any tip on how to filter the intermediate paths in my query ?
Thanks in advance !
ps : by the way, I'm quite new with the graph modeling, and I'd be interested with any answer that would require to change the graph structure if needed.

You can add a further point "dnnext" in your path, and add a condition to ensure the "dn" is the last one that satisfis the condifition,
start n= ... // definition of the many starting nodes here
match n-[:last]->d1, path = d1-[:previous*0..7]->dn-[:previous*0..1]->dnnext
where dn.date > some_date_a_week_ago and dnnext < some_date_a_week

Movement data analysis in R; Flights and temporal subsampling

I want to analyse angles in movement of animals. I have tracking data that has 10 recordings per second. The data per recording consists of the position (x,y) of the animal, the angle and distance relative to the previous recording and furthermore includes speed and acceleration.
I want to analyse the speed an animal has while making a particular angle, however since the temporal resolution of my data is so high, each turn consists of a number of minute angles.
I figured there are two possible ways to work around this problem for both of which I do not know how to achieve such a thing in R and help would be greatly appreciated.
The first: Reducing my temporal resolution by a certain factor. However, this brings the disadvantage of losing possibly important parts of the data. Despite this, how would I be able to automatically subsample for example every 3rd or 10th recording of my data set?
The second: By converting straight movement into so called 'flights'; rule based aggregation of steps in approximately the same direction, separated by acute turns (see the figure). A flight between two points ends when the perpendicular distance from the main direction of that flight is larger than x, a value that can be arbitrarily set. Does anyone have any idea how to do that with the xy coordinate positional data that I have?

It sounds like there are three potential things you might want help with: the algorithm, the math, or R syntax.
The algorithm you need may depend on the specifics of your data. For example, how much data do you have? What format is it in? Is it in 2D or 3D? One possibility is to iterate through your data set. With each new point, you need to check all the previous points to see if they fall within your desired column. If the data set is large, however, this might be really slow. Worst case scenario, all the data points are in a single flight segment, meaning you would check the first point the same number of times as you have data points, the second point one less, etc. The means n + (n-1) + (n-2) + ... + 1 = n(n-1)/2 operations. That's O(n^2); the operating time could have quadratic growth with respect to the size of your data set. Hence, you may need something more sophisticated.
The math to check whether a point is within your desired column of x is pretty straightforward, although maybe more sophisticated math could help inform a better algorithm. One approach would be to use vector arithmetic. To take an example, suppose you have points A, B, and C. Your goal is to see if B falls in a column of width x around the vector from A to C. To do this, find the vector v orthogonal to C, then look at whether the magnitude of the scalar projection of the vector from A to B onto v is less than x. There is lots of literature available for help with this sort of thing, here is one example.
I think this is where I might start (with a boolean function for an individual point), since it seems like an R function to determine this would be convenient. Then another function that takes a set of points and calculates the vector v and calls the first function for each point in the set. Then run some data and see how long it takes.
I'm afraid I won't be of much help with R syntax, although it is on my list of things I'd like to learn. I checked out the manual for R last night and it had plenty of useful examples. I believe this is very doable, even for an R novice like myself. It might be kind of slow if you have a big data set. However, with something that works, it might also be easier to acquire help from people with more knowledge and experience to optimize it.
Two quick clarifying points in case they are helpful:
The above suggestion is just to start with the data for a single animal, so when I talk about growth of data I'm talking about the average data sample size for a single animal. If that is slow, you'll probably need to fix that first. Then you'll need to potentially analyze/optimize an algorithm for processing multiple animals afterwards.
I'm implicitly assuming that the definition of flight segment is the largest subset of contiguous data points where no "sub" flight segment violates the column rule. That is to say, I think I could come up with an example where a set of points satisfies your rule of falling within a column of width x around the vector to the last point, but if you looked at the column of width x around the vector to the second to last point, one point wouldn't meet the criteria anymore. Depending on how you define the flight segment then (e.g. if you want it to be the largest possible set of points that meet your condition and don't care about what happens inside), you may need something different (e.g. work backwards instead of forwards).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

all_simple_paths() in igraph in R is taking too much time - r

Related

Calculate all time differences between points in R (instead of from one point to the next one)

Reading undirected graph relationships (A-B) in R and renaming vertices with igraph

What data structures allow for efficient lookup in nested intervals?

neo4j cypher : how to query a linked list

Movement data analysis in R; Flights and temporal subsampling

Categories

Resources