Using this example graph below, I'm trying to collect a a mapping of terms and weights for each content.
v1 = g.addV("content").property("title", "Title 1")
v2 = g.addV("content").property("title", "Title 2")
v3 = g.addV("content").property("title", "Title 3")
v4 = g.addV("content").property("title", "Title 4")
v5 = g.addV("term").property("name", "Term 1")
v6 = g.addV("term").property("name", "Term 2")
g.addE("hasTerm").from(v1).to(v5).property("weight", 5)
g.addE("hasTerm").from(v1).to(v6).property("weight", 8)
g.addE("hasTerm").from(v2).to(v5).property("weight", 10)
g.addE("hasTerm").from(v3).to(v5).property("weight", 15)
g.addE("hasTerm").from(v3).to(v6).property("weight", 6)
g.addE("hasTerm").from(v4).to(v6).property("weight", 8)
Example Required Output:
[[Title 1, Terms:[[t:Term 1, w:5],[t:Term 2, w:8]]],
[Title 2, Terms:[[t:Term 1, w:10]]],
[Title 3, Terms:[[t:Term 1, w:15],[t:Term 2, w:6]]],
[Title 4, Terms:[[t:Term 2, w:8]]]]
I got close to accomplishing this goal, using a query like this:
g.V().has('content', 'title', 'Title 1').as('a').
out('hasTerm').as('t', 'w').
select('t', 'w').
by('name').
by(inE('hasTerm').values('weight'))
And got this result when only looking at terms for 1 content vertex:
==>[t:Term1,w:5]
==>[t:Term 2,w:8]
But whenever I tried use a query on all vertices with label content and group it by content, it doesn't work.
Add additional requirement I still need to solve (after this step) is to find the sum of all the minimum weights between two pieces of content. Comparing Title 1 and Title 3 they both share Term 1 and Term 2 with varying weights. I would want to take the minimum weight that goes to Term 1 and add it to the minimum weight going to Term 2. Examples of all permutations below:
[[Title 1, Title 2, w:5],
[Title 1, Title 3, w:11],
[Title 1, Title 4, w:8],
[Title 2, Title 3, w:10],
[Title 3, Title 4, w: 6]]
If this second option would be better to handle simply in code rather than in the graph query, I'll do that, but I'm not sure where how I would do it in the query to begin with.
One way to do this is to use nested group steps. Notice I also changed the query from out and inE, to outE and inV
gremlin> g.V().hasLabel('content').
......1> group().
......2> by('title').
......3> by(outE('hasTerm').
......4> group().
......5> by(inV().values('name')).
......6> by(values('weight'))).unfold()
==>Title 4={Term 2=8}
==>Title 1={Term 2=8, Term 1=5}
==>Title 3={Term 2=6, Term 1=15}
==>Title 2={Term 1=10}
Related
I'm trying to solve this problem:
There is a grid with with r rows and c columns. A robot sitting in top left cell can only move in 2 directions, right and down. But certain cells have to be avoided and the robot cannot step on them. Find a path for the robot from the top left to the bottom right.
The problem specifically asks for a single path, and that seems straight forward:
Having the grid as boolean[][], the pseudocode I have is
List<String> path = new ArrayList<String>()
boolean found = false
void getPath(r, c){
if (!found) {
if ( (r or c is outofbounds) || (!grid[r][c]) )
return
if (r==0 AND c==0) // we reached
found = true
getPath(r-1, c)
getPath(r, c-1)
String cell = "(" + r + ", " + c + ")"
path.add(cell)
}
}
Though I was wondering how can I get all the possible paths (NOT just the count, but the path values as well). Note that it has r rows and c columns, so its not a nxn grid. I'm trying to think of a DP/recursive solution but unable to come up with any and stuck. It's hard to think when the recursion goes in two ways.
Any pointers? And also any general help on how to "think" about such problems would be appreciated :).
Any pointers? And also any general help on how to "think" about such problems would be appreciated :).
Approach to the problem:
Mentally construct graph G of the problem. In this case the vertices are cells in the grid and directed edges are created where a valid robot move exist.
Search for properties of G. In this case G is a DAG (Directed Acyclic Graph).
Use such properties to come up with a solution. In this case (G is a DAG) its common to use topological sort and dynamic programming to find the amount of valid paths.
Actually you don't need to construct the graph since the set of edges is pretty clear or to do topological sort as usual iteration of the matrix (incremental row index and incremental column index) is a topological sort of this implicit graph.
The dynamic programming part can be solved by storing in each cell [x][y] the amount of valid paths from [0][0] to [x][y] and checking where to move next.
Recurrence:
After computations the answer is stored in dp[n - 1][m - 1] where n is amount of rows and m is amount of columns. Overall runtime is O(nm).
How about find all possible valid paths:
Usual backtracking works and we can speed it up by applying early pruning. In fact, if we calculate dp matrix and then we do backtracking from cell [n - 1][m - 1] we can avoid invalid paths as soon the robot enters at a cell whose dp value is zero.
Python code with dp matrix calculated beforehand:
n, m = 3, 4
bad = [[False, False, False, False],
[ True, True, False, False],
[False, False, False, False]]
dp = [[1, 1, 1, 1],
[0, 0, 1, 2],
[0, 0, 1, 3]]
paths = []
curpath = []
def getPath(r, c):
if dp[r][c] == 0 or r < 0 or c < 0:
return
curpath.append((r, c))
if r == 0 and c == 0:
paths.append(list(reversed(curpath)))
getPath(r - 1, c)
getPath(r, c - 1)
curpath.pop()
getPath(n - 1, m - 1)
print(paths)
# valid paths are [[(0, 0), (0, 1), (0, 2), (0, 3), (1, 3), (2, 3)],
# [(0, 0), (0, 1), (0, 2), (1, 2), (1, 3), (2, 3)],
# [(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (2, 3)]]
Notice that is very similar to your code, there is a need to store all valid paths together and take care that appended lists are a copy of curpath to avoid ending up with an list of empty lists.
Runtime: O((n + m) * (amount of valid paths)) since simulated robot moves belong to valid paths or first step into an invalid path detected using foresight (dp). Warning: This method is exponential as amount of valid paths can be .
it possible someone explain "Tf is dependent on term and document" and "IDF is just dependent on the term" with an example ?
Suppose that we have these two documents:
d_1: "Tf is dependent on term and document"
d_2: "IDF is just dependent on the term"
The count of terms in each document is as follows:
d_1:
{Tf: 1, is: 1, dependent: 1, on: 1, term: 1, and: 1, document: 1}
d_2:
{IDF: 1, is: 1, just: 1, dependent: 1, on: 1, the: 1, term: 1}
The term frequencies (i.e., the ratio of times that term t appears in document d to the total count of terms of that document) for term "on" are:
tf(on, d_1) = 1 / 7
tf(on, d_2) = 1 / 7
For calculating the term frequency of a term, you must specifiy which document you are talking about. tf(on, d_1) = 1/7 tells you that 1/7 of all words in d_1 is "on".
The inverse document frequency (logarithm of ratio of documents that include the word "on") is:
idf(on) = log(2/2) = 0
As you see, the idf is constant for all documents in this corpus of two documents. It's just a measure of how common a term is in a set of documents. idf(on) = 0 tells you that "on" is not special at all and it appears in all documents.
We have the Graph below.
gremlin> a = graph.addVertex("name", "alice")
==>v[4208]
gremlin> b = graph.addVertex("name", "bobby")
==>v[40968424]
gremlin> c = graph.addVertex("name", "cindy")
==>v[4192]
gremlin> d = graph.addVertex("name", "david")
==>v[40972520]
gremlin> e = graph.addVertex("name", "eliza")
==>v[40964272]
gremlin> a.addEdge("rates",b,"tag","ruby","value",9)
==>e[2ry-38w-azv9-oe3fs][4208-rates->40968424]
gremlin> b.addEdge("rates",c,"tag","ruby","value",8)
==>e[odzq5-oe3fs-azv9-38g][40968424-rates->4192]
gremlin> c.addEdge("rates",d,"tag","ruby","value",7)
==>e[170-38g-azv9-oe6lk][4192-rates->40972520]
gremlin> d.addEdge("rates",e,"tag","ruby","value",6)
==>e[oe04d-oe6lk-azv9-oe08g][40972520-rates->40964272]
gremlin> a.addEdge("rates",e,"tag","java","value",9)
==>e[366-38w-azv9-oe08g][4208-rates->40964272]
gremlin> g.E().values("tag")
==>ruby
==>ruby
==>ruby
==>ruby
==>java
gremlin> graph.tx().commit()
Below Script we can get the weight for all the paths.
gremlin>g.V().has('name','alice').repeat(outE().has('tag','ruby').inV()).times(3).emit().group().by('name').by(path().unfold().has('value').values('value').fold()).next()
==>bobby=[9]
==>cindy=[8, 9, 8]
==>david=[8, 7, 9, 8, 7]
Below we can sum the weight for all the paths.
gremlin>g.V().has('name','alice').repeat(outE().has('tag','ruby').inV()).times(3).emit().group().by('name').by(path().unfold().has('value').values('value').sum()).next()
==>bobby=9 sum[9]
==>cindy=25 sum[8, 9, 8]
==>david=39 sum[8, 7, 9, 8, 7]
My question is how can I Sum the weight with coefficient?
like below
==>bobby=[9] = 9*1
==>cindy=[8, 9, 8] = 8*1 + 9*0.5 + 8*0.25
==>david=[8, 7, 9, 8, 7] = 8*1 + 7*0.5 + 9*0.25 + 8*0.125 + 7*0.0625
Another request is if the coefficient is from edge property co?
Your help is really appreciated!
Try to prevent lambdas when possible.
gremlin> g.withSack(1.0).V().has("name","alice").
repeat(outE("rates").has("tag","ruby").
project("a","b","c").
by(inV()).
by(sack()).
by("value").as("x").
select("a").
sack(mult).by(constant(0.5))).
times(3).emit().
select(all, "x").
project("name","score").
by(tail(local, 1).select("a").values("name")).
by(unfold().
sack(assign).by(select("b")).
sack(mult).by(select("c")).
sack().sum())
==>[name:bobby,score:9.0]
==>[name:cindy,score:13.00]
==>[name:david,score:14.750]
This query returns a result that differs a bit from your provided example, but that's only because your queries show results, that were not produced using the provided sample graph.
You can do something like g.V().has('name','alice').repeat(outE().has('tag','ruby').inV()).times(3).emit().group().by('name').by(path().unfold().has('value').values('value').fold()).next().collect{k, v -> v.withIndex().collect {Integer it, Integer idx -> return it * (1/(idx + 1))}.inject(0.0) {acc,i -> acc+i}} Don't forget, you're working with the groovy programming language as well, so you have access to those features.
I have a set of datapoints such as (THIS IS AN EXAMPLE)
val=4; (*this value is calculated before in the program, so it not known a priori*)
x={0,1,2,3};
data=Table[0, {val}];
data[[1]] = {1,5,6,8};
data[[2]] = {9,7,1,3};
data[[3]] = {3,4,5,6};
data[[4]] = {2,2,4,6};
Now I can plot each of these data with ListPlot as
ListPlot[Transpose[{x,data[[1]]}]]
and if I want to plot more than one I can do
ListPlot[{Transpose[{x, data[[1]]}], Transpose[{x, data[[2]]}]}]
but how can I plot all of them in one code single line, by considering that val is calculated before in the program?
Is there a way to do something like
For[i = 1, i < val + 1, i++, ListPlot[Transpose[{x,data[i]}]]......]
having a single graph with all x-y curves?
Indeed I would like a static picture of
Manipulate[ListPlot[Transpose[{x, data[[i]]}]], {i, 1, val,1}]
Thanks
Virgilio
You want to "do the same thing" to every element of a list. That should tell you to think of using Map. Your list is named data and each element is your four element sublist. If you look at the help page for Map it shows you need to think up a function that does what you need to do to each individual sublist. You have already understood that you need to use Transpose with x and your sublist so that tells you your function and you are almost there. The result of Map will be a list of all those results. So
In[1]:= x = {0, 1, 2, 3};
data = {{1, 5, 6, 8}, {9, 7, 1, 3}, {3, 4, 5, 6}, {2, 2, 4, 6}};
ListPlot[Map[Transpose[{x, #}] &, data], Joined -> True]
Out[3]= ...FourOverlaidPlotsSnipped...
Go through that a word at a time until you can really understand the thinking that was done to be able to write that. You will use this idea again and again if you keep using Mathematica.
For the example you give the cleanest method is to use DataRange:
data = {{1, 5, 6, 8}, {9, 7, 1, 3}, {3, 4, 5, 6}, {2, 2, 4, 6}};
ListLinePlot[data, DataRange -> {0, 3}]
Please ask your future questions on the dedicated Mathematica StackExchange site:
Suppose I have a tuple of (1, 2, 3) and want to index a multidimensional array with it such as:
index = (1, 2, 3)
table[index] = 42 # behaves like table[1][2][3]
index has an unknown number of dimensions, so I can't do:
table[index[0]][index[1]][index[2]]
I know I could do something like this:
functools.reduce(lambda x, y: x[y], index, table)
but it's utterly ugly (and maybe also inefficient), so I wonder if there's a better, more Pythonic choice.
EDIT: Maybe a simple loop is best choice:
elem = table
for i in index:
elem = elem[i]
EDIT2: Actually, there's a problem with both solutions: I can't assign a value to the indexed array :-(, back to ugly:
elem = table
for i in index[:-1]:
elem = elem[i]
elem[index[-1]] = 42
The question is very interesting and also your suggested solution looks good (havn't checked it, but this kind of problem requires a recursive treatment and you just did it in one line).
However, the pythonic way I use in my programs is to use dictionaries of tuples. The syntax is array-like, the performance - of a dictionary, and there was no problem in it for me.
For example:
a = {(1, 2, 3): 'A', (3, 4, 5): 'B', (5, 6, 7, 8): 'C'}
print a[1, 2, 3]
print a[5, 6, 7, 8]
Will output:
A
B
And assigning to an index is super easy:
a[1, 4, 5] = 42. (But you might want to first check that (1, 4, 5) is within the dict, or else it will be created by the assignment)