We've one photo sharing application and I'm using tinkerpop 3.4.3 java library and AWS Neptune graph. In our application, we're already using .flatMap() step to chain the traversals from the other methods. Our current code looks like this.
public boolean isViewable(String photoId, String userId) {
return graph.V(photoId).hasLabel("photo")
.flatMap(getDirection(userId))
.otherV().hasId(userId).hasNext();
}
Based on the userId, we retrieve the correct direction/relation information from the other systems and use it here for the result.
Unfortunately we're facing marginal performance issue when using the .flatMap() step when the number of edges of the photoId are high (100K edges).
...flatMap(inE()).otherV().profile() results in ~5000 milli seconds but the same query without .flatMap results in less than 200 milli seconds.
To avoid this, we've modified our current code like the below.
public boolean isViewable(String photoId, String userId) {
GraphTraversal<Vertex, Vertex> traversal = graph.V(photoId).hasLabel("photo");
applyDirection(traversal, userId);
traversal.otherV().hasId(userId).hasNext();
}
private void applyDirection(GraphTraversal<Vertex, Vertex> traversal, String userId) {
if(condition) {
traversal.inE();
} else {
traversal.outE();
}
}
But code looks complex without the chaining. Is there any other steps are available to chain the traversals ?
I don't think your code without the chaining is all that complex or hard to read. It's quite common to take that approach when dynamically building a traversal. If you really dislike it you could build a DSL to make a custom step to encapsulate that logic:
graph.V(photoId).hasLabel("photo")
.eitherInOrOut(userId)
.otherV().hasId(userId).hasNext();
If your logic is truly that simple for determining the Direction you could also use the little known to() step:
graph.V(photoId).hasLabel("photo")
.toE(getDirection(userId))
.otherV().hasId(userId).hasNext();
Related
I have a query which uses code like:
criteria.AddOrder(
Order.Asc(
Projections.SqlFunction(
new StandardSQLFunction("NEWID"),
new NHType.GuidType(),
new IProjection[0])));
The purpose is to get a random ordering. I run this against an SQL server, but I would also like to run it against an SQLite, since we use this for testing. SQLite does not support NEWID() but has Random instead. Is it possible to write the code (or configure) such that the same query will work against both databases?
I think the way to do this is to create two custom dialects. Have each one implement a random function differently:
public class MyMsSqlDialect : MsSql2012Dialect
{
protected override void RegisterFunctions()
{
base.RegisterFunctions();
RegisterFunction("random", new StandardSQLFunction("NEWID", NHibernateUtil.Guid));
}
}
public class MySqliteDialect : SQLiteDialect
{
protected override void RegisterFunctions()
{
base.RegisterFunctions();
RegisterFunction("random", new StandardSQLFunction("random", NHibernateUtil.Guid));
}
}
Now, the following query should work fine in either database:
criteria.AddOrder(
Order.Asc(
Projections.SqlFunction("random", NHibernateUtil.Guid)));
Note that this is cheating a bit. random doesn't return a Guid in the Sqlite flavor, but since NHibernate doesn't need that information to do the ORDER BY, nothing should go wrong. I would consider calling it random_order or something to make it clear that this is only for ordering.
We are evaluating Grid Gain 6.5.5 at the moment as a potential solution for distribution of compute jobs over a grid.
The problem we are facing at the moment is a lack of a suitable asynchronous notification mechanism that will notify the sender asynchronously upon job completion (or future completion).
The prototype architecture is relatively simple and the core issue is presented in the pseudo code below (the full code cannot be published due to an NDA). *** Important - the code represents only the "problem", the possible solution in question is described in the text at the bottom together with the question.
//will be used as an entry point to the grid for each client that will submit jobs to the grid
public class GridClient{
//client node for submission that will be reused
private static Grid gNode = GridGain.start("config xml file goes here");
//provides the functionality of submitting multiple jobs to the grid for calculation
public int sendJobs2Grid(GridJob[] jobs){
Collection<GridCallable<GridJobOutput>> calls = new ArrayList<>();
for (final GridJob job : jobs) {
calls.add(new GridCallable<GridJobOutput>() {
#Override public GridJobOutput call() throws Exception {
GridJobOutput result = job.process();
return result;
}
});
}
GridFuture<Collection<GridJobOutput>> fut = this.gNode.compute().call(calls);
fut.listenAsync(new GridInClosure<GridFuture<Collection<GridJobOutput>>>(){
#Override public void apply(GridFuture<Collection<GridJobOutput>> jobsOutputCollection) {
Collection<GridJobOutput> jobsOutput;
try {
jobsOutput = jobsOutputCollection.get();
for(GridJobOutput currResult: jobsOutput){
//do something with the current job output BUT CANNOT call jobFinished(GridJobOutput out) method
//of sendJobs2Grid class here
}
} catch (GridException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
});
return calls.size();
}
//This function should be invoked asynchronously when the GridFuture is
//will invoke some processing/aggregation of the result for each submitted job
public void jobFinished(GridJobOutput out) {}
}
}
//represents a job type that is to be submitted to the grid
public class GridJob{
public GridJobOutput process(){}
}
Description:
The idea is that a GridClient instance will be used to in order to submit a list/array of jobs to the grid, notify the sender how many jobs were submitted and when the jobs are finished (asynchronously) is will perform some processing of the results. For the results processing part the "GridClient.jobFinished(GridJobOutput out)" method should be invoked.
Now getting to question at hand, we are aware of the GridInClosure interface that can be used with "GridFuture.listenAsync(GridInClosure> lsnr)"
in order to register a future listener.
The problem (if my understanding is correct) is that it is a good and pretty straightforward solution in case the result of the future is to be "processed" by code that is within the scope of the given GridInClosure. In our case we need to use the "GridClient.jobFinished(GridJobOutput out)" which is out of the scope.
Due to the fact that GridInClosure has a single argument R and it has to be of the same type as of GridFuture result it seems impossible to use this approach in a straightforward manner.
If I got it right till now then in order to use "GridFuture.listenAsync(..)" aproach the following has to be done:
GridClient will have to implement an interface granting access to the "jobFinished(..)" method let's name it GridJobFinishedListener.
GridJob will have to be "wrapped" in new class in order to have an additional property of GridJobFinishedListener type.
GridJobOutput will have to be "wrapped" in new class in order to have an addtional property of GridJobFinishedListener type.
When the GridJob will be done in addition to the "standard" result GridJobOutput will contain the corresponding GridJobFinishedListener reference.
Given the above modifications now GridInClosure can be used now and in the apply(GridJobOutput) method it will be possible to call the GridClient.jobFinished(GridJobOutput out) method through the GridJobFinishedListener interface.
So if till now I got it all right it seems a bit clumsy work around so I hope I have missed something and there is a much better way to handle this relatively simple case of asynchronous call back.
Looking forward to any helpful feedback, thanks a lot in advance.
Your code looks correct and I don't see any problems in calling jobFinished method from the future listener closure. You declared it as an anonymous class which always has a reference to the external class (GridClient in your case), therefore you have access to all variables and methods of GridClient instance.
I Like the idea of intermediate operations of Java8, Where all operations will applied once when a terminal operation is reached.
I am asking is there library that I can use with Java 7 that allow me to achieve such behaviour.
Note:
I am using commons-collections4 for collection operations, like forAllDo, So it is possible to use it for such case? (intermediate vs terminal operations)
Guava
As your [Guava] tag suggests, most Guava collection operations are lazy - they are applied only once needed. For example:
List<String> strings = Lists.newArrayList("1", "2", "3");
List<Integer> integers = Lists.transform(strings, new Function<String, Integer>() {
#Override
public Integer apply(String input) {
System.out.println(input);
return Integer.valueOf(input);
}
});
This code seems to convert a List<String> to a List<Integer> while also writing the strings to the output. But if you actually run it, it doesn't do anything. Let's add some more code:
for (Integer i : integers) {
// nothing to do
}
Now it writes the inputs out!
That's because the Lists.transform() method doesn't actually do the transforming, but returns a specially crafted class which only computes the values when they are needed.
Bonus proof that it all works nicely: If we removed the empty loop and replaced it with e.g. just integers.get(1);, it would actually only output the number 2.
If you'd like to chain multiple methods together, there is always FluentIterable. That basically allows you to code in the Java 8 Stream-like style.
Goldman Sachs Collections
While Guava usually does the right thing by default and works with JDK classes, sometimes you need something more complex. That's where Goldman Sachs collections come in. GS collections give you far more flexibility and power by having a complete drop-in collections framework with everything you might dream of. Laziness is not theer by default, but can be easily achieved:
FastList<String> strings = FastList.newListWith("1", "2", "3");
LazyIterable<Integer> integers = strings.asLazy().collect(new Function<String, Integer>() {
#Override
public Integer valueOf(String string) {
System.out.println(string);
return Integer.valueOf(string);
}
});
Again, doesn't do anything. But:
for (Integer i : integers) {
// nothing to do
}
suddenly outputs everything.
I was previously using a repository with this code:
public virtual void Delete(T entity)
{
DbEntityEntry dbEntityEntry = DbContext.Entry(entity);
if (dbEntityEntry.State != EntityState.Deleted)
{
dbEntityEntry.State = EntityState.Deleted;
}
else
{
DbSet.Attach(entity);
DbSet.Remove(entity);
}
}
My application is simple and I noticed many people saying "EF is already a repository so you don't need to use another one". The problem is that without that external repository wrapping I now find that when I want to code a simple delete I find it's necessary to do something like this:
DbEntityEntry dbEntityEntry1 = db.Entry(objectiveDetail);
if (dbEntityEntry1.State != EntityState.Deleted)
{
dbEntityEntry1.State = EntityState.Deleted;
}
else
{
db.ObjectiveDetails.Attach(objectiveDetail);
db.ObjectiveDetails.Remove(objectiveDetail);
}
My one line repo.delete has now changed to ten lines. Is there a way I can get back to simplifying the removal of an entry with EF without having to hard code all the lines for checking if the entry is already attached etc. ?
Actually, although EF is a repository in and of itself, it still makes sense in many situations from a separation of concerns perspective to abstract EF out and use a Repository pattern to achieve this. I use a repository pattern quite often with EF. If you are using automated unit tests, the repository pattern makes isolating your code for testing much easier.
What is difference to the controller that gets the return with repect to rendering the List?
In Linq dataContext:
public IList<Response> GetResponses(int ID)
{
var responses = from r in this.Responses where r.ID == ID orderby r.Date select r;
return responses.ToList();
}
OR
public List<Response> GetResponses(int ID)
{
var responses = from r in this.Responses where r.ID == ID orderby r.Date select r;
return responses.ToList();
}
I doubt there's much difference to the controller but you should probably try to reveal as little information as possible about the private data of your classes. This means exposing interfaces rather than concrete types and using the interface that exposes the minimum amount of information the client will need to operate on the data.
If the controller only needs an IEnumerable<Response> then you should consider making that the return type of GetResponses.
The difference is that the controller won't need to be updated if you change your List implementation if you use the IList interface. It's probably not that big of a deal unless you are planning to make your library available to others. In that case the abstraction is probably justified as you won't be the only one having to update your code if you make the change.
Consider returning an array, consider accepting IEnumerable as a parameter.