i'm trying to make a region growing segmentation for my project in OpenCL, can anyone please help me writng the code
the algorithm is as follows:
seed the first pixel(seed) manually
get the value of the seed
compare it with its 4-neighbors or 8-neighbors(i dont know which is better)
if the neighbors have the same value then they are in region
compare these neighbors with there other neighbors and so on until it finds a boundary, then stop
thanks a lot
I have implemented an OpenCL version of this algorithm in the FAST framework for 2D and 3D. See here: https://github.com/smistad/FAST/tree/master/source/FAST/Algorithms/SeededRegionGrowing
In our GPU segmentation review article you can find an explanation of the implementation as well as a performance analysis: http://www.medicalimageanalysisjournal.com/article/S1361-8415(14)00181-9/fulltext#s0075
Related
I am dealing with an optimisation issue, which I classified as a combinatorial problem. Now, I know this is a 2D variant of the knapsack problem, but please bear with me:
If I have an area that is modeled as a grid comprised of equal size cells, how to place a certain number of square objects of different sizes, on this grid area, if every object has its cost and its benefit and the goal is to have an arrangement of the objects that has the maximum Benefit/Cost ratio:
Object 1: 1x1 square, cost = 800, value= 2478336
Object 2: 2x2 square cost= 2000 value = 7565257
Object 3: 3x3 square cost= 3150 value= 14363679
The object 3 has the best value/cost ratio, so the approach would be a greedy one I guess, to first place as much of the bigger squares as possible, but still there are many optimal solutions depending on the size of the area.
Also, the square objects cannot overlap.
I am using R for this, and the package adagio has algorithms for the single and multiple knapsack, but not for a 2D knapsack problem. Because I am very new in optimization and programming, I am not sure if there is way of solving this problem with R, can someone please help?
Thanks!
Firstly, I'm not an expert in R and adagio. Secondly, I think that your problem is not exactly 2d knapsack, it looks like a variant of packing problem, so it requires a different approach.
So, first, check this awesome list of R optimization packages, especially the following sections:
Specific Applications in Optimization (for example, tabu search could be useful for you)
Mathematical Programming Solvers/Interfaces to Open Source Optimizers (lpsolve definitely could solve your task)
Global and Stochastic Optimization (some of this packages could be used to solve your task)
In case if you're not tied to R, consider Minizinc as a solver. It's very easy to install/use and it's pretty efficient in terms of memory/time consumption. Moreover, there is a bunch of great examples how to use it.
I know that in GP, different from the SVM, kernel is to measure the similarity between one point and a given point. However, is there a general guideline for kernel selection? Or is there any material which introduce the characteristics of popular kernels (like the squared exponential kernel)?
By the way, Dr. Lin (author of LIBSVM toolbox) explained why the RBF kernel is probably the first choice for the SVM, so could anyone please tell me why the SE kernel is so popular in GP?
I think you have touched upon a few common questions people have about Gaussian Processes. I'll try my best to answer them:
Model Selection: Model selection is an important and non-trivial aspect of GPs. It is usually left up to the user as they tend to be application specific. This can be a problem though as the user may not be a GP kernel guru and if we choose a poor model for our data then we inevitably get poor predictive performance. Some attempts have been made in this regard though. If you read chapter 5.2 of GPML it talks about bayesian model selection which should help you.
Useful Kernels: There is a lot of useful literature on kernels and when to use them. However if you know a few primitive kernels and learn the basics of how to combine them you can actually produce pretty sophisticated ML systems. A great resource for getting your head around them is The Kernel Cookbook by David Duvenaud.
RBF vs Squared Exponential: This confuses a lot of people unnecessarily - they are the same thing! This common kernel is used all the time and goes by many names. However, call it a 'Squared Exponential' at your own risk as that is actually an inaccurate but commonly used term for it. Neil Lawrence, one of the many GP gods, says that we should stop calling it the 'Squared Exponential' kernel and rather the 'Exponentiated Quadratic' kernel.
I am using the InfoMap algorithm in the igraph package to perform community detection on a directed and non-weighted graph (34943 vertices, 206366 edges). In the graph, vertices represent websites and edges represent the existence of a hyperlink between websites.
A problem I have encountered after running the algorithm is that the majority of vertices have a membership in a single massive community (32920 or 94%). The rest of the vertices are dispersed into hundreds of other tiny communities.
I have tried different settings with the nb.trials parameter (i.e. 50, 100, and now running 500). However, this doesn't seem to change the result much.
I am feeling rather exasperated because the run-time on the algorithm is quite high, so I have to wait each time for the results (with no luck yet!!).
Many thanks.
Thanks for all the excellent comments. In the end, I got it working by downloading and running the source code for Infomap, which is available at: http://www.mapequation.org/code.html.
Due to licence issues, the latest code has not been integrated with igraph.
This solved the problem of too many nodes being 'lumped' into a single massive community.
Specifically, I used the following options from the command line: -N 10 --directed --two-level --map
Kudos to Martin Rosvall from the Infomap project for providing me with detailed help to resolve this problem.
For the interested reader, here is more information about this issue:
When a network collapses into one major cluster, it is most often because of a very dense and random link structure ... In the code for directed networks implemented in iGraph, teleportation is encoded. If many nodes have no outlinks, the effect of teleportation can be significant because it randomly connect nodes. We have made new code available here: http://www.mapequation.org/code.html that can cluster network without encoding the random teleportation necessary to make the dynamics ergodic. For details, see this paper: http://pre.aps.org/abstract/PRE/v85/i5/e056107
I was going to put this in a comment, but it ended up being too long and hard to read in that format, so this is a tangentially related answer.
One thing you should do is assess whether the algorithm is doing a good job at finding community structure. You can try to visualise your network to establish:
Is the algorithm returning community structures that make sense? Maybe there is one massive community?
If not does the visualisation provide insight as to why?
This will help inform your next steps. Maybe the structure of the network requires a different algorithm?
One thing I find useful for large networks is plotting your edges as a heatmap. This is simple to do if you have your edges stored in an adjacency matrix.
For this, you can use the image function, passing in your matrix of edges as the argument z. Hopefully this will allow you to see by eye the community structure.
However you also want to assess the correctness of your algorithm, so you want to sort the nodes (rows and columns of your adjacency matrix) by the community they've been assigned to.
Another thing to note is that if your edges are directed it may be more difficult to assess by eye as edges can appear on either side of the diagonal of the heatmap. One thing you can do is instead plot the underlying graph -- that is the adjacency matrix assuming your edges are undirected.
If your algorithm is doing a good job, you would expect to see square blocks along the diagonal, one for each detected community.
http://msl.cs.uiuc.edu/rrt/
Can anyone explain how rrt works with simple wording that is easy to understand?
I read the description in the site and in wikipedia.
What I would like to see, is a short implementation of a rrt or a thorough explanation of the following thing:
Why does the rrt grow outwards instead of just growing very dense around the center?
How is it different from a naive random tree?
How is the next new vertex that we attempt to reach picked?
I know there is an Motion Strategy Library I could download but I would much rather understand the idea before I delve into the code rather than the other way around.
The simplest possible RRT algorithm has been so successful because it is pretty easy to implement. Things tend to get complicated when you:
need to visualise planning concepts in more than two dimensions
are unfamiliar with the terminology associated with planning, and;
in the huge number of variants of RRT that are have been described in the literature.
Pseudo code
The basic algorithm looks something like this:
Start with an empty search tree
Add your initial location (configuration) to the search tree
while your search tree has not reached the goal (and you haven't run out of time)
3.1. Pick a location (configuration), q_r, (with some sampling strategy)
3.2. Find the vertex in the search tree closest to that random point, q_n
3.3. Try to add an edge (path) in the tree between q_n and q_r, if you can link them without a collision occurring.
Although that description is adequate, after a while working in this space, I really do prefer the pseudocode of figure 5.16 on RRT/RDT in Steven LaValle's book "Planning Algorithms".
Tree Structure
The reason that the tree ends up covering the entire search space (in most cases) is because of the combination of the sampling strategy, and always looking to connect from the nearest point in the tree. This effect is described as reducing the Voronoi bias.
Sampling Strategy
The choice of where to place the next vertex that you will attempt to connect to is the sampling problem. In simple cases, where search is low dimensional, uniform random placement (or uniform random placement biased toward the goal) works adequately. In high dimensional problems, or when motions are very complex (when joints have positions, velocities and accelerations), or configuration is difficult to control, sampling strategies for RRTs are still an open research area.
Libraries
The MSL library is a good starting point if you're really stuck on implementation, but it hasn't been actively maintained since 2003. A more up-to-date library is the Open Motion Planning Library (OMPL). You'll also need a good collision detection library.
Planning Terminology & Advice
From a terminology point of view, the hard bit is to realise that although lots of the diagrams you see in the (early years of) publications on RRT are in two dimensions (trees that link 2d points), that this is the absolute simplest case.
Typically, a mathematically rigorous way to describe complex physical situations is required. A good example of this is planning for a robot arm with n- linkages. Describing the end of such an arm requires a minimum of n joint angles. This set of minimum parameters to describe a position is a configuration (or some publications state). A single configuration is often denoted q
The combination of all possible configurations (or a subset thereof) that can be achieved make up a configuration space (or state space). This can be as simple as an unbounded 2d plane for a point in the plane, or incredibly complex combinations of ranges of other parameters.
As a side project, I want to implement a Hidden Markov Model for my NVidia graphics card so that I can have it execute quickly and using many cores.
I'm looking at the Forward-Backward algorithm and was wondering what is there that I can make parallel here? If you look at the forward part of the algorithm for instance, the matrix multiplications can be divided up to be done in parallel, but can the iterative parts of the algorithm that depend on the previous step be parallelized in any way? Is there some kind of a mathematical trick that can be applied here?
Thanks,
mj
http://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm#Example
You are correct in your assessment - you can parallelize the matrix multiplications (i.e. across states), but you can't parallelize the recursive steps. I just made a blog post on my work with HMMs and GPUs. Check it out here:
http://sgmustadio.wordpress.com/2012/02/27/hidden-markov-models-in-cuda-gpu/
If you are still working on this project, you may want to check out HMMlib and parredHMMlib.
sgmustadio is right to point out that you cannot parallelize recursive steps, but it seems that these authors have come up with a clever way to reduce the Forward and Viterbi algorithms to a series of matrix multiplications and reductions.