I am currently working on a problem where I don't have a plethora of labeled data. I therefore want to use active learning to try and label some of my data using the model and then have all images (in this case) with a low confidence threshold be sent off for annotation. Are there any built in or peripheral techniques/packages in the FluxML ecosystem that would enable this? I looked around but did not see active learning techniques mentioned at all for Flux. In PyTorch for example, one of the resources I use is this PyTorch for Active learning repo.
Related
I am using Reinforcement Learning to teach an AI an Austrian Card Game with imperfect information called Schnapsen. For different states of the game, I have different neural networks (which use different features) that calculate the value/policy. I would like to try using RNNs, as past actions may be important to navigate future decisions.
However, as I use multiple neural networks, I somehow need to constantly transfer the hidden state from one RNN to another one. I am not quite able to do that, especially during training I don't know how to make backpropagation through time work. I am grateful for any advice or links to related papers/blogs!
I am currently working with Flux in Julia, but I am also willing to switch to Tensorflow or Pytorch in Python.
Thank you in advance!
I have recently tried to implement my own version of the Asynchronous Advantage Actor-Critic (A3C) method for deep reinforcement learning since I couldn't get other A3C implementations found in the web to work properly. The problem is that my version isn't converging either....So, I would really appreciate any help to identify the problem. The code is located in here: https://github.com/MatheusMRFM/A3C-LSTM-with-Tensorflow. I am training the method using the Pong game from the Open AI gym environment. Here's what I did:
My implementation is heavily based in the following A3C implementations: Arthur Juliani's version, Open AI's A3C and andreimuntean's version. I've chosen these implementations due to their clarity and because everything seemed correct according the the original A3C paper;
I'm using a network as follows: a set of convolutional layers, a fully connected layer, an LSTM layer, and again, two fully connected layers (one for the policy and the other for the value function). I already tested several other architectures (changing the concolutional layers, removing the first hidden layer, change the output of the hidden and LSTM layers, etc. None of these configurations worked....
I tried 3 different optimizers: RMSPropOptimizer, AdadeltaOptimizer, and AdamOptimizer. I also tried different learning rates for each one. No luck;
I already tried several parameters based on the implementations that I looked.
My code always ends up converging to a policy where the paddle always moves up or always moves down (not both). There must be some stupid detail that I missed, but I can't find it. Any help is welcome!
I have actually found what the problem was. And just as I thought, it was only a simple detail that messed everything up: in my code and in all the other codes that I mentioned in my post, the policy loss function uses the softmax log of the policy outputted by the network. In order to avoid Nan results (in case the policy has a 0 associated to at least one action), I added a small value to the policy (in my code, this is done in line 180 from Network.py). It seems that this small value (1e-8) wasn't so small after all, and it was messing with the policy loss function. I ended up using 1e-13 and it then worked. I tested in the VizDoom environment (same map used in Arthur Juliani's version) and it converged within about 6k episodes.
Hope it helps anyone with a similar problem. I will soon update my code in my GitHub account.
I have made a few annotators in UIMA and now, i want to check their efficiency.Is there a standardized way to gauge the performance of the Annotators?
UIMA itself does not provide immediate support for comparing annotators and evaluating them against a gold standard.
However, there are various tools/implementations out there that provide such functionality on top of UIMA but typically within the confines of the particular tool, e.g.:
U-Compare supports running multiple annotators doing the same thing and comparing their results
WebAnno is an interactive annotation tool that uses UIMA as its backend and that supports comparing annotations from multiple users to each other. There is a class called "CasDiff2" in the code that generates differences and feeds them into DKPro Statistics in the background for the actual agreement calculation. Unfortunately, CasDiff2 cannot be really used separately from WebAnno (yet).
Disclosure: I'm on the WebAnno team and have implemented CasDiff2 in there.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am working on a Spring based Web application which performs Predictive Analysis based on the user historical data and come up with offers to users. I need to implement a Predictive Analysis or any regression kind of functionality which gives a confidence score/prediction to present those offers. I am a Java developer and looked at Weka, Mahout to get the desired result. But both the tools dont provide good documentation and it is highly difficult to proceed with them. I need a suggestion regarding Java based analytics API to process my data using regression or Neural Networks or Decision Trees and provides a confidence score depicting customers probablity on buying the product in future.
Any help in this regard is highly appreciable.
I've just finished working on a long project that involves building a GUI with JavaFx and R using the JRI package, it uses forecasting from the forecast package in R.
if you'll choose this solution (JavaFX + R) , all the statistical packaging of R will be at use, R has great documentation for this, but the interface jri is a challenge.
The program i built is in a stand alone mode, not a web start.
Most of the fuss regards setting up all environment variables, and passing parameters to the JVM, the big problem is for deployment, you need to make sure your clients have R, and to setup all the links between R and Java in their PC.
If you're interested in any prediction analysis (trees, regressions..) in R using Java /JRI, let me know and ill post it.
I'd advise you to keep trying with Weka. It's a great tool, not only for implementation but also to get an idea of which algorithms will work for you, what your data looks like, etc.
The book is worth the price, but if you're not willing to buy it, this wiki page might be a good starting point.
It might be best to start with testing, not programming - I believe the quote goes "60% of the difficulty of machine learning is understanding the dataset". Play around with the Weka GUI, find out what works best for you and your data, and do try some of the meta-classifiers (boosting, bagging, stacking); they often give great results (at the cost of processing time).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am currently facing a situation where I as an advocate of test driven development have to compete with an advocate of model driven software development (MDSD) / model driven architecture (MDA).
In my opinion, code generation is a valuable tool in my toolbox and I make heavy use of templates and automation when needed. I also create diagrams in UML when I think this helps to understand the inner working or to discuss architecture on the white board. However, I strongly doubt that creating software via UML (creating statecharts and sequence diagrams to create working code not only skeletons of code) is more efficient for multi tier applications (database layer, business/domain layer and a Gui, maybe even distributed). It seems to me when it comes to MDSD, the CASE tooling suddenly isn't just a tool anymore but it is the thing to satisfy: As I see it, on the one hand, MDSDevelopers profit from the higher abstraction UML gives them but at the same time they are struggling with modifing the codegenerator/template/engine to fullfill their needs which might be easily implemented (and tested) if used another tool out of their toolbox (VisualStudio, Eclipse,...).
All this makes me wonder if there has been a success story (suceess being that the product was rolled out in time, within the budged and with only few bugs and parts of the software have been reused later on) for a real world application which fullfills this creteria and has been developed using a strict model driven approach:
it has nothing to do with the the Object Management Group (OMG) or with consultants related to MDSD/MDA/SOA/
the application is not related to Business Process Modelling and is not a CASE tool itself
the application is actively used by end user
it has at least three tiers, including a user interface which goes beyond displaying raw table values and is not one of the common MDA/MDSD examples ("how to model a coffee machine, traffic light, dishwasher").
A tiny, but nevertheless useful testimonial on the use of MDSD has been posted on the Model Driven Software Network:
http://www.modeldrivensoftware.net/profiles/blogs/viva-mdd-follow-up-building-a?xg_source=activity
It is a relatively small app being developed, but still a good example of MDSD in action.
More success stories are listed at Metacase's site (http://www.metacase.com/cases/index.html). Metacase sells MetaEdit+, which implements DSM (Domain-Specific Modeling). DSM is just a form of MDSD.
I am also developing ABSE (Atom-Based Software Engineering), another form of MDSD, very close to DSM. ABSE is outlined at http://www.abse.info.
I used MDA and code generation on an embedded system project using 4 processors connected via CAN. We had over 20 axes of motion and many, many sensors. The system was highly robust and maintainable as the mechanical components were evaluated and modified.
We worked in the models and generated code so the models were always up-to-date. We did a careful domain analysis to achieve subject matter isolation. The motor control required very high performance and so was not modeled or generated. Our network drivers were also hand-coded, and we wrote interfaces that allowed bridge services to send events to any service anywhere in the system as needed (although this was tightly controlled so as to minimize interprocessor dependencies).
Using the method took a bit of discipline, but having working models was great because they can be reviewed by non-software types.
Version control and differencing of the models was a bit of a challenge but we had a small, localized team so we were able to avoid merge issues.
The good people at Pathfinder Solutions (our tool vendor) can help mentor you through the project.
You could also take a look at the slides from previous Code Generation conferences. Several of these talks were from successful case studies e.g. http://www.codegeneration.net/cg2009/slides.php
I am working on one of the project for legacy modernization and its using MDA tool named Bluage. Its for a big healthcare organization and its in production so i could say that its successful. MDA is better in case of legacy modernization as it can generate KDM model from some technologies like pacbase which are going to be out of support.
I worked on a MDSD system that generated admin style web apps in Google Closure. I believe that your question is compelling. Too much complexity and your MDSD system is too hard to use. Too simple and you won't generate apps that are useful in the real world. Where MDSD really shines is in saving developer time typing lots of plumbing style code but how can MDSD remain effective over multiple releases? Requirements can go in many directions. That is the real challenge. I recently blogged about my MDSD lessons learned on that project.