I am using Reinforcement Learning to teach an AI an Austrian Card Game with imperfect information called Schnapsen. For different states of the game, I have different neural networks (which use different features) that calculate the value/policy. I would like to try using RNNs, as past actions may be important to navigate future decisions.
However, as I use multiple neural networks, I somehow need to constantly transfer the hidden state from one RNN to another one. I am not quite able to do that, especially during training I don't know how to make backpropagation through time work. I am grateful for any advice or links to related papers/blogs!
I am currently working with Flux in Julia, but I am also willing to switch to Tensorflow or Pytorch in Python.
Thank you in advance!
Related
I am currently working on a problem where I don't have a plethora of labeled data. I therefore want to use active learning to try and label some of my data using the model and then have all images (in this case) with a low confidence threshold be sent off for annotation. Are there any built in or peripheral techniques/packages in the FluxML ecosystem that would enable this? I looked around but did not see active learning techniques mentioned at all for Flux. In PyTorch for example, one of the resources I use is this PyTorch for Active learning repo.
I just want to get the main idea/principle of openFOAM and how you create a simulation, please let me know where I go wrong,
So basically you have a object that interacts with gas or liquid and you want to simulate this, so you create model of the object, mesh it, specify where the gas will flow in and out and what are the walls, and set the other correct parameters and then run the program (with the approprate time step etc)?
OpenFOAM is an open source C++ library which implements the finite volume method (FVM), which is widely used in CFD.
What you have explained is a vague understanding of some of the applications of CFD. Those things you specified might not always be the case (i.e. the fluid might not necessarily be (a) gas and so on.
The main stages of a CFD problem are: making the geometry - mesh generation - preprocess - solving - postprocess.
There might be more stages added depending on the resolution and other specifics of the case.
Now OpenFoam is an open source (free for all) tool which is in C++ and helps solve the CFD problems. If the problem is simple and routine, and you have access to a commercial solver such as ANSYS fluent, then you can use that since it is easier and much less work if the problem is not specific. However, if the problem is specific and there are customized criteria OpenFoam is a nice tool.
It is written in C++ thus it is object oriented and also there are many many different solvers already written and available to use, so you will not have to write all the schemes and everything on your own from scratch.
However, my main advice to you is to read more about CFD to have a clear understanding, there are tens of good books avaiable.
My motivation for asking this question is that I have found an interesting problem using machine learning on a graph data set. There are papers out there on this subject. For example, "Learning from Labeled and Unlabeled Data on a Directed Graph" (Zhou, Huang, Scholkopf). However I do not have a background in artificial intelligence or machine learning so I would like to write a smaller program for a more general audience before working on anything scientific.
Several years ago I wrote a game called Solumns. It is an evil variant of the classic Sega game Columns. Inspired by bastet, it bruteforces for colour combinations disadvantageous to the player. It is hard.
I would like to improve upon its AI. I figure the game space (grid of coloured blocks, column positions, column colours) fits a graph structure better than a list of attributes. If that is the case, then this problem is similar to my research problem.
I am considering using the following high-level plan to solve this problem:
I'm thinking what would be useful is if the AI opponent could assign a fitness rating to a possible move based on more data than the number of existing squares on the board after the move. I'm thinking using a categoriser. Train on the move and all past moves, using the course of the rest of the game as a measure of success.
I am also thinking of developing a player bot that can beat the standard AI opponent. This could be useful when generating data for 1.
Use a sample of the player bot's games to build an AI that beats the strategic player. Maybe use this data for 1, too.
Write a fun AI that delegates to a possible combination of 1, 3, and the original AI, when appropriate, which I will determine using experimentation to find heuristic fudge factors.
To build the player bot, I figured I could use brute force to compute the sample space. Then use machine learning techniques such as those used in building Random Forests to create some kind of decision maker.
Building the AI opponent is where I am most perplexed.
Specific questions then:
Rating moves sounds like the kind of thing people do with chess, and although I'll admit my approach may be ignorant, there is a lot about this in literature and I can learn from that. Question is, should the player bot and AI opponent create the data sample? It sounds like I'm getting confused between different sample sets, which sounds like a recipe for bad training. Should I just play the game a bunch?
What kind of algorithm should I consider for training the player bot against the current AI?
What kind of algorithm should I consider for training an AI opponent against the player bot?
Extra info:
I am deliberately not asking if this strategy is a good fit for programming a game AI. Sure, you might be able to write a great AI more simply (after all it is already difficult to play). I want to learn about machine learning by doing something fun.
The original game was written in a mixture of racket and C. I am porting it to jruby for various reasons, likely with extensions or RPC calls to another faster language. I am not so interested in existing language-specific solutions here. I want to develop skills in this area and am not afraid to implement an algorithm for myself.
You can get the source for the original game here
I would not go for machine learning here. Look at game playing AIs.
You have an adversarial games (like Go) with two asymmetric players:
The user who places the pieces,
and the computer who chooses the pieces (instead of choosing pieces by chance).
I would probably start with Monte Carlo Tree Search.
This is an ambitious question from a Wolfram Science Conference: Is there such a thing as a network analog of a recursive function? Maybe a kind of iterative "map-reduce" pattern? If we add interaction to iteration, things become complicated: continuous iteration of a large number of interacting entities can produce very complex results. It would be nice to have a way of seeing the consequences of the myriad interactions that define a complex system. Can we find a counterpart of a recursive function in an iterative network of connected nodes which contain nested propagation loops?
One of the basic patterns of distributed computation is Map-Reduce: it can be found in Cellular Automata (CA) and Neural Networks (NN). Neurons in NN collect informations through their synapses (reduce) and send it to other neurons (map). Cells in CA act similar, they gather informations from their neighbors (reduce), apply a transition rule (reduce), and offer the result to their neighbors again. Thus >if< there is a network analog of a recursive function, then Map-Reduce is certainly an important part of it. What kind of iterative "map-reduce" patterns exist? Do certain kinds of "map-reduce" patterns result in certain kinds of streams or even vortices or whirls? Can we formulate a calculus for map-reduce patterns?
I'll take a stab at the question about recursion in neural networks, but I really don't see how map-reduce plays into this at all. I get that neural network can perform distributed computation and then reduce it to a more local representation, but the term map-reduce is a very specific brand of this distributed/local piping, mainly associated with google and Hadoop.
Anyways, the simple answer to your question is that there isn't a general method for recursion in neural networks; in fact, the very related simpler problem of implementing general purpose role-value bindings in neural networks is currently still an open question.
The general principle of why things like role-binding and recursion in neural networks (ANNs) are so hard is that ANNs are very interdependent by nature; indeed that is where most of their computational power is derived from. Whereas function calls and variable bindings are both very delineated operations; what they include is an all-or-nothing affair, and that discreteness is a valuable property in many cases. So implementing one inside the other without sacrificing any computational power is very tricky indeed.
Here is a small sampling of papers that try there hand at partial solutions. Lucky for you, a great many people find this problem very interesting!
Visual Segmentation and the Dynamic Binding Problem: Improving the Robustness of an Artificial Neural Network Plankton Classifier (1993)
A Solution to the Binding Problem for Compositional Connectionism
A (Somewhat) New Solution to the Binding Problem
I have been working on a course project in which we implemented an FPS using FSMs, by showing a top 2d view of the game, and using the bots and players and circles. The behaviour of bots was deterministic. For example, if the bot's health drops to below a threshhold, and the player is visible, the bot flees, else it looks for health packs.
However, I felt that in this case the bot isn't showing much of intelligence, as most of the decisions it takes are based on rules already decided by us.
What other techniques could I use, which would help me implement some real intelligence in the bot? I've been looking at HMMs, and I feel that they might help in bringing more uncertainty in the bot, and the bot might start being more autonomous in taking decisions than depending on pre defined rules.
What do you guys think? Any advice would be appreciated.
I don't think using a hidden Markov model would really be more autonomous. It would just be following the more opaque rules of the model rather than the explicit rules of the state machine. It's still deterministic. The only uncertainty they bring is to the observer, who doesn't have a simple ruleset to base predictions on.
That's not to say they can't be used effectively - if I recall correctly, several bots for FPS games used this sort of system to learn from players and develop their own AI.
But this does depend exactly on what you want to model with the process. AI is not really about algorithms, but about representation. If all you do is pick the same states that your current FSM has and observe an existing player's transitions, you're not likely to get a better system than having an expert input carefully tweaked rules for an FSM.
Given that you're not going to manage to implement "some real intelligence" as that is currently considered beyond modern science, what is it you want to be able to create? Is it a system that learns from its own experiments? A system that learns by observing human subjects? One that deliberately introduces unusual choices in order to make it harder for an opponent to predict?