Improve a puzzle game AI using machine learning

Improve a puzzle game AI using machine learning - graph

My motivation for asking this question is that I have found an interesting problem using machine learning on a graph data set. There are papers out there on this subject. For example, "Learning from Labeled and Unlabeled Data on a Directed Graph" (Zhou, Huang, Scholkopf). However I do not have a background in artificial intelligence or machine learning so I would like to write a smaller program for a more general audience before working on anything scientific.
Several years ago I wrote a game called Solumns. It is an evil variant of the classic Sega game Columns. Inspired by bastet, it bruteforces for colour combinations disadvantageous to the player. It is hard.
I would like to improve upon its AI. I figure the game space (grid of coloured blocks, column positions, column colours) fits a graph structure better than a list of attributes. If that is the case, then this problem is similar to my research problem.
I am considering using the following high-level plan to solve this problem:
I'm thinking what would be useful is if the AI opponent could assign a fitness rating to a possible move based on more data than the number of existing squares on the board after the move. I'm thinking using a categoriser. Train on the move and all past moves, using the course of the rest of the game as a measure of success.
I am also thinking of developing a player bot that can beat the standard AI opponent. This could be useful when generating data for 1.
Use a sample of the player bot's games to build an AI that beats the strategic player. Maybe use this data for 1, too.
Write a fun AI that delegates to a possible combination of 1, 3, and the original AI, when appropriate, which I will determine using experimentation to find heuristic fudge factors.
To build the player bot, I figured I could use brute force to compute the sample space. Then use machine learning techniques such as those used in building Random Forests to create some kind of decision maker.
Building the AI opponent is where I am most perplexed.
Specific questions then:
Rating moves sounds like the kind of thing people do with chess, and although I'll admit my approach may be ignorant, there is a lot about this in literature and I can learn from that. Question is, should the player bot and AI opponent create the data sample? It sounds like I'm getting confused between different sample sets, which sounds like a recipe for bad training. Should I just play the game a bunch?
What kind of algorithm should I consider for training the player bot against the current AI?
What kind of algorithm should I consider for training an AI opponent against the player bot?
Extra info:
I am deliberately not asking if this strategy is a good fit for programming a game AI. Sure, you might be able to write a great AI more simply (after all it is already difficult to play). I want to learn about machine learning by doing something fun.
The original game was written in a mixture of racket and C. I am porting it to jruby for various reasons, likely with extensions or RPC calls to another faster language. I am not so interested in existing language-specific solutions here. I want to develop skills in this area and am not afraid to implement an algorithm for myself.
You can get the source for the original game here

I would not go for machine learning here. Look at game playing AIs.
You have an adversarial games (like Go) with two asymmetric players:
The user who places the pieces,
and the computer who chooses the pieces (instead of choosing pieces by chance).
I would probably start with Monte Carlo Tree Search.

Related

What's the best method to implement multiplayer on a Unity Billiard game?

I'm making an online billiard game. I've finished all the mechanics for single player, online account system, online inventory system etc. Everything's fine but I've gotten to the hardest part now, the multiplayer. I tried syncing the position of each ball every frame but the movement wasn't smooth at all, the balls would move back and forth and it looked "bad" in general. Does anyone have any solution for this ? How do other billiard games like the one in Miniclip do it, I'm honestly stuck here and frustrated as it took me a while to learn Photon networking then to find out it's not that good at handling the physics synchronization.
Would uNet be a better choice here ?
I appreciate any help you give me. Thank you!

This is done with PUN already: https://www.assetstore.unity3d.com/en/#!/content/15802
You can try to play with synchronization settings or implement custom OnPhotonSerializeView (see DemoSynchronization in PUN package). Make sure that physic simulation disabled on synchronized clients. See DemoBoxes for physics simulation sample.
Or, if balls can move along lines only, do not send all positions every frame. Send positions and velocities only when balls colliding and do simple velocity simulation between. This can work even with more comprehensive physics but general rule is the same: synchronize it at key points. Of course this is not as simple as automatic synchronization.
Also note that classic billiard is turnbased game and you do not have all the complexity of players interaction. In worst case you can 'record' simulation on current player client and 'playback' it on others.

Hidden Markov Models instead of FSM in a first person shooter game

I have been working on a course project in which we implemented an FPS using FSMs, by showing a top 2d view of the game, and using the bots and players and circles. The behaviour of bots was deterministic. For example, if the bot's health drops to below a threshhold, and the player is visible, the bot flees, else it looks for health packs.
However, I felt that in this case the bot isn't showing much of intelligence, as most of the decisions it takes are based on rules already decided by us.
What other techniques could I use, which would help me implement some real intelligence in the bot? I've been looking at HMMs, and I feel that they might help in bringing more uncertainty in the bot, and the bot might start being more autonomous in taking decisions than depending on pre defined rules.
What do you guys think? Any advice would be appreciated.

I don't think using a hidden Markov model would really be more autonomous. It would just be following the more opaque rules of the model rather than the explicit rules of the state machine. It's still deterministic. The only uncertainty they bring is to the observer, who doesn't have a simple ruleset to base predictions on.
That's not to say they can't be used effectively - if I recall correctly, several bots for FPS games used this sort of system to learn from players and develop their own AI.
But this does depend exactly on what you want to model with the process. AI is not really about algorithms, but about representation. If all you do is pick the same states that your current FSM has and observe an existing player's transitions, you're not likely to get a better system than having an expert input carefully tweaked rules for an FSM.
Given that you're not going to manage to implement "some real intelligence" as that is currently considered beyond modern science, what is it you want to be able to create? Is it a system that learns from its own experiments? A system that learns by observing human subjects? One that deliberately introduces unusual choices in order to make it harder for an opponent to predict?

Convert graph in to data points using Mechanical Turk?

I looked around but did not see anyone using Mechanical Turk for this. I've heard of the service, but never used it before. I need to take the following graph and digitize it so I get a list of data points for each line (noting that there are two Y-axes, and thus depends on which line we are talking about). This is pretty time consuming for me, and I saw other posts on StackOverflow about digitizing software doing a poor job at this. Would Mechanical Turk be well suited to my task?
Here is the graph for reference: http://www.yourpicturehost.com/dyno_hbspeed.jpg

Depends how many of these you have. Mechanical turk could work quite well, but you'd have to check the accuracy carefully (eg by re-plotting the graphs, and comparing them yourself).
If you have a lot, though - you should be able to design an image processing algorithm to pick up the data.

exploring mathematics of/in computer science [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 13 years ago.
I have been working for two years in software industry. Some things that have puzzled me are as follows:
There is lack of application of mathematics in current software industry.
e.g.: When a mechanical engineer designs an electricity pole , he computes the stress on the foundation by using stress analysis techniques(read mathematical equations) to determine exactly what kind and what grade of steel should be used, but when a software developer deploys a web server application he just guesses on the estimated load on his server and leaves the rest on luck and god, there is nothing that he can use to simulate mathematically to answer his problem (my observation).
Great softwares (wind tunnel simulators etc) and computing programs(like matlab etc) are there to simulate real world problems (because they have their mathematical equations) but we in software industry still are clueless about how much actual resources in terms of memory , computing resources, clock speed , RAM etc would be needed when our server side application would actually be deployed. we just keep on guessing about the solution and solve such problem's by more or less 'hit and trial' (my observation).
Programming is done on API's, whether in c, C#, java etc. We are never able to exactly check the complexity of our code and hence efficiency because somewhere we are using an abstraction written by someone else whose source code we either don't have or we didn't have the time to check it.
e.g. If I write a simple client server app in C# or java, I am never able to calculate beforehand how much the efficiency and complexity of this code is going to be or what would be the minimum this whole client server app will require (my observation).
Load balancing and scalability analysis are just too vague and are merely solved by adding more nodes if requests on the server are increasing (my observation).
Please post answers to any of my above puzzling observations.
Please post relevant references also.
I would be happy if someone proves me wrong and shows the right way.
Thanks in advance
Ashish

I think there are a few reasons for this. One is that in many cases, simply getting the job done is more important than making it perform as well as possible. A lot of software that I write is stuff that will only be run on occasion on small data sets, or stuff where the performance implications are pretty trivial (it's a loop that does a fixed computation on each element, so it's trivially O(n)). For most of this software, it would be silly to spend time analyzing the running time in detail.
Another reason is that software is very easy to change later on. Once you've built a bridge, any fixes can be incredibly expensive, so it's good to be very sure of your design before you do it. In software, unless you've made a horrible architectural choice early on, you can generally find and optimize performance hot spots once you have some more real-world data about how it performs. In order to avoid those horrible architectural choices, you can generally do approximate, back-of-the-envelope calculations (make sure you're not using an O(2^n) algorithm on a large data set, and estimate within a factor of 10 or so how many resources you'll need for the heaviest load you expect). These do require some analysis, but usually it can be pretty quick and off the cuff.
And then there are cases in which you really, really do need to squeeze the ultimate performance out of a system. In these case, people frequently do actually sit down, work out the performance characteristics of the systems they are working with, and do very detailed analyses. See, for instance, Ulrich Drepper's very impressive paper What Every Programmer Should Know About Memory (pdf).

Think about the engineering sciences, they all have very well defined laws that are applicable to the design, and building of physical items, things like gravity, strength of materials, etc. Whereas in Computer science, there are not many well defined laws when it comes to building an application against.
I can think of many different ways to write a simple hello world program that would satisfy the requirment. However, if I have to build an electricity pole, I am severely constrained by the physical world, and the requirements of the pole.

Point by point
An electricity pole has to withstand the weather, a load, corrosion etc and these can be quantified and modelled. I can't quantify my website launch success, or how my database will grow.
Premature optimisation? Good enough is exactly that, fix it when needed. If you're a vendor, you've no idea what will be running your code in real life or how it's configured. Again you can't quantify it.
Premature optimisation
See point 1. I can add as needed.
Carrying on... even engineers bollix up. Collapsing bridges, blackout, car safety recalls, "wrong kind of snow" etc etc. Shall we change the question to "why don't engineers use more empirical observations?"

The answer to most of these is in order to have meaningful measurements (and accepted equations, limits, tolerances etc) that you have in real-world engineering you first need a way of measuring what it is that you are looking at.
Most of these things simply can't be measured easily - Software complexity is a classic, what is "complex"? How do you look at source code and decide if it is complex or not? McCabe's Cyclomatic Complexity is the closest standard we have for this but it's still basically just counting branch instructions in methods.

There is little math in software programs because the programs themselves are the equation. It is not possible to figure out the equation before it is actually run. Engineers use simple (and very complex) programs to simulate what happens in the real world. It is very difficult to simulate a simulator. additionally, many problems in computer science don't even have an answer mathematically: see traveling salesman.
Much of the mathematics is also built into languages and libraries. If you use a hash table to store data, you know to find any element can be done in constant time O(1), no matter how many elements are in the hash table. If you store it in a binary tree, it will take longer depending on the number of elements [0(n^2) if i remember correctly].

The problem is that software talks with other software, written by humans. The engineering examples you describe deal with physical phenomenon, which are constant. If I develop an electrical simulator, everyone in the world can use it. If I develop a protocol X simulator for my server, it will help me, but probably won't be worth the work.
No one can design a system from scratch and people that write semi-common libraries generally have plenty of enhancements and extensions to work on rather than writing a simulator for their library.
If you want a network traffic simulator you can find one, but it will tell you little about your server load because the traffic won't be using the protocol your server understands. Every server is going to see completely different sets of traffic.

There is lack of application of mathematics in current software industry.
e.g.: When a mechanical engineer designs an electricity pole , he computes the stress on the foundation by using stress analysis techniques(read mathematical equations) to determine exactly what kind and what grade of steel should be used, but when a software developer deploys a web server application he just guesses on the estimated load on his server and leaves the rest on luck and god, there is nothing that he can use to simulate mathematically to answer his problem (my observation).
I wouldn't say that luck or god are always the basis for load estimation. Often realistic data can be had.
It's also not true that there are no mathematical techniques to answer the question. Operations research and queuing theory can be applied to good advantage.
The real problem is that mechanical engineering is based on laws of physics and a foundation of thousands of years worth of empirical and scientific investigation. Computer science is only as old as me. Computer science will be much further along by the time your children and grandchildren apply the best practices of their day.

An MIT EE grad would not have this problem ;)

My thoughts:
Some people do actually apply math to estimate server load. The equations are very complex for many applications and many people resort to rules of thumb, guess and adjust or similar strategies. Some applications (real time applications with a high penalty for failure... weapons systems, powerplant control applications, avionics) carefully compute the required resources and ensure that they will be available at runtime.
Same as 1.
Engineers also use components provided by others, with a published interface. Think of electrical engineering. You don't usually care about the internals of a transistor, just it's interface and operating specifications. If you wanted to examine every component you use in all of it's complexity, you would be limited to what one single person can accomplish.
I have written fairly complex algorithms that determine what to scale when based on various factors such as memory consumption, CPU load, and IO. However, the most efficient solution is sometimes to measure and adjust. This is especially true if the application is complex and evolves over time. The effort invested in modeling the application mathematically (and updating that model over time) may be more than the cost of lost efficiency by try and correct approaches. Eventually, I could envision a better understanding of the correlation between code and the environment it executes in could lead to systems that predict resource usage ahead of time. Since we don't have that today, many organizations load test code under a wide range of conditions to empirically gather that information.

Software engineering are very different from the typical fields of engineering. Where "normal" engineering are bound to the context of our physical universe and the laws in it we've identified, there's no such boundary in the software world.
Producing software are usually an attempt to mirror a subset of the real-life world into a virtual reality. Here we define the laws ourselves, by only picking the ones we need and by making them just as complex as we need. Because of this fundamental difference, you need to look at the problem-solving from a different perspective. We try to make abstractions to make complex parts less complex, just like we teach kids that yellow + blue = green, when it's really the wavelength of the light that bounces on the paper that changes.
Once in a while we are bound by different laws though. Stuff like Big-O, Test-coverage, complexity-measurements, UI-measurements and the likes are all models of mathematic laws. If you look into digital signal processing, realtime programming and functional programming, you'll often find that the programmers use equations to figure out a way to do what they want. - but these techniques aren't really (to some extend) useful to create a virtual domain, that can solve complex logic, branching and interact with a user.

The reasons why wind tunnels, simulations, etc.. are needed in the engineering world is that it's much cheaper to build a scaled down prototype, than to build the full thing and then test it. Also, a failed test on a full scale bridge is destructive - you have to build a new one for each test.
In software, once you have a prototype that passes the requirements, you have the full-blown solution. there is no need to build the full-scale version. You should be running load simulations against your server apps before going live with them, but since loads are variable and often unpredictable, you're better off building the app to be able to scale to any size by adding more hardware than to target a certain load. Bridge builders have a given target load they need to handle. If they had a predicted usage of 10 cars at any given time, and then a year later the bridge's popularity soared to 1,000,000 cars per day, nobody would be surprised if it failed. But with web applications, that's the kind of scaling that has to happen.

1) Most business logic is usually broken down into decision trees. This is the "equation" that should be proofed with unit tests. If you put in x then you should get y, I don't see any issue there.
2,3) Profiling can provide some insight as to where performance issues lie. For the most part you can't say that software will take x cycles because that will change over time (ie database becomes larger, OS starts going funky, etc). Bridges for instance require constant maintenance, you can't slap one up and expect it to last 50 years without spending time and money on it. Using libraries is like not trying to figure out pi every time you want to find the circumference of a circle. It has already been proven (and is cost effective) so there is no need to reinvent the wheel.
4) For the most part web applications scale well horizontally (multiple machines). Vertical (multithreading/multiprocess) scaling tends to be much more complex. Adding machines is usually relatively easy and cost effective and avoid some bottlenecks that become limited rather easily (disk I/O). Also load balancing can eliminate the possibility of one machine being a central point of failure.
It isn't exactly rocket science as you never know how many consumers will come to the serving line. Generally it is better to have too much capacity then to have errors, pissed of customers and someone (generally your boss) chewing your hide out.

How to get started on Information Extraction?

Could you recommend a training path to start and become very good in Information Extraction. I started reading about it to do one of my hobby project and soon realized that I would have to be good at math (Algebra, Stats, Prob). I have read some of the introductory books on different math topics (and its so much fun). Looking for some guidance. Please help.
Update: Just to answer one of the comment. I am more interested in Text Information Extraction.

Just to answer one of the comment. I am more interested in Text Information Extraction.
Depending on the nature of your project, Natural language processing, and Computational linguistics can both come in handy -they provide tools to measure, and extract features from the textual information, and apply training, scoring, or classification.
Good introductory books include OReilly's Programming Collective Intelligence (chapters on "searching, and ranking", Document filtering, and maybe decision trees).
Suggested projects utilizing this knowledge: POS (part-of-speech) tagging, and named entity recognition (ability to recognize names, places, and dates from the plain text). You can use Wikipedia as a training corpus since most of the target information is already extracted in infoboxes -this might provide you with some limited amount of measurement feedback.
The other big hammer in IE is search, a field not to be underestimated. Again, OReilly's book provides some introduction in basic ranking; once you have a large corpus of indexed text, you can do some really IE tasks with it. Check out Peter Norvig: Theorizing from data as a starting point, and a very good motivator -maybe you could reimplement some of their results as a learning exercise.
As a fore-warning, I think I'm obligated to tell you, that information extraction is hard. The first 80% of any given task is usually trivial; however, the difficulty of each additional percentage for IE tasks are usually growing exponentially -in development, and research time. It's also quite underdocumented -most of the high-quality info is currently in obscure white papers (Google Scholar is your friend) -do check them out once you've got your hand burned a couple of times. But most importantly, do not let these obstacles throw you off -there are certainly big opportunities to make progress in this area.

I would recommend the excellent book Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. It covers a broad area of issues which form a great and up-to-date (2008) basis for Information Extraction and is available online in full text (under the given link).

I would suggest you take a look at the Natural Language Toolkit (nltk) and the NLTK Book. Both are available for free and are great learning tools.

You don't need to be good at math to do IE just understand how the algorithm works, experiment on the cases for which you need an optimal result performance, and the scale with which you need to achieve target accuracy level and work with that. You are basically working with algorithms and programming and aspects of CS/AI/Machine learning theory not writing a PhD paper on building a new machine-learning algorithm where you have to convince someone by way of mathematical principles why the algorithm works so I totally disagree with that notion. There is a difference between practical and theory - as we all know mathematicians are stuck more on theory then the practicability of algorithms to produce workable business solutions. You would, however, need to do some background reading both books in NLP as well as journal papers to find out what people found from their results. IE is a very context-specific domain so you would need to define first in what context you are trying to extract information - How would you define this information? What is your structured model? Supposing you are extracting from semi and unstructured data sets. You would then also want to weigh out whether you want to approach your IE from a standard human approach which involves things like regular expressions and pattern matching or would you want to do it using statistical machine learning approaches like Markov Chains. You can even look at hybrid approaches.
A standard process model you can follow to do your extraction is to adapt a data/text mining approach:
pre-processing - define and standardize your data to extraction from various or specific sources cleansing your data
segmentation/classification/clustering/association - your black box where most of your extraction work will be done
post-processing - cleansing your data back to where you want to store it or represent it as information
Also, you need to understand the difference between what is data and what is information. As you can reuse your discovered information as sources of data to build more information maps/trees/graphs. It is all very contextualized.
standard steps for: input->process->output
If you are using Java/C++ there are loads of frameworks and libraries available you can work with.
Perl would be an excellent language to do your NLP extraction work with if you want to do a lot of standard text extraction.
You may want to represent your data as XML or even as RDF graphs (Semantic Web) and for your defined contextual model you can build up relationship and association graphs that most likely will change as you make more and more extractions requests. Deploy it as a restful service as you want to treat it as a resource for documents. You can even link it to taxonomized data sets and faceted searching say using Solr.
Good sources to read are:
Handbook of Computational Linguistics and Natural Language Processing
Foundations of Statistical Natural Language Processing
Information Extraction Applications in Prospect
An Introduction to Language Processing with Perl and Prolog
Speech and Language Processing (Jurafsky)
Text Mining Application Programming
The Text Mining Handbook
Taming Text
Algorithms of Intelligent Web
Building Search Applications
IEEE Journal
Make sure you do a thorough evaluation before deploying such applications/algorithms into production as they can recursively increase your data storage requirements. You could use AWS/Hadoop for clustering, Mahout for large scale classification amongst others. Store your datasets in MongoDB or unstructured dumps into jackrabbit, etc. Try experimenting with prototypes first. There are various archives you can use to base your training on say Reuters corpus, tipster, TREC, etc. You can even check out alchemy API, GATE, UIMA, OpenNLP, etc.
Building extractions from standard text is easier than say a web document so representation at pre-processing step becomes even more crucial to define what exactly it is you are trying to extract from a standardized document representation.
Standard measures include precision, recall, f1 measure amongst others.

I disagree with the people who recommend reading Programming Collective Intelligence. If you want to do anything of even moderate complexity, you need to be good at applied math and PCI gives you a false sense of confidence. For example, when it talks of SVM, it just says that libSVM is a good way of implementing them.
Now, libSVM is definitely a good package but who cares about packages. What you need to know is why SVM gives the terrific results that it gives and how it is fundamentally different from Bayesian way of thinking ( and how Vapnik is a legend).
IMHO, there is no one solution to it. You should have a good grip on Linear Algebra and probability and Bayesian theory. Bayes, I should add, is as important for this as oxygen for human beings ( its a little exaggerated but you get what I mean, right ?). Also, get a good grip on Machine Learning. Just using other people's work is perfectly fine but the moment you want to know why something was done the way it was, you will have to know something about ML.
Check these two for that :
http://pindancing.blogspot.com/2010/01/learning-about-machine-learniing.html
http://measuringmeasures.com/blog/2010/1/15/learning-about-statistical-learning.html
http://measuringmeasures.com/blog/2010/3/12/learning-about-machine-learning-2nd-ed.html
Okay, now that's three of them :) / Cool

The Wikipedia Information Extraction article is a quick introduction.
At a more academic level, you might want to skim a paper like Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text.

Take a look here if you need enterprise grade NER service. Developing a NER system (and training sets) is a very time consuming and high skilled task.

This is a little off topic, but you might want to read Programming Collective Intelligence from O'Reilly. It deals indirectly with text information extraction, and it doesn't assume much of a math background.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex