Project idea using Eigen - linear-algebra

I have been reading the documentation and playing with Eigen recently:
docs
and would like to build something that uses it extensively to learn it well. I looked on their website and they mention various projects that use it - like Google Ceres. Something like that might be too large for one or two people to undertake on the side as an Eigen learning experience so I'm looking for something simpler but not trivial that would use it extensively and is a real - useful - application..

Eigen is extensively used in computer vision, and if you are comfortable with linear algebra and matrix calculus (I assume you are, otherwise you wouldn't use Eigen), why not build a toy VSLAM (visual simultaneous localization and mapping) system? Those that are based on bundle adjustment (there's a whole chapter dedicated to that in zisserman's book on multipleview geometry, and it is also discussed in the open-source yet excellent book enter link description here) can be very tricky to implement efficiently, and will take a lot of time, but since your goal is to learn Eigen, performance shouldn't be that much of an issue. If that seems too hard/long for two people, and you think it demands too much energy for a side-project, I recommend that you select some computer-vision algorithms like those that compute the the essential matrix between two images, or are used in 3d pose estimation. Well, those are the only really fun things that come to mind right now, and they will force you to discover a lot of Eigen's functionalities (and gotchas!).

Related

What is a good language to develop in for simple, yet customizable math programs?

I'm writing to ask for some guidance on choosing a language and course of action in learning programming. I apologize if this type of question is inappropriate for Cross Validated, please advise me to another forum if that is the case.
I've seen thread after thread with questions from newbies, asking, "What is the best language to start with?" and then it always starts a flame war or someone just answers, "There's no best language, it's best to pick one and start learning it." My question is a little bit more focused than that.
First off, I've been programming my whole life, in very limited capacities. My deepest training was in C++. Whilst in my EECS degree program, I resolved to never be a software developer because I couldn't stand not interacting with people for such long periods of time. Instead I realized I wanted to be a math teacher, and so that is the path I have taken.
But now that I'm well down that path, I've started to realize that perhaps I could develop my own software to help me in the classroom. If I want to demonstrate the Euclidean algorithm, what better way than to have a piece of software that breaks down the process? Students could run that software as part of their studies, and the advanced students might even develop programs for themselves. Or, with an Ipad in hand, why not have an app that lets students take their own attendance? It would certainly streamline some of the needs of classroom management.
There's obviously a lot of great stuff already out there for math, and for education, but I want a way to more directly create things specific to my lectures. If I'm teaching a specific way of calculating a percent, I want to create an app that aligns with my teaching style, not just another calculator app that requires the student to learn twice.
The most I use in class right now is iWork Numbers/Microsoft Excel for my stats class. Students can learn the basic statistical functions, and turn some of their data into graphs.
I have dabbled a bit with R, and used Maple in college. I've started the basic tutorials for OS X/iOS development and have actually made good progress making an OS X app that takes a text string, converts it to numbers, and performs encryption using modular addition and multiplication. I sometimes use Wolfram|Alpha to save myself some time in getting quick solutions to equations or base conversions. I know of MatLab, Mathematica, and recently people have been telling me to check into Python or Ruby. I also know basic HTML, and while it's forgotten now, learned Javascript and PERL in college.
If I keep on the path of Obj-C/Cocoa, I think it will have great benefits. Unfortunately, anything I produced for Mac would only be usable on a Mac, so it wouldn't be universal for all of my students. Perhaps then learning a web language would be better. Second, I'm wondering if the primary use is mathematical, then perhaps my time would be better spent learning Mathematica Programming Language, or R, or something based less on GUI and more on simple coding of algorithms, maybe Python or Ruby?
It seems that Mathematica already has a lot of demos for different math concepts, so why reinvent the wheel is also a question I have. I think overall, it would be good to have more control and design things the way I need. And then, if I do want to make an "Attendance" app or something else, I would already have the programming experience to more easily design something for my iPad or MacBook.
The related question to this is what is a good language to teach to my students? In his TED talk, Conrad Wolfram says one of the best ways to check the understanding of a student is have them write a program. But if Mathematica does the math virtually automatically for them, then I'm not sure that will get the deeper experience of working out logic for themselves, like you do when you're writing C, or a traditional procedural language.
I know that programming takes time to learn, but I also know that at this point, my goal is not to be able to make an app like "Tiny Wings." With the app store ease, some of my work may be an extra revenue stream, but I see myself as more of a hobbyist, and now teacher looking to software development specifically for its ability to help me demonstrate mathematical concepts.
I think I will push ahead with Obj-C/Cocoa for OSX/iOS, but if anyone has some better guidance regarding all of the other available stuff, it would be much appreciated. I don't think I would want to go fully to the web (I like apps), but perhaps someone could suggest a nice way of bridging what I produce in XCode to a universal web version. For example, if you come up with an algorithm in obj-c is it easiest to transition that to ruby and run it online, or is there another approach that works better?
Mathematica is pretty awesome for the first part of your question. I've used the interactive mode (Manipulate[]) for explaining things to my colleges (and myself). It makes really nice dynamic figures and is fairly expressive (although your code can end up looking like line noise). It is very powerful, but it does far less for you than you might think. It's pretty intuitive, which is a good thing for teaching.
You could use Scala if you want an "easy" way to make a domain specific language for teaching. Python seems to confuse people as a first programming language. Objective C seems like a completely random choice to me.
Mathematica then. It's worth the price. But anything that is interpreted and has an interactive shell is probably better than a compiled language. BBC BASIC?
Nothing beats Haskell for general-purpose mathematical programming. The wiki's quite extensive and the IRC channel (#haskell on Freenode) is great for asking questions. If you statically link your binaries on compilation, you should be able to run your programs on just about any system (with a few exceptions, e.g., libgmp).
Haskell code reads (roughly) like mathematical notation once you get the hang of it, so it can really help to tie things together for your students who are motivated to write their own programs. The purely functional style can be beneficial, as well, since it focuses less on I/O and the marshalling of data (perfectly useful in applications, perhaps less so in pure math), and more on the actual creation and refinement of functions and algorithms. You can even compose functions just as you would on paper.
If you want to get really serious, you could also look into Coq or Agda, but those might be a bit much for most classes.
For a Haskell program idea for an educator, check out this link.
A nice list of arguments can also be found at:
Eleven Reasons to use Haskell as a Mathematician and the book The Haskell Road to Logic, Maths and Programming

Math me - 2d video games

I'm a hobbyist game programmer. I only do 2d games, no 3d stuff. I don't have a math background and lots of things are tripping me up like bullet projections and angles.
I took two college level Algebra courses at the local community college, but really disappointed. I got As in both, but really don't feel like I'm using any of it in my everyday 2d game programming and still stuck on angles/bullet paths, etc
I dropped out this semester to self study. The advisory at the community college said I want to be in Statistics for this and was really pushing me hard to enroll in that class. He said Statistics then Calculus I & II would get me what I needed.
I've been reading up a lot and not so sure on this. I think I should start with a a Geometry book and then move into Trigonometry? Is that the right approach?
Anyone suggest any good self-study starter books?
I got a lot out of "3D Math Primer for Graphics and Game Development". I know it says 3D but there is a lot of stuff in there for 2D. And the math is fairly simple linear algebra.
It sounds like statistics is useless for what you want to do. Calculus might be marginally useful, but not until you are really solid with it. You probably need to learn trigonometry more than anything else. I could offer more explicit advice if you give an example of a problem you're trying to solve.
There are a few points here:
1) The statistics suggestion is a complete misdirection, and this advice should be completely ignored, along with the person who gave it to you. Statistics is an interesting topic, but not at all useful in game programming (except maybe for a few esoteric approaches to esoteric topics, maybe, like, drawing clouds).
2) (Not that you seem to but...) it's not uncommon for programmers to make the mistake of assuming that they can just learn everything on the job, but most science topics (including math) can not be effectively learned this way. With these, one needs a much more structured approach, building an elaborate structure of ideas, with each new idea built on top of the previous. You could certainly program games with a few equations that you learned to use from a game programming book, but it's unlikely you'd ever have the ability to solve problems that you hadn't already seen solved somewhere else.
3) The best way to get comfortable with math is to solve lots of problems, and not on the computer, but with pencil and paper. For example, you can easily write a program to test that sin2+cos2=1, but to prove it, you need to understand it.
4) Of the topics you'll need, trig is the most time effective place to start. Geometry would be a bit useful, but probably not so much. Another useful topic is linear algebra. Calculus is also useful for calculating trajectories that have acceleration (and gravity), but it's a much bigger topic and involves so many new ideas that it's probably a bit difficult to pick up on your own. Maybe for this topic it's best to try to glean a few useful approaches and equations.
Final suggestion: I recommend starting with trig, and use a book that gives concise explanations followed by lots of problems that are solved in the back. For example, Schaum's Outline of Trig for $13, would probably be a good choice. You don't need to solve a every problem in the book, but work them until you're comfortable, and then move on.

How To: Pattern Recognition

I'm interested in learning more about pattern recognition. I know that's somewhat of a broad field, so I'll list some specific types of problems I would like to learn to deal with:
Finding patterns in a seemingly random set of bytes.
Recognizing known shapes (such as circles and squares) in images.
Noticing movement patterns given a stream of positions (Vector3)
This is a new area of experimentation for me personally, and to be honest, I simply don't know where to start :-) I'm obviously not looking for the answers to be provided to me on a silver platter, but some search terms and/or online resources where I can start to acquaint myself with the concepts of the above problem domains would be awesome.
Thanks!
ps: For extra credit, if said resources provide code examples/discussion in C# would be grand :-) but doesn't need to be
Hidden Markov Models are a great place to look, as well as Artificial Neural Networks.
Edit: You could take a look at NeuronDotNet, it's open source and you could poke around the code.
Edit 2: You can also take a look at ITK, it's also open source and implements a lot of these types of algorithms.
Edit 3: Here's a pretty good intro to neural nets. It covers a lot of the basics and includes source code (albeit in C++). He implemented an unsupervised learning algorithm, I think you may be looking for a supervised backpropagation algorithm to train your network.
Edit 4: Another good intro, avoids really heavy math, but provides references to a lot of that detail at the bottom, if you want to dig into it. Includes pseudo-code, good diagrams, and a lengthy description of backpropagation.
This is kind of like saying "I'd like to learn more about electronics.. anyone tell me where to start?" Pattern Recognition is a whole field - there are hundreds, if not thousands of books out there, and any university has at least several (probably 10 or more) courses at the grad level on this. There are numerous journals dedicated to this as well, that have been publishing for decades ... conferences ..
You might start with the wikipedia.
http://en.wikipedia.org/wiki/Pattern_recognition
This is kind of an old question, but it's relevant so I figured I'd post it here :-) Stanford began offering an online Machine Learning class here - http://www.ml-class.org
OpenCV has some functions for pattern recognition in images.
You might want to look at this :http://opencv.willowgarage.com/documentation/pattern_recognition.html. (broken link: closest thing in the new doc is http://opencv.willowgarage.com/documentation/cpp/ml__machine_learning.html, although it is no longer what I'd call helpful documentation for a beginner - see other answers)
However, I also recommend starting with Matlab because openCV is not intuitive to use.
Lot of useful links on this page on computer vision related pattern recognition. Some of the links seem to be broken now but you may find it useful.
I am not an expert on this, but reading about Hidden Markov Models is a good way to start.
Beware false patterns! For any decently large data set you will find subsets that appear to have pattern, even if it is a data set of coin flips. No good process for pattern recognition should be without statistical techniques to assess confidence that the detected patterns are real. When possible, run your algorithms on random data to see what patterns they detect. These experiments will give you a baseline for the strength of a pattern that can be found in random (a.k.a "null") data. This kind of technique can help you assess the "false discovery rate" for your findings.
learning pattern-recoginition is easier in matlab..
there are several examples and there are functions to use.
it is good for the understanding concepts and experiments...
I would recommend starting with some MATLAB toolbox. MATLAB is an especially convenient place to start playing around with stuff like this due to its interactive console. A nice toolbox I personally used and really liked is PRTools (http://prtools.org); they have an implementation of pretty much every pattern recognition tool and also some other machine learning tools (Neural Networks, etc.). But the nice thing about MATLAB is that there are many other toolboxes as well you can try out (there is even a proprietary toolbox from Mathworks)
Whenever you feel comfortable enough with the different tools (and found out which classifier is perfomring best for you problem), you can start thinking about implementing the machine learning in a different application.

How to get started on Information Extraction?

Could you recommend a training path to start and become very good in Information Extraction. I started reading about it to do one of my hobby project and soon realized that I would have to be good at math (Algebra, Stats, Prob). I have read some of the introductory books on different math topics (and its so much fun). Looking for some guidance. Please help.
Update: Just to answer one of the comment. I am more interested in Text Information Extraction.
Just to answer one of the comment. I am more interested in Text Information Extraction.
Depending on the nature of your project, Natural language processing, and Computational linguistics can both come in handy -they provide tools to measure, and extract features from the textual information, and apply training, scoring, or classification.
Good introductory books include OReilly's Programming Collective Intelligence (chapters on "searching, and ranking", Document filtering, and maybe decision trees).
Suggested projects utilizing this knowledge: POS (part-of-speech) tagging, and named entity recognition (ability to recognize names, places, and dates from the plain text). You can use Wikipedia as a training corpus since most of the target information is already extracted in infoboxes -this might provide you with some limited amount of measurement feedback.
The other big hammer in IE is search, a field not to be underestimated. Again, OReilly's book provides some introduction in basic ranking; once you have a large corpus of indexed text, you can do some really IE tasks with it. Check out Peter Norvig: Theorizing from data as a starting point, and a very good motivator -maybe you could reimplement some of their results as a learning exercise.
As a fore-warning, I think I'm obligated to tell you, that information extraction is hard. The first 80% of any given task is usually trivial; however, the difficulty of each additional percentage for IE tasks are usually growing exponentially -in development, and research time. It's also quite underdocumented -most of the high-quality info is currently in obscure white papers (Google Scholar is your friend) -do check them out once you've got your hand burned a couple of times. But most importantly, do not let these obstacles throw you off -there are certainly big opportunities to make progress in this area.
I would recommend the excellent book Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze. It covers a broad area of issues which form a great and up-to-date (2008) basis for Information Extraction and is available online in full text (under the given link).
I would suggest you take a look at the Natural Language Toolkit (nltk) and the NLTK Book. Both are available for free and are great learning tools.
You don't need to be good at math to do IE just understand how the algorithm works, experiment on the cases for which you need an optimal result performance, and the scale with which you need to achieve target accuracy level and work with that. You are basically working with algorithms and programming and aspects of CS/AI/Machine learning theory not writing a PhD paper on building a new machine-learning algorithm where you have to convince someone by way of mathematical principles why the algorithm works so I totally disagree with that notion. There is a difference between practical and theory - as we all know mathematicians are stuck more on theory then the practicability of algorithms to produce workable business solutions. You would, however, need to do some background reading both books in NLP as well as journal papers to find out what people found from their results. IE is a very context-specific domain so you would need to define first in what context you are trying to extract information - How would you define this information? What is your structured model? Supposing you are extracting from semi and unstructured data sets. You would then also want to weigh out whether you want to approach your IE from a standard human approach which involves things like regular expressions and pattern matching or would you want to do it using statistical machine learning approaches like Markov Chains. You can even look at hybrid approaches.
A standard process model you can follow to do your extraction is to adapt a data/text mining approach:
pre-processing - define and standardize your data to extraction from various or specific sources cleansing your data
segmentation/classification/clustering/association - your black box where most of your extraction work will be done
post-processing - cleansing your data back to where you want to store it or represent it as information
Also, you need to understand the difference between what is data and what is information. As you can reuse your discovered information as sources of data to build more information maps/trees/graphs. It is all very contextualized.
standard steps for: input->process->output
If you are using Java/C++ there are loads of frameworks and libraries available you can work with.
Perl would be an excellent language to do your NLP extraction work with if you want to do a lot of standard text extraction.
You may want to represent your data as XML or even as RDF graphs (Semantic Web) and for your defined contextual model you can build up relationship and association graphs that most likely will change as you make more and more extractions requests. Deploy it as a restful service as you want to treat it as a resource for documents. You can even link it to taxonomized data sets and faceted searching say using Solr.
Good sources to read are:
Handbook of Computational Linguistics and Natural Language Processing
Foundations of Statistical Natural Language Processing
Information Extraction Applications in Prospect
An Introduction to Language Processing with Perl and Prolog
Speech and Language Processing (Jurafsky)
Text Mining Application Programming
The Text Mining Handbook
Taming Text
Algorithms of Intelligent Web
Building Search Applications
IEEE Journal
Make sure you do a thorough evaluation before deploying such applications/algorithms into production as they can recursively increase your data storage requirements. You could use AWS/Hadoop for clustering, Mahout for large scale classification amongst others. Store your datasets in MongoDB or unstructured dumps into jackrabbit, etc. Try experimenting with prototypes first. There are various archives you can use to base your training on say Reuters corpus, tipster, TREC, etc. You can even check out alchemy API, GATE, UIMA, OpenNLP, etc.
Building extractions from standard text is easier than say a web document so representation at pre-processing step becomes even more crucial to define what exactly it is you are trying to extract from a standardized document representation.
Standard measures include precision, recall, f1 measure amongst others.
I disagree with the people who recommend reading Programming Collective Intelligence. If you want to do anything of even moderate complexity, you need to be good at applied math and PCI gives you a false sense of confidence. For example, when it talks of SVM, it just says that libSVM is a good way of implementing them.
Now, libSVM is definitely a good package but who cares about packages. What you need to know is why SVM gives the terrific results that it gives and how it is fundamentally different from Bayesian way of thinking ( and how Vapnik is a legend).
IMHO, there is no one solution to it. You should have a good grip on Linear Algebra and probability and Bayesian theory. Bayes, I should add, is as important for this as oxygen for human beings ( its a little exaggerated but you get what I mean, right ?). Also, get a good grip on Machine Learning. Just using other people's work is perfectly fine but the moment you want to know why something was done the way it was, you will have to know something about ML.
Check these two for that :
http://pindancing.blogspot.com/2010/01/learning-about-machine-learniing.html
http://measuringmeasures.com/blog/2010/1/15/learning-about-statistical-learning.html
http://measuringmeasures.com/blog/2010/3/12/learning-about-machine-learning-2nd-ed.html
Okay, now that's three of them :) / Cool
The Wikipedia Information Extraction article is a quick introduction.
At a more academic level, you might want to skim a paper like Integrating Probabilistic Extraction Models and Data Mining to Discover Relations and Patterns in Text.
Take a look here if you need enterprise grade NER service. Developing a NER system (and training sets) is a very time consuming and high skilled task.
This is a little off topic, but you might want to read Programming Collective Intelligence from O'Reilly. It deals indirectly with text information extraction, and it doesn't assume much of a math background.

What are some good rigid body dynamics references?

I'm not a math guy in the least but I'm interested in learning about rigid body physics (for the purpose of implementing a basic 3d physics engine). In school I only took Maths through Algebra II, but I've done 3d dev for years so I have a fairly decent understanding of vectors, quaternions, matrices, etc. My real problem is reading complex formulas and such, so I'm looking for some decent rigid body dynamics references that will make some sense.
Anyone have any good references?
Physics for Game Programmers I think is better than Physics for Game Developers.
If you want something thick in your bookshelf (like I do), Eberly's 3D Game Engine Design and Erleben's Physics-Based Animation can accompany the above.
Chris Hecker has a nice set of articles on his website which were originally published in Game Developer Magazine. They start with 2D physics and progress to 3D.
Physically Based Modeling by David Baraff is also good, but is a bit heavier on the math.
I guess what you are looking for is Classical Mechanics, which describes motion in one, two, and three dimensions in a generalized manner.
I found a good introductory course on Classical Mechanics from the University of Texas.
I do not guarantee that you will be able to understand all the concepts there, but it will at least give you a basis for your plan. I advise you to consult a Physics professor to help you understand the math.
Good luck!
If you are already familiar (and comfortable) with
linear algebra
basic calculus
Newton's laws of motion
then 6DoF Rigid Body Dynamics is what you are looking for. It's a brief article written [disclaimer: by me] when I once had to develop a helicopter flight simulator.
Using a rotation matrix allows for extremely simple modelling equations, but there exists a simple mapping to and from a quaternion if you prefer that representation for other reasons.
Trying not to get you to rip off your hair with frustration (well, Baraff's/Witkin great math articles with the multi-dimensional matrices would do that sometimes), you can look at the easier online articles such as the ones published in Gamasutra.
Here are two of them:
http://www.gamasutra.com/resource_guide/20030121/kennedy_pfv.htm
http://www.gamasutra.com/features/19990702/data_structures_01.htm
http://www.gamasutra.com/resource_guide/20030121/jacobson_pfv.htm
You'd notice that they point at the mentioned resources as part of their references. I would add that unless you need to solve equations system for multiple particles, articulated characters, or non-rigid complex object, this might be enough to start with.
If however, you do look for more advanced physics and mathematics which involves matrices and equations systems look up Witkin and Baraff's home pages (I think they are both in Pixar if I'm not mistaken), or start with Hecker (that tried more than several practical methods and documented his results).

Resources