Multivariate Similarity Metrics - math

Say I am tracking 2 objects moving in space & time,
I know their x,y co-ords and score (score being a probabilistic
measure of the tracked being the actual object),
and I get several
such {x,y,score} samples over time for each object
What metric would I use to measure "similarity" of say a ball moving across a room vs. man moving across the room vs. a child moving across the room.
Assume the score is pretty accurate.

Given your description, I'd recommend looking into a Hidden Markov Model or possibly an artificial neural network or some other machine learning approach. However, there are a number of other techniques that might be more appropriate for your situation.

Related

Point pattern similarity and comparison

I recently started to work with a huge dataset, provided by medical emergency
service. I have cca 25.000 spatial points of incidents.
I am searching books and internet for quite some time and am getting more and more confused about what to do and how to do it.
The points are, of course, very clustered. I calculated K, L and G function
for it and they confirm serious clustering.
I also have population point dataset - one point for every citizen, that is similarly clustered as incidents dataset (incidents happen to people, so there is a strong link between these two datasets).
I want to compare these two datasets to figure out, if they are similarly
distributed. I want to know, if there are places, where there are more
incidents, compared to population. In other words, I want to use population dataset to explain intensity and then figure out if the incident dataset corresponds to that intensity. The assumption is, that incidents should appear randomly regarding to population.
I want to get a plot of the region with information where there are more or less incidents than expected if the incidents were randomly happening to people.
How would you do it with R?
Should I use Kest or Kinhom to calculate K function?
I read the description, but still don't understand what is a basic difference
between them.
I tried using Kcross, but as I figured out, one of two datasets used
should be CSR - completely spatial random.
I also found Kcross.inhom, should I use that one for my data?
How can I get a plot (image) of incident deviations regarding population?
I hope I asked clearly.
Thank you for your time to read my question and
even more thanks if you can answer any of my questions.
Best regards!
Jernej
I do not have time to answer all your questions in full, but here are some pointers.
DISCLAIMER: I am a coauthor of the spatstat package and the book Spatial Point Patterns: Methodology and Applications with R so I have a preference for using these (and I genuinely believe these are the best tools for your problem).
Conceptual issue: How big is your study region and does it make sense to treat the points as distributed everywhere in the region or are they confined to be on the road network?
For now I will assume we can assume they are distributed anywhere.
A simple approach would be to estimate the population density using density.ppp and then fit a Poisson model to the incidents with the population density as the intensity using ppm. This would probably be a reasonable null model and if that fits the data well you can basically say that incidents happen "completely at random in space when controlling for the uneven population density". More info density.ppp and ppm are in chapters 6 and 9 of 1, respectively, and of course in the spatstat help files.
If you use summary statistics like the K/L/G/F/J-functions you should always use the inhom versions to take the population density into account. This is covered in chapter 7 of 1.
Also it could probably be interesting to see the relative risk (relrisk) if you combine all your points in to a marked point pattern with two types (background and incidents). See chapter 14 of 1.
Unfortunately, only chapters 3, 7 and 9 of 1 are availble as free to download sample chapters, but I hope you have access to it at your library or have the option of buying it.

FEM software for increment calculations

Can anyone suggest any software that will calculate increment stress effects on a body ?
a particular application would be calculating increment stress on gear teeth through a simulation run.
Since we would have a cyclic run, if we had 2 gears, their teeth would be in contact once every revolution, and i am interested in knowing if there is a software that will keep track of the "damage" done on first contact, which would slightly change the geometry of the gear and most importantly change the way the gear responds to the same stress at future contacts.
You'll need a non-linear transient FEA capability.
I'm assuming that the gear rotational velocity is small enough where you aren't interested in inertial effects. You want to do a non-linear load that tracks loading over one or more rotational cycles.
You need to model contact and friction at the contact points. That's a challenging non-linear problem.
You'll need a mesh that's refined enough in the contact zone to resolve the surface stress you're interested in.
Small strain is sufficient as a first step. Large strains would imply that your geometry is in some trouble.
Damage implies a non-linear material model of some kind. What were you assuming? Small strain plasticity with isotropic or kinematic hardening? Or a more advanced model like Walker or Chaboche?
Do temperature effects matter to you? Must you do a heat transfer analysis as well?
Do you have a model for metallurgical effects (e.g. austenite/martensite phase changes for carbon steel)? Do you have any heat treatment or grain size data that impact your material model?
I'd recommend starting simple and modeling contact between two teeth, one stationary and another in motion.
I haven't done finite element analysis for a living in many years, but when I was a practitioner this kind of problem would be solved with something like MARC or ABAQUS. I believe ANSYS is very popular now. There are also open source finite element solvers, but I'm less familiar with those.
I'm sure you've done a Google search for something like "finite element analysis gear tooth". You're far from the first to be interested in a problem like this.

How to make a good mesh in a biologically accurate model with very small domains

I have been trying to make a biologically accurate 2D spatial model of tissue layers, where different physiological processes happen. This includes mainly chemical reactions, diffusion and fluxes over boundaries.
I am making this model in COMSOL Multiphysics, a finite element software package that solves different physics like reaction-diffusion systems, although for my question this might not be really relevant.
In my geometry, I have really small regions between the cells of the tissue layers. These regions serve as openings where diffusion can take place between the cells (junctions). The quality of the mesh is not great here and if I want to improve the quality (mainly by introducing more elements and such), my simulation time increases drastically. The lesser quality mesh also causes convergence to take longer. I added a picture of the geometry to give an idea. I tried different meshes, all with different qualities of the elements and the number of elements ranging from 16000 to 50000.
My background in FEM is really limited and I wanted to know if I can tackle this problem in such a way that it
doesn't negatively affect the biology (keep the tissue domain sizes/problem etc as biologically accurate as possible),
doesn't increase the simulation time drastically,
give a better mesh quality.
So I really want to know what the best way to go is, since I have already thought of some things.
Can I go with the lesser quality mesh (which is not really bad, but not good either), so that I can keep the small regions for optimum biological accuracy and have a relatively small computation time (and hope I won't run into convergence errors).
But maybe there are possibilities that I am missing, for instance: is it possible to make the small domain bigger and then add some kind of factor to the diffusion rates. In other words, if I want to make the domain twice as large, do I factor the diffusion rate with half? Is that even accurate in chemical/physical laws :S.
Hopefully I made the problem a bit clear and thank you greatly in advance for the help.
Cheers,
Mesh of the tissue model
I know this thread was posted some months back but I am unsure if you found a solution.
In order to find the relationship between accuracy and computational time would be that you run a mesh analysis on your model and see how the mesh size directly affects the results you are expecting to obtain (pore pressure, fluid velocity, strain, etc.) This will allow you to determine the most appropriate mesh strategy for your specific problem.
Also, you might need to keep in mind that the diffusion rate of a material will depend on the pore size and the permeability (by means of Darcy's law) so depending on the assumptions you are making for the implementation of your constitutive law and your problem boundary conditions you might simplify/enlarge some of the smaller domains you have in your model so long they are within your previously made assumptions.

When and why is crossover beneficial in differential evolution?

I implemented a differential evolution algorithm for a side project I was doing. Because the crossover step seemed to involve a lot of parameter choices (e.g. crossover probabilities), I decided to skip it and just use mutation. The method seemed to work ok, but I am unsure whether I would get better performance if I introduced crossover.
Main Question: What is the motivation behind introducing crossover to differential evolution? Can you provide a toy example where introducing crossover out-performs pure mutation?
My intuition is that crossover will produce something like the following in 2-dimensions. Say
we have two parent vectors (red). Uniform crossover could produce a new trial vector at one of the blue points.
I am not sure why this kind of exploration would be expected to be beneficial. In fact, it seems like this could make performance worse if high-fitness solutions follow some linear trend. In the figure below, lets say the red points are the current population, and the optimal solution is towards the lower right corner. The population is traveling down a valley such that the upper right and lower left corners produce bad solutions. The upper left corner produces "okay" but suboptimal solutions. Notice how uniform crossover produces trials (in blue) that are orthogonal to the direction of improvement. I've used a cross-over probability of 1 and neglected mutation to illustrate my point (see code). I imagine this situation could arise quite frequently in optimization problems, but could be misunderstanding something.
Note: In the above example, I am implicitly assuming that the population was randomly initialized (uniformly) across this space, and has begun to converge to the correct solution down the central valley (top left to bottom right).
This toy example is convex, and thus differential evolution wouldn't even be the appropriate technique. However, if this motif was embedded in a multi-modal fitness landscape, it seems like crossover might be detrimental. While crossover does support exploration, which could be beneficial, I am not sure why one would choose to explore in this particular direction.
R code for the example above:
N = 50
x1 <- rnorm(N,mean=2,sd=0.5)
x2 <- -x1+4+rnorm(N,mean=0,sd=0.1)
plot(x1,x2,pch=21,col='red',bg='red',ylim=c(0,4),xlim=c(0,4))
x1_cx = list(rep(0, 50))
x2_cx = list(rep(0, 50))
for (i in 0:N) {
x1_cx[i] <- x1[i]
x2_cx[i] <- x2[sample(1:N,1)]
}
points(x1_cx,x2_cx,pch=4,col='blue',lwd=4)
Follow-up Question: If crossover is beneficial in certain situations, is there a sensible approach to a) determining if your specific problem would benefit from crossover, and b) how to tune the crossover parameters to optimize the algorithm?
A related stackoverflow question (I am looking for something more specific, with a toy example for instance): what is the importance of crossing over in Differential Evolution Algorithm?
A similar question, but not specific to differential evolution: Efficiency of crossover in genetic algorithms
I am not particularly familiar with the specifics of the DE algorithm but in general the point of crossover is that if you have two very different individuals with high fitness it will produce an offspring that is intermediate between them without being particularly similar to either. Mutation only explores the local neighbourhood of each individual without taking the rest of the population into account. If you think of genomes as points in some high dimensional vector space, then a mutation is shift in a random direction. Therefore mutation needs to take small steps since if your are starting from a significantly better than random position, a long step in a random direction is almost certain to make things worse because it is essentially just introducing entropy into an evolved genome. You can think of a cross over as a step from one parent towards the other. Since the other parent is also better than random, it is more promising to take a longer step in that direction. This allows for faster exploration of the promising parts of the fitness landscape.
In real biological organisms the genome is often organized in such a way that genes that depend on each other are close together on the same chromosome. This means that crossover is unlikely to break synergetic gene combinations. Real evolution actually moves genes around to achieve this (though this is much slower than the evolution of individual genes) and sometimes the higher order structure of the genome (the 3 dimensional shape of the DNA) evolves to prevent cross-overs in particularly sensitive areas. These mechanisms are rarely modeled in evolutionary algorithms, but you will get more out of crossovers if you order your genome in a way that puts genes that are likely to interact close to each other.
No. Crossover is not useful. There I said it. :P
I've never found a need for crossover. People seem to think it does some kind of magic. But it doesn't (and can't) do anything more useful than simple mutation. Large mutations can be used to explore the entire problem space and small mutations can be used to exploit niches.
And all the explanations I've read are (to put it mildly) unsatisfactory. Crossover only complicates your algorithms. Drop it asap. Your life will be simpler. .... IMHO.
As Daniel says, cross over is a way to take larger steps across the problem landscape, allowing you to escape local maxima that a single mutation would be unable to do so.
Whether it is appropriate or not will depend on the complexity of the problem space, how the genotype -> phenotype expression works (will related genes be close together), etc.
More formally this is the concept of 'Connectivity' in Local Search algorithms, providing strong enough operators that the local search neighbourhood is sufficentally large to escape local minima.

Examples of mathematics algorithms that apply to game development

I am designing a RPG game like final fantasy.
I have the programming part done but what I lack is the maths. I am ok at maths but I am having trouble incorporating the players stas into mu sums.
How can I make an action timer that is based on the players speed?
How can I use attack and defence so that it is not always exactly the same damage?
How can I add randomness into the equations?
Can anyone point me to some resources that I can read to learn this sort of stuff.
EDIT: Clarification Of what I am looking for
for the damage I have (player attack x move strength) / enemy defence.
This works and scales well but i got a look at the algorithms from final fantasy 4 a while a got and this sum alone was over 15 steps. mine has only 2.
I am looking for real game examples if possible but would settle for papers or books that have sections that explain how they get these complex sums and why they don't use simple ones.
I eventually intent to implement but am looking for more academic knowledge at the moment.
Not knowing Final fantasy at all, here are some thoughts.
Attack/Defence could either be a 'chance to hit/block' or 'damage done/mitigated' (or, possibly, a blend of both). If you decide to go for 'damage done/mitigated', you'll probably want to do one of:
Generate a random number in a suitable range, added/subtracted from the base attack/defence value.
Generate a number in the range 0-1, multiplied by the attack/defence
Generate a number (with a Gaussian or Poisson distribution and a suitable standard deviation) in the range 0-2 (or so, to account for the occasional crit), multiplied by the attack/defence
For attack timers, decide what "double speed" and "triple speed" should do for the number of attacks in a given time. That should give you a decent lead for how to implement it. I can, off-hand, think of three methods.
Use N/speed as a base for the timer (that means double/triple speed gives 2/3 times the number of attacks in a given interval).
Use Basetime - Speed as the timer (requires a cap on speed, may not be an issue, most probably has an unintuitive relation between speed stat and timer, not much difference at low levels, a lot of difference at high levels).
Use Basetime - Sqrt(Speed) as the timer.
I doubt you'll find academic work on this. Determining formulae for damage, say, is heuristic. People just make stuff up based on their experience with various functions and then tweak the result based on gameplay.
It's important to have a good feel for what the function looks like when plotted on a graph. The best advice I can give for this is to study a course on sketching graphs of functions. A Google search on "sketching functions" will get you started.
Take a look at printed role playing games like Dungeons & Dragons and how they handle these issues. They are the inspiration for computer RPGs. I don't know of academic work
Some thoughts: you don't have to have an actual "formula". It can be rules like "roll a 20 sided die, weapon does 2 points of damage if the roll is <12 and 3 points of damage if the roll is >=12".
You might want to simplify continuous variables down to small ranges of integers for testing. That way you can calculate out tables with all the possible permutations and see if the results look reasonable. Once you have something good, you can interpolate the formulas for continuous inputs.
Another key issue is play balance. There aren't necessarily formulas for telling you whether your game mechanics are balanced, you have to test.

Resources