Predicting SPC (Statistical Process Control) - r

I will give a brief explanation to my scenario. The company mass produces components like valves/nuts/bolts etc which need to measured for dimensions (like length,radius,thickness etc) for quality purposes. As it is not feasible to inspect all the pieces, they are chosen in a batch style. Foe eg: from a batch of every 100 pieces, 5 will be randomly selected & mean of their dimensions measured & noted for drawing SPC control charts (plots mean dimension on y axis & batch number on x axis).
Even though there are a number of factors (like operator efficiency, machine/tool condition etc) which affect the quality of the product, they don't seem to be measurable.
My objective is to develop a machine learning model to predict the product dimensions of the coming batch samples(mean). This will help the operator to forecast if there is going to be any significant dimensional variation so that he can pause working & figure out potential reasons & thus prevent the wastage of the product/material.
I have some idea about R programming & machine learning techniques like decision trees/regression etc but couldn't land on a proper model for this. Mainly because I couldn't think of the independent variables for this situation. I don't have much idea about time series modelling though.
Will someone throw some insights/ideas/suggestions about how to tackle this.
I am sorry that I had to write a long story but just wanted to make things as clear as possible.
Thanks in advance.
Sreenath

Your requirement may apply with three level by steps:
1.Fundamental
Automatic apply SPC rule with machine learning, ex. identify SPC chart pattern with Nelson rule, and extend to new pattern of variation in specific process.
Nelson rules
ML system for SPC reference
2.Supplemental
Predicate Cp and SPC trend with multivariant collection and machine learning. For example, particle of smoke will impact wafer yield rate, it may earlier to found if data analysis model link SPC and worker shift arrangement
Improve SPC by PPC
3.Intelligent agent
Automatic process event through integration between SPC and reaction plan. The agent model by link SPC and FMEA and build with CEP engine in BAM architecture.
Process integration
System integration
Intelligent Agent
CEP
BAM

Related

Is there an R package with which I can model the effects of competition on ideal free distribution?

I am a university student working on a research project, because of our local lockdown I cannot go into the field to collect observation data, I am therefore looking for an R package that will allow me to model the effects of competition when testing for ideal free distribution (IFD).
To give you a better idea of what I am looking for I have described the project in more detail below.
In my original dataset (which I received i.e., I did not collect the data myself) I have two patches (A,B) which received random treatments of food input (1:1, 2:1, 5:1). Under the ideal free distribution hypothesis species should distribute into the patches in accordance with the treatment ratios. This is not the case.
Under normal circumstances I would go into the field and observe behaviour of individuals in the patches to see if dominance affects distribution. Since we are in a lockdown I am unable to do so. I am hoping that there is a package out there that would allow me to model this scenario and help me investigate how competition affects IFD.
I have already found two packages called coexist and EcoVirtual but they model coexistence and extinction dynamics, whereas I want to investigate how competition might alter distribution between profitable patches when there is variation in the level of competition.
I am fairly new to R and creating my own package is beyond my skillset at this point, so I would appreciate the help.
I hope this makes sense and thanks in advance.
Wow, that's an odd place to find another researcher of IFD. I do not believe there are packages on R specifically about IFD. Its too specific and most models are relatively simple to estimate using common tests. For example, the input-matching rule you mentioned can be tested using a simple run-of-the-mill t-test, already included in base R.
What you have is not a coding problem per say, or even an statistical one. It is a biological problem. What ratio would you expect when animals are ideal (full knowledge of the environment), free (no movement costs), but with the presence of competition? Is this ratio equal to the ratio in your dataset? Sutherland,1983 suggests animals would undermatch.
I would love to discuss this at depth, given my PhD was in IFD, but I fear you hit the wrong forum.

FEM software for increment calculations

Can anyone suggest any software that will calculate increment stress effects on a body ?
a particular application would be calculating increment stress on gear teeth through a simulation run.
Since we would have a cyclic run, if we had 2 gears, their teeth would be in contact once every revolution, and i am interested in knowing if there is a software that will keep track of the "damage" done on first contact, which would slightly change the geometry of the gear and most importantly change the way the gear responds to the same stress at future contacts.
You'll need a non-linear transient FEA capability.
I'm assuming that the gear rotational velocity is small enough where you aren't interested in inertial effects. You want to do a non-linear load that tracks loading over one or more rotational cycles.
You need to model contact and friction at the contact points. That's a challenging non-linear problem.
You'll need a mesh that's refined enough in the contact zone to resolve the surface stress you're interested in.
Small strain is sufficient as a first step. Large strains would imply that your geometry is in some trouble.
Damage implies a non-linear material model of some kind. What were you assuming? Small strain plasticity with isotropic or kinematic hardening? Or a more advanced model like Walker or Chaboche?
Do temperature effects matter to you? Must you do a heat transfer analysis as well?
Do you have a model for metallurgical effects (e.g. austenite/martensite phase changes for carbon steel)? Do you have any heat treatment or grain size data that impact your material model?
I'd recommend starting simple and modeling contact between two teeth, one stationary and another in motion.
I haven't done finite element analysis for a living in many years, but when I was a practitioner this kind of problem would be solved with something like MARC or ABAQUS. I believe ANSYS is very popular now. There are also open source finite element solvers, but I'm less familiar with those.
I'm sure you've done a Google search for something like "finite element analysis gear tooth". You're far from the first to be interested in a problem like this.

Neural network back propagation weight change effect on predictions

I am trying to understand how neural network can predict different outputs by learning different input/output patterns..I know that weights changes are the mode of learning...but if an input brings about weight adjustments to achieve a particular output in back propagtion algorithm.. won't this knowledge(weight updates) be knocked of when presented with a different set of input pattern...thus making the network forget what it had previously learnt..
The key to avoid "destroying" the networks current knowledge is to set the learning rate to a sufficiently low value.
Lets take a look at the mathmatics for a perceptron:
The learning rate is always specified to be < 1. This forces the backpropagation algorithm to take many small steps towards the correct setting, rather than jumping in large steps. The smaller the steps, the easier it will be to "jitter" the weight values into the perfect settings.
If, on the other hand, used a learning rate = 1, we could start to experience trouble with converging as you mentioned. A high learning rate would imply that the backpropagation should always prefer to satisfy the currently observed input pattern.
Trying to adjust the learning rate to a "perfect value" is unfortunately more of an art, than science. There are of course implementations with adaptive learning rate values, refer to this tutorial from Willamette University. Personally, I've just used a static learning rate in the range [0.03, 0.1].

Library to train GMMs from MFCC

I am trying to build a basic Emotion detector from speech using MFCCs, their deltas and delta-deltas. A number of papers talk about getting a good accuracy by training GMMs on these features.
I cannot seem to find a ready made package to do the same. I did play around with scilearn in Python, Voicebox and similar toolkits in Matlab and Rmixmod, stochmod, mclust, mixtools and some other packages in R. What would be the best library to calculate GMMs from trained data?
Challenging problem is training data, which contains the emotion information, embedded in feature set. The same features that encapsulate emotions should be used in the test signal. The testing with GMM will only be good as your universal background model. In my experience typically with GMM you can only separate male female and a few unique speakers. Simply feeding the MFCC’s into GMM would not be sufficient, since GMM does not hold time varying information. Since emotional speech would contain time varying parameters such as pitch and changes in pitch over periods in addition to the frequency variations MFCC parameters. I am not saying it not possible with current state of technology but challenging in a good way.
If you want to use Python, here is the code in the famous speech recognition toolkit Sphinx.
http://sourceforge.net/p/cmusphinx/code/HEAD/tree/trunk/sphinxtrain/python/cmusphinx/gmm.py

Which particular software development tasks have you used math for? And which branch of math did you use?

I'm not looking for a general discussion on if math is important or not for programming.
Instead I'm looking for real world scenarios where you have actually used some branch of math to solve some particular problem during your career as a software developer.
In particular, I'm looking for concrete examples.
I frequently find myself using De Morgan's theorem when as well as general Boolean algebra when trying to simplify conditionals
I've also occasionally written out truth tables to verify changes, as in the example below (found during a recent code review)
(showAll and s.ShowToUser are both of type bool.)
// Before
(showAll ? (s.ShowToUser || s.ShowToUser == false) : s.ShowToUser)
// After!
showAll || s.ShowToUser
I also used some basic right-angle trigonometry a few years ago when working on some simple graphics - I had to rotate and centre a text string along a line that could be at any angle.
Not revolutionary...but certainly maths.
Linear algebra for 3D rendering and also for financial tools.
Regression analysis for the same financial tools, like correlations between financial instruments and indices, and such.
Statistics, I had to write several methods to get statistical values, like the F Probability Distribution, the Pearson product moment coeficient, and some Linear Algebra correlations, interpolations and extrapolations for implementing the Arbitrage pricing theory for asset pricing and stocks.
Discrete math for everything, linear algebra for 3D, analysis for physics especially for calculating mass properties.
[Linear algebra for everything]
Projective geometry for camera calibration
Identification of time series / statistical filtering for sound & image processing
(I guess) basic mechanics and hence calculus for game programming
Computing sizes of caches to optimize performance. Not as simple as it sounds when this is your critical path, and you have to go back and work out the times saved by using the cache relative to its size.
I'm in medical imaging, and I use mostly linear algebra and basic geometry for anything related to 3D display, anatomical measurements, etc...
I also use numerical analysis for handling real-world noisy data, and a good deal of statistics to prove algorithms, design support tools for clinical trials, etc...
Games with trigonometry and AI with graph theory in my case.
Graph theory to create a weighted graph to represent all possible paths between two points and then find the shortest or most efficient path.
Also statistics for plotting graphs and risk calculations. I used both Normal distribution and cumulative normal distribution calculations. Pretty commonly used functions in Excel I would guess but I actully had to write them myself since there is no built-in support in the .NET libraries. Sadly the built in Math support in .NET seem pretty basic.
I've used trigonometry the most and also a small amount a calculus, working on overlays for GIS (mapping) software, comparing objects in 3D space, and converting between coordinate systems.
A general mathematical understanding is very useful if you're using 3rd party libraries to do calculations for you, as you ofter need to appreciate their limitations.
i often use math and programming together, but the goal of my work IS the math so use software to achive that.
as for the math i use; mostly Calculus (FFT's analysing continuous and discrete signals) with a slash of linar algebra (CORDIC) to do trig on a MCU with no floating point chip.
I used a analytic geometry for simple 3d engine in opengl in hobby project on high school.
Some geometry computation i had used for dynamic printing reports, where was another 90° angle layout than.
A year ago I used some derivatives and integrals for store analysis (product item movement in store).
Bot all the computation can be found on internet or high-school book.
Statistics mean, standard-deviation, for our analysts.
Linear algebra - particularly gauss-jordan elimination and
Calculus - derivatives in the form of difference tables for generating polynomials from a table of (x, f(x))
Linear algebra and complex analysis in electronic engineering.
Statistics in analysing data and translating it into other units (different project).
I used probability and log odds (log of the ratio of two probabilities) to classify incoming emails into multiple categories. Most of the heavy lifting was done by my colleague Fidelis Assis.
Real world scenarios: better rostering of staff, more efficient scheduling of flights, shortest paths in road networks, optimal facility/resource locations.
Branch of maths: Operations Research. Vague definition: construct a mathematical model of a (normally complex) real world business problem, and then use mathematical tools (e.g. optimisation, statistics/probability, queuing theory, graph theory) to interrogate this model to aid in the making of effective decisions (e.g. minimise cost, maximise efficency, predict outcomes etc).
Statistics for scientific data analyses such as:
calculation of distributions, z-standardisation
Fishers Z
Reliability (Alpha, Kappa, Cohen)
Discriminance analyses
scale aggregation, poling, etc.
In actual software development I've only really used quite trivial linear algebra, geometry and trigonometry. Certainly nothing more advanced than the first college course in each subject.
I have however written lots of programs to solve really quite hard math problems, using some very advanced math. But I wouldn't call any of that software development since I wasn't actually developing software. By that I mean that the end result wasn't the program itself, it was an answer. Basically someone would ask me what is essentially a math question and I'd write a program that answered that question. Sure I’d keep the code around for when I get asked the question again, and sometimes I’d send the code to someone so that they could answer the question themselves, but that still doesn’t count as software development in my mind. Occasionally someone would take that code and re-implement it in an application, but then they're the ones doing the software development and I'm the one doing the math.
(Hopefully this new job I’ve started will actually let me to both, so we’ll see how that works out)

Resources