How do I work backwards to find a probability curve?
Asked by
PhiNotPi (
12686)
March 24th, 2011
For simplicity, let’s say there is a particle that can be anywhere on a line. The probability of the particle being at a given point on the line is given by a probability curve, which is more or less bell shaped. The particle’s position is observed however many times, and it’s position is recorded. Now here is the problem: Given the points, how can we find the approximate probability curve that generated them?
Observing members:
0
Composing members:
0
5 Answers
Maybe this is the sort of thing you’re looking for. I’m going to confess I didn’t read it closely, just enough to gather it discusses regression analysis for bell curves.
Based on other posts I’ve seen, you are mathematically literate. I’m guessing you’re familiar with at least the concept of basic regression techniques (linear, quadratic,logistic, etc..) Without having personally fit data to a bell curve shaped function, I’m assuming these sorts of techniques are appropriate. Maybe sine regression?
Sorry if you’re looking for a more rigorous answer.
The mean and standard deviation are supposed to describe a complete description of the distribution. Is that not enough?
What @6rant6 said could work.
There are other ways to do it too. I forget what its called, but it works like this:
Call the points x1,..., xn.
If the mean is m and the standard deviation is s, you have a pdf f_m_s.
Let P_(f_m_s_ = f_m_s(x1) * .... * f_m_s(xn).
Find the m and s that maximized P(f_m_s).
You make a histogram of the particles’ positions—counting how many points fall into various intervals. This estimates the probability density function, using the total number of points to scale the area under the curve, which must be exactly 1 (the probability of the point being somewhere). For a normal distribution you can also estimate standard deviation by the width of the bell.
If the shape is more or less bell shaped, it sounds like you’re dealing with a gaussian distribution. To find the parameters of your gaussian distribution N(mu,sigma^2), take the mean of all your points,that’s mu.. now subtract mu from all the points and take the square of that, finally divide by the number of points, thats your sigma^2.
For example, if you have points 3,4,5 , mu=4 and sigma^2 = (minus1^2+0^2+1^2)/3 = ⅔. This means your points were generated by the normal distribution N(4,⅔) , or in other words drawn randomly from a PDF described by the following function f(x)=1/sqrt(2*pi*sigma^2)*e^-(x-mu)^2/(2*sigma^2).
If you’re not sure that you are dealing with a gaussian distribution, you may have to hypothesise what the distribution may be, or choose a general distribution/model such as a GMM and use EM to fit it to your data.
Here are some links that may be helpful:
http://en.wikipedia.org/wiki/Normal_distribution
http://en.wikipedia.org/wiki/Variance
http://en.wikipedia.org/wiki/Estimation
http://en.wikipedia.org/wiki/Mixture_model
http://en.wikipedia.org/wiki/Expectation_maximization
Feel free to give more details of your problem and maybe i could give a more precise answer.
Answer this question
This question is in the General Section. Responses must be helpful and on-topic.