In a previous post I
made a brief (but probably controversial to some) statement. I stated that
often the type of input distributions used is not that important because the
nature of the model has a much more significant affect upon the shape of the
output distribution. I therefore stated that when dealing with the input
distribution you probably should not worry too much about the moments other
than the mean or the standard deviation for input variables.
This is a fairly bold
statement, and it probably requires more support. So why do I believe that you
often don’t need to worry too much about the distribution type? This conviction
is a result of what I experienced during my PhD. During my PhD I had to model a
number of manufacturing operations. This was to ascertain heuristics that could
be used to allocate a distribution to variable once the associated method of
manufacture was known. I frequently found that it was the phenomena underlying
the manufacturing method (and thus the model of that operation) that had the
greatest affect upon the calculated distribution. This was because the central
limit theorem would often be present because multiplicative and additive
operations are so common. In fact, it is almost impossible to develop a model
of any real system that does not include a number of additive or multiplicative
operations.
This is certainly not
always the case. I found that in cases where there a small number of random
variables there was a reduced opportunity for the central limit theorem to
express itself, and the type of distribution became very important. However,
the majority of cases considered in the real world have a large number of
random variables. Therefore, the central limit theorem is more common, and the
distribution type is less important in the majority of real life cases. Still,
be aware of cases where this might not be the case, especially when there are
only a small number of random variables.
If you wish to verify
this contention of mine, consider the models that you often deal with. You will
most likely find that there are a number of additive and multiplicative
functions within any of these models. You can also try changing the
distribution types to check that they do not matter that much. Of course, keep
the mean and the standard deviation the same.
Comments