Dr. Jonathan Kenigson, FRSA
Athanasian Hall, Cambridge Limited
0. Division by zero is undefined, or so one is taught in school. Fractions behave badly while products do not. Scholars of Bayesian Statistics frequently assume that data follows an underlying discrete or continuous probability distribution. Examples of discrete distributions abound: Binomial, Inverse Binomial, Hypergeometric, Bernoulli, and many more. These distributions either take on finitely or countably many different values. On the other hand, continuous distributions take on uncountably many values. They are “continuous” in the sense that the random variable they are measuring is not restricted to a countable number of values. Examples of such distributions also abound: The Uniform, Normal, Bates, and Beta Distributions are the best-known examples. Methodological questions abound, however, when one divides distributions into “discrete” and “continuous” types. There are two theses to the current article. The first thesis is that ratios of random variables are badly behaved. The second thesis is that applied statistics should be as indifferent to this consideration as practical utility will allow.
1. Modern Data Science is based on statistical inference and predictive analytics. The Central Limit Theorem and its generalizations broadly state that mean expected values of independent and identically distributed random variables approach a Normal limit. But are ratios of Normal variables neatly tractable? The assumption that data follows a regular causal or inferential rule is a philosophical one. Even under the generalization provided by the Kolmogorov paradigm when data are not required to be independent and identically-distributed the decision to classify data as belonging to a class of widely-intelligible distributions is a matter of practical considerations rather than mathematical necessity. Of course, certain chance processes are very well modeled by classical distributions. The Poisson Approximation states that as the number of trials increases without bound and the probability of success decreases to zero, the underlying Binomial distribution is Poisson. In this sense, the Binomial can be said to “approximate” the Poisson under a wide class of physically relevant conditions. The Normal Approximation to the Binomial changes these initial conditions subtly but importantly. The approximation of the (discrete) Binomial distribution by the (continuous) Normal one occurs when the expected value of the probability of each event is static, and the number of trials grows without bound.
2. Convergence rates for approximations to limiting distributions are codified in terms of canonical error estimates. The Barry-Essen Theorem is the best-known of these. This theorem says that the error estimate in the convergence of the Cumulative Distribution Functions (CDF) of n independent and identically distributed random variables is bounded above by the Third Moment (“Skewness”) of each of the variables and inversely proportional to the product of the cube of the standard deviation and the square root of n. This is an upper bound in the sense of uniform convergence of measurable functions. Estimates of the constant of proportionality constitute an active area of research in the theory of “Mathematical Statistics.” In practical actuality, improvements in the because the limitations of the bound cannot be improved beyond O(n^-0.5), so increasing n is the most practical alternative to ensuring rapid convergence. Counterintuitively, improvements in the standard deviation of the random variables are of little value in improvement of convergence bounds. The constant of proportionality is a function of the standard deviation in a manner that mathematicians are still seeking to determine.
3. Ambiguity in error estimates is not the only matter relevant to current topics in “Bayesian Statistics” and Decision Analysis. If one is to adopt a strictly distributional approach to statistical reasoning, one must also reckon with “pathological” distributions whose means and standard deviations are incalculable even in theory. The most famous of this sort of “pathological” distribution is the Cauchy type, in which both mean and standard deviation are undefined. The reason for this aberrant behavior is the lack of integrability of the moment functions for such distributions. One would be profoundly mistaken in believing that Cauchy Distributions are atypical: They arise, in fact, as the ratio of standard Normal random variables. The sample mean of Cauchy variables is also Cauchy, making the Central Limit Theorem impossible to apply. Even suitable generalizations are also inapplicable. Measure Theory plays a key role in applied paradigms of valuation and probability, and in this case, it deals a harsh blow to a strictly distributional theory of decision-making. Parametric estimation is still possible via MLE even though the classical moments are undefined. Beta-Negative Binomial Distributions suffer from undefined mean and variance in many situations. These arise from a simple sequence of Bernoulli Trials where the probability of success varies quite regularly in terms of a classical Beta Distribution. Because they are simple ratios of factorial functions, the Beta Distributions are among the best behaved. Even under these ideal circumstances, Beta-Negative Distributions exhibit complex and nonlinear behavior in mean, standard deviation, skewness, and Moment Generating Function. Zeta Distributions arise undefined in the Probability Distribution Function whenever Zeta is Analytically undefined at s=1, the simplest and most common value to be inquired of by any distribution. This ill-posedness is generated by the fact that the Riemann Zeta Function is itself a limit of Harmonic Partial Sums that diverges as a classical sum whenever its argument is 1 or less. Convergence can be ensured using Serre Global Convergence using Complex Analysis, effectively producing imaginary outputs that are of little practical commercial use at present.
4. In practice, however, one may be brought to recognize that statistics which itself is a mathematics of empirical things is, in its applied forms, a practical calculus of heuristics rather than a reified theoretical framework. That which is useful for calculations may not be bounded in rigorous theory. The underdetermination of parameters for ratio distributions should excite the recognition that applied decision sciences are based upon Hume’s Constant Conjunction. It is not propositionally reasonable to assume that the future will bear a causal semblance to the present or past. It is, however, experientially reasonable to assume as much. Rigor is developed subsequently to utility. The statistics of commerce is ultimately viable in accordance with its utility. Proprietary data science pushes Neural Networks beyond what a strictly propositional theory of their properties would permit. The same science treats Artificial Intelligence and Machine Learning similarly. Modern statistics is far more art than rigorous mathematics in its applied forms. Doubtless proof is important, but to whom? To use that which is useful in the way it is useful is nothing of which one need be ashamed if such willingness is tempered by the consideration that refinements are inevitable as theory and practice refine each other in tandem.
For more information regarding the Unreasonable Ratios, International Edition and to investigate resources for further study, please visit https://athhallcam.uk/ C/O Dr. Jonathan Kenigson, FRSA.
Disclaimer: The views, suggestions, and opinions expressed here are the sole responsibility of the experts. No Daily Scotland News journalist was involved in the writing and production of this article.