Topic: Statistics (Page 3)

You are looking at all articles with the topic "Statistics". We found 68 matches.

Hint: To view all topics, click here. Too see the most popular topics, click here instead.

๐Ÿ”— Kullbackโ€“Leibler Divergence

๐Ÿ”— Mathematics ๐Ÿ”— Physics ๐Ÿ”— Statistics

In mathematical statistics, the Kullbackโ€“Leibler divergence (also called relative entropy and I-divergence), denoted D KL ( P โˆฅ Q ) {\displaystyle D_{\text{KL}}(P\parallel Q)} , is a type of statistical distance: a measure of how one probability distribution P is different from a second, reference probability distribution Q. A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P. While it is a distance, it is not a metric, the most familiar type of distance: it is not symmetric in the two distributions (in contrast to variation of information), and does not satisfy the triangle inequality. Instead, in terms of information geometry, it is a type of divergence, a generalization of squared distance, and for certain classes of distributions (notably an exponential family), it satisfies a generalized Pythagorean theorem (which applies to squared distances).

In the simple case, a relative entropy of 0 indicates that the two distributions in question have identical quantities of information. Relative entropy is a nonnegative function of two distributions or measures. It has diverse applications, both theoretical, such as characterizing the relative (Shannon) entropy in information systems, randomness in continuous time-series, and information gain when comparing statistical models of inference; and practical, such as applied statistics, fluid mechanics, neuroscience and bioinformatics.

Discussed on

๐Ÿ”— St. Petersburg paradox

๐Ÿ”— Economics ๐Ÿ”— Statistics

The St. Petersburg paradox or St. Petersburg lottery is a paradox related to probability and decision theory in economics. It is based on a particular (theoretical) lottery game that leads to a random variable with infinite expected value (i.e., infinite expected payoff) but nevertheless seems to be worth only a very small amount to the participants. The St. Petersburg paradox is a situation where a naive decision criterion which takes only the expected value into account predicts a course of action that presumably no actual person would be willing to take. Several resolutions are possible.

The paradox takes its name from its resolution by Daniel Bernoulli, one-time resident of the eponymous Russian city, who published his arguments in the Commentaries of the Imperial Academy of Science of Saint Petersburg (Bernoulli 1738). However, the problem was invented by Daniel's cousin, Nicolas Bernoulli, who first stated it in a letter to Pierre Raymond de Montmort on September 9, 1713 (de Montmort 1713).

Discussed on

๐Ÿ”— 68โ€“95โ€“99.7 Rule

๐Ÿ”— Mathematics ๐Ÿ”— Statistics

In statistics, the 68โ€“95โ€“99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean, respectively.

In mathematical notation, these facts can be expressed as follows, where Pr() is the probability function, ฮง is an observation from a normally distributed random variable, ฮผ (mu) is the mean of the distribution, and ฯƒ (sigma) is its standard deviation:

Pr ( ฮผ โˆ’ 1 ฯƒ โ‰ค X โ‰ค ฮผ + 1 ฯƒ ) โ‰ˆ 68.27 % Pr ( ฮผ โˆ’ 2 ฯƒ โ‰ค X โ‰ค ฮผ + 2 ฯƒ ) โ‰ˆ 95.45 % Pr ( ฮผ โˆ’ 3 ฯƒ โ‰ค X โ‰ค ฮผ + 3 ฯƒ ) โ‰ˆ 99.73 % {\displaystyle {\begin{aligned}\Pr(\mu -1\sigma \leq X\leq \mu +1\sigma )&\approx 68.27\%\\\Pr(\mu -2\sigma \leq X\leq \mu +2\sigma )&\approx 95.45\%\\\Pr(\mu -3\sigma \leq X\leq \mu +3\sigma )&\approx 99.73\%\end{aligned}}}

The usefulness of this heuristic especially depends on the question under consideration.

In the empirical sciences, the so-called three-sigma rule of thumb (or 3ฯƒ rule) expresses a conventional heuristic that nearly all values are taken to lie within three standard deviations of the mean, and thus it is empirically useful to treat 99.7% probability as near certainty.

In the social sciences, a result may be considered "significant" if its confidence level is of the order of a two-sigma effect (95%), while in particle physics, there is a convention of a five-sigma effect (99.99994% confidence) being required to qualify as a discovery.

A weaker three-sigma rule can be derived from Chebyshev's inequality, stating that even for non-normally distributed variables, at least 88.8% of cases should fall within properly calculated three-sigma intervals. For unimodal distributions, the probability of being within the interval is at least 95% by the Vysochanskijโ€“Petunin inequality. There may be certain assumptions for a distribution that force this probability to be at least 98%.

Discussed on

๐Ÿ”— Winsorized Mean

๐Ÿ”— Statistics

A winsorized mean is a winsorized statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean. It involves the calculation of the mean after winsorizing โ€” replacing given parts of a probability distribution or sample at the high and low end with the most extreme remaining values, typically doing so for an equal amount of both extremes; often 10 to 25 percent of the ends are replaced. The winsorized mean can equivalently be expressed as a weighted average of the truncated mean and the quantiles at which it is limited, which corresponds to replacing parts with the corresponding quantiles.

Discussed on

๐Ÿ”— Delphi Method

๐Ÿ”— Statistics ๐Ÿ”— Futures studies

The Delphi method or Delphi technique ( DEL-fy; also known as Estimate-Talk-Estimate or ETE) is a structured communication technique or method, originally developed as a systematic, interactive forecasting method which relies on a panel of experts. The technique can also be adapted for use in face-to-face meetings, and is then called mini-Delphi or Estimate-Talk-Estimate (ETE). Delphi has been widely used for business forecasting and has certain advantages over another structured forecasting approach, prediction markets.

Delphi is based on the principle that forecasts (or decisions) from a structured group of individuals are more accurate than those from unstructured groups. The experts answer questionnaires in two or more rounds. After each round, a facilitator or change agent provides an anonymised summary of the experts' forecasts from the previous round as well as the reasons they provided for their judgments. Thus, experts are encouraged to revise their earlier answers in light of the replies of other members of their panel. It is believed that during this process the range of the answers will decrease and the group will converge towards the "correct" answer. Finally, the process is stopped after a predefined stop criterion (e.g., number of rounds, achievement of consensus, stability of results), and the mean or median scores of the final rounds determine the results.

Special attention has to be paid to the formulation of the Delphi theses and the definition and selection of the experts in order to avoid methodological weaknesses that severely threaten the validity and reliability of the results.

Discussed on

๐Ÿ”— Secretary Problem

๐Ÿ”— Mathematics ๐Ÿ”— Statistics

The secretary problem is a problem that demonstrates a scenario involving optimal stopping theory. The problem has been studied extensively in the fields of applied probability, statistics, and decision theory. It is also known as the marriage problem, the sultan's dowry problem, the fussy suitor problem, the googol game, and the best choice problem.

The basic form of the problem is the following: imagine an administrator who wants to hire the best secretary out of n {\displaystyle n} rankable applicants for a position. The applicants are interviewed one by one in random order. A decision about each particular applicant is to be made immediately after the interview. Once rejected, an applicant cannot be recalled. During the interview, the administrator gains information sufficient to rank the applicant among all applicants interviewed so far, but is unaware of the quality of yet unseen applicants. The question is about the optimal strategy (stopping rule) to maximize the probability of selecting the best applicant. If the decision can be deferred to the end, this can be solved by the simple maximum selection algorithm of tracking the running maximum (and who achieved it), and selecting the overall maximum at the end. The difficulty is that the decision must be made immediately.

The shortest rigorous proof known so far is provided by the odds algorithm (Bruss 2000). It implies that the optimal win probability is always at least 1 / e {\displaystyle 1/e} (where e is the base of the natural logarithm), and that the latter holds even in a much greater generality (2003). The optimal stopping rule prescribes always rejecting the first โˆผ n / e {\displaystyle \sim n/e} applicants that are interviewed and then stopping at the first applicant who is better than every applicant interviewed so far (or continuing to the last applicant if this never occurs). Sometimes this strategy is called the 1 / e {\displaystyle 1/e} stopping rule, because the probability of stopping at the best applicant with this strategy is about 1 / e {\displaystyle 1/e} already for moderate values of n {\displaystyle n} . One reason why the secretary problem has received so much attention is that the optimal policy for the problem (the stopping rule) is simple and selects the single best candidate about 37% of the time, irrespective of whether there are 100 or 100 million applicants.

Discussed on

๐Ÿ”— The Erlang Distribution

๐Ÿ”— Statistics

The Erlang distribution is a two-parameter family of continuous probability distributions with support x โˆˆ [ 0 , โˆž ) {\displaystyle x\in [0,\infty )} . The two parameters are:

  • a positive integer k , {\displaystyle k,} the "shape", and
  • a positive real number ฮป , {\displaystyle \lambda ,} the "rate". The "scale", ฮผ , {\displaystyle \mu ,} the reciprocal of the rate, is sometimes used instead.

The Erlang distribution with shape parameter k = 1 {\displaystyle k=1} simplifies to the exponential distribution. It is a special case of the gamma distribution. It is the distribution of a sum of k {\displaystyle k} independent exponential variables with mean 1 / ฮป {\displaystyle 1/\lambda } each.

The Erlang distribution was developed by A. K. Erlang to examine the number of telephone calls which might be made at the same time to the operators of the switching stations. This work on telephone traffic engineering has been expanded to consider waiting times in queueing systems in general. The distribution is also used in the field of stochastic processes.

Discussed on

๐Ÿ”— Galton Board

๐Ÿ”— Statistics

The bean machine, also known as the Galton Board or quincunx, is a device invented by Sir Francis Galton to demonstrate the central limit theorem, in particular that with sufficient sample size the binomial distribution approximates a normal distribution. Among its applications, it afforded insight into regression to the mean or "regression to mediocrity".

Discussed on

๐Ÿ”— Fuzzy Logic

๐Ÿ”— Mathematics ๐Ÿ”— Philosophy ๐Ÿ”— Statistics

Fuzzy logic is a form of many-valued logic in which the truth values of variables may be any real number between 0 and 1 both inclusive. It is employed to handle the concept of partial truth, where the truth value may range between completely true and completely false. By contrast, in Boolean logic, the truth values of variables may only be the integer values 0 or 1.

The term fuzzy logic was introduced with the 1965 proposal of fuzzy set theory by Lotfi Zadeh. Fuzzy logic had, however, been studied since the 1920s, as infinite-valued logicโ€”notably by ลukasiewicz and Tarski.

Fuzzy logic is based on the observation that people make decisions based on imprecise and non-numerical information. Fuzzy models or sets are mathematical means of representing vagueness and imprecise information (hence the term fuzzy). These models have the capability of recognising, representing, manipulating, interpreting, and utilising data and information that are vague and lack certainty.

Fuzzy logic has been applied to many fields, from control theory to artificial intelligence.

Discussed on

๐Ÿ”— Curse of dimensionality

๐Ÿ”— Computing ๐Ÿ”— Mathematics ๐Ÿ”— Statistics ๐Ÿ”— Cognitive science

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming.

Cursed phenomena occur in domains such as numerical analysis, sampling, combinatorics, machine learning, data mining and databases. The common theme of these problems is that when the dimensionality increases, the volume of the space increases so fast that the available data become sparse. This sparsity is problematic for any method that requires statistical significance. In order to obtain a statistically sound and reliable result, the amount of data needed to support the result often grows exponentially with the dimensionality. Also, organizing and searching data often relies on detecting areas where objects form groups with similar properties; in high dimensional data, however, all objects appear to be sparse and dissimilar in many ways, which prevents common data organization strategies from being efficient.

Discussed on