Topic: Statistics (Page 5)

You are looking at all articles with the topic "Statistics". We found 68 matches.

Hint: To view all topics, click here. Too see the most popular topics, click here instead.

๐Ÿ”— Extreme Learning Machine

๐Ÿ”— Statistics

Extreme learning machines are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned. These hidden nodes can be randomly assigned and never updated (i.e. they are random projection but with nonlinear transforms), or can be inherited from their ancestors without being changed. In most cases, the output weights of hidden nodes are usually learned in a single step, which essentially amounts to learning a linear model. The name "extreme learning machine" (ELM) was given to such models by its main inventor Guang-Bin Huang.

According to their creators, these models are able to produce good generalization performance and learn thousands of times faster than networks trained using backpropagation. In literature, it also shows that these models can outperform support vector machines (SVM) and SVM provides suboptimal solutions in both classification and regression applications.

Discussed on

๐Ÿ”— Halton Sequence

๐Ÿ”— Statistics

In statistics, Halton sequences are sequences used to generate points in space for numerical methods such as Monte Carlo simulations. Although these sequences are deterministic, they are of low discrepancy, that is, appear to be random for many purposes. They were first introduced in 1960 and are an example of a quasi-random number sequence. They generalize the one-dimensional van der Corput sequences.

Discussed on

๐Ÿ”— Akaike information criterion

๐Ÿ”— Mathematics ๐Ÿ”— Statistics

The Akaike information criterion (AIC) is an estimator of out-of-sample prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Thus, AIC provides a means for model selection.

AIC is founded on information theory. When a statistical model is used to represent the process that generated the data, the representation will almost never be exact; so some information will be lost by using the model to represent the process. AIC estimates the relative amount of information lost by a given model: the less information a model loses, the higher the quality of that model.

In estimating the amount of information lost by a model, AIC deals with the trade-off between the goodness of fit of the model and the simplicity of the model. In other words, AIC deals with both the risk of overfitting and the risk of underfitting.

The Akaike information criterion is named after the Japanese statistician Hirotugu Akaike, who formulated it. It now forms the basis of a paradigm for the foundations of statistics; as well, it is widely used for statistical inference.

Discussed on

๐Ÿ”— The Birthday Paradox

๐Ÿ”— Mathematics ๐Ÿ”— Statistics

In probability theory, the birthday problem or birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are only 366 possible birthdays, including February 29). However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. These conclusions are based on the assumption that each day of the year (excluding February 29) is equally probable for a birthday.

Actual birth records show that different numbers of people are born on different days. In this case, it can be shown that the number of people required to reach the 50% threshold is 23 or fewer. For example, if half the people were born on one day and the other half on another day, then any two people would have a 50% chance of sharing a birthday.

It may well seem surprising that a group of just 23 individuals is required to reach a probability of 50% that at least two individuals in the group have the same birthday: this result is perhaps made more plausible by considering that the comparisons of birthday will actually be made between every possible pair of individuals = 23ย ร—ย 22/2ย =ย 253 comparisons, which is well over half the number of days in a year (183 at most), as opposed to fixing on one individual and comparing his or her birthday to everyone else's. The birthday problem is not a "paradox" in the literal logical sense of being self-contradictory, but is merely unintuitive at first glance.

Real-world applications for the birthday problem include a cryptographic attack called the birthday attack, which uses this probabilistic model to reduce the complexity of finding a collision for a hash function, as well as calculating the approximate risk of a hash collision existing within the hashes of a given size of population.

The history of the problem is obscure. W. W. Rouse Ball indicated (without citation) that it was first discussed by Harold Davenport. However, Richard von Mises proposed an earlier version of what is considered today to be the birthday problem.

Discussed on

๐Ÿ”— Herfindahl Index

๐Ÿ”— Economics ๐Ÿ”— Statistics

The Herfindahl index (also known as Herfindahlโ€“Hirschman Index, HHI, or sometimes HHI-score) is a measure of the size of firms in relation to the industry and an indicator of the amount of competition among them. Named after economists Orris C. Herfindahl and Albert O. Hirschman, it is an economic concept widely applied in competition law, antitrust and also technology management. It is defined as the sum of the squares of the market shares of the firms within the industry (sometimes limited to the 50 largest firms), where the market shares are expressed as fractions. The result is proportional to the average market share, weighted by market share. As such, it can range from 0 to 1.0, moving from a huge number of very small firms to a single monopolistic producer. Increases in the Herfindahl index generally indicate a decrease in competition and an increase of market power, whereas decreases indicate the opposite. Alternatively, if whole percentages are used, the index ranges from 0 to 10,000 "points". For example, an index of .25 is the same as 2,500 points.

The major benefit of the Herfindahl index in relationship to such measures as the concentration ratio is that it gives more weight to larger firms.

The measure is essentially equivalent to the Simpson diversity index, which is a diversity index used in ecology; the inverse participation ratio (IPR) in physics; and the effective number of parties index in politics.

Discussed on

๐Ÿ”— Gini coefficient

๐Ÿ”— Mathematics ๐Ÿ”— Economics ๐Ÿ”— Statistics ๐Ÿ”— Sociology ๐Ÿ”— Globalization

In economics, the Gini coefficient ( JEE-nee), sometimes called the Gini index or Gini ratio, is a measure of statistical dispersion intended to represent the income or wealth distribution of a nation's residents, and is the most commonly used measurement of inequality. It was developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper Variability and Mutability (Italian: Variabilitร  e mutabilitร ).

The Gini coefficient measures the inequality among values of a frequency distribution (for example, levels of income). A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of one (or 100%) expresses maximal inequality among values (e.g., for a large number of people, where only one person has all the income or consumption, and all others have none, the Gini coefficient will be very nearly one). For larger groups, values close to one are very unlikely in practice. Given the normalization of both the cumulative population and the cumulative share of income used to calculate the Gini coefficient, the measure is not overly sensitive to the specifics of the income distribution, but rather only on how incomes vary relative to the other members of a population. The exception to this is in the redistribution of income resulting in a minimum income for all people. When the population is sorted, if their income distribution were to approximate a well-known function, then some representative values could be calculated.

The Gini coefficient was proposed by Gini as a measure of inequality of income or wealth. For OECD countries, in the late 20th century, considering the effect of taxes and transfer payments, the income Gini coefficient ranged between 0.24 and 0.49, with Slovenia being the lowest and Mexico the highest. African countries had the highest pre-tax Gini coefficients in 2008โ€“2009, with South Africa the world's highest, variously estimated to be 0.63 to 0.7, although this figure drops to 0.52 after social assistance is taken into account, and drops again to 0.47 after taxation. The global income Gini coefficient in 2005 has been estimated to be between 0.61 and 0.68 by various sources.

There are some issues in interpreting a Gini coefficient. The same value may result from many different distribution curves. The demographic structure should be taken into account. Countries with an aging population, or with a baby boom, experience an increasing pre-tax Gini coefficient even if real income distribution for working adults remains constant. Scholars have devised over a dozen variants of the Gini coefficient.

Discussed on

๐Ÿ”— Super-Spreader

๐Ÿ”— Viruses ๐Ÿ”— Statistics ๐Ÿ”— Computational Biology

A super-spreader is an unusually contagious organism infected with a disease. In context of a human-borne illness, a super-spreader is an individual who is more likely to infect others, compared with a typical infected person. Such super-spreaders are of particular concern in epidemiology.

Some cases of super-spreading conform to the 80/20 rule, where approximately 20% of infected individuals are responsible for 80% of transmissions, although super-spreading can still be said to occur when super-spreaders account for a higher or lower percentage of transmissions. In epidemics with super-spreading, the majority of individuals infect relatively few secondary contacts.

Super-spreading events are shaped by multiple factors including a decline in herd immunity, nosocomial infections, virulence, viral load, misdiagnosis, airflow dynamics, immune suppression, and co-infection with another pathogen.

Discussed on

๐Ÿ”— Stigler's Law of Eponymy

๐Ÿ”— Mathematics ๐Ÿ”— Statistics ๐Ÿ”— History of Science

Stigler's law of eponymy, proposed by University of Chicago statistics professor Stephen Stigler in his 1980 publication Stiglerโ€™s law of eponymy, states that no scientific discovery is named after its original discoverer. Examples include Hubble's law, which was derived by Georges Lemaรฎtre two years before Edwin Hubble, the Pythagorean theorem, which was known to Babylonian mathematicians before Pythagoras, and Halley's Comet, which was observed by astronomers since at least 240 BC (although its official designation is due to the first ever mathematical prediction of such astronomical phenomenon in the sky, not to its discovery). Stigler himself named the sociologist Robert K. Merton as the discoverer of "Stigler's law" to show that it follows its own decree, though the phenomenon had previously been noted by others.

Discussed on

๐Ÿ”— Hierarchical Clustering

๐Ÿ”— Computer science ๐Ÿ”— Statistics ๐Ÿ”— Databases

In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories:

  • Agglomerative: This is a "bottom-up" approach: Each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.
  • Divisive: This is a "top-down" approach: All observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.

In general, the merges and splits are determined in a greedy manner. The results of hierarchical clustering are usually presented in a dendrogram.

Hierarchical clustering has the distinct advantage that any valid measure of distance can be used. In fact, the observations themselves are not required: all that is used is a matrix of distances. On the other hand, except for the special case of single-linkage distance, none of the algorithms (except exhaustive search in O ( 2 n ) {\displaystyle {\mathcal {O}}(2^{n})} ) can be guaranteed to find the optimum solution.

Discussed on

๐Ÿ”— Boltzmann machine

๐Ÿ”— Computer science ๐Ÿ”— Statistics

A Boltzmann machine (also called stochastic Hopfield network with hidden units) is a type of stochastic recurrent neural network. It is a Markov random field. It was translated from statistical physics for use in cognitive science. The Boltzmann machine is based on stochastic spin-glass model with an external field, i.e., a Sherringtonโ€“Kirkpatrick model that is a stochastic Ising Model and applied to machine learning.

Boltzmann machines can be seen as the stochastic, generative counterpart of Hopfield networks. They were one of the first neural networks capable of learning internal representations, and are able to represent and (given sufficient time) solve combinatoric problems.

They are theoretically intriguing because of the locality and Hebbian nature of their training algorithm (being trained by Hebb's rule), and because of their parallelism and the resemblance of their dynamics to simple physical processes. Boltzmann machines with unconstrained connectivity have not proven useful for practical problems in machine learning or inference, but if the connectivity is properly constrained, the learning can be made efficient enough to be useful for practical problems.

They are named after the Boltzmann distribution in statistical mechanics, which is used in their sampling function. That's why they are called "energy based models" (EBM). They were invented in 1985 by Geoffrey Hinton, then a Professor at Carnegie Mellon University, and Terry Sejnowski, then a Professor at Johns Hopkins University.

Discussed on