Topic: Statistics (Page 6)

You are looking at all articles with the topic "Statistics". We found 68 matches.

Hint: To view all topics, click here. Too see the most popular topics, click here instead.

πŸ”— Normally distributed and uncorrelated does not imply independent

πŸ”— Statistics

In probability theory, although simple examples illustrate that linear uncorrelatedness of two random variables does not in general imply their independence, it is sometimes mistakenly thought that it does imply that when the two random variables are normally distributed. This article demonstrates that assumption of normal distributions does not have that consequence, although the multivariate normal distribution, including the bivariate normal distribution, does.

To say that the pair ( X , Y ) {\displaystyle (X,Y)} of random variables has a bivariate normal distribution means that every linear combination a X + b Y {\displaystyle aX+bY} of X {\displaystyle X} and Y {\displaystyle Y} for constant (i.e. not random) coefficients a {\displaystyle a} and b {\displaystyle b} has a univariate normal distribution. In that case, if X {\displaystyle X} and Y {\displaystyle Y} are uncorrelated then they are independent. However, it is possible for two random variables X {\displaystyle X} and Y {\displaystyle Y} to be so distributed jointly that each one alone is marginally normally distributed, and they are uncorrelated, but they are not independent; examples are given below.

Discussed on

πŸ”— Loess Regression

πŸ”— Mathematics πŸ”— Statistics

Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS (locally estimated scatterplot smoothing) and LOWESS (locally weighted scatterplot smoothing), both pronounced . They are two strongly related non-parametric regression methods that combine multiple regression models in a k-nearest-neighbor-based meta-model. In some fields, LOESS is known and commonly referred to as Savitzky–Golay filter (proposed 15 years before LOESS).

LOESS and LOWESS thus build on "classical" methods, such as linear and nonlinear least squares regression. They address situations in which the classical procedures do not perform well or cannot be effectively applied without undue labor. LOESS combines much of the simplicity of linear least squares regression with the flexibility of nonlinear regression. It does this by fitting simple models to localized subsets of the data to build up a function that describes the deterministic part of the variation in the data, point by point. In fact, one of the chief attractions of this method is that the data analyst is not required to specify a global function of any form to fit a model to the data, only to fit segments of the data.

The trade-off for these features is increased computation. Because it is so computationally intensive, LOESS would have been practically impossible to use in the era when least squares regression was being developed. Most other modern methods for process modeling are similar to LOESS in this respect. These methods have been consciously designed to use our current computational ability to the fullest possible advantage to achieve goals not easily achieved by traditional approaches.

A smooth curve through a set of data points obtained with this statistical technique is called a loess curve, particularly when each smoothed value is given by a weighted quadratic least squares regression over the span of values of the y-axis scattergram criterion variable. When each smoothed value is given by a weighted linear least squares regression over the span, this is known as a lowess curve; however, some authorities treat lowess and loess as synonyms.

Discussed on

πŸ”— Gravity Model of Trade

πŸ”— Economics πŸ”— Statistics

The gravity model of international trade in international economics is a model that, in its traditional form, predicts bilateral trade flows based on the economic sizes and distance between two units.

The model was first introduced in economics world by Walter Isard in 1954. The basic model for trade between two countries (i and j) takes the form of

F i j = G βˆ— M i βˆ— M j D i j {\displaystyle F_{ij}=G*{\frac {M_{i}*M_{j}}{D_{ij}}}}

In this formula G is a constant, F stands for trade flow, D stands for the distance and M stands for the economic dimensions of the countries that are being measured. The equation can be changed into a linear form for the purpose of econometric analyses by employing logarithms. The model has been used by economists to analyse the determinants of bilateral trade flows such as common borders, common languages, common legal systems, common currencies, common colonial legacies, and it has been used to test the effectiveness of trade agreements and organizations such as the North American Free Trade Agreement (NAFTA) and the World Trade Organization (WTO) (Head and Mayer 2014). The model has also been used in international relations to evaluate the impact of treaties and alliances on trade (Head and Mayer).

The model has also been applied to other bilateral flow data (also 'dyadic' data) such as migration, traffic, remittances and foreign direct investment.

πŸ”— Type I and type II errors

πŸ”— Mathematics πŸ”— Statistics

In statistical hypothesis testing, a type I error is the rejection of a true null hypothesis (also known as a "false positive" finding or conclusion), while a type II error is the non-rejection of a false null hypothesis (also known as a "false negative" finding or conclusion). Much of statistical theory revolves around the minimization of one or both of these errors, though the complete elimination of either is a statistical impossibility for non-deterministic algorithms. By selecting a low threshold (cut-off) value and modifying the alpha (p) level, the quality of the hypothesis test can be increased. The knowledge of Type I errors and Type II errors is widely used in medical science, biometrics and computer science.

Discussed on

πŸ”— ItΓ΄ Calculus

πŸ”— Mathematics πŸ”— Statistics

ItΓ΄ calculus, named after Kiyosi ItΓ΄, extends the methods of calculus to stochastic processes such as Brownian motion (see Wiener process). It has important applications in mathematical finance and stochastic differential equations.

The central concept is the ItΓ΄ stochastic integral, a stochastic generalization of the Riemann–Stieltjes integral in analysis. The integrands and the integrators are now stochastic processes:

Y t = ∫ 0 t H s d X s , {\displaystyle Y_{t}=\int _{0}^{t}H_{s}\,dX_{s},}

where H is a locally square-integrable process adapted to the filtration generated by X (Revuz & Yor 1999, Chapter IV), which is a Brownian motion or, more generally, a semimartingale. The result of the integration is then another stochastic process. Concretely, the integral from 0 to any particular t is a random variable, defined as a limit of a certain sequence of random variables. The paths of Brownian motion fail to satisfy the requirements to be able to apply the standard techniques of calculus. So with the integrand a stochastic process, the ItΓ΄ stochastic integral amounts to an integral with respect to a function which is not differentiable at any point and has infinite variation over every time interval. The main insight is that the integral can be defined as long as the integrand H is adapted, which loosely speaking means that its value at time t can only depend on information available up until this time. Roughly speaking, one chooses a sequence of partitions of the interval from 0 to t and constructs Riemann sums. Every time we are computing a Riemann sum, we are using a particular instantiation of the integrator. It is crucial which point in each of the small intervals is used to compute the value of the function. The limit then is taken in probability as the mesh of the partition is going to zero. Numerous technical details have to be taken care of to show that this limit exists and is independent of the particular sequence of partitions. Typically, the left end of the interval is used.

Important results of ItΓ΄ calculus include the integration by parts formula and ItΓ΄'s lemma, which is a change of variables formula. These differ from the formulas of standard calculus, due to quadratic variation terms.

In mathematical finance, the described evaluation strategy of the integral is conceptualized as that we are first deciding what to do, then observing the change in the prices. The integrand is how much stock we hold, the integrator represents the movement of the prices, and the integral is how much money we have in total including what our stock is worth, at any given moment. The prices of stocks and other traded financial assets can be modeled by stochastic processes such as Brownian motion or, more often, geometric Brownian motion (see Black–Scholes). Then, the ItΓ΄ stochastic integral represents the payoff of a continuous-time trading strategy consisting of holding an amount Ht of the stock at time t. In this situation, the condition that H is adapted corresponds to the necessary restriction that the trading strategy can only make use of the available information at any time. This prevents the possibility of unlimited gains through clairvoyance: buying the stock just before each uptick in the market and selling before each downtick. Similarly, the condition that H is adapted implies that the stochastic integral will not diverge when calculated as a limit of Riemann sums (Revuz & Yor 1999, Chapter IV).

Discussed on

πŸ”— Thompson sampling

πŸ”— Statistics πŸ”— Robotics

Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that address the exploration-exploitation dilemma in the multi-armed bandit problem. It consists of choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Discussed on

πŸ”— Extreme Value Theory

πŸ”— Statistics

Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the probability of events that are more extreme than any previously observed. Extreme value analysis is widely used in many disciplines, such as structural engineering, finance, earth sciences, traffic prediction, and geological engineering. For example, EVA might be used in the field of hydrology to estimate the probability of an unusually large flooding event, such as the 100-year flood. Similarly, for the design of a breakwater, a coastal engineer would seek to estimate the 50-year wave and design the structure accordingly.

πŸ”— R vs. Adams

πŸ”— Law πŸ”— Statistics πŸ”— United Kingdom

R v Adams [1996] EWCA Crim 10 and 222, are rulings in the United Kingdom that banned the expression in court of headline (soundbite), standalone Bayesian statistics from the reasoning admissible before a jury in DNA evidence cases, in favour of the calculated average (and maximal) number of matching incidences among the nation's population. The facts involved strong but inconclusive evidence conflicting with the DNA evidence, leading to a retrial.

Discussed on

πŸ”— Today a greater percentage of Dutch people speak English than Canadians

πŸ”— Lists πŸ”— Statistics πŸ”— Linguistics πŸ”— Linguistics/Applied Linguistics πŸ”— Languages πŸ”— Countries πŸ”— English Language

The following is a list of English-speaking population by country, including information on both native speakers and second-language speakers.

Some of the entries in this list are dependent territories (e.g.: U.S. Virgin Islands), autonomous regions (e.g.: Hong Kong) or associated states (e.g.: Cook Islands) of other countries, rather than being fully sovereign countries in their own right.

Discussed on

πŸ”— Baum-Welch Algorithm

πŸ”— Computing πŸ”— Mathematics πŸ”— Statistics πŸ”— Molecular Biology πŸ”— Molecular Biology/Computational Biology

In electrical engineering, statistical computing and bioinformatics, the Baum–Welch algorithm is a special case of the expectation–maximization algorithm used to find the unknown parameters of a hidden Markov model (HMM). It makes use of the forward-backward algorithm to compute the statistics for the expectation step.