Saturday, March 29, 2008

Mr Men T-shirts Singapore

2: The linear fit formulas



easier adjustment formula data that we can carry out is one in which the data show a linear trend in which the data appear to follow a trend in a straight line when placed on a graph. The first step, before anything, is to translate data into a graph at our disposal to determine if indeed there is any trend (either linear or nonlinear) for grouped data following a certain trend, behind which there is possibly a natural relationship that eventually can be expressed with a simple formula. If the graph of the data from several pairs of measurements of two variable quantities, of which perhaps can be varied at will, this is a as follows:


we can see that there seems to be no correlation between the graphed data. However, if the graph turns out to be one like this:


then it manifests as a tendency. These data, allegedly obtained by experiment, usually to be affected with a random error (occurring at random) denoted by the Greek letter ε (equivalent to the Latin letter "e"). If it were not for this error, possibly all the data fall in a straight line or a smooth, continuous curve of the phenomenon itself is being described by the data. In the last graph, it is tempting to draw "by hand" on the same straight line as close as possible to our data, a straight line like the following:



The problem with a plot "by hand" of the straight line is that different people get different lines according to their own subjective criteria, and possibly no one will have the same line, having no way of knowing which of them is the best. That is why, in order to unify criteria and get the same answer in all cases, we need to use a mathematical approach . This approach gives us the method of least squares , developed by the "prince of mathematics" Carl Friedrich Gauss.

The idea behind the method of least squares is as follows: if on a dataset in a graph that seem to cluster along a trend marked by a straight line is drawn a straight line, then all the different lines that can be drawn we can try to find one which produces "best fit" (in English this is called best fit) of according to some mathematical criterion. This line may be that such that the "average distance" from all points on the graph to the ideal line is the smallest average distance possible. Although the distances of each point to the ideal line can be defined so as to be perpendicular to the line, as shown in the picture below right:


mathematical manipulation of the problem can be greatly simplified if instead of using such distances perpendicular to the ideal use vertical distances as the vertical axis of the graph as shown in the picture above left.

While we could try to use absolute values \u200b\u200b d │ i │ distances of each of the points i to the ideal line (the absolute values \u200b\u200beliminate the presence of negative values \u200b\u200baveraged over the Positive values \u200b\u200bwould end "canceling" We intend to obtain a useful average), the main problem is that the absolute value any variable can not be differentiated mathematically in a conventional manner, not easily lend itself to a mathematical derivation using the usual resources of differential calculus, which is a disadvantage when they go to use the tools of the calculation to obtain the maximum and minimum. That is why we use the sum of the squares distances rather than the absolute values \u200b\u200bof the same, as this allows us to treat these securities, known as residual , as a quantity continuously differentiable. However, this technique has the disadvantage that when used square of the distance those isolated points are very far from the ideal line will have an effect on the setting, something that we should not lose sight of when they appear isolated in the graph data that seem too far from the ideal line and which may be indicative of a mistake in measurement or bad data recorded.

For data that seem to show a linear trend, according to the method of least squares is assumed from the outset the existence of a line "ideal" that provides the best fit (best fit ) known as "least-squares fit (least squares fit ). The equation of this "ideal line" is:

Y = A + BX

where A and B are the parameters (constant number) to be determined under the criterion of least squares.

Given a number N of pairs of experimental points (X 1 , And 1), (2 X, Y 2), (X 3 , And 3), etc., then for each experimental point corresponding to each value of the independent variable X = X 1 , X 2, X 3 ,..., X N be a calculated value and i = and 1 , and 2, and 3 ... using the straight "ideal", which is:

and 1 = A + BX 1

and 2 = A + BX 2

and 3 = A + BX 3

.
.
.

and N = A + BX N

The difference between real value of Y = Y 1 , Y 2, and 3 ,..., Y N and each value calculated for the corresponding X i using the ideal line gives us the "distance" vertical D i that alienates both values:

D 1 = A + BX 1 - Y 1

2 D = A + BX 2 - And 2

D 3 = A + BX 3 - Y 3

.
.
.

D N = A + BX N - Y N

Each of these distances D i is known in mathematical statistics as the residual .

straight To find the "ideal", we will use the established procedures of differential calculus for determining the maximum and minimum. A first attempt to lead us to try and find the line that minimizes the sum of the distances

S = D 1 + D + D 2 3 + ... D + N

However, this scheme will not serve us much, because the calculations to determine the value of each distance D i some points "real" above the line will and others will be below it, which some of the distances are positive and some negative (possibly distributed in equal parts) thus canceling much of their contributions the construction of the function you want to minimize. This leads us immediately to try to use absolute values \u200b\u200b distances:

S = another scheme in which we also add the distances D i but without the problem of mutual cancellation of terms having positive and negative terms. The strategy is to use squares of the distances: S = D 1 D ² + 2 ² + D

3 ² + ... + D N ²

With this definition, the general term we want to minimize is given by: S = (A + BX 1 And 1 ) ² + (A + BX 2 And

2) ² + (A + BX 3

And 3) ² + ... + (A + BX N-Y N ) ² The unknowns of the line we are looking for are ideal parameters and A B . With respect to these two questions is how we must carry out the minimization of S . If it were a single parameter, a sufficient ordinary differentiation. But since there are two parameters, we must carry out two separate distinctions derived using partial in which we differentiate with respect to a parameter keeping the other constant. from the calculation, S be a minimum when the partial derivatives with respect to A and B are zero. These partial derivatives are:

The solution of these equations gives us the equations required:
AN + B Σ
X -
Σ Y = 0

Σ A X + B

Σ X ² - Σ XY = 0
where we are using the following simplified symbolic notation:

The two equations we can rearrange as follows: AN + B Σ X =
Σ

And

A

Σ

X + B
Σ X ² = Σ XY
taking with it two equations linear which can be solved as simultaneous equations either directly or through the method of Cramer (determinants), thus obtaining the following formulas:

Thus, the substitution of data in the two formulas give us the values \u200b\u200bof the parameters and A B we are looking for and the "ideal line, the line provides the best fit of all that we can draw on the criteria we defined. Since we are minimizing a function that minimizes the sum of the squares of the distances (residuals), this method as mentioned above is known universally as the method of least square.

PROBLEM:
Given the following, obtain the line of least squares:

To use the equations required to obtain the line of least squares, it is desirable to accommodate the summation in a table like the one shown below:


From this table of intermediate results we obtain:
(
Σ And
) (
Σ

X ²
) X) ( Σ


XY) = (40) (524) - (56) (364) = 6 N Σ XY - (X Σ) (Σ And ) = (8) (364) - (56) (40) = 7 N
Σ
X ² - ( Σ X ) ² = (8) (524) - (56 ) ² = 11 And using the above formulas obtained:
A = [
( AND Σ) (Σ X ² ) - ( Σ
X) (XY


Σ
) ] / [N Σ X ² - X ) ²] = 6 / 11 A = .545 B = [N Σ XY - ( Σ X ) ( Σ
And
) ] / [N
Σ

X ² -
( Σ X ) ²] = 7 / 11 B = .636 The least squares line is then: Y = A + BX Y = 0.545 + 0.636 X The graph of this straight line superimposed on the individual point pairs is:


We

see that the fit is reasonably good. And, most important, other researchers will get exactly the same result under the criterion of square Mimin such problems. Significantly, the mechanization of the evaluation of these data by columns such as arrangements that were used up getting
Σ
X, Y

Σ , Σ


and
X ² Σ XY
can be done in a "worksheet" as EXCEL.
For a large set of data pairs, other times these calculations used to be tedious and subject to mistakes. Fortunately, with the advent of programmable pocket calculators and computer programs that can now be performed on a desktop computer arithmetic for which only two decades ago required expensive computers and sophisticated software in a scientific programming language such as FORTRAN These calculations can be machined to such an extent that instead of having to use excessive amounts of time in performing the calculations the emphasis today is on and
l analysis and interpretation of results
. If, on the basis of experimental data or data obtained from a sample taken from a population we want to estimate the value of a variable Y corresponding to a value of another variable X from the least-squares curve best fits the data, it is customary to call the resulting curve the regression curve of Y in X since And is estimated X . If the curve is a straight line, then call that line the regression line of Y on X . An analysis carried out by the method of least squares is also called regression analysis, and computer programs that can perform calculations are called least-squares regression programs

. If, however, instead of estimating the value of
And
from X value we wish to estimate the value of X from And then we would use a curve regression of X on Y , which involves simply exchanging the variables in the diagram (and in the normal equations) so that X is the dependent variable and And the independent variable, which in turn means replacing the vertical distances D used in the derivation of the least squares line for horizontal distances :

An interesting detail is that, generally, for a given set of data the regression line in X Y and the regression line in X Y two different lines that do not match exactly in a diagram, but in any case are so close to each other that could be confused.

PROBLEM:
Given the following data set:

a) Get the regression line of Y on X, given Y as the dependent variable and X as independent variable. b)
Get the regression line of X on Y, given X and Y as the dependent variable as independent variable.

a) Considering Y as the dependent variable and X as the independent variable, the equation of the least squares line is Y = A + BX, and the normal equations are: BΣX ΣY = AN +

ΣXY = AΣX + BΣX ² 8A + 56B = 40
56A + 524B = 364

Simultaneous two equations, we get A = 6 / 11 and B = 7 / 11. Then the least squares line is:
Y = 6 / 11 + (7 / 11) X


Y = 0.545 + 0.636 X



b) Considering X as the dependent variable and Y as independent variable, the least squares equation is now X = P + QY, and normal equations are:


sx = PN + QΣY
ΣXY = PΣY + QΣY ²


Carrying out the summation, the normal equations become:


8P +
40Q = 56 = 364 40P + 256Q Simultaneous

both equations, we obtain P =- 1 / 2 and Q = 3 / 2. Then the least squares line is:
X = -1 / 2 + (3 / 2) Y


X = -0.5 + 1.5Y



For comparative purposes, we can solve this formula to make And depending on X, obtaining:


Y = 0.333 + 0.667 X

We note that the regression lines obtained in (a) and (b)

different. Below is a chart showing the two lines:





An important measure of how well is the "adjustment" of various experimental data to a straight line obtained from the minimum by the method of least squares is the correlation coefficient
. When all the data is located exactly on a straight line, then

correlation coefficient is unity, and as data on a graph will show increasingly dispersed in relation to the line then the correlation coefficient gradually decreases as shown by the following examples:




As a courtesy of Professor Victor Miguel Ponce, professor and researcher at San Diego State University are available to the public on his personal website on the Internet several programs to machine calculations required to "adjust" data set with a linear trend line of "least squares". The page that provides all programs is:
http://ponce.sdsu.edu/online_calc.php under the heading of "Regression." The page that we want to obtain a data fit a straight line is located at:
Http://ponce.sdsu.edu/onlineregression11.php

To use the above program, we introduce first the size of the array (array ), or the amount of data pairs, after which we introduce the values \u200b\u200b
paired data in an orderly manner starting first with the values \u200b\u200bseparated by commas, followed by the values \u200b\u200bof x, also separated by commas. Done, press "Calculate" at the bottom of the page, which we obtain the values \u200b\u200bα

, the correlation coefficient

r

The standard error

estimation and dispersions (standard deviations) σ

x and σ and data x i and data and i . As an example of using this program, we obtain the least squares line for the following pairs of data: x (1) = 1 and (1) = 5 x (2) = 2, and ( 2) = 7 x (3) = 4, and (3) = 11 x (4) = 5, and (4) = 13 x (5) = 9, and (5) = 21 De Under this program, least squares line is: Y = 3 + 2X And the correlation coefficient r is

= 1.0, while the standard error of the estimate is zero, which as we shall see later tells us that
all pairs of data are part of the line of least squares
. If we plot the least squares line and plotted on it the data pairs (x




i
,

and

i), we find that indeed all data is aligned directly over a line:

which confirms that the mathematical criterion that we used to get the line of least squares, we have defined the correlation index r are correct, we have defined the standard error of the estimate is correct . far we have considered a "least-squares fit" associated with a line that might be called "ideal" from a mathematical point of view, where we have an independent variable (cause) that produces an effect on some dependent variable (effect) . But it may be the case we have a situation in which some variable values \u200b\u200btake will be dependent on not one but two or more factors. In this case, if the individual unit because each of the factors, keeping the other constant, is a linear dependence, we can extend the method of least squares to cover this situation as we did when there was only one variable independent. This is known as a multiple linear regression . For two variables X and X 1 2, this dependence is represented as Y = f (X
1, X 2
). If you have a set of experimental data for a situation like this, the graphic data must be carried out in three dimensions, and its appearance is as follows:


In this graph, the height of each point represents the value of Y for each each pair of values \u200b\u200bX and X
1 2. Representing the points without explicitly show the "highest" points to the horizontal plane Y = 0, making three-dimensional graph like this:

The least squares method used to adjust a set of data to a least squares line can also be extended to obtain a formula of least squares, in which case for two variables the regression equation will the following:
1 X 1 + A 2 X 2

and often erroneously, this equation is taken as representing a line. However, there is a line, is a surface
. If we perform a least squares fit on the linear formula with two factors X and X
1


2, we obtain what is known as a regression
surface, which in this case is a flat surface :
For the data shown above, this surface regression looks like the one shown below:
If we get the equations for this least squares plane , we proceed in exactly the same way as we did to get the formulas which evaluate the parameters to obtain the least squares line, that is, we define the vertical distances of each of the ordered pairs of points into this plane of least squares:


By extension, the problems involving more than two variables and


1 X 2
, suppose we start with a relationship between the three variables that can be described by the following formula:
Y = α
+ ß 1 X

1 + ß 2 X 2

which is a linear formula in the variables
And , X 1 X and 2. Here are three independent parameters α , 1 ß and ß

2. values \u200b\u200band in this line correspond to X 1 = X 1 1 , X 1 2 , X 1 3. .. , X 1 N and 2 X = X 2 1, X 2 2, X 2 3 .. . , X 2 N (we use here the subscript to distinguish each of the two variables X and 1 X 2, and superscript to eventually perform summation over the values \u200b\u200bto each of these variables) are α + ß 1 X 1 1 + ß 2 X 2 1 , α + ß 1 X 1 2 + ß 2 2 X 2 , α + ß 1 X 1 3 + ß 2 X 2 3 ... , Α + ß 1 X 1 N + ß 2 X 2 N , while current values \u200b\u200bare And 1 , Y 2, 3 And ... , And N respectively. Then, as we did on the regression equation based on a single variable, we define the "gap" produced by each trio of experimental data to values \u200b\u200b And i so that the sum of the squares of these distances is: S = (α + ß 1 1 X 1 + ß 2 X 2

1 - Y 1) ² + (α + ß 1 1 X 2 + ß 2 X 2 2 - Y 2) ² + ... + (α + ß 1 X 1 N + ß 2 X 2 N - Y N ) ² from the calculation, S be a minimum when the partial derivatives of S regarding parameters α, ß 1 and 2 ß are equal to zero:

Proceeding as we did when we had two parameters instead of three, this gives us the following set of equations: N α + ß 1 Σ X
1
+ ß 2
Σ

X

2
- Σ Y = 0 α Σ X 1 + ß 1 Σ X 1
² + ß
2 Σ X 1 X 2 - Σ X 1 Y = 0 α Σ X 2 + ß 2 Σ X 2

ß ² + 1 Σ X 1 X 2 - X Σ 2 Y = 0 These are the normal equations required to obtain regression in X Y and 1 X 2 . In the calculations, we are three simultaneous equations which are obtained parameters α, ß and
1 ß 2
. There is a reason why these equations are called normal equations. If we represent the data set for the variable X 1 as a vector X 1 and data set for the variable X 2 as another vector X
2
, considering that these vectors are independent of each other (using a term from linear algebra, linearly independent, which means they are not a simple multiple of each other physically pointing in the same direction) then we can put these vectors in a plane. On the other hand, we consider the sum of squared differences D i used in the derivation of normal equations as well as the magnitude of a vector D i , recalling that the square length of a vector is equal to the sum of the squares of its elements ( an extended Pythagorean theorem dimensions). This makes the principle of "best fit" is equivalent to find one difference vector D i corresponding to the shortest possible distance to the plane formed by vectors X 1 X and 2 . And the shortest distance possible is a vector perpendicular or normal vector : the plane defined by vectors X and 1 X 2 (or rather, the plane formed by the linear combination of vectors ß 1 X

+ 1 ß 2

X2). Although we can repeat here the formulas that correspond to the case of two variables X and 1 X 2 , having understood what a "flat of square Mimin "we can use one of many commercially available computer programs or online. The personal page of Professor Victor Miguel Ponce quoted above gives us the means to carry out a" least-squares fit "when it comes to case of two variables X and 1 X 2 , accessible at the following address: http://ponce.sdsu.edu/onlineregression13.php

PROBLEM: Getting the formula plane that best fits the representation of the following data set: These data, represented in three dimensions, are the following aspect:




For this data set, the formula that corresponds to the surface regression is:

Y = α + ß 1 X


1

2

Y = 9.305829 + 0.787255 X1 - X2 0.04411 Next we have an animated graphic of multiple linear regression and X1 X 2 represented by the formula:

Y = -2
+ 2 X1 X2
where

X1 and X2 are varied from -10 to +10 and dimensional graph is rotated around the axis turning And, why this kind of graphs are known under the name "spin plot" (requires enlarge to see the animated action):

The modeling that we conducted can be extended to three variables, four variables, etc., and we can obtain a regression equation Multiple linear:

Y = ß 0 + ß 1 X


X

2
+ ß3 X 3 + ß 4 X 4 + ß 5 X 5 + ... + ß N X N Unfortunately, for more than two variables is not possible to make a multi-dimensional plotting, and instead of relying on our intuition Geometric have to trust our intuition mathematics. After some practice, we can abandon our reliance on graphical representations extending what we learned into a multi-dimensional world but we can not able to visualize what is happening, giving the crucial step of generalization or abstraction that allows us to dispense with Particulars and still be able to continue working as if nothing had happened. One important thing we have not mentioned yet is that for the case of two variables (as well as more than three variables), we have not taken into account the possible effects of interaction that may exist between the independent variables. These interaction effects, which occur with some frequency in the field of practical applications can be modeled easier if with a formula like this:
Y = ß 0 + ß

1 X 1

+ ß 2 X

2
+ ß 12 X 1 X 2 When no any interaction between variables, the parameter ß 12 shown in this formula is zero . But if there is some kind of interaction, depending on the magnitude of the parameter ß 12 with respect to the other parameters 0
ß, ß

and 1 ß 2 this interaction could be of such magnitude could even nullify the importance of the variable terms ß 1 X 1 and ß 2 X 2 . This issue alone is large enough to require to be dealt with separately in another section of this work.

Friday, March 28, 2008

Citric Acid In Yogurt

3: The least-squares parabola

data adjustment formula using a linear model is excellent when the data follow a linear trend. However, in many cases the data does not follow a linear trend. Consider as an example the following ten data collection:

If we plot this collection of ten data points, we get the following: From the graph it is unclear how we can describe this Library data with a linear empirical formula. We

force a straight line on this data collection carried out regression analysis, by blindly usual mathematical calculations to obtain the "best fit" linear. However, the formula thus obtained we may not be very useful to estimate what will happen with other values \u200b\u200bare not plotted.

If we get the idea that this data collection can be best described by a non-linear model bx + cx 2 + dx

3
+ ex
4 + ...
first thing that could happen would we use a polynomial whose degree corresponds directly to the number of dots on the graph. In this way, as well as a graph in which there are only two points we would use a straight line to connect these points on a graph in which there are three points we would use a quadratic polynomial of degree 2, in a game in which there are four points would use a cubic polynomial of degree 3, and so on. This mathematical procedure is known as interpolation

. This certainly would go exactly curve for each points of the graph, as shown below:



Let's look at two examples.
PROBLEM:
For an experiment of which are available only three data shown in the chart below:


What is the empirical formula can best fit these data? Try

perform a least-squares fit to obtain the regression line that best approximates the three data shown in the graph will una pérdida de tiempo, ya que los puntos no muestran tendencia alguna de agruparse en las proximidades de una línea recta. Sin embargo, podemos tratar de llevar a cabo aquí un ajuste utilizando como modelo un polinomio cuadrático, haciendo pasar los tres puntos
exactamente
a lo largo del polinomio:
P(X) = a 0 + a
1
X + a
Sustituyendo los tres pares de datos
A
( X
1
,
Y 1 )=(1,1), B ( X 2
,
Y
2
)=(2,8) and C ( X 2 , And 2) = (3,2) in the quadratic polynomial: 1 = a 0 + a 1 (1) + a 2 (1) 2 8 = a 0 + a 1 (2) + a 2 (2) 2

2 = a 0 + a 1 (3 ) + a 2 (3) 2

obtain the following set of equations can be solved as simultaneous equations: to 0 1 + a + a 2

= 1 to + 2a 0 1 to 2 +4 = 8
to the 3rd
0
+
1
the 9th + 2 = 2 Of these three equations we obtain as a solution the following factors:

to 0 = -19 to 1 = 26.5

to 2 = -5.5 The quadratic formula to accurately model the three pairs of data is then:
P (X) = - 19 + 26.5X - 6.5X 2


The graph of this quadratic formula superimposed on the three discrete points that produced it is:



If the data were quadratic formula
forced on the data was collected from real life, the difficulty with the method of setting exact

is that if you subsequently collect additional data values And at other points of X such as X = 1.5 and X = 2.5, such additional points can not be used to refine the model, since its derivation does not support more than three pairs, in which case the gathering of additional data only serve to confirm or reject the quadratic formula obtained, not to improve and refine it.

PROBLEM:
Carry out an exact fit

of the following
X
1
= -1, 1 And = 0


X 2 = 0, And
2 = 0
X 3 = 1, 3 And
= 0.1 X 4 = 1.3, And 4
= 1
a cubic polynomial: P (X) = a + a 0
1 X + a 2 + X 2 to 3 X 3

there much data as i coefficients in the polynomial, which allows us to perform accurate adjustment

otherwise could not be carried out if there were fewer data or more data coefficients coefficients. To perform accurate adjustment
simply replace the pairs of values \u200b\u200bfor the cubic polynomial and four equations can be solved as simultaneous linear equations: to 0 + a 1 (-1) + a 2 (-1) 2 + to 3
(-1) = 0
3 to 0 + a 1
(0) + a 2
(0) 2 + to 3 (0) 3 = 0 to 0 + a 1 (1) + a 2

(1) 2 + to 3 (1) 3 = 0.1 to 0 + a 1 (1.3) + a 2

(1.3) 2 + to 3 (1.3) 3 = 1 Of these four equations we obtain the following coefficients: to
0
= 0 to 1 = -0,898 to 2 = 0.05 to 3 = 0,948
The

cubic polynomial representing the four pairs of data is then:

P (X) = -. 898X + 0.05X + 0.948X 2 3


The cubic polynomial graph superimposed on the four discrete produced it is:




All the points fall exactly on the curve
as we had anticipated.
An inspection of the curve shows that three of the points appear to be clustered around what appears to be a line almost horizontal. The only discordant note gives a point on X 4 = 1.3, which should make us reflect.
If the four points provided to make an exact fit to a cubic polynomial have been obtained experimentally, the fact that three of the four points seem to be located around a straight line should make us ask ourselves: there is no possibility of the fourth point is not near that line has been the result of some serious experimental error rather than an error of a statistical nature? Either way, it is necessary to keep an open mind to the possibility that the data point is a genuine discordant, so that if we repeat the experiment only for that point would return to get a result close to that obtained previously. As an alternative to solve this mystery, we can obtain more information by obtaining experimental data for other points that had not been considered, for example

X =- 0.5, and X
= +0.5. But in that case is no longer possible to try to carry out an exact fit to a cubic formula, would require in any case a fifth degree polynomial. And if eleven collect experimental data, we would require a polynomial of tenth grade to carry out an exact fit, running the curve on all ten points. Regardless of the mathematical complexity of driving polynomials of increasing order, is the fact that we are giving much importance to force we are modeling the curve to pass exactly on all points, which flatly ignores the fact that experimental data always have some measure of statistical "noise, a dose of random error that prevents them from falling on a precise way curve if there is a curve derived theoretically able to describe what we see. On the other hand, the disadvantage of high degree polynomials is their tendency to oscillate violently, not only outside the range of values \u200b\u200bconsidered in an experiment, but even among the points buffer zones in which they carried out the measurements. Note on the curve of third degree polynomial for values \u200b\u200bbelow 1.5 X =-
vertical value falls sharply, going something similar to values \u200b\u200babove
X
= +1.5 X where the values \u200b\u200bof P (X) amount sharply. For polynomials of high degree, this oscillatory behavior can turn violent in a matter completely unpredictable, to be a direct consequence of insisting on carrying out an exact modeling then go all the experimental data on a curve.

The interpolation procedure is appropriate to solve problems analytically accurate, which does not happen with the experimental data where the data rarely "fall" just a value that could be considered ideal, where the dispersion of data with respect to a set "ideal" is due to experimental error and where it is meaningless to try to fit exactly a certain amount of data to a polynomial formula. That is why, as well as fitting a linear formula to a collection of data that seem to follow a linear trend to use the method of least squares, the same method of least squares is extended to formulas can be applied to polynomial, allowing us to maintain the degree of the polynomial under control without allowing it to grow disproportionately to be adding additional pairs of points (in other words, under the criterion of least squares we try to fit 101 pairs of points a quadratic polynomial or a cubic polynomial instead of being forced to have recourse to a polynomial of degree 100 if you insist on trying to carry out an exact adjustment data to the formula that we are developing). If a set of data pairs to be plotted does not show a clustering around a straight line but around a curve, as a first approximation we can try to make an "adjustment" to the basic curve of all, the parable

, which in simple terms means attempting to carry out the data fit a quadratic polynomial as follows: Y = a + 0 to 1 X + a

2 X ²

has done a slight change of notation in the parameters of the polynomial, in preparation for eventual generalization to a "fit" least squares a curve for a polynomial of degree p.

Proceeding in exactly the same way as we did with the line of least squares, we can apply the difference between each actual value of Y = Y
1 , Y 2, and 3 And ,..., N
and each value calculated for the corresponding X


i
quadratic equation using least squares, which gives us the "distance" vertical D i that alienates both values: D 1 = a 0 + a 1 X 1 + to 2

X 1 ² - Y 1 D 2 = a 0 + a 1 X 2 + to 2

X 2 ² - Y 2 D 3 = a 0 + a 1 X 3
+ a 2 X
3 ² - Y 3 . . . D N = a 0 + a

1
X N

+ a 2 X
² N - Y N And like as we did to find the line least squares, also here we extend the approach of seeking the quadratic polynomial is such that the sum of the squares of the vertical distances of each one of the "real" points calculated according to the polynomial to be a minimum. In short, we want to minimize the function: S = [a 0 + a 1 X 1 + to 2

X

1
² - And 1 ] ² + [a 0 + a 1 X 2 + a 2 X 2 ² - Y 2] ² + [a 0 + a 1 X 3 + a 2 X 3 ² - Y 3] _______ + ... + [A 0 + a 1 X N + a 2 X
² N - Y N ] ² Since we now we have three parameters instead of two, we must carry out three partial differentiation, which eventually lead us to the following three sets of equations: to 0 N + a 1 Σ X + a 2



Σ
X ² = Σ And to 0 Σ X + a 1 Σ X ² + a
2 Σ
X 3 = Σ XY to 0 Σ X ² + a 1 Σ X

3 + to 2 Σ X 3 = Σ X ² Y This set of equations is known as normal equations to the parable of least squares. Again, we have a system of simultaneous equations with three unknowns, the parameters to 0 , to 1 and to 2 which will define the least squares curve for a given set of data that seem to follow an exponential growth in the second degree.
PROBLEM:
Adjust, as appropriate, a line or a parabola least squares the data given by the following table: The first step required before attempting to set a set of data, a formula is to put the data on a chart to try to discover the trend shown by the data. In this case, the graph is be:


Although at first sight our first impulse is to try to carry out a fit using a least squares line, the dot on the chart to X
0
= 0 if it really is not a mistake in making a reading but a genuinely valid data should lead us to think about the possibility of data rather than being shaped by a straight line may be modeled by a curve. And the curve simplest of all is provided by a second degree polynomial, a quadratic polynomial. Using the normal equations derived above, the least-squares parabola turns out to be:
Y = 2.51 - 1.20X + 0.733 X ²

The plotting of this curve, superimposed on experimental data, appears as follows:


We can see that the adjustment data to a quadratic formula is pretty good. Not only that, but we can detect the presence of what appears to be a minimum. This minimum could well be an optimal point to minimize losses in an industrial process to obtain the highest degree of purity in a chemical process, or achieve the highest quality alloy. And we use the seven pairs of experimental data to carry out modeling without having to resort to a polynomial of degree six if we insisted on an exact adjustment of the data. We can immediately see the graph that the minimum point of the parable is located approximately at the point X = 0.25, and we can get a better numerical approximation by taking the derivative calculus of the parable of least squares and equating to zero the derivative. Armed with this information, we can plan the conduct of a single experiment in which we give to the variable X

(which presumably is under our control) the value 0.25 in order to confirm whether there really is a point minimum. Note the bold step we are taking here. In a series of discrete points, after carrying out the adjustment of the data to a formula we are anticipating the existence of a minimum, and not only this but we are anticipating the area which is located the lowest. This is precisely one of the goals set a series of data to a formula, you can use this formula to try to make predictions within the range studied, or even extrapolate the formula outside the ranges studied.

PROBLEM:
To determine the value of the constant g, the acceleration due to gravity on Earth's surface, a group of students conducted an experiment which measured the time it took for an object falling from a building along different heights, measuring distances preset time. If the results were as follows:

Considering
t as the independent variable and
and
as the dependent variable, what is the best fit parabola these data? Knowing that the theoretical formula is y = ½ gt ²

where g is the acceleration of gravity, get the value of g

from these experimental data. Also calculate the heights, as the curve of least squares, should students have obtained for each of the elapsed times. The plotting of the points obtained experimentally as follows: The graph shows that, within the error margins can be expected in any experiment is performed, data seem to better fit a parabolic curve to a straight line. Using the normal equations derived above, the least-squares parabola turns out to be: Y = 5.089t 2
The continuous plot of this formula superimposed on experimental data which was obtained as follows:


If the formula theoretical acceleration due to gravity on Earth is
y = ½ gt ²
, then the value of such acceleration
g
be: ½ g
= 5,089


g
= 10,178 g = 9.8 m / s ²

. The problem shows that a least-squares fit is charged with "average" trend with respect to the experimental data, and the more experimental data are taken, the better. This problem is representative of those problems which previously has been derived and a theoretical model to explain certain behavior of some natural phenomenon, in which the purpose of carrying out a set of data to a formula is to obtain a value for some constant as it is in this case the acceleration of gravity on Earth's surface. http://www.amstat.org/publications/jse/v3n1/datasets.dickey.html

PROBLEM:
In investigating automobile accidents, the total time required for the total braking of a car after the driver has received a threat is composed of

reaction time (the time lag in detection of danger the application of the brakes) plus the braking time (the time it takes the car to stop after applying the brakes). The following table provides the stopping distance D in feet of a car traveling at different speeds V in miles per hour at which the driver detects a hazard.





Get the parable of least squares in the form


D = a 0 + to 1

V +
2 V ²


describing the dataset. Based on this formula, estimate the distance D
braking when the car is moving at 45 mph and 80 mph. The least-squares parabola is found to be: D = 41.77 - 1.096V + V ² 08,786 The graph of this formula superimposed on the data given is as follows:

Based on this formula, the braking distance when the car is moving at 45 mph and 80 mph are:



D = 41.77 - 1096 (45) + 08 786 (45) ²
D = 170 feet


D = 41.77 - 1,096 (80) + .08786 (80) ²
D = 516 feet


Note that in this problem to calculate the stopping distance for a
D
V speed = 80 mph are
extrapolating data going beyond the speed
V = 70 miles / hour for which were obtained by making a prediction that goes beyond what we might call our "comfort zone." There is always a risk when making such extrapolations, and more than a statistic has been in the ridiculous to make such extrapolations, although in this case the fit of the data to a quadratic formula should give us some reassurance that the actual outcome will not be far from what we predicted.
This problem is representative of those problems in which the conclusions can be drawn from them can have an impact even legal.
The procedure we have studied in this section can be extended to carry out an adjustment of a set of data to a third-degree polynomial, a cubic polynomial
general whose representation is:


Y = a + a 0 1 X + a 2

X ²
+
to 3

X
3 now proceed in the same way as we did with the parable of least squares, postulating the difference between each actual value of Y = Y 1 , Y 2, and 3 And ,..., N and each value calculated for the corresponding X
i
using what will be the least-squares cubic equation, which gives us the "distance" vertical D i that alienates both values: D 1 = a 0 + a 1 X 1 + to 2

X 1 ² + to X 3 1 3 - Y 1 D 2 = a 0 + a 1 X 2 + a
2
X 2 ² + a 3 X 2 3 - Y 2 D 3 = a 0 + a 1 X 3 + a
2
X 3 ² + a 3 X 3 3 - Y 3 . . . D 1 = a 0 + a

1
X N

+ a 2 X
N ² + to 3 X N 3 - Y 1 And like as we did to find the line of least squares, also here we extend the approach to find the quadratic polynomial is such that the sum of the squares of the vertical distances of each one of the "real" points calculated according to the polynomial is a minimum. In short, we want to minimize the function: S = [a 0 + a 1 X 1 + to 2

X

1
² + to X 3 1 3 - Y 1 ]² + [a 0 + a 1 X 2 + a 2 X 2 ² + a 3 X 2 3 - Y 2 + [a 0 + a 1 X 3 + a 2 X 3

²
+ a 3 X 3 3 - Y 3] + ... + [A 0 + a 1 X N + a 2 X N ² + to 3 X N 3 - Y N] ² Now we four parameters instead of three, which means we have to perform some differentiation four with respect to 0 , to 1 , to 2
and

to 3 , which eventually lead us to four sets of simultaneous equations. Solving these equations simultaneously be exactly the same way as the way in which cases were resolved for the linear regression equation and the parable of least squares, and will not be repeated here. The end result of all this is, as I should have suspected, a set of normal equations for cubic polynomial : to 0 N + a 1 Σ X + a

2
Σ X² + a 3 Σ X 3 = Σ Y a 0 Σ X + a 1
Σ
X² + a 2 Σ X 3 + a 3 Σ X 4 = Σ XY a 0 Σ X² + a 1
Σ
X 3 + a 2 Σ X 4 + a 3 Σ X 5 = Σ X²Y a 0 Σ X 3
+
a 1 Σ X 4 + a 2 Σ X 5 + to 3 Σ X 6 = Σ X 3 And Note that the formation of normal equations for higher degree polynomials is following a definite pattern, and even we can formulate a "rule" for the normal equations for any polynomial of degree n . However, for polynomials of degree greater than 4, this exercise is futile by the excessive amount of repetitive arithmetic calculations would be carried out if we turn directly to the normal equations are expressed as above, this being the reason why we feel the need to develop a little more sophisticated techniques that allow us to solve the normal equations of a shorthand. Like as when the technique for obtaining the least squares line in a single independent variable X was extended to cover a multiple regression in two or more variables 1 X, X 2 , X 3 , etc., Also the parable of least squares can be extended to carry out an adjustment to a formula with two or more variables in linear and quadratic terms. The general multiple regression formula as simple as possible involving linear and quadratic terms, with only two independent variables X
and

1 X 2
and ignoring the possibility of interaction terms is as follows: Y = α + ß 1 X 1 + ß 2 X 2 + ß

11
X 1 + ß 2 22 X 2 2 Given the difficulties to visualize the relationships that take place when we're driving or modeling formulas quadratic multiple regressions involving the Department of Mathematics and Statistics at York University in Ontario, Canada, has made available to students and the academic community in which a page can be viewed dynamically (either three-dimensional rotating surfaces corresponding to a multiple regression or by varying parameters such as the interaction terms) using animated GIF files generated with the help of SAS software package developed and sold by the division of Academic Technology Services (ATS) of the University of California at Los Angeles (UCLA). This page can be downloaded from the following address: http://www.math.yorku.ca/SCS/spida/lm/visreg.html This page has taken a graphics file in three dimensions the following formula : And = 20 to 2 X 1 + 2 2 X - 0.2 X

1


2 - 0.2 X


2 2 The file is as follows (the file with animated effects can be obtained using the page from which it was obtained): While data modeling quadratic surfaces can be carried out by solving the set of normal equations produced by the mathematical model that is being considered, the calculations can be cumbersome and even fastidious when made by hand at this level of complexity, which is why it is preferable to another method in which all you have to do is mount a vector or any securities matrix on which can perform calculations in a short series of steps with the help of a computer program to handle vectors and matrices. This is precisely what we see in the next section where we will try on a general matrix method abridging the steps to be carried out for this type of modeling.