Sunday, December 23, 2012

Correlation


The correlation is one of the most common and most useful statistics. A correlation is a single number that describes the degree of relationship between two variables. Let's work through an example to show you how this statistic is computed.

Correlation Example

Let's assume that we want to look at the relationship between two variables, height (in inches) and self esteem. Perhaps we have a hypothesis that how tall you are effects your self esteem (incidentally, I don't think we have to worry about the direction of causality here -- it's not likely that self esteem causes your height!). Let's say we collect some information on twenty individuals (all male -- we know that the average height differs for males and females so, to keep this example simple we'll just use males). Height is measured in inches. Self esteem is measured based on the average of 10 1-to-5 rating items (where higher scores mean higher self esteem). Here's the data for the 20 cases (don't take this too seriously -- I made this data up to illustrate what a correlation is):
PersonHeightSelf Esteem
1684.1
2714.6
3623.8
4754.4
5583.2
6603.1
7673.8
8684.1
9714.3
10693.7
11683.5
12673.2
13633.7
14623.3
15603.4
16634.0
17654.1
18673.8
19633.4
20613.6
Now, let's take a quick look at the histogram for each variable:
hist1.gif (3391 bytes)
hist2.gif (3476 bytes)
And, here are the descriptive statistics:
VariableMeanStDevVarianceSumMinimumMaximumRange
Height65.44.4057419.41051308587517
Self Esteem3.7550.4260900.18155375.13.14.61.5
Finally, we'll look at the simple bivariate (i.e., two-variable) plot:
corrbv.gif (2807 bytes)
You should immediately see in the bivariate plot that the relationship between the variables is a positive one (if you can't see that, review the section on types of relationships) because if you were to fit a single straight line through the dots it would have a positive slope or move up from left to right. Since the correlation is nothing more than a quantitative estimate of the relationship, we would expect a positive correlation.
What does a "positive relationship" mean in this context? It means that, in general, higher scores on one variable tend to be paired with higher scores on the other and that lower scores on one variable tend to be paired with lower scores on the other. You should confirm visually that this is generally true in the plot above.

Calculating the Correlation

Now we're ready to compute the correlation value. The formula for the correlation is:
corrform1.gif (3131 bytes)
We use the symbol r to stand for the correlation. Through the magic of mathematics it turns out that r will always be between -1.0 and +1.0. if the correlation is negative, we have a negative relationship; if it's positive, the relationship is positive. You don't need to know how we came up with this formula unless you want to be a statistician. But you probably will need to know how the formula relates to real data -- how you can use the formula to compute the correlation. Let's look at the data we need for the formula. Here's the original data with the other necessary columns:
PersonHeight (x)Self Esteem (y)x*yx*xy*y
1684.1278.8462416.81
2714.6326.6504121.16
3623.8235.6384414.44
4754.4330562519.36
5583.2185.6336410.24
6603.118636009.61
7673.8254.6448914.44
8684.1278.8462416.81
9714.3305.3504118.49
10693.7255.3476113.69
11683.5238462412.25
12673.2214.4448910.24
13633.7233.1396913.69
14623.3204.6384410.89
15603.4204360011.56
16634252396916
17654.1266.5422516.81
18673.8254.6448914.44
19633.4214.2396911.56
20613.6219.6372112.96
Sum =130875.14937.685912285.45
The first three columns are the same as in the table above. The next three columns are simple computations based on the height and self esteem data. The bottom row consists of the sum of each column. This is all the information we need to compute the correlation. Here are the values from the bottom row of the table (where N is 20 people) as they are related to the symbols in the formula:
corrform2.gif (945 bytes)
Now, when we plug these values into the formula given above, we get the following (I show it here tediously, one step at a time):
corrform3.gif (3949 bytes)
So, the correlation for our twenty cases is .73, which is a fairly strong positive relationship. I guess there is a relationship between height and self esteem, at least in this made up data!

Testing the Significance of a Correlation

Once you've computed a correlation, you can determine the probability that the observed correlation occurred by chance. That is, you can conduct a significance test. Most often you are interested in determining the probability that the correlation is a real one and not a chance occurrence. In this case, you are testing the mutually exclusive hypotheses:
Null Hypothesis:r = 0
Alternative Hypothesis:r <> 0
The easiest way to test this hypothesis is to find a statistics book that has a table of critical values of r. Most introductory statistics texts would have a table like this. As in all hypothesis testing, you need to first determine the significance level. Here, I'll use the common significance level of alpha = .05. This means that I am conducting a test where the odds that the correlation is a chance occurrence is no more than 5 out of 100. Before I look up the critical value in a table I also have to compute the degrees of freedom or df. The df is simply equal to N-2 or, in this example, is 20-2 = 18. Finally, I have to decide whether I am doing a one-tailedor two-tailed test. In this example, since I have no strong prior theory to suggest whether the relationship between height and self esteem would be positive or negative, I'll opt for the two-tailed test. With these three pieces of information -- the significance level (alpha = .05)), degrees of freedom (df = 18), and type of test (two-tailed) -- I can now test the significance of the correlation I found. When I look up this value in the handy little table at the back of my statistics book I find that the critical value is .4438. This means that if my correlation is greater than .4438 or less than -.4438 (remember, this is a two-tailed test) I can conclude that the odds are less than 5 out of 100 that this is a chance occurrence. Since my correlation 0f .73 is actually quite a bit higher, I conclude that it is not a chance finding and that the correlation is "statistically significant" (given the parameters of the test). I can reject the null hypothesis and accept the alternative.

Thursday, December 20, 2012

B.COM PART 1: ENGLISH - IMPORTANT ESSAYS:


B.COM PART 1 IMP ESSAYS ENGLISH:

*Terrorism In Pakistan: Its Causes, Impacts And Remedies

*Democracy

*Flood: The worst calamity

*Interest free banking

*Power crises in Pakistan

*Inflation

*Street crimes and remedies

*Importance of commerce education

B.COM PART 1 PRINCIPLES OF ACCOUNTING MAY POSTPONE!!!!!!!



ATTENTION B.COM STUDENTS

JOIN KHALID AZIZ



B.COM PART 1 & 2



ACCOUNTING, STATISTICS & ECONOMICS OF PART 1



ADVANCED & COST ACCOUNTING, AUDITING, MANAGEMENT & BUSINESS LAW OF PART 2



GUARANTEED COMPLETION OF SYLLABUS.



QUALIFIED AND PROFESSIONAL TEACHER.



CONTACT :



KHALID AZIZ



0300-2540827

VISIT AND FOLLOW

https://twitter.com/EDUCATION_KHI




B-COM ON FACE BOOK

ADMISSION ALERT B.COM PART 1

ADMISSION ALERT B.COM PART 1


Only such students will be eligible for admission to the B.Com. Part-1 class who have
passed:-
i) Intermediate in Commerce OR
ii) Higher Secondary with Commerce OR
iii) Intermediate Arts/Higher Secondary Arts Group with Economics OR
iv) Higher Secondary or Intermediate Arts/Science/Home Economics in at least Second
Division OR
v) Diploma in Commerce, Diploma of Business Administration, Diploma of Associate
Engineer of the Technical Education Board OR
vi) Intermediate Agriculture with Economics OR
vii) Intermediate Science with Mathematics.

MA ECONOMICS

EDUCATION KARACHI

B.COM COACHING AND HOME TUITION