Get Help And Discuss STEM Concepts From Math To Data Science & Financial Literacy
STEM Gender Equality | Join us on ZOOM | Spreading Mathematical Happiness

MathsGee is free of annoying ads. We want to keep it like this. You can help with your DONATION

1 like 0 dislike
What are measures of association?
in Data Science by Diamond (50,934 points) | 47 views

1 Answer

1 like 0 dislike
Best answer

Measures of association indicate whether two variables are related. Two measures are commonly used:

  • Chi-square
  • Correlation


  • As a measure of association between variables, chi-square tests are used on nominal data (i.e., data that are put into classes: e.g., gender [male, female] and type of job [unskilled, semi-skilled, skilled]) to determine whether they are associated*
  • A chi-square is called significant if there is an association between two variables, and nonsignificant if there is not an association

To test for associations, a chi-square is calculated in the following way: Suppose a researcher wants to know whether there is a relationship between gender and two types of jobs, construction worker and administrative assistant. To perform a chi-square test, the researcher counts up the number of female administrative assistants, the number of female construction workers, the number of male administrative assistants, and the number of male construction workers in the data. These counts are compared with the number that would be expected in each category if there were no association between job type and gender (this expected count is based on statistical calculations). If there is a large difference between the observed values and the expected values, the chi-square test is significant, which indicates there is an association between the two variables.

*The chi-square test can also be used as a measure of goodness of fit, to test if data from a sample come from a population with a specific distribution, as an alternative to Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests. As such, the chi square test is not restricted to nominal data; with non-binned data, however, the results depend on how the bins or classes are created and the size of the sample


  • A correlation coefficient is used to measure the strength of the relationship between numeric variables (e.g., weight and height)
  • The most common correlation coefficient is Pearson’s r, which can range from -1 to +1.
  • If the coefficient is between 0 and 1, as one variable increases, the other also increases. This is called a positive correlation. For example, height and weight are positively correlated because taller people usually weigh more
  • If the correlation coefficient is between -1 and 0, as one variable increases the other decreases. This is called a negative correlation. For example, age and hours slept per night are negatively correlated because older people usually sleep fewer hours per night
by Diamond (50,934 points)
selected by

Related questions

2 like 0 dislike
1 answer
asked May 15 in Data Science by Tedsf Diamond (50,934 points) | 382 views
6 like 0 dislike
2 answers
1 like 0 dislike
1 answer
1 like 0 dislike
1 answer
0 like 0 dislike
0 answers

Welcome to MathsGee Q&A Bank, Africa’s largest personalized STEM and Financial Literacy education network that helps people find answers to problems, connect with others and take action to improve their outcomes.

MathsGee Q&A is the STEM and Financial Literacy knowledge-sharing community where students and experts put their heads together to crack their toughest questions.

MathsGee is free of annoying ads. We want to keep it like this. You can help with your DONATION

Enter your email address:

11,573 questions
9,565 answers
10,486 users