Get Help From Experts in Data Science and Mathematics
ePortfolios |STEM Gender Equality | ZOOM | Slack | Spreading Mathematical Happiness

You can support MathsGee with your DONATION

0 like 0 dislike
27 views
You’ve got a data set to work having p (no. of variable) > n (no. of observation). Why is OLS as bad option to work with? Which techniques would be best to use? Why?
by Diamond (53,882 points) | 27 views

1 Answer

0 like 0 dislike
In such high dimensional data sets, we can’t use classical regression techniques, since their assumptions tend to fail. When p > n, we can no longer calculate a unique least square coefficient estimate, the variances become infinite, so OLS cannot be used at all.

To combat this situation, we can use penalized regression methods like lasso, LARS, ridge which can shrink the coefficients to reduce variance. Precisely, ridge regression works best in situations where the least square estimates have higher variance.

Among other methods include subset regression, forward stepwise regression.
by Wooden (3,542 points)

Related questions

0 like 0 dislike
0 answers
0 like 0 dislike
0 answers
0 like 0 dislike
1 answer
asked Sep 6, 2020 by Tedsf Diamond (53,882 points) | 12 views
0 like 0 dislike
1 answer
asked Sep 6, 2020 by Tedsf Diamond (53,882 points) | 9 views

Welcome to MathsGee Q&A Bank, Africa’s largest personalized FREE Study Help network that helps people find answers to problems and connect with experts for improved outcomes.


Get help from experts - simply ask your question


You can Support MathsGee with your DONATION

Enter your email address:

13,102 questions
10,327 answers
101 comments
11,185 users