Fundraise on MathsGee
First time here? Checkout the FAQs!
x

*Math Image Search only works best with zoomed in and well cropped math screenshots. Check DEMO

1 like 0 dislike
276 views
Why use Information Gain to select attributes in decision trees? What other criteria seem reasonable, and what are the tradeoffs in making this choice?
in Data Science & Statistics by Platinum (130,522 points) | 276 views

1 Answer

0 like 0 dislike
Best answer
Information gain is the difference in entropy before and after the split at a given node i.e.

\[\text{Information Gain} = \text{Entropy(before)} - \Sigma_{j=1}^K \text{Entropy(j,after)}\]

Where "before" is the dataset before the split, $K$ is the number of subsets generated by the split, and $\text{(j, after)}$ is subset $\text{ j }$ after the split.

So this methos focuses on how well a given attribute separates the training examples according to their target classification.

The compromise that this method brings is the fact that it is locally greedy. It looks to optimize at the local node level i.e. maximize information gain and minimze entropy.

The "Greedy Approach" is based on the concept of Heuristic Problem Solving by making an optimal local choice at each node. By making these local optimal choices, we estimate the approximate optimal global solution. This is not always the best global estimate.

information gain (IG) is also biased toward variables with large number of distinct values not variables that have observations with large values. A variable with the highest number of distinct values probability can divide data to smaller chunks. Also, we know that lower number of observations in each chunk reduces probability of variation occurrence.
by Platinum (130,522 points)

Related questions

1 like 0 dislike
1 answer
1 like 0 dislike
0 answers
2 like 0 dislike
0 answers
1 like 0 dislike
1 answer
1 like 0 dislike
0 answers
0 like 0 dislike
0 answers
0 like 0 dislike
0 answers
2 like 0 dislike
1 answer
1 like 0 dislike
0 answers

Join the MathsGee Learning Club where you get study and financial support for success from our community. CONNECT - LEARN - FUNDRAISE


On the MathsGee Learning Club, you can:


1. Ask questions


2. Answer questions


3. Vote on Questions and Answers


4. Start a Fundraiser


5. Tip your favourite community member(s)


6. Create Live Video Tutorials (Paid/Free)


7. Join Live Video Tutorials (Paid/Free)


8. Earn points for participating



Posting on the MathsGee Learning Club


1. Remember the human


2. Behave like you would in real life


3. Look for the original source of content


4. Search for duplicates before posting


5. Read the community's rules




CLUB RULES


1. Answers to questions will be posted immediately after moderation


2. Questions will be queued for posting immediately after moderation


3. Depending on how many posts we receive, you could be waiting up to 24 hours for your post to appear. But, please be patient as posts will appear after they pass our moderation.


MathsGee Android Q&A