Hello Everyone,
Welcome to the 31st edition of my newsletter ML & AI Cupcakes!
The agenda of today’s newsletter is to have a quick knowledge check of your understanding of decision tree algorithm.
Decision tree algorithm is the basis for other popular ensembling algorithms like random forest and xgboost. If you want to understand the working of these ensembling algorithms, you first need to understand decision trees well.
So, make sure you keep strengthening your decision tree fundamentals.
Without any further delay, let’s begin the test.
Good luck!
Which of the following is a node splitting critera used in decision trees?
A) Information gain
B) Gini impurity
C) Variance reduction
D) All of the above
Decision Trees can be used for ______ .
A) Classification tasks only
B) Regression tasks only
C) Both classification and regression tasks
D) Neither classification nor regression tasks
What is the general impact of increasing tree depth on overfitting?
A) Increase overfitting
B) Decreases overfitting
C) No impact
Leaf nodes are also know as terminal nodes in a decision tree.
A) True
B) False
What is the gini index of a perfectly pure node?
A) -1
B) 0
C) 0.5
D) 1
What is the default node splitting criteria in DecisionTreeClassifier?
A) Gini index
B) Variance reduction
C) Mean squared error
D) Information gain
Can we change the default node splitting criteria in DecisionTreeClassifier?
A) Yes
B) No
What does entropy does in the context of a decision tree?
A) Selects the depth of a tree
B) Decides the number of leaf nodes in a tree
C) Measures the impurity in an internal node
D) None of the above
Which of the following is a type of decision tree algorithm?
A) DBSCAN
B) C4.5
C) K-Means clustering
D) ADASYN
What is the entropy of a pure node in a binary classification problem?
A) -1
B) 0
C) 0.5
D) 1
When is entropy value the highest for a binary classification problem with 20 observations?
A) When there is only one class in the dataset
B) When there is imbalance in the classes
C) When both the classes have equal number of observations
D) None of the above
Information gain and gini index have the same mathematical formula.
A) True
B) False
Information gain and entropy are same.
A) True
B) False
Which of the following is true while selecting features for splitting using gini index?
A) The feature with lowest gini index is chosen for splitting
B) The feature with highest gini index is chosen for splitting
Which of the following is true while selecting features for splitting using information gain?
A) The feature with lowest information gain is chosen for splitting
B) The feature with highest information gain is chosen for splitting
Which of the following is a parameter of DecisionTreeClassifier in scikit-learn?
A) max_depth
B) min_samples_split
C) max_features
D) All of the above
Answers
1.D 2.C 3.A 4.A 5.B 6.A 7.A 8.C 9.B 10. B
C 12. B 13. B 14. A 15. B 16. D
Writing each newsletter takes a lot of research, time and effort. Just want to make sure it reaches maximum people to help them grow in their AI/ML journey.
It would be great if you could share this newsletter with your network.
Also, please let me know your feedbacks and suggestions in the comments section. That will help me keep going. Even a “like” on my posts will tell me that my posts are helpful to you.
See you soon!
-Kavita