XGBoost (Part 2): Machine Learning Interview Prep 17

Shahidullah Kawsar
5 min readDec 19, 2023

--

Xtreme Gradient Boosting, also known as XGBoost, is a powerful Machine Learning algorithm. It works by successively building multiple decision trees, each correcting the errors of the previous tree. XGBoost is widely used because it’s highly efficient and often outperforms other algorithms in terms of predictive accuracy. It’s like a team of experts collaborating to make better decisions, resulting in superior predictive models for tasks like classification and regression.

Photo: DuBois Park, Jupiter, Florida, USA Credit: Tasnim and Kawsar

Let’s check your basic knowledge of XGBoost. Here are 10 multiple-choice questions for you and there’s no time limit. Have fun!

1. XGBoost hyperparameter ‘subsample’ helps prevent overfitting. What does the ‘subsample’ parameter in XGBoost control?
(A) The size of the feature set
(B) The fraction of the total dataset to be randomly sampled for each tree
(C) The learning rate
(D) The number of trees in the model

2. When a tree is constructed in XGBoost, it continuously splits nodes, creating a more complex model. After creating a split, if the gain in performance (like loss reduction) is not significant, it might lead to overfitting. The ‘gamma’ parameter in XGBoost specifies the complexity of the trees. What does the ‘gamma’ parameter in XGBoost control?
(A) Learning rate
(B) Minimum loss reduction required for a split
(C) Maximum depth of trees
(D) Number of boosting rounds

3. The ‘eta’ parameter in XGBoost determines the step size at each iteration while moving toward a minimum of a loss function. In XGBoost, what does the ‘eta’ parameter represent?
(A) The learning rate
(B) The number of trees
(C) The depth of trees
(D) The regularization term

4. The ‘lambda’ parameter in XGBoost is used for one kind of regularization on the weights, which helps to reduce overfitting by penalizing complex models. What is the purpose of the ‘lambda’ parameter in XGBoost?
(A) To specify the learning rate
(B) To control the tree depth
(C) To add L1 (Lasso) regularization
(D) To add L2 (Ridge) regularization

5. The ‘scale_pos_weight’ parameter in XGBoost gives more weight to the minority class to balance the class distribution or balance the positive and negative weights. XGBoost’s ‘scale_pos_weight’ parameter is useful in which scenario?
(A) Multi-class classification
(B) Regression tasks
(C) Handling imbalanced classification datasets
(D) Time-series analysis

6. The XGBoost algorithm uses decision trees as the base learners. There is a parameter in XGBoost that sets the number of gradient-boosted trees and directly impacts the computation time. More trees will generally increase the training time. A higher value can lead to overfitting if not controlled. In XGBoost, which parameter directly impacts the computation time for training a model?
(A) ‘n_estimators’
(B) ‘learning_rate’
(C) ‘max_depth’
(D) ‘subsample’

7. Compared to Logistic Regression (linear model), how does XGBoost (tree-based model) handle feature interactions?
(A) XGBoost cannot capture feature interactions.
(B) Both capture feature interactions equally well.
(C) Logistic Regression is better at capturing feature interactions.
(D) XGBoost is better at capturing non-linear feature interactions.

8. XGBoost is an optimized distributed gradient boosting library designed to efficiently handle large-scale and high-dimensional data. What are the other benefits of using XGBoost?
(A) Efficient memory usage, faster and more accurate prediction
(B) Better handling of missing data
(C) Support for distributed computing, and parallel processing
(D) All of the above

9. How can you prevent overfitting in XGBoost?
(A) L1 (Lasso) regularization (parameter — alpha) encourages the model to use a sparse set of features, while L2 (Ridge) regularization (parameter - lambda) encourages the model to use all the features but with smaller weights.
(B) Early stopping helps prevent overfitting by stopping training once the validation performance no longer improves.
(C) Reducing the learning rate (parameter — eta). The learning rate determines the step size at each iteration while moving toward a minimum of a loss function. A lower learning rate makes the optimization process more robust but slower.
(D) All of the above

10. Which statement is correct about XGBoost?
(A) XGBoost uses gradient boosting while Random Forest uses bagging, which means that XGBoost creates new models that focus on correcting the errors of the previous models, while Random Forest creates independent decision trees.
(B) XGBoost can handle missing values by assigning them a default direction during the splitting process based on the distribution of the non-missing values.
(C) XGBoost is called “gradient boosting” because it minimizes a loss function by iteratively adding new models that minimize the negative gradient of the loss function.
(D) All of the above

The solutions will be published in the next blog Support Vector Machine (SVM) Part 2: Machine Learning Interview Prep 18.

Happy learning. If you like the questions and enjoy taking the test, please subscribe to my email list for the latest ML questions, follow my Medium profile, and leave a clap for me. Feel free to discuss your thoughts on these questions in the comment section. Don’t forget to share the quiz link with your friends or LinkedIn connections. If you want to connect with me on LinkedIn: my LinkedIn profile.

The solution of XGBoost (Part 1): Machine Learning Interview Prep 16 1(B), 2(A), 3(D), 4(A), 5(C), 6(A), 7(A), 8(B), 9(C), 10(A, C)

References:
[1] XGBoost Part 1 (of 4): Regression
[2] XGBoost Part 2 (of 4): Classification
[3] XGBoost Part 3 (of 4): Mathematical Details
[4] XGBoost Part 4 (of 4): Crazy Cool Optimizations
[5] XGBoost in Python from Start to Finish
[6] Interview questions for XG Boost
[7] All about XGBoost
[8] Tell us something about XGBoost. Why is it so popular?
[9] ChatGPT to rewrite the questions simply.

--

--