Chapter 6: Model Evaluation & Selection
⚖️ After building a model, you must evaluate how well it performs and select the best one among alternatives.
🔍 Why Is This Chapter Important?
-
Avoid overfitting (too good on training, bad on test).
-
Detect underfitting (poor on both training and test).
-
Choose the right model for your data.
-
Improve model through tuning and validation.
📌 Key Concepts Covered:
-
Train/Test Split
-
Cross-Validation
-
Bias-Variance Trade-off
-
Overfitting vs. Underfitting
-
Hyperparameter Tuning
-
Model Selection Techniques
🔹 1. Train-Test Split
Divide your dataset into two parts:
-
Training Set (80%) – to train the model
-
Test Set (20%) – to test how well the model generalizes
🔹 2. Cross-Validation
K-Fold Cross Validation splits data into k parts, trains on k-1 parts, tests on the remaining, and repeats.
🔁 Example: 5-Fold CV
-
Divide data into 5 parts.
-
Train on 4, test on 1 — repeat 5 times.
📌 Benefit: More reliable than a single train/test split.
⚖️ 3. Bias-Variance Trade-off
Term | Meaning | Problem |
---|---|---|
High Bias | Model too simple | Underfitting |
High Variance | Model too complex | Overfitting |
🧠 Goal: Balance bias and variance to get best generalization.
🧱 4. Overfitting vs Underfitting
Type | Training Accuracy | Test Accuracy | Looks Like |
---|---|---|---|
Overfitting | High | Low | Memorized data |
Underfitting | Low | Low | Didn’t learn enough |
📌 Fix Overfitting:
-
More data
-
Simpler model
-
Regularization
📌 Fix Underfitting:
-
More features
-
Complex model
-
Train longer
🧪 5. Evaluation Metrics Recap
For Classification:
-
Accuracy
-
Precision / Recall / F1 Score
-
Confusion Matrix
-
ROC-AUC Curve
For Regression:
-
MSE (Mean Squared Error)
-
RMSE (Root Mean Squared Error)
-
MAE (Mean Absolute Error)
-
R² Score (Coefficient of Determination)
🔧 6. Hyperparameter Tuning
Hyperparameters are settings you configure before training the model (like number of trees, learning rate, k in KNN).
Techniques:
✅ Grid Search
Try all combinations of given parameters.
✅ Randomized Search
Randomly selects parameter combinations — faster for large search space.
📌 Example Workflow:
-
Train/test split → baseline model
-
K-Fold Cross-Validation → better estimate
-
Choose best evaluation metric (accuracy or RMSE)
-
Tune hyperparameters with GridSearch
-
Select final model → deploy
💻 Hands-On Mini Project Ideas
✅ Compare Models:
-
Train Logistic Regression, KNN, and SVM on the same dataset
-
Use Cross-Validation to compare average accuracy
✅ Tune Model:
-
Use GridSearch to tune K in KNN or max_depth in Decision Trees
🧠 Summary of Chapter 6
Concept | Role |
---|---|
Train-Test Split | Initial testing of model |
Cross-Validation | Robust performance checking |
Overfitting | Model too specific |
Underfitting | Model too general |
Hyperparameter Tuning | Improves model performance |
Model Metrics | Helps compare models |
✅ Mini Assignment:
-
Use 5-Fold Cross-Validation on a classifier and report average accuracy.
-
Build a Decision Tree and tune
max_depth
using GridSearch. -
Explain a case where your model was overfitting. How did you fix it?