Chapter 6: Model Evaluation & Selection

 

Chapter 6: Model Evaluation & Selection

⚖️ After building a model, you must evaluate how well it performs and select the best one among alternatives.


🔍 Why Is This Chapter Important?

  • Avoid overfitting (too good on training, bad on test).

  • Detect underfitting (poor on both training and test).

  • Choose the right model for your data.

  • Improve model through tuning and validation.


📌 Key Concepts Covered:

  1. Train/Test Split

  2. Cross-Validation

  3. Bias-Variance Trade-off

  4. Overfitting vs. Underfitting

  5. Hyperparameter Tuning

  6. Model Selection Techniques


🔹 1. Train-Test Split

Divide your dataset into two parts:

  • Training Set (80%) – to train the model

  • Test Set (20%) – to test how well the model generalizes

python
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

🔹 2. Cross-Validation

K-Fold Cross Validation splits data into k parts, trains on k-1 parts, tests on the remaining, and repeats.

🔁 Example: 5-Fold CV

  • Divide data into 5 parts.

  • Train on 4, test on 1 — repeat 5 times.

python
from sklearn.model_selection import cross_val_score scores = cross_val_score(model, X, y, cv=5) print(scores.mean())

📌 Benefit: More reliable than a single train/test split.


⚖️ 3. Bias-Variance Trade-off

TermMeaningProblem
High BiasModel too simpleUnderfitting
High VarianceModel too complexOverfitting

🧠 Goal: Balance bias and variance to get best generalization.


🧱 4. Overfitting vs Underfitting

TypeTraining AccuracyTest AccuracyLooks Like
OverfittingHighLowMemorized data
UnderfittingLowLowDidn’t learn enough

📌 Fix Overfitting:

  • More data

  • Simpler model

  • Regularization

📌 Fix Underfitting:

  • More features

  • Complex model

  • Train longer


🧪 5. Evaluation Metrics Recap

For Classification:

  • Accuracy

  • Precision / Recall / F1 Score

  • Confusion Matrix

  • ROC-AUC Curve

For Regression:

  • MSE (Mean Squared Error)

  • RMSE (Root Mean Squared Error)

  • MAE (Mean Absolute Error)

  • R² Score (Coefficient of Determination)


🔧 6. Hyperparameter Tuning

Hyperparameters are settings you configure before training the model (like number of trees, learning rate, k in KNN).

Techniques:

✅ Grid Search

Try all combinations of given parameters.

python
from sklearn.model_selection import GridSearchCV grid = GridSearchCV(model, param_grid, cv=5) grid.fit(X_train, y_train) print(grid.best_params_)

✅ Randomized Search

Randomly selects parameter combinations — faster for large search space.


📌 Example Workflow:

  1. Train/test split → baseline model

  2. K-Fold Cross-Validation → better estimate

  3. Choose best evaluation metric (accuracy or RMSE)

  4. Tune hyperparameters with GridSearch

  5. Select final model → deploy


💻 Hands-On Mini Project Ideas

✅ Compare Models:

  • Train Logistic Regression, KNN, and SVM on the same dataset

  • Use Cross-Validation to compare average accuracy

✅ Tune Model:

  • Use GridSearch to tune K in KNN or max_depth in Decision Trees


🧠 Summary of Chapter 6

ConceptRole
Train-Test SplitInitial testing of model
Cross-ValidationRobust performance checking
OverfittingModel too specific
UnderfittingModel too general
Hyperparameter TuningImproves model performance
Model MetricsHelps compare models

✅ Mini Assignment:

  1. Use 5-Fold Cross-Validation on a classifier and report average accuracy.

  2. Build a Decision Tree and tune max_depth using GridSearch.

  3. Explain a case where your model was overfitting. How did you fix it?

homeacademy

Home academy is JK's First e-learning platform started by Er. Afzal Malik For Competitive examination and Academics K12. We have true desire to serve to society by way of making educational content easy . We are expertise in STEM We conduct workshops in schools Deals with Science Engineering Projects . We also Write Thesis for your Research Work in Physics Chemistry Biology Mechanical engineering Robotics Nanotechnology Material Science Industrial Engineering Spectroscopy Automotive technology ,We write Content For Coaching Centers also infohomeacademy786@gmail.com

Post a Comment (0)
Previous Post Next Post