Chapter 5: Unsupervised Learning

 

Chapter 5: Unsupervised Learning

🧠 In Unsupervised Learning, the model discovers patterns or structures from data without any labels.
No answers are given — the algorithm finds hidden relationships on its own.


🔍 Key Idea:

“Group similar things together or reduce complexity without knowing the correct output.”


✳️ Types of Unsupervised Learning

TypeDescriptionUse Case
ClusteringGroup similar data pointsCustomer Segmentation
Dimensionality ReductionReduce features while preserving informationData Visualization
Association Rule LearningFind rules between variablesMarket Basket Analysis

🔹 1. Clustering

Grouping data points such that those in the same group are more similar to each other than to those in other groups.

a. K-Means Clustering

  • Partitions data into K clusters

  • Iteratively updates cluster centers (centroids)

python
from sklearn.cluster import KMeans model = KMeans(n_clusters=3) model.fit(X) print(model.labels_)

📌 Example: Grouping customers by spending habits


b. Hierarchical Clustering

  • Builds a tree (dendrogram) of clusters

  • Doesn’t require pre-specifying number of clusters

📌 Example: Document or gene clustering


🔹 2. Dimensionality Reduction

Reducing the number of features (columns) in the dataset while retaining important info

a. PCA (Principal Component Analysis)

  • Converts high-dimensional data into fewer dimensions (components)

python
from sklearn.decomposition import PCA pca = PCA(n_components=2) X_reduced = pca.fit_transform(X)

📌 Used for: Visualization, noise removal, speeding up algorithms


b. t-SNE

  • Better for visualizing high-dimensional data

  • Preserves local structure (similar points stay close)

📌 Often used to visualize clusters in NLP or image features


🔹 3. Association Rule Learning

Finds relationships between variables in large datasets.

a. Apriori Algorithm

  • Discovers frequent itemsets and rules

📌 Example: If a customer buys bread & butter, they’re likely to buy milk.

b. Eclat Algorithm

  • More memory-efficient than Apriori

  • Uses set intersections

📌 Used in: Market Basket Analysis, Recommender Systems


📏 Evaluation Metrics in Unsupervised Learning

Since there's no "correct label", metrics are indirect.

MetricUse
Silhouette ScoreHow well-clustered data points are
Inertia (for KMeans)Measures compactness of clusters
Explained Variance (PCA)How much info each component keeps

💻 Hands-On Projects

✅ Customer Segmentation (K-Means)

  • Data: Age, Spending Score, Income

  • Output: Group customers into clusters

✅ Market Basket Analysis (Apriori)

  • Input: Transaction data

  • Output: Discover item combinations often bought together


🧠 Summary of Chapter 5

ConceptSummary
Unsupervised LearningNo labels; model finds patterns
ClusteringK-Means, Hierarchical
Dimensionality ReductionPCA, t-SNE
Association RulesApriori, Eclat
Hands-onCustomer groups, Market rules

✅ Mini Assignment:

  1. Use K-Means on a dataset with 2 features and visualize the clusters.

  2. Apply PCA to reduce a dataset to 2 dimensions.

  3. Try using the Apriori algorithm on transaction data.

homeacademy

Home academy is JK's First e-learning platform started by Er. Afzal Malik For Competitive examination and Academics K12. We have true desire to serve to society by way of making educational content easy . We are expertise in STEM We conduct workshops in schools Deals with Science Engineering Projects . We also Write Thesis for your Research Work in Physics Chemistry Biology Mechanical engineering Robotics Nanotechnology Material Science Industrial Engineering Spectroscopy Automotive technology ,We write Content For Coaching Centers also infohomeacademy786@gmail.com

إرسال تعليق (0)
أحدث أقدم