As we discussed Supervised and Unsupervised machine learning models and regression in machine learning in our previous chapter. If you did not go through it, please first read that chapter so it is feasible for you to understand this chapter about Classification in Machine Learning.
What is Classification in Machine Learning
Classification is a type of supervised learning in machine learning. It’s like sorting things into categories, or you can say we do a grouping of things. For example, You have a bunch of fruits, and you want to teach a computer how to tell whether each fruit is an apple or a banana. That’s what classification does. In classification, we make computers capable of grouping things based on their features.
How Does Classification in Machine Learning Work?
For example, Like a teacher guiding a student. The teacher shows the student examples of apples and bananas and tells them which is which. The student learns from these examples. Then, the teacher gives the student a new fruit they’ve never seen before and asks whether it’s an apple or a banana. The student uses what they learned to make a guess. Similarly, a classifier in machine learning learns from examples and then guesses the category of new things it hasn’t seen before.
I hope you understand about a little bit about classification don’t worry we also do implementation of this concept with Python.
Benefits of Classification in Machine Learning
There are numerous advantage but for short we see only four these are also enough to know let’s see
- Informed Decision Making
- Pattern Recognition
- Efficiency
- Predictive Insights
Informed Decision Making
Classification aids in making well-informed decisions by predicting outcomes.
Pattern Recognition
It identifies patterns and trends that might not be obvious.
Efficiency
Classification automates sorting and categorization of large datasets, Which is very helpful.
Predictive Insights
Provides insights into future scenarios based on historical data.
I hope you understand about a little bit about classification merits.
Applications of Classification in Machine Learning
There are a lot’s of application of Classification in Machine Learning some of them are following.
- Loan Default Prediction
- Pattern Recognition
- Image Recognition
- Email Filtering
- Customer Churn Prediction
- Speech Recognition
Loan Default Prediction
Predicting loan repayment likelihood based on historical loan default data.
Medical Diagnosis
Diagnosing diseases by analyzing patient symptoms and medical tests.
Image Recognition
Identifying objects within images, such as facial recognition.
Email Filtering
Sorting emails into “spam” or “not spam” categories.
Customer Churn Prediction
Predicting if a customer will switch to a competitor.
Speech Recognition
Converting spoken language into text, enabling voice assistants.
Classification Algorithms in Classification in Machine Learning
There are many algorithms that we can use in classification some of them are following under below.
- Decision Trees
- Naive Bayes
- Linear Discriminant Analysis (LDA)
- k-Nearest Neighbors (k-NN)
- Logistic Regression
- Neural Networks
- Support Vector Machines (SVM)
Python implementation of all Algorithms
Here below you see implementation of all algorithms generally used in Classification.
Decision Trees in Python
from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No) data = [ [25, 50000, 1, 1], [35, 75000, 0, 0], # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = DecisionTreeClassifier() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"Decision Tree Accuracy: {accuracy}")
Output
Decision Tree Accuracy: 0.0
Naive Bayes
from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample email data and labels (1 for spam, 0 for not spam) emails = [ ("Get a free iPhone now!", 1), ("Meeting agenda for today", 0), # ... more data ] X = [email[0] for email in emails] y = [email[1] for email in emails] vectorizer = CountVectorizer() X = vectorizer.fit_transform(X) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = MultinomialNB() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"Naive Bayes Accuracy: {accuracy}")
Output
Naive Bayes Accuracy: 0.0
Linear Discriminant Analysis (LDA)
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No) data = [ [25, 50000, 1, 1], [35, 75000, 0, 0], [30, 60000, 1, 0], [40, 80000, 0, 1], # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = LinearDiscriminantAnalysis() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"LDA Accuracy: {accuracy}")
Output
LDA Accuracy: 0.0
k-Nearest Neighbors (k-NN)
from sklearn.neighbors import KNeighborsClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), class (0, 1, 2 for different classes) data = [ [25, 50000, 1, 0], [35, 75000, 0, 1], # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Decreased n_neighbors to 1 for a small dataset classifier = KNeighborsClassifier(n_neighbors=1) classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"k-NN Accuracy: {accuracy}")
Output
k-NN Accuracy: 0.0
Logistic Regression
from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No) data = [ [25, 50000, 1, 1], [35, 75000, 0, 0], [30, 60000, 1, 1], # Added another sample with a different class [40, 80000, 0, 0], # Added another sample with a different class # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = LogisticRegression() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"Logistic Regression Accuracy: {accuracy}")
Output
Logistic Regression Accuracy: 1.0
Neural Networks
from sklearn.neural_network import MLPClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No) data = [ [25, 50000, 1, 1], [35, 75000, 0, 0], # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = MLPClassifier() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"Neural Network Accuracy: {accuracy}")
Output
Neural Network Accuracy: 0.0
Support Vector Machines (SVM)
from sklearn.svm import SVC from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score # Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No) data = [ [25, 50000, 1, 1], [35, 75000, 0, 0], [30, 60000, 1, 1], # Added another sample with a different class [40, 80000, 0, 0], # Added another sample with a different class # ... more data ] X = [row[:-1] for row in data] y = [row[-1] for row in data] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) classifier = SVC() classifier.fit(X_train, y_train) predictions = classifier.predict(X_test) accuracy = accuracy_score(y_test, predictions) print(f"SVM Accuracy: {accuracy}")
Output
SVM Accuracy: 0.0
Conclusion
Now in last i want to conclude that, In machine learning, classification is a powerful technique for categorizing and labeling data into distinct classes. It involves learning patterns and relationships between features and target labels in order to make predictions on new, unlabeled data. Various classification algorithms, such as Logistic Regression, Neural Networks, and Support Vector Machines (SVM), offer different approaches to solving classification problems.
Link:Â https://Codelikechamp.com
Medium Link: Follow me on Medium
Linkedin Link: Follow me on Linkedin