Learn about more on Classification in Machine Learning

As we discussed Supervised and Unsupervised machine learning models  and regression in machine learning in our previous chapter. If you did not go through it, please first read that chapter so it is feasible for you to understand this chapter about Classification in Machine Learning.

Classification in Machine Learning

What is Classification in Machine Learning

Classification is a type of supervised learning in machine learning. It’s like sorting things into categories, or you can say we do a grouping of things. For example, You have a bunch of fruits, and you want to teach a computer how to tell whether each fruit is an apple or a banana. That’s what classification does. In classification, we make computers capable of grouping things based on their features.

How Does Classification in Machine Learning Work?

For example, Like a teacher guiding a student. The teacher shows the student examples of apples and bananas and tells them which is which. The student learns from these examples. Then, the teacher gives the student a new fruit they’ve never seen before and asks whether it’s an apple or a banana. The student uses what they learned to make a guess. Similarly, a classifier in machine learning learns from examples and then guesses the category of new things it hasn’t seen before.

I hope you understand about a little bit about classification don’t worry we also do implementation of this concept with Python.

Benefits of Classification in Machine Learning

There are numerous advantage but for short we see only four these are also enough to know let’s see

  1. Informed Decision Making
  2. Pattern Recognition
  3. Efficiency
  4. Predictive Insights

Informed Decision Making

Classification aids in making well-informed decisions by predicting outcomes.

Pattern Recognition

It identifies patterns and trends that might not be obvious.

Efficiency

Classification automates sorting and categorization of large datasets, Which is very helpful.

Predictive Insights

Provides insights into future scenarios based on historical data.

I hope you understand about a little bit about classification merits.

Applications of Classification in Machine Learning

There are a lot’s of application of Classification in Machine Learning some of them are following.

  1. Loan Default Prediction
  2. Pattern Recognition
  3. Image Recognition
  4. Email Filtering
  5. Customer Churn Prediction
  6. Speech Recognition

Loan Default Prediction

Predicting loan repayment likelihood based on historical loan default data.

Medical Diagnosis

Diagnosing diseases by analyzing patient symptoms and medical tests.

Image Recognition

Identifying objects within images, such as facial recognition.

Email Filtering

Sorting emails into “spam” or “not spam” categories.

Customer Churn Prediction

Predicting if a customer will switch to a competitor.

Speech Recognition

Converting spoken language into text, enabling voice assistants.

Classification Algorithms in Classification in Machine Learning

There are many algorithms that we can use in classification some of them are following under below.

  1. Decision Trees
  2. Naive Bayes
  3. Linear Discriminant Analysis (LDA)
  4. k-Nearest Neighbors (k-NN)
  5. Logistic Regression
  6. Neural Networks
  7. Support Vector Machines (SVM)

Python implementation of all Algorithms

Here below you see implementation of all algorithms generally used in Classification.

Decision Trees in Python

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No)
data = [
    [25, 50000, 1, 1],
    [35, 75000, 0, 0],
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = DecisionTreeClassifier()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Decision Tree Accuracy: {accuracy}")

Output

Decision Tree Accuracy: 0.0

Naive Bayes

from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample email data and labels (1 for spam, 0 for not spam)
emails = [
    ("Get a free iPhone now!", 1),
    ("Meeting agenda for today", 0),
    # ... more data
]

X = [email[0] for email in emails]
y = [email[1] for email in emails]

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = MultinomialNB()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Naive Bayes Accuracy: {accuracy}")

Output

Naive Bayes Accuracy: 0.0

Linear Discriminant Analysis (LDA)

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No)
data = [
    [25, 50000, 1, 1],
    [35, 75000, 0, 0],
    [30, 60000, 1, 0],
    [40, 80000, 0, 1],
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = LinearDiscriminantAnalysis()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"LDA Accuracy: {accuracy}")

Output

LDA Accuracy: 0.0

k-Nearest Neighbors (k-NN)

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), class (0, 1, 2 for different classes)
data = [
    [25, 50000, 1, 0],
    [35, 75000, 0, 1],
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decreased n_neighbors to 1 for a small dataset
classifier = KNeighborsClassifier(n_neighbors=1)
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"k-NN Accuracy: {accuracy}")

Output

k-NN Accuracy: 0.0

Logistic Regression

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No)
data = [
    [25, 50000, 1, 1],
    [35, 75000, 0, 0],
    [30, 60000, 1, 1],  # Added another sample with a different class
    [40, 80000, 0, 0],  # Added another sample with a different class
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = LogisticRegression()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Logistic Regression Accuracy: {accuracy}")

Output

Logistic Regression Accuracy: 1.0

Neural Networks

from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No)
data = [
    [25, 50000, 1, 1],
    [35, 75000, 0, 0],
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = MLPClassifier()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"Neural Network Accuracy: {accuracy}")

Output

Neural Network Accuracy: 0.0

Support Vector Machines (SVM)

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample data: age, income, education (1 for Graduate, 0 for Not Graduate), default (1 for Yes, 0 for No)
data = [
    [25, 50000, 1, 1],
    [35, 75000, 0, 0],
    [30, 60000, 1, 1],  # Added another sample with a different class
    [40, 80000, 0, 0],  # Added another sample with a different class
    # ... more data
]

X = [row[:-1] for row in data]
y = [row[-1] for row in data]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

classifier = SVC()
classifier.fit(X_train, y_train)

predictions = classifier.predict(X_test)
accuracy = accuracy_score(y_test, predictions)
print(f"SVM Accuracy: {accuracy}")

Output

SVM Accuracy: 0.0
Conclusion

Conclusion

Now in last i want to conclude that, In machine learning, classification is a powerful technique for categorizing and labeling data into distinct classes. It involves learning patterns and relationships between features and target labels in order to make predictions on new, unlabeled data. Various classification algorithms, such as Logistic Regression, Neural Networks, and Support Vector Machines (SVM), offer different approaches to solving classification problems.

Link: https://Codelikechamp.com

Medium Link: Follow me on Medium

Linkedin Link: Follow me on Linkedin

🤞 Don’t miss any latest posts!

Please subscribe by joining our community for free and stay updated!!!

IF YOU HAVE ALREADY SUBSCRIBED JUST CLOSE THIS FORM !

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top