Naïve Bayes is a supervised machine learning algorithm mainly used for classification tasks. It is based on Bayes᎙ Theorem and assumes that all features are independent of each other (this is why it is called naïve).
᪅ Bayes᎙ Theorem
P(CᖣX)=P(XᖣC)ᙅP(C)P(X)P(C|X) = \frac{P(X|C)\cdot P(C)}{P(X)}
Where:
- P(CᖣX)P(C|X) = Posterior probability (probability of class given input)
- P(XᖣC)P(X|C) = Likelihood (probability of input given class)
- P(C)P(C) = Prior probability of class
- P(X)P(X) = Evidence (probability of input)
᪅ Working of Naïve Bayes
Naïve Bayes predicts the class CC for a given input XX by selecting the class with the highest probability:
C=argᏡmaxᏡP(C)ᙅP(XᖣC)C = \arg\max P(C)\cdot P(X|C)
᪅ Types of Naïve Bayes Classifiers
1. Gaussian Naïve Bayes
- Used when features are continuous and follow a normal distribution.
- Example: Medical diagnosis, sensor data.
2. Multinomial Naïve Bayes
- Used for discrete data like word counts.
- Example: Text classification, spam filtering.
3. Bernoulli Naïve Bayes
- Used for binary features (0 or 1).
- Example: Presence/absence of a word in documents.
᪅ Advantages
- Simple and fast algorithm
- Works well with high-dimensional data
- Performs very well in text classification problems
- Requires less training data
ᫌ Disadvantages
- Assumes independence between features (often unrealistic)
- Poor performance when features are highly correlated
- Probability estimates may not be accurate
᪅ Applications
- Spam email detection
- Sentiment analysis
- Document classification
- Disease prediction
- Recommendation systems
