Hidden Markov Model (HMM)
A Hidden Markov Model (HMM) is a statistical model used to represent systems where the actual states are not directly observable (hidden), but we can observe outputs that depend on those hidden states.
It is widely used for sequence prediction and time-series modeling.
✅ Key Idea
In an HMM:
- The system moves through a sequence of hidden states
- Each hidden state produces an observable output
Example:
In speech recognition, we cannot directly observe the speaker’s phoneme state (hidden), but we can observe sound signals (output).
✅ Components of HMM
An HMM consists of:
1. Hidden States (S)
States that cannot be observed directly.
Example: {Hot, Cold}
2. Observations (O)
Outputs that can be observed.
Example: {Ice cream count}
3. Transition Probability (A)
Probability of moving from one hidden state to another.
Aij=P(St+1=j∣St=i)A_{ij} = P(S_{t+1}=j | S_t=i)
4. Emission Probability (B)
Probability of observing an output given a hidden state.
Bj(k)=P(Ot=k∣St=j)B_j(k) = P(O_t=k | S_t=j)
5. Initial State Probability (π)
Probability of starting in a particular hidden state.
πi=P(S1=i)\pi_i = P(S_1=i)
✅ Representation
An HMM is defined by:
λ=(A,B,π)\lambda = (A, B, \pi)
Where:
- AA = Transition probability matrix
- BB = Emission probability matrix
- π\pi = Initial probability vector
✅ Example
Weather is hidden (Hot/Cold), but we observe number of ice creams sold.
- If weather is Hot, ice cream sales are high.
- If weather is Cold, ice cream sales are low.
So, by observing sales, we can estimate the hidden weather state.
✅ Three Fundamental Problems of HMM
1. Evaluation Problem
Compute probability of observation sequence given the model.
Solved using Forward Algorithm.
2. Decoding Problem
Find the most likely hidden state sequence.
Solved using Viterbi Algorithm.
3. Learning Problem
Estimate model parameters (A, B, π) from data.
Solved using Baum-Welch Algorithm.
✅ Applications of HMM
- Speech recognition
- Handwriting recognition
- Part-of-speech tagging in NLP
- Bioinformatics (DNA sequence analysis)
- Gesture recognition
- Weather forecasting
✅ Advantages
- Good for sequential/time-series data
- Handles uncertainty effectively
- Useful when states are not directly visible
❌ Disadvantages
- Assumes Markov property (depends only on current state)
- Difficult for very complex real-world systems
- Requires large data for accurate estimation
