Decision Tree
A Decision Tree is a supervised machine learning algorithm used for classification and regression.
It works like a tree structure where decisions are made based on conditions.
Structure
- Root Node ᎓ starting point
- Internal Nodes ᎓ decision conditions (feature tests)
- Branches ᎓ outcomes of decisions
- Leaf Nodes ᎓ final prediction/result
Working
The dataset is split into smaller subsets based on feature values until a final decision is reached.
Advantages
- Easy to understand and interpret
- Works with both numerical and categorical data
- Requires little data preprocessing
Disadvantages
- Can easily overfit
- Sensitive to small changes in data
Random Forest
A Random Forest is an ensemble learning method that combines multiple decision trees to improve accuracy and reduce overfitting.
Working
- Creates many decision trees using random samples of the dataset (Bootstrap sampling).
- Each tree selects random features for splitting.
- Final output is obtained by:
- Majority voting (Classification)
- Average prediction (Regression)
Advantages
- High accuracy
- Reduces overfitting compared to a single decision tree
- Works well with large datasets
Disadvantages
- More computationally expensive
- Less interpretable than a single decision tree
Difference Between Decision Tree and Random Forest
| Feature | Decision Tree | Random Forest |
|---|---|---|
| Model Type | Single tree | Collection of trees |
| Accuracy | Moderate | High |
| Overfitting | High chance | Less chance |
| Interpretability | Easy | Difficult |
| Speed | Faster | Slower |
