Old Website
23ITB302 Data Analytics

Outlier detection plays a crucial role in data analytics for identifying abnormal data points that may impact analysis and decision-making. Choose a real-world numerical dataset available from the internet (for example, datasets from Kaggle, UCI Machine Learning Repository, or other open data sources). Using the selected dataset, perform outlier detection employing three categories of methods: Statistical methods (Z-Score and IQR), Visualization methods (Box Plot and Scatter Plot), and Algorithmic methods (Isolation Forest and DBSCAN). Write a Python program to implement each method, identify and label the outliers, and visualize the results appropriately. Compare the outliers detected by different methods and analyze the variations in their detection behavior. Submit the dataset source link, Python code, output plots, and a brief discussion of the results.

screen tagSupport