Big Data and Data Science, A Project on Data Analytics - A Little History on Methodologies for Data Analytics, KDD Process, CRISP-DM Methodology; Data Analytics- Types, Tools and Applications
Descriptive Statistics - Scale Types, Descriptive Univariate Analysis, Descriptive Bivariate Analysis; Descriptive Multivariate Analysis - Multivariate Frequencies, Multivariate Data Visualization, Multivariate Statistics, Infographics and Word Clouds; Data Quality - Missing Values, Redundant Data, Inconsistent Data, Noisy Data, Outliers.
Distance Measures - Differences between Values of Common Attribute Types, Distance Measures for Objects with Quantitative Attributes, Distance Measures for Non-conventional Attributes; Clustering Validation, Clustering Techniques - K-means, Centroids and Distance Measures, DBSCAN
Binary Classification - Predictive Performance Measures for Classification; Distance-based Learning Algorithms - K-nearest Neighbor Algorithms, Case-based Reasoning; Probabilistic Classification Algorithms - Logistic Regression Algorithm, Naive Bayes Algorithm.
Regression and its types; DA Applications for Text, Web and Social Media - Working with Texts, Recommender Systems, Social Network Analysis
Reference Book:
1 Dean J, ―Big Data, Data Mining and Machine learning, Wiley publications, 2014. 2 Provost F and Fawcett T, ―Data Science for Business, O‘Reilly Media Inc, 2013. 3 Janert PK, ―Data Analysis with Open Source Tools, O‘Reilly Media Inc, 2011. 4 Weiss SM, Indurkhya N and Zhang T, ―Fundamentals of Predictive Text Mining, Springer-Verlag London Limited, 2010. 5 Runkler T A, - Data Analytics: Models and Algorithms for Intelligent data analysis,Springer, 2012
Text Book:
Joao Moreira, Andre Carvalho, Tomás Horvath – “A General Introduction to Data Analytics” – Wiley - 2018