Prerequisite | Python Language |

Theory Fee | Rs. 6500/- (Includes digital course material) |

Digital or Physical Certificate Fee | Rs. 200/- |

Software Required | Anaconda from anaconda.com/downloads |

- What is Data Science
- What is Machine Learning
- What is Deep Learning
- Role of Data Scientist
- Applications of Data Science
- Data and its sources
- Overview of Data Science Life Cycle

- Downloading and installing Anaconda
- Starting Jupyter Notebook
- UI elements of Notebook
- Kernel and types of cells - Code and Markdown
- Modes - Edit and Command
- Magic functions - Line and Cell functions
- Keyboard shortcuts - Command mode and Edit mode shortcuts
- Saving and loading of notebook
- Using Jupyter Lab

- Mean, Median, Mode and Range
- Using statistics module
- Variance and Standard Deviation
- Quartiles and IQR
- Understanding distribution of data using Histogram and Box plot
- Measuring Skewness and Kurtosis
- Probability
- Correlation between variables
- Using scipy.stats module
- Scatter plot to understand correlation
- Regression Analysis
- Understanding intercept and slope - predict y given X

- Creating single and multi-dimensional arrays
- Using indexing and slicing
- Using fancy indexing (boolean indexing and array of indices)
- Array operations, methods of ndarray and universal functions
- View vs. Copy of array
- Reshaping arrays
- Stacking and splitting arrays
- Broadcasting
- Applying Linear Algebra
- Image processing with Arrays

- Working with Series
- Applying methods on Series
- Working with DataFrame
- Reading data into DataFrame and writing DataFrame to other formats
- Selecting rows and columns in DataFrame
- Adding and deleting rows and columns in DataFrame
- Working with apply() and applymap() functions
- Working with str attribute for string manipulations
- Joining, Merging and Concatenating DataFrames
- Grouping data on one or more columns
- Using pivot_table()
- Data Wrangling - Binning, Encoding etc.
- Handling null values
- Drawing plots using Pandas

- Anatomy of a figure
- Working with Module API and Object API
- Working with different plots - Histogram, Bar, Stacked Bar, Pie, Scatter, Line
- Creating multiple axes in single figure
- Customizing plots - labels, legends, scales, titles, text etc.

- Figure-level vs. Axes level plots
- Categorical, Relational, Distribution, Regression and Matrix Plots
- Using parameters like hue, row and col

- What is the question
- Data Acquisition
- Preparing data - cleaning and organizing data
- Exploratory Data Analysis (EDA)
- Data Munging/Data Wrangling
- Feature Engineering
- Data Visualization
- Model Building
- Model Evaluation
- Model Deployment

- Understanding pre-processing concepts like Standardization, Encoding etc.
- Training Model using train and test split
- Using different algorithms like Logistic Regression, Support Vector Machines, k-Nearest Neighbors, Naive Bayes, Decision Tree, Random Forest using Scikit-learn
- Evaluating result of the model using metrics - classification report, confusion matrix
- Understand Precision, Recall, F1 Score, Specificity and Sensitivity
- Understanding cross validation and how to use it to train and test model
- Presenting the model - Deployment of the model

- How to use metrics - MSE, RMSE, R2 Score etc. to evaluate model
- Understanding Regularization - Lasso and Ridge
- Understanding ensemble algorithms - Bagging and Boosting
- Stochastic Gradient Descent
- Using Grid Search to select right hyper parameters
- How to use Pipeline
- Understanding non-linear data - polynomial features

- What is clustering
- How k-Means clustering works
- How DBSCAN works to create clusters
- How hierarchical clustering works - Agglomerative clustering and Dendrogram
- Recommender systems - Collaborative filtering and Content-based filtering