Topics — Ultimate Data Science & GenAI Bootcamp

Topics — Ultimate Data Science & GenAI Bootcamp


19 Modules

Module 1 — Python Foundations

Core fundamentals to start programming in Python.
Topics (click to collapse)
Introduction to PythonComparison with other programming languages; Python objects: Numbers, Booleans, Strings.
Data Structures & OperationsContainer objects and mutability; Operators; Operator precedence and associativity.
Control FlowConditional statements, loops, break and continue.
String ManipulationInbuilt string methods; splitting/joining; formatting.
Lists & CollectionsList methods, comprehensions; tuples, sets, dictionaries; dictionary views.
Functions & IteratorsFunction basics, parameter passing, iterators, generator functions, lambda, map/reduce/filter.

Module 2 — Advanced Python Programming

Object-oriented & advanced language features.
Topics
Object-Oriented Programming (OOP)Classes, inheritance, polymorphism, encapsulation, dunder methods, property decorators.
File Handling & LoggingReading/writing files, buffered I/O, logging and debugging.
Modules & Exception HandlingImporting modules, creating packages, robust error handling.
Concurrency & ParallelismIntro to multithreading and multiprocessing for performance.

Module 3 — Mastering Data Handling with Pandas

Data manipulation essentials using Pandas.
Topics
Data Structures & FundamentalsSeries, DataFrame, Indexing, reindexing, iteration.
Data Operations & TransformationsSorting, text processing, categorical data, date/time functions.
Data Analysis & Statistical FunctionsStatistical functions, window functions.
Reading, Writing & VisualizationI/O from different file systems and basic visualization.

Module 4 — Mastering NumPy

Numerical computing with ndarray and linear algebra.
Topics
NumPy Basics & Array Creationndarray, dtypes, creation routines.
Indexing, Slicing & Advanced IndexingAccess patterns and views vs copies.
Array Operations & ManipulationBroadcasting, arithmetic operations, string funcs.
Mathematical & Statistical AnalysisStat functions, matrix library, linear algebra.
Advanced ConceptsBroadcasting deep-dive, iterating, byte swapping.

Module 5 — Data Visualization with Python

Visual storytelling using Matplotlib, Seaborn, and Plotly.
Topics
Introduction to Data VisualizationPrinciples and best practices.
MatplotlibBasic plots, axes, legends, subplots, saving figures.
SeabornDistribution plots, pairplots, heatmaps, categorical plots.
PlotlyInteractive plots, dashboards, Plotly Express.

Module 6 — Advanced SQL & Database Management

From SQL fundamentals to advanced query design.
Topics
Introduction to SQLSELECT, INSERT, UPDATE, DELETE.
SQL Functions & ProceduresAggregates, stored procedures, UDFs.
Database ConstraintsPrimary/foreign keys and referential integrity.
Advanced SQL TechniquesWindow functions, CTEs, partitioning, indexing.
SQL Joins & UnionsInner, Left, Right, Full Outer, Cross Joins, Union.
Triggers & CASEBefore/after triggers and conditional logic.
Normalization & Pivoting1NF/2NF/3NF, pivot tables, aggregation.

Module 7 — Introduction to NoSQL with MongoDB

Working with document databases and flexible schemas.
Topics
Getting StartedMongoDB setup and shell commands.
Database & Collection ManagementCreate DBs and collections.
CRUD OperationsInsert,

Module 8 — Foundations of Statistics & Probability

The math that powers data-driven decisions.
Topics
Introduction to StatisticsMeasures of central tendency and dispersion.
Random Variables & ProbabilitySet theory, covariance, correlation, PDFs.
DistributionsBinomial, Poisson, Normal, Bernoulli, Uniform.
Statistical InferenceZ-statistics, Central Limit Theorem, hypothesis testing.

Module 9 — Advanced Statistical Inference & Hypothesis Testing

Deeper statistical testing and interpretation.
Topics
Hypothesis Testing & ErrorsType I/II errors; T-test vs Z-test guidance.
Statistical Distributions & TestsT-stats, Chi-square, goodness of fit (with Python).
Bayesian StatisticsBayes theorem, confidence intervals, margin of error.
Statistical SignificanceP-values and interpretation.

Module 10 — Feature Engineering & Data Preprocessing

Prepare production-ready datasets for modeling.
Topics
Handling Missing & Imbalanced DataStrategies for imputation and resampling.
Outliers & ScalingOutlier handling and feature scaling techniques.
Data Transformation & EncodingEncoding categorical features and transformations.
Feature SelectionBackward/forward elimination, RFE.
Correlation & MulticollinearityCovariance, correlation, VIF diagnostics.

Module 11 — Exploratory Data Analysis (EDA)

Discover insights and patterns in your data.
Topics
Trend Analysis & SegmentationExample: bike sharing trends & customer segmentation.
Sentiment & Quality AnalysisMovie review sentiment; wine quality analysis.
Recommendation & ForecastingMusic recommendation systems; forecasting stock & commodity prices.

Module 12 — Machine Learning Foundations & Techniques

Classic ML algorithms and evaluation.
Topics (supervised + unsupervised)
Introduction to MLAI vs ML vs DL vs DS; supervised/unsupervised/semi-supervised/reinforcement learning.
Linear Regression MSE, MAE, RMSE, R², OLS.
Regularization Ridge, Lasso, ElasticNet.
Logistic Regression Confusion matrix, precision, recall, F-score, ROC-AUC.
SVM Support vector classifiers and regressors.
Bayes & Naive Bayes Bayes theorem and applications.
KNNKNN classifier and regressor.
Decision TreesTree-based models for classification and regression.
Ensemble Methods Bagging, boosting, Random Forest, XGBoost.
Unsupervised Learning Clustering overview: KMeans, Hierarchical, DBSCAN.
Clustering Evaluation Silhouette coefficient and related metrics.

Module 13 — Natural Language Processing (NLP)

Basics to embeddings and Word2Vec.
Topics
Introduction to NLP for MLRoadmap and practical use cases.
Text PreprocessingTokenization, stemming, lemmatization, stopwords.
Text RepresentationOne-hot, n-grams, BoW, TF-IDF.
POS TaggingPart-of-speech tagging using NLTK.
Named Entity Recognition (NER)Basics and implementation with NLTK.
Word EmbeddingsIntro and benefits.
Word2VecSkip-gram and CBOW, training Word2Vec models.

Module 14 — Introduction to Deep Learning & Neural Networks

ANN fundamentals, frameworks and best practices.
Topics
Intro to Deep LearningWhy deep learning matters.
Perceptron & ANNNeuron models and multi-layer nets.
BackpropagationGradient descent and training networks.
Vanishing/Exploding GradientsCauses and mitigation.
Activation FunctionsSigmoid, ReLU, Tanh, etc.
Loss FunctionsCommon losses for regression/classification.
OptimizersSGD, Adam, RMSprop.
Weight InitializationXavier, He init.
Dropout & BatchNormRegularization and normalization techniques.
Keras & PyTorchFramework fundamentals and model building.

Module 15 — CNNs, Object Detection & Segmentation

Computer vision architectures and deployment.
Topics
Introduction to CNNCNN fundamentals and architecture overview.
CNN Deep-DiveTensor space, filters, feature maps.
CNN ArchitecturesResNet and variants.
Training CNNsHyperparameter tuning, overfitting/underfitting.
Web Apps for CNNDeploying with Flask/Django or TF.js.
Object Detection — YOLOYOLO architecture, training and deployment.
Object Detection — Detectron2Pre-trained models and fine-tuning.
SegmentationSemantic & instance segmentation with YOLO and Detectron2.

Module 16 — RNNs & Transformer Models

Sequence modeling and attention-based architectures.
Topics
Introduction to RNNsRNN fundamentals and use cases.
LSTMLSTM cells and sequence modeling.
GRUGRU vs LSTM, when to use which.
Encoders & DecodersSeq2seq architectures.
Attention MechanismSoft/hard attention and variants.
Attention Neural NetworksSelf-attention and Transformer basics.
BERTPre-training & fine-tuning BERT.
GPT-2Autoregressive text generation and fine-tuning.

Module 17 — Introduction to Generative AI

Generative models, research, and applications.
Topics
Overview of Generative AIGenerative vs discriminative, definitions & significance.
Understanding Generative ModelsGANs, VAEs, how they work and advantages.
Generative vs DiscriminativeKey differences and use cases.
Recent AdvancementsState-of-the-art techniques and trends.
ApplicationsImage synthesis, drug discovery, NLP, creative AI.

Module 18 — Vector Databases

Storing & querying embeddings for modern AI systems.
Topics
Overview of Vector DBsWhat they are, key concepts and use cases.
Comparison with SQL/NoSQLDifferences, tradeoffs and performance considerations.
CapabilitiesSimilarity search, high-dimensional handling, real-time processing.
Storage & ArchitectureIndexing techniques and optimizations.
TypesIn-memory, disk-based, cloud-based vector DBs.
Popular DBsChromaDB, FAISS, Pinecone, LanceDB, Quadrant.
Vector Search with NoSQLIntegrating vector search with MongoDB/Cassandra best practices.

Module 19 — Retrieval-Augmented Generation (RAG) & Capstone Projects

Combine retrieval systems with LLMs and build end-to-end projects.
Topics
RAG OverviewWhat RAG is, core components, why it matters.
RAG PipelineRetrieval, contextualization, generation phases and challenges.
LangChain IntegrationBuilding end-to-end RAG pipelines with LangChain.
Leveraging Vector DBsHow vector DBs power retrieval in RAG.
Role of LLMsFine-tuning and LLM-based generation in RAG systems.
Hybrid Search & RerankingCombining retrieval methods and reranking strategies.
Memory in RAGLong-term memory and persistent context strategies.
Multimodal RAGCombining text, images and other modalities.
Capstone — Python ProjectsEnd-to-end application architecture and best practices.
Capstone — ML/Deep Learning/Generative ProjectsComplete project lifecycles and deployment.

Comments

Popular posts from this blog

YouTube Tutorials & Blogs for 6-Month Speaker Development

AI Engineering Stack

GenAI Interview Question