Data Architect Certification Program
Course overview:
This six-week intensive prepares working technologists to design, govern, and operate modern data platforms end-to-end. Participants move from conceptual data modeling through cloud-native lakehouse architectures, streaming pipelines, and enterprise governance, finishing with a capstone where each student delivers a production-ready architecture blueprint for a real-world scenario.
​
The curriculum blends concise theory with hands-on labs, weekly quizzes, take-home assignments, and a graded capstone project. Tools and platforms covered include AWS (S3, Glue, Redshift, Lake Formation), Snowflake, Apache Kafka, dbt, Apache Airflow, Terraform, and modern observability stacks. Every week includes a live lab session in a sandboxed cloud environment so concepts are reinforced through building, not just reading.
Course Content
🔹 Week 1: Foundations and data modeling
Topics:
-
Role of the data architect, stakeholder map, and architecture deliverables (C4, ADRs, reference diagrams).
-
Conceptual, logical, and physical data modeling — when to use each.
-
Normalization (1NF–3NF, BCNF) versus denormalization trade-offs.
-
Dimensional modeling: star, snowflake, conformed dimensions, slowly changing dimensions (Type 1, 2, 3).
-
Data Vault 2.0 introduction: hubs, links, satellites.
🔹 Week 2: Storage systems and data warehousing
Topics:
-
OLTP versus OLAP workload characteristics; row-store versus columnar storage.
-
RDBMS internals: indexing strategies, partitioning, query planning.
-
NoSQL family deep dive: key-value, document, wide-column, graph — selection criteria.
-
Cloud data warehouses: Snowflake, Redshift, BigQuery — architecture and pricing models.
-
ETL versus ELT; ingestion patterns and orchestration with Airflow and dbt.
🔹 Week 3: Data lakes and lakehouse architecture
Topics:
-
Data lake fundamentals: zones (raw, curated, consumption), file formats, and partition design.
-
Open table formats: Delta Lake, Apache Iceberg, Apache Hudi — feature comparison.
-
Medallion architecture: bronze, silver, gold layering and SLAs per layer.
-
Query engines: Athena, Trino, Spark SQL, Databricks SQL — trade-offs.
-
Cost optimization: tiered storage, lifecycle policies, file compaction, Z-ordering.
🔹 Week 4: Streaming and real-time data pipelines
Topics:
-
Batch versus streaming versus micro-batch; latency budgets and use case mapping.
-
Apache Kafka deep dive: brokers, topics, partitions, consumer groups, exactly-once semantics.
-
Stream processing engines: Kafka Streams, Flink, Spark Structured Streaming.
-
Change data capture (CDC) patterns with Debezium; outbox pattern.
-
Event-driven architecture, schema registry, and contract testing for events.​




🔹 Week 5: Data governance, security, and quality
Topics:
-
Data governance frameworks: DAMA-DMBOK overview, RACI for data ownership.
-
Catalog and lineage tooling: AWS Glue Data Catalog, Unity Catalog, OpenMetadata, Collibra.
-
Data quality frameworks: Great Expectations, dbt tests, Soda; SLA and SLO definition.
-
Security: IAM, row- and column-level security, tokenization, encryption at rest and in transit.
-
PII handling, GDPR / CCPA / HIPAA constraints, data residency, and audit logging.
​
🔹 Week 6: Capstone project and architecture review
Topics:
-
Executive summary (1 page) covering business context, success criteria, and proposed approach.
-
Reference architecture diagram (C4 levels 1–3) plus a deployment view with cloud services labeled.
-
Data model artifacts: conceptual ERD, dimensional model for the analytics layer, and key DDL.
-
Pipeline design: ingestion sources, batch versus streaming routing, orchestration, and SLAs per layer.
-
Governance plan: catalog choice, lineage strategy, quality checks, security model, and compliance mapping.
-
Cost model with three load scenarios (low, expected, peak) and a sensitivity analysis on the top two drivers.
-
Three to five architecture decision records (ADRs) documenting major trade-offs.
-
Risk register listing top ten risks with mitigations and owners.
Staffing Support​
-
Resume Preparation
-
Mock Interview Preparation
-
Phone Interview Preparation
-
Face to Face Interview Preparation
-
Project/Technology Preparation
-
Internship with internal project work
-
Externship with client project work
Our Salient Features:
-
Hands-on Labs and Homework
-
Group discussion and Case Study
-
Course Project work
-
Regular Quiz / Exam
-
Regular support beyond the classroom
-
Students can re-take the class at no cost
-
Dedicated conf. rooms for group project work
-
Live streaming for the remote students
-
Video recording capability to catch up the missed class
