Data Architect Certification Program

Course overview:

This six-week intensive prepares working technologists to design, govern, and operate modern data platforms end-to-end. Participants move from conceptual data modeling through cloud-native lakehouse architectures, streaming pipelines, and enterprise governance, finishing with a capstone where each student delivers a production-ready architecture blueprint for a real-world scenario.

The curriculum blends concise theory with hands-on labs, weekly quizzes, take-home assignments, and a graded capstone project. Tools and platforms covered include AWS (S3, Glue, Redshift, Lake Formation), Snowflake, Apache Kafka, dbt, Apache Airflow, Terraform, and modern observability stacks. Every week includes a live lab session in a sandboxed cloud environment so concepts are reinforced through building, not just reading.

Course Content

🔹 Week 1: Foundations and data modeling

Topics:

Role of the data architect, stakeholder map, and architecture deliverables (C4, ADRs, reference diagrams).
Conceptual, logical, and physical data modeling — when to use each.
Normalization (1NF–3NF, BCNF) versus denormalization trade-offs.
Dimensional modeling: star, snowflake, conformed dimensions, slowly changing dimensions (Type 1, 2, 3).
Data Vault 2.0 introduction: hubs, links, satellites.

🔹 Week 2: Storage systems and data warehousing

Topics:

OLTP versus OLAP workload characteristics; row-store versus columnar storage.
RDBMS internals: indexing strategies, partitioning, query planning.
NoSQL family deep dive: key-value, document, wide-column, graph — selection criteria.
Cloud data warehouses: Snowflake, Redshift, BigQuery — architecture and pricing models.
ETL versus ELT; ingestion patterns and orchestration with Airflow and dbt.

🔹 Week 3: Data lakes and lakehouse architecture

Topics:

Data lake fundamentals: zones (raw, curated, consumption), file formats, and partition design.
Open table formats: Delta Lake, Apache Iceberg, Apache Hudi — feature comparison.
Medallion architecture: bronze, silver, gold layering and SLAs per layer.
Query engines: Athena, Trino, Spark SQL, Databricks SQL — trade-offs.
Cost optimization: tiered storage, lifecycle policies, file compaction, Z-ordering.

🔹 Week 4: Streaming and real-time data pipelines

Topics:

Batch versus streaming versus micro-batch; latency budgets and use case mapping.
Apache Kafka deep dive: brokers, topics, partitions, consumer groups, exactly-once semantics.
Stream processing engines: Kafka Streams, Flink, Spark Structured Streaming.
Change data capture (CDC) patterns with Debezium; outbox pattern.
Event-driven architecture, schema registry, and contract testing for events.

AWS

1/3

🔹 Week 5: Data governance, security, and quality

Topics:

Data governance frameworks: DAMA-DMBOK overview, RACI for data ownership.
Catalog and lineage tooling: AWS Glue Data Catalog, Unity Catalog, OpenMetadata, Collibra.
Data quality frameworks: Great Expectations, dbt tests, Soda; SLA and SLO definition.
Security: IAM, row- and column-level security, tokenization, encryption at rest and in transit.
PII handling, GDPR / CCPA / HIPAA constraints, data residency, and audit logging.

🔹 Week 6: Capstone project and architecture review

Topics:

Executive summary (1 page) covering business context, success criteria, and proposed approach.
Reference architecture diagram (C4 levels 1–3) plus a deployment view with cloud services labeled.
Data model artifacts: conceptual ERD, dimensional model for the analytics layer, and key DDL.
Pipeline design: ingestion sources, batch versus streaming routing, orchestration, and SLAs per layer.
Governance plan: catalog choice, lineage strategy, quality checks, security model, and compliance mapping.
Cost model with three load scenarios (low, expected, peak) and a sensitivity analysis on the top two drivers.
Three to five architecture decision records (ADRs) documenting major trade-offs.
Risk register listing top ten risks with mitigations and owners.

Staffing Support

Resume Preparation
Mock Interview Preparation
Phone Interview Preparation
Face to Face Interview Preparation
Project/Technology Preparation
Internship with internal project work
Externship with client project work

Our Salient Features:

Hands-on Labs and Homework
Group discussion and Case Study
Course Project work
Regular Quiz / Exam
Regular support beyond the classroom
Students can re-take the class at no cost
Dedicated conf. rooms for group project work
Live streaming for the remote students
Video recording capability to catch up the missed class