DEA-C01 — AWS Certified Data Engineer – Associate Study Plan

A practical 7-, 14-, 30-, and 60/90-day preparation schedule for the AWS Certified Data Engineer – Associate (DEA-C01) exam.

Study plan orientation

This Study Plan is for candidates preparing for the AWS Certified Data Engineer – Associate (DEA-C01) exam from AWS. It is designed for practical scheduling: diagnostic practice, focused service review, hands-on concept checks, missed-question review, timed mocks, and final-week consolidation.

Use it whether you have one week left or are starting earlier. The shorter plans prioritize weak-area repair and exam execution. The longer plans give you time to build stronger AWS data engineering judgment across ingestion, storage, transformation, orchestration, security, governance, monitoring, and troubleshooting.

Which plan should you use?

Time availableBest fitDaily time targetMain goalMock exam approach
7 daysYou already studied and need final review1.5-3 hoursRepair weak areas, review mistakes, sharpen timing1 full timed mock if you can review it fully
14 daysYou know AWS basics but need focused DEA-C01 coverage1.5-2.5 hoursCover core data services and build exam rhythm1 diagnostic set + 1-2 timed mocks
30 daysBalanced preparation60-120 minutesLearn, practice, review, and simulateDiagnostic early, mocks in weeks 3 and 4
60 daysFull preparation with steady practice5-8 hours/weekBuild service-selection depth and hands-on confidenceSectional practice first, timed mocks late
90 daysNewer to AWS data engineering or limited weekly time3-5 hours/weekBuild foundations without crammingMore hands-on labs before full mocks

If your time is very limited, do not try to “read everything.” Start with a diagnostic, identify the highest-value gaps, and spend most of your time reviewing missed questions and practicing realistic AWS data engineering scenarios.

Core DEA-C01 study map

Use the official AWS exam guide for the current objective list, then organize your study into these working areas.

Study areaAWS topics to reviewWhat you should be able to do
Data ingestion and movementAmazon S3 ingestion patterns, AWS DMS, Amazon Kinesis Data Streams, Amazon Data Firehose, Amazon MSK, event-driven ingestionChoose batch, streaming, replication, or event-based ingestion based on latency, volume, source type, and operational needs
Data storage and lake foundationsAmazon S3, AWS Glue Data Catalog, crawlers, partitioning, file formats, schema evolution, Amazon Redshift, Amazon DynamoDB, relational sourcesDesign a storage layout, catalog data, choose formats, and reason about query performance and cost
Transformation and processingAWS Glue jobs, Spark concepts, AWS Glue Studio, Amazon EMR, AWS Lambda for lightweight transforms, SQL-based transformsPick the right processing service, troubleshoot failed jobs, and understand ETL vs ELT decisions
Orchestration and automationAWS Step Functions, Amazon EventBridge, AWS Glue workflows and triggers, Amazon MWAA conceptsCoordinate multi-step pipelines, retries, dependencies, and scheduled or event-driven workflows
Analytics and query accessAmazon Athena, Amazon Redshift, Redshift Spectrum concepts, data warehouse vs data lake query patternsChoose query engines and storage patterns for reporting, ad hoc analysis, and large-scale analytics
Security and governanceIAM roles and policies, AWS KMS, encryption, Lake Formation, S3 bucket policies, VPC endpoints, CloudTrailApply least privilege, protect data, manage access, and identify audit or governance controls
Monitoring and troubleshootingAmazon CloudWatch logs and metrics, AWS CloudTrail, job run history, pipeline failures, schema and partition issuesDiagnose failures, identify bottlenecks, and select appropriate observability tools
Performance, reliability, and costPartitioning, compression, columnar formats, retries, idempotency, lifecycle management, workload sizing conceptsImprove pipeline efficiency without overbuilding or choosing unnecessarily complex services

Start with a diagnostic

Before following any plan, complete a diagnostic session.

StepActionOutput
1Take a mixed DEA-C01 practice set under light timingBaseline weak areas
2Mark every missed, guessed, or slow questionMissed-question log
3Group misses by topicTop 3-5 repair areas
4Review the related AWS service behaviorShort notes in your own words
5Retest only those weak areasEvidence of improvement

Do not treat the diagnostic as a pass/fail judgment. Its purpose is to decide where your limited study time should go.

Daily practice rhythm

Use the rhythm below for most study days. Adjust the minutes, but keep the order: recall, learn, practice, review.

Available timeRecommended session
45 minutes5 min recall, 20 min focused review, 15 min practice questions, 5 min missed-question notes
60 minutes10 min recall, 25 min topic review, 20 min practice, 5 min summary
90 minutes10 min recall, 35 min service review or hands-on concept check, 30 min practice, 15 min missed-question review
2+ hours15 min recall, 45 min focused study, 45 min scenario practice, 30 min review, 15 min flashcards or notes

Daily checklist

  • Review yesterday’s missed questions before adding new content.
  • Study one focused topic at a time, such as Glue job troubleshooting or Kinesis ingestion choices.
  • Practice with scenario questions, not only definition questions.
  • Write down the service-selection rule you learned.
  • End with a short note: “If I see this scenario, I will choose X because Y.”

Missed-question review method

Most DEA-C01 improvement comes from reviewing why an answer was wrong, not from taking more questions.

Log fieldWhat to record
TopicExample: Glue crawler, Lake Formation access, Firehose delivery, Athena partitioning
Mistake typeKnowledge gap, misread scenario, wrong service, missed security requirement, performance/cost tradeoff
Why the correct answer worksOne or two plain-language sentences
Why your answer failedThe constraint you ignored
Rule to rememberA short decision rule
Retest date24-72 hours later

Common mistake categories

Mistake categoryExample repair action
Wrong ingestion serviceBuild a comparison table for Kinesis Data Streams, Data Firehose, MSK, DMS, and S3 batch ingestion
Weak IAM/security reasoningReview IAM roles, KMS keys, S3 policies, Lake Formation permissions, and audit trails together
Confused catalog behaviorPractice how Glue crawlers, the Data Catalog, schemas, and partitions relate
Weak ETL troubleshootingReview job logs, permissions, source connectivity, schema changes, and data format issues
Overlooking cost/performanceRevisit partitioning, columnar formats, compression, lifecycle policies, and query engine choice
Memorizing instead of reasoningRewrite the question as a real pipeline design decision

Hands-on concept review

If you have access to a safe AWS practice environment, use small, controlled exercises. Clean up resources when finished and avoid deploying anything unnecessary.

Hands-on themePractice taskWhat to learn
S3 data lake basicsPlace sample files in S3 using different prefixes and formatsHow layout affects cataloging and query patterns
Glue Data CatalogCreate or inspect tables, crawler behavior, schemas, and partitionsHow metadata supports Athena, Glue, and other analytics tools
Athena query practiceQuery sample data and compare layout or format choicesHow partitioning and file format affect query behavior
Glue ETL conceptsReview job parameters, source/target settings, logs, and retriesHow to diagnose transformation and permission failures
Streaming ingestionCompare stream-based and delivery-stream patterns conceptuallyWhen to use near-real-time ingestion vs direct delivery
OrchestrationMap a pipeline with triggers, dependencies, retries, and notificationsHow to coordinate multi-step data workflows
Security controlsReview IAM role assumptions, KMS use, S3 policies, and Lake Formation conceptsHow least privilege and data governance apply to pipelines
MonitoringInspect where logs, metrics, and audit records would appearHow to troubleshoot a failing pipeline

If you cannot use hands-on labs, replace each lab with a diagram exercise: draw the pipeline, list the AWS services, list permissions, identify failure points, and explain how you would monitor it.

7-day final review plan

Use this plan when the exam is close and you already have some preparation. Do not try to learn every AWS data service from scratch in one week.

DayFocusStudy actions
Day 1Diagnostic and triageTake a mixed practice set. Build a missed-question log. Pick your top 4 weak areas.
Day 2Ingestion and storageReview S3, DMS, Kinesis options, Data Firehose, source-to-lake patterns, file formats, partitions, and schema handling.
Day 3Processing and orchestrationReview Glue jobs, Spark concepts, EMR use cases, Lambda limits as a pattern, Step Functions, EventBridge, Glue workflows, and pipeline dependencies.
Day 4Security and governanceReview IAM, KMS, encryption, S3 policies, Lake Formation, CloudTrail, VPC access patterns, and least-privilege pipeline roles.
Day 5Troubleshooting and performanceDrill Glue job failures, crawler/catalog issues, Athena query issues, permission failures, monitoring signals, and cost/performance tradeoffs.
Day 6Timed mock and deep reviewTake one timed mock or the closest equivalent. Spend at least as much time reviewing as testing.
Day 7Light final reviewReview notes, missed questions, service-selection tables, and exam logistics. Avoid heavy new content.

7-day rule

Stop adding unfamiliar services after Day 5 unless they directly explain a repeated missed question. The final 48 hours should be for consolidation, not expansion.

14-day focused plan

Use this plan if you have two weeks and can study most days.

DayFocusOutput
1Diagnostic set and exam guide reviewTopic ranking and schedule adjustments
2S3, file formats, partitioning, lifecycle conceptsStorage decision notes
3Glue Data Catalog, crawlers, schemas, Athena basicsCatalog and query notes
4Batch ingestion: S3, DMS, scheduled loadsBatch ingestion comparison
5Streaming ingestion: Kinesis Data Streams, Data Firehose, MSK conceptsStreaming decision table
6Glue ETL, Spark concepts, job configuration, retriesETL troubleshooting notes
7Orchestration: Step Functions, EventBridge, Glue workflows, MWAA conceptsPipeline dependency diagram
8Timed sectional practiceMissed-question log update
9Security: IAM, KMS, S3 policies, Lake FormationSecurity access-control map
10Redshift, Athena, EMR, analytics service selectionQuery and processing comparison
11Monitoring and troubleshootingFailure-mode checklist
12Weak-area sprintRetest of top weak topics
13Full timed mock or near-full simulationTiming and readiness evidence
14Final reviewLight notes, no major new topics

14-day priorities

Spend the most time on the areas that affect many question types:

  1. Service selection for ingestion, processing, storage, and analytics.
  2. IAM, KMS, S3, and Lake Formation access patterns.
  3. Glue, Data Catalog, crawlers, partitions, and ETL troubleshooting.
  4. Monitoring, logs, retries, and pipeline reliability.
  5. Performance and cost tradeoffs in data lake and analytics designs.

30-day balanced plan

The 30-day path is best if you want enough time to learn, practice, review, and simulate without stretching preparation too long.

DaysFocusPrimary tasksPractice target
1-2Diagnostic and planningTake diagnostic practice, review the official exam guide, rank weak areasMixed baseline set
3-5Data lake foundationsS3 layout, prefixes, file formats, compression, partitions, Glue Data CatalogStorage and catalog questions
6-8Batch ingestionDMS, S3 ingestion, scheduled loads, source/target decisions, error handlingBatch ingestion scenarios
9-11Streaming ingestionKinesis Data Streams, Data Firehose, MSK concepts, streaming-to-lake patternsStreaming service selection
12-15Processing and transformationGlue jobs, Spark concepts, EMR, Lambda use cases, ETL vs ELTETL and processing drills
16-17OrchestrationStep Functions, EventBridge, Glue workflows, scheduling, dependencies, retriesPipeline workflow questions
18-20Analytics and warehouse patternsAthena, Redshift, Redshift Spectrum concepts, query access, data modeling considerationsQuery engine selection
21Timed mock 1Simulate exam conditions as closely as your practice tool allowsFull review afterward
22-23Mock review and repairRe-study every missed or guessed questionRetest weak areas
24-25Security and governanceIAM, KMS, encryption, Lake Formation, S3 policies, auditabilitySecurity scenario drills
26-27Monitoring, troubleshooting, cost/performanceCloudWatch, CloudTrail, job logs, crawler issues, query performance, lifecycle choicesTroubleshooting drills
28Timed mock 2Take a second full or near-full simulationTiming and weak-area evidence
29Final weak-area sprintReview only recurring misses and service-selection rulesShort targeted sets
30Light final reviewNotes, flashcards, logistics, restNo heavy new content

Weekly rhythm for the 30-day plan

Day typeWhat to do
New topic dayLearn the service patterns, then answer targeted questions
Review dayRevisit missed questions and draw architecture flows
Mock dayTest under timing, then review deeply
Repair dayRe-study only the topics that caused misses
Final dayConsolidate notes and protect energy

60/90-day full preparation path

Use the 60-day path if you can study consistently several hours per week. Use the 90-day path if you are newer to AWS data engineering, have limited weekly time, or want more hands-on reinforcement.

Phase60-day timing90-day timingFocusDeliverable
1Week 1Weeks 1-2Diagnostic, AWS data pipeline foundations, exam guide reviewBaseline scorecard and study map
2Week 2Weeks 3-4S3, Glue Data Catalog, crawlers, schemas, partitions, AthenaData lake notes and catalog diagram
3Week 3Weeks 5-6Batch and streaming ingestion: DMS, Kinesis, Data Firehose, MSK, S3 patternsIngestion service-selection table
4Week 4Weeks 7-8Glue ETL, Spark concepts, EMR, Lambda transforms, ELT patternsProcessing comparison notes
5Week 5Week 9Orchestration and automation: Step Functions, EventBridge, Glue workflows, MWAA conceptsPipeline workflow diagram
6Week 6Week 10Security and governance: IAM, KMS, S3 policies, Lake Formation, auditabilityAccess-control checklist
7Week 7Week 11Monitoring, troubleshooting, performance, reliability, and costFailure-mode playbook
8Week 8Week 12Timed mocks, weak-area sprint, final reviewExam-readiness decision

60/90-day weekly structure

Weekly activityRecommended amount
Focused reading or video review2 sessions
Hands-on or diagram-based concept practice1 session
Targeted practice questions2 sessions
Missed-question review2-3 short sessions
Architecture/service-selection drill1 session
Timed mockLate phase only

Long-path checkpoint schedule

CheckpointWhenWhat to decide
Baseline checkpointEnd of Phase 1Which topics are unfamiliar?
First repair checkpointEnd of Phase 3Can you choose ingestion services correctly?
Processing checkpointEnd of Phase 4Can you explain Glue, EMR, Lambda, and SQL transform tradeoffs?
Security checkpointEnd of Phase 6Can you reason through IAM, KMS, S3, and Lake Formation access?
Readiness checkpointFinal 1-2 weeksAre mistakes isolated and reviewable, or broad and repeated?

Timed mock exam strategy

Timed mocks are useful only if you review them thoroughly. A mock without review is mostly a stamina exercise.

Plan lengthWhen to use timed mocksHow to review
7 daysOnce, around Day 6, if you have time to reviewReview every missed, guessed, and slow question
14 daysAround Days 8 and 13Use the first to repair, the second to confirm readiness
30 daysAround Days 21 and 28Compare mistake patterns across both mocks
60/90 daysMostly in the final quarter of the planUse earlier practice as sectional drills, not full simulations

Mock review rules

  • Recreate the reasoning path for every miss.
  • Mark questions you answered correctly but guessed.
  • Identify whether the issue was AWS knowledge, scenario reading, or service selection.
  • Review related services together. For example, do not review Athena without also reviewing S3 layout, partitions, Glue Data Catalog, and permissions.
  • Avoid taking multiple full mocks back-to-back if you cannot review them the same day or next day.

Service-selection drills for DEA-C01

Many AWS data engineering questions test the ability to choose the right service or design pattern under constraints. Practice comparisons directly.

Decision areaComparePractice question to answer
Batch vs streamingS3 batch loads, DMS, Kinesis, Data Firehose, MSKHow fresh does the data need to be, and who manages the streaming complexity?
Storage choiceS3, Redshift, DynamoDB, relational databasesIs this a data lake, warehouse, operational store, or source system?
Processing choiceGlue, EMR, Lambda, SQL in Athena or RedshiftIs the workload serverless ETL, large Spark processing, lightweight event handling, or SQL transformation?
Catalog and queryGlue Data Catalog, crawlers, Athena, Redshift Spectrum conceptsHow will data be discovered, partitioned, and queried?
Orchestration choiceStep Functions, EventBridge, Glue workflows, MWAA conceptsIs the pipeline event-driven, scheduled, dependency-heavy, or workflow-managed?
Access controlIAM, S3 policies, KMS, Lake FormationWhich layer controls identity, encryption, object access, and governed table access?
ObservabilityCloudWatch, CloudTrail, job logs, service metricsAre you debugging performance, failures, permissions, or audit history?

Final-week rules

Use these rules regardless of which plan you followed.

Stop adding new material

Stop adding broad new material about 2-3 days before the exam. Continue reviewing only:

  • Repeated missed-question topics.
  • Service comparisons you still confuse.
  • Security and permission patterns.
  • Troubleshooting workflows.
  • Your own summary notes.

Protect review quality

DoAvoid
Review missed questions in detailSkimming answer keys
Redraw common data pipelinesMemorizing isolated service names
Practice timing on mixed setsSpending all day on one obscure topic
Sleep and keep a normal routineTaking a full mock late the night before
Confirm exam logisticsChanging your entire strategy at the end

Exam-readiness checks

You are closer to ready when you can do the following without heavy notes.

Readiness signalWhat it looks like
Explain an end-to-end AWS data pipelineSource, ingestion, storage, catalog, transform, query, monitoring, and security are all included
Choose ingestion patternsYou can distinguish batch, streaming, replication, and event-driven designs
Reason about Glue and the Data CatalogYou understand crawlers, schemas, partitions, jobs, and common failure points
Apply security controlsYou can reason through IAM, KMS, S3 policies, Lake Formation, and audit needs
Troubleshoot scenariosYou know where to look for logs, permissions, schema issues, and data layout problems
Manage timingYou can finish timed practice without rushing the final questions
Review effectivelyYour missed-question log is shrinking and mistakes are less repetitive

If your misses are still broad across ingestion, storage, processing, security, and troubleshooting, use more targeted review before relying on another full mock.

Practical next step

Start with a diagnostic DEA-C01 practice set, create a missed-question log, and choose the plan that matches your remaining time. Then follow the daily rhythm: review yesterday’s misses, study one AWS data engineering topic, practice scenario questions, and write down the service-selection rule you learned.

Browse Certification Practice Tests by Exam Family