MLA-C01 — AWS Certified Machine Learning Engineer – Associate Exam Blueprint
Practical exam blueprint for AWS Certified Machine Learning Engineer – Associate MLA-C01 readiness.
How to Use This Exam Blueprint
Use this checklist as a practical study map for the AWS Certified Machine Learning Engineer – Associate (MLA-C01) exam. It is designed to help you confirm whether you can apply AWS machine learning concepts in realistic engineering scenarios, not just recognize service names.
For each topic area:
- Read the readiness target.
- Check whether you can explain the decision, not only the definition.
- Practice scenario questions where multiple AWS services could work.
- Revisit weak areas during final review.
Ready means you can choose, configure, troubleshoot, and operate machine learning workflows on AWS at an associate engineer level.
MLA-C01 Topic-Area Readiness Table
| Readiness area | What to review | You are ready when you can… |
|---|---|---|
| ML problem framing | Business objective, ML task type, target variable, success metric, constraints | Identify whether a scenario needs classification, regression, forecasting, ranking, clustering, NLP, computer vision, or anomaly detection |
| Data collection and ingestion | S3, streaming, batch ingestion, data formats, data quality, data lineage | Choose an ingestion pattern and explain tradeoffs for latency, scale, reliability, and cost |
| Data preparation | Cleaning, transformation, joins, encoding, normalization, missing values, outliers | Select appropriate preprocessing steps and avoid leakage from validation or test data |
| Feature engineering | Feature selection, feature creation, feature stores, categorical handling, text/image features | Explain how features affect model performance, reproducibility, and online/offline consistency |
| Model training | SageMaker training jobs, built-in algorithms, custom containers, distributed training concepts | Match training options to workload size, framework, data location, and operational needs |
| Model evaluation | Metrics, validation strategy, confusion matrix, bias/variance, overfitting, underfitting | Choose the correct metric for the business goal and diagnose poor model behavior |
| Hyperparameter tuning | Search strategies, objective metric, early stopping concepts, tuning cost | Decide when tuning is useful and how to avoid wasting compute |
| Model deployment | Real-time endpoints, batch transform, async patterns, containers, inference pipelines | Select a deployment pattern based on latency, throughput, payload size, and update frequency |
| MLOps pipelines | SageMaker Pipelines, Step Functions, CI/CD concepts, model registry, approval gates | Describe a repeatable workflow from data prep through deployment and rollback |
| Monitoring and observability | CloudWatch, logs, metrics, alarms, model monitoring, drift detection | Detect degraded model or endpoint behavior and identify likely root causes |
| Security and access control | IAM, least privilege, encryption, VPC access, network isolation, secrets | Design secure ML workflows without over-permissive roles or exposed data |
| Governance and compliance support | Auditability, tagging, lineage, approvals, reproducibility | Explain how to track who trained, approved, deployed, and changed a model |
| Cost and performance optimization | Instance selection, autoscaling concepts, spot training concepts, right-sized inference | Choose options that reduce cost without breaking workload requirements |
| Troubleshooting | Failed jobs, data schema mismatch, container errors, endpoint latency, permission errors | Isolate whether a problem is caused by data, code, IAM, compute, networking, or service configuration |
Core AWS Machine Learning Workflow
Know the end-to-end workflow and the AWS services that commonly support each stage.
flowchart LR
A[Define ML problem] --> B[Collect and store data]
B --> C[Prepare and validate data]
C --> D[Engineer features]
D --> E[Train model]
E --> F[Evaluate model]
F --> G{Meets objective?}
G -- No --> C
G -- Yes --> H[Register or approve model]
H --> I[Deploy for inference]
I --> J[Monitor data, model, and endpoint]
J --> K{Drift or degradation?}
K -- Yes --> C
K -- No --> J
Readiness Checks
- I can explain each step in an AWS ML lifecycle.
- I can identify which stage a scenario is describing.
- I can choose between batch, streaming, and real-time workflows.
- I can recognize when retraining is needed.
- I can explain how monitoring connects back to model improvement.
ML Problem Framing and Task Selection
| Scenario clue | Likely ML task | Readiness prompt |
|---|---|---|
| Predict a numeric value such as demand, price, or duration | Regression | Can you choose metrics that penalize large errors appropriately? |
| Predict one of several categories | Classification | Can you interpret precision, recall, F1, ROC/AUC, and confusion matrix outcomes? |
| Forecast future values over time | Time-series forecasting | Can you account for seasonality, trend, and time-based validation? |
| Group similar records without labels | Clustering | Can you explain why there may be no single “correct” label? |
| Detect unusual behavior or fraud-like activity | Anomaly detection | Can you distinguish rare-event detection from standard classification? |
| Extract meaning from text | NLP | Can you identify tokenization, embeddings, classification, summarization, or entity extraction needs? |
| Analyze images or video frames | Computer vision | Can you explain labeling, augmentation, and inference performance tradeoffs? |
| Recommend items or rank results | Recommendation/ranking | Can you explain feedback loops and cold-start concerns? |
Can You Do This?
- Identify the target variable and prediction unit in a business scenario.
- Distinguish supervised, unsupervised, and reinforcement learning at a practical level.
- Explain why the wrong metric can produce a model that is technically accurate but operationally useless.
- Recognize data leakage when future information or target-derived values enter training data.
- Explain when a rules-based system may be preferable to ML.
Data Collection, Storage, and Ingestion
Expect scenarios that test whether you can choose a data path that fits the workload.
| Topic | Review focus | Ready means you can… |
|---|---|---|
| Amazon S3 | Object storage, prefixes, data lake patterns, versioning concepts, encryption | Use S3 as a source and destination for ML data and artifacts |
| Data formats | CSV, JSON, Parquet, images, text, compressed files | Explain tradeoffs in size, schema, query performance, and processing efficiency |
| Batch ingestion | Scheduled loads, ETL jobs, file-based workflows | Choose batch when latency requirements allow delayed processing |
| Streaming ingestion | Event-driven or near-real-time data | Recognize when low-latency data capture matters |
| Data cataloging | Schema discovery and metadata | Explain why searchable metadata helps repeatability and governance |
| Data quality | Missing values, invalid records, duplicates, drift, skew | Identify checks that should happen before model training |
| Permissions | Bucket policies, IAM roles, cross-service access | Diagnose access failures between ML services and storage |
Decision Checks
| If the scenario says… | Think about… |
|---|---|
| “Large historical dataset stored as objects” | S3-backed training, data partitioning, efficient formats |
| “Data arrives continuously from applications or devices” | Streaming ingestion, buffering, downstream processing |
| “Analysts and ML engineers need shared curated data” | Data lake, catalog, governance, access controls |
| “Training fails because input data cannot be read” | IAM role, S3 path, encryption permissions, VPC/network path |
| “Model quality changes after new data source added” | Schema changes, feature distribution shift, data validation |
Data Preparation and Feature Engineering
Data Preparation Checklist
- Handle missing values using an approach appropriate to the feature and model.
- Remove or correct duplicate and invalid records.
- Detect outliers and decide whether to remove, cap, transform, or preserve them.
- Split training, validation, and test data correctly.
- Avoid using test data during preprocessing decisions.
- Preserve time order for time-dependent problems.
- Encode categorical variables appropriately.
- Scale or normalize features when the algorithm benefits from it.
- Tokenize or vectorize text when needed.
- Resize, normalize, or augment images when needed.
- Track preprocessing code and parameters for reproducibility.
Feature Engineering Readiness Table
| Feature issue | Why it matters | What to know |
|---|---|---|
| High cardinality categorical variables | Can increase dimensionality and overfitting risk | Encoding strategy, grouping rare categories, embeddings where appropriate |
| Time-based features | Can improve forecasting and behavioral models | Avoid future leakage; create lags, rolling windows, calendar features |
| Imbalanced labels | Accuracy may hide poor minority-class performance | Use recall, precision, F1, sampling strategies, class weights where appropriate |
| Online/offline mismatch | Model performs differently in production | Keep training and inference transformations consistent |
| Data leakage | Artificially high validation score | Exclude target-derived or future-known variables |
| Feature drift | Production distribution changes | Monitor input distributions and retrain when needed |
SageMaker and AWS ML Service Readiness
You do not need to memorize every option of every service, but you should understand the role each service can play in an AWS ML workflow.
| AWS service or capability | Practical exam relevance | Can you explain when to use it? |
|---|---|---|
| Amazon SageMaker | Build, train, tune, deploy, and monitor ML models | [ ] |
| SageMaker training jobs | Managed training execution | [ ] |
| SageMaker Processing | Data preprocessing and evaluation workloads | [ ] |
| SageMaker Pipelines | Repeatable ML workflows | [ ] |
| SageMaker Experiments | Track experiment runs and comparisons | [ ] |
| SageMaker Model Registry | Register, approve, and manage model versions | [ ] |
| SageMaker endpoints | Real-time inference hosting | [ ] |
| SageMaker batch transform | Offline batch inference | [ ] |
| SageMaker Model Monitor | Monitor data and model behavior | [ ] |
| Amazon S3 | Store training data, model artifacts, logs, and outputs | [ ] |
| AWS Glue | ETL, data cataloging, data preparation support | [ ] |
| AWS Lambda | Lightweight event-driven orchestration or preprocessing | [ ] |
| AWS Step Functions | Workflow orchestration across services | [ ] |
| Amazon EventBridge | Event-driven automation | [ ] |
| Amazon ECR | Store custom container images | [ ] |
| AWS IAM | Control permissions for users, roles, and services | [ ] |
| Amazon VPC | Network isolation and private access patterns | [ ] |
| AWS KMS | Encryption key management | [ ] |
| Amazon CloudWatch | Metrics, logs, alarms, operational visibility | [ ] |
Model Training Readiness
Training Decision Table
| Requirement | Consider |
|---|---|
| Minimal infrastructure management | Managed SageMaker training |
| Custom framework or dependencies | Custom container or supported framework container |
| Large dataset | Efficient storage format, distributed processing, instance selection |
| Need repeatable experiments | Track parameters, code version, data version, metrics, artifacts |
| Need automated tuning | Hyperparameter tuning job with objective metric |
| Training must be isolated from public internet | VPC configuration, private access patterns, security controls |
| Training job fails quickly | IAM, S3 path, container entry point, dependency issue, input channel mismatch |
| Training job runs but model performs poorly | Data quality, features, metric choice, overfitting, underfitting |
Can You Do This?
- Explain what goes into a training job: data, code/container, compute, hyperparameters, output artifact.
- Choose between built-in algorithms, framework containers, and custom containers.
- Explain how training artifacts are stored and later used for deployment.
- Interpret training and validation metrics.
- Recognize overfitting and underfitting from metric patterns.
- Explain how early stopping can reduce unnecessary training.
- Identify IAM or S3 permission problems from error symptoms.
Model Evaluation and Metrics
Classification Metrics
| Metric or artifact | What it tells you | Common trap |
|---|---|---|
| Accuracy | Overall proportion correct | Misleading with imbalanced classes |
| Precision | Of predicted positives, how many were correct | High precision may still miss many positives |
| Recall | Of actual positives, how many were found | High recall may increase false positives |
| F1 score | Balance between precision and recall | Useful when both false positives and false negatives matter |
| Confusion matrix | Counts of true/false positives/negatives | Must map positive class correctly |
| ROC/AUC | Ranking performance across thresholds | Does not choose the operating threshold by itself |
| PR curve | Precision-recall tradeoff | Often useful for imbalanced positive classes |
Regression Metrics
| Metric | What to know |
|---|---|
| MAE | Average absolute error; easier to interpret in original units |
| MSE | Penalizes larger errors more strongly |
| RMSE | Square root of MSE; same unit as target |
| R-squared | Explains variance captured, but not always sufficient alone |
Evaluation Readiness Checklist
- Match the metric to the business cost of errors.
- Explain why false positives and false negatives may have different impacts.
- Choose validation methods that avoid leakage.
- Compare candidate models using the same data split and metric.
- Explain threshold tuning for classification models.
- Recognize when a model is too simple or too complex.
- Explain why test data should be held back for final evaluation.
Hyperparameter Tuning
| Topic | Readiness target |
|---|---|
| Hyperparameters vs parameters | Know that hyperparameters are configured before or during training, while model parameters are learned |
| Objective metric | Choose the metric the tuning process should optimize |
| Search space | Define sensible ranges to avoid wasted trials |
| Resource tradeoff | More trials can improve results but increase cost and time |
| Early stopping | Stop unpromising training runs when appropriate |
| Validation data | Use validation results, not test data, for tuning decisions |
Common Tuning Traps
- Optimizing for accuracy on an imbalanced dataset.
- Tuning on the test set.
- Defining a search space that is too broad or unrealistic.
- Comparing models trained on different data splits.
- Ignoring training cost and deployment constraints.
- Selecting a model based only on metric improvement without considering latency or explainability needs.
Deployment and Inference Patterns
Deployment Pattern Decision Table
| Requirement | Likely pattern to evaluate |
|---|---|
| User-facing low-latency predictions | Real-time endpoint |
| Large offline scoring job | Batch transform or batch inference workflow |
| Inference requests arrive intermittently | Consider cost-aware endpoint or event-driven pattern |
| Large payload or longer processing time | Asynchronous-style inference pattern may be relevant |
| Multiple preprocessing and model steps | Inference pipeline or orchestrated workflow |
| Need safe rollout | Versioned model, staged deployment, monitoring, rollback plan |
| Need to serve multiple model versions | Endpoint variant or controlled routing concept |
| Need custom inference logic | Custom container or inference script |
Deployment Readiness Checklist
- Explain the difference between training and inference containers.
- Choose real-time, batch, or asynchronous inference from scenario clues.
- Identify where model artifacts are stored before deployment.
- Explain how endpoint scaling relates to traffic and latency.
- Know why production preprocessing must match training preprocessing.
- Recognize deployment failures caused by container startup, missing artifacts, IAM, or incompatible input format.
- Explain a safe rollback strategy when a new model performs poorly.
MLOps, Automation, and Reproducibility
Pipeline Stages to Recognize
| Stage | Artifacts or decisions to track |
|---|---|
| Data extraction | Source, time window, schema, permissions |
| Data validation | Quality checks, schema expectations, drift checks |
| Processing | Transformation code, parameters, output location |
| Training | Code version, image, hyperparameters, metrics, model artifact |
| Evaluation | Metric threshold, validation set, approval criteria |
| Registration | Model version, metadata, status |
| Deployment | Endpoint configuration, environment, version |
| Monitoring | Logs, metrics, drift, alarms |
| Retraining | Trigger, data window, approval path |
Can You Do This?
- Describe why manual notebook-only workflows are risky in production.
- Explain how pipelines improve repeatability.
- Identify where approval gates fit before production deployment.
- Track which data and code produced a model.
- Explain how CI/CD concepts apply to ML workflows.
- Distinguish application deployment concerns from model lifecycle concerns.
- Identify when an event-driven workflow is more appropriate than a scheduled workflow.
Monitoring, Logging, and Troubleshooting
What to Monitor
| Monitoring target | Examples of what can go wrong |
|---|---|
| Endpoint health | Invocation errors, latency increase, unavailable container |
| Infrastructure | CPU, memory, GPU utilization, scaling issues |
| Logs | Container errors, malformed input, dependency failure |
| Input data | Schema change, missing fields, distribution drift |
| Model output | Prediction distribution shift, confidence changes |
| Business metric | Conversion, fraud catch rate, churn reduction, cost per prediction |
| Security events | Unauthorized access attempts, unexpected role use |
| Cost | Over-provisioned endpoints, unnecessary training runs |
Troubleshooting Decision Checks
| Symptom | First areas to investigate |
|---|---|
| Training job cannot access data | IAM role, S3 path, bucket policy, encryption permissions |
| Training starts but fails inside container | Entry point, dependencies, environment variables, input channel paths |
| Model has excellent validation but poor production results | Data leakage, train/serve skew, drift, non-representative validation set |
| Endpoint latency is high | Instance type, model size, payload size, preprocessing, autoscaling, cold path dependencies |
| Endpoint returns errors after deployment | Input schema mismatch, serialization format, container logs, missing model artifact |
| Batch inference output is incomplete | Input manifest/path, failed records, permissions, output path |
| Monitoring alarms fire after data source change | Schema drift, changed value ranges, missing features |
Security, Identity, and Network Controls
Security scenarios often test least privilege and service-to-service access.
| Topic | What to be ready for |
|---|---|
| IAM users, groups, roles, and policies | Choose roles for services and avoid broad permissions |
| Least privilege | Grant only the actions and resources needed |
| Service roles | Allow SageMaker or other AWS services to access S3, logs, KMS keys, or containers |
| S3 security | Bucket policies, encryption, access restrictions |
| KMS | Understand encryption key access and permission dependencies |
| VPC configuration | Private access patterns, isolation, security groups, subnets |
| Secrets handling | Avoid hardcoding credentials in notebooks, code, or containers |
| Logging and audit | Know why access and change history matter |
| Data privacy | Limit exposure of sensitive training and inference data |
Security Readiness Checklist
- I can identify the IAM role used by an ML service.
- I can diagnose an access denied error involving S3 or KMS.
- I know why embedding access keys in notebooks or containers is unsafe.
- I can explain encryption at rest and in transit at a practical level.
- I can choose private network access when public connectivity is not allowed.
- I can distinguish identity permissions from network reachability.
- I can explain why logs may contain sensitive data and need controls.
Cost, Performance, and Resiliency Tradeoffs
| Scenario pressure | What to evaluate |
|---|---|
| Training cost is too high | Right-size compute, reduce unnecessary trials, use efficient data formats, stop failed experiments early |
| Endpoint cost is too high | Match deployment pattern to traffic, autoscaling concepts, batch when real time is unnecessary |
| Inference latency is too high | Model size, feature processing, instance choice, request batching, endpoint scaling |
| Throughput is too low | Scaling policy, parallelism, payload size, model optimization |
| Model must recover from failed deployment | Versioning, rollback, monitoring, staged release |
| Workflow must be repeatable | Pipelines, artifacts, model registry, infrastructure as code concepts |
| Data must be retained and traceable | S3 organization, metadata, tags, lineage, lifecycle concepts |
Ready Means You Can Balance
- Accuracy vs latency.
- Cost vs training duration.
- Real-time inference vs batch scoring.
- Managed service convenience vs custom container flexibility.
- Automation speed vs approval control.
- Broad access convenience vs least-privilege security.
- Frequent retraining vs operational stability.
AWS Artifact and Configuration Checks
You should be comfortable recognizing common ML workflow artifacts, even if the exam does not ask you to write full production templates.
| Artifact | Why it matters |
|---|---|
| Training data path | Tells the job where to read data |
| Output artifact path | Stores trained model output |
| Container image URI | Defines training or inference environment |
| IAM role ARN | Grants the service permission to access required resources |
| Hyperparameters | Configure training behavior |
| Environment variables | Pass runtime configuration |
| Model package or registry entry | Tracks a deployable model version |
| Endpoint configuration | Connects model, compute, and production variant concepts |
| Monitoring baseline | Defines expected data or prediction behavior |
| CloudWatch logs | Primary place to inspect runtime errors |
Example: Configuration Fields to Recognize
training_job:
input_data: s3://example-bucket/training/
output_path: s3://example-bucket/model-artifacts/
role: arn:aws:iam::123456789012:role/example-ml-role
image: 123456789012.dkr.ecr.region.amazonaws.com/example-training-image
hyperparameters:
learning_rate: "0.01"
epochs: "10"
Focus on what each field does and what could fail, not on memorizing placeholder syntax.
Scenario and Decision-Point Practice
Service Selection Prompts
| Question | Strong answer should consider |
|---|---|
| Should this workload use batch or real-time inference? | Latency requirement, volume, cost, user interaction, output freshness |
| Should preprocessing run inside the model container or as a separate processing step? | Reuse, consistency, complexity, latency, monitoring, pipeline design |
| Should the team use a managed training job or run training manually? | Reproducibility, scaling, monitoring, permissions, automation |
| Should retraining be scheduled or triggered? | Drift, new labels, data volume, seasonality, operational risk |
| Should a custom container be used? | Dependencies, framework, compliance needs, portability, operational overhead |
| Should the model be deployed immediately after training? | Evaluation threshold, approval gate, risk, rollback plan |
| Should the team optimize model accuracy or latency? | Business goal, user experience, cost, SLA-like expectations |
Scenario Cues to Recognize
“The model worked in validation but fails on live data” Think: train/serve skew, data leakage, drift, schema mismatch.
“The endpoint is expensive and receives requests only once per day” Think: batch inference or more cost-appropriate deployment pattern.
“A new data field was added and predictions changed unexpectedly” Think: schema validation, feature pipeline, monitoring baseline.
“A training job cannot decrypt objects” Think: IAM plus KMS permissions, not only S3 access.
“The ML team cannot reproduce last month’s model” Think: data version, code version, hyperparameters, artifacts, experiment tracking.
“The model has high accuracy but misses most fraud cases” Think: class imbalance, recall, precision-recall tradeoff, threshold.
Common Weak Areas and Traps
| Weak area | Why candidates miss it | How to fix it |
|---|---|---|
| Metric selection | They memorize metrics without mapping to business cost | For every scenario, ask what error is most expensive |
| Data leakage | It can look like strong model performance | Practice identifying future-known and target-derived features |
| IAM troubleshooting | Access errors involve multiple layers | Check role, policy, resource policy, encryption key, and network path |
| Train/serve skew | Preprocessing is often treated as an afterthought | Track how each feature is produced in training and inference |
| Batch vs real-time inference | Candidates assume all deployments need endpoints | Start with latency and interaction requirements |
| Hyperparameter tuning | Tuning is seen as a default fix | First verify data quality, metric, and feature issues |
| Monitoring | Candidates focus only on endpoint uptime | Include data drift, prediction drift, logs, and business outcomes |
| Cost optimization | Candidates overprovision for simplicity | Match compute and deployment pattern to actual workload |
| Pipelines | Candidates know notebooks but not production flow | Study artifacts, approvals, model registry, and rollback |
| Security | Candidates choose broad permissions | Apply least privilege and service roles |
Final-Week Review Checklist
High-Value Review Tasks
- Revisit all major AWS ML workflow stages from data ingestion through monitoring.
- Practice choosing between real-time, batch, and asynchronous-style inference patterns.
- Review classification and regression metrics with scenario examples.
- Review IAM role-based access for SageMaker, S3, ECR, KMS, and CloudWatch.
- Practice diagnosing training job failures.
- Practice diagnosing endpoint deployment and latency issues.
- Review data leakage, drift, and train/serve skew.
- Review model registry, approval, deployment, and rollback concepts.
- Review cost and performance tradeoffs for training and inference.
- Practice reading small configuration snippets and identifying missing pieces.
Final Readiness Self-Test
Ask yourself these questions without notes:
- Can I explain the full AWS ML lifecycle in order?
- Can I choose the correct evaluation metric for an imbalanced classification problem?
- Can I explain why a model with high validation accuracy might fail in production?
- Can I identify the permissions needed for a training job to read encrypted S3 data?
- Can I choose between SageMaker training, processing, tuning, pipelines, registry, endpoints, and batch transform?
- Can I explain how model monitoring supports retraining decisions?
- Can I recognize when cost, latency, or governance should override pure accuracy?
- Can I troubleshoot a failed endpoint deployment from logs, artifacts, input schema, and IAM clues?
- Can I describe a safe path from experiment to production deployment?
- Can I explain what should be tracked so a model can be reproduced later?
Practical Next Step
Use this Exam Blueprint to mark weak areas, then work through scenario-based practice for the AWS Certified Machine Learning Engineer – Associate (MLA-C01) exam. Prioritize questions that force you to choose between AWS services, deployment patterns, metrics, security controls, and troubleshooting paths.