AI-300 — Machine Learning Operations Engineer Exam Blueprint
Independent exam blueprint for Microsoft AI-300 candidates preparing for Machine Learning Operations Engineer Associate readiness.
How to Use This Exam Blueprint
Use this page as a practical readiness map for the Microsoft Certified: Machine Learning Operations Engineer Associate (AI-300) exam. It is organized around the kinds of MLOps decisions, artifacts, workflows, and troubleshooting tasks a candidate should be ready to reason through.
Because official weights can change, the sections below are presented as topic areas and readiness areas, not as guaranteed exam sections or scoring percentages. Use the checklist to find weak spots, then validate them with hands-on practice and scenario-based questions.
A good AI-300 candidate should be able to move beyond definitions and explain how to build, secure, automate, deploy, monitor, and improve machine learning systems using Microsoft and Azure-based MLOps patterns.
Exam identity
| Field | Value |
|---|---|
| Vendor/provider | Microsoft |
| Official exam title | Microsoft Certified: Machine Learning Operations Engineer Associate (AI-300) |
| Official exam code | AI-300 |
| Professional vertical | IT |
| Public page concept | Exam Blueprint |
| Best use | Final review, gap analysis, hands-on lab planning, scenario readiness |
Topic-area readiness table
| Readiness area | What to review | You are ready when you can… |
|---|---|---|
| Azure Machine Learning workspace and core assets | Workspaces, computes, environments, data assets, models, jobs, pipelines, endpoints, registries where applicable | Identify which artifact belongs where and explain how it supports repeatable ML operations |
| Experiment tracking and reproducibility | Runs, metrics, parameters, artifacts, lineage, MLflow-style tracking, versioned assets | Recreate or compare training runs without relying on an untracked notebook |
| Data operations for ML | Data asset versioning, schema checks, train/validation/test separation, data access controls, drift considerations | Explain how a data change can break a model pipeline and how to detect it early |
| Training automation | Command jobs, pipeline jobs, reusable components, scheduled or triggered retraining, dependency management | Convert manual training steps into a repeatable pipeline with clear inputs and outputs |
| Model evaluation and promotion | Metrics, thresholds, validation gates, approval workflows, model registry, champion/challenger patterns | Decide whether a model should be promoted, held, retrained, or rolled back |
| CI/CD for ML systems | Source control, build validation, automated tests, deployment stages, approvals, rollback plans, Azure DevOps or GitHub Actions concepts | Describe a safe release path from code commit to production model endpoint |
| Deployment and inference | Online endpoints, batch inference, blue/green or canary-style release thinking, scoring scripts, environments, scaling and rollback concepts | Choose an appropriate serving pattern for latency, volume, cost, and operational risk |
| Monitoring and observability | Logs, metrics, model performance, data drift, prediction drift, endpoint health, alerts, dashboards | Distinguish infrastructure failure from model degradation and know what evidence to check |
| Security and governance | Managed identities, RBAC, Key Vault, private networking concepts, auditability, data protection, least privilege | Design an MLOps workflow that does not depend on hard-coded secrets or broad permissions |
| Responsible AI and model risk | Bias, fairness, explainability, error analysis, transparency, human review, documentation | Include responsible AI checks in the model lifecycle instead of treating them as afterthoughts |
| Troubleshooting and operations | Failed jobs, dependency conflicts, bad environments, missing permissions, schema mismatch, endpoint failures | Triage failures using logs, job history, configuration, data lineage, and deployment state |
| Cost and resource management | Compute selection, idle resources, pipeline efficiency, endpoint sizing, tagging, cleanup | Identify common cost leaks and operational tradeoffs without needing exact price memorization |
Workspace, asset, and environment checklist
A Machine Learning Operations Engineer Associate candidate should be comfortable with the operational shape of Azure-based ML projects, not just model training concepts.
Core workspace and asset readiness
Check each item when you can explain its purpose, lifecycle, and failure modes.
- Workspace: where experiments, jobs, assets, endpoints, and collaboration are managed.
- Compute target: why training compute may differ from inference compute.
- Data asset: how versioned data improves reproducibility.
- Environment: how dependencies are packaged and reused across jobs.
- Model asset or registry entry: how trained artifacts are versioned, described, and promoted.
- Component: how a reusable pipeline step is defined and parameterized.
- Pipeline: how multiple steps are orchestrated with inputs, outputs, dependencies, and gates.
- Endpoint: how a model is exposed for online or batch inference.
- Deployment: how a specific model/environment/scoring combination serves traffic.
- Job history: how past executions support troubleshooting, auditability, and comparison.
Environment and dependency checks
| Skill | Can you do this? |
|---|---|
| Identify dependency drift | Explain why code that worked in a notebook may fail in a scheduled job |
| Pin dependencies appropriately | Know when stable dependency versions matter for repeatability |
| Separate training and inference dependencies | Avoid shipping unnecessary training packages into production inference images |
| Diagnose environment build failure | Check package conflicts, missing system libraries, invalid base images, and authentication problems |
| Use reusable environments | Avoid recreating one-off environments for every run |
| Document environment purpose | Make it clear which environment supports training, evaluation, batch scoring, or real-time serving |
Data operations and lineage readiness
ML operations often fail because data assumptions are not controlled. Be ready for questions where the technically correct model is not the operationally safe answer.
| Topic | Review focus | Ready response |
|---|---|---|
| Data versioning | Stable references to training and evaluation datasets | “I can reproduce the run because I know which data version was used.” |
| Schema validation | Required columns, data types, allowed ranges, missing values | “The pipeline should fail early if the incoming data contract is broken.” |
| Data splits | Train, validation, test, holdout, time-based splits | “I can avoid leakage and evaluate on data that reflects production use.” |
| Data access | Identities, permissions, storage access, secrets handling | “The pipeline accesses data with least privilege and no embedded credentials.” |
| Data drift | Input distribution changes | “I know when production input data differs from training data.” |
| Concept drift | Relationship between features and target changes | “The same input pattern may now imply a different outcome.” |
| Label delay | Production labels arrive later than predictions | “Monitoring must account for delayed ground truth.” |
| Lineage | Data-to-run-to-model traceability | “I can trace which data and code produced a deployed model.” |
Data failure prompts
Can you diagnose these?
- A pipeline succeeds, but production accuracy drops after a source system changes column encoding.
- A model appears to improve because validation data leaked into training.
- A retraining job uses “latest” data unintentionally and cannot be reproduced.
- A scoring job fails because a nullable field becomes required.
- A drift alert fires, but labels are not yet available to confirm performance impact.
- A data engineer updates a feature definition without updating downstream model documentation.
Training, experimentation, and reproducibility checklist
Experiment tracking readiness
You should be able to compare experiments using more than a model file name.
- Track parameters, metrics, artifacts, code version, environment, and dataset version.
- Explain why run lineage matters for audit and rollback.
- Compare candidate models using the same evaluation dataset.
- Record both training metrics and validation/test metrics.
- Identify overfitting from a gap between training and validation performance.
- Preserve logs and artifacts needed to debug failed training jobs.
- Explain how MLflow-style tracking supports repeatability and comparison.
Training pipeline readiness
| Pipeline step | What to verify |
|---|---|
| Data ingestion | Source, identity, permissions, version, schema |
| Data validation | Required fields, ranges, types, missing values, leakage checks |
| Feature preparation | Deterministic transformations, reusable code, no manual notebook-only logic |
| Training | Parameters, compute, environment, random seeds where appropriate |
| Evaluation | Standard metrics, holdout data, threshold checks, comparison to baseline |
| Registration | Model artifact, metadata, metrics, lineage, approval state |
| Deployment | Target endpoint or batch process, scoring code, environment, release gate |
| Monitoring | Logs, alerts, performance signals, drift checks, feedback loop |
Reproducibility traps
| Trap | Why it matters | Better approach |
|---|---|---|
| “It works in my notebook” | Notebook state is often hidden and not repeatable | Package code as scripts/components with explicit inputs |
| Unversioned data path | Future runs may train on different data | Use versioned data assets or controlled snapshots |
| Untracked environment | Dependency updates can change results | Define reusable environments with known dependencies |
| Manual model copy | No lineage or approval history | Register models with metadata and promotion state |
| Metric cherry-picking | Promotion decision may be biased | Define evaluation metrics and gates before comparison |
| No baseline | Improvement is unclear | Compare to current production or accepted baseline model |
CI/CD and automation readiness
For AI-300, think like an engineer responsible for reliable ML delivery. CI/CD for ML includes code, data contracts, environments, models, and infrastructure.
Source control and branching checks
- Can you explain why model training code belongs in source control?
- Can you separate experimentation branches from release-ready code?
- Can you identify which files should be reviewed before production deployment?
- Can you describe pull request validation for ML pipelines?
- Can you explain why large datasets and model binaries are usually handled differently from source code?
CI checks for ML projects
| Check type | Examples |
|---|---|
| Code quality | Linting, formatting, static checks, security scanning |
| Unit tests | Feature functions, preprocessing logic, scoring functions |
| Data contract tests | Column existence, type checks, null handling, category validation |
| Pipeline validation | Component syntax, expected inputs/outputs, dry-run style validation where supported |
| Environment validation | Dependency resolution, image build, import checks |
| Model validation | Metric thresholds, fairness checks, baseline comparison |
| Deployment validation | Smoke test, sample inference, health probe, rollback condition |
CD and release readiness
- Explain a staged path from development to test to production.
- Describe where human approval may be appropriate.
- Distinguish deploying code from promoting a model.
- Use release gates based on evaluation metrics and operational checks.
- Plan rollback before deployment, not after failure.
- Keep deployment configuration separate from experimental code.
- Know how to handle secrets, identities, and permissions in automation.
- Explain how infrastructure as code supports repeatable environments.
Model evaluation, promotion, and governance
Metrics you should recognize
Know what each metric answers and when it can mislead.
| Metric or concept | Use it when… | Watch out for… |
|---|---|---|
| Accuracy | Classes are balanced and error costs are similar | Misleading with class imbalance |
| Precision | False positives are expensive | May reduce recall |
| Recall | False negatives are expensive | May increase false positives |
| F1 score | You need balance between precision and recall | Hides tradeoffs between precision and recall |
| ROC AUC | You compare ranking quality across thresholds | Can look strong even when chosen threshold performs poorly |
| PR AUC | Positive class is rare | Requires careful interpretation |
| RMSE | Large regression errors should be penalized more | Sensitive to outliers |
| MAE | You want average absolute error | May hide rare but severe errors |
| Confusion matrix | You need class-level error visibility | Requires domain interpretation |
| Calibration | Predicted probabilities must be reliable | Good ranking does not guarantee calibrated probabilities |
Key formulas to know conceptually:
\[ \text{Precision} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Positives}} \]\[ \text{Recall} = \frac{\text{True Positives}}{\text{True Positives} + \text{False Negatives}} \]\[ \text{F1} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} \]Promotion decision checklist
Before promoting a model, can you answer these?
- What model is currently in production?
- What baseline is the candidate compared against?
- Which data version was used for evaluation?
- Are metrics better overall and for important subgroups?
- Are error cases acceptable for the business process?
- Were responsible AI checks completed where relevant?
- Is the scoring script compatible with the deployment target?
- Is the environment reproducible?
- Are required secrets and identities configured safely?
- Is rollback tested?
- Are monitoring and alerts configured before traffic shifts?
- Is model documentation updated?
Champion/challenger thinking
| Situation | Likely decision |
|---|---|
| Candidate has better offline metrics but poor latency | Hold promotion or optimize before deployment |
| Candidate improves average performance but worsens a protected or critical subgroup | Investigate before promotion |
| Candidate passes tests but was trained on unversioned data | Do not promote until reproducibility is fixed |
| Candidate performs well in batch but endpoint smoke test fails | Fix serving path before release |
| Candidate is slightly better but operational risk is high | Consider limited rollout or additional validation |
| Production model degrades due to drift | Retrain, investigate feature changes, or roll back depending on evidence |
Deployment and inference readiness
Online versus batch inference
| Decision factor | Online endpoint | Batch inference |
|---|---|---|
| Latency need | Low-latency request/response | Not immediate; scheduled or bulk processing |
| Input pattern | Individual or small request payloads | Large datasets or files |
| Output use | Application decision, API response, near-real-time workflow | Reports, downstream datasets, periodic scoring |
| Operational focus | Endpoint health, scaling, latency, traffic routing | Job success, throughput, data access, output validation |
| Common risk | Endpoint errors, bad scoring code, scaling issues | Failed batch job, schema mismatch, incomplete outputs |
Deployment checks
- Can you explain what is being deployed: model, scoring code, environment, and endpoint configuration?
- Can you distinguish an endpoint from a deployment behind that endpoint?
- Can you describe blue/green or canary-style release logic without relying on exact traffic percentages?
- Can you run a smoke test with representative sample input?
- Can you interpret endpoint logs when scoring fails?
- Can you identify whether a failure is caused by authentication, input schema, code, environment, or resource pressure?
- Can you describe how rollback returns traffic to a known-good model?
- Can you explain why production monitoring must be enabled at deployment time?
Scoring script readiness
A scoring script should be predictable, minimal, and observable.
| Area | Readiness check |
|---|---|
| Initialization | Model is loaded once where appropriate, not repeatedly for every request |
| Input validation | Bad payloads fail clearly and safely |
| Preprocessing | Inference preprocessing matches training preprocessing |
| Output schema | Response format is stable for downstream consumers |
| Logging | Errors are logged without exposing sensitive data |
| Dependency use | Required packages are available in the inference environment |
| Performance | Avoid unnecessary heavyweight operations during each request |
Monitoring, observability, and incident response
Monitoring signals to understand
| Signal | What it tells you | Example response |
|---|---|---|
| Endpoint availability | Whether the service is reachable | Check deployment state, health, recent releases |
| Latency | Whether response time is acceptable | Review scaling, model size, code path, dependencies |
| Error rate | Whether requests are failing | Inspect logs, payloads, auth, schema, scoring code |
| Resource utilization | Whether compute is constrained | Adjust configuration or optimize workload |
| Data drift | Whether input distribution changed | Validate source data changes and consider retraining |
| Prediction drift | Whether model outputs changed | Compare against expected distribution and business context |
| Model performance | Whether predictions remain correct | Evaluate when labels become available |
| Pipeline failure rate | Whether automation is reliable | Inspect component logs, permissions, data availability |
| Cost trend | Whether resources are being used efficiently | Clean idle resources, right-size compute, review schedules |
Data drift, concept drift, and performance degradation
| Issue | Symptom | What to check |
|---|---|---|
| Data drift | Production feature distribution changes | Source systems, schema, ranges, categories, missing values |
| Concept drift | Same features no longer predict the target well | Recent events, business process changes, delayed labels |
| Model decay | Performance declines over time | Monitoring metrics, retraining cadence, validation results |
| Pipeline regression | New code changes model behavior unexpectedly | Recent commits, component versions, environment changes |
| Serving skew | Training preprocessing differs from inference preprocessing | Feature transformation parity and scoring code |
Incident response checklist
When a production model incident occurs:
- Confirm the symptom: outage, high latency, incorrect predictions, drift alert, or failed job.
- Identify the blast radius: one endpoint, one deployment, one model, one data source, or all pipelines.
- Check recent changes: code, data, model, environment, infrastructure, permissions.
- Review logs and metrics before making assumptions.
- Decide whether to roll back, pause, reroute, retrain, or hotfix.
- Preserve evidence: run IDs, model versions, deployment versions, logs, sample payloads.
- Communicate impact and mitigation status.
- Add a test, alert, or gate that would have caught the issue earlier.
Security, identity, and governance checklist
Identity and access control
- Prefer managed identities or service principals over hard-coded credentials.
- Apply least privilege to workspaces, storage, registries, key stores, and deployment targets.
- Understand when a pipeline identity needs data read access versus model registration rights.
- Keep secrets in a managed secret store such as Key Vault rather than source code.
- Rotate and revoke credentials according to organizational process.
- Audit who can approve model promotion or deploy to production.
- Separate development, test, and production permissions where appropriate.
Network and data protection
| Topic | Be ready to explain |
|---|---|
| Private access patterns | Why some organizations restrict public network exposure |
| Storage protection | How data access should be controlled and audited |
| Secret handling | Why secrets should not appear in notebooks, logs, YAML, or pipeline variables |
| Sensitive data | How training and inference workflows should minimize exposure |
| Logging safety | Why logs must be useful but not leak personal or confidential data |
| Environment isolation | Why production inference should not depend on uncontrolled local packages |
Governance artifacts
A production model should be understandable to people other than the person who trained it.
- Model purpose and intended use.
- Training data source and version.
- Evaluation data source and version.
- Key metrics and threshold decisions.
- Known limitations.
- Responsible AI review status where relevant.
- Approval history.
- Deployment history.
- Monitoring plan.
- Rollback plan.
- Owner or on-call contact.
- Retirement or replacement criteria.
Responsible AI and model risk readiness
The exam may test whether you include responsible practices in the lifecycle, not just after deployment.
| Area | Readiness prompt |
|---|---|
| Fairness | Can you check whether performance differs across meaningful groups? |
| Explainability | Can you explain why stakeholders may need feature importance or local explanations? |
| Error analysis | Can you identify where the model fails and whether failures are concentrated? |
| Human oversight | Can you decide when predictions should assist, not replace, human judgment? |
| Transparency | Can you document intended use and limitations clearly? |
| Privacy | Can you avoid unnecessary exposure of sensitive data during training and inference? |
| Monitoring | Can you detect whether model behavior changes after deployment? |
Scenario and decision-point checks
Use these as rapid-fire final review prompts.
| Scenario | Best readiness question |
|---|---|
| Training job fails only in the cloud, not locally | Are dependencies, paths, identities, and environment definitions explicit? |
| New model has higher accuracy but worse recall for a critical class | Which error type matters more for the business risk? |
| Endpoint returns errors after deployment | Did scoring code, payload schema, environment, or permissions change? |
| Data drift alert triggers but labels are unavailable | What can be inferred now, and what must wait for ground truth? |
| Pipeline uses a storage key embedded in code | How should identity and secrets be redesigned? |
| Model was trained manually from a notebook | What must be packaged, tracked, and versioned before production? |
| Stakeholders want automatic deployment after every training run | What evaluation gates and approvals are needed? |
| Batch scoring job produces incomplete outputs | Were input partitions, permissions, failures, and output validation checked? |
| A feature pipeline changes a categorical encoding | How do you prevent serving skew and retraining surprises? |
| Production cost spikes after retraining automation | Are compute schedules, idle resources, endpoint sizing, and job frequency controlled? |
| Rollback is requested | Is there a known-good model, environment, scoring script, and endpoint configuration? |
| A model performs well overall but poorly for one subgroup | Should promotion pause for fairness/error analysis? |
Commands, configuration, and artifact recognition
You do not need to memorize every property name to be operationally ready, but you should recognize the purpose of common commands and configuration artifacts.
Azure CLI pattern recognition
Be ready to interpret commands like these conceptually:
az ml job create --file train-job.yml
az ml model create --name <model-name> --path <model-path>
az ml online-endpoint show --name <endpoint-name>
az ml online-deployment get-logs --endpoint-name <endpoint-name> --name <deployment-name>
Can you answer?
- Is this creating a training job, registering a model, inspecting an endpoint, or retrieving deployment logs?
- Which command would help diagnose a failed deployment?
- Which artifact file likely defines inputs, compute, command, and environment?
- Which command affects production serving state versus experiment tracking?
Example job configuration concepts
A training job or pipeline component usually makes key operational assumptions explicit.
command: python train.py --training-data ${{inputs.training_data}} --model-output ${{outputs.model_output}}
environment: azureml:<environment-name>@<version-or-label>
compute: azureml:<compute-name>
inputs:
training_data:
type: uri_folder
path: azureml:<data-asset-name>@<version-or-label>
outputs:
model_output:
type: uri_folder
Readiness prompts:
- Where is the training data defined?
- Where are dependencies defined?
- Where does the trained model artifact go?
- What would make this job non-reproducible?
- What permissions are required for the job to read data and write outputs?
Artifact inventory
| Artifact | What you should be able to verify |
|---|---|
| Repository | Code version, branch, pull request, review history |
| Training script | Inputs, outputs, parameters, logging, error handling |
| Pipeline definition | Step order, dependencies, reusable components, gates |
| Environment file | Packages, runtime dependencies, reproducibility |
| Data asset | Version, source, schema, access permissions |
| Model artifact | Version, metrics, lineage, approval status |
| Scoring script | Input handling, preprocessing, output format |
| Endpoint configuration | Deployment target, traffic routing concept, auth, monitoring |
| Release pipeline | Stages, approvals, validation, rollback |
| Monitoring dashboard | Health, drift, performance, cost, alerts |
| Runbook | Incident steps, owners, rollback procedure |
Troubleshooting checklist
Failed training job
Check in this order:
- Job logs: error message, failing step, stack trace.
- Data access: identity, path, permissions, network restrictions.
- Data contract: missing columns, type changes, empty input, corrupt files.
- Environment: dependency conflict, missing package, incompatible runtime.
- Compute: unavailable target, quota or capacity issue, startup failure.
- Code: parameter mismatch, path assumption, local-only dependency.
- Outputs: write permissions, invalid output path, disk pressure.
Failed deployment
Check in this order:
- Deployment status and recent changes.
- Endpoint and deployment logs.
- Scoring script initialization.
- Model file path and loading logic.
- Environment dependencies.
- Request payload schema.
- Authentication and authorization.
- Resource pressure or scaling behavior.
- Rollback option.
Bad predictions after successful deployment
Check in this order:
- Is the endpoint serving the intended model version?
- Is inference preprocessing identical to training preprocessing?
- Did feature definitions change?
- Is input data within expected ranges?
- Is there data drift, concept drift, or label delay?
- Are downstream systems interpreting outputs correctly?
- Was the model promoted using the correct evaluation data?
- Should traffic be shifted, rolled back, or monitored longer?
Common weak areas and traps
| Weak area | Why candidates miss it | What to practice |
|---|---|---|
| Treating MLOps as only CI/CD | ML systems also depend on data, metrics, models, and monitoring | Trace a model from data to deployment to monitoring |
| Ignoring data lineage | Model reproducibility depends on data versioning | Recreate a training run from recorded artifacts |
| Confusing drift types | Data drift, concept drift, and performance decline require different evidence | Match symptoms to likely root causes |
| Promoting models on one metric | Single metrics can hide business or subgroup risk | Use confusion matrices, thresholds, and subgroup checks |
| Overusing broad permissions | It may work in a lab but fail governance expectations | Design least-privilege identities |
| Hard-coding secrets | Security and rotation become operational risks | Use managed secret storage and identities |
| Forgetting rollback | Release is incomplete without recovery | Define known-good model and deployment state |
| Not testing scoring code | Training success does not guarantee inference success | Run sample payload tests before traffic shift |
| Assuming notebooks are production pipelines | Hidden state and manual steps break repeatability | Convert logic to scripts/components |
| Missing environment differences | Local packages differ from cloud execution | Build and test explicit environments |
| Focusing only on model accuracy | Production ML also needs latency, reliability, explainability, and cost control | Evaluate operational metrics with model metrics |
| Waiting to monitor until after incidents | You need baseline signals before problems occur | Configure logs, metrics, alerts, and dashboards at deployment |
Final-week checklist
Three to five days before the exam
- Review the AI-300 exam identity and current Microsoft exam page for any official updates.
- Revisit each readiness area in this checklist and mark red/yellow/green.
- Complete at least one end-to-end MLOps walkthrough: data asset, training job, model registration, deployment, monitoring concept.
- Practice interpreting pipeline definitions and deployment configurations.
- Review common failure patterns for jobs, environments, endpoints, and permissions.
- Rehearse model promotion decisions using metrics, risk, and operational readiness.
- Review security patterns: managed identity, RBAC, Key Vault, network restriction concepts, and auditability.
- Review monitoring vocabulary: logs, metrics, drift, performance, alerts, lineage.
- Practice explaining rollback and incident response without looking up notes.
One to two days before the exam
- Stop trying to memorize every command option; focus on recognizing intent and troubleshooting clues.
- Rework weak scenario questions and explain why each wrong answer is wrong.
- Review metric tradeoffs: precision, recall, F1, ROC AUC, PR AUC, RMSE, MAE.
- Review deployment choices: online versus batch, staged rollout, rollback.
- Review automation choices: CI validation, CD gates, approvals, model registry promotion.
- Review data risks: leakage, schema mismatch, drift, unversioned data, label delay.
- Review responsible AI checks and documentation artifacts.
- Prepare a short mental runbook for “job failed,” “deployment failed,” and “model degraded.”
Final readiness test
You are close to ready when you can answer these without notes:
- How do you make a training run reproducible?
- What artifacts must be versioned in an ML system?
- How do you decide whether a model should be promoted?
- What is the difference between data drift and concept drift?
- How do you safely deploy a new model version?
- What should be monitored after deployment?
- How do you troubleshoot endpoint failures?
- How do you avoid hard-coded secrets in ML pipelines?
- How do you design rollback for a model release?
- How do responsible AI checks fit into an MLOps workflow?
Practical next step
Turn this checklist into a scorecard. Mark each topic as ready, needs review, or needs hands-on practice. Then focus your remaining study time on scenario-based practice for the areas marked weakest, especially model promotion, deployment troubleshooting, monitoring, identity, reproducibility, and end-to-end MLOps workflow design.