AI-300 — Microsoft Certified: Machine Learning Operations Engineer Associate Quick Reference

Compact AI-300 reference for Azure Machine Learning MLOps workflows, deployment, monitoring, security, and CI/CD decisions.

Exam identity

ItemDetail
Vendor/providerMicrosoft
Official titleMicrosoft Certified: Machine Learning Operations Engineer Associate (AI-300)
Exam codeAI-300
Page purposeIndependent Quick Reference for real-exam preparation and original practice support

Use this as a compact decision guide for Azure Machine Learning operations: building repeatable training workflows, packaging assets, deploying models, monitoring production behavior, and securing MLOps automation.

High-yield MLOps mental model

    flowchart LR
	    A[Source control<br/>code, YAML, tests] --> B[CI validation<br/>lint, unit tests, schema checks]
	    B --> C[Training pipeline<br/>data, compute, components]
	    C --> D[Evaluate<br/>metrics, bias, quality gates]
	    D -->|passes| E[Register asset<br/>model, env, component]
	    D -->|fails| C
	    E --> F[Deploy<br/>online or batch endpoint]
	    F --> G[Monitor<br/>logs, metrics, drift, quality]
	    G --> H[Trigger retraining<br/>manual or automated]
	    H --> C
Exam decision pointFast rule
Training needs repeatabilityUse Azure Machine Learning jobs, components, environments, data assets, and pipelines rather than notebook-only work.
Promotion across workspacesUse versioned assets and registries; avoid “copy files by hand” patterns.
Low-latency scoringUse an online endpoint.
Large offline scoringUse a batch endpoint.
Secret handlingUse managed identities, Key Vault-backed secrets, or workspace connections; do not hard-code secrets.
Production changeUse staged deployment, traffic control, tests, and rollback.
Monitoring asks “why did it fail?”Check job/endpoint logs, environment build, identity permissions, data paths, and scoring code.

Azure Machine Learning object map

ObjectWhat it representsExam-relevant use
WorkspaceTop-level Azure Machine Learning boundary for assets, jobs, compute, endpoints, and collaborators.Central control plane for MLOps.
DatastoreReference to storage such as Azure Blob Storage or Azure Data Lake Storage.Connects workspace to data without embedding storage credentials in code.
Data assetVersioned reference to data used by jobs and pipelines.Reproducibility, lineage, input binding.
EnvironmentRuntime definition: base image, conda/pip dependencies, Docker context, or curated environment.Ensures train/deploy consistency.
Compute instanceManaged development workstation.Interactive authoring and debugging; not ideal as production training compute.
Compute clusterScalable managed compute for jobs.Training, batch jobs, parallel workloads.
JobExecution unit such as command, sweep, AutoML, or pipeline job.Repeatable training and evaluation.
ComponentReusable pipeline step with inputs, outputs, code, environment, and command.Modular pipeline design and reuse.
PipelineDirected workflow of components/jobs.End-to-end MLOps orchestration.
Model assetRegistered model artifact, often MLflow or custom.Versioned deployment candidate.
RegistryCross-workspace sharing and promotion of models, components, and environments.Dev/test/prod separation and enterprise reuse.
Online endpointHTTPS scoring endpoint with one or more deployments.Real-time inference.
Batch endpointEndpoint for asynchronous batch inference over large input datasets.Scheduled or offline scoring.

Service and feature selection matrix

Compute choices

NeedChooseAvoid choosing when
Interactive notebooks, debugging, small experimentsCompute instanceYou need scalable, repeatable production training.
Scalable training jobsCompute clusterYou need always-on interactive development.
Pipeline execution with managed scalingAzure Machine Learning managed compute optionsYou require a custom Kubernetes platform.
Existing Kubernetes operations modelAttached Kubernetes / Kubernetes online deployment patternYou want Microsoft-managed endpoint infrastructure.
Distributed data processingSpark integration where appropriateThe task is simple model training and does not need Spark.
Local smoke testLocal execution or small dev computeIt must represent production security, networking, or scale behavior.

Job and workflow choices

ScenarioBest fitKey exam clue
Run a script with parametersCommand job“Train this script with inputs and outputs.”
Compare hyperparametersSweep job“Find best hyperparameters.”
Build reusable multi-step workflowPipeline job“Preprocess, train, evaluate, register.”
Automate model searchAutoML job“Try algorithms/features automatically.”
Score many files/rows offlineBatch endpoint/job“No real-time response required.”
Trigger workflow from Git commitCI/CD pipeline invoking Azure ML CLI/SDK“Source-controlled MLOps.”

Deployment choices

RequirementChooseWhy
Real-time HTTPS inferenceManaged online endpointManaged production endpoint with deployments and traffic control.
Real-time inference on organization-managed KubernetesKubernetes online endpoint/deployment patternUse existing Kubernetes governance and runtime.
Offline scoring of large datasetsBatch endpointAsynchronous, file/data oriented scoring.
Blue/green or canary releaseMultiple deployments under one online endpointSplit or shift traffic between model versions.
Fast rollbackKeep previous deployment available and shift traffic backRollback should not require rebuilding from scratch.
Custom request handlingCustom scoring scriptNeeded for non-MLflow or custom preprocessing logic.
Standard MLflow model servingMLflow model deployment pathReduces custom serving code when compatible.

Asset versioning and lineage

AssetVersioning guidanceCommon trap
CodeKeep in Git with tests and review gates.Editing production code directly in a notebook or portal.
DataUse versioned data assets or immutable paths for training inputs.Training on “latest” data without recording the exact input.
EnvironmentPin dependencies and version environments.Using unpinned packages that change between train and deploy.
ModelRegister only evaluated candidates with metadata and metrics.Deploying an unregistered artifact with no lineage.
ComponentVersion reusable pipeline steps.Breaking old pipelines by mutating component behavior.
Pipeline YAMLStore with code and parameterize environment-specific values.Manually recreating pipelines in each workspace.

Azure ML CLI v2 patterns

Use YAML definitions for repeatability. The exact schema depends on the asset type, but the exam often tests whether you understand what belongs in code, YAML, identities, and CI/CD.

Command job pattern

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
type: command
display_name: train-classifier
experiment_name: churn-training
code: ./src
command: >-
  python train.py
  --training_data ${{inputs.training_data}}
  --max_epochs ${{inputs.max_epochs}}
inputs:
  training_data:
    type: uri_folder
    path: azureml:churn-data:1
  max_epochs: 10
environment: azureml:sklearn-train-env:1
compute: azureml:cpu-cluster
outputs:
  model_output:
    type: uri_folder
az ml job create --file train-job.yml --resource-group <rg> --workspace-name <workspace>

Component pattern

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command
name: train_component
version: 1
display_name: Train model
inputs:
  training_data:
    type: uri_folder
  learning_rate:
    type: number
outputs:
  model_output:
    type: uri_folder
code: ./src
environment: azureml:sklearn-train-env:1
command: >-
  python train.py
  --training_data ${{inputs.training_data}}
  --learning_rate ${{inputs.learning_rate}}
  --model_output ${{outputs.model_output}}

Online endpoint and deployment pattern

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: churn-endpoint
auth_mode: key
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: blue
endpoint_name: churn-endpoint
model: azureml:churn-model:3
environment: azureml:sklearn-infer-env:2
code_configuration:
  code: ./score
  scoring_script: score.py
instance_type: <vm_size>
instance_count: <count>
az ml online-endpoint create -f endpoint.yml
az ml online-deployment create -f blue-deployment.yml --all-traffic
az ml online-deployment get-logs \
  --endpoint-name churn-endpoint \
  --name blue \
  --resource-group <rg> \
  --workspace-name <workspace>

Traffic shift pattern

az ml online-endpoint update \
  --name churn-endpoint \
  --traffic blue=90 green=10 \
  --resource-group <rg> \
  --workspace-name <workspace>

Use this for canary-style validation. For rollback, shift traffic back to the previous known-good deployment.

MLflow quick reference

TaskMLflow use
Track parametersmlflow.log_param()
Track metricsmlflow.log_metric()
Track artifactsmlflow.log_artifact() or mlflow.log_artifacts()
Package modelFlavor-specific logging such as mlflow.sklearn.log_model()
Register modelRegister from run artifact or use Azure ML model registration flow.
Reduce custom serving codePrefer MLflow model format when the framework and inference contract fit.
import mlflow
import mlflow.sklearn
from sklearn.metrics import accuracy_score

with mlflow.start_run():
    model.fit(X_train, y_train)

    predictions = model.predict(X_test)
    accuracy = accuracy_score(y_test, predictions)

    mlflow.log_param("model_type", "random_forest")
    mlflow.log_metric("accuracy", accuracy)
    mlflow.sklearn.log_model(model, artifact_path="model")

MLflow vs custom model

RequirementPrefer MLflow modelPrefer custom model
Standard framework model packagingYesMaybe
Minimal scoring boilerplateYesNo
Custom preprocessing at inferenceMaybeYes
Nonstandard request/response handlingNoYes
Full control of init() and run() logicNoYes

Scoring script essentials

For custom online inference, the scoring script commonly exposes initialization and request handling.

import json
import joblib
import numpy as np
import os

def init():
    global model
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model.pkl")
    model = joblib.load(model_path)

def run(raw_data):
    payload = json.loads(raw_data)
    data = np.array(payload["data"])
    predictions = model.predict(data)
    return {"predictions": predictions.tolist()}
FunctionPurposeCommon issue
init()Load model and shared resources once at startup.Model path or dependency failure causes deployment startup errors.
run()Handle each request and return serializable output.Input schema mismatch or non-JSON-serializable result.
EnvironmentSupplies runtime packages.Missing library works locally but fails in endpoint container.
LogsDiagnose startup and request failures.Not checking endpoint deployment logs before changing infrastructure.

Pipeline design reference

Design concernRecommended patternExam trap
ReuseBuild components for preprocess, train, evaluate, register.One huge script with no reusable boundaries.
Inputs/outputsDeclare typed inputs and outputs.Hidden file paths inside scripts.
ParametersPass as component or pipeline parameters.Hard-coded values that differ by environment.
ReproducibilityVersion code, data, environment, and model.Reruns produce untraceable differences.
Quality gatesEvaluate metrics before registering or deploying.Register every run as production-ready.
PromotionPromote versioned assets through environments.Retrain separately in each environment unless required.
Failure handlingMake steps idempotent where possible.Partial outputs corrupt later steps.

Quality gate examples

GateExample check
Data validationRequired columns exist; schema and ranges are acceptable.
Unit testsFeature functions and scoring code pass tests.
Training metricsCandidate beats baseline or minimum threshold.
Responsible AI reviewError patterns, explainability, or fairness checks reviewed where required.
Security checkNo secrets in code, images, logs, or YAML.
Deployment smoke testEndpoint responds with expected schema.
Production approvalHuman or policy approval before full traffic shift.

Model quality metrics for gates

Use the metric aligned to business risk. Do not optimize accuracy alone when false positives and false negatives have different costs.

\[ \text{Precision} = \frac{TP}{TP + FP} \]\[ \text{Recall} = \frac{TP}{TP + FN} \]\[ F1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} \]\[ RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2} \]
MetricUse whenWatch for
AccuracyClasses are balanced and error costs are similar.Misleading with imbalanced data.
PrecisionFalse positives are costly.May miss many actual positives.
RecallFalse negatives are costly.May increase false positives.
F1Need balance between precision and recall.Hides business-specific cost differences.
ROC AUCNeed ranking/separation quality across thresholds.Does not choose an operating threshold.
RMSERegression with larger errors needing stronger penalty.Sensitive to outliers.
MAERegression with interpretable average absolute error.Does not penalize large errors as strongly as RMSE.

Security, identity, and networking

Identity choices

Identity/controlUse forExam cue
User identityInteractive development and investigation.“Data scientist runs notebook.”
Service principalAutomation from CI/CD where managed identity is not available.“Pipeline outside Azure needs to submit jobs.”
System-assigned managed identityAzure resource needs an automatically managed identity.“No credential rotation; tied to resource lifecycle.”
User-assigned managed identityShared identity across resources or stable identity lifecycle.“Reuse same identity across deployments.”
Azure RBACGrant access to Azure resources such as storage, Key Vault, workspace.“Least privilege access.”
Key VaultStore secrets that cannot be replaced with identity-based access.“Do not put secret in code/YAML.”

Security decision table

RequirementPattern
CI/CD submits Azure ML jobsUse federated identity or service principal/managed identity with least-privilege RBAC.
Job reads private storageGrant the job identity appropriate storage permissions; reference data through datastore/data asset.
Endpoint accesses downstream serviceUse managed identity and grant only required permissions.
Protect secretsUse Key Vault or secure workspace connection; never log values.
Restrict public accessUse private endpoints and network controls where required.
Control outbound trafficUse managed network/VNet patterns and explicit outbound rules where applicable.
Separate dev/test/prodUse separate workspaces, resource groups, subscriptions, or registries according to governance.
Audit activityUse Azure activity logs, Azure ML job history, and monitoring logs.

Common security traps

TrapCorrect exam response
Hard-code storage keys in training scriptUse identity-based access or Key Vault-backed secret.
Give CI/CD Owner permissions broadlyUse least privilege.
Use personal account for production automationUse managed identity or service principal.
Open endpoint publicly when private access is requiredUse private networking design.
Assume workspace access grants storage accessGrant required permissions on the backing data resource too.
Put secrets in Docker image or environment YAMLInject securely at runtime or use managed identity.

CI/CD for MLOps

Typical CI/CD stages

StageActivitiesOutput
ValidateLint code/YAML, run unit tests, validate schemas.Build is accepted or rejected.
Build/packageBuild environment or image, package components.Versioned runnable assets.
TrainSubmit Azure ML pipeline/job.Run history, metrics, candidate model.
EvaluateCompare metrics to thresholds and baseline.Pass/fail deployment gate.
RegisterRegister model with metadata if gate passes.Versioned model asset.
Deploy to nonprodCreate/update endpoint deployment.Testable endpoint.
Smoke/integration testSend sample requests, verify schema and latency behavior.Promotion decision.
Promote to prodShift traffic or deploy approved model.Production endpoint update.
MonitorCollect logs, metrics, model quality signals.Retraining or rollback signal.

GitHub Actions sketch

name: mlops-ci

on:
  push:
    branches: [ main ]

jobs:
  validate-and-train:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Azure login
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Install Azure ML extension
        run: az extension add -n ml -y

      - name: Submit training pipeline
        run: |
          az ml job create \
            --file pipelines/train.yml \
            --resource-group $RG \
            --workspace-name $WORKSPACE
Exam pointWhat to remember
Store YAML in repoInfrastructure and ML workflows should be reviewable and repeatable.
Do not store secrets in repoUse secure pipeline secrets or federated identity.
Use approval gatesEspecially before production traffic shift.
Keep environments versionedCI/CD should not depend on mutable runtime state.
Capture run outputsMetrics and model artifacts drive promotion decisions.

Registries and environment promotion

NeedWorkspace assetRegistry
Use asset in one workspaceYesOptional
Share model across workspacesLimitedYes
Promote from dev to prodPossible manuallyBetter fit
Reuse components enterprise-wideLimitedYes
Keep approved environment versionsYesYes
Avoid retraining just to move environmentsHarderEasier with promoted assets

Promotion pattern:

  1. Train and evaluate in development workspace.
  2. Register candidate model with metrics and metadata.
  3. Promote approved model/environment/component to registry.
  4. Deploy from registry into test or production workspace.
  5. Monitor production and create retraining trigger when needed.

Online endpoint operations

TaskCommand/action patternNotes
Create endpointaz ml online-endpoint createEndpoint is the stable scoring URL.
Create deploymentaz ml online-deployment createDeployment holds model, code, environment, compute config.
Allocate trafficEndpoint traffic settingsEnables blue/green and canary.
View logsaz ml online-deployment get-logsFirst stop for startup and scoring failures.
Test requestInvoke endpoint with sample payloadConfirms schema and runtime behavior.
Roll backShift traffic to previous deploymentFaster than rebuilding old model.
Remove unused deploymentDelete after validation periodAvoid confusion and unnecessary resource use.

Endpoint troubleshooting

SymptomLikely areaCheck
Deployment fails to provisionEnvironment/image/computeBuild logs, dependency versions, base image, quota/capacity constraints without assuming exact limits.
Container starts then crashesinit() failureModel path, missing files, package imports.
4xx responseRequest/auth/schemaEndpoint keys/tokens, input JSON, content type, route.
5xx responseScoring code/runtimerun() exceptions, memory, dependency mismatch.
Slow responsesModel/runtime/computeModel size, preprocessing, instance type, autoscale configuration.
Works locally, fails in AzureEnvironment or identityPin packages; check managed identity and resource access.
No access to data/serviceRBAC/networkingStorage permissions, private endpoints, outbound restrictions.

Batch endpoint operations

Batch needDesign choice
Score files or large tabular dataUse batch endpoint with data input.
Schedule recurring scoringTrigger from orchestration or CI/CD scheduler.
Parallelize workConfigure batch deployment/job parallelism options appropriate to workload.
Store outputsWrite predictions to configured output location.
Troubleshoot failed recordsInspect batch job logs and per-task errors.
Online endpointBatch endpoint
Synchronous request/responseAsynchronous job-style scoring
Low latencyThroughput over latency
Small payloads per requestLarge datasets or many files
User/application-facing APIBack-office scoring pipelines
Traffic splitting between deploymentsDeployment chosen for batch invocation

Monitoring and observability

SignalWhy it mattersWhere to look conceptually
Job status and durationDetect failed or slow training pipelines.Azure ML job history and logs.
Training metricsDetermine whether a candidate should be registered.MLflow/run metrics.
Endpoint request countUnderstand usage and capacity.Endpoint/Azure Monitor metrics.
Endpoint latencyDetect performance regression.Endpoint metrics and logs.
Endpoint failuresDetect runtime or caller issues.Deployment logs, application logs.
Resource utilizationRight-size compute and troubleshoot bottlenecks.Azure Monitor metrics.
Data distributionDetect potential feature/data drift.Model monitoring or custom validation.
Prediction qualityConfirm model still performs after labels arrive.Offline evaluation against ground truth.
Responsible AI findingsIdentify fairness, explainability, or error-pattern concerns.Responsible AI artifacts/dashboards where used.

Monitoring decision cues

If the question says…Prefer…
“Endpoint returns errors after deployment”Deployment logs and scoring script diagnostics.
“Need to detect degraded model quality when labels become available”Compare predictions to ground truth and trigger retraining.
“Need operational metrics and alerts”Azure Monitor-style metrics/alerts for endpoints and resources.
“Need experiment metrics and model lineage”Azure ML run history and MLflow tracking.
“Need to understand model behavior across cohorts”Error analysis, explainability, or responsible AI review artifacts.
“Need automatic retraining when data changes”Scheduled/event-triggered pipeline with validation and approval gates.

Responsible AI in MLOps

ConcernOperational control
ExplainabilityCapture explanations or feature importance where appropriate.
FairnessEvaluate performance across relevant cohorts.
Error analysisIdentify segments where the model fails disproportionately.
TransparencyKeep model metadata, intended use, limitations, and evaluation results.
Human reviewAdd approval gates for high-impact changes.
MonitoringRecheck quality and cohort behavior after deployment.

Common exam distinction: responsible AI is not only a training-time concern. Operational workflows should preserve evidence, review results, and monitor production behavior.

Data management decisions

NeedUseAvoid
Reference existing cloud dataDatastore plus data assetCopying data into arbitrary local folders.
Reproducible trainingVersioned data asset or immutable path“Latest” path with no version record.
Pipeline input bindingDeclared input in YAML/componentHidden path inside script.
Large file/folder inputURI folder/file style assetsEmbedding large data in repo.
Tabular schema-aware inputMLTable-style data asset where appropriateManually parsing inconsistent files repeatedly.
Secure accessManaged identity/RBACStorage keys in scripts.

Environment and dependency decisions

RequirementRecommended pattern
Fast start with common frameworksCurated environment if it fits.
Custom packagesCustom environment with conda/pip dependencies.
Native libraries or system packagesDockerfile/base image approach.
Training/deployment parityUse compatible or same dependency versions for train and inference.
ReproducibilityPin package versions and version the environment.
SecurityScan/review images and avoid secrets baked into images.
TroubleshootingCheck image build logs and import errors first.

Infrastructure as code and governance

Governance needPattern
Repeat workspace creationUse ARM/Bicep/Terraform or approved IaC tooling.
Environment separationSeparate dev/test/prod workspaces and controlled promotion.
Policy enforcementUse Azure Policy/RBAC/resource locks where appropriate.
AuditabilityKeep changes in source control and CI/CD logs.
Network consistencyDefine private endpoints, VNets, DNS, and outbound rules as code.
Least privilegeAssign roles to managed identities/service principals per environment.

Scenario cue table

Scenario wordingLikely answer
“Data scientist needs a cloud VM for notebooks”Compute instance.
“Training should scale down when jobs finish”Compute cluster or managed job compute pattern.
“Reusable preprocessing step across pipelines”Command component.
“Pipeline should fail if accuracy is below threshold”Evaluation component with quality gate before registration/deployment.
“Model must be served with HTTPS for applications”Online endpoint.
“Score millions of records overnight”Batch endpoint.
“Deploy new model to 10% of users”Second online deployment with traffic split.
“Rollback quickly after errors”Shift traffic back to previous deployment.
“Share approved model between workspaces”Registry.
“Pipeline needs storage access without credentials in code”Managed identity with RBAC.
“Private access only”Private endpoint/network-restricted workspace and endpoint design.
“Endpoint cannot import Python package”Fix environment image/dependencies.
“Need run metrics and lineage”MLflow/Azure ML experiment tracking.
“Need to trigger retraining from production drift signal”Monitoring plus scheduled/event-triggered pipeline.
“Need manual approval before production”CI/CD environment approval or release gate.

Common traps to avoid

TrapBetter answer
Treat notebooks as production workflowsConvert to scripts, components, jobs, and pipelines.
Deploy directly from local filesRegister versioned model and environment assets.
Use online endpoint for offline bulk scoringUse batch endpoint.
Use batch endpoint for user-facing low-latency APIUse online endpoint.
Rebuild a different model in prodPromote the evaluated model artifact.
Ignore environment parityAlign train and inference dependencies.
Store secrets in YAMLUse secure identity or Key Vault-backed configuration.
Assume RBAC on workspace grants data permissionsGrant access on storage/data resources too.
Replace deployment in place with no fallbackUse blue/green or canary with rollback path.
Monitor only infrastructureAlso monitor data, predictions, labels, and business quality.
Register every experimentRegister only candidates that pass evaluation criteria.
Use broad permissions for automationApply least privilege to service principals or managed identities.

Last-minute checklist

  • Know the difference between workspace, registry, model, environment, component, job, pipeline, endpoint, and deployment.
  • Choose online endpoints for real-time inference and batch endpoints for offline scoring.
  • Use versioned data, code, environments, and models for reproducibility.
  • Use MLflow/Azure ML tracking for metrics, artifacts, and lineage.
  • Use managed identities/RBAC instead of embedded credentials.
  • Use CI/CD to validate, train, evaluate, register, deploy, test, and promote.
  • Use traffic splitting for canary/blue-green deployment and rollback.
  • Troubleshoot endpoints from logs, scoring script behavior, dependencies, identity, and networking.
  • Include monitoring for operational health and model quality.
  • Treat responsible AI outputs as part of the operational evidence chain.

Practical next step

Use this Quick Reference to drill scenario questions: for each prompt, identify the Azure Machine Learning asset, compute option, identity pattern, deployment type, and monitoring action before checking the answer.

Browse Certification Practice Tests by Exam Family