AI-103 — Microsoft Azure AI Apps and Agents Developer Associate Quick Reference

Compact exam-prep reference for Microsoft AI-103 covering Azure AI Foundry, Azure OpenAI, agents, RAG, Azure AI Search, security, evaluation, and operations.

Exam identity and study focus

This independent Quick Reference supports candidates preparing for the Microsoft Microsoft Azure AI Apps and Agents Developer Associate (AI-103) exam. Use it as a compact review of high-yield design choices, implementation patterns, and troubleshooting points for Azure AI apps and agent-based solutions.

ItemReference
Vendor/providerMicrosoft
Exam titleMicrosoft Azure AI Apps and Agents Developer Associate (AI-103)
Exam codeAI-103
Candidate focusBuild, integrate, secure, evaluate, and operate AI apps and agents on Azure
Core services to recognizeAzure AI Foundry, Azure OpenAI in Azure AI Foundry, Azure AI Search, Azure AI services, Azure AI Content Safety, Azure Monitor/Application Insights, Microsoft Entra ID, Key Vault, Storage

High-yield architecture map

    flowchart LR
	    U[User or app client] --> A[AI app API / orchestration layer]
	    A --> ID[Microsoft Entra ID / managed identity]
	    A --> LLM[Azure OpenAI / model deployment]
	    A --> AG[Agent service or agent runtime]
	    AG --> T[Tools: functions, APIs, code, search, workflows]
	    A --> R[Retriever]
	    R --> S[Azure AI Search index]
	    S --> D[Blob, files, DBs, documents]
	    A --> CS[Content safety and policy checks]
	    A --> MON[Tracing, logs, evaluations, metrics]
	    LLM --> A
	    T --> AG
	    CS --> A

High-yield mental model:

  1. Model generates or reasons.
  2. Retrieval grounds answers in enterprise data.
  3. Tools let the model or agent take actions.
  4. Security controls identity, data access, networking, and secrets.
  5. Evaluation proves quality, safety, and groundedness before and after release.
  6. Observability helps troubleshoot latency, token use, model errors, unsafe outputs, and poor retrieval.

Service-selection matrix

NeedUsually chooseWhyExam trap
Build generative AI app with model deployments, prompts, evaluations, and project assetsAzure AI FoundryCentral workspace for model-centric AI app developmentDo not treat Foundry as only a portal; know project, model, deployment, connection, evaluation, and tracing concepts
Call GPT-style models from an appAzure OpenAI in Azure AI FoundryManaged access to OpenAI models through Azure controlsIn Azure calls, the model value often refers to the deployment name, not just the base model name
Chat over private documentsAzure AI Search + Azure OpenAIRetrieval-augmented generation with indexed chunks and citationsFine-tuning is not the default answer for changing private facts
Multi-step assistant that chooses toolsAzure AI Foundry Agent Service or agent frameworkAgent instructions, tools, threads/runs, and tool-call orchestrationAgents increase non-determinism; use deterministic workflows for fixed business processes
Enterprise search over text and vectorsAzure AI SearchKeyword, vector, hybrid, filtering, semantic rankingSemantic ranking is not a security boundary
Extract tables, key-value pairs, layout, or fields from formsAzure AI Document IntelligenceDocument layout and extraction modelsOCR alone is not enough for structured document extraction
Classify, extract, summarize, or analyze natural language with prebuilt APIsAzure AI Language or generative modelUse task-specific APIs for predictable NLP; use LLMs for flexible generationDo not overuse LLMs when a deterministic AI service API fits
Speech transcription or text-to-speechAzure AI SpeechSpeech-to-text, text-to-speech, speech translation patternsAudio quality, language, and diarization requirements affect design
Image analysis or OCRAzure AI Vision / Document IntelligenceImage tagging, OCR, document layout depending on inputChoose Document Intelligence for document structure, not just images
Moderate unsafe text or imagesAzure AI Content Safety and Azure OpenAI content filtersDetect harmful content, jailbreak attempts, protected categories, or policy violationsContent filtering is not a full compliance program
Store secrets and keysAzure Key VaultCentral secret management and rotation supportPrefer managed identity where possible instead of distributing keys
Monitor production AI appAzure Monitor, Application Insights, Foundry tracing/evaluation featuresLogs, traces, metrics, failures, latency, quality signalsDo not log sensitive prompts/responses without a privacy plan

Core app patterns

PatternUse whenMain componentsAvoid when
Direct chat/completionUser asks general questions or app needs generated textApp API, prompt, model deploymentAnswers require current private data or strict traceability
Grounded chat / RAGAnswers must use enterprise documentsChunking pipeline, embeddings, Azure AI Search, prompt with retrieved contextSource content is highly structured and better served by direct database queries
Agentic RAGAssistant must search, reason, call tools, and iterateAgent, tools, retrieval, thread/run state, policy controlsA fixed workflow can meet the requirement more reliably
Tool/function callingModel chooses from app-defined operationsFunction schema, tool-call handler, validation, execution layerThe action is high-risk and needs human approval or deterministic rules
Workflow-first automationSteps are known and must be auditableAPI workflow, rules engine, Logic Apps/Functions, optional LLM stepThe task requires flexible open-ended reasoning
Fine-tuningNeed consistent style, format, or task behavior from examplesTraining examples, evaluation set, model deploymentNeed to add frequently changing facts; use RAG instead
Task-specific AI serviceNeed predictable extraction/classification/speech/visionAzure AI Language, Speech, Vision, Document IntelligenceNeed open-ended reasoning across many task types

Azure AI Foundry concepts

ConceptWhat to know for AI-103
ProjectOrganizes app assets such as models, deployments, data connections, prompts, evaluations, and traces
Model catalogPlace to discover foundation models and select models for deployment or inference
Model deploymentApp-facing deployed model endpoint/configuration; applications call deployments
Prompt engineeringIterative design of instructions, examples, constraints, grounding, and output format
EvaluationMeasures quality and safety using test data, metrics, and comparison runs
TracingCaptures app/agent execution steps for debugging prompts, retrieval, tools, and latency
ConnectionsSecure references to resources such as storage, search, model endpoints, and external services
AgentsAssistants that use instructions, models, tools, and conversation state to perform tasks

Foundry development checklist

  • Create or select the Azure AI project/resource.
  • Deploy or select a suitable model.
  • Define the app pattern: direct model call, RAG, agent, or workflow.
  • Configure connections to data sources, indexes, tools, and storage.
  • Build prompts with clear instructions, grounding rules, and output constraints.
  • Add content safety and input/output validation.
  • Evaluate with representative prompts and expected outcomes.
  • Deploy through an app/API layer with managed identity where possible.
  • Monitor traces, latency, token use, model errors, safety flags, and user feedback.

Azure OpenAI and model interaction reference

Building blocks

Building blockPurposeCommon exam distinction
System/developer instructionsDefine assistant behavior, constraints, and roleMore durable than user text, but not a security boundary
User messageEnd-user requestMust be validated and checked for prompt injection
Assistant messageModel responseCan be used as conversation history, but manage token growth
ContextRetrieved or supplied factsThe model only knows private data if you provide or connect it
EmbeddingsNumeric representation of text for similarityQuery and indexed vectors must be generated consistently
Tool/function definitionSchema for actions the model may requestThe app executes the function; the model does not directly access your systems
Structured outputJSON or schema-constrained responseStill validate output before using it
StreamingIncremental token deliveryImproves perceived latency but complicates moderation and logging

Model parameter quick reference

ParameterEffectPractical guidance
TemperatureHigher means more varied/random outputLower for factual, deterministic, or formatted answers
Top-pControls nucleus samplingUsually tune either temperature or top-p, not both aggressively
Max output tokensCaps response lengthSet based on UX and cost/latency requirements
Stop sequencesStop generation at defined textUseful for templates, delimiters, or multi-part prompts
Frequency/presence penaltiesDiscourage repetition or encourage noveltyUse carefully; can reduce consistency
Response format / schemaRequests structured outputAlways parse and validate in code

Prompt design checklist

GoalPrompt tactic
Grounded answer“Use only the provided context. If context is insufficient, say what is missing.”
Citation supportInclude source IDs/URLs in retrieved context and require citations by source ID
Tool disciplineTell the model when it must use a tool versus when it may answer directly
JSON outputProvide schema, valid example, and instruction to return only JSON
SafetyInclude prohibited behaviors, escalation instructions, and human handoff rules
Injection resistanceTreat retrieved/user content as data, not as higher-priority instructions

Minimal Azure OpenAI call pattern

import os
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AzureOpenAI

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
)

client = AzureOpenAI(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    azure_ad_token_provider=token_provider,
    api_version=os.environ["AZURE_OPENAI_API_VERSION"]
)

response = client.chat.completions.create(
    model=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"],  # Azure deployment name
    messages=[
        {"role": "system", "content": "Answer using concise technical language."},
        {"role": "user", "content": "Explain hybrid search in RAG."}
    ],
    temperature=0.2
)

print(response.choices[0].message.content)

Exam points:

  • Prefer Microsoft Entra ID and managed identities for production when supported.
  • API keys are easier for quick tests but increase secret-management risk.
  • The model deployment name is a frequent source of 404 or deployment-not-found errors.
  • Token budget includes instructions, history, retrieved context, tool schemas, and response.

RAG pipeline

    flowchart LR
	    A[Source documents] --> B[Load and crack documents]
	    B --> C[Clean, split, chunk]
	    C --> D[Enrich: OCR, metadata, extraction]
	    D --> E[Create embeddings]
	    E --> F[Index in Azure AI Search]
	    Q[User question] --> G[Embed / rewrite query]
	    G --> H[Retrieve: keyword, vector, hybrid]
	    F --> H
	    H --> I[Prompt with context + citations]
	    I --> J[Generate answer]
	    J --> K[Evaluate and monitor]

Chunking and indexing decisions

DecisionGood default thinkingTrap
Chunk sizeLarge enough for meaning, small enough for precise retrievalEntire documents often dilute relevance and exceed context budget
OverlapAdd overlap when concepts span chunk boundariesToo much overlap increases cost and duplicate results
MetadataStore source, page, section, timestamp, owner, ACLs, content typeWithout metadata, filtering and citations are weak
Embedding modelUse the same embedding approach for documents and queriesMixing incompatible embeddings breaks similarity quality
ReindexingRe-run indexing when source data or enrichment logic changesRAG does not automatically know changed documents unless ingestion updates the index
Security trimmingApply filters based on user authorizationSearch relevance is not authorization

Azure AI Search components

ComponentPurposeExam notes
IndexSearchable schema and stored document chunksFields can be searchable, filterable, sortable, facetable, retrievable, vectorized
Data sourceConnection to source data for indexersCommonly storage or supported data platforms
IndexerPulls data from source into indexUseful for scheduled or repeatable ingestion
SkillsetEnrichment pipeline such as OCR, extraction, language, or custom skillsAdds structure before indexing
AnalyzerControls tokenization and text processingImportant for language-specific search behavior
Vector fieldStores embedding vectorsQuery vectors must align with index configuration
Semantic rankingImproves natural-language ranking and captions where configuredEnhances relevance; does not enforce security
FiltersRestrict results by metadata or ACL fieldsCritical for tenant, user, or department isolation
Synonym mapExpands equivalent termsHelpful for domain vocabulary
Scoring profileBoosts selected fields or freshnessUseful when ranking needs business tuning

Retrieval modes

Retrieval modeBest forLimitations
Keyword searchExact terms, IDs, names, product codesMisses semantic matches
Vector searchConceptual similarity and paraphrasesCan return plausible but contextually wrong chunks
Hybrid searchCombines keyword and vector signalsOften strong for enterprise RAG
Semantic rankingRe-ranks top results for natural-language relevanceWorks after initial retrieval; not a replacement for good indexing
Filtered retrievalEnforces scope such as user, region, product, or document typeOverly strict filters can hide relevant context

RAG retrieval snippet

## Conceptual pattern: embed query, retrieve chunks, then pass context to the model.
query = "What is the refund exception process for enterprise customers?"

query_vector = embed(query)  # Use the same embedding strategy as the indexed chunks.

results = search_client.search(
    search_text=query,
    vector_queries=[
        {
            "vector": query_vector,
            "fields": "contentVector",
            "k_nearest_neighbors": 5
        }
    ],
    filter="department eq 'Support'",
    select=["content", "source", "page", "lastUpdated"],
    top=5
)

context = "\n\n".join(
    f"[{r['source']} p.{r['page']}]\n{r['content']}" for r in results
)

RAG failure-to-fix table

SymptomLikely causeFix
Answer is fluent but wrongRetrieved context is irrelevant or missingInspect retrieved chunks; tune chunking, hybrid search, filters, and prompts
Answer lacks citationsSource metadata missing or prompt does not require citationsStore source/page IDs and require citation format
User sees unauthorized contentNo security trimming or wrong filterAdd per-user/tenant ACL fields and enforce filters before generation
Model ignores contextPrompt allows outside knowledge or context too noisyStrengthen grounding instruction and improve retrieval precision
High latencyToo many retrieval calls, large context, slow toolsCache, reduce top-k, compress context, parallelize safe calls
Poor recallChunks too small/large, weak synonyms, no hybrid searchTune chunks, add metadata, use hybrid/semantic ranking
Stale answersIndex not refreshedSchedule or trigger ingestion updates

Agents and tool calling

Agent concepts

ConceptMeaningCandidate reminder
AgentModel-backed assistant configured with instructions and toolsUse for flexible multi-step tasks
InstructionsPersistent behavior and policy guidanceKeep concise, explicit, and testable
Thread/sessionConversation stateManage retention, privacy, and token growth
Run/executionOne agent processing cycleA run may require tool outputs before completion
ToolCapability exposed to the agentExamples: function, search, file retrieval, code, workflow, API
Tool callModel-requested action with argumentsValidate arguments before execution
Tool outputResult returned to agentSanitize tool output to reduce prompt injection
Human approvalManual gate for sensitive actionsUse for irreversible, financial, legal, or high-impact actions

Agent vs function vs workflow

RequirementBest fitWhy
“Answer questions about these files”RAG or file-search-capable agentRetrieval is the primary need
“Book a meeting, email summary, update CRM”Agent with tools, or workflow with LLM stepAgent can select tools; workflow is safer if sequence is fixed
“Always run these 5 steps in this order”Deterministic workflowEasier to audit and test
“Decide which diagnostic command to run next”AgentRequires iterative reasoning
“Call one known API based on user intent”Function callingLighter than a full agent
“Generate strictly formatted output”Direct model call with schemaAgent may be unnecessary

Tool/function calling pattern

## Pseudocode: the app, not the model, executes tools.
messages = [
    {"role": "system", "content": "Use tools for account lookups. Do not invent account data."},
    {"role": "user", "content": "What is the status of order A123?"}
]

model_response = call_model(messages, tools=[get_order_status_schema])

if model_response.requests_tool:
    tool_name = model_response.tool_name
    args = validate_json(model_response.tool_arguments)

    if tool_name == "get_order_status":
        tool_result = get_order_status(order_id=args["order_id"])

    messages.append(model_response.as_message())
    messages.append({
        "role": "tool",
        "tool_call_id": model_response.tool_call_id,
        "content": sanitize(tool_result)
    })

    final_response = call_model(messages, tools=[get_order_status_schema])

Tool-calling traps:

  • Validate tool arguments even if the schema is strict.
  • Apply authorization before executing the requested action.
  • Treat tool outputs and retrieved documents as untrusted text.
  • Use idempotency keys or confirmation for actions that change state.
  • Log tool traces without exposing secrets or sensitive data.
  • Set max iterations to avoid runaway agent loops.

Azure AI services quick grid

Service areaUse forHigh-yield distinction
Azure AI LanguageSentiment, key phrases, entity recognition, PII detection, classification, conversational language understandingUse when a prebuilt or custom NLP API is more predictable than an LLM prompt
Azure AI SpeechSpeech-to-text, text-to-speech, speech translationAudio format, language, latency, and speaker requirements matter
Azure AI VisionImage analysis, OCR/image understanding scenariosUse Document Intelligence when document structure is central
Azure AI Document IntelligenceLayout, tables, key-value pairs, prebuilt/custom document extractionBest for forms, invoices, receipts, contracts, and structured document processing
Azure AI TranslatorText translationPrefer for translation workloads instead of prompting a general model
Azure AI Content SafetyHarmful content detection and safety controlsComplements Azure OpenAI content filters and app policy logic
Azure AI SearchIndexing and retrieval for enterprise contentCore service for scalable RAG grounding

Security, identity, and governance

Identity and access choices

ControlPreferUse whenTrap
Managed identityAzure-hosted apps accessing Azure resourcesApp Service, Functions, AKS, VM, Container Apps, workflowsRole assignment still required
Microsoft Entra ID token authProduction service-to-service accessSupported SDKs and enterprise authWrong token scope or tenant causes auth failures
API keysQuick tests or unsupported identity scenarioLocal prototypes or simple integrationStore in Key Vault; do not hard-code
Key VaultSecrets, keys, certificatesCentral secret lifecycleApp still needs identity to read secrets
RBACResource and data-plane permissionsLeast privilege accessContributor at subscription scope is usually excessive
Private endpoint/network controlsRestrict public exposureSensitive data or enterprise network requirementsDNS and routing must be configured correctly

Data and prompt security checklist

  • Classify data before sending it to model, search, logging, or evaluation systems.
  • Use least privilege for app identity to Search, Storage, Key Vault, and AI resources.
  • Apply user-level or tenant-level filters before retrieval.
  • Remove or mask sensitive data in logs and traces.
  • Do not put secrets in prompts, tool schemas, system messages, or source documents.
  • Validate model output before database writes, API calls, or user-visible actions.
  • Use human approval for high-impact operations.
  • Treat prompt injection as an application security issue, not just a prompt wording issue.

Prompt injection defenses

Attack patternDefense
Retrieved document says “ignore previous instructions”Delimit retrieved content and state that it is untrusted data
User asks for hidden system promptRefuse disclosure and avoid placing secrets in prompts
User asks agent to call unauthorized toolCheck authorization in code before tool execution
Malicious source includes fake citationGenerate citations from metadata, not from document text alone
Tool output contains instructionsSanitize and summarize tool output before returning it to the model

Evaluation and responsible AI

Quality and safety evaluation matrix

Evaluation targetWhat to measurePractical method
GroundednessResponse is supported by retrieved contextCompare answer claims to source chunks
RelevanceResponse answers the user’s questionUse labeled test prompts or evaluator model
Retrieval qualityRight chunks appear in top resultsInspect recall/precision by query set
Citation qualityCitations point to correct sourcesValidate source IDs/pages against answer claims
CoherenceResponse is clear and logically structuredHuman review or automated scoring
SafetyHarmful, disallowed, or policy-violating contentContent Safety checks and adversarial tests
RobustnessHandles ambiguous, malicious, or edge-case promptsRed-team prompt set
LatencyMeets user experience needsTrace model, retrieval, and tool durations
Cost/token useFits budget and throughput goalsTrack prompt size, context size, completion size

Responsible AI controls

ControlUse forNotes
Content filtersModel input/output safety enforcementBuilt into Azure OpenAI flows depending on configuration
Azure AI Content SafetyModeration and harm detection across app contentUseful for custom moderation workflows
Grounding checksDetect unsupported claimsImportant for enterprise Q&A
Human reviewEscalation and high-impact decisionsEspecially for sensitive or irreversible actions
Abuse monitoringDetect misuse patternsCombine telemetry, rate limits, and policy
Feedback captureImprove prompts, retrieval, and toolsKeep feedback privacy-aware

Deployment and operations

Production readiness checklist

AreaCheck
App architectureSeparate client, orchestration/API layer, model calls, retrieval, and tools
IdentityUse managed identity or Entra ID where possible
SecretsStore keys in Key Vault; rotate and audit access
RetrievalTest index freshness, metadata filters, and citation accuracy
PromptingVersion prompts and evaluate before release
ToolsValidate arguments, authorize actions, handle retries and timeouts
SafetyRun input/output moderation and policy checks
ObservabilityTrace model calls, retrieval, tool calls, failures, latency, and token use
ReliabilityImplement retries with backoff for transient errors
PrivacyRedact or avoid sensitive prompt/response logging
EvaluationMaintain regression set for quality and safety
RollbackKeep known-good prompt/model/config versions

Troubleshooting quick table

Symptom/errorCommon causeResponse
401 UnauthorizedBad credential, expired token, wrong auth methodCheck identity, key, token acquisition, and SDK config
403 ForbiddenIdentity lacks role or network access blockedVerify RBAC/data-plane roles, private endpoint, firewall
404 deployment/resource not foundWrong endpoint, resource, deployment name, or regionConfirm endpoint and Azure deployment name
429 throttlingToo much concurrency or request volumeRetry with exponential backoff, queue, reduce parallelism
5xx/transient errorsService or network transient issueRetry safely, add circuit breaker, monitor status
JSON parse failureModel did not follow output formatUse schema/structured output, lower temperature, validate and retry
Tool loopAgent keeps requesting toolsLimit iterations, improve instructions, return clearer tool errors
Hallucinated answerWeak grounding or missing contextImprove retrieval, require “insufficient information” behavior
High token useLong history, excessive context, verbose toolsSummarize history, reduce chunks, compress tool output
Slow responseRetrieval/tool/model latencyTrace each step, stream output, cache safe results

Common AI-103 exam traps

TrapCorrect exam mindset
“Use fine-tuning for private knowledge”Use RAG for changing or source-grounded private data; fine-tune for behavior/style/task examples
“The LLM securely enforces permissions”Your app must enforce identity, authorization, filters, and tool permissions
“Prompt instructions are security controls”Prompts help behavior but are not sufficient security boundaries
“Vector search is always better than keyword search”Hybrid search often performs better for enterprise content
“Semantic ranking controls access”It ranks results; it does not authorize users
“Agent equals workflow”Agents choose steps dynamically; workflows execute defined logic
“Tool schemas guarantee safe execution”Validate, authorize, sanitize, and log in application code
“Content filters replace app policy”Filters are one layer; add business rules, review, and monitoring
“More retrieved chunks always improve answers”Too much context can add noise, cost, and latency
“Conversation history can grow forever”Summarize, truncate, or selectively retain context
“Logging everything helps debugging”AI logs may contain sensitive data; design privacy-aware telemetry
“Model name and deployment name are interchangeable”Azure app calls commonly use the deployment name configured in Azure

Rapid review checklist

Before practice, make sure you can explain:

  • When to use Azure AI Foundry, Azure OpenAI, Azure AI Search, Azure AI services, and Azure AI Content Safety.
  • The difference between direct prompting, RAG, tool calling, and agents.
  • How embeddings, chunking, metadata, filters, and hybrid search affect RAG quality.
  • Why managed identity, RBAC, Key Vault, private networking, and data filtering matter.
  • How to evaluate groundedness, relevance, safety, retrieval quality, and latency.
  • How to troubleshoot auth errors, deployment-name issues, throttling, poor retrieval, hallucinations, and tool loops.
  • Why prompt injection requires application-level defenses.

Next step for practice

Use this Quick Reference as a checklist while completing hands-on Azure AI Foundry, Azure OpenAI, Azure AI Search, and agent labs. Then move into timed AI-103-style practice questions that force you to choose the best service, pattern, security control, and troubleshooting action for each scenario.

Browse Certification Practice Tests by Exam Family