Browse Certification Practice Tests by Exam Family

Free AI-103 Full-Length Practice Exam: 50 Questions

Try 50 free AI-103 questions across the exam domains, with explanations, then continue with full IT Mastery practice.

This free full-length AI-103 practice exam includes 50 original IT Mastery questions across the exam domains.

These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.

Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.

Need concept review first? Read the AI-103 Cheat Sheet on Tech Exam Lexicon, then return here for timed mocks and full IT Mastery practice.

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try AI-103 on Web View full AI-103 practice page

Exam snapshot

  • Exam route: AI-103
  • Practice-set question count: 50
  • Time limit: 120 minutes
  • Practice style: mixed-domain diagnostic run with answer explanations

Full-length exam mix

DomainWeight
Plan and Manage an Azure AI Solution27%
Implement Generative AI and Agentic Solutions33%
Implement Computer Vision Solutions13%
Implement Text Analysis Solutions13%
Implement Information Extraction Solutions14%

Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.

Practice questions

Questions 1-25

Question 1

Topic: Plan and Manage an Azure AI Solution

A Python support agent is deployed by the same pipeline to staging and production. Each environment has its own Foundry project and model deployment, but both deployments use the same model family. The release gate must validate that production traffic uses the production project and intended deployment, and must provide evidence if a request is misrouted. What should you implement?

Options:

  • A. Emit traces with the Foundry project endpoint and deployment name per call.

  • B. Run a safety evaluation on production responses.

  • C. Monitor aggregate token usage by model family.

  • D. Compare generated answers with a fixed golden dataset.

Best answer: A

Explanation: The goal is to verify connection context, not only model quality or safety. Trace logging that records the actual Foundry project endpoint and deployment name for each request provides direct evidence that the application is using the intended production context.

For Foundry app deployments, the project endpoint and model deployment name are part of the runtime context that determines where requests are sent. In a CI/CD release gate, instrument the app or SDK calls to emit traces with the configured project endpoint, deployment name, environment, and correlation ID. A smoke test or telemetry query can then confirm that production requests resolve to the production Foundry project and deployment. Quality evaluations can still be useful, but they do not prove that the app connected to the correct project context.

  • Golden dataset checks validate response quality, but they do not prove which Foundry project or deployment handled the request.
  • Token usage by model family can hide routing mistakes because both environments use the same model family.
  • Safety evaluation measures harmful or policy-violating responses, not whether the app used the intended project context.

Question 2

Topic: Implement Computer Vision Solutions

A Foundry project hosts a support agent that accepts customer screenshots and photos. The app already applies text safety filters to the user’s typed prompt. During red-team testing, a benign prompt such as “summarize this screenshot” is paired with an image that contains embedded policy-violating instructions and unsafe visual content. You need to reduce this risk without blocking normal screenshot analysis. What should you implement?

Options:

  • A. Disable all image uploads to the agent.

  • B. Run only the existing text filter on OCR output.

  • C. Apply multimodal visual safety moderation before model processing.

  • D. Require managed identity for the model endpoint.

Best answer: C

Explanation: The threat is in the image, so the control must inspect visual content, not just typed text. Multimodal visual moderation or risk detection can flag unsafe image content and embedded attacks while preserving legitimate screenshot workflows.

For multimodal agents, responsible AI controls must cover every input modality that can carry unsafe content or instructions. A text-only filter sees the typed prompt but may miss policy violations, visual harms, or adversarial content embedded in pixels. The better design is to moderate image inputs before they reach the model, apply guardrails based on the detected risk, and use review or refusal only for flagged cases. This reduces risk without removing the business value of visual support.

  • Endpoint identity secures access to the service but does not detect unsafe visual content.
  • OCR-only filtering still relies on text safety and can miss non-text visual risks or incomplete extraction.
  • Blocking uploads reduces risk by eliminating the feature, but it fails the requirement to allow normal screenshot analysis.

Question 3

Topic: Implement Generative AI and Agentic Solutions

A Foundry project hosts a customer-support agent that uses Azure AI Search for grounding and calls a refund-eligibility function. Before production, stakeholders ask whether the agent gives correct policy answers, cites retrieved sources, and blocks unsafe refund guidance. Which evaluation plan best addresses the request?

Options:

  • A. Track token spend per conversation and select the lowest-cost model.

  • B. Compare p95 latency across model deployments and select the fastest endpoint.

  • C. Run concurrent-user load tests against the agent endpoint.

  • D. Evaluate groundedness, answer correctness, safety, and tool-call correctness on traces.

Best answer: D

Explanation: The scenario asks about output quality and agent behavior, not operational efficiency. A suitable Foundry evaluation plan should use conversation traces to test correctness, groundedness, safety, and whether the refund function is called appropriately.

For a grounded agent workflow, evaluation should match the business risk being tested. Here, the risks are incorrect policy answers, unsupported claims, unsafe guidance, and improper function-calling. Use quality and safety evaluations over representative conversations, including retrieved context and tool-call traces, so reviewers can see whether the response is supported by sources and whether the agent followed the refund policy. Latency, throughput, and cost metrics are useful for performance planning, but they do not prove that answers are correct, grounded, or safe.

  • Latency focus fails because fast responses can still be incorrect, ungrounded, or unsafe.
  • Cost focus fails because lower token spend does not validate policy accuracy or safe behavior.
  • Load testing fails because concurrency results measure capacity, not answer quality or grounding.

Question 4

Topic: Implement Generative AI and Agentic Solutions

You maintain a Foundry RAG app that answers HR policy questions from documents indexed in Azure AI Search. The team changed retrieval from top-3 semantic hybrid results to top-5 and revised the grounding prompt. You must verify before deployment that the change does not reduce groundedness or citation accuracy for known questions. What should you do?

Options:

  • A. Run baseline-vs-candidate regression evaluation with groundedness checks.

  • B. Replace hybrid search with vector-only search for all queries.

  • C. Increase chunk size and redeploy after a smoke test.

  • D. Monitor production user feedback after releasing the change.

Best answer: A

Explanation: Prompt and retrieval changes can alter which passages are used and how answers cite them. A regression evaluation uses the same test cases against the current and candidate workflows to detect grounding or citation regressions before deployment.

The core concept is regression evaluation for a RAG workflow. Because the team changed both retrieval parameters and the grounding prompt, the safest predeployment check is to run a fixed evaluation dataset through the baseline and candidate versions, then compare groundedness, answer relevance, and citation accuracy. This verifies that known questions still produce answers supported by retrieved sources.

Smoke tests, search-mode changes, or post-release monitoring can be useful, but they do not provide controlled evidence that the new RAG workflow preserves grounding quality.

  • Smoke testing only fails because it checks basic functionality, not whether answers remain grounded and correctly cited.
  • Vector-only search changes retrieval behavior without proving it improves grounding for the known questions.
  • Production feedback detects issues too late and is not a controlled regression comparison.

Question 5

Topic: Implement Generative AI and Agentic Solutions

You are building a Microsoft Foundry app for a help desk. A user submits a natural-language prompt and a screenshot of an error dialog. The app must generate troubleshooting steps that depend on both the prompt text and the visible message in the screenshot. The app can call only one deployed model endpoint per request and cannot add an OCR or Vision preprocessing step. Which endpoint should the app call?

Options:

  • A. An embedding model deployment endpoint

  • B. A text-only generation LLM endpoint

  • C. A multimodal model deployment endpoint

  • D. A code model deployment endpoint

Best answer: C

Explanation: The task requires one endpoint that can process a prompt and an image together, then generate text. A multimodal model deployment is the right fit because it supports visual input plus natural-language generation without a separate OCR or Vision step.

In Foundry applications, the deployed endpoint should match the input modality and output task. This scenario needs text generation, but the generated answer must use information visible in an image. A multimodal model deployment can receive both the user prompt and screenshot and produce troubleshooting guidance in a single call. A text-only generation endpoint cannot directly inspect the screenshot, and an embedding endpoint produces vectors for retrieval rather than user-facing guidance. Choose the endpoint whose supported input and output capabilities match the workload.

  • Text-only generation fails because it cannot directly use the screenshot without a separate extraction step.
  • Code model is intended for code-focused generation or transformation, not visual troubleshooting.
  • Embedding endpoint supports vectorization for search or retrieval, not direct multimodal response generation.

Question 6

Topic: Plan and Manage an Azure AI Solution

A Foundry project deploys an enterprise procurement agent by using CI/CD. The approved production baseline requires business-system tools to use a user-assigned managed identity with a least-privilege role policy, and it prohibits stored API keys. A drift scan shows the erpLookup tool in production now authenticates by using an API key stored in app configuration, while the endpoint and tool schema are unchanged. Users must continue querying purchase order status.

What should you do?

Options:

  • A. Add a prompt guardrail that tells the model not to reveal secrets.

  • B. Disable the erpLookup tool until the next monthly release.

  • C. Redeploy the approved managed-identity tool configuration and remove the API key.

  • D. Leave the setting unchanged because the endpoint and schema match.

Best answer: C

Explanation: The drift is in the tool authentication setting, not in the tool schema. Restoring the approved managed identity and role policy removes the stored secret risk without preventing authorized purchase order lookups.

Configuration drift occurs when the deployed production settings no longer match the approved baseline from the Foundry CI/CD process. In this scenario, the erpLookup tool changed from keyless managed identity authentication to a stored API key. That creates a credential exposure and governance risk even though the endpoint and schema are unchanged. The right response is to reconcile production back to the approved deployment configuration, remove or rotate the exposed key, and keep the tool available through the managed identity with its least-privilege role policy. Blocking the tool would reduce business functionality unnecessarily, and prompt-only guardrails do not fix an authentication drift issue.

  • Disabling the tool blocks legitimate purchase order queries instead of restoring the approved secure configuration.
  • Matching endpoint and schema misses that authentication settings are part of the governed deployment baseline.
  • Prompt guardrails can reduce disclosure behavior but do not remove the stored credential or restore keyless access.

Question 7

Topic: Implement Text Analysis Solutions

A Foundry project processes support tickets before creating workflow tasks. The workflow expects valid JSON with customer, product, issue_type, severity, and topics. Recent runs fail with Invalid JSON.

Trace excerpt:

Tool selected: summarize_ticket
Prompt intent: concise summary and key themes
Model output: "Contoso reports intermittent sign-in failures..."
Parser error: expected object at line 1

What should you do next?

Options:

  • A. Use a structured JSON extraction tool with the required schema.

  • B. Add Azure AI Search grounding for each ticket.

  • C. Route tickets to topic classification only.

  • D. Increase max output tokens for the summarization prompt.

Best answer: A

Explanation: The failure is caused by using a summarization-oriented prompt/tool for a structured-output requirement. In Foundry, the fix is to use an extraction prompt or tool that constrains the response to the required JSON schema.

Language model text analysis tasks should match the required output type. Summaries produce human-readable prose, topic classification produces labels or themes, and entity extraction identifies specific values. When an automation requires machine-readable fields, use a structured-output extraction prompt or Foundry Tool with an explicit JSON schema so the model returns the fields the parser expects.

The trace shows the selected tool is summarize_ticket, and the model output is natural language. The issue is not missing context or response length; it is the mismatch between the task type and the required output format.

  • More tokens would allow a longer summary but would not force valid JSON or required fields.
  • Search grounding helps when answers need retrieved knowledge, but this ticket already contains the source text.
  • Topic classification can identify themes but does not extract all required fields into a JSON object.

Question 8

Topic: Implement Text Analysis Solutions

You are extending a Microsoft Foundry customer-support agent that already uses Azure AI Search for RAG over private product manuals. The mobile app must let field technicians ask questions hands-free and hear the agent’s answer while they work. The manuals must remain in the existing private search index. Which implementation should you use?

Options:

  • A. Expose the Foundry agent as a text chat and enlarge the response font.

  • B. Add Azure AI Speech for speech-to-text and text-to-speech around the Foundry agent.

  • C. Use Content Understanding to extract fields from uploaded manuals only.

  • D. Use Azure Translator to translate typed questions and text responses.

Best answer: B

Explanation: The requirement is a voice user experience, not just text analysis or document extraction. Azure AI Speech can convert technician audio to text for the Foundry agent and synthesize the agent’s response back to speech while leaving the existing Azure AI Search RAG flow in place.

For a hands-free agent interaction, the implementation must support both speech input and speech output. Azure AI Speech provides speech-to-text for the user’s spoken question and text-to-speech for the agent’s answer. The Foundry agent can still perform the reasoning step and use the existing private Azure AI Search index for grounding, so the retrieval architecture does not need to be replaced.

The key takeaway is to add speech modality support around the agent instead of choosing a text-only workflow.

  • Text chat only fails because larger text still requires reading and typing, which does not meet the hands-free requirement.
  • Translator only handles language conversion, not microphone input or spoken responses.
  • Manual extraction only addresses document processing, not real-time speech interaction with the agent.

Question 9

Topic: Plan and Manage an Azure AI Solution

You are designing a Microsoft Foundry HR benefits agent. The agent uses Azure AI Search for handbook grounding and a payroll tool that can submit compensation-change requests. The solution must block unsafe or prompt-injection input, avoid grounding on restricted content, limit tool calls to approved actions, prevent unsafe responses, and require HR approval before any payroll change is committed. Which guardrail placement should you use?

Options:

  • A. Use a system prompt to describe all restrictions and rely on conversation memory to avoid unsafe requests.

  • B. Run one safety check only on the final model response and send payroll changes for HR review after execution.

  • C. Guard input before retrieval, filter retrieved content before grounding, enforce tool policy before execution, scan output before return, and approve before commit.

  • D. Require HR approval for every user turn and skip retrieval and tool guardrails.

Best answer: C

Explanation: Guardrails should be placed as close as possible to the risk they control. In this workflow, input, retrieval, tool execution, output, and approval each need a separate checkpoint so unsafe content or unauthorized actions are stopped before they affect the next stage.

Agent governance in Foundry should use layered checkpoints rather than a single late filter. Input controls help stop prompt injection or unsafe requests before retrieval. Retrieval controls help ensure only permitted, trusted content is added to the model context. Tool policies and typed schemas constrain function-calling before execution. Output checks reduce the chance of unsafe or confidential responses reaching users. Human approval belongs before a high-impact action is committed, not afterward.

The key takeaway is to prevent risk before the agent uses the next capability, especially before external tool execution.

  • Final-only filtering fails because restricted retrieval content and unauthorized tool execution can already occur before the response is scanned.
  • Prompt-only control fails because instructions and memory do not enforce retrieval permissions, tool schemas, or approval workflows.
  • Approval-only governance fails because human review does not replace automated input, retrieval, tool, and output safeguards.

Question 10

Topic: Plan and Manage an Azure AI Solution

A team builds a RAG-backed enterprise assistant in a Microsoft Foundry project. New policy PDFs are synced from SharePoint, processed with OCR/layout extraction, chunked, embedded, and indexed in Azure AI Search. Users report that some newly uploaded policies are not being cited. The team already evaluates generated answer quality and needs an observability signal for data ingestion quality. What should they monitor?

Options:

  • A. Search click-through rate for cited documents

  • B. Source-to-index completeness and freshness per ingestion run

  • C. LLM token usage and response latency per conversation

  • D. Safety evaluator scores for generated responses

Best answer: B

Explanation: Ingestion quality monitoring should verify the path from source content to searchable grounding data. For this scenario, the useful signal is whether each source PDF was processed, chunked, embedded, indexed, and refreshed on time.

The core concept is source-to-index observability for a retrieval pipeline. When users cannot cite newly uploaded policies, the team must determine whether the issue occurred before retrieval: connector sync, OCR/layout extraction, chunking, embedding generation, indexing, or freshness. Per-run lineage metrics should compare expected source documents with successfully indexed chunks and embeddings, and record failures or stale items.

This is different from monitoring the model’s runtime behavior. Latency, token usage, response safety, and user clicks can be useful, but they do not prove that the grounding source was ingested correctly.

  • Runtime telemetry shows conversation cost and performance, not whether documents reached the search index.
  • Safety scoring evaluates generated content risk, not missing or stale grounding data.
  • Click-through tracking reflects user behavior after retrieval, not ingestion completeness.

Question 11

Topic: Implement Generative AI and Agentic Solutions

A Microsoft Foundry project includes an agent that helps field technicians troubleshoot equipment. The agent keeps conversation state and uses Azure AI Search to retrieve manual passages. Some requests include an uploaded photo of the equipment panel, and the answer must reason over both the image and retrieved text. Which deployed model endpoint should you use as the agent’s primary reasoning and generation endpoint?

Options:

  • A. A code model deployment

  • B. A small text-only classifier deployment

  • C. A multimodal chat model deployment

  • D. An embedding model deployment

Best answer: C

Explanation: The agent needs a deployed endpoint that can generate answers while reasoning over image and text inputs. A multimodal chat model is the best fit because the scenario requires combining a technician photo with retrieved manual content.

In Foundry applications, choose the deployed model endpoint based on the task and input modality. This agent is not just classifying intent or retrieving passages; it must generate a response from both visual input and text grounding. A multimodal chat model supports image-plus-text reasoning and can be used with retrieval results from Azure AI Search as grounding context. Embedding endpoints support search/vectorization, not response generation, and code or text-only focused models do not satisfy the image requirement.

The key takeaway is to match the agent’s reasoning endpoint to the richest required input modality and generation goal.

  • Code endpoint fits code completion or code reasoning tasks, not visual troubleshooting from equipment photos.
  • Embedding endpoint supports retrieval and similarity search, but it does not generate the agent’s final answer.
  • Text-only classifier may route or label requests, but it cannot reason over uploaded images.

Question 12

Topic: Plan and Manage an Azure AI Solution

A support agent in a Microsoft Foundry project has intermittent latency spikes during morning sign-in. The model deployment traces show the following records, and the app currently retries failed calls immediately up to three times.

Status: 429 RateLimitExceeded
Message: Tokens-per-minute limit exceeded
Header: retry-after-ms=8000
Pattern: failures occur in 2-minute bursts

Which change is the best next fix?

Options:

  • A. Increase model temperature for the agent deployment.

  • B. Increase retry concurrency to clear the backlog.

  • C. Throttle and queue model calls, honoring Retry-After.

  • D. Rebuild the search index with larger vectors.

Best answer: C

Explanation: The symptom is a model deployment rate limit, not a grounding or search-quality issue. HTTP 429 with a tokens-per-minute message and a retry-after-ms header indicates the app should slow and schedule requests instead of retrying immediately.

Rate-limit troubleshooting starts with the error evidence. A 429 RateLimitExceeded response plus a tokens-per-minute message means the deployment is receiving more token demand than its current capacity allows during bursts. Immediate retries can multiply the load and make latency worse. A safe mitigation is to add client-side throttling, queue requests, track request and token budgets, and use exponential backoff that honors the Retry-After value. If the queued demand remains consistently high after smoothing bursts, use the trace data for capacity planning or quota adjustment. The key takeaway is to reduce retry amplification before adding more traffic pressure.

  • Temperature tuning affects generation variability, not tokens-per-minute throttling.
  • Index rebuilding addresses retrieval quality or index health, not 429 model deployment limits.
  • Higher retry concurrency amplifies the burst and can worsen rate limiting and latency.

Question 13

Topic: Implement Generative AI and Agentic Solutions

A customer support agent in a Microsoft Foundry project must create incidents by calling the createIncident tool. Users report that the agent says it cannot create the incident after several attempts.

Trace excerpt:

User: Create a P1 incident for checkout failures.
Tool call: createIncident({"title":"checkout failures","priority":"P1"})
Tool result: 400 ValidationError
Detail: field "severity" is required; "priority" is not allowed;
        severity must be one of ["sev1","sev2","sev3"]
Tool call repeated 3 times with the same arguments

What is the best next fix?

Options:

  • A. Raise the model temperature for tool argument generation.

  • B. Increase the agent’s maximum tool-call iterations.

  • C. Correct the createIncident tool schema and enum descriptions.

  • D. Add Azure AI Search grounding for incident history.

Best answer: C

Explanation: This is a tool contract problem, not a retrieval problem. The agent repeatedly sends priority, but the tool requires severity with a specific enum, so the tool definition and descriptions must guide the model to produce valid arguments.

Tool-augmented generation depends on the tool schema as the contract the model uses to form calls. The trace shows the external API rejects the call before any incident is created because the argument names and allowed values do not match: priority is sent, but severity is required. In a Foundry agent or workflow, expose the correct parameter names, required fields, enum values, and descriptions such as mapping a user phrase like P1 to sev1. Adding retries or changing model randomness does not fix a deterministic schema mismatch.

  • Search grounding would help with knowledge lookup, but the failure is an API validation error during tool invocation.
  • More iterations would likely repeat the same invalid call and prolong the loop.
  • Higher temperature can make arguments less consistent and does not correct the required field name or enum.

Question 14

Topic: Implement Generative AI and Agentic Solutions

You are implementing a Microsoft Foundry customer-support agent. A user can ask for the current refund status and, if they approve, the agent must create a return authorization. The status and action are available only through an internal REST API that supports Microsoft Entra ID and is reachable from the Foundry project’s private network. Which implementation should you choose?

Options:

  • A. Put API examples and retry instructions in the system prompt.

  • B. Fine-tune the model on nightly exports of refund records.

  • C. Index refund policies in Azure AI Search for RAG answers.

  • D. Register managed-identity Foundry tools for lookup and confirmed creation.

Best answer: D

Explanation: The request requires both live external data and a controlled external action. Tool-augmented generation is the right pattern because the agent can call authenticated tools for the refund lookup and return creation instead of relying on model memory or static grounding.

Tool-augmented generation is used when the model must retrieve current data or perform an action outside the model. In this scenario, the Foundry agent should expose the internal REST API as tools with defined inputs and outputs, authenticate with managed identity, and enforce user confirmation before invoking the tool that creates a return authorization. The model can reason over the user request, call the lookup tool, present the result, and only call the creation tool after approval. Fine-tuning, RAG, and prompt instructions can improve language behavior or grounding, but they do not securely perform live API calls or side-effecting actions.

  • Nightly fine-tuning can make model knowledge stale and does not create an authorized return action.
  • Policy RAG helps answer policy questions but does not retrieve per-user live refund status or invoke the API.
  • Prompt-only API guidance describes what to do but does not provide secure authenticated tool execution.

Question 15

Topic: Plan and Manage an Azure AI Solution

A company is planning a Microsoft Foundry project for a procurement agent. The agent must search approved contract content, call a vendor-status REST API, and submit exception requests by running an internal custom function. Security policy requires no stored secrets, no public ingress to internal systems, least-privilege access, and human approval for write actions. Which integration approach should you choose?

Options:

  • A. Use Foundry Tools with managed identity, private endpoints, role policies, approval for write tools, and provenance tracing.

  • B. Store API keys in the agent tool configuration and restrict access by prompt instructions.

  • C. Give the agent a broad Contributor role and allow all tool calls automatically on the private network.

  • D. Disable all API and function tools and answer only from an indexed knowledge store.

Best answer: A

Explanation: The agent needs multiple tool types: search over knowledge, API calls, and a custom function. The safest workable design uses Foundry tool integrations with managed identity, private connectivity, least-privilege roles, approval controls for write actions, and traceable provenance.

For this scenario, the core concept is choosing secure tool integration patterns for an agent. Azure AI Search can serve as the knowledge/search tool, while the REST API and custom function can be exposed as controlled tools. Managed identity avoids embedded secrets, private networking avoids public ingress, role policies limit what the agent can access, and approval workflows reduce risk for state-changing operations. Trace logging and provenance metadata support audit and oversight without preventing legitimate read and write workflows.

The key takeaway is to secure each required tool path instead of relying on prompts alone or removing required capabilities.

  • Prompt-only restriction fails because API keys in tool configuration are stored secrets and prompts are not an access-control boundary.
  • Knowledge-only design blocks legitimate vendor-status calls and exception submissions required by the scenario.
  • Broad automatic access violates least privilege and omits the required human approval for write actions.

Question 16

Topic: Plan and Manage an Azure AI Solution

A team is moving a Microsoft Foundry agent from pilot to production. The agent retrieves regulated HR documents from Azure Blob Storage and Azure AI Search. The release gate requires proof that the runtime does not use public endpoints or stored access keys to reach those data sources. Which option best validates the deployment design?

Options:

  • A. Store connection strings in Key Vault; monitor secret expiration events.

  • B. Enable content safety filters; monitor blocked prompt categories.

  • C. Enable groundedness evaluation; monitor retrieval relevance scores.

  • D. Use managed identity and private endpoints; monitor audit and network logs.

Best answer: D

Explanation: This requirement is about identity and network isolation, not model quality. Managed identity is needed when the runtime must avoid stored secrets, and private networking is needed when access to Azure resources must not traverse public endpoints.

For production Foundry solutions that access regulated data, the deployment design must include managed identity when services should authenticate without embedded keys or connection strings. It must include private networking, such as private endpoints, when resource access must be restricted away from public endpoints. The release gate should validate both controls with observable evidence, such as identity/audit logs showing managed identity token-based access and network/resource logs showing private endpoint traffic. Quality evaluators for relevance, groundedness, or safety are useful for AI behavior, but they do not prove keyless authentication or private network routing.

  • Secret monitoring helps manage stored credentials but does not satisfy a no-stored-keys requirement.
  • Groundedness metrics validate answer quality and retrieval usefulness, not network path or authentication method.
  • Safety filter logs monitor harmful content controls, not private access to Blob Storage or Azure AI Search.

Question 17

Topic: Implement Text Analysis Solutions

A Foundry support agent uses a text classification tool to route employee chat messages to harassment, self_harm, security_incident, or normal. Monitoring shows many false positives. In traces, the tool receives only the latest sentence and bare category names. Phrases such as “kill the process,” “this build is sick,” and “blast radius review” are routed to harmful categories. What should you change in the agent workflow?

Options:

  • A. Increase the classifier response token limit.

  • B. Pass category definitions and domain glossary into classification.

  • C. Lower safety thresholds for all negative messages.

  • D. Disable conversation memory during triage.

Best answer: B

Explanation: The trace indicates that the agent is classifying domain-specific language without enough context. Supplying clear category definitions and a domain glossary helps the text classifier distinguish technical phrases from actual harmful or sensitive content.

Poor classification in text-analysis workflows is often caused by missing context, domain language, or ambiguous labels rather than model capacity. In this case, phrases like “kill the process” and “blast radius” are normal engineering terms, but the agent receives only bare labels and the latest sentence. The workflow should ground the classification step by passing label definitions, examples, and relevant domain terminology, potentially retrieved from an approved glossary or knowledge source. This improves routing without weakening safety controls or relying on unrelated tuning changes.

  • Lower thresholds would likely increase false positives because the issue is misinterpreted context, not missed unsafe content.
  • More tokens does not fix ambiguous category definitions or missing domain terminology.
  • No memory can remove useful conversation context and make short technical phrases even harder to classify.

Question 18

Topic: Implement Generative AI and Agentic Solutions

You are implementing a Python service that calls a Microsoft Foundry agent in the production Foundry project. The agent uses a Foundry Azure AI Search tool connector for RAG over kb-index. The production security standard requires keyless access, and the connector uses managed identity uami-agent-prod. Private endpoint connectivity tests pass, and model responses work, but grounded answers fail.

Trace excerpt:

Project endpoint: foundry-prod
Agent: support-agent-prod
Tool: azure_ai_search(kb-index)
Status: 403 Forbidden
Message: Principal uami-agent-prod is not authorized to read documents

Which implementation change should you make?

Options:

  • A. Replace managed identity with a Search admin key.

  • B. Reference the base model name instead of the deployment name.

  • C. Switch the SDK endpoint to the dev Foundry project.

  • D. Grant uami-agent-prod Search Index Data Reader on kb-index.

Best answer: D

Explanation: The failure is caused by connector permissions, not by the model deployment or project endpoint. The Search tool reaches Azure AI Search but receives a 403 for the managed identity, so the identity needs read permission on the target index while preserving keyless access.

In a Foundry agent integration, the model deployment call and the tool connector call can fail independently. Here, model responses work and private connectivity passes, so the request is reaching the Search tool. The 403 message names uami-agent-prod, which is the managed identity configured on the connector. To query documents from Azure AI Search, that identity needs an appropriate data-plane role such as Search Index Data Reader on the target index or service scope.

Changing the project endpoint or model reference would not address a tool authorization failure. Using an admin key would also violate the stated keyless requirement.

  • Wrong project would move production calls to the dev project and does not match the successful prod agent invocation.
  • Base model name is not the issue because model responses already work and Foundry app calls should target deployments.
  • Admin key fallback may bypass the permission error but violates the keyless security constraint and uses excessive privilege.

Question 19

Topic: Plan and Manage an Azure AI Solution

You are evaluating a Microsoft Foundry RAG assistant that answers HR policy questions by using Azure AI Search for grounding. Groundedness scores are low, but trace logs show that answers are consistent with the retrieved chunks. For failed questions, the relevant policy pages exist in the source files but are not in the top retrieved results.

Which validation should you prioritize to confirm that retrieval or indexing design is the primary fix?

Options:

  • A. Run a content safety evaluation on failed answers.

  • B. Increase the system prompt detail and retest fluency.

  • C. Compare answer length with the model token limit.

  • D. Measure retrieval recall@k against labeled relevant policy pages.

Best answer: D

Explanation: The trace evidence points to a retrieval failure, not a generation failure. If the model answers from the retrieved context but the correct source passages are missing, retrieval recall against labeled relevant content is the key validation.

For RAG grounding issues, first separate retrieval quality from generation quality. In this scenario, the source content exists, and the model is staying consistent with the retrieved chunks. The failure is that Azure AI Search is not returning the right pages in the top results. Measuring retrieval recall@k with a labeled evaluation set confirms whether chunking, metadata filtering, semantic ranking, vector strategy, or hybrid retrieval needs redesign.

Prompt tuning can improve how the model uses context, but it cannot ground answers in passages that were never retrieved.

  • Token limit check may explain truncation, but the issue is missing relevant retrieved content.
  • Safety evaluation measures harmful or sensitive output, not whether the right grounding passages were retrieved.
  • Prompt detail may improve formatting or reasoning, but it does not validate retrieval coverage.

Question 20

Topic: Implement Generative AI and Agentic Solutions

A company is building a Foundry-based assistant for internal policy questions. The source documents change daily and contain confidential data. Security requires keyless private access, least-privilege document visibility, and auditable source citations. The assistant must answer from current enterprise knowledge. Which implementation should you recommend?

Options:

  • A. Fine-tune an LLM on the documents and update the model monthly.

  • B. Place the latest documents in the system prompt for every request.

  • C. Disable document grounding and answer only from the base model.

  • D. Use RAG with Azure AI Search, managed identity, role filters, and provenance metadata.

Best answer: D

Explanation: The scenario needs current enterprise knowledge, not model training. A RAG design can retrieve fresh content at query time and enforce security controls such as managed identity, private access, role-based filtering, and citations without blocking valid user questions.

RAG is the right pattern when an app must use enterprise content that changes frequently. In this case, the assistant should retrieve from an Azure AI Search index or connected knowledge source at runtime, using managed identity for keyless access and private networking where required. Role-based filters preserve each user’s document visibility, and provenance metadata enables source citations and audit review. This keeps confidential knowledge outside model weights while still grounding answers in approved content. Fine-tuning is not a substitute for secure retrieval when facts change daily.

  • Fine-tuning confusion fails because training a model on confidential documents can make knowledge stale and does not provide per-user document access control.
  • Prompt stuffing fails because placing documents in prompts is hard to secure, audit, and scale for daily-changing enterprise content.
  • Base model only fails because it avoids risk by blocking the required legitimate use: grounded answers from current internal documents.

Question 21

Topic: Implement Computer Vision Solutions

An insurer is building a Microsoft Foundry claim-review agent. The agent must reason over uploaded photos and walkthrough videos by consuming structured outputs that include visual attributes, locations or timestamps, and source provenance. Which two backlog items are appropriate to implement by using Azure Content Understanding? Select TWO.

Options:

  • A. Translate adjuster comments from French to English before indexing.

  • B. Transcribe the adjuster’s spoken narration from the video audio track.

  • C. Index raw media files in Azure AI Search without visual extraction.

  • D. Generate photorealistic repair mockups from a text prompt.

  • E. Create timestamped records of visible leaks from walkthrough videos.

  • F. Extract damage cues and affected regions from roof photos as JSON.

Correct answers: E and F

Explanation: Azure Content Understanding is appropriate when an app needs structured, grounded representations of visual content for downstream reasoning. In this scenario, extracting visual attributes, regions, and timestamps from photos or videos matches that purpose.

Content Understanding fits workloads that convert unstructured visual media into structured outputs an agent or RAG workflow can reason over. For images, that can include domain-specific visual characteristics and affected regions. For video, it can include timestamped observations or segments that preserve where the evidence came from. The key requirement is not just storing media or producing a generic caption; it is creating a reliable visual representation with grounding/provenance for later reasoning.

A single-purpose translation, speech, search, or image generation capability may still be useful in the larger solution, but it does not satisfy the visual-characteristic extraction requirement.

  • Translation task handles existing text, not visual evidence from photos or videos.
  • Speech transcription processes audio narration, not visible objects, regions, or scenes.
  • Raw indexing supports retrieval but skips the required visual extraction step.
  • Image generation creates new media and does not analyze claim evidence.

Question 22

Topic: Implement Computer Vision Solutions

A retail team is building a Microsoft Foundry app that lets designers upload a product photo, mark the product area to preserve, and request a new campaign background. The app must return an edited image, not labels, captions, or search results. Which implementation should you use?

Options:

  • A. Index the photo in Azure AI Search for similar-image retrieval.

  • B. Run Azure AI Vision image analysis to generate tags and captions.

  • C. Use Content Understanding to extract product fields from the image.

  • D. Call a Foundry image editing model with the source image, mask, and prompt.

Best answer: D

Explanation: The task requires creating or editing media, so the implementation must use an image generation or image editing model. A classification, extraction, or retrieval workflow can describe or find images, but it will not produce the requested edited campaign image.

Image generation and editing workflows in Microsoft Foundry are used when an application must create new visual content or modify an existing image. In this scenario, the product photo and mask define what should be preserved, and the prompt describes the new background to generate. That requires calling an image editing-capable model deployment through the app or Foundry SDK.

Image analysis, information extraction, and search can be useful supporting features, but they do not perform the core requested behavior: generating an edited image.

  • Vision analysis returns descriptive outputs such as tags or captions, not a modified campaign image.
  • Content extraction is suited to structured understanding of documents or visual content, not pixel-level image editing.
  • Similarity search can find related images, but it does not create the requested new background.

Question 23

Topic: Implement Generative AI and Agentic Solutions

You are designing a Microsoft Foundry agent for insurance claims support. The agent uses a deployed GPT-4o mini model and Azure AI Search over private policy documents with managed identity. Users must resume a claim conversation from web or mobile for 30 days. The security team allows durable storage of claim state and communication preferences, but not full long-term transcripts containing PII. Which memory design is the best fit?

Options:

  • A. Append every prior chat turn to each prompt from the client app.

  • B. Store all chat transcripts as vectors in the policy search index.

  • C. Use Foundry agent threads and persist encrypted state summaries keyed by claim.

  • D. Fine-tune the deployed model monthly on resolved conversations.

Best answer: C

Explanation: The best design separates short-term conversation state from durable memory. Foundry agent threads can track the active exchange, while an encrypted application-managed summary stores only the claim state and preferences needed to resume later.

For agent continuity, use the agent thread for the current conversation and persist only the minimum durable memory required outside the model, such as claim status, unresolved questions, and approved user preferences. On resume, the app can load that summary into a new or existing thread and continue grounding policy answers with Azure AI Search. This satisfies the 30-day continuity requirement without retaining full PII-heavy transcripts. The key takeaway is to keep enterprise knowledge retrieval separate from per-user memory and avoid using model training as a substitute for conversation state.

  • Prompt-only history is not reliable cross-channel memory and increases context size while repeatedly exposing full PII.
  • Vectorized transcripts mix user-specific memory with shared policy retrieval and retain sensitive conversations beyond the stated allowance.
  • Fine-tuning conversations changes model behavior rather than preserving per-claim state and overbuilds the continuity requirement.

Question 24

Topic: Implement Computer Vision Solutions

You are building a Foundry agent that answers questions about uploaded equipment photos, such as “Is the oxygen valve open?” Answers must be grounded in the actual image and include the source photo and relevant region. The current RAG pipeline indexes only OCR text and file names in Azure AI Search, so visual-only details are missed. Which change should you implement?

Options:

  • A. Increase OCR chunk size and enable semantic ranking over file names.

  • B. Fine-tune a text LLM on previous image questions and keep the same text index.

  • C. Generate one page-level alt-text summary per photo and answer only from summaries.

  • D. Index visual captions, embeddings, and region metadata, then pass retrieved image crops to a multimodal model.

Best answer: D

Explanation: Visual question answering needs evidence from the image, not only nearby text. Indexing visual descriptors, embeddings, and region metadata enables retrieval of relevant visual evidence, and passing the retrieved crop or image region to a multimodal model lets the answer stay grounded.

For grounded visual Q&A, the retrieval pipeline must make image content searchable and preserve provenance. OCR helps with visible text, but questions like whether a valve is open depend on visual features. A better pipeline enriches each image with captions or visual embeddings, stores source photo and region metadata, retrieves the relevant visual evidence from Azure AI Search, and gives the multimodal model the original image or crop for final answering. The key takeaway is that text-only RAG cannot reliably answer questions that depend on visual details in the source image.

  • Fine-tuning only can learn patterns but does not supply current, cited visual evidence from the uploaded photo.
  • Larger OCR chunks still miss details that are visible but not represented as text.
  • Single summaries lose region-level provenance and may omit the specific visual detail needed for the question.

Question 25

Topic: Implement Generative AI and Agentic Solutions

A Foundry project includes a support agent that calls a schedule_pickup API tool. The tool schema should require orderId, require reasonCode from an approved enum, and require pickupDate in ISO date format. Recent traces show calls that omit reasonCode or send free-text dates. You need to validate that the revised schema supports reliable tool invocation before the API runs. Which observability approach is best?

Options:

  • A. Monitor only API 4xx and 5xx response counts

  • B. Validate traced tool-call arguments against the schema

  • C. Track total token usage for pickup conversations

  • D. Score final answers with a similarity evaluator

Best answer: B

Explanation: The goal is to validate whether the agent produces tool calls that conform to required arguments and constraints. The best signal is a trace-based tool-call evaluation that checks generated arguments against the declared schema before the API executes.

For tool schema quality, observe the model’s proposed function calls and validate their arguments against the schema used by the agent. This catches missing required properties, invalid enum values, and format violations such as non-ISO dates before the downstream API is called. In Foundry agent traces, tool-call records provide the right evidence because they show the selected tool and the exact arguments generated by the model. Operational metrics such as latency, token count, or API error rate can be useful, but they do not directly prove that the tool schema is constraining invocation arguments correctly.

  • Token analytics helps cost and performance monitoring, but it does not detect missing or invalid tool arguments.
  • Answer similarity evaluates response text quality, not whether the function call payload follows the schema.
  • API error counts can reveal downstream failures, but they are indirect and occur after invalid calls may already reach the API.

Questions 26-50

Question 26

Topic: Plan and Manage an Azure AI Solution

Your team is deploying a claims-assistance agent in a Microsoft Foundry project. The agent must retrieve internal policy content from Azure AI Search, maintain per-case memory, and call a refund API when a refund is justified. Security policy requires no stored service keys, private connectivity to data sources and tools, least-privilege tool access, human approval before any refund call, and audit evidence that includes tool calls, grounding sources, and safety events. Which deployment configuration should you choose?

Options:

  • A. Use managed identity, private endpoints, RBAC-scoped tools and memory, refund approval gating, and trace logging with provenance and safety events.

  • B. Disable the refund tool and require operators to process all refunds outside the agent workflow.

  • C. Use managed identity and private endpoints, but grant broad tool access and omit trace logging for privacy.

  • D. Use project-stored API keys, public endpoints, prompt-only refund rules, and logging of final responses only.

Best answer: A

Explanation: The scenario requires secure deployment controls without preventing approved refund actions. Managed identity and private endpoints address keyless private access, while RBAC-scoped tools, approval gating, trace logging, provenance, and safety events support governance and auditability.

For a Foundry agent that can use enterprise data and a high-impact action tool, the deployment should combine identity, network, tool, oversight, and monitoring controls. Managed identity avoids stored service keys. Private endpoints keep traffic to Azure AI Search, memory storage, and tools off public paths. Role-based tool access limits the agent to only the resources it needs. A human approval gate allows legitimate refund use while preventing autonomous high-risk actions. Trace logging with provenance and safety events gives auditors evidence of prompts, retrieval grounding, tool calls, and policy enforcement. The key takeaway is to control risky tool use with approval and observability, not by relying only on prompts or disabling the workflow.

  • Prompt-only controls fail because they do not meet keyless access, private networking, approval, or full audit requirements.
  • Broad tool access fails because least privilege and traceability are explicit policy requirements.
  • Disabling refunds fails because it blocks a legitimate approved workflow instead of adding human oversight.

Question 27

Topic: Implement Generative AI and Agentic Solutions

A finance team is building a Foundry agent that reconciles invoices by using retrieval over purchase orders and then calling an ERP tool named createPayment. Payments above a defined limit or payments involving changed bank details must be reviewed because they have financial and compliance impact. Which workflow should you implement?

Options:

  • A. Use only a system prompt that tells the agent to be careful.

  • B. Require approval before createPayment runs for flagged payments.

  • C. Run createPayment first and notify approvers afterward.

  • D. Route only ERP tool failures to a reviewer.

Best answer: B

Explanation: High-impact agent actions need a human-in-the-loop approval control before the side-effecting tool executes. The agent can still prepare the payment request, but the reviewed arguments and evidence should be approved before calling the ERP payment tool.

For financial, operational, privacy, or compliance-impacting actions, the approval workflow should gate the tool invocation, not just the conversation. In this scenario, the agent can retrieve purchase order evidence, draft the createPayment arguments, and present the payment summary for review. The workflow should execute the ERP tool only after an authorized approver accepts the proposed action, and the approval decision should be traceable for monitoring and audit. Post-action notifications or prompts alone do not reliably prevent an irreversible or regulated action.

  • Post-action notification fails because the payment may already have been created before review.
  • Prompt-only control fails because model instructions are not a sufficient governance boundary for financial tool use.
  • Failure-only review fails because risky successful payments still bypass required approval.

Question 28

Topic: Implement Text Analysis Solutions

You are building a Foundry app that triages customer emails in French, Spanish, and Japanese. The Azure AI Content Safety text check supports these source languages. The validated extraction and summarization prompts use English-only examples and return canonical English labels. Policy requires unsafe user content to be blocked before any transformation or domain LLM call. Which TWO ordering choices should you implement?

Options:

  • A. Summarize first, then scan the summary for safety.

  • B. Translate every message to English before safety detection.

  • C. Extract fields in the source language, then translate values.

  • D. Translate safe non-English messages before extraction and summarization.

  • E. Translate canonical labels before downstream routing.

  • F. Run safety detection on the original message first.

Correct answers: D and F

Explanation: The pipeline must preserve the safety policy first, then satisfy the language assumptions of the domain prompts. Because safety detection supports the incoming languages and must happen before transformations, scan the original text first. After content is allowed, translate non-English input to English for the validated extraction and summarization prompts.

Translation order depends on which component has the stricter language or governance requirement. Here, safety detection is both multilingual for the workload and required before any transformation, so it should run against the original user message. Once the message is allowed, translation to English is appropriate because the extraction and summarization prompts were validated only in English and produce canonical English labels. This reduces prompt drift and keeps downstream routing consistent. The key distinction is that safety policy controls should not be delayed by translation when the safety detector can process the source language.

  • Translate before safety violates the explicit requirement to block unsafe content before any transformation.
  • Source-language extraction ignores that the extraction and summarization prompts were validated only in English.
  • Summary-only safety scanning can miss unsafe details and occurs after a domain LLM call.
  • Translated routing labels break the requirement for canonical English labels downstream.

Question 29

Topic: Implement Information Extraction Solutions

You are building a Foundry agent workflow that reviews scanned supplier contracts. The reviewer agent must reason over clause order, headings, tables, and page provenance before it can invoke an approval tool. You need the preprocessing step to create a grounded artifact for downstream reasoning. Which implementation should you use?

Options:

  • A. Embed raw OCR text chunks and ignore layout structure.

  • B. Have the approval tool parse the original PDF.

  • C. Ask the model to summarize the PDF into Markdown.

  • D. Call a Content Understanding analyzer and pass its Markdown artifact.

Best answer: D

Explanation: The workflow needs a document-derived artifact that preserves layout and can be consumed by the reasoning agent. A Content Understanding analyzer that outputs Markdown is the appropriate preprocessing step because it keeps headings, tables, order, and provenance available before tool approval decisions.

For document extraction workflows, Content Understanding analyzers can produce clean, layout-aware representations such as Markdown for downstream reasoning. In this scenario, the agent should not reason directly from an unstructured PDF or a lossy summary. The analyzer output becomes the grounded intermediate artifact that the reviewer agent can use to inspect clauses, follow table context, and retain source/page metadata before invoking an approval tool.

The key takeaway is to generate Markdown from the source document before agent reasoning, not after a model has already summarized or discarded structure.

  • Raw OCR only loses important layout signals such as headings, table relationships, and clause order.
  • Model summary first creates Markdown after reasoning, which can omit evidence and weaken grounding.
  • Approval parsing puts extraction in the wrong step; the approval tool should receive decisions or structured inputs, not perform document understanding.

Question 30

Topic: Implement Computer Vision Solutions

You are implementing a product-photo editing workflow by using the Foundry SDK. An edit request uses a deployed image-editing model, a private source image, and a mask. The request fails before generation.

Prompt safety: passed
Reference media fetch: 200 OK
Source image: 1024 x 1024 PNG
Mask image: 768 x 768 PNG
Model capability: image editing with masks
Error: Invalid mask geometry

Which implementation change should you make?

Options:

  • A. Rewrite the prompt to remove unsafe content.

  • B. Regenerate the mask at 1024 x 1024 pixels.

  • C. Switch to a text-to-image-only deployment.

  • D. Move the source image to a public URL.

Best answer: B

Explanation: The failure is caused by the mask, not by the prompt, reference media, policy, or model choice. For masked image editing, the mask must align with the source image geometry so the model can identify the exact region to edit.

In an image editing workflow, validation can fail before generation if the provided mask is incompatible with the source image. The exhibit shows the prompt passed safety checks, the reference media was fetched successfully, and the deployed model supports masked editing. The only failing evidence is Invalid mask geometry, with a 768 x 768 mask for a 1024 x 1024 source image. Regenerating the mask at the same dimensions as the source preserves the intended private, edit-capable workflow.

Changing the prompt, media location, or model would not address the geometry mismatch shown in the failure details.

  • Prompt safety is not the issue because the safety check already passed.
  • Public media access is unnecessary because the reference media fetch succeeded and could weaken the private design.
  • Text-to-image only does not preserve the required source-image editing workflow with a mask.

Question 31

Topic: Implement Information Extraction Solutions

An insurance company is building a RAG-backed Foundry agent for claims specialists. Source content includes 80 GB of PDF manuals in Azure Storage, scanned claim forms, and SharePoint policy pages that change throughout the day. The agent must cite page or section sources and enforce each user’s document permissions at retrieval time. Newly changed content must be searchable within 30 minutes. Which ingestion approach should you choose?

Options:

  • A. Index whole-file embeddings with a nightly full refresh.

  • B. Use incremental Azure AI Search indexing with OCR-enriched chunks, source metadata, and ACL filters.

  • C. Load retrieved source files into the model prompt for filtering.

  • D. Create public summaries and index only the summaries.

Best answer: B

Explanation: The best ingestion design uses Azure AI Search with incremental indexing, OCR enrichment, chunking, and metadata needed for citations and security trimming. This matches the content types, 30-minute freshness requirement, large corpus size, and downstream grounding needs.

For grounded RAG, ingestion must prepare retrievable units that preserve meaning, source provenance, freshness, and access control. OCR enrichment makes scanned forms searchable, chunking creates passages suitable for retrieval, and source metadata supports page or section citations. Incremental indexing avoids reprocessing the full 80 GB corpus while meeting the 30-minute freshness goal. ACL metadata enables retrieval-time filtering so the Foundry agent only grounds answers in documents the user is allowed to access. Whole-document embeddings or prompt-time filtering do not provide the same grounding quality, freshness, or security guarantees.

  • Nightly refresh misses the 30-minute freshness requirement and whole-file embeddings reduce retrieval precision.
  • Prompt filtering is not a reliable access-control boundary and does not scale for 80 GB of content.
  • Public summaries remove source-level provenance and violate the permission-aware retrieval requirement.

Question 32

Topic: Implement Information Extraction Solutions

A financial services team is building a Microsoft Foundry agent to review scanned onboarding packets. The packets contain PII, and policy permits processing them only if the solution uses keyless access, private network paths, clean extracted fields and layout-aware markdown, and source provenance for each recommendation. The team must not block valid packets solely because they contain PII.

Which implementation should you choose?

Options:

  • A. Pass raw PDFs to the agent with safety filters and trace logging.

  • B. Reject packets when PII is detected before agent processing.

  • C. Use Content Understanding extraction with managed identity and provenance outputs.

  • D. Use a stored storage key so the agent can read raw PDFs.

Best answer: C

Explanation: The requirement is not just to moderate unsafe content; it is to transform scanned documents into governed, traceable representations before the agent uses them. Content Understanding extraction with managed identity supports clean fields, layout-aware markdown, and provenance without blocking legitimate PII-containing packets.

For document extraction scenarios, an agent should not reason directly over raw unstructured scans when the business requires trusted fields, layout, markdown, or citations. Use Azure Content Understanding or document extraction capabilities first, secure the data path with managed identity and private networking, and pass the agent only the extracted representation plus provenance metadata. This lets the agent summarize or recommend using auditable source spans while role policies and keyless access reduce credential risk.

Safety filters and trace logs are still useful, but they do not replace OCR, layout analysis, field extraction, or provenance generation.

  • Raw PDF prompting fails because moderation and trace logging do not create clean fields, layout-aware markdown, or source provenance.
  • PII rejection fails because the policy allows PII processing and requires legitimate packets to continue.
  • Stored key access fails because it uses long-lived credentials and still gives the agent raw unstructured documents.

Question 33

Topic: Implement Text Analysis Solutions

A Microsoft Foundry agent triages maintenance voice notes. The current flow transcribes each audio file, summarizes the transcript, and uses the summary to query Azure AI Search for repair manuals.

Technicians report irrelevant answers when they say, “the pump makes this sound,” and then record 15 seconds of abnormal noise. Trace logs show accurate speech transcription, but the noise segment appears as [non-speech audio]. The agent must reason over both the spoken description and the recorded sound.

What is the best next fix?

Options:

  • A. Improve the text summarization prompt for the transcript.

  • B. Increase the number of Azure AI Search results returned.

  • C. Translate the transcript before querying the search index.

  • D. Route the original audio to a multimodal reasoning workflow.

Best answer: D

Explanation: The problem is not inaccurate transcription or weak search recall. The workflow loses the diagnostic pump sound because it converts the input to text and treats non-speech audio as unavailable content. A multimodal reasoning workflow can use the raw audio signal with the spoken context.

For speech-enabled agents, choose transcription when the relevant information is spoken words, translation when language conversion is needed, summarization when a long transcript must be condensed, and multimodal reasoning when the audio itself contains information the model must interpret. In this scenario, the transcript is accurate, but the decisive evidence is an abnormal non-speech sound. Summarizing or retrieving from the text transcript cannot recover information that was discarded before reasoning.

The key takeaway is to preserve and route the modality that contains the signal needed for the task.

  • Prompt tuning fails because the transcript does not contain the pump sound needed for diagnosis.
  • Translation does not address the symptom; no language mismatch is shown.
  • More search results cannot fix missing acoustic evidence before retrieval.

Question 34

Topic: Plan and Manage an Azure AI Solution

Your team is building an internal policy assistant in a Microsoft Foundry project. The agent is deployed to Azure Container Apps by CI/CD, calls a Foundry model deployment, and queries an Azure AI Search index for grounding. All traffic to Foundry and Search must use private endpoints, and security forbids API keys, connection strings, and service principal secrets. RBAC assignments must survive container app redeployments and blue/green replacements. Which identity design is the best fit?

Options:

  • A. Use a system-assigned managed identity with Contributor on the resource group.

  • B. Store Foundry and Search keys in Key Vault and inject them at deployment.

  • C. Use a service principal secret behind an API gateway to broker access.

  • D. Use a user-assigned managed identity with keyless SDK authentication and least-privilege RBAC.

Best answer: D

Explanation: A user-assigned managed identity is the best fit when the app identity must remain stable across redeployments. The agent can use Azure SDK keyless authentication over private endpoints, with RBAC scoped only to the Foundry and Azure AI Search resources it needs.

Managed identity removes stored credentials from the app and CI/CD pipeline. In this scenario, the redeployment constraint makes a user-assigned managed identity preferable because its principal is independent of the container app lifecycle. Assign only the required data-plane roles for model invocation and search queries, and keep network access restricted through private endpoints. This satisfies keyless authentication, least privilege, private networking, and stable RBAC without adding a custom credential broker.

  • System-assigned identity can work for simple deployments, but its principal can change when the hosting resource is replaced and Contributor is too broad.
  • Key injection still depends on API keys and deployment-time secrets, which the security requirement explicitly forbids.
  • Secret broker pattern adds unnecessary complexity and still relies on a service principal secret rather than managed identity.

Question 35

Topic: Implement Information Extraction Solutions

Your team is building a Foundry RAG assistant for contracts stored as PDF, TIFF, and DOCX files in Azure Blob Storage. Many PDFs are scanned images with no embedded text. The ingestion flow must populate an Azure AI Search-backed RAG index with searchable chunks from both native document text and image-based text, including source/page metadata for citations. Which TWO configurations can meet the requirement? Select TWO.

Options:

  • A. Store scanned pages in Foundry agent memory and skip ingestion OCR.

  • B. Run a Content Understanding OCR/layout analyzer, then index chunked page text.

  • C. Use an Azure AI Search indexer skillset with normalized-image OCR and chunk projection.

  • D. Apply semantic ranking to native extracted text without OCR.

  • E. Vectorize only file names and blob metadata.

  • F. Use image classification tags as the only text for scanned pages.

Correct answers: B and C

Explanation: Scanned pages must be converted to text during ingestion before retrieval can use them. Both Azure AI Search OCR enrichment and Content Understanding OCR/layout analysis can produce text that is chunked, indexed, and tied to source/page metadata for grounded RAG citations.

RAG ingestion for image-only documents needs an extraction stage before retrieval indexing. Azure AI Search can handle this in an indexer skillset by creating normalized page images, applying OCR, and projecting the resulting text into chunks with metadata. Alternatively, a Content Understanding analyzer configured for OCR/layout can produce page-level text or markdown before those chunks are embedded and indexed in Azure AI Search. The key is that text from scanned content must exist in the index as retrievable content; semantic ranking and vector search improve retrieval over indexed content but do not extract text from pixels.

  • Semantic-only indexing fails because semantic ranking reranks existing searchable text and does not OCR scanned pages.
  • Metadata embeddings fail because file names and blob metadata do not contain the contract body text.
  • Image tags or captions fail because they are not a full OCR extraction of image-based document text.
  • Agent memory fails because it is not a governed ingestion path for creating grounded, citeable search chunks.

Question 36

Topic: Implement Generative AI and Agentic Solutions

A support agent in a Foundry project uses a Foundry connector to retrieve policy articles from Azure AI Search before generating answers. After release, users report that some answers cite articles that do not support the response. Which monitoring approach best validates the quality goal?

Options:

  • A. Add tests for exact SDK method names

  • B. Track only model token usage per conversation

  • C. Trace retrieval spans and run groundedness evaluations

  • D. Log only successful connector HTTP status codes

Best answer: C

Explanation: The quality issue is grounding, not SDK syntax or basic connectivity. End-to-end traces show which content the connector retrieved, and groundedness evaluations measure whether responses are supported by that retrieved evidence.

For a RAG-backed agent, observability should connect the generated answer to the retrieval evidence used in the same run. Trace logging can capture the Azure AI Search connector call, retrieved chunks, citations, latency, and tool spans. Groundedness or relevance evaluations can then score whether the final answer is supported by those retrieved sources. This targets the stated failure: citations that do not support the response. Token totals, HTTP success codes, and SDK method tests can be useful in other contexts, but they do not validate answer support against retrieved content.

  • Token analytics can help with cost and prompt-size trends, but it does not prove cited policy articles support an answer.
  • HTTP status logging confirms connector availability, not retrieval quality or grounding.
  • SDK method tests check implementation details and do not monitor runtime answer quality.

Question 37

Topic: Implement Generative AI and Agentic Solutions

You are building a Microsoft Foundry agent for employee reimbursements. The agent can review receipts and call a finance connector that issues payments. Company policy requires reimbursements over USD 1,000 to be approved by a manager, and every payment must have an audit record before the connector is called. Which implementation should you use?

Options:

  • A. Use workflow steps for validation, approval, audit, and payment.

  • B. Issue payments immediately and flag exceptions in monitoring.

  • C. Let the agent call the connector after retrieving policy text.

  • D. Add the policy and payment rules to the system prompt.

Best answer: A

Explanation: Business-critical actions should be controlled by explicit workflow steps, not hidden inside a prompt. In this scenario, the payment connector must run only after deterministic validation, manager approval, and audit recording.

For agentic solutions in Microsoft Foundry, prompts can guide reasoning, but they should not be the only control for regulated or high-impact business actions. A reimbursement payment is an external side effect, so the flow should separate the model’s recommendation from the execution path. The workflow should validate the amount, route over-limit requests to an approval step, write the audit record, and only then allow the payment tool or connector to run. This makes the process inspectable, testable, and enforceable even if the model output is incomplete or ambiguous. The key takeaway is to place critical gates in the workflow/tool orchestration layer rather than relying on natural-language instructions alone.

  • Prompt-only control fails because prompt instructions are not a reliable enforcement boundary for payments.
  • Retrieved policy text can inform the agent, but it does not create a required approval gate.
  • After-the-fact monitoring detects violations too late because the payment has already been issued.

Question 38

Topic: Implement Information Extraction Solutions

A team is building a RAG-backed troubleshooting agent in a Microsoft Foundry project. The source articles have already been extracted and chunked. User queries often describe symptoms in different words than the articles, and the main requirement is to retrieve chunks with the closest embedding similarity for grounding. Which retrieval approach should the team implement?

Options:

  • A. Store chunk embeddings and query with vector search.

  • B. Enable OCR enrichment on the indexed articles.

  • C. Use keyword search with a synonym map.

  • D. Use semantic ranking over text-only fields.

Best answer: A

Explanation: The deciding requirement is similarity over embeddings, not text extraction or exact keyword matching. Azure AI Search vector search supports nearest-neighbor retrieval against stored chunk vectors, which is the appropriate grounding method for semantically similar wording.

Vector search is the right retrieval pattern when the application must compare embedding vectors for semantic similarity. In this scenario, the articles are already extracted and chunked, so the missing capability is not OCR or enrichment. The index should include a vector field containing embeddings for each chunk, and the RAG pipeline should embed the user query and use vector search to retrieve the nearest chunks for grounding. Semantic ranking and keyword techniques can improve some text retrieval scenarios, but they do not directly satisfy a requirement to rank by embedding similarity.

  • OCR enrichment does not address retrieval quality because the content is already extracted and chunked.
  • Keyword search depends on term overlap and synonyms, which misses the stated embedding-similarity requirement.
  • Semantic ranking can rerank text results, but it is not the primary mechanism for nearest-neighbor retrieval over vectors.

Question 39

Topic: Implement Generative AI and Agentic Solutions

A team deployed a multi-agent claims assistant from a Microsoft Foundry project. The assistant uses Azure AI Search retrieval, a policy-check tool, and a human-approval tool before final responses. Managers need weekly error analysis that separates missing grounding, failed tool calls, safety blocks, and slow approval steps. Which monitoring approach should you implement?

Options:

  • A. Track only model token usage and aggregate request latency.

  • B. Capture Foundry agent traces and run evaluators on grounding, tools, safety, and latency spans.

  • C. Store final chat transcripts and manually review only low-rated sessions.

  • D. Evaluate the base model with a static prompt set outside the agent.

Best answer: B

Explanation: The goal is step-level observability for a deployed agent workflow. Foundry agent traces plus evaluators can show where failures occur across retrieval, tool invocation, safety handling, and human-approval latency instead of only measuring the final response.

For deployed agents, monitoring should preserve the execution path as trace spans: model calls, retrieval calls, tool invocations, safety decisions, and human-approval steps. Running evaluators over those captured runs lets the team score qualities such as groundedness, tool-call success, safety handling, and latency by component. This supports weekly error analysis because each bad outcome can be tied to a specific span or evaluator result. Aggregate model metrics are useful, but they cannot explain whether the agent failed because retrieval was weak, a tool call failed, or an approval step was slow.

  • Token-only monitoring misses retrieval, tool-call, safety, and approval-step causes.
  • Base-model evaluation excludes the deployed agent orchestration, tools, and grounding path.
  • Transcript review only can find examples but does not provide systematic step-level telemetry.

Question 40

Topic: Plan and Manage an Azure AI Solution

You are troubleshooting a Microsoft Foundry project agent for HR policy questions. Users report confident but outdated answers with no citations.

Exhibit: Trace summary

Requirement: answer only from current SharePoint policies and cite the policy URL
Agent trace: response used conversation memory; no retrieval or tool call occurred
Knowledge sources: none configured
Memory: enabled for user preferences and prior chats
Index status: SharePoint policy index in Azure AI Search is healthy with URL metadata

Which next fix best addresses the root cause?

Options:

  • A. Fine-tune the model daily on HR policies.

  • B. Store summarized HR policies in conversation memory.

  • C. Increase the model’s maximum output tokens.

  • D. Connect the Azure AI Search index as the agent’s retrieval source.

Best answer: D

Explanation: The issue is not index health; the index is healthy but unused. For grounded answers from enterprise content, the agent needs a retrieval or knowledge integration, such as an Azure AI Search-backed source, rather than relying on conversation memory.

Grounded enterprise answers require an authoritative retrieval path from the agent to the enterprise content. In this trace, the SharePoint content has already been indexed and includes URL metadata, but the agent has no knowledge source configured and no retrieval call occurs. Conversation memory is appropriate for prior interaction context or preferences, not for serving as the authoritative store for changing HR policies. Connecting the Azure AI Search index lets the agent retrieve current policy passages and use their metadata for citations.

The key takeaway is to fix the knowledge integration pattern, not the model generation settings.

  • Memory as knowledge fails because memory is not the authoritative retrieval layer for current enterprise policy documents.
  • Fine-tuning for facts fails because changing policy content should be retrieved, not baked into model weights.
  • Longer responses fail because token limits do not create grounding or citations when no retrieval source is called.

Question 41

Topic: Implement Text Analysis Solutions

A Foundry project app extracts customer, entities, topics, and summary from service-case notes into a JSON record. After deployment, 18% of runs fail schema validation because topics is sometimes a comma-separated string and entities sometimes contains prose.

Trace excerpt:

Prompt: Return valid JSON that matches this shape...
Enabled tools: none
Configured tool: emit_case_json (JSON schema)
Tool calls: 0
Parser error: $.topics expected array

What is the best next fix?

Options:

  • A. Add Azure AI Search grounding for case notes

  • B. Lower temperature and keep prompt-only JSON output

  • C. Enable and require the emit_case_json tool call

  • D. Increase the model deployment token limit

Best answer: C

Explanation: The failure is a structured-output enforcement issue, not a retrieval or capacity issue. The trace shows a JSON schema tool is configured but no tools are enabled and no tool call occurs, so the model is only following a prompt convention rather than a schema-bound contract.

For entity, topic, summary, and structured JSON extraction, a Foundry Tool with a JSON schema can make the model emit arguments that conform to the expected fields and types. The evidence shows prompt-only extraction: no enabled tools and zero tool calls. Prompt instructions such as “return valid JSON” are useful, but they do not reliably enforce arrays, objects, or required fields across all inputs. Enabling the schema-backed tool and requiring the extraction step to call it provides a stronger contract for downstream validation.

The key troubleshooting signal is the mismatch between a configured schema tool and the trace showing it was never available to the run.

  • Token limit does not explain type mismatches such as an array becoming a comma-separated string.
  • Search grounding helps when answers lack source context, but the case notes are already being processed.
  • Prompt-only control may reduce variation but still does not enforce the JSON schema as reliably as a tool call.

Question 42

Topic: Implement Generative AI and Agentic Solutions

You are building a Microsoft Foundry HR policy assistant that uses RAG with an Azure AI Search index of approved policy documents. Testers report that the assistant sometimes gives confident answers that are not supported by the indexed documents. You need to reduce fabrication risk and make policy claims traceable to approved sources. Which TWO actions should you implement?

Options:

  • A. Require response citations that reference retrieved source metadata.

  • B. Fine-tune the model on historical HR chat transcripts.

  • C. Run Azure AI Content Safety moderation on every response.

  • D. Increase the generation temperature for more varied wording.

  • E. Use conversation memory as the primary source for policy details.

  • F. Pass retrieved document chunks into the model prompt as grounding context.

Correct answers: A and F

Explanation: RAG reduces fabrication by grounding generation in retrieved content from authoritative sources. Passing retrieved chunks into the prompt and requiring citations from retrieved metadata help ensure answers are source-backed and traceable.

For a RAG-backed Foundry app, the model should not answer policy questions only from its pretrained knowledge or conversation history. The app should retrieve relevant chunks from Azure AI Search, provide those chunks as grounding context, and instruct the model to answer only when the retrieved context supports the claim. Source metadata, such as document name, URL, page, or chunk ID, should be preserved so the response can cite the retrieved evidence. This does not guarantee correctness by itself, but it directly reduces unsupported fabrication and gives users provenance for validation. Moderation, fine-tuning, and memory can be useful in other parts of an AI solution, but they do not replace grounded retrieval and citations for authoritative policy answers.

  • Higher temperature makes wording more diverse and can increase unsupported variation rather than grounding the answer.
  • Conversation memory is not an authoritative policy source and can preserve user-provided or stale information.
  • Fine-tuning on chats may improve style or patterns, but it does not ensure current approved policy grounding or citations.
  • Content moderation helps detect unsafe content, but it does not prove that policy claims came from retrieved documents.

Question 43

Topic: Implement Information Extraction Solutions

You are designing a Microsoft Foundry project for compliance review. Azure Content Understanding extracts contract fields and markdown sections into an Azure AI Search index. The solution has these requirements: the batch validation workflow must always retrieve the current policy section before calling a model and record source IDs; a reviewer-facing agent must decide during multi-turn chats when policy search is needed before using other tools.

Which TWO design choices should you implement?

Options:

  • A. Embed the Azure AI Search query in the batch workflow before the model call.

  • B. Expose Azure AI Search as a retrieval tool for the reviewer agent.

  • C. Let the agent read raw PDFs directly instead of using the index.

  • D. Run a fixed retrieval query before every reviewer-agent turn.

  • E. Expose batch validation retrieval only as an optional agent tool.

  • F. Fine-tune the model on extracted contracts instead of using retrieval.

Correct answers: A and B

Explanation: Use an agent retrieval tool when the agent needs discretion about whether and when to search during a conversation. Embed retrieval in the application workflow when retrieval is mandatory, deterministic, and must produce auditable source IDs before model invocation.

The placement of retrieval depends on control. For the batch validation workflow, retrieval is a required processing step, so the application should call Azure AI Search directly, capture the selected chunks and source IDs, and pass that grounded context to the model. For the reviewer-facing experience, the agent is handling open-ended multi-turn questions and may need to search only when relevant before using other tools. That retrieval should be exposed as an agent tool with an appropriate description and access controls. The key distinction is deterministic orchestration by the app versus discretionary tool use by the agent.

  • Optional batch tool fails because the batch workflow must always retrieve policy content and record source IDs.
  • Fixed agent retrieval removes the agent’s discretion and may retrieve unnecessarily for turns that do not need policy grounding.
  • Fine-tuning substitute does not provide current retrieved evidence or citations from the extraction index.
  • Raw PDF access bypasses the prepared extraction and retrieval pipeline needed for consistent grounding and provenance.

Question 44

Topic: Plan and Manage an Azure AI Solution

You manage a Foundry project for an HR knowledge assistant grounded by Azure AI Search. Evaluation shows the assistant often cites retired benefits policies even though the current PDFs are also indexed. The retrieval trace shows the old and current chunks have similar embeddings, and no version or effective-date fields are used during retrieval.

What should you change first to improve grounding quality?

Options:

  • A. Lower the model temperature for all assistant responses.

  • B. Add conversation memory for recent HR topics.

  • C. Fine-tune the model on the latest policy PDFs.

  • D. Add version metadata and apply freshness-aware retrieval filtering.

Best answer: D

Explanation: This is a grounding-quality problem caused by retrieval design, not by generation style or memory. Because old and current chunks are both indexed and semantically similar, the retrieval layer needs metadata and filtering or scoring that favors the currently effective policy versions.

For RAG solutions, low grounding quality is often fixed at the retrieval and indexing layer when the wrong sources are being selected before the model generates an answer. In this case, Azure AI Search is returning retired and current policy chunks as near matches, and the system has no indexed fields to distinguish active content from archived content. Adding version, status, and effective-date metadata during ingestion, then using filters or scoring profiles at query time, gives the assistant a reliable way to ground answers in current policy unless the user explicitly requests archives.

Changing model behavior can make responses sound different, but it cannot reliably correct retrieval results that contain the wrong source set.

  • Temperature tuning may reduce variation, but it does not stop retired documents from being retrieved.
  • Conversation memory helps with session context, not source freshness or policy version selection.
  • Fine-tuning risks baking policy content into the model and does not solve citation grounding against indexed sources.

Question 45

Topic: Implement Generative AI and Agentic Solutions

You are building an expense-policy agent in a Microsoft Foundry project. The current design sends every user turn to a high-capability LLM, causing high cost and p95 latency. Requirements: simple FAQ answers must be fast, reimbursement approvals must follow deterministic policy rules, and complex exceptions must still use the high-capability LLM. Which TWO actions should you take? (Select TWO.)

Options:

  • A. Route by task type to small or high-capability models.

  • B. Run all models sequentially and vote on every response.

  • C. Use Azure AI Search semantic ranking as the orchestrator.

  • D. Execute approval decisions in a rules engine tool.

  • E. Use the high-capability LLM for every request.

  • F. Put all approval policy rules only in the prompt.

Correct answers: A and D

Explanation: The best design uses orchestration to match each task to the lowest-cost component that still meets quality and governance needs. Simple requests can use smaller models, complex exceptions can use the high-capability LLM, and deterministic approvals should be handled by rules rather than prompts alone.

The core concept is task-fit orchestration across models and non-LLM components. In a Foundry agent or flow, a router can classify the request and send simple FAQ tasks to a smaller, faster model while escalating ambiguous or complex exceptions to the high-capability LLM. For reimbursement approval, the agent should call a deterministic rules engine tool so the decision follows policy consistently and can be audited. The LLM can still summarize or explain the decision, but it should not be the source of truth for approval logic. This design optimizes cost, latency, and quality without weakening policy enforcement.

  • Largest model only misses the cost and latency goal by keeping the most expensive path for simple requests.
  • Sequential voting may improve robustness, but it increases latency and cost for every turn.
  • Search ranking helps retrieve grounding data, but it does not orchestrate model selection or deterministic approvals.
  • Prompt-only rules makes approval behavior probabilistic and harder to audit than a rules engine tool.

Question 46

Topic: Implement Generative AI and Agentic Solutions

Contoso is building a claims-support agent in a Microsoft Foundry project. The agent uses a deployed chat model, managed identity to query a private Azure AI Search index over policy PDFs, and tool calls for claim status. A release pipeline must block promotion when the agent gives incorrect benefit guidance, unsupported citations, or unsafe financial/legal recommendations. Operations already tracks token usage and p95 latency separately. Which evaluation architecture is the best fit?

Options:

  • A. Select the model deployment with the lowest cost per completed answer.

  • B. Gate CI/CD on average token count, endpoint cost, and p95 latency.

  • C. Gate CI/CD on Foundry quality, groundedness, citation, and safety evaluations.

  • D. Approve releases after manual spot checks of production chat transcripts.

Best answer: C

Explanation: The scenario asks whether the agent is correct, grounded in retrieved policy content, and safe. The evaluation plan should use Foundry app/output evaluations with curated test prompts, expected answers or rubrics, source evidence, and safety cases, then gate CI/CD on those quality signals.

For generative AI and agent evaluations, the metric set must match the release risk. Here, the risk is not performance or spend; it is whether the claims agent gives correct guidance, cites retrieved policy evidence, and avoids unsafe recommendations. A best-fit Azure-native design uses Foundry evaluations and traces over a representative test set, including expected outcomes, retrieved context, citation checks, and safety/adversarial prompts. Those results can be used as a release gate in CI/CD while operational telemetry continues to track latency and token usage separately. Cost and latency are valid operational metrics, but they do not prove correctness, groundedness, or safety.

  • Latency gate fails because p95 latency and token count do not show whether answers are correct or grounded.
  • Cost selection fails because cheaper answers can still be unsupported or unsafe.
  • Manual spot checks fail because they do not provide a repeatable CI/CD quality and safety gate before promotion.

Question 47

Topic: Plan and Manage an Azure AI Solution

A bank is piloting a Microsoft Foundry agent for loan-servicing staff. The agent can retrieve policy documents and invoke a payment-deferral tool. Governance requires the tool to run only for authorized staff, and deferrals over 30 days must pause for human approval before execution. The compliance lead asks for evidence that the deployed agent is enforcing these oversight and tool-access controls during real conversations. What should you monitor?

Options:

  • A. Retrieval relevance scores for policy documents

  • B. Average response latency and token usage by conversation

  • C. User satisfaction ratings after each conversation

  • D. Agent traces with tool-call, authorization, and approval events

Best answer: D

Explanation: The stated goal is to validate governance enforcement, not general quality or performance. Agent trace logging is the best evidence because it records runtime tool attempts, authorization outcomes, approval handoffs, and execution results for each conversation.

For agent governance, the key observability signal is a per-run trace or audit trail that captures the agent’s decisions around tools and oversight. In this scenario, compliance needs to know whether unauthorized users were blocked and whether long deferrals paused for human approval before the payment-deferral tool executed. Aggregate metrics such as latency, token consumption, relevance, or satisfaction can be useful, but they do not prove that the agent enforced role-based tool access or human-in-the-loop approval requirements. The key takeaway is to monitor the control points the policy governs: tool invocation, authorization, approval, and execution.

  • Performance metrics show operational efficiency but not whether governance controls were enforced.
  • Retrieval scores help assess grounding quality, not access control or approval workflows.
  • Satisfaction ratings may reveal user experience issues but cannot provide compliance evidence for tool governance.

Question 48

Topic: Implement Computer Vision Solutions

You are building a Foundry agent that answers questions about store audit photos. The current retrieval index contains only file names and manual titles, so answers are not grounded when users ask about visual details such as damaged packaging, warning labels, object location, or dominant colors. You need the agent to retrieve fresh photo evidence and cite the source image. What should you implement?

Options:

  • A. Enable OCR only and index the recognized text from each photo.

  • B. Fine-tune the model on previous store-audit answer transcripts.

  • C. Add upload-date metadata filters to the existing title-only index.

  • D. Configure a Content Understanding image analyzer and index its extracted visual fields.

Best answer: D

Explanation: The grounding gap is that the retrieval source lacks visual characteristics. Configuring Azure Content Understanding in Foundry Tools to extract objects, regions, labels, colors, and related visual fields creates searchable evidence that the agent can retrieve and cite.

For visual-understanding grounding, the ingestion pipeline must convert each image into reliable, structured evidence before retrieval. Azure Content Understanding in Foundry Tools is designed to analyze visual content and produce extracted characteristics that can be stored with the source image ID, timestamp, and citation metadata. Those extracted fields can then be indexed for retrieval so the agent grounds answers in current photo evidence instead of filenames or manual titles. OCR can help when text is visible, but it does not cover non-text visual attributes such as damage, object placement, or color.

  • OCR-only extraction misses non-text visual evidence such as object regions, packaging damage, and colors.
  • Metadata filtering can improve freshness but cannot create the missing visual characteristics needed for grounding.
  • Fine-tuning transcripts changes model behavior but does not ground answers in newly uploaded photo evidence.

Question 49

Topic: Plan and Manage an Azure AI Solution

A company is designing Microsoft Foundry infrastructure for an internal policy assistant. Approved policy files are published throughout the day to a private Azure Storage account, and many files are scanned PDFs. The agent must ground answers in the latest approved content, return citations, and restrict retrieval by business unit. Which design should you use?

Options:

  • A. Cache policy summaries in conversation memory and filter them by user profile.

  • B. Load the PDFs into the agent prompt and rely on system instructions for citations.

  • C. Fine-tune the model on the policy files and redeploy it after each publishing cycle.

  • D. Connect Azure AI Search with OCR enrichment, hybrid/vector indexing, citations, and filterable business-unit metadata.

Best answer: D

Explanation: The requirement is a RAG infrastructure decision, not a model-training decision. Azure AI Search is the appropriate grounding layer because it can index enriched content from documents, support hybrid/vector retrieval, preserve citation metadata, and filter results by business unit.

For a Foundry-based grounded assistant, the infrastructure should separate approved source storage from the searchable grounding index. Azure AI Search can ingest approved files, use enrichment such as OCR for scanned PDFs, create searchable chunks and embeddings, and store provenance fields for citations. Filterable metadata such as business unit enables retrieval-time constraints before content is supplied to the agent. This design keeps answers tied to current indexed sources instead of relying on static model knowledge or fragile prompt-only controls.

  • Fine-tuning policies fails because training does not provide fresh retrieval, per-request filtering, or reliable source citations.
  • Prompt-loaded PDFs fails because prompt context is not a scalable or controlled indexing and filtering layer.
  • Conversation memory summaries fails because memory is not the authoritative source for current approved content or citation-grade grounding.

Question 50

Topic: Implement Information Extraction Solutions

You are building a Foundry agent that reviews uploaded supplier invoices and routes exceptions to an approval workflow. The uploads are scanned PDFs and phone images. The agent must reason over normalized invoice fields, table layout, and a markdown representation with provenance for each flagged discrepancy.

Which workflow should you implement?

Options:

  • A. Attach the raw files to conversation memory.

  • B. Let the agent retrieve raw PDF chunks only.

  • C. Prompt the agent with base64 file contents.

  • D. Run a Content Understanding analyzer before the agent.

Best answer: D

Explanation: The scenario requires clean extracted fields, layout, markdown, and provenance, not just access to raw documents. The agent should receive a structured representation produced by an extraction step, such as a Content Understanding analyzer, before making routing decisions.

For document extraction workflows, use OCR, layout, field extraction, and Content Understanding analyzers to transform scanned or image-based documents into grounded representations the agent can safely consume. In this scenario, the agent’s role is to review extracted invoice facts and route exceptions, while the analyzer’s role is to produce normalized fields, table structure, markdown output, and provenance. Passing raw PDFs, images, or encoded file contents directly to the agent makes extraction inconsistent and weakens traceability. Retrieval can help find content, but it does not replace a document-understanding step when structured fields and layout are required.

  • Raw memory fails because conversation memory is not a reliable extraction pipeline for scanned invoices.
  • Base64 prompting fails because it bypasses OCR, layout parsing, and structured field normalization.
  • Raw chunk retrieval fails because retrieving document chunks alone does not guarantee clean fields, table layout, or markdown provenance.

Continue with full practice

Use the AI-103 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try AI-103 on Web View AI-103 Practice Test

Focused topic pages

Free review resource

Read the AI-103 Cheat Sheet on Tech Exam Lexicon for concept review before another timed run.

Revised on Thursday, May 14, 2026