Browse Certification Practice Tests by Exam Family

Free AI-901 Full-Length Practice Exam: 50 Questions

Try 50 free AI-901 questions across the exam domains, with explanations, then continue with full IT Mastery practice.

This free full-length AI-901 practice exam includes 50 original IT Mastery questions across the exam domains.

These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.

Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try AI-901 on Web View full AI-901 practice page

Exam snapshot

  • Exam route: AI-901
  • Practice-set question count: 50
  • Time limit: 45 minutes
  • Practice style: mixed-domain diagnostic run with answer explanations

Full-length exam mix

DomainWeight
Identify AI Concepts and Capabilities43%
Implement AI Solutions by Using Microsoft Foundry57%

Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.

Practice questions

Questions 1-25

Question 1

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer configured a single-agent solution in Microsoft Foundry with specific instructions and Foundry Tools. A lightweight application must invoke that configured agent behavior instead of sending raw prompts directly to a deployed model. Which implementation choice best maps to this need?

Options:

  • A. Use a text analysis client

  • B. Use an image generation client

  • C. Use a chat model client

  • D. Use an agent client

Best answer: D

Explanation: In Microsoft Foundry, an agent client is the appropriate choice when an application needs to call a configured agent. The configured agent can include instructions, tool connections, and behavior that are managed as part of the agent setup. A chat model client is better suited for sending messages directly to a deployed chat-capable model, where the application is responsible for supplying the needed prompt context and behavior. The key distinction is whether the application is using an agent configuration or directly invoking a model endpoint.

  • Direct chat call misses the configured agent behavior and treats the deployed model as the primary interface.
  • Text analysis is for extracting insights such as sentiment or entities, not invoking an agent.
  • Image generation targets creating images from prompts, not using Foundry agent instructions and tools.

Question 2

Topic: Identify AI Concepts and Capabilities

A finance team receives scanned supplier invoices and PDF purchase forms. They need to capture vendor names, invoice numbers, totals, and line-item relationships so the data can be reviewed and imported into an accounting system. Which AI capability is the best fit?

Options:

  • A. Image generation from prompts

  • B. Document and form information extraction

  • C. Speech recognition and transcription

  • D. General text sentiment analysis

Best answer: B

Explanation: Document and form information extraction is the best fit when the goal is to turn content in forms, invoices, receipts, or similar documents into structured data. The key clues are the need to capture named fields, their values, and relationships such as line items connected to totals or invoice identifiers. In Microsoft Foundry scenarios, Azure Content Understanding in Foundry Tools can support extraction from documents and forms. This is different from simply analyzing whether text is positive or negative, generating new images, or converting speech to text.

  • Sentiment analysis evaluates opinion or tone, not field-value pairs and document structure.
  • Image generation creates new visual content, so it does not extract accounting data from existing forms.
  • Speech recognition converts audio to text, which does not match the PDF and scanned invoice requirement.

Question 3

Topic: Identify AI Concepts and Capabilities

A team wants users to ask questions about a product defect by typing a description and attaching a photo in the same prompt. Which model capability is the best fit for this requirement?

Options:

  • A. A multimodal model

  • B. A speech recognition model

  • C. A text-only generative model

  • D. An image generation model

Best answer: A

Explanation: A multimodal model is needed when a prompt includes multiple input types, such as text plus an image, text plus audio, or another combination of modalities. In this scenario, the user is not only typing a description; they are also attaching a photo that the model must interpret to answer the question. A text-only model would not be sufficient because it cannot use the visual information in the photo as part of the prompt. The key mapping is: more than one input type in the prompt means multimodal capability is required.

  • Text-only model fails because the photo is part of the prompt, not just the written description.
  • Speech recognition fails because the scenario does not require converting spoken audio to text.
  • Image generation fails because the user needs analysis of an attached image, not creation of a new image.

Question 4

Topic: Identify AI Concepts and Capabilities

A customer support team receives short ticket messages and wants to automatically identify specific items mentioned in each message, such as product names, order numbers, and locations. Which text-analysis technique best matches this need?

Options:

  • A. Named entity recognition

  • B. Language detection

  • C. Key phrase extraction

  • D. Sentiment analysis

Best answer: A

Explanation: Named entity recognition is the best text-analysis technique when the goal is to find and classify specific entities in text. In this scenario, the support team is not asking whether the customer is happy or unhappy, or for a general summary of the ticket. They need structured details such as product names, order numbers, and locations so the ticket can be routed or processed. Entity recognition turns those mentions into usable categories for downstream triage. The key distinction is that entities are specific values, while key phrases are broader important phrases.

  • Sentiment analysis fails because it measures positive, negative, or neutral attitude rather than extracting specific items.
  • Key phrase extraction is tempting, but it returns important phrases without reliably categorizing them as products, identifiers, or locations.
  • Language detection fails because it identifies the text language, not the objects mentioned in the message.

Question 5

Topic: Identify AI Concepts and Capabilities

A clinic is building a Microsoft Foundry chat assistant to summarize patient messages. Developers need enough diagnostic information to troubleshoot prompt failures, but the clinic must prevent patient names, IDs, and medical details from being exposed in logs or accessed by unauthorized users. Which responsible AI concern is the best fit?

Options:

  • A. Inclusiveness for diverse abilities

  • B. Transparency about AI limitations

  • C. Fairness across user groups

  • D. Privacy and security of AI data

Best answer: D

Explanation: Privacy and security focus on protecting data used by an AI system, including prompts, outputs, logs, and stored artifacts. In this scenario, the key issue is that patient identifiers and medical details could be captured for troubleshooting and viewed by people who should not have access. Appropriate controls would include minimizing sensitive data in logs, redacting identifiers, restricting access, and applying secure data-handling practices. Fairness, inclusiveness, and transparency are also responsible AI principles, but they address different concerns than sensitive data exposure.

  • Fairness is about avoiding unjust differences in outcomes across groups, not protecting patient data in logs.
  • Inclusiveness is about making the solution usable by people with diverse needs and abilities, not access control.
  • Transparency is about explaining AI use and limitations, not securing sensitive prompts and outputs.

Question 6

Topic: Identify AI Concepts and Capabilities

A bank wants to use a generative AI model to draft customer hardship-case recommendations for service agents. The output may affect payment plans, must be recorded in the case file, and must follow bank policy. Which oversight response is the best fit?

Options:

  • A. Use a larger model to reduce review needs

  • B. Store only the final AI recommendation

  • C. Require agent review and approval before case updates

  • D. Let the model update cases when confidence is high

Best answer: C

Explanation: Accountability and human oversight are essential when AI output influences meaningful decisions, customer outcomes, operations, or records. In this scenario, the model is helping draft recommendations, but those recommendations can affect payment plans and become part of an official case file. The best response is to keep a human service agent responsible for reviewing, validating, and approving the recommendation before it changes the customer’s case or record. This does not prevent AI assistance; it ensures that AI remains advisory where policy, judgment, and customer impact matter. A higher-confidence or larger model can still make errors, and recordkeeping should preserve enough context for accountability.

  • Confidence automation fails because high confidence does not remove the need for oversight in customer-impacting decisions.
  • Larger model fails because model size does not establish accountability or policy compliance by itself.
  • Final-only record fails because it weakens traceability for decisions that affect customer records.

Question 7

Topic: Implement AI Solutions by Using Microsoft Foundry

A team is choosing a model deployment in the Foundry portal for a support app. Users will submit a photo of a damaged product and type a question. The app must return a written troubleshooting response. Which deployment best matches the required input and output modalities?

Options:

  • A. A multimodal model with image and text input, text output

  • B. An image-generation model with text input and image output

  • C. A speech model with audio input and text output

  • D. A text-only chat model with text input and text output

Best answer: A

Explanation: Model deployment selection in Foundry should match the solution’s required modalities: what the app sends to the model and what it expects back. In this case, the app sends two inputs: an image of the damaged product and a typed text question. It expects a text answer. That points to a deployed multimodal model that supports image plus text input and text output. A text-only deployment cannot inspect the image, and image generation is for creating images rather than answering from an uploaded image. The key is to map input and output types before choosing the deployment.

  • Text-only chat fails because it can process the typed question but not the uploaded product photo.
  • Speech processing fails because the scenario does not use spoken audio as input.
  • Image generation fails because the required output is a written troubleshooting response, not a generated image.

Question 8

Topic: Identify AI Concepts and Capabilities

A customer support team wants an AI feature that reads a short case summary and drafts a polite reply for an agent to review. The reply must use natural language, adapt to the customer’s issue, and not simply label or extract facts from the text. Which model capability is the best fit?

Options:

  • A. Keyword extraction for case topics

  • B. Entity extraction for names and dates

  • C. A generative AI model for text generation

  • D. Sentiment analysis for customer tone

Best answer: C

Explanation: A text generation task requires a model that can produce new language based on a prompt or input text. In this scenario, the system is not only identifying what is already present in the case summary; it must draft a complete response that an agent can review. Entity extraction, sentiment analysis, and keyword extraction are text analysis capabilities that identify existing information or labels in text. They can support a workflow, but they do not create the final reply.

  • Entity extraction finds items such as names, dates, or locations, but it does not compose a customer response.
  • Sentiment analysis detects tone or opinion, but it does not generate the reply text.
  • Keyword extraction identifies important terms or topics, but it does not write natural-language content.

Question 9

Topic: Identify AI Concepts and Capabilities

A team uses a deployed generative AI model in Microsoft Foundry for an internal help chat. Test users report that the same question often produces very different, overly broad answers and sometimes drifts into unrelated advice. The team wants more consistent, focused responses without changing the application code. Which configuration change is the best fit?

Options:

  • A. Use Azure Speech in Foundry Tools

  • B. Lower the model temperature

  • C. Switch to an image-generation model

  • D. Increase the maximum response length

Best answer: B

Explanation: Temperature is a generation configuration parameter that affects how random or creative model responses are. A higher temperature can make answers more varied and exploratory, which may be useful for brainstorming but can also cause broad or off-task output. In this scenario, the team wants more consistent and focused chat responses without changing code, so reducing temperature is the most direct configuration adjustment. Response length controls how much text can be produced, not whether the answer stays focused. Changing workload tools or model types does not address the stated text-generation behavior.

  • Longer output fails because increasing response length can make broad answers even longer instead of more focused.
  • Image generation fails because the workload is text chat, not creating images.
  • Speech tooling fails because speech processing does not control randomness in generated text responses.

Question 10

Topic: Identify AI Concepts and Capabilities

A marketing team wants an AI feature that can draft a new product description from a short prompt. The description should vary naturally by audience and tone, and it should not be limited to prewritten phrases selected from a rules table. Which approach best fits this need?

Options:

  • A. Use an object detection model to identify product images

  • B. Use a generative AI model to create text from the prompt

  • C. Use a fixed template library with keyword substitution

  • D. Use a sentiment analysis model to score the prompt

Best answer: B

Explanation: Generative AI creates new content, such as text, images, or audio, in response to a prompt. It does this by using patterns learned from training data, not by simply selecting from fixed templates. In this scenario, the team needs flexible product descriptions that change based on audience and tone, so a generative text model is the best fit. Template substitution can be useful for predictable wording, but it does not provide the same open-ended content generation.

  • Template substitution fails because it reuses predefined wording rather than generating new content from learned patterns.
  • Sentiment analysis fails because it classifies or scores text instead of drafting new text.
  • Object detection fails because it identifies items in images, not written product descriptions.

Question 11

Topic: Identify AI Concepts and Capabilities

A marketing team wants to create a new visual asset for a product launch. The app must generate an original banner image from a text description, allow style guidance such as “minimalist,” and should not rely on analyzing an existing image. Which model capability is the best fit?

Options:

  • A. Speech recognition with transcription

  • B. Image generation from a text prompt

  • C. Content Understanding for forms

  • D. Computer vision image classification

Best answer: B

Explanation: The key requirement is to create a new image, not classify, extract, or transcribe existing content. An image generation capability uses a text prompt to produce an original visual asset and can often follow creative instructions such as style, subject, composition, and tone. Computer vision capabilities are used to analyze existing images, while Content Understanding is used to extract information from content such as documents, forms, images, audio, or video. Speech recognition converts spoken audio to text. The best fit is the capability that produces a new visual output from a prompt.

  • Image analysis fails because classification labels or describes an existing image instead of creating a new banner.
  • Information extraction fails because form extraction targets structured data, not creative visual generation.
  • Speech processing fails because transcription works with audio input and text output, not image creation.

Question 12

Topic: Identify AI Concepts and Capabilities

A user asks a generative AI model for a policy summary. The response is fluent and certain in tone, but it includes a policy rule that is not present in the source documents and omits an important exception. Which generative model behavior best explains this observation?

Options:

  • A. The model applied deterministic business rules

  • B. The model generated a plausible but ungrounded response

  • C. The model verified the answer against all source documents

  • D. The model converted speech input to text

Best answer: B

Explanation: Generative AI models create responses by predicting likely content based on patterns in their training data and the prompt context. A response can therefore sound natural and confident without being fully grounded in the provided sources. This behavior is often described as a hallucination or an ungrounded response, especially when the model invents details or omits key facts. Confidence in wording is not the same as evidence, validation, or completeness.

The key takeaway is that fluent generated text should be checked against trusted sources when accuracy matters.

  • Business rules is wrong because deterministic rules would apply predefined logic rather than invent unsupported policy details.
  • Source verification is wrong because the scenario says the answer includes content not found in the documents.
  • Speech conversion is wrong because the issue is answer grounding, not speech-to-text processing.

Question 13

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is testing a lightweight chat client that sends customer support notes to a deployed generative AI model in Microsoft Foundry for summarization. The summary only needs issue type, product name, and requested action. The notes may include customer names, phone numbers, and account IDs. What should the developer do before sending the prompt to the model?

Options:

  • A. Use a larger generative model without changing the prompt

  • B. Increase the model temperature for more varied summaries

  • C. Move all customer details into the system prompt

  • D. Remove personal identifiers not needed for the summary

Best answer: D

Explanation: Prompts should include only the information needed for the model to complete the task. In this scenario, the model needs the issue type, product name, and requested action, but customer names, phone numbers, and account IDs are not required for that summary. Removing or masking those identifiers before sending the prompt reduces unnecessary exposure of personal data and aligns with privacy and security expectations. Changing model settings or model size does not address the data-minimization problem.

  • Temperature setting affects output variety, not whether sensitive data is protected.
  • System prompt relocation still exposes unnecessary personal data to the model.
  • Larger model choice may improve capability, but it does not reduce privacy risk in the prompt.

Question 14

Topic: Identify AI Concepts and Capabilities

A support team wants an AI solution that can understand a customer’s request, decide whether it needs an account lookup, call an approved internal tool when needed, and then summarize the result for the customer. The team also wants the behavior constrained by a configured instruction set. Which workload is the best fit?

Options:

  • A. An image generation workload

  • B. A text sentiment analysis workload

  • C. A single-agent solution

  • D. A basic chatbot

Best answer: C

Explanation: An agentic workload is identified by configured behavior and task-oriented action. In this scenario, the solution must decide when an account lookup is needed, use an approved internal tool, and summarize the outcome. Those are agent-like behaviors because the system is not only generating a conversational response; it is following instructions, making a task decision, and invoking a tool to complete part of the task. A basic chatbot label usually implies responding to user messages without independent tool use or configured task steps. The key signal is action toward a goal, especially with tools or defined behavior constraints.

  • Basic chatbot is too limited because the scenario requires deciding when to call a tool, not only chatting.
  • Sentiment analysis fits detecting opinion or emotion in text, not performing account lookup actions.
  • Image generation creates visuals from prompts and does not match customer-support tool use.

Question 15

Topic: Implement AI Solutions by Using Microsoft Foundry

A team is testing a lightweight chat client that uses a deployed generative AI model in Microsoft Foundry. The assistant should answer only questions about the company’s benefits policy, but it sometimes responds to unrelated travel and entertainment questions. The team must reduce off-task responses without changing the user’s immediate request. Which change is best?

Options:

  • A. Switch to image input so the request includes more context.

  • B. Increase the model temperature to make responses more varied.

  • C. Rewrite each user prompt to mention the benefits-policy scope.

  • D. Add a system prompt that defines the benefits-policy scope and redirects off-scope requests.

Best answer: D

Explanation: The system prompt is the right place to define the assistant’s role, boundaries, and expected behavior across turns. In this scenario, the user’s immediate request should not be changed, so the best fix is to update the system prompt with instructions such as answering only benefits-policy questions and politely redirecting unrelated requests. This reduces off-task behavior while preserving the original user prompt. Changing model creativity or modality does not directly establish the assistant’s allowed topic scope.

  • User prompt rewriting fails because the requirement says not to change the user’s immediate request.
  • Higher temperature can make outputs less predictable and does not enforce topic boundaries.
  • Image input changes the modality but does not address off-task text responses.

Question 16

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is building a support assistant in Microsoft Foundry. Users will upload a photo of damaged equipment and ask questions such as “What seems broken?” and “Is there an obvious safety concern?” The app needs a natural-language interpretation of the whole image, not exact text extraction or bounding boxes. Which capability is the best fit?

Options:

  • A. Use a deployed multimodal model with an image and prompt

  • B. Use object detection to return labeled bounding boxes

  • C. Use image generation to create a replacement equipment photo

  • D. Use OCR to extract all visible text from the image

Best answer: A

Explanation: Prompt-based visual interpretation uses a multimodal model that can process an image along with a user prompt and produce a natural-language response. In this scenario, the assistant must reason over the overall photo and answer user questions about visible damage and safety concerns. OCR is best when the goal is to extract printed or handwritten text. Object detection is best when the goal is to identify object categories and locations, often with bounding boxes. The key distinction is that the user wants descriptive interpretation, not structured text extraction or object localization.

  • OCR extraction fails because the scenario explicitly does not require exact text from the image.
  • Object detection fails because bounding boxes and object labels are not the requested output.
  • Image generation fails because the app must analyze an uploaded photo, not create a new image.

Question 17

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is preparing a prototype in Microsoft Foundry. The app must answer typed support questions, reason over uploaded product photos, and be tested in the portal before any SDK code is written. Which deployment decision is the best fit?

Options:

  • A. Skip portal testing and connect the SDK to the first listed model.

  • B. Deploy a ready multimodal generative model and test sample prompts in the portal.

  • C. Use Azure Content Understanding to extract form fields from the photos.

  • D. Deploy a text-only chat model and handle photos later in the client.

Best answer: B

Explanation: Model deployment in the Foundry portal should start with the workload requirements: the model must be ready to deploy, support the required input modalities, and be testable before application code depends on it. In this scenario, the prototype needs both text interaction and image reasoning, so a deployed multimodal generative model is the best fit. Testing sample text and image prompts in the portal helps confirm behavior before building a lightweight SDK client. A text-only model misses the photo requirement, while Content Understanding is better for extracting structured information from documents, forms, images, audio, or video rather than general visual chat.

  • Text-only shortcut fails because client code cannot add image-understanding capability to a model that does not support visual input.
  • Extraction workload fails because field extraction is not the same as conversational reasoning over product photos.
  • Untested model choice fails because the scenario requires confirming model behavior in the portal before SDK work.

Question 18

Topic: Implement AI Solutions by Using Microsoft Foundry

A media company has recordings of product demonstrations. The team needs to extract spoken product names, visible actions in the demo, and the times when key events occur. Which Microsoft Foundry approach is the best fit?

Options:

  • A. Use image generation from text prompts

  • B. Use sentiment analysis on the transcript

  • C. Use Azure Content Understanding for video extraction

  • D. Use Azure Speech for transcription only

Best answer: C

Explanation: Azure Content Understanding in Foundry Tools is the best fit when the input is media and the required output includes more than a transcript. It can support extraction of information from video or audio, including spoken facts, visual facts, events, and time-based observations. In this scenario, the team needs both what was said and what was visible, plus when key events occurred. Azure Speech is useful for speech-to-text, but it does not address visible actions in the video. The key distinction is media information extraction versus a single-purpose speech or text-analysis workload.

  • Speech-only transcription misses the visual actions and event timing required from the video.
  • Image generation creates new images; it does not extract facts from existing recordings.
  • Sentiment analysis interprets text tone, but it does not identify visual events or timestamps.

Question 19

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is testing a lightweight chat client that calls a deployed generative AI model in Microsoft Foundry. The model answers accurately, but the responses are too general because users type vague requests such as “Explain this policy.” Which change best maps to prompt revision rather than model training or deployment configuration?

Options:

  • A. Switch the deployment to a larger model family

  • B. Redeploy the model with a different temperature setting

  • C. Add audience, format, and task details to the user prompt

  • D. Train a new model on company policy examples

Best answer: C

Explanation: Prompt revision changes the instructions supplied to the model for a specific interaction. In this case, the model is already producing accurate answers, but the user request lacks context. Adding details such as the intended audience, desired format, scope, and constraints helps the deployed model generate a more useful response without changing model weights or deployment settings. Training or fine-tuning is about changing model behavior through data, and deployment configuration changes adjust how a selected model is hosted or sampled. The key distinction is that prompt revision is the fastest run-time change when the issue is unclear or incomplete instructions.

  • Training examples would be appropriate when the model needs learned domain behavior, not when the request is merely vague.
  • Temperature setting is a deployment or inference configuration choice, not a rewrite of the user’s instruction.
  • Larger model family changes the deployed capability and may be unnecessary when better prompt detail solves the issue.

Question 20

Topic: Implement AI Solutions by Using Microsoft Foundry

A team uses Azure Content Understanding in Foundry Tools to extract customer names and totals from uploaded invoices. Reviewers say they need to know when a field value was produced by AI processing rather than entered by a person. Which response best supports this need?

Options:

  • A. Block all low-confidence extracted fields

  • B. Balance invoice samples across vendors

  • C. Label extracted fields as AI-processed

  • D. Encrypt uploaded invoices at rest

Best answer: C

Explanation: Transparency means users should be able to understand when AI is being used and how AI affects the information they see. In this scenario, reviewers specifically need to know that extracted invoice fields came from AI processing. A clear label or disclosure on AI-extracted fields addresses that need without changing the extraction workflow. Privacy protections, balanced data, and confidence handling may also be important, but they map to different concerns. The key takeaway is that disclosure and understandable communication are transparency responses.

  • Privacy control fails because encryption protects data but does not tell reviewers that AI produced a value.
  • Fairness measure fails because balancing samples addresses bias risk, not disclosure of AI involvement.
  • Validation rule fails because blocking low-confidence fields supports reliability, but it does not communicate AI use for accepted fields.

Question 21

Topic: Identify AI Concepts and Capabilities

A city department plans to use a generative AI assistant in Microsoft Foundry to draft plain-language summaries of public comments. Stakeholders need to know that the summaries are AI-generated, what source material they are based on, and that staff must review them before publication. Which response best supports transparency?

Options:

  • A. Increase the model temperature for more varied summaries

  • B. Replace staff review with automated content filtering

  • C. Publish a user-facing AI disclosure and usage guidance

  • D. Collect additional personal data to improve context

Best answer: C

Explanation: Transparency means people affected by or using an AI system should understand when AI is involved, what the system is intended to do, and its important limitations. In this scenario, stakeholders specifically need disclosure that summaries are AI-generated, context about the source material, and guidance that the output is draft content requiring staff review before publication. A user-facing disclosure with usage guidance directly addresses those needs without changing the model or expanding data collection. Transparency does not mean making the model more creative or replacing accountability with automation. The key takeaway is to communicate AI involvement and appropriate reliance on its outputs.

  • Temperature tuning affects response variability, not whether stakeholders understand AI use and limitations.
  • Automated filtering may support safety, but it does not replace human accountability for public summaries.
  • Extra personal data raises privacy concerns and does not address the stated need for disclosure and usage limits.

Question 22

Topic: Implement AI Solutions by Using Microsoft Foundry

A city maintenance team uses a deployed multimodal model in Microsoft Foundry to interpret uploaded street images. The result may update a public hazard record and dispatch a repair crew. The team needs a response process that supports quick triage but avoids unsafe automated actions when the image interpretation is uncertain. Which approach is the best fit?

Options:

  • A. Lower the confidence threshold to reduce manual reviews

  • B. Automatically dispatch crews for every detected hazard

  • C. Send uncertain hazard results to human review before updating records

  • D. Use image generation to recreate unclear street scenes

Best answer: C

Explanation: When vision results can affect safety, user records, or operational actions, the response process should include a review step for uncertain or high-impact interpretations. A deployed multimodal model can help triage images, but its output should not be the only authority for actions such as updating official hazard records or dispatching crews when confidence is low or the situation is ambiguous. Human review adds accountability and reduces the risk of acting on a mistaken visual interpretation.

The key takeaway is to use AI output to assist decisions, not to bypass review for safety-impacting actions.

  • Automatic dispatch fails because it turns every model detection into an operational action without checking uncertainty or impact.
  • Image recreation fails because generating a clearer-looking scene does not verify the real-world hazard.
  • Lower threshold fails because it increases likely false positives and weakens safeguards for safety-related records.

Question 23

Topic: Identify AI Concepts and Capabilities

A junior developer is building a lightweight chat client by using Microsoft Foundry. The team has already chosen a generative AI model from the model catalog and now needs to test it in the Foundry portal and call it from the application by using an endpoint. What should the developer do next?

Options:

  • A. Create an image generation resource

  • B. Rewrite the user prompt

  • C. Select a different base model

  • D. Deploy the selected model

Best answer: D

Explanation: Selecting a model and deploying a model are separate steps in Microsoft Foundry. Selecting a model identifies the model capability you want to use, such as a chat-capable generative AI model. Deploying that selected model creates a usable deployment that can be tested in the Foundry portal and consumed by a lightweight application through the required connection details, such as an endpoint. In this scenario, the model choice is already complete, so the next need is availability for testing and app calls. Changing prompts may improve behavior later, but it does not make the model consumable by the client application.

  • Model choice repeat fails because the team has already selected the model from the catalog.
  • Prompt rewrite may affect responses, but it does not create an endpoint for application use.
  • Image resource targets a different workload and does not satisfy a chat client requirement.

Question 24

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is creating a lightweight chat client with the Foundry SDK. Before the client can send a user prompt and receive a response, which implementation detail must the app have?

Options:

  • A. A custom model training pipeline

  • B. A Content Understanding analyzer schema

  • C. A deployed model reference and connection information

  • D. An image-generation safety filter configuration

Best answer: C

Explanation: A Foundry SDK chat client sends prompts to a specific deployed model. The application needs a reference to that deployment, such as its deployment name or model identifier, plus connection information such as the project, endpoint, and credentials required by the SDK. Without these details, the client has no target model and no authorized path for the request.

Training pipelines, analyzer schemas, and image-generation settings can be useful for other AI workloads, but they are not the basic requirement for sending chat prompts to a deployed model.

  • Training pipeline is unnecessary because the chat client can use an already deployed model.
  • Analyzer schema applies to Content Understanding extraction, not basic chat prompt submission.
  • Image-generation settings target image output scenarios, not connecting a chat client to a model deployment.

Question 25

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is configuring Azure Content Understanding in Foundry Tools for supplier invoices. The finance team needs the vendor identity, payment due date, amount owed, and the purchased products with quantities and prices. They do not need a summary of the invoice text. Which extraction targets are the best fit?

Options:

  • A. Sentiment, keywords, language, and summary

  • B. Customer address, document title, and page count

  • C. Handwritten text, image labels, and captions

  • D. Vendor name, due date, total, and line items

Best answer: D

Explanation: For document and form extraction, choose targets that match the specific business fields the app must capture. In an invoice scenario, common extraction targets include names or identifiers, dates, totals, addresses, form fields, and line items. Because the finance team needs vendor identity, due date, amount owed, and purchased product details, the best target set includes vendor name, due date, total, and line items. A summary or general text analysis output would not reliably provide structured accounting fields.

  • Text analysis outputs fail because sentiment, keywords, and summaries do not capture structured invoice fields.
  • Wrong field set fails because customer address, title, and page count omit the required due date, total, and product details.
  • Vision-style labels fail because image captions and labels describe visual content rather than extracting invoice form fields.

Questions 26-50

Question 26

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer is creating a lightweight chat app that calls a deployed generative AI model from Microsoft Foundry. Users ask follow-up questions, and the app must keep the assistant focused on the original tutoring goal during a short session. The team does not need retrieval, multiple agents, or workflow automation. Which client behavior best meets the need?

Options:

  • A. Retrain the model after each user message.

  • B. Send only the latest user message to reduce prompt length.

  • C. Create a multi-step agent workflow for every follow-up.

  • D. Include the system prompt and relevant chat history in each request.

Best answer: D

Explanation: A lightweight chat app should manage short-session context in the client by keeping the system prompt and relevant conversation turns, then sending them with each request to the deployed model. This preserves the conversation goal without adding unnecessary enterprise architecture such as agent orchestration, retrieval pipelines, or model training. Generative model calls do not automatically remember earlier API requests unless the application provides that context. The key takeaway is to use simple conversation-history management before adding heavier solution components.

  • Retraining per turn fails because adapting model weights is not how a basic chat client maintains short conversational context.
  • Agent workflow overreach fails because the stem explicitly does not need tools, automation, or multi-step orchestration.
  • Latest message only fails because the model may lose the tutoring goal and prior references without supplied context.

Question 27

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is planning a lightweight application that will call a model deployed from Microsoft Foundry. Which requirement best indicates that the application should include vision capability?

Options:

  • A. Describe defects visible in uploaded product photos

  • B. Summarize customer support chat transcripts

  • C. Transcribe recorded calls into text

  • D. Generate a logo from a text prompt

Best answer: A

Explanation: Vision capability is needed when an application must interpret images or other visual input, such as identifying objects, describing scenes, reading visual details, or answering questions about an uploaded image. In a lightweight Microsoft Foundry app, this usually means choosing a deployed multimodal model or vision-capable workflow that can accept image input, not just text. Text summarization, speech transcription, and image generation are related AI workloads, but they do not require analyzing an existing image as input. The key signal is that the app must understand what is visually present.

  • Text-only input fails because chat transcripts can be processed with text capabilities.
  • Speech input fails because recorded calls require speech recognition, not image analysis.
  • Image output fails because generating a logo from text is image generation, not vision analysis of visual input.

Question 28

Topic: Identify AI Concepts and Capabilities

A junior developer is building a lightweight Python chat client with the Foundry SDK. The app must call a selected generative AI model by using a deployment name or endpoint, and the team does not need to train or fine-tune a custom model. In the Foundry portal, the model is visible in the model catalog, but no deployment has been created. What should the developer do next?

Options:

  • A. Write a more detailed system prompt

  • B. Fine-tune the model before calling it

  • C. Deploy the selected model in Foundry

  • D. Create a Content Understanding project

Best answer: C

Explanation: A model deployment makes a selected model available for application calls. Seeing a model in a catalog means it can be chosen, but a lightweight client still needs a deployed model target, such as a deployment name or endpoint, before it can send prompts through the SDK. The scenario does not require custom training or document extraction; it only needs an app-callable generative model. Prompt design happens after there is a model target to call, not as a replacement for deployment.

  • Prompt-only change fails because a system prompt controls behavior but does not create an endpoint or deployment target.
  • Content Understanding is for extracting information from content such as documents, forms, images, audio, or video, not for exposing a chat model.
  • Fine-tuning first adds unnecessary customization and still would not replace the need for a deployment target.

Question 29

Topic: Implement AI Solutions by Using Microsoft Foundry

A media team stores recorded customer-support calls and short training videos. They need a Microsoft Foundry solution that can identify key events, speakers, and relevant details from the audio/video files so the data can be indexed for review. Which capability is the best fit?

Options:

  • A. Use a generative model to summarize typed transcripts only

  • B. Use Azure Content Understanding in Foundry Tools

  • C. Use Azure Speech to synthesize spoken responses

  • D. Use image generation to create training visuals

Best answer: B

Explanation: Audio and video information extraction focuses on finding useful structure and facts inside media files, such as events, speakers, scenes, or relevant details. In Microsoft Foundry, Azure Content Understanding in Foundry Tools is the best fit when the input is audio or video and the goal is extraction for indexing or review. Speech synthesis goes the opposite direction by creating spoken audio from text. Text summarization can help after a transcript exists, but it does not directly extract information from the original audio/video files. Image generation creates new images rather than analyzing existing media.

  • Speech synthesis fails because it produces spoken output instead of extracting details from recorded media.
  • Transcript-only summarization misses the requirement to process audio and video files directly.
  • Image generation fails because it creates visuals rather than extracting information from existing audio/video content.

Question 30

Topic: Identify AI Concepts and Capabilities

A bank uses one automated loan-screening model for all applicants. The team requires the same scoring rule for every person, uses past approval data as input, and wants to understand why complaints of unfair outcomes may still be valid. Which explanation best fits this situation?

Options:

  • A. The training data may reflect biased past decisions.

  • B. The model must use separate rules for each group.

  • C. The model needs a larger compute deployment.

  • D. The issue is only transparency, not fairness.

Best answer: A

Explanation: Fairness in AI is not guaranteed by applying the same automated rule to everyone. If the model learns from biased historical approvals, incomplete inputs, or output labels that reflect unfair decisions, it can treat applicants identically in process while still producing unequal or unjust results. The problem is in what the system learned and what outcomes it optimizes, not necessarily in whether the same scoring code runs for each applicant.

The key takeaway is that fairness requires checking inputs, outputs, and impacts, not only confirming that the automation is uniform.

  • Separate rules is not the best explanation because different rules by group are not required to identify biased outcomes.
  • Transparency only fails because explainability may help investigate the model, but the complaint is about unfair impact.
  • Compute deployment is unrelated because more capacity does not correct biased data or biased labels.

Question 31

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer created a single-agent solution in Microsoft Foundry. The agent has new instructions, a connected tool, and a rule that it must not answer outside the company travel policy. The team plans to build a lightweight client application that depends on the agent. What should the developer do first?

Options:

  • A. Deploy a different generative AI model

  • B. Write the client application with the Foundry SDK

  • C. Test the agent in the Foundry portal

  • D. Remove the boundary rule from the instructions

Best answer: C

Explanation: An agent should be tested in the Foundry portal before an application relies on it, especially after changing instructions, adding tools, or defining boundaries. Portal testing lets the developer confirm that the agent follows its intended scope, uses connected tools appropriately, and refuses or redirects out-of-scope requests. After that behavior is validated, a lightweight client can call the agent with more confidence. Writing the client first can hide agent design problems inside application code.

  • SDK first is tempting because the app is the final goal, but the agent behavior should be validated before client integration.
  • Different model does not address whether the configured agent, tool, and boundary instructions work.
  • Remove boundaries weakens the requirement to keep answers within the travel policy.

Question 32

Topic: Implement AI Solutions by Using Microsoft Foundry

A company uses Azure Content Understanding in Foundry Tools to extract damage details from uploaded vehicle images. The extracted data will update claim records and may affect repair authorization and payment amounts. The team needs a review step that reduces incorrect operational and financial decisions. Which step is the best fit?

Options:

  • A. Store only the raw images without extracted fields

  • B. Require human validation before record updates

  • C. Automatically approve claims with high confidence scores

  • D. Use image generation to recreate unclear damage areas

Best answer: B

Explanation: When extracted image information can affect user records, operations, safety, or financial outcomes, the solution should include a human review or validation step before the data is used for important actions. In this scenario, the extracted damage details can change claim records and influence payment or repair authorization, so relying only on automated extraction is too risky. Confidence scores can help prioritize review, but they should not replace review when the business impact is significant. The key takeaway is to add human oversight where AI output drives consequential decisions.

  • Image generation is the wrong workload because it creates images rather than validating extracted evidence.
  • Automatic approval is risky because a high confidence score does not eliminate the need for oversight in financial decisions.
  • Raw-image storage only avoids using extraction results but does not provide the needed validation workflow for claim processing.

Question 33

Topic: Implement AI Solutions by Using Microsoft Foundry

A team uses a deployed multimodal model in Microsoft Foundry to interpret images from field inspections. A result may update maintenance records and trigger an equipment shutdown. Which review response best fits this use of visual interpretation?

Options:

  • A. Require human review before updating records or triggering actions

  • B. Replace the vision model with a text-only model

  • C. Automatically accept high-confidence visual results as final

  • D. Store only the image and discard the interpretation

Best answer: A

Explanation: When a vision result can affect safety, user records, or operational actions, it should not be treated as automatically authoritative. A deployed multimodal model can help interpret images, but its output may be uncertain or context-dependent. The appropriate response is to include a review step so a qualified person confirms the interpretation before records are changed or actions such as shutdowns are triggered. This supports reliability, safety, and accountability while still using AI to assist the workflow. Confidence scores can help prioritize review, but they do not remove the need for oversight in high-impact decisions.

  • High confidence is not final because safety and operational decisions still need oversight when model errors could cause harm.
  • Discarding interpretation avoids risk but fails to use the model result as decision support.
  • Text-only replacement does not address the need to interpret inspection images.

Question 34

Topic: Implement AI Solutions by Using Microsoft Foundry

A team has already configured and tested a single-agent solution in Microsoft Foundry. A junior developer must build a lightweight web client that accepts user questions, sends them to the configured agent, and displays the response without changing the agent’s instructions or tools. Which implementation approach is the best fit?

Options:

  • A. Create a new model deployment for each user session before sending messages.

  • B. Use Azure Content Understanding to extract answers from the user messages.

  • C. Use the Foundry SDK to send user messages to the configured agent and display its replies.

  • D. Move the agent’s system instructions into each user prompt from the web form.

Best answer: C

Explanation: A lightweight agent client app does not usually define the agent’s behavior, tools, or model deployment at runtime. Those are configured in Microsoft Foundry. The client is responsible for connecting to the configured agent, sending the user’s request as a message, receiving the agent’s response, and presenting it in the application UI. It may also handle basic application concerns such as authentication, conversation state, and errors, but it should not replace the configured agent setup. Creating deployments or moving system instructions into user prompts changes the solution design rather than acting as a lightweight client.

  • New deployments are unnecessary because the agent is already configured and tested.
  • Prompt relocation weakens the separation between system instructions and user input.
  • Content extraction targets information extraction workloads, not sending requests to an existing agent.

Question 35

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is using Azure Content Understanding in Foundry Tools to process supplier invoices. The finance app must capture each purchased product row, including the description, quantity, unit price, and row subtotal. Which extraction target best matches this need?

Options:

  • A. Line items

  • B. Addresses

  • C. Dates

  • D. Totals

Best answer: A

Explanation: Document and form extraction targets should match the shape of the information needed from the document. Invoice rows that repeat across a document, such as product description, quantity, unit price, and row subtotal, are line items. A total, date, or address is usually a single field or small set of fields, while line items represent repeated structured records that must stay associated with each row.

  • Totals fails because it captures summary amounts, not each individual purchased product row.
  • Dates fails because invoice dates or due dates are single values, not repeated purchase rows.
  • Addresses fails because supplier or billing addresses identify parties or locations, not itemized charges.

Question 36

Topic: Implement AI Solutions by Using Microsoft Foundry

Which need is best mapped to a single-agent solution set up in the Foundry portal?

Options:

  • A. One agent answers HR policy questions using approved instructions and tools

  • B. A monitoring architecture audits all agents across multiple business units

  • C. A governance program defines enterprise-wide agent approval workflows

  • D. Several specialized agents negotiate and delegate tasks to each other

Best answer: A

Explanation: At Azure AI Fundamentals depth, a single-agent solution in the Foundry portal is a focused setup where one agent is configured with instructions, a model, and optional tools or knowledge to help users complete a task. The key signal is that one agent owns the interaction and does not require agent-to-agent coordination. Scenarios involving multiple specialized agents, enterprise approval processes, or cross-organization monitoring move beyond basic single-agent setup into orchestration or governance concerns. The task here is to identify the implementation choice that stays centered on one configured agent.

  • Multi-agent delegation fails because it describes coordination among several agents, not one agent handling a focused task.
  • Enterprise approval workflows fail because they describe governance processes rather than setting up an individual agent.
  • Cross-unit auditing fails because it describes organization-wide oversight, not a basic Foundry portal single-agent configuration.

Question 37

Topic: Identify AI Concepts and Capabilities

A bank is piloting an AI pre-screening assistant for loan applications. The model meets the required overall accuracy target, does not expose applicant data, and shows users a short explanation of the factors used. Testing shows that equally qualified applicants from one age group receive substantially fewer approvals than others. Which responsible AI concern is the BEST fit?

Options:

  • A. Reliability and safety issue

  • B. Privacy and security issue

  • C. Transparency issue

  • D. Fairness risk

Best answer: D

Explanation: Fairness in AI focuses on whether an AI system treats people and groups equitably, especially when decisions affect opportunities such as loans, hiring, or access to services. In this scenario, the system has acceptable overall accuracy, protects applicant data, and provides explanations, but one age group receives worse outcomes despite similar qualifications. That pattern points to potential bias or disparate impact, which is a fairness concern. Overall performance can look acceptable while still hiding unfair outcomes for a subgroup. The key distinction is that the problem is not missing disclosure, data exposure, or general model failure; it is unequal impact across people who should be treated comparably.

  • Reliability and safety is tempting because the model is making important decisions, but the stem says overall accuracy meets the target.
  • Privacy and security does not fit because the scenario states that applicant data is not exposed.
  • Transparency does not fit because users are already shown a short explanation of the factors used.

Question 38

Topic: Identify AI Concepts and Capabilities

A city services team wants to analyze resident feedback messages. The solution must find references to named parks, departments, dates, phone numbers, and monetary amounts so staff can route and summarize issues. Which text analysis capability is the best fit?

Options:

  • A. Language detection

  • B. Sentiment analysis

  • C. Key phrase extraction

  • D. Entity detection

Best answer: D

Explanation: Entity detection is the text analysis capability used to identify and label specific references in unstructured text, such as locations, organizations, dates, phone numbers, and quantities. In this scenario, the team needs to locate structured references inside feedback messages so the information can support routing and summarization. That requirement is different from determining emotional tone, extracting general topics, or identifying the language of the text. The key signal is the need to find named or typed references rather than classify the overall message.

  • Sentiment analysis fails because it assesses positive, negative, or neutral tone rather than locating named references.
  • Key phrase extraction fails because it returns important phrases or topics, not typed entities such as dates or money amounts.
  • Language detection fails because it identifies the text language, not parks, departments, dates, or quantities.

Question 39

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is building an app that must process recorded customer-support calls and meeting videos. The app needs structured fields such as speaker names, key discussion points, action items, and timestamps that can be stored in a database. Which Microsoft Foundry capability best matches this need?

Options:

  • A. Text sentiment analysis

  • B. Azure Content Understanding in Foundry Tools

  • C. Image generation model deployment

  • D. Azure Speech in Foundry Tools

Best answer: B

Explanation: Azure Content Understanding in Foundry Tools is the best fit when an application needs to extract structured information from unstructured content, including audio and video. In this scenario, the app is not only converting speech to text; it needs usable fields such as speakers, action items, topics, and timestamps from recorded media. That is an information-extraction workload for audio and video sources. Azure Speech is more appropriate for speech recognition or synthesis tasks, while sentiment analysis focuses on classifying text tone after text is available.

  • Speech-only mapping misses the structured extraction requirement across recorded media.
  • Image generation creates new images rather than extracting information from audio or video.
  • Sentiment analysis classifies text sentiment but does not extract meeting fields and timestamps from media.

Question 40

Topic: Identify AI Concepts and Capabilities

A support team uses a deployed multimodal model to analyze customer photos and short videos of damaged deliveries. For one case, the model suggests that the package was damaged before delivery, but the video is shaky and the label is partly obscured. Which concept does this observation best map to?

Options:

  • A. The model deployment has failed

  • B. Review is needed because source evidence is ambiguous

  • C. A higher temperature setting will improve factual certainty

  • D. A text-only model should replace the multimodal model

Best answer: B

Explanation: Multimodal models can combine information from inputs such as images, video, audio, and text, but their output still depends on the quality and completeness of the source evidence. If a video is shaky, a label is obscured, or an image lacks key context, the model may infer a plausible answer without enough reliable evidence. In that situation, the appropriate mapping is not a different workload or a failed deployment; it is a need for review before relying on the result. The key takeaway is that multimodal capability does not remove the need to validate uncertain outputs against source evidence.

  • Text-only replacement fails because the task depends on visual evidence from photos and videos.
  • Deployment failure fails because the model produced an output; the issue is evidence quality, not availability.
  • Higher temperature fails because more variation in output does not make ambiguous evidence more factual.

Question 41

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is adding a feature in a Microsoft Foundry app that accepts uploaded product photos and identifies visible objects, dominant colors, and brief captions for each image. Which capability best fits this need?

Options:

  • A. Image generation

  • B. Vision capability

  • C. Text analysis

  • D. Speech processing

Best answer: B

Explanation: A vision capability is the best fit when the input is an existing image and the app needs to understand what is visible in it. In this scenario, the app analyzes product photos to identify objects, colors, and captions, which are visual understanding tasks. Text analysis works on written language, speech processing works on audio or spoken language, and image generation creates new images from prompts rather than analyzing uploaded photos. The key distinction is whether the workload interprets existing visual content or creates new content.

  • Text input mismatch fails because text analysis processes written content, not pixels in uploaded photos.
  • Audio input mismatch fails because speech processing handles spoken input or output, not image understanding.
  • Creation vs analysis fails because image generation creates new images instead of extracting meaning from existing ones.

Question 42

Topic: Identify AI Concepts and Capabilities

A team uses a deployed generative AI model to convert support tickets into one of five approved status messages. They want repeated runs on the same ticket to produce the most consistent response possible, and they do not need creative variation. Which configuration approach best matches this need?

Options:

  • A. Raise the maximum token limit

  • B. Use a multimodal input model

  • C. Increase temperature for more variation

  • D. Set a low temperature value

Best answer: D

Explanation: Temperature is a generation configuration parameter that controls how random or varied a generative AI model’s output can be. For a constrained task, such as selecting or wording one of a small set of approved status messages, a lower temperature is preferred because it makes the model favor more likely, repeatable responses. This supports consistency when the same or similar input is submitted multiple times. A higher temperature is useful when creativity or varied wording is desired, but that conflicts with this scenario. Token limits affect response length, not predictability.

  • More variation fails because increasing temperature makes outputs less consistent.
  • Token limit fails because it controls output length rather than randomness.
  • Multimodal input fails because adding image or audio capability does not address predictable text generation.

Question 43

Topic: Implement AI Solutions by Using Microsoft Foundry

A developer is building a lightweight Foundry SDK chat client for a deployed multimodal model. Users must ask questions by speaking into a microphone, the app must preserve the spoken prompt for the model, and the solution should avoid sending only a text box value. Which implementation step is the best fit?

Options:

  • A. Add a longer system prompt describing speech recognition

  • B. Send only synthesized speech as the model response

  • C. Capture microphone audio and pass it as supported audio input

  • D. Use image generation to create a transcript image

Best answer: C

Explanation: For spoken prompts with a deployed multimodal model, the client application must capture the user’s speech in a usable input form, such as a supported audio file or audio stream, and include that input in the model request. The key requirement is not just that the user speaks, but that the app preserves the spoken input in a format the model can accept. A system prompt can guide behavior after input is received, but it does not capture audio. Speech synthesis is for producing spoken output, not accepting a spoken prompt.

  • System prompt only fails because instructions cannot replace capturing the user’s actual audio input.
  • Transcript image fails because image generation is unrelated to receiving spoken prompts.
  • Synthesized response fails because text-to-speech output does not provide the user’s speech to the model.

Question 44

Topic: Implement AI Solutions by Using Microsoft Foundry

A team uses text analysis in a Microsoft Foundry prototype to flag negative customer comments. The phrase “This update is sick” is sometimes flagged as negative, but in this community it often means “excellent.” Which result-handling concept does this observation best illustrate?

Options:

  • A. Text-analysis output may need context

  • B. Document extraction requires form fields

  • C. Speech recognition requires a custom voice

  • D. Image generation needs a style prompt

Best answer: A

Explanation: Text analysis can identify sentiment, key phrases, or entities, but language is often ambiguous. Slang, sarcasm, product names, regional wording, and surrounding sentences can change the meaning of a word or phrase. In the stem, “sick” may look negative in isolation but can be positive in the customer community. Result handling should account for context before the application takes action, especially when the result could affect users or business decisions.

The key takeaway is that text-analysis output is useful evidence, not a guaranteed interpretation of meaning in every context.

  • Speech recognition is unrelated because the input is already text, not spoken audio.
  • Image generation does not apply because the task is interpreting customer comments, not creating images.
  • Document extraction is about pulling structured data from documents or forms, not resolving ambiguous sentiment.

Question 45

Topic: Implement AI Solutions by Using Microsoft Foundry

A team uses Azure Content Understanding in Foundry Tools to extract fields from uploaded claim forms. Which observation best maps to a result that should be routed for review before the application uses it automatically?

Options:

  • A. A typed invoice number matches the expected format.

  • B. A blurred handwritten diagnosis code includes a patient identifier.

  • C. A vendor name is extracted from a clear company letterhead.

  • D. A standard date field is read from a clean printed form.

Best answer: B

Explanation: Extraction validation is especially important when the source content reduces reliability or raises privacy risk. Blurred handwriting can make a field ambiguous, so the extracted value may be wrong even if the system returns a value. A patient identifier also makes the content sensitive, so the application should avoid using or storing the result automatically until it is reviewed according to the organization’s validation and privacy process. Clear printed fields, expected formats, and standard templates are lower-risk signals and usually do not, by themselves, require special review.

  • Expected format does not signal a problem when the source is readable and the value matches validation rules.
  • Clear letterhead supports confidence in the extracted organization name rather than indicating ambiguity.
  • Clean printed form is a normal extraction case and does not add a quality or sensitivity concern by itself.

Question 46

Topic: Implement AI Solutions by Using Microsoft Foundry

A team uses Azure Content Understanding in Foundry Tools to extract fields from supplier invoices. The app detects that an extracted TotalAmount has low confidence and must be checked by an accounts-payable employee before it is sent to the accounting system. Which workflow action best matches this need?

Options:

  • A. Display the field only in a dashboard

  • B. Validate the field with no human review

  • C. Route the field for downstream review

  • D. Store the extracted field as final data

Best answer: C

Explanation: In an extraction app workflow, fields can be validated, stored, displayed, or routed depending on what must happen next. A low-confidence or business-critical value that needs human confirmation before being used by another system should be routed for downstream review. This allows a reviewer to confirm or correct the extracted field before it becomes operational data. Storing is appropriate after data is accepted, displaying helps users inspect results, and automated validation checks rules or formats but does not replace a required human review step.

  • Store as final fails because the stem says the value must be checked before it is sent onward.
  • Dashboard only fails because viewing the field does not create a review-and-approval workflow.
  • Automated validation only fails because the requirement includes employee review, not just a rule check.

Question 47

Topic: Implement AI Solutions by Using Microsoft Foundry

A junior developer has tested a deployed generative AI model in the Foundry portal. The team now needs an internal web page where employees can enter prompts, send them to the same deployment at runtime, and display responses without opening the portal. Which approach is the best fit?

Options:

  • A. Continue using the Foundry portal test pane for employee prompts.

  • B. Build a lightweight app that calls the deployment through the Foundry SDK.

  • C. Use Azure Content Understanding to extract fields from prompts.

  • D. Create a new model deployment for each employee prompt.

Best answer: B

Explanation: The key distinction is where the interaction happens. The Foundry portal test experience is useful for trying prompts, checking a deployment, and exploring model behavior during development. It is not the right interface when users need an application experience. When an app must collect user input, send requests at runtime, and render model responses, the developer should use the Foundry SDK or supported client APIs to call the existing deployed model from application code. The deployment remains the model endpoint; the lightweight app becomes the user-facing client. Creating deployments per prompt or using an extraction tool changes the workload instead of implementing the required app interaction.

  • Portal testing is plausible for development, but it does not provide the required employee-facing web page.
  • New deployments are not needed for each prompt; prompts are requests sent to an existing deployment.
  • Content Understanding targets information extraction from documents, forms, images, audio, or video, not chat-style runtime prompts.

Question 48

Topic: Identify AI Concepts and Capabilities

A clinic plans to use an AI feature in Microsoft Foundry to rank incoming patient messages by urgency. The clinic needs to use the AI output to support triage, explain triage decisions when challenged, and ensure accountability if a patient is harmed by an incorrect recommendation. Which approach is the best fit?

Options:

  • A. Rely on model confidence scores as the only oversight mechanism

  • B. Assign clinical owners to review and govern AI-supported triage decisions

  • C. State that the deployed model is responsible for triage outcomes

  • D. Require patients to accept that AI recommendations are final

Best answer: B

Explanation: Accountability is a responsible AI principle: people and organizations must remain responsible for decisions that use AI support. In this scenario, the clinic can use AI to help prioritize messages, but it should define human ownership, review processes, documentation, and escalation paths for triage decisions. A model can generate recommendations or rankings, but it cannot accept legal, ethical, or operational responsibility for harm. The key takeaway is that AI can support decision-making, but responsibility stays with the organization deploying and using it.

  • Model responsibility fails because a model is a tool, not an accountable decision-maker.
  • Confidence-only oversight fails because scores do not replace human governance or review.
  • Final AI recommendations fails because it removes meaningful human oversight for a patient-impacting process.

Question 49

Topic: Identify AI Concepts and Capabilities

A finance team has scanned images of supplier invoices. They need an AI capability that returns structured fields such as supplier name, invoice date, line items, and total amount for use in an accounting app. Which workload best matches this need?

Options:

  • A. Image generation

  • B. Image classification

  • C. Object detection

  • D. Information extraction from documents and forms

Best answer: D

Explanation: This is an information extraction task because the required output is structured data from a source document, not a general description of what is visible. At Azure AI Fundamentals depth, visual inputs such as scanned forms, receipts, or invoices can still be handled as information extraction when the business need is to populate fields like dates, names, totals, or line items. Computer vision interpretation may identify visual content, but the deciding factor here is the desired structured output for another system. The key takeaway is to classify the workload by the goal, not only by the input modality.

  • Image classification fails because assigning one label to the whole invoice would not return invoice fields.
  • Object detection fails because locating visual objects does not extract supplier names, dates, or totals.
  • Image generation fails because creating new images is unrelated to reading structured data from existing invoices.

Question 50

Topic: Identify AI Concepts and Capabilities

A team is adding image generation to a youth education app. They are concerned that prompts could produce violent, sexual, deceptive, or otherwise harmful images that users should not see. Which responsible AI principle is most directly involved?

Options:

  • A. Reliability and safety

  • B. Inclusiveness

  • C. Fairness

  • D. Transparency

Best answer: A

Explanation: For image generation, a key responsible AI concern is whether generated content could be unsafe, inappropriate, misleading, or harmful. This maps most directly to reliability and safety, because the system should reduce the risk of harmful outputs and behave in ways that are safe for the intended users and context. In practice, this can involve content filtering, prompt restrictions, user warnings, review processes, or escalation paths.

Transparency is still useful when explaining that images are AI-generated, but the main concern in the stem is preventing harmful content from reaching users.

  • Inclusiveness is about making solutions usable by people with diverse needs, not primarily filtering harmful generated images.
  • Transparency helps users understand AI-generated content, but it does not directly describe the harm-prevention concern.
  • Fairness focuses on avoiding unjust bias or unequal treatment, not the general safety of generated images.

Continue with full practice

Use the AI-901 Practice Test page for the full IT Mastery practice bank, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try AI-901 on Web View AI-901 Practice Test

Focused topic pages

Free review resource

Read the AI-901 Cheat Sheet for compact concept review before returning to timed practice.

Revised on Monday, May 25, 2026