Try 50 free AI-901 questions across the exam domains, with explanations, then continue with full IT Mastery practice.
This free full-length AI-901 practice exam includes 50 original IT Mastery questions across the exam domains.
These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.
Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
| Domain | Weight |
|---|---|
| Identify AI Concepts and Capabilities | 43% |
| Implement AI Solutions by Using Microsoft Foundry | 57% |
Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer configured a single-agent solution in Microsoft Foundry with specific instructions and Foundry Tools. A lightweight application must invoke that configured agent behavior instead of sending raw prompts directly to a deployed model. Which implementation choice best maps to this need?
Options:
A. Use a text analysis client
B. Use an image generation client
C. Use a chat model client
D. Use an agent client
Best answer: D
Explanation: In Microsoft Foundry, an agent client is the appropriate choice when an application needs to call a configured agent. The configured agent can include instructions, tool connections, and behavior that are managed as part of the agent setup. A chat model client is better suited for sending messages directly to a deployed chat-capable model, where the application is responsible for supplying the needed prompt context and behavior. The key distinction is whether the application is using an agent configuration or directly invoking a model endpoint.
Topic: Identify AI Concepts and Capabilities
A finance team receives scanned supplier invoices and PDF purchase forms. They need to capture vendor names, invoice numbers, totals, and line-item relationships so the data can be reviewed and imported into an accounting system. Which AI capability is the best fit?
Options:
A. Image generation from prompts
B. Document and form information extraction
C. Speech recognition and transcription
D. General text sentiment analysis
Best answer: B
Explanation: Document and form information extraction is the best fit when the goal is to turn content in forms, invoices, receipts, or similar documents into structured data. The key clues are the need to capture named fields, their values, and relationships such as line items connected to totals or invoice identifiers. In Microsoft Foundry scenarios, Azure Content Understanding in Foundry Tools can support extraction from documents and forms. This is different from simply analyzing whether text is positive or negative, generating new images, or converting speech to text.
Topic: Identify AI Concepts and Capabilities
A team wants users to ask questions about a product defect by typing a description and attaching a photo in the same prompt. Which model capability is the best fit for this requirement?
Options:
A. A multimodal model
B. A speech recognition model
C. A text-only generative model
D. An image generation model
Best answer: A
Explanation: A multimodal model is needed when a prompt includes multiple input types, such as text plus an image, text plus audio, or another combination of modalities. In this scenario, the user is not only typing a description; they are also attaching a photo that the model must interpret to answer the question. A text-only model would not be sufficient because it cannot use the visual information in the photo as part of the prompt. The key mapping is: more than one input type in the prompt means multimodal capability is required.
Topic: Identify AI Concepts and Capabilities
A customer support team receives short ticket messages and wants to automatically identify specific items mentioned in each message, such as product names, order numbers, and locations. Which text-analysis technique best matches this need?
Options:
A. Named entity recognition
B. Language detection
C. Key phrase extraction
D. Sentiment analysis
Best answer: A
Explanation: Named entity recognition is the best text-analysis technique when the goal is to find and classify specific entities in text. In this scenario, the support team is not asking whether the customer is happy or unhappy, or for a general summary of the ticket. They need structured details such as product names, order numbers, and locations so the ticket can be routed or processed. Entity recognition turns those mentions into usable categories for downstream triage. The key distinction is that entities are specific values, while key phrases are broader important phrases.
Topic: Identify AI Concepts and Capabilities
A clinic is building a Microsoft Foundry chat assistant to summarize patient messages. Developers need enough diagnostic information to troubleshoot prompt failures, but the clinic must prevent patient names, IDs, and medical details from being exposed in logs or accessed by unauthorized users. Which responsible AI concern is the best fit?
Options:
A. Inclusiveness for diverse abilities
B. Transparency about AI limitations
C. Fairness across user groups
D. Privacy and security of AI data
Best answer: D
Explanation: Privacy and security focus on protecting data used by an AI system, including prompts, outputs, logs, and stored artifacts. In this scenario, the key issue is that patient identifiers and medical details could be captured for troubleshooting and viewed by people who should not have access. Appropriate controls would include minimizing sensitive data in logs, redacting identifiers, restricting access, and applying secure data-handling practices. Fairness, inclusiveness, and transparency are also responsible AI principles, but they address different concerns than sensitive data exposure.
Topic: Identify AI Concepts and Capabilities
A bank wants to use a generative AI model to draft customer hardship-case recommendations for service agents. The output may affect payment plans, must be recorded in the case file, and must follow bank policy. Which oversight response is the best fit?
Options:
A. Use a larger model to reduce review needs
B. Store only the final AI recommendation
C. Require agent review and approval before case updates
D. Let the model update cases when confidence is high
Best answer: C
Explanation: Accountability and human oversight are essential when AI output influences meaningful decisions, customer outcomes, operations, or records. In this scenario, the model is helping draft recommendations, but those recommendations can affect payment plans and become part of an official case file. The best response is to keep a human service agent responsible for reviewing, validating, and approving the recommendation before it changes the customer’s case or record. This does not prevent AI assistance; it ensures that AI remains advisory where policy, judgment, and customer impact matter. A higher-confidence or larger model can still make errors, and recordkeeping should preserve enough context for accountability.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team is choosing a model deployment in the Foundry portal for a support app. Users will submit a photo of a damaged product and type a question. The app must return a written troubleshooting response. Which deployment best matches the required input and output modalities?
Options:
A. A multimodal model with image and text input, text output
B. An image-generation model with text input and image output
C. A speech model with audio input and text output
D. A text-only chat model with text input and text output
Best answer: A
Explanation: Model deployment selection in Foundry should match the solution’s required modalities: what the app sends to the model and what it expects back. In this case, the app sends two inputs: an image of the damaged product and a typed text question. It expects a text answer. That points to a deployed multimodal model that supports image plus text input and text output. A text-only deployment cannot inspect the image, and image generation is for creating images rather than answering from an uploaded image. The key is to map input and output types before choosing the deployment.
Topic: Identify AI Concepts and Capabilities
A customer support team wants an AI feature that reads a short case summary and drafts a polite reply for an agent to review. The reply must use natural language, adapt to the customer’s issue, and not simply label or extract facts from the text. Which model capability is the best fit?
Options:
A. Keyword extraction for case topics
B. Entity extraction for names and dates
C. A generative AI model for text generation
D. Sentiment analysis for customer tone
Best answer: C
Explanation: A text generation task requires a model that can produce new language based on a prompt or input text. In this scenario, the system is not only identifying what is already present in the case summary; it must draft a complete response that an agent can review. Entity extraction, sentiment analysis, and keyword extraction are text analysis capabilities that identify existing information or labels in text. They can support a workflow, but they do not create the final reply.
Topic: Identify AI Concepts and Capabilities
A team uses a deployed generative AI model in Microsoft Foundry for an internal help chat. Test users report that the same question often produces very different, overly broad answers and sometimes drifts into unrelated advice. The team wants more consistent, focused responses without changing the application code. Which configuration change is the best fit?
Options:
A. Use Azure Speech in Foundry Tools
B. Lower the model temperature
C. Switch to an image-generation model
D. Increase the maximum response length
Best answer: B
Explanation: Temperature is a generation configuration parameter that affects how random or creative model responses are. A higher temperature can make answers more varied and exploratory, which may be useful for brainstorming but can also cause broad or off-task output. In this scenario, the team wants more consistent and focused chat responses without changing code, so reducing temperature is the most direct configuration adjustment. Response length controls how much text can be produced, not whether the answer stays focused. Changing workload tools or model types does not address the stated text-generation behavior.
Topic: Identify AI Concepts and Capabilities
A marketing team wants an AI feature that can draft a new product description from a short prompt. The description should vary naturally by audience and tone, and it should not be limited to prewritten phrases selected from a rules table. Which approach best fits this need?
Options:
A. Use an object detection model to identify product images
B. Use a generative AI model to create text from the prompt
C. Use a fixed template library with keyword substitution
D. Use a sentiment analysis model to score the prompt
Best answer: B
Explanation: Generative AI creates new content, such as text, images, or audio, in response to a prompt. It does this by using patterns learned from training data, not by simply selecting from fixed templates. In this scenario, the team needs flexible product descriptions that change based on audience and tone, so a generative text model is the best fit. Template substitution can be useful for predictable wording, but it does not provide the same open-ended content generation.
Topic: Identify AI Concepts and Capabilities
A marketing team wants to create a new visual asset for a product launch. The app must generate an original banner image from a text description, allow style guidance such as “minimalist,” and should not rely on analyzing an existing image. Which model capability is the best fit?
Options:
A. Speech recognition with transcription
B. Image generation from a text prompt
C. Content Understanding for forms
D. Computer vision image classification
Best answer: B
Explanation: The key requirement is to create a new image, not classify, extract, or transcribe existing content. An image generation capability uses a text prompt to produce an original visual asset and can often follow creative instructions such as style, subject, composition, and tone. Computer vision capabilities are used to analyze existing images, while Content Understanding is used to extract information from content such as documents, forms, images, audio, or video. Speech recognition converts spoken audio to text. The best fit is the capability that produces a new visual output from a prompt.
Topic: Identify AI Concepts and Capabilities
A user asks a generative AI model for a policy summary. The response is fluent and certain in tone, but it includes a policy rule that is not present in the source documents and omits an important exception. Which generative model behavior best explains this observation?
Options:
A. The model applied deterministic business rules
B. The model generated a plausible but ungrounded response
C. The model verified the answer against all source documents
D. The model converted speech input to text
Best answer: B
Explanation: Generative AI models create responses by predicting likely content based on patterns in their training data and the prompt context. A response can therefore sound natural and confident without being fully grounded in the provided sources. This behavior is often described as a hallucination or an ungrounded response, especially when the model invents details or omits key facts. Confidence in wording is not the same as evidence, validation, or completeness.
The key takeaway is that fluent generated text should be checked against trusted sources when accuracy matters.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is testing a lightweight chat client that sends customer support notes to a deployed generative AI model in Microsoft Foundry for summarization. The summary only needs issue type, product name, and requested action. The notes may include customer names, phone numbers, and account IDs. What should the developer do before sending the prompt to the model?
Options:
A. Use a larger generative model without changing the prompt
B. Increase the model temperature for more varied summaries
C. Move all customer details into the system prompt
D. Remove personal identifiers not needed for the summary
Best answer: D
Explanation: Prompts should include only the information needed for the model to complete the task. In this scenario, the model needs the issue type, product name, and requested action, but customer names, phone numbers, and account IDs are not required for that summary. Removing or masking those identifiers before sending the prompt reduces unnecessary exposure of personal data and aligns with privacy and security expectations. Changing model settings or model size does not address the data-minimization problem.
Topic: Identify AI Concepts and Capabilities
A support team wants an AI solution that can understand a customer’s request, decide whether it needs an account lookup, call an approved internal tool when needed, and then summarize the result for the customer. The team also wants the behavior constrained by a configured instruction set. Which workload is the best fit?
Options:
A. An image generation workload
B. A text sentiment analysis workload
C. A single-agent solution
D. A basic chatbot
Best answer: C
Explanation: An agentic workload is identified by configured behavior and task-oriented action. In this scenario, the solution must decide when an account lookup is needed, use an approved internal tool, and summarize the outcome. Those are agent-like behaviors because the system is not only generating a conversational response; it is following instructions, making a task decision, and invoking a tool to complete part of the task. A basic chatbot label usually implies responding to user messages without independent tool use or configured task steps. The key signal is action toward a goal, especially with tools or defined behavior constraints.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team is testing a lightweight chat client that uses a deployed generative AI model in Microsoft Foundry. The assistant should answer only questions about the company’s benefits policy, but it sometimes responds to unrelated travel and entertainment questions. The team must reduce off-task responses without changing the user’s immediate request. Which change is best?
Options:
A. Switch to image input so the request includes more context.
B. Increase the model temperature to make responses more varied.
C. Rewrite each user prompt to mention the benefits-policy scope.
D. Add a system prompt that defines the benefits-policy scope and redirects off-scope requests.
Best answer: D
Explanation: The system prompt is the right place to define the assistant’s role, boundaries, and expected behavior across turns. In this scenario, the user’s immediate request should not be changed, so the best fix is to update the system prompt with instructions such as answering only benefits-policy questions and politely redirecting unrelated requests. This reduces off-task behavior while preserving the original user prompt. Changing model creativity or modality does not directly establish the assistant’s allowed topic scope.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is building a support assistant in Microsoft Foundry. Users will upload a photo of damaged equipment and ask questions such as “What seems broken?” and “Is there an obvious safety concern?” The app needs a natural-language interpretation of the whole image, not exact text extraction or bounding boxes. Which capability is the best fit?
Options:
A. Use a deployed multimodal model with an image and prompt
B. Use object detection to return labeled bounding boxes
C. Use image generation to create a replacement equipment photo
D. Use OCR to extract all visible text from the image
Best answer: A
Explanation: Prompt-based visual interpretation uses a multimodal model that can process an image along with a user prompt and produce a natural-language response. In this scenario, the assistant must reason over the overall photo and answer user questions about visible damage and safety concerns. OCR is best when the goal is to extract printed or handwritten text. Object detection is best when the goal is to identify object categories and locations, often with bounding boxes. The key distinction is that the user wants descriptive interpretation, not structured text extraction or object localization.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is preparing a prototype in Microsoft Foundry. The app must answer typed support questions, reason over uploaded product photos, and be tested in the portal before any SDK code is written. Which deployment decision is the best fit?
Options:
A. Skip portal testing and connect the SDK to the first listed model.
B. Deploy a ready multimodal generative model and test sample prompts in the portal.
C. Use Azure Content Understanding to extract form fields from the photos.
D. Deploy a text-only chat model and handle photos later in the client.
Best answer: B
Explanation: Model deployment in the Foundry portal should start with the workload requirements: the model must be ready to deploy, support the required input modalities, and be testable before application code depends on it. In this scenario, the prototype needs both text interaction and image reasoning, so a deployed multimodal generative model is the best fit. Testing sample text and image prompts in the portal helps confirm behavior before building a lightweight SDK client. A text-only model misses the photo requirement, while Content Understanding is better for extracting structured information from documents, forms, images, audio, or video rather than general visual chat.
Topic: Implement AI Solutions by Using Microsoft Foundry
A media company has recordings of product demonstrations. The team needs to extract spoken product names, visible actions in the demo, and the times when key events occur. Which Microsoft Foundry approach is the best fit?
Options:
A. Use image generation from text prompts
B. Use sentiment analysis on the transcript
C. Use Azure Content Understanding for video extraction
D. Use Azure Speech for transcription only
Best answer: C
Explanation: Azure Content Understanding in Foundry Tools is the best fit when the input is media and the required output includes more than a transcript. It can support extraction of information from video or audio, including spoken facts, visual facts, events, and time-based observations. In this scenario, the team needs both what was said and what was visible, plus when key events occurred. Azure Speech is useful for speech-to-text, but it does not address visible actions in the video. The key distinction is media information extraction versus a single-purpose speech or text-analysis workload.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is testing a lightweight chat client that calls a deployed generative AI model in Microsoft Foundry. The model answers accurately, but the responses are too general because users type vague requests such as “Explain this policy.” Which change best maps to prompt revision rather than model training or deployment configuration?
Options:
A. Switch the deployment to a larger model family
B. Redeploy the model with a different temperature setting
C. Add audience, format, and task details to the user prompt
D. Train a new model on company policy examples
Best answer: C
Explanation: Prompt revision changes the instructions supplied to the model for a specific interaction. In this case, the model is already producing accurate answers, but the user request lacks context. Adding details such as the intended audience, desired format, scope, and constraints helps the deployed model generate a more useful response without changing model weights or deployment settings. Training or fine-tuning is about changing model behavior through data, and deployment configuration changes adjust how a selected model is hosted or sampled. The key distinction is that prompt revision is the fastest run-time change when the issue is unclear or incomplete instructions.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team uses Azure Content Understanding in Foundry Tools to extract customer names and totals from uploaded invoices. Reviewers say they need to know when a field value was produced by AI processing rather than entered by a person. Which response best supports this need?
Options:
A. Block all low-confidence extracted fields
B. Balance invoice samples across vendors
C. Label extracted fields as AI-processed
D. Encrypt uploaded invoices at rest
Best answer: C
Explanation: Transparency means users should be able to understand when AI is being used and how AI affects the information they see. In this scenario, reviewers specifically need to know that extracted invoice fields came from AI processing. A clear label or disclosure on AI-extracted fields addresses that need without changing the extraction workflow. Privacy protections, balanced data, and confidence handling may also be important, but they map to different concerns. The key takeaway is that disclosure and understandable communication are transparency responses.
Topic: Identify AI Concepts and Capabilities
A city department plans to use a generative AI assistant in Microsoft Foundry to draft plain-language summaries of public comments. Stakeholders need to know that the summaries are AI-generated, what source material they are based on, and that staff must review them before publication. Which response best supports transparency?
Options:
A. Increase the model temperature for more varied summaries
B. Replace staff review with automated content filtering
C. Publish a user-facing AI disclosure and usage guidance
D. Collect additional personal data to improve context
Best answer: C
Explanation: Transparency means people affected by or using an AI system should understand when AI is involved, what the system is intended to do, and its important limitations. In this scenario, stakeholders specifically need disclosure that summaries are AI-generated, context about the source material, and guidance that the output is draft content requiring staff review before publication. A user-facing disclosure with usage guidance directly addresses those needs without changing the model or expanding data collection. Transparency does not mean making the model more creative or replacing accountability with automation. The key takeaway is to communicate AI involvement and appropriate reliance on its outputs.
Topic: Implement AI Solutions by Using Microsoft Foundry
A city maintenance team uses a deployed multimodal model in Microsoft Foundry to interpret uploaded street images. The result may update a public hazard record and dispatch a repair crew. The team needs a response process that supports quick triage but avoids unsafe automated actions when the image interpretation is uncertain. Which approach is the best fit?
Options:
A. Lower the confidence threshold to reduce manual reviews
B. Automatically dispatch crews for every detected hazard
C. Send uncertain hazard results to human review before updating records
D. Use image generation to recreate unclear street scenes
Best answer: C
Explanation: When vision results can affect safety, user records, or operational actions, the response process should include a review step for uncertain or high-impact interpretations. A deployed multimodal model can help triage images, but its output should not be the only authority for actions such as updating official hazard records or dispatching crews when confidence is low or the situation is ambiguous. Human review adds accountability and reduces the risk of acting on a mistaken visual interpretation.
The key takeaway is to use AI output to assist decisions, not to bypass review for safety-impacting actions.
Topic: Identify AI Concepts and Capabilities
A junior developer is building a lightweight chat client by using Microsoft Foundry. The team has already chosen a generative AI model from the model catalog and now needs to test it in the Foundry portal and call it from the application by using an endpoint. What should the developer do next?
Options:
A. Create an image generation resource
B. Rewrite the user prompt
C. Select a different base model
D. Deploy the selected model
Best answer: D
Explanation: Selecting a model and deploying a model are separate steps in Microsoft Foundry. Selecting a model identifies the model capability you want to use, such as a chat-capable generative AI model. Deploying that selected model creates a usable deployment that can be tested in the Foundry portal and consumed by a lightweight application through the required connection details, such as an endpoint. In this scenario, the model choice is already complete, so the next need is availability for testing and app calls. Changing prompts may improve behavior later, but it does not make the model consumable by the client application.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is creating a lightweight chat client with the Foundry SDK. Before the client can send a user prompt and receive a response, which implementation detail must the app have?
Options:
A. A custom model training pipeline
B. A Content Understanding analyzer schema
C. A deployed model reference and connection information
D. An image-generation safety filter configuration
Best answer: C
Explanation: A Foundry SDK chat client sends prompts to a specific deployed model. The application needs a reference to that deployment, such as its deployment name or model identifier, plus connection information such as the project, endpoint, and credentials required by the SDK. Without these details, the client has no target model and no authorized path for the request.
Training pipelines, analyzer schemas, and image-generation settings can be useful for other AI workloads, but they are not the basic requirement for sending chat prompts to a deployed model.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is configuring Azure Content Understanding in Foundry Tools for supplier invoices. The finance team needs the vendor identity, payment due date, amount owed, and the purchased products with quantities and prices. They do not need a summary of the invoice text. Which extraction targets are the best fit?
Options:
A. Sentiment, keywords, language, and summary
B. Customer address, document title, and page count
C. Handwritten text, image labels, and captions
D. Vendor name, due date, total, and line items
Best answer: D
Explanation: For document and form extraction, choose targets that match the specific business fields the app must capture. In an invoice scenario, common extraction targets include names or identifiers, dates, totals, addresses, form fields, and line items. Because the finance team needs vendor identity, due date, amount owed, and purchased product details, the best target set includes vendor name, due date, total, and line items. A summary or general text analysis output would not reliably provide structured accounting fields.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer is creating a lightweight chat app that calls a deployed generative AI model from Microsoft Foundry. Users ask follow-up questions, and the app must keep the assistant focused on the original tutoring goal during a short session. The team does not need retrieval, multiple agents, or workflow automation. Which client behavior best meets the need?
Options:
A. Retrain the model after each user message.
B. Send only the latest user message to reduce prompt length.
C. Create a multi-step agent workflow for every follow-up.
D. Include the system prompt and relevant chat history in each request.
Best answer: D
Explanation: A lightweight chat app should manage short-session context in the client by keeping the system prompt and relevant conversation turns, then sending them with each request to the deployed model. This preserves the conversation goal without adding unnecessary enterprise architecture such as agent orchestration, retrieval pipelines, or model training. Generative model calls do not automatically remember earlier API requests unless the application provides that context. The key takeaway is to use simple conversation-history management before adding heavier solution components.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is planning a lightweight application that will call a model deployed from Microsoft Foundry. Which requirement best indicates that the application should include vision capability?
Options:
A. Describe defects visible in uploaded product photos
B. Summarize customer support chat transcripts
C. Transcribe recorded calls into text
D. Generate a logo from a text prompt
Best answer: A
Explanation: Vision capability is needed when an application must interpret images or other visual input, such as identifying objects, describing scenes, reading visual details, or answering questions about an uploaded image. In a lightweight Microsoft Foundry app, this usually means choosing a deployed multimodal model or vision-capable workflow that can accept image input, not just text. Text summarization, speech transcription, and image generation are related AI workloads, but they do not require analyzing an existing image as input. The key signal is that the app must understand what is visually present.
Topic: Identify AI Concepts and Capabilities
A junior developer is building a lightweight Python chat client with the Foundry SDK. The app must call a selected generative AI model by using a deployment name or endpoint, and the team does not need to train or fine-tune a custom model. In the Foundry portal, the model is visible in the model catalog, but no deployment has been created. What should the developer do next?
Options:
A. Write a more detailed system prompt
B. Fine-tune the model before calling it
C. Deploy the selected model in Foundry
D. Create a Content Understanding project
Best answer: C
Explanation: A model deployment makes a selected model available for application calls. Seeing a model in a catalog means it can be chosen, but a lightweight client still needs a deployed model target, such as a deployment name or endpoint, before it can send prompts through the SDK. The scenario does not require custom training or document extraction; it only needs an app-callable generative model. Prompt design happens after there is a model target to call, not as a replacement for deployment.
Topic: Implement AI Solutions by Using Microsoft Foundry
A media team stores recorded customer-support calls and short training videos. They need a Microsoft Foundry solution that can identify key events, speakers, and relevant details from the audio/video files so the data can be indexed for review. Which capability is the best fit?
Options:
A. Use a generative model to summarize typed transcripts only
B. Use Azure Content Understanding in Foundry Tools
C. Use Azure Speech to synthesize spoken responses
D. Use image generation to create training visuals
Best answer: B
Explanation: Audio and video information extraction focuses on finding useful structure and facts inside media files, such as events, speakers, scenes, or relevant details. In Microsoft Foundry, Azure Content Understanding in Foundry Tools is the best fit when the input is audio or video and the goal is extraction for indexing or review. Speech synthesis goes the opposite direction by creating spoken audio from text. Text summarization can help after a transcript exists, but it does not directly extract information from the original audio/video files. Image generation creates new images rather than analyzing existing media.
Topic: Identify AI Concepts and Capabilities
A bank uses one automated loan-screening model for all applicants. The team requires the same scoring rule for every person, uses past approval data as input, and wants to understand why complaints of unfair outcomes may still be valid. Which explanation best fits this situation?
Options:
A. The training data may reflect biased past decisions.
B. The model must use separate rules for each group.
C. The model needs a larger compute deployment.
D. The issue is only transparency, not fairness.
Best answer: A
Explanation: Fairness in AI is not guaranteed by applying the same automated rule to everyone. If the model learns from biased historical approvals, incomplete inputs, or output labels that reflect unfair decisions, it can treat applicants identically in process while still producing unequal or unjust results. The problem is in what the system learned and what outcomes it optimizes, not necessarily in whether the same scoring code runs for each applicant.
The key takeaway is that fairness requires checking inputs, outputs, and impacts, not only confirming that the automation is uniform.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer created a single-agent solution in Microsoft Foundry. The agent has new instructions, a connected tool, and a rule that it must not answer outside the company travel policy. The team plans to build a lightweight client application that depends on the agent. What should the developer do first?
Options:
A. Deploy a different generative AI model
B. Write the client application with the Foundry SDK
C. Test the agent in the Foundry portal
D. Remove the boundary rule from the instructions
Best answer: C
Explanation: An agent should be tested in the Foundry portal before an application relies on it, especially after changing instructions, adding tools, or defining boundaries. Portal testing lets the developer confirm that the agent follows its intended scope, uses connected tools appropriately, and refuses or redirects out-of-scope requests. After that behavior is validated, a lightweight client can call the agent with more confidence. Writing the client first can hide agent design problems inside application code.
Topic: Implement AI Solutions by Using Microsoft Foundry
A company uses Azure Content Understanding in Foundry Tools to extract damage details from uploaded vehicle images. The extracted data will update claim records and may affect repair authorization and payment amounts. The team needs a review step that reduces incorrect operational and financial decisions. Which step is the best fit?
Options:
A. Store only the raw images without extracted fields
B. Require human validation before record updates
C. Automatically approve claims with high confidence scores
D. Use image generation to recreate unclear damage areas
Best answer: B
Explanation: When extracted image information can affect user records, operations, safety, or financial outcomes, the solution should include a human review or validation step before the data is used for important actions. In this scenario, the extracted damage details can change claim records and influence payment or repair authorization, so relying only on automated extraction is too risky. Confidence scores can help prioritize review, but they should not replace review when the business impact is significant. The key takeaway is to add human oversight where AI output drives consequential decisions.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team uses a deployed multimodal model in Microsoft Foundry to interpret images from field inspections. A result may update maintenance records and trigger an equipment shutdown. Which review response best fits this use of visual interpretation?
Options:
A. Require human review before updating records or triggering actions
B. Replace the vision model with a text-only model
C. Automatically accept high-confidence visual results as final
D. Store only the image and discard the interpretation
Best answer: A
Explanation: When a vision result can affect safety, user records, or operational actions, it should not be treated as automatically authoritative. A deployed multimodal model can help interpret images, but its output may be uncertain or context-dependent. The appropriate response is to include a review step so a qualified person confirms the interpretation before records are changed or actions such as shutdowns are triggered. This supports reliability, safety, and accountability while still using AI to assist the workflow. Confidence scores can help prioritize review, but they do not remove the need for oversight in high-impact decisions.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team has already configured and tested a single-agent solution in Microsoft Foundry. A junior developer must build a lightweight web client that accepts user questions, sends them to the configured agent, and displays the response without changing the agent’s instructions or tools. Which implementation approach is the best fit?
Options:
A. Create a new model deployment for each user session before sending messages.
B. Use Azure Content Understanding to extract answers from the user messages.
C. Use the Foundry SDK to send user messages to the configured agent and display its replies.
D. Move the agent’s system instructions into each user prompt from the web form.
Best answer: C
Explanation: A lightweight agent client app does not usually define the agent’s behavior, tools, or model deployment at runtime. Those are configured in Microsoft Foundry. The client is responsible for connecting to the configured agent, sending the user’s request as a message, receiving the agent’s response, and presenting it in the application UI. It may also handle basic application concerns such as authentication, conversation state, and errors, but it should not replace the configured agent setup. Creating deployments or moving system instructions into user prompts changes the solution design rather than acting as a lightweight client.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is using Azure Content Understanding in Foundry Tools to process supplier invoices. The finance app must capture each purchased product row, including the description, quantity, unit price, and row subtotal. Which extraction target best matches this need?
Options:
A. Line items
B. Addresses
C. Dates
D. Totals
Best answer: A
Explanation: Document and form extraction targets should match the shape of the information needed from the document. Invoice rows that repeat across a document, such as product description, quantity, unit price, and row subtotal, are line items. A total, date, or address is usually a single field or small set of fields, while line items represent repeated structured records that must stay associated with each row.
Topic: Implement AI Solutions by Using Microsoft Foundry
Which need is best mapped to a single-agent solution set up in the Foundry portal?
Options:
A. One agent answers HR policy questions using approved instructions and tools
B. A monitoring architecture audits all agents across multiple business units
C. A governance program defines enterprise-wide agent approval workflows
D. Several specialized agents negotiate and delegate tasks to each other
Best answer: A
Explanation: At Azure AI Fundamentals depth, a single-agent solution in the Foundry portal is a focused setup where one agent is configured with instructions, a model, and optional tools or knowledge to help users complete a task. The key signal is that one agent owns the interaction and does not require agent-to-agent coordination. Scenarios involving multiple specialized agents, enterprise approval processes, or cross-organization monitoring move beyond basic single-agent setup into orchestration or governance concerns. The task here is to identify the implementation choice that stays centered on one configured agent.
Topic: Identify AI Concepts and Capabilities
A bank is piloting an AI pre-screening assistant for loan applications. The model meets the required overall accuracy target, does not expose applicant data, and shows users a short explanation of the factors used. Testing shows that equally qualified applicants from one age group receive substantially fewer approvals than others. Which responsible AI concern is the BEST fit?
Options:
A. Reliability and safety issue
B. Privacy and security issue
C. Transparency issue
D. Fairness risk
Best answer: D
Explanation: Fairness in AI focuses on whether an AI system treats people and groups equitably, especially when decisions affect opportunities such as loans, hiring, or access to services. In this scenario, the system has acceptable overall accuracy, protects applicant data, and provides explanations, but one age group receives worse outcomes despite similar qualifications. That pattern points to potential bias or disparate impact, which is a fairness concern. Overall performance can look acceptable while still hiding unfair outcomes for a subgroup. The key distinction is that the problem is not missing disclosure, data exposure, or general model failure; it is unequal impact across people who should be treated comparably.
Topic: Identify AI Concepts and Capabilities
A city services team wants to analyze resident feedback messages. The solution must find references to named parks, departments, dates, phone numbers, and monetary amounts so staff can route and summarize issues. Which text analysis capability is the best fit?
Options:
A. Language detection
B. Sentiment analysis
C. Key phrase extraction
D. Entity detection
Best answer: D
Explanation: Entity detection is the text analysis capability used to identify and label specific references in unstructured text, such as locations, organizations, dates, phone numbers, and quantities. In this scenario, the team needs to locate structured references inside feedback messages so the information can support routing and summarization. That requirement is different from determining emotional tone, extracting general topics, or identifying the language of the text. The key signal is the need to find named or typed references rather than classify the overall message.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is building an app that must process recorded customer-support calls and meeting videos. The app needs structured fields such as speaker names, key discussion points, action items, and timestamps that can be stored in a database. Which Microsoft Foundry capability best matches this need?
Options:
A. Text sentiment analysis
B. Azure Content Understanding in Foundry Tools
C. Image generation model deployment
D. Azure Speech in Foundry Tools
Best answer: B
Explanation: Azure Content Understanding in Foundry Tools is the best fit when an application needs to extract structured information from unstructured content, including audio and video. In this scenario, the app is not only converting speech to text; it needs usable fields such as speakers, action items, topics, and timestamps from recorded media. That is an information-extraction workload for audio and video sources. Azure Speech is more appropriate for speech recognition or synthesis tasks, while sentiment analysis focuses on classifying text tone after text is available.
Topic: Identify AI Concepts and Capabilities
A support team uses a deployed multimodal model to analyze customer photos and short videos of damaged deliveries. For one case, the model suggests that the package was damaged before delivery, but the video is shaky and the label is partly obscured. Which concept does this observation best map to?
Options:
A. The model deployment has failed
B. Review is needed because source evidence is ambiguous
C. A higher temperature setting will improve factual certainty
D. A text-only model should replace the multimodal model
Best answer: B
Explanation: Multimodal models can combine information from inputs such as images, video, audio, and text, but their output still depends on the quality and completeness of the source evidence. If a video is shaky, a label is obscured, or an image lacks key context, the model may infer a plausible answer without enough reliable evidence. In that situation, the appropriate mapping is not a different workload or a failed deployment; it is a need for review before relying on the result. The key takeaway is that multimodal capability does not remove the need to validate uncertain outputs against source evidence.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is adding a feature in a Microsoft Foundry app that accepts uploaded product photos and identifies visible objects, dominant colors, and brief captions for each image. Which capability best fits this need?
Options:
A. Image generation
B. Vision capability
C. Text analysis
D. Speech processing
Best answer: B
Explanation: A vision capability is the best fit when the input is an existing image and the app needs to understand what is visible in it. In this scenario, the app analyzes product photos to identify objects, colors, and captions, which are visual understanding tasks. Text analysis works on written language, speech processing works on audio or spoken language, and image generation creates new images from prompts rather than analyzing uploaded photos. The key distinction is whether the workload interprets existing visual content or creates new content.
Topic: Identify AI Concepts and Capabilities
A team uses a deployed generative AI model to convert support tickets into one of five approved status messages. They want repeated runs on the same ticket to produce the most consistent response possible, and they do not need creative variation. Which configuration approach best matches this need?
Options:
A. Raise the maximum token limit
B. Use a multimodal input model
C. Increase temperature for more variation
D. Set a low temperature value
Best answer: D
Explanation: Temperature is a generation configuration parameter that controls how random or varied a generative AI model’s output can be. For a constrained task, such as selecting or wording one of a small set of approved status messages, a lower temperature is preferred because it makes the model favor more likely, repeatable responses. This supports consistency when the same or similar input is submitted multiple times. A higher temperature is useful when creativity or varied wording is desired, but that conflicts with this scenario. Token limits affect response length, not predictability.
Topic: Implement AI Solutions by Using Microsoft Foundry
A developer is building a lightweight Foundry SDK chat client for a deployed multimodal model. Users must ask questions by speaking into a microphone, the app must preserve the spoken prompt for the model, and the solution should avoid sending only a text box value. Which implementation step is the best fit?
Options:
A. Add a longer system prompt describing speech recognition
B. Send only synthesized speech as the model response
C. Capture microphone audio and pass it as supported audio input
D. Use image generation to create a transcript image
Best answer: C
Explanation: For spoken prompts with a deployed multimodal model, the client application must capture the user’s speech in a usable input form, such as a supported audio file or audio stream, and include that input in the model request. The key requirement is not just that the user speaks, but that the app preserves the spoken input in a format the model can accept. A system prompt can guide behavior after input is received, but it does not capture audio. Speech synthesis is for producing spoken output, not accepting a spoken prompt.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team uses text analysis in a Microsoft Foundry prototype to flag negative customer comments. The phrase “This update is sick” is sometimes flagged as negative, but in this community it often means “excellent.” Which result-handling concept does this observation best illustrate?
Options:
A. Text-analysis output may need context
B. Document extraction requires form fields
C. Speech recognition requires a custom voice
D. Image generation needs a style prompt
Best answer: A
Explanation: Text analysis can identify sentiment, key phrases, or entities, but language is often ambiguous. Slang, sarcasm, product names, regional wording, and surrounding sentences can change the meaning of a word or phrase. In the stem, “sick” may look negative in isolation but can be positive in the customer community. Result handling should account for context before the application takes action, especially when the result could affect users or business decisions.
The key takeaway is that text-analysis output is useful evidence, not a guaranteed interpretation of meaning in every context.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team uses Azure Content Understanding in Foundry Tools to extract fields from uploaded claim forms. Which observation best maps to a result that should be routed for review before the application uses it automatically?
Options:
A. A typed invoice number matches the expected format.
B. A blurred handwritten diagnosis code includes a patient identifier.
C. A vendor name is extracted from a clear company letterhead.
D. A standard date field is read from a clean printed form.
Best answer: B
Explanation: Extraction validation is especially important when the source content reduces reliability or raises privacy risk. Blurred handwriting can make a field ambiguous, so the extracted value may be wrong even if the system returns a value. A patient identifier also makes the content sensitive, so the application should avoid using or storing the result automatically until it is reviewed according to the organization’s validation and privacy process. Clear printed fields, expected formats, and standard templates are lower-risk signals and usually do not, by themselves, require special review.
Topic: Implement AI Solutions by Using Microsoft Foundry
A team uses Azure Content Understanding in Foundry Tools to extract fields from supplier invoices. The app detects that an extracted TotalAmount has low confidence and must be checked by an accounts-payable employee before it is sent to the accounting system. Which workflow action best matches this need?
Options:
A. Display the field only in a dashboard
B. Validate the field with no human review
C. Route the field for downstream review
D. Store the extracted field as final data
Best answer: C
Explanation: In an extraction app workflow, fields can be validated, stored, displayed, or routed depending on what must happen next. A low-confidence or business-critical value that needs human confirmation before being used by another system should be routed for downstream review. This allows a reviewer to confirm or correct the extracted field before it becomes operational data. Storing is appropriate after data is accepted, displaying helps users inspect results, and automated validation checks rules or formats but does not replace a required human review step.
Topic: Implement AI Solutions by Using Microsoft Foundry
A junior developer has tested a deployed generative AI model in the Foundry portal. The team now needs an internal web page where employees can enter prompts, send them to the same deployment at runtime, and display responses without opening the portal. Which approach is the best fit?
Options:
A. Continue using the Foundry portal test pane for employee prompts.
B. Build a lightweight app that calls the deployment through the Foundry SDK.
C. Use Azure Content Understanding to extract fields from prompts.
D. Create a new model deployment for each employee prompt.
Best answer: B
Explanation: The key distinction is where the interaction happens. The Foundry portal test experience is useful for trying prompts, checking a deployment, and exploring model behavior during development. It is not the right interface when users need an application experience. When an app must collect user input, send requests at runtime, and render model responses, the developer should use the Foundry SDK or supported client APIs to call the existing deployed model from application code. The deployment remains the model endpoint; the lightweight app becomes the user-facing client. Creating deployments per prompt or using an extraction tool changes the workload instead of implementing the required app interaction.
Topic: Identify AI Concepts and Capabilities
A clinic plans to use an AI feature in Microsoft Foundry to rank incoming patient messages by urgency. The clinic needs to use the AI output to support triage, explain triage decisions when challenged, and ensure accountability if a patient is harmed by an incorrect recommendation. Which approach is the best fit?
Options:
A. Rely on model confidence scores as the only oversight mechanism
B. Assign clinical owners to review and govern AI-supported triage decisions
C. State that the deployed model is responsible for triage outcomes
D. Require patients to accept that AI recommendations are final
Best answer: B
Explanation: Accountability is a responsible AI principle: people and organizations must remain responsible for decisions that use AI support. In this scenario, the clinic can use AI to help prioritize messages, but it should define human ownership, review processes, documentation, and escalation paths for triage decisions. A model can generate recommendations or rankings, but it cannot accept legal, ethical, or operational responsibility for harm. The key takeaway is that AI can support decision-making, but responsibility stays with the organization deploying and using it.
Topic: Identify AI Concepts and Capabilities
A finance team has scanned images of supplier invoices. They need an AI capability that returns structured fields such as supplier name, invoice date, line items, and total amount for use in an accounting app. Which workload best matches this need?
Options:
A. Image generation
B. Image classification
C. Object detection
D. Information extraction from documents and forms
Best answer: D
Explanation: This is an information extraction task because the required output is structured data from a source document, not a general description of what is visible. At Azure AI Fundamentals depth, visual inputs such as scanned forms, receipts, or invoices can still be handled as information extraction when the business need is to populate fields like dates, names, totals, or line items. Computer vision interpretation may identify visual content, but the deciding factor here is the desired structured output for another system. The key takeaway is to classify the workload by the goal, not only by the input modality.
Topic: Identify AI Concepts and Capabilities
A team is adding image generation to a youth education app. They are concerned that prompts could produce violent, sexual, deceptive, or otherwise harmful images that users should not see. Which responsible AI principle is most directly involved?
Options:
A. Reliability and safety
B. Inclusiveness
C. Fairness
D. Transparency
Best answer: A
Explanation: For image generation, a key responsible AI concern is whether generated content could be unsafe, inappropriate, misleading, or harmful. This maps most directly to reliability and safety, because the system should reduce the risk of harmful outputs and behave in ways that are safe for the intended users and context. In practice, this can involve content filtering, prompt restrictions, user warnings, review processes, or escalation paths.
Transparency is still useful when explaining that images are AI-generated, but the main concern in the stem is preventing harmful content from reaching users.
Use the AI-901 Practice Test page for the full IT Mastery practice bank, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Try AI-901 on Web View AI-901 Practice Test
Read the AI-901 Cheat Sheet for compact concept review before returning to timed practice.