Microsoft Azure AI Fundamentals (AI-900): 24 Sample Questions & Simulator

AI-900 sample questions, mock-exam practice, and simulator access with detailed explanations in IT Mastery on web, iOS, and Android.

On this page

AI-900 is Microsoft Azure AI Fundamentals for candidates who need practical AI workload recognition, service-selection judgment, and clear understanding of machine learning, vision, NLP, and generative AI on Azure. If you are searching for AI-900 sample questions, a practice test, mock exam, or exam simulator, this is the main IT Mastery page to start on web and continue on iOS or Android with the same account.

Interactive Practice Center

Start a practice session for Microsoft Azure AI Fundamentals (AI-900) below, or open the full app in a new tab. For the best experience, open the full app in a new tab and navigate with swipes/gestures or the mouse wheel—just like on your phone or tablet.

Open Full App in a New Tab

A small set of questions is available for free preview. Subscribers can unlock full access by signing in with the same account used on mobile.

Prefer to practice on your phone or tablet? Download the IT Mastery – AWS, Azure, GCP & CompTIA exam prep app for iOS or IT Mastery app on Google Play (Android) and then sign in with the same account on web to continue your sessions on desktop.

What this practice page gives you

a direct route into the live IT Mastery simulator for AI-900
24 sample questions with detailed explanations across AI workloads, machine learning, computer vision, NLP, and generative AI
focused practice around Azure AI service selection and fundamentals instead of generic AI trivia
a clear free-preview path before you subscribe
the same account across web and mobile

AI-900 exam snapshot

Vendor: Azure
Official exam name: Microsoft Azure AI Fundamentals (AI-900)
Exam code: AI-900
Items: 50 total
Exam time: 45 minutes
Practice format on this page: 24 single-answer sample questions with detailed explanations from the current local bank

Fundamentals-level Microsoft Azure AI practice across common AI workloads, ML basics, Azure AI services, and generative AI. This exam retires on June 30, 2026.

Topic coverage for AI-900 practice

Domain	Weight
Describe Artificial Intelligence workloads and considerations	19%
Describe fundamental principles of machine learning on Azure	19%
Describe features of computer vision workloads on Azure	19%
Describe features of Natural Language Processing (NLP) workloads on Azure	19%
Describe features of generative AI workloads on Azure	24%

How to use the AI-900 simulator efficiently

Start with workload recognition and service-selection questions so you can quickly tell document intelligence, vision, NLP, and generative AI scenarios apart.
Review every miss until you can explain why the best Azure AI service fits the data, the user need, and the operating constraint.
Move into mixed sets once you can switch comfortably between machine learning basics, Responsible AI, and Azure AI service choices.
Finish with timed runs so the 45-minute pace feels normal before test day.

Free preview vs premium

Free preview: a smaller web set so you can validate the question style and explanation depth.
Premium: the full AI-900 bank, focused drills, mixed sets, detailed explanations, and progress tracking across web and mobile.

Good next pages after AI-900

AZ-900 if you also need broader Azure platform and governance fundamentals
Azure certification pages if you are still choosing between AI, fundamentals, administrator, developer, and architect tracks

24 AI-900 sample questions with detailed explanations

These sample questions are drawn from the current local bank for this exact exam code. Use them to check your readiness here, then continue into the full IT Mastery question bank for broader timed coverage.

Question 1

A customer support team stores recordings of call-center conversations in Azure. They want to automatically convert each conversation into searchable text transcripts for later review. Which Azure service category should they use?

Options:

A. Azure AI Language
B. Azure AI Vision
C. Azure Machine Learning
D. Azure AI Speech

Best answer: D

Explanation: The requirement is to turn spoken audio from call recordings into text. That is a speech recognition task, which is provided by Azure AI Speech through speech-to-text capabilities.

The core concept is matching the AI workload to the correct Azure service family. Transcribing call-center conversations means the input is audio and the desired output is text, so the needed capability is speech-to-text. Azure AI Speech is the prebuilt Azure AI service for recognizing spoken language and generating transcripts.

Azure AI Language works on text that already exists, such as analyzing sentiment or extracting key phrases. Azure AI Vision is for image and video analysis. Azure Machine Learning is typically used when you need to build or train custom models, not when a standard prebuilt speech transcription service already matches the requirement.

When the goal is converting spoken words into text, choose Azure AI Speech.

Text analysis only The option using Azure AI Language is for processing existing text, not for converting audio into text.
Wrong modality The option using Azure AI Vision focuses on visual content such as images and video rather than spoken conversations.
Custom model mismatch The option using Azure Machine Learning is less appropriate because this scenario is directly covered by a prebuilt Azure AI service.

Question 2

A retail company stores customer purchase histories in Azure. The marketing team wants to discover natural groups of customers for targeted promotions. The dataset has no existing segment labels, and the team is not trying to predict a number or generate text. Which approach is the BEST fit?

Options:

A. Use regression in Azure Machine Learning
B. Use classification in Azure Machine Learning
C. Use clustering in Azure Machine Learning
D. Use Azure OpenAI Service to summarize the records

Best answer: C

Explanation: This scenario calls for unsupervised learning because the company has no labeled customer segments and wants to discover patterns. Clustering in Azure Machine Learning groups similar records without requiring predefined outcomes.

When no labels are available and the goal is to find hidden structure, clustering is the appropriate machine learning technique. Clustering is an unsupervised learning method that groups similar items based on shared characteristics, such as purchase frequency, product mix, or average order value. In this scenario, the company wants to discover customer segments, not predict a known category and not forecast a numeric result.

Use clustering when labels are missing and pattern discovery is the goal.
Use classification when each record already has a known category.
Use regression when you need to predict a number.

A generative AI service can summarize or create text, but it does not replace a clustering model for discovering natural groups in customer data.

Classification is tempting, but it requires labeled examples such as existing customer segment names.
Regression is for predicting numeric values, such as future spending, not for finding groups.
Summarization with Azure OpenAI Service generates text output rather than identifying clusters in tabular data.

Question 3

A retailer’s data science team already trains sales forecasting models in Azure Machine Learning. They now need to keep a record of each run’s parameters, evaluation metrics, and output files so they can compare runs and reproduce the best result later. Which Azure Machine Learning capability best meets this need?

Options:

A. Experiment tracking
B. Automated machine learning
C. Compute clusters
D. Managed online endpoints

Best answer: A

Explanation: The requirement is to log and compare repeated model runs, not to train a model automatically, host predictions, or provide compute. In Azure Machine Learning, experiment tracking stores run details such as parameters, metrics, and artifacts so the team can review outcomes and reproduce the best run.

This scenario is about organizing and evaluating multiple training runs. In Azure Machine Learning, experiment tracking captures what happened during each run, including parameter values, evaluation metrics, and produced artifacts, so a team can compare results and reproduce a strong run later.

Use experiment tracking when the main goal is visibility and comparison across runs.
Use automated machine learning when the goal is to have Azure try models and settings for you.
Use managed online endpoints when the goal is real-time deployment.
Use compute clusters when the goal is scalable processing power for training jobs.

The key distinction is that experiment tracking supports model development oversight, while deployment and compute capabilities serve different stages of the ML lifecycle.

Automated training fits when Azure should help build and tune models, not when the main need is logging and comparing existing runs.
Real-time deployment through managed online endpoints is for serving predictions, not for recording training history.
Training infrastructure with compute clusters provides scalable compute, but it does not by itself organize run metrics and artifacts for comparison.

Question 4

Contoso is evaluating Azure AI for image analysis. Which requirement is primarily a facial detection workload?

Options:

A. Determine whether face images can be retained legally
B. Grant secure room access to approved employees
C. Locate faces in event photos so they can be blurred
D. Verify a visitor’s selfie against an ID photo

Best answer: C

Explanation: A facial detection workload focuses on finding human faces in an image. Locating faces for blurring fits that task, while selfie matching, access decisions, and legal retention questions are identity, security, or compliance concerns.

The key distinction is between detecting that a face appears in an image and making a broader decision about identity, access, or policy. A facial detection workload analyzes an image to locate faces, often so an app can blur, crop, or count them. In AI-900 terms, this is a computer vision use case.

By contrast, comparing a selfie to an ID photo is identity verification. Granting entry to a secure room is an access-control or security decision. Deciding whether face images may be stored is a compliance or governance question. Those scenarios may involve images of faces, but face detection is not the primary workload being asked for here.

Selfie matching is an identity verification task, not just face location.
Legal retention is a compliance question about policy and governance.
Secure room access adds authorization and security rules beyond detecting a face.

Question 5

Which responsible AI principle is defined as ensuring that AI systems work effectively for people with different backgrounds, conditions, and abilities?

Options:

A. Fairness
B. Transparency
C. Accountability
D. Inclusiveness

Best answer: D

Explanation: Inclusiveness is the responsible AI principle about making AI systems effective for people with varied backgrounds, conditions, and abilities. It focuses on designing for a broad range of users instead of assuming one typical user.

In responsible AI, inclusiveness means designing and evaluating AI systems so they can be used effectively by people with different backgrounds, experiences, conditions, and abilities. This includes considering users with disabilities, different language needs, varied technical skill levels, and different real-world contexts. An inclusive system is not built only for an assumed “average” user; it is built and tested with diverse users in mind.

This principle is closely related to accessibility and broad usability. For example, an inclusive AI system might support multiple input methods, work well with varied accents, or present information in ways that more people can understand and use. The closest confusion is fairness: fairness focuses on avoiding unjust bias in outcomes, while inclusiveness focuses on making the system effective for a wide range of people.

Fairness is about avoiding biased or unjust outcomes across groups, not primarily about usability for people with different abilities.
Transparency is about helping people understand how an AI system works or how it reaches results.
Accountability is about ensuring humans and organizations remain responsible for AI decisions and governance.

Question 6

A company wants Azure to create searchable text from recorded customer-support calls. Which AI workload type best fits this requirement?

Options:

A. Speech
B. Natural language processing
C. Generative AI
D. Computer vision

Best answer: A

Explanation: The input is recorded audio, and the desired output is written text. That makes this a speech workload, specifically speech-to-text transcription in Azure AI Speech, rather than a service that analyzes images or already-written text.

Choose the workload by matching the data type to the task. Here, the company wants spoken words from support-call recordings turned into searchable text, so the core capability is speech-to-text. That is a speech workload, and in Azure it aligns with Azure AI Speech.

Natural language processing is usually applied after text already exists, such as for sentiment analysis, entity extraction, or text translation. Computer vision works with images and video. Generative AI creates new content such as summaries or draft replies, but it is not the primary workload for accurately transcribing audio. The key takeaway is simple: when the source is audio and the output is text, start with speech.

NLP confusion fits text-based tasks like sentiment or translation, but this requirement begins with recorded audio.
Vision mismatch is for images and video, not spoken conversations.
Generative AI mismatch can draft or summarize content, but transcription first requires recognizing speech.

Question 7

A company is building a photo intake app for event registration. The app must detect each human face in uploaded images and return the location of each face so the photos can be cropped automatically. Which Azure service should the company choose?

Options:

A. Azure AI Face detection service
B. Azure AI Document Intelligence
C. Azure Machine Learning
D. Azure AI Vision

Best answer: A

Explanation: The requirement is specifically about finding human faces in images and returning where those faces appear. Azure AI Face detection service is the best match because it is built for face detection and face-focused analysis tasks.

When a scenario explicitly requires detecting or analyzing human faces, the best Azure choice is the Azure AI Face detection service. In this case, the app must locate each face in uploaded photos so the system can crop them automatically, which is a direct face-detection task.

Azure AI Vision is a broader computer vision service for general image analysis tasks, but this question asks for a face-specific capability. Azure AI Document Intelligence is used for extracting content from forms and documents, not photos of people. Azure Machine Learning is for building custom models and is not the best first choice when a prebuilt Azure AI service already matches the need.

For AI-900, the key idea is to choose the specialized prebuilt service when the requirement clearly centers on faces.

General vision is broader image analysis, but the stem calls for a face-specific service.
Document processing fits forms, receipts, and scanned pages rather than event photos.
Custom ML is unnecessary when Azure already provides a prebuilt service for face detection.

Question 8

A retailer already uses Azure AI Language to classify customer emails by topic. The company wants to add a capability that shows the main value proposition of a generative AI workload compared with a classification-only workload. Which requirement best fits this need?

Options:

A. Detect whether each email is a complaint or a compliment
B. Draft a natural-language reply for an agent to review
C. Extract order numbers from each email
D. Identify the language used in each email

Best answer: B

Explanation: Generative AI is most useful when a solution must produce new text based on the input context. Drafting a reply uses the email content to generate a response, which is different from classification tasks that only return predefined categories.

The core difference is prediction versus creation. A classification-only workload maps input to a fixed set of labels, such as topic, sentiment, or language. A generative AI workload creates new content, such as a reply, summary, or suggested wording, based on the input.

In this scenario, the retailer already has email classification. The added value of generative AI is the ability to compose a context-aware response for a human agent to review. That goes beyond labeling the email and shows the main value proposition of generative AI at the fundamentals level.

A good rule is: if the output is a predefined category, think classification; if the output is newly written content, think generative AI.

Complaint detection is still classification because it returns one of a small set of labels.
Order-number extraction is information extraction, not content generation.
Language identification selects a predefined category rather than writing a response.

Question 9

A healthcare provider plans to use Azure OpenAI Service to draft visit summaries from clinician notes. The team discusses four design choices. Which choice most directly raises a privacy or security concern?

Options:

A. Assign one manager to approve model updates and monitor incidents.
B. Require a clinician to review each AI-generated summary before it is saved.
C. Test the system with notes from patients of different age groups and backgrounds.
D. Send full notes with patient names and insurance IDs when those details are not needed.

Best answer: D

Explanation: The privacy and security principle focuses on protecting sensitive data in AI workflows. Sending unredacted patient names and insurance IDs to a summarization system when those details are not needed creates the clearest privacy risk because unnecessary identifiable information is being shared.

Privacy and security is the responsible AI principle concerned with protecting sensitive data that an AI system collects, stores, or transmits. In this scenario, the summarization task does not require patient names or insurance IDs, so sending those identifiers to the AI system exposes unnecessary sensitive information. A key fundamentals practice is data minimization: provide only the data needed for the task, and protect any sensitive records that must be used with appropriate controls.

Human review of outputs, broad testing across different groups, and clear ownership for oversight are all valuable practices, but they address other responsible AI needs. The deciding signal here is unnecessary identifiable medical data being sent to the AI service.

Human review supports reliability and safety because staff can catch inaccurate draft summaries before use.
Broader testing supports inclusiveness and fairness by checking whether the system works well across different groups.
Named oversight supports accountability because someone is clearly responsible for monitoring and governance.

Question 10

A retail company is building a hands-free customer support kiosk in Azure. The solution must convert customers’ spoken questions into text, read answers back as natural-sounding audio, and use a prebuilt Azure AI service instead of custom model training.

Which service should the company choose?

Options:

A. Azure AI Vision
B. Azure AI Speech
C. Azure AI Language
D. Azure OpenAI Service

Best answer: B

Explanation: Azure AI Speech is the best fit because the kiosk must both transcribe spoken language and synthesize spoken responses. Those are core speech capabilities provided by a prebuilt Azure AI service.

This scenario is about a voice interface, not general text analysis, image analysis, or generative content creation. Azure AI Speech is designed for speech-to-text when users speak into the kiosk and text-to-speech when the system reads answers aloud. That makes it the most direct service match for a hands-free experience.

Azure AI Language works with text that already exists, such as analyzing sentiment or extracting key phrases. Azure AI Vision is for images and video. Azure OpenAI Service is for generative AI tasks such as drafting or summarizing content, but it is not the primary service for converting speech to text and text to spoken audio.

The key takeaway is to choose the service family that matches the input and output type: spoken audio points to Azure AI Speech.

Text analysis only The option using Azure AI Language misses the requirement to handle spoken input and spoken output.
Wrong modality The option using Azure AI Vision is for visual content such as images, not audio conversations.
Generative mismatch The option using Azure OpenAI Service focuses on generating content, not core speech transcription and synthesis.

Question 11

A retailer stores smartphone photos of store shelves. It needs to extract the printed price text from each image and identify common objects such as bottles and boxes. The team wants a prebuilt Azure service instead of training a custom model. Which service should they choose?

Options:

A. Azure AI Language
B. Azure OpenAI Service
C. Azure AI Vision
D. Azure Machine Learning

Best answer: C

Explanation: Azure AI Vision is the best fit because the retailer must analyze image content and read text that appears inside images. OCR is a computer vision capability, even when it is used as part of a larger document-processing workflow.

This scenario is mainly a computer vision problem. The company wants to work with photos, detect visual content in those photos, and read printed text from the images. In Azure, OCR is a core capability of Azure AI Vision, alongside image analysis features such as tagging and object detection. OCR is often used inside broader document-processing solutions, but the underlying task is still reading text from an image, which is why it belongs in computer vision. Azure AI Language works with text that is already available, Azure OpenAI Service focuses on generative AI tasks, and Azure Machine Learning is for building custom models when a prebuilt service is not enough. The key takeaway is to treat text-in-image tasks as computer vision first when choosing the Azure service.

Language-only focus fits analysis of existing text, but it does not read printed words from images.
Generative AI mismatch can create or summarize content, but the main need here is OCR plus visual analysis.
Custom model overkill could build a solution, but the stem asks for a prebuilt service that already matches both needs.

Question 12

A retailer wants to build a custom machine learning solution that classifies product photos into categories. The team needs to store labeled training data, run multiple experiments on scalable compute, and retrain models as new images arrive. Which Azure service is the BEST fit?

Options:

A. Azure OpenAI Service
B. Azure Machine Learning
C. Azure AI Vision
D. Azure AI Language

Best answer: B

Explanation: Azure Machine Learning is the best choice because the team wants to build and retrain its own model. It provides managed resources for data, experimentation, training, and model development rather than only offering a prebuilt AI capability.

Azure Machine Learning is the Azure service for creating custom machine learning solutions. When a team needs to bring its own labeled data, run repeated experiments, use scalable compute, and train or retrain models, Azure Machine Learning is the right fit. It supports the core resources and workflow used in experimentation, training, and model development.

By contrast, prebuilt Azure AI services are typically used when you want ready-made capabilities without building a custom model. In this scenario, the deciding requirement is not just image analysis; it is managing data and compute for custom training over time.

Prebuilt vision is tempting because the workload uses images, but the scenario requires custom experiments and training resources.
NLP mismatch fails because language services focus on text tasks such as sentiment analysis or entity extraction.
Generative AI mismatch fails because Azure OpenAI Service is for generative AI experiences, not the main platform for custom image-model training.

Question 13

Which AI workload is intended to extract existing information such as key-value pairs and table data from forms, receipts, or invoices, rather than generate new text?

Options:

A. Generative AI
B. Sentiment analysis
C. Document processing
D. Optical character recognition (OCR)

Best answer: C

Explanation: Document processing is used to capture information that already exists in documents, such as field values and table rows. The goal here is extraction, not creating summaries, drafts, or other new content, so generative AI is not the best fit.

Document processing focuses on reading and organizing content that is already present in a document. It can use OCR as part of the process, but it goes further by identifying structure such as key-value pairs, tables, and labeled fields in forms, receipts, and invoices. That makes it the right workload when a business wants to capture existing values and store them in another system.

Generative AI has a different purpose: it produces new content such as answers, summaries, rewritten text, or drafts. If the requirement is extraction of existing document data rather than content creation, document processing is the correct choice. OCR alone is narrower because it mainly converts printed text into machine-readable text.

The generative AI option fails because it focuses on creating new content, not extracting existing document fields.
The OCR-only option is too narrow because character reading is only one part of understanding document structure.
The sentiment analysis option fails because it detects opinion or emotion in text, not values from forms or invoices.

Question 14

A media company is building a review tool for uploaded event photos. The tool must find each human face so editors can automatically blur bystanders before publishing, and it does not need to identify people, authenticate users, or generate captions. Which Azure AI service is the best fit?

Options:

A. Azure AI Language
B. Azure AI Vision
C. Azure OpenAI Service
D. Azure AI Face detection service

Best answer: D

Explanation: The primary workload is detecting faces in images so the company can blur bystanders. Because the scenario explicitly excludes identity verification and access control, Azure AI Face detection service is the most direct fit. This is a computer vision choice, not an NLP or generative AI one.

Face detection means finding whether a human face appears in an image and where it is located. In this scenario, the company wants to detect faces in uploaded photos so it can blur them before publication. That makes the core need a computer vision task focused on facial detection, not identity verification, security screening, or compliance decisions. Azure AI Face detection service is the Azure service family aligned to that need at the AI-900 level. A language service would help with text, and a generative AI service would create or summarize content rather than detect faces. The broader Azure AI Vision family handles general image analysis, but the stem specifically points to facial detection as the primary workload.

General image analysis the Azure AI Vision option is broader, but the stem specifically requires detecting faces rather than general tagging or OCR.
Text processing the Azure AI Language option works with written language, not faces in images.
Generative AI the Azure OpenAI Service option can generate or summarize content, but that is not the main task here.

Question 15

A company uses Azure Machine Learning to train a model that predicts home prices. In this context, what are features?

Options:

A. Groups of similar homes discovered without labeled data
B. Measurements such as accuracy or mean absolute error
C. Input variables such as square footage and number of bedrooms
D. Expected outputs such as the home price used for training

Best answer: C

Explanation: Features are the pieces of input data a model uses to learn patterns and make predictions. In a home-price model, values like square footage and bedroom count are features, while the home price is the label.

In machine learning, features are the input variables provided to the model for each record. The model analyzes these inputs to learn patterns that relate to the value it should predict. In a home-price scenario, columns such as square footage, neighborhood, and number of bedrooms are features because they help estimate the price.

The value being predicted is the label, not a feature.
Groups created from unlabeled data are clustering results.
Accuracy and mean absolute error are evaluation metrics used to assess a model.

If the model reads a value as an input to make a prediction, that value is a feature.

The option describing the home price used for training refers to the label or target.
The option describing groups of similar homes refers to clustering output, not model inputs.
The option describing accuracy or mean absolute error refers to evaluation metrics used after training.

Question 16

Which computer vision task assigns an entire image to one or more predefined categories, such as cat, car, or outdoor?

Options:

A. Face detection
B. Image classification
C. Optical character recognition (OCR)
D. Object detection

Best answer: B

Explanation: Image classification is the task of labeling an image with one or more known categories. It focuses on what the overall image contains, not where items appear in the image and not what text the image includes.

Image classification is a computer vision workload that predicts one or more predefined labels for an entire image. For example, a model might classify a photo as beach, sunset, or animal. This makes it the right concept when the goal is to place an image into known categories.

Other vision tasks answer different questions:

Object detection identifies and locates objects within an image.
OCR reads printed or handwritten text from an image.
Face detection finds whether faces are present, and where they are.

The key clue is predefined categories for the whole image. In Azure fundamentals, this is the defining idea behind image classification.

Object detection is for finding and locating objects, usually with bounding boxes, not for labeling the whole image.
OCR is for extracting text from images or documents, not assigning visual category labels.
Face detection is a specialized task for identifying the presence and location of faces, not general image categorization.

Question 17

A retailer wants to build an internal assistant that can answer employee questions in natural language and summarize long policy documents. The company wants access to OpenAI models through a Microsoft Azure service instead of building its own model. Which service should it choose?

Options:

A. Azure OpenAI Service
B. Azure AI Language
C. Azure Machine Learning
D. Azure AI Vision

Best answer: A

Explanation: Azure OpenAI Service is Microsoft Azure’s service for accessing OpenAI models to build generative AI solutions such as chatbots and summarizers. Because the scenario specifically requires OpenAI models in Azure, it is the best fit.

The key concept is matching a generative AI requirement to the correct Azure service. When a business wants to use OpenAI models in Azure for tasks like conversational assistants, text generation, or summarization, the correct service is Azure OpenAI Service. It is designed for Azure-based access to OpenAI models and is the standard choice for beginner-level scenarios involving generative AI assistants.

Other Azure AI services may work with language, images, or machine learning, but they are not the primary service for accessing OpenAI models. The deciding clue here is the requirement for an assistant that generates natural-language answers and summaries by using OpenAI models within Azure.

Language analytics mismatch Azure AI Language supports NLP features such as sentiment analysis and entity extraction, but it is not the main service for OpenAI model access.
Wrong workload type Azure AI Vision is for analyzing images and video, not for building a text-based generative assistant.
Too broad a platform Azure Machine Learning is a general machine learning platform, but the scenario asks for direct access to OpenAI models through an Azure service.

Question 18

A logistics company wants a mobile app for drivers. Drivers will upload photos of package labels. The app must capture the tracking number and destination code printed on each label and store that text in a database. The company does not need to categorize the label images. Which solution is the BEST fit?

Options:

A. Use Azure OpenAI Service to summarize each image.
B. Use Azure AI Language to analyze label sentiment.
C. Use Azure AI Vision OCR to extract label text.
D. Train an image classification model for label photos.

Best answer: C

Explanation: This scenario requires reading exact text from photos, which is an OCR task. Azure AI Vision OCR is built to extract printed characters from images, while image classification would only assign categories to the whole image.

The key distinction is between reading text and classifying an image. OCR, or optical character recognition, takes an image that contains letters or numbers and converts those characters into machine-readable text. That matches the need to capture tracking numbers and destination codes from package label photos.

General image classification answers a different question: it predicts what category an image belongs to, such as “label,” “receipt,” or “damaged package.” It does not return the exact printed text shown in the image. In Azure, Azure AI Vision includes OCR capabilities for this kind of text extraction.

If the business needs the actual words or numbers from an image, choose OCR rather than image classification.

Image categories could sort label photos into types, but it would not return the tracking number or destination code itself.
Sentiment analysis is an NLP task for opinions in text, which does not match package label processing.
Generative summary might describe what an image contains, but it is not the right tool for precise text extraction.

Question 19

A retail company records customer support calls in English and wants to create written transcripts so supervisors can search and review what customers said later. Which Azure AI capability should the company use?

Options:

A. Sentiment analysis
B. Text translation
C. Optical character recognition (OCR)
D. Speech recognition

Best answer: D

Explanation: The company needs to turn spoken audio from calls into searchable written text. That is the speech recognition workload, commonly provided as speech-to-text capability in Azure AI Speech.

Speech recognition means converting spoken words in an audio source into text. In this scenario, the input is recorded phone calls and the required output is written transcripts, so the workload is speech-to-text. This is different from translating text between languages, reading printed text from images, or detecting opinion in existing text.

A quick way to classify it is:

Input: spoken audio
Output: text transcript
Workload: speech recognition

The closest distractor is translation, but the scenario does not ask to change languages; it asks to capture spoken words as text.

Translation mismatch changes text or speech from one language to another, not audio to same-language transcript.
OCR mismatch extracts printed or handwritten text from images or documents, not from recorded speech.
Sentiment mismatch analyzes opinion in text or speech content after it has been captured; it does not create the transcript.

Question 20

A retailer is choosing between Azure AI capabilities. Which requirement is the best fit for a generative AI solution?

Options:

A. Detect sentiment in customer reviews
B. Draft product descriptions from feature lists
C. Identify key phrases in news articles
D. Extract invoice totals from scanned PDFs

Best answer: B

Explanation: Generative AI is used when the goal is to create new content from a prompt or source input. Drafting product descriptions from feature lists fits that pattern because the system must produce original text rather than extract or classify existing content.

The main distinction is whether the AI must generate something new or analyze information that already exists. Generative AI workloads create fresh content such as descriptions, summaries, replies, or other text based on prompts and context. Turning a list of product features into polished product copy is generative because the desired wording is newly composed.

Classic NLP usually analyzes existing text, such as detecting sentiment, classifying documents, or extracting key phrases. Document processing focuses on reading documents and pulling out text or fields from forms, invoices, or receipts. When the requirement is content creation instead of extraction or analysis, a generative AI solution is the best fit.

Extracting values from invoices may use AI, but it is not generative because it retrieves existing information instead of composing new language.

Invoice extraction is document processing because it reads existing fields from scanned documents.
Sentiment detection is classic NLP because it classifies the tone of existing text.
Key phrase identification is classic NLP because it extracts important terms from text rather than generating original prose.

Question 21

A company is building an employee badge photo checker. Before a photo is accepted, the app must confirm that exactly one human face is present and evaluate face-related qualities such as head pose and blur. The team wants a prebuilt Azure AI service. Which Azure service category is the best fit?

Options:

A. Azure OpenAI Service
B. Azure AI Face detection service
C. Azure AI Vision
D. Azure AI Language

Best answer: B

Explanation: This is a face-analysis scenario, so the best fit is Azure AI Face detection service. It is built for detecting human faces and returning face-specific information, unlike general image analysis, text analysis, or generative AI services.

The key requirement is not just analyzing an image, but specifically detecting whether a face is present and checking face-related characteristics such as head pose and blur. That points to Azure AI Face detection service, which is the Azure service family designed for face-focused analysis in images.

Azure AI Vision is broader and commonly used for image analysis, OCR, and other vision tasks, but the scenario is centered on face-specific detection. Azure AI Language is for text workloads such as sentiment analysis or entity extraction. Azure OpenAI Service supports generative AI scenarios like chat, summarization, and content generation.

When the requirement is specifically about faces, choose the face-focused service rather than a general vision, language, or generative AI service.

General vision is tempting because the input is an image, but the stem asks for face-specific detection and face-related characteristics.
Language analysis fits text tasks such as sentiment or entity extraction, not checking faces in photos.
Generative AI is useful for chat, summarization, and content generation, not for prebuilt face detection in images.

Question 22

A company uses Azure OpenAI Service to add a chat assistant to its employee portal. Employees will type questions about HR policies and receive human-like responses based on internal guidance. What is the main user benefit of this conversational generative AI system?

Options:

A. Ask questions in natural language and get contextual replies
B. Translate spoken conversations in real time
C. Forecast future staffing needs from historical trends
D. Extract text and fields from scanned forms

Best answer: A

Explanation: Conversational generative AI is designed for natural-language interaction. In this scenario, the benefit is that employees can ask policy questions in their own words and receive context-aware answers, rather than getting translations, extracted document fields, or forecasts.

The core benefit of a conversational generative AI system is interactive question answering in natural language. Users do not need to search documents manually or choose from rigid commands; they can ask a question conversationally and receive a generated response based on the provided business content. In Azure, this kind of chat experience is commonly associated with Azure OpenAI Service.

That is different from AI workloads that translate speech, extract printed text from forms, or predict future numeric outcomes from historical data. The key clue in the scenario is the need for human-like replies to typed HR questions. When the goal is a chat assistant that responds conversationally, the main user benefit is natural, contextual answers.

Speech translation fits a spoken multilingual scenario, but the stem describes typed questions and generated answers.
Document extraction is an OCR or form-processing task, not a conversational chat experience.
Forecasting is a predictive machine learning workload used for estimating future values, not answering user questions interactively.

Question 23

Contoso is reviewing four planned Azure AI solutions before release. Which scenario most directly raises a transparency concern?

Options:

A. A loan screening app uses Azure Machine Learning, but applicants are not told AI is used and staff cannot explain the score.
B. A kiosk that uses Azure AI Face detection service has lower accuracy for people with darker skin tones.
C. An Azure OpenAI Service chatbot can issue refunds, but no team is assigned to review harmful decisions.
D. An Azure AI Language sentiment analysis solution stores identifiable customer comments without adequate safeguards.

Best answer: A

Explanation: Transparency is about making AI use visible and understandable to affected users. A screening system that hides the use of AI and cannot provide a reason for its score most directly violates that principle.

Transparency in responsible AI means people should know when AI is being used and should be able to understand the basis for an output, especially when the output affects them. In the loan screening scenario, both parts are missing: applicants are not told that AI is involved, and employees cannot explain how the score was produced. That makes the system harder to trust, question, or appropriately use.

The other scenarios point to different responsible AI principles. Uneven performance across demographic groups is mainly a fairness issue. Weak protection of identifiable customer data is a privacy and security issue. Lack of assigned human ownership for reviewing harmful outcomes is an accountability issue. The key takeaway is that hidden AI use and unexplained outputs are classic transparency concerns.

Bias across groups lower accuracy for people with darker skin tones is primarily a fairness concern.
Weak data protection storing identifiable comments without safeguards is a privacy and security concern.
No clear owner failing to assign human review responsibility is an accountability concern.

Question 24

A retailer wants to analyze photos of store shelves. The solution must identify each product instance and return its location as a rectangle in the image so the app can count items by position. Which Azure AI capability best fits this requirement?

Options:

A. Azure AI Face detection service
B. Image classification in Azure AI Vision
C. Optical character recognition in Azure AI Vision
D. Object detection in Azure AI Vision

Best answer: D

Explanation: The key requirement is the output: a rectangle for each detected product. That means the workload must return bounding boxes for multiple objects, which is what object detection provides.

Choose the service or capability based on the required output, not just because the scenario involves images. Here, the app needs to find each product and return its location in the image. That is an object detection task, because object detection identifies objects and provides bounding boxes.

Other vision capabilities produce different outputs. Image classification assigns a label to an image, OCR extracts text, and face detection identifies human faces and their locations. Since the target is products on shelves rather than text or faces, and the solution needs coordinates for each item, object detection is the best fit.

The main takeaway is to match the needed result—labels, extracted text, bounding boxes, or face detection—to the capability.

Image label only fails because image classification typically labels an image but does not locate each product instance.
Text extraction fails because OCR is for reading printed or handwritten text, not detecting general products.
Wrong object type fails because face detection locates human faces, not shelf items such as products.

AZ-104

Browse Certification Exams

Microsoft Azure AI Fundamentals (AI-900): 24 Sample Questions & Simulator

What this practice page gives you

AI-900 exam snapshot

Topic coverage for AI-900 practice

How to use the AI-900 simulator efficiently

Free preview vs premium

Good next pages after AI-900

24 AI-900 sample questions with detailed explanations

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

Question 19

Question 20

Question 21

Question 22

Question 23

Question 24