Microsoft AI-200 Practice Test: AI Cloud Developer

Prepare for Microsoft Azure AI Cloud Developer Associate (AI-200) with a stable, objective-mapped IT Mastery bank, 24 public sample questions, a free 60-question diagnostic, Azure container, data-service, integration, security, monitoring, and troubleshooting drills.

IT Mastery

Start on Web. Use on iPhone or Android too.

Open the matching practice page on web first. Plans, login, and mobile access stay one click away.

Start with the free diagnostic.

See how the questions feel, review detailed explanations, and identify weak topics before you subscribe. IT Mastery then gives you a stable, blueprint-mapped practice bank with timed mocks, topic drills, detailed explanations, glossary support, code and scenario practice, and progress tracking across web and mobile.

Current alignment Mapped to the current vendor blueprint, domains, services, task statements, and skill areas where published.

Stable bank Questions are prebuilt and reviewed before publication, so practice is stable and inspectable, not improvised during a quiz.

Teaching explanations Explanations teach the service choice, configuration trade-off, command behavior, or troubleshooting rule behind the answer.

Free proof first Use the public samples or free diagnostic first, then continue into IT Mastery for the full question bank.

Tip: Start with the free AI-200 web preview, then drill Azure containers, data services, integration, security, monitoring, and troubleshooting until cloud AI application decisions feel natural.

On page 24 sample questions Free preview Included before subscription Full bank 1,980 total questions with subscription

Use the free preview first, then unlock the full bank, unlimited timed mock exams, and progress analytics. Start on web or mobile, then keep using the same IT Mastery account on each supported device.

Start on Web See plans Subscriber Login

Use on iPhone / Android too

Downloads · App Store · Google Play

Start with the free 60-question AI-200 diagnostic or the 24 public sample questions. See how the questions test Azure containers, serverless services, AI data services, service integration, identity, security, monitoring, and troubleshooting before you subscribe; IT Mastery then gives you a stable, objective-mapped AI-200 practice bank with 1,980 questions, timed mocks, topic drills, progress tracking, and detailed explanations across web and mobile.

Interactive Practice Center

Start a practice session for Microsoft Azure AI Cloud Developer Associate (AI-200) below, or open the full app in a new tab. For the best experience, open the full app in a new tab and navigate with swipes/gestures or the mouse wheel—just like on your phone or tablet.

Open Full App in a New Tab

A small set of questions is available for free preview. Subscribers can unlock full access by signing in with the same app-family account they use on web and mobile.

Prefer to practice on your phone or tablet? Download the IT Mastery – AWS, Azure, GCP & CompTIA exam prep app for iOS or IT Mastery app on Google Play (Android) and use the same IT Mastery account across web and mobile.

Free diagnostic: Try the AI-200 full-length practice exam before subscribing. Use it as one Azure AI cloud-developer baseline, then return to IT Mastery for timed mocks, topic drills, explanations, and the full AI-200 question bank.

What this AI-200 practice page gives you

a direct route into IT Mastery practice for AI-200
24 on-page sample questions selected from the live AI-200 practice bank
a free 60-question diagnostic across the AI-200 topic areas
topic drills for containers, AI data services, Azure service integration, security, monitoring, and troubleshooting
the same IT Mastery account across web and mobile

Who AI-200 is for

developers building backend and AI-enabled applications on Microsoft Azure
candidates comparing AZ-204-style Azure development with newer AI-first development routes
teams that need practice around containers, Azure Functions, messaging, data services, security, monitoring, and troubleshooting

AI-200 exam snapshot

Issuer: Microsoft
Certification lane: Microsoft Certified: Azure AI Cloud Developer Associate
Exam code: AI-200
Practice reference: 60 questions in 120 minutes in the Mastery catalog
Current IT Mastery status: live practice available

Topic coverage for AI-200

Domain	Weight
Develop Containerized Solutions on Azure	23%
Develop AI Solutions by Using Azure Data Management Services	29%
Connect to and Consume Azure Services	24%
Secure, Monitor, and Troubleshoot Azure Solutions	24%

AI-200 application build map

Use this map to connect individual questions to the Azure AI cloud-developer decisions this practice page tests.

    flowchart LR
	  S1["App requirement"] --> S2
	  S2["Choose compute boundary"] --> S3
	  S3["Connect AI and data services"] --> S4
	  S4["Secure identities and secrets"] --> S5
	  S5["Add observability and resilience"] --> S6
	  S6["Ship reviewed release"]

AI-200 readiness map

Area	What strong readiness looks like
Containerized solutions	You can choose container app, registry, revision, scaling, identity, and deployment patterns from scenario evidence.
AI data services	You can match vector search, document storage, caching, relational data, and data-governance requirements to Azure services.
Azure service integration	You can choose queues, events, API boundaries, Functions, and workflow patterns that avoid brittle synchronous designs.
Security and operations	You can apply managed identity, Key Vault, App Configuration, telemetry, KQL, retry strategy, and troubleshooting evidence.

Sample Exam Questions

Try these 24 original sample questions for Microsoft AI-200. They are selected from the live IT Mastery practice bank for self-assessment and are not official exam questions.

Question 1

Topic: Develop AI Solutions by Using Azure Data Management Services

A Python RAG API stores document chunks, embeddings, tenant_id, and source_system in Azure Database for PostgreSQL. Users from one tenant get low-confidence answers even though matching chunks exist. A trace for a failing request shows:

Request: tenant_id=contoso, source_system=policy
SQL executed:
SELECT chunk_id, tenant_id, source_system, content
FROM rag_chunks
ORDER BY embedding <=> :query_embedding
LIMIT 20;

App filter after SQL:
tenant_id=contoso AND source_system=policy
filtered_rows=1

Which retrieval pattern should you use to address this failure mode?

Options:

A. Return newest matching chunks and skip vector ranking.
B. Increase LIMIT and keep filtering in application code.
C. Append tenant metadata before generating the query embedding.
D. Filter metadata in SQL, then rank by vector distance.

Best answer: D

Explanation: The trace shows the top-k vector search is being run across all chunks before tenant and source metadata is applied in the app. Because LIMIT happens before filtering, the best matching rows for the requested tenant and source may never be returned from PostgreSQL. For semantic retrieval with metadata constraints, store embeddings and source metadata together and push the metadata predicates into the PostgreSQL query. The query should restrict rows with predicates such as tenant_id = :tenant_id and source_system = :source, then rank that filtered candidate set with vector distance before applying LIMIT. Increasing top-k is only a workaround; the core issue is the ordering of metadata filtering and similarity ranking.

Larger top-k may reduce misses, but it still performs global similarity first and can remain slow or incomplete.
Prompt metadata does not enforce tenant or source predicates against stored rows.
Newest chunks applies metadata but ignores semantic similarity, which is required for the retrieval pattern.

Question 2

Topic: Develop Containerized Solutions on Azure

A team deploys a Python RAG API to Azure Container Apps from Azure Container Registry. The new revision never becomes ready.

Deployment path:

Build task -> ACR repository -> Container Apps revision -> Running replica

Build task: Succeeded; pushed rag-api:20260516
Revision: rag-api--k9r2 created; desired replicas: 1; ready: 0
Revision event: Image pull failed for contosoacr.azurecr.io/rag-api:20260516
Detail: dial tcp 10.4.2.5:443 i/o timeout
App logs: no stdout or stderr for rag-api--k9r2

Which evidence identifies the deployment failure point?

Options:

A. Application logs for the missing container stdout
B. Revision event with the ACR image-pull timeout
C. KEDA scale history for the revision replicas
D. Build log with the pushed image digest

Best answer: B

Explanation: In a Container Apps deployment, the image must first be pulled from the registry before the container process can start and emit application logs. The build task succeeded, so the image was created and pushed. The revision was created, but the event shows the revision could not pull the image from ACR because the connection to the registry endpoint timed out. That points to registry reachability or related connectivity evidence between the Container Apps environment and ACR. Missing stdout is expected when the container never starts, and scaling evidence is secondary because provisioning failed first.

Application logs are absent because the container never started, so they cannot diagnose this pre-start pull failure.
Build digest confirms an image was pushed, but not that Container Apps can reach the registry.
Scale history is not decisive because the revision failed during provisioning before scaling behavior matters.

Question 3

Topic: Connect to and Consume Azure Services

An AI document-processing API runs in Azure Container Apps and uses the Azure Service Bus SDK to enqueue one processing command per upload. During burst tests, p95 enqueue latency increases and Service Bus metrics show many short-lived AMQP connections from each replica. The current handler creates a ServiceBusClient and queue sender, sends one message, and closes them for every HTTP request. The app must continue using managed identity and the queue’s retry/dead-letter behavior. Which SDK-based access pattern best improves efficiency?

Options:

A. Reuse one ServiceBusClient and sender per process.
B. Store a Service Bus connection string in App Configuration.
C. Emit upload notifications through Event Grid instead.
D. Cache pending commands in Azure Managed Redis.

Best answer: A

Explanation: Azure SDK clients that manage network connections should usually be created once per application instance and reused. In this scenario, the performance symptom is many short-lived AMQP connections caused by constructing and closing Service Bus SDK objects inside every request. A long-lived ServiceBusClient created with managed identity, plus a reused queue sender, reduces connection churn and lowers enqueue latency under burst load. This keeps the existing Service Bus queue semantics, including retry and dead-letter handling. Switching services or moving credentials does not address the connection lifecycle problem.

Event Grid swap fails because it changes the integration pattern and does not preserve queue retry and dead-letter behavior.
Redis buffering adds another component but does not provide durable Service Bus queue semantics for commands.
Connection string storage weakens the managed identity requirement and does not reduce per-request client creation overhead.

Question 4

Topic: Connect to and Consume Azure Services

An Azure Function processes embedding jobs from Azure Service Bus. After deployment, queue depth grows and the host logs show the trigger cannot start. The team wants the Functions host to manage the Service Bus listener and avoid a secret lookup in user code on each invocation. The function app’s managed identity can read the secret from Key Vault.

Exhibit:

Trigger type: Service Bus queue
Queue: embedding-jobs
Binding connection: ServiceBusIngest
Log: connection setting 'ServiceBusIngest' was not found

Which Function app application setting should you add?

Options:

A. SERVICEBUS_NAMESPACE = Service Bus fully qualified namespace
B. ServiceBusIngest = Key Vault reference to the Service Bus connection string
C. ServiceBusIngest = App Configuration endpoint URL
D. AzureWebJobsStorage = Service Bus connection string

Best answer: B

Explanation: For an Azure Functions binding, the connection property names the Function app application setting the host uses to connect to the target service. Here, the Service Bus trigger specifies ServiceBusIngest, so the deployed app needs an application setting with that exact name. Using a Key Vault reference lets the platform resolve the secret securely while the Functions host initializes the trigger and manages the listener efficiently. That avoids adding per-invocation secret retrieval or client setup in function code.

AzureWebJobsStorage is for the Functions runtime storage account, not the Service Bus queue trigger connection.

Runtime storage mix-up fails because AzureWebJobsStorage is not the binding connection named in the trigger.
Namespace only fails because the stem does not configure identity-based Service Bus binding settings under the required prefix.
App Configuration endpoint fails because it does not give the trigger host the Service Bus credential at startup.

Question 5

Topic: Connect to and Consume Azure Services

A team is building a Python Azure Functions API for AI document enrichment. Each enrichment job can run for several minutes. The API endpoint must return in under 2 seconds with a job ID, create a durable work item that one background worker can claim, support retries with poison-job handling, and avoid hard-coded service secrets. Which implementation should you use?

Options:

A. Store pending jobs in Cosmos DB and poll them with a timer-triggered Function.
B. Publish an Event Grid custom event and process it with an Event Grid-triggered Function.
C. Use an HTTP trigger with an identity-based Service Bus queue output binding; process jobs with a Service Bus-triggered Function.
D. Run the enrichment inside the HTTP-triggered Function and increase the function timeout.

Best answer: C

Explanation: For long-running AI work, the Function-based API should accept the request, create a job ID, enqueue a command-style work item, and return 202 Accepted quickly. Azure Service Bus queues are designed for durable background work where one worker claims a message, failed processing can be retried, and poison messages can be moved to a dead-letter queue. A Service Bus-triggered Function then performs the enrichment asynchronously. Using an identity-based binding or secure configuration avoids embedding connection strings in code. Event Grid is better for event notification, and polling a database adds custom retry and poison-message logic.

Synchronous processing violates the under-2-second response goal and ties client availability to the AI job duration.
Event notification does not best match durable command work that needs worker claiming and poison-message handling.
Database polling adds latency and custom workflow logic instead of using native trigger and binding behavior.

Question 6

Topic: Develop Containerized Solutions on Azure

An Azure Container Apps worker uses the Azure Service Bus SDK to process queue messages and then query Azure Cosmos DB for NoSQL. It scales with KEDA based on queue length. After a release, queue age increased, and the team suggests raising the replica limit. You must reduce delay without increasing Cosmos DB throttling.

Exhibit: Current observations

KEDA rule: Service Bus queue length, max replicas = 20
Current: 20 replicas, CPU 24%, memory 48%
Cosmos DB query p95: 1.9 seconds, cross-partition
Cosmos DB 429 retries: increased under load
Load test at 40 replicas: same throughput, more 429 retries

Which implementation should you choose?

Options:

A. Move the worker to an Azure Functions Service Bus trigger.
B. Add a CPU scaler to increase replicas when CPU is low.
C. Increase KEDA max replicas and lower the queue-length target.
D. Optimize the Cosmos DB query path before changing KEDA.

Best answer: D

Explanation: KEDA is appropriate when more replicas can independently process more work. Here, the worker is already at the replica limit, CPU is low, and the slow step is a cross-partition Cosmos DB query with 429 retries. The load test confirms that extra replicas do not drain the queue faster and instead increase downstream throttling. The implementation should keep scaling at a safe level and fix the per-message bottleneck, such as query shape, partition filtering, indexing, request-unit capacity, or client-side concurrency. Scaling the container app is not a substitute for resolving a saturated downstream dependency.

More replicas fails because the load test already showed unchanged throughput and more Cosmos DB 429 retries.
CPU scaling fails because CPU is not saturated and low CPU does not indicate a need for more workers.
Changing hosts fails because a Functions trigger would still hit the same slow and throttled Cosmos DB query path.

Question 7

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps-hosted Python RAG API started returning 500 errors immediately after a new revision was deployed. The revision is intended to use the production managed identity. You need to decide whether the cause is operational or design-related and choose the next step.

Evidence source	Finding
KQL logs	`SecretClient.get_secret` returns `403 Forbidden` for `cosmos-key`
OpenTelemetry trace	Failed span is `KeyVault.GetSecret`; no Cosmos DB span starts
Metrics	CPU, memory, and request volume are unchanged
Configuration	Expected identity is `mi-rag-prod`; new revision uses `mi-rag-test`

What should you do next?

Options:

A. Move Key Vault secrets into App Configuration.
B. Deploy a revision using the authorized managed identity.
C. Add Redis caching for retrieval responses.
D. Increase Cosmos DB Request Units for vector queries.

Best answer: B

Explanation: The combined evidence indicates an operational deployment/configuration issue, not a design flaw or capacity problem. The failure begins at KeyVault.GetSecret, the trace never reaches Cosmos DB, and platform metrics do not show increased load. The configuration also shows the new revision using mi-rag-test while the intended authorized identity is mi-rag-prod. The next step is to correct the revision’s managed identity configuration and redeploy, then validate that traces proceed past Key Vault to the downstream data call. Capacity or caching changes would optimize later parts of the request path, but this request is failing before retrieval starts.

RU increase targets database throughput, but the trace shows the request never reaches Cosmos DB.
App Configuration move misuses nonsecret configuration storage and does not fix the denied Key Vault access.
Redis caching is premature because the failure occurs before retrieval or response caching can help.

Question 8

Topic: Develop AI Solutions by Using Azure Data Management Services

A containerized RAG API on Azure Container Apps stores embeddings in Azure Database for PostgreSQL. The table uses the correct vector data type, metadata filters, and an appropriate vector index. KQL and database metrics show p95 similarity-search latency above target, CPU at 92%, memory pressure during queries, and storage I/O below 40%. Which resource adjustment is the best design fit?

Options:

A. Scale PostgreSQL to a larger memory-optimized compute SKU
B. Increase only the allocated PostgreSQL storage size
C. Move durable embeddings to Azure Managed Redis
D. Add Service Bus between the API and PostgreSQL

Best answer: A

Explanation: Vector similarity search in Azure Database for PostgreSQL can be compute- and memory-intensive even when the schema and vector index are appropriate. In this scenario, the key evidence is high CPU and memory pressure while storage I/O is not saturated. The best resource adjustment is to scale the PostgreSQL server to a compute option with more vCores and memory, such as a larger memory-optimized SKU. That preserves the existing PostgreSQL vector workload and addresses the measured bottleneck directly. Storage expansion helps only when capacity or I/O is the limiting factor, and queuing or cache substitution does not fix slow synchronous database search.

Storage-only scaling misses the observed CPU and memory bottleneck and would not materially reduce similarity-search latency.
Redis as durable storage drifts from the PostgreSQL vector workload and changes the source-of-truth design.
Service Bus buffering can smooth asynchronous work but does not improve the latency of a synchronous vector query.

Question 9

Topic: Develop AI Solutions by Using Azure Data Management Services

An AI metadata API uses Azure Cosmos DB for NoSQL to retrieve document records before vector retrieval. A new case-insensitive category filter caused RU spikes. The team must reduce RU without removing tenant isolation or changing the filter semantics. The container partition key is /tenantId.

Query:
SELECT c.id, c.title
FROM c
WHERE c.tenantId = @tenantId
  AND LOWER(c.category) = @category
  AND c.embeddingModel = @model

Observed:
Cross-partition: false
Consistency: Session
Index note: /tenantId and /embeddingModel used
LOWER(c.category): evaluated after retrieval
Retrieved: 24,900 docs
Returned: 37 docs
Request charge: 910 RU

Which control should the developer implement?

Options:

A. Add a composite index but keep LOWER(c.category).
B. Change the reads to Strong consistency.
C. Use an indexed normalizedCategory equality filter in the tenant-scoped query.
D. Repartition the container by /category.

Best answer: C

Explanation: The evidence points to application query shape, not partitioning or consistency. The request is already single-partition because the query includes /tenantId, and Session consistency is not the source of the high RU charge. The large gap between retrieved and returned documents, combined with the note that LOWER(c.category) is evaluated after retrieval, shows that many candidate documents are read before the final filter is applied. Storing a normalized category value, such as lowercasing it at write time, and querying that indexed field with equality preserves case-insensitive behavior while keeping tenant isolation.

Repartitioning by category ignores the evidence that the current query is already not cross-partition.
Strong consistency would typically increase read cost or latency and does not fix the inefficient predicate.
Composite index only does not make a function-wrapped LOWER(c.category) predicate index-served.

Question 10

Topic: Develop Containerized Solutions on Azure

A team deploys a custom container to Azure App Service. The same image must be promoted unchanged from staging to production. The Python app reads MODEL_ENDPOINT and POSTGRES_PASSWORD only from environment variables. Security requires that secrets are not stored in the image, Dockerfile, or repository. What should you configure to meet the requirement with the least operational risk?

Options:

A. Set App Service app settings with Key Vault references.
B. Define the variables only in the ACR build task.
C. Store the values in a config file inside the image.
D. Add ENV values to the Dockerfile before building.

Best answer: A

Explanation: For a custom container hosted in Azure App Service, application settings are injected into the running container as environment variables. This lets the same image move across environments while each App Service instance supplies its own runtime configuration. Nonsecret values such as MODEL_ENDPOINT can be stored directly as app settings. Secret values such as POSTGRES_PASSWORD should be referenced from Azure Key Vault, typically using a managed identity, so the secret is retrieved at runtime without being committed to source or baked into the image. Build-time variables or image files do not satisfy the requirement because they make configuration part of the artifact rather than the deployment environment.

Dockerfile ENV values fail because they bake environment-specific configuration into the container image.
Image config files fail because updating settings would require rebuilding or republishing the image.
ACR build variables fail because they affect image build automation, not the App Service runtime environment.

Question 11

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A containerized Python API uses a model endpoint name, feature flags, retry counts, and an Azure Database for PostgreSQL password. Security requires least-privilege access to secrets and rotation without redeploying the container. Operations must update feature flags and retry counts without being granted secret permissions. Which design best meets these requirements?

Options:

A. Store secrets in Key Vault and nonsecrets in App Configuration.
B. Store all settings in Key Vault.
C. Bake all settings into the container image.
D. Store all settings in App Configuration.

Best answer: A

Explanation: Key Vault is the appropriate store for sensitive values such as passwords, API keys, and connection secrets because it supports controlled access, auditing, and rotation patterns. Azure App Configuration is better suited for nonsecret application settings such as feature flags, endpoint names, retry counts, and other runtime behavior flags. In this scenario, separating the stores lets security grant the app identity secret access without giving operations secret permissions, while still allowing operations to change nonsecret behavior safely. Baking values into images or using one store for everything either weakens secret handling or makes operational changes harder than required.

All in Key Vault over-restricts ordinary configuration updates and does not fit the requirement for operations to manage nonsecret settings separately.
All in App Configuration fails because database passwords should not be treated as ordinary nonsecret configuration.
Container image settings fail because rotation or configuration changes would require rebuilding and redeploying the image.

Question 12

Topic: Develop AI Solutions by Using Azure Data Management Services

A chat retrieval API uses Azure Cosmos DB for NoSQL to store chunk metadata for an AI assistant. Users must see chunks they uploaded in their own next chat turn; global latest-order across users is not required.

Flow trace:

Upload -> write chunk item (partition key: tenantId)
Chat -> pass upload session token to read
Query -> same tenant:
  WHERE c.tenantId = @tenant AND c.docType = @type
  ORDER BY c.updatedAt DESC
Consistency: Session
Indexing:
  included paths: /tenantId/?, /docType/?, /updatedAt/?
  composite indexes: none
Metrics:
  high RU, high retrieved documents, low index utilization

Which change is the most durable Cosmos DB optimization for this query path?

Options:

A. Change chat reads to eventual consistency.
B. Remove updatedAt from the indexing policy.
C. Add a composite index matching docType and updatedAt.
D. Increase provisioned throughput for the container.

Best answer: C

Explanation: The durable optimization is to make the existing query use an index pattern that matches how it filters and sorts. The app already uses Session consistency with the upload session token, which supports the stated read-your-writes requirement. The visible failure point is the query plan evidence: high RU, high retrieved document count, low index utilization, and no composite index for the filter-plus-ORDER BY path. Increasing throughput may reduce throttling if throttling exists, but it does not reduce RU per query. Weakening consistency is not justified because the requirement depends on seeing the user’s own write.

Eventual consistency can break the stated read-your-writes behavior for the user’s next chat turn.
More throughput does not address the inefficient query shape shown by low index utilization.
Removing the sort path would make the ORDER BY updatedAt path less index-friendly, not more efficient.

Question 13

Topic: Develop Containerized Solutions on Azure

A team is deploying a containerized Python RAG API for an Azure AI cloud solution. The image is already built and stored in Azure Container Registry. The platform team has provisioned the AKS cluster, namespace, node pools, networking, ingress controller, and ACR pull permissions. The app team must deploy each image version, set replicas and health probes, expose an internal endpoint, and keep deployment definitions in source control. Which design best fits?

Options:

A. Build a new AKS cluster with updated networking and identity.
B. Move the API to Azure Container Apps with KEDA scaling.
C. Version and apply Kubernetes Deployment, Service, and ConfigMap manifests.
D. Reconfigure AKS node pools and cluster autoscaler settings.

Best answer: C

Explanation: This scenario is about application lifecycle within an existing AKS cluster, not administering the cluster. The platform prerequisites—namespace, node pools, networking, ingress, and ACR permissions—are already in place, and the app team only needs to manage resources inside the namespace. For AI-200, that points to manifest-based application management: a Deployment for image tag, replicas, probes, and environment references; a Service for internal exposure; and ConfigMap references for nonsecret settings. CI can apply versioned YAML manifests for controlled rollouts. Changing node pools, networking, identities, or creating a new cluster shifts into broader AKS administration and ignores the stated boundary.

Cluster scaling changes miss the app-team scope because node pools and autoscaler settings are platform-level concerns.
Container Apps migration changes the hosting platform even though AKS is already selected and provisioned.
New cluster creation overbuilds the solution and violates the constraint to avoid cluster-level changes.

Question 14

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A team monitors an Azure Container Apps-hosted Python API that serves RAG responses. The API uses Azure Cosmos DB for NoSQL and Azure Managed Redis. During a latency incident, all health probes remain healthy.

Signal	Observation
Container Apps	CPU 35%, memory 50%, replica cap not reached
Cosmos DB	0 throttles, p95 server latency 8 ms
Managed Redis	p95 server latency 2 ms, hit ratio 86%
App traces	p95 request 1,800 ms; client initialization spans run on each request

Which change is most likely to improve performance while keeping the current services?

Options:

A. Increase Cosmos DB Request Units for the container.
B. Shorten the Redis TTL so entries refresh more often.
C. Lower the Container Apps scale threshold to add replicas earlier.
D. Reuse Cosmos DB and Redis SDK clients per process.

Best answer: D

Explanation: The evidence points to an application-level bottleneck, not a service-level bottleneck. Container Apps is not CPU or memory constrained, Cosmos DB has no throttling and low server latency, and Redis has low server latency with a healthy hit ratio. The slow part appears in application traces: client initialization spans run on each request and consume much of the request time. Reusing SDK clients for Cosmos DB and Redis per process avoids repeated connection setup and improves efficiency without changing service capacity or weakening reliability. Scaling replicas or increasing database throughput would target symptoms that are not present in the measurements.

More RUs fails because there are no Cosmos DB throttles or high server-side latency.
Earlier scaling fails because Container Apps resource usage is low and the replica cap is not being reached.
Shorter TTL fails because it would reduce cache effectiveness and increase downstream work.

Question 15

Topic: Develop AI Solutions by Using Azure Data Management Services

An Azure Database for PostgreSQL-backed RAG service has finished generating 1,536-dimension embeddings. The prototype table stores each chunk in one column:

chunk_id uuid
payload text  -- chunk text, source URI, metadata JSON, embedding array

Production queries must run vector similarity search, filter by tenant, document type, and date range, and return source citations. What is the best next implementation step?

Options:

A. Cache similar query results in Azure Managed Redis first.
B. Load the prototype table and parse payload during each query.
C. Create a full-text index on payload before storing embeddings.
D. Redesign the table with vector(1536), typed filter/source columns, and jsonb metadata.

Best answer: D

Explanation: For PostgreSQL-backed RAG retrieval, the schema should separate data by how it will be used. Embeddings should be stored in a vector data type with the correct dimensionality so vector similarity operations and vector indexes can be applied. Frequently filtered attributes such as tenant, document type, and dates should be typed columns, not buried in text, so they remain easy to filter and index. Source references such as document ID, URI, and chunk position should also be explicit columns for reliable citations. jsonb is useful for flexible metadata that is not part of the primary filter path. Load data and tune indexes only after the table structure supports the required retrieval pattern.

Parsing on read keeps the prototype shape and makes filtering, citations, and vector operations harder to optimize.
Full-text first addresses keyword search, not the required vector similarity and structured filtering model.
Caching first is premature because the durable table design still cannot support the required queries cleanly.

Question 16

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A Python Azure Functions app uses a Service Bus trigger to process embedding-refresh messages and query an Azure Database for PostgreSQL vector store. After a deployment, message backlog and retrieval latency increased. You must fix the implementation without changing the database tier or vector query semantics.

Exhibit: Recent evidence

Signal	Value
Function scale-out	4 to 30 instances
PostgreSQL connections	490/500 active
PostgreSQL CPU/memory	35% / 48%
App traces	`PoolTimeout`, `too many clients`
Current code	New pool per invocation

Which change should you implement?

Options:

A. Cache every vector query result in Azure Managed Redis.
B. Rebuild the vector index with a new distance metric.
C. Increase Service Bus trigger concurrency.
D. Reuse a bounded PostgreSQL pool per worker process.

Best answer: D

Explanation: The evidence points to connection pressure, not database compute pressure or vector-index inefficiency. PostgreSQL CPU and memory are moderate, but active connections are nearly exhausted and the app logs show pool timeouts and too many clients. Because the current code creates a new pool per invocation, scale-out multiplies open connections quickly. A module-level or worker-level pool with a bounded maximum, reused across invocations, preserves the same query behavior while preventing connection storms. The key is to align Function concurrency and pool sizing with the database connection capacity instead of trying to drain the queue by creating more simultaneous database clients.

More trigger concurrency would likely worsen the connection storm because each concurrent invocation can demand more database connections.
Changing the vector metric addresses retrieval semantics or indexing, but the provided evidence shows connection exhaustion rather than poor vector index selection.
Caching every result does not fix saturated connections and may introduce stale retrieval results for embedding-refresh processing.

Question 17

Topic: Develop AI Solutions by Using Azure Data Management Services

A Python RAG API in Azure Container Apps must query items in an Azure Cosmos DB for NoSQL container by using the Azure Cosmos DB SDK. The container image must not contain secrets, and the app’s managed identity has Cosmos DB data-plane permissions. Which TWO connection details or client configurations are needed for the SDK query path? Select TWO.

Options:

A. An Azure Storage connection string for the account data plane
B. A Service Bus queue name for routing query requests
C. The database ID and container ID for the container client
D. The account endpoint URI and managed identity credential for CosmosClient
E. A SQL Server connection string that uses TCP port 1433
F. A hard-coded Cosmos DB account key in the container image

Correct answers: C, D

Explanation: An SDK query path for Azure Cosmos DB for NoSQL starts by constructing a CosmosClient for the target account. In this scenario, the app should supply the account endpoint and use an Azure Identity credential, such as DefaultAzureCredential, so the managed identity is used instead of embedding a secret. After the client is created, the code must resolve the database and container by their configured IDs, then run the query on the container client. Other Azure connection strings or messaging settings do not establish a Cosmos DB for NoSQL query path.

Storage connection string fails because Azure Storage credentials do not connect the Cosmos DB for NoSQL SDK to a container.
SQL Server settings fail because Cosmos DB for NoSQL queries are not made through TDS on port 1433.
Hard-coded account key fails because it violates the stated no-secrets-in-image requirement.
Service Bus routing fails because message routing is unrelated to creating a Cosmos DB SDK query client.

Question 18

Topic: Connect to and Consume Azure Services

A containerized AI summarization API runs in Azure Container Apps. Its managed identity has the Azure Service Bus Data Sender role. The app must enqueue work items without storing connection strings.

Trace:

POST /summaries
read queueName from App Configuration
build https://orders.servicebus.windows.net/summarize/messages
POST JSON payload without an Authorization header
result: 401 Unauthorized

Which change supplies the missing SDK-based access pattern?

Options:

A. Publish the payload with EventGridPublisherClient to the queue URL.
B. Read a Key Vault secret and repeat the same unsigned REST call.
C. Store the payload as a key by using the App Configuration SDK.
D. Create a ServiceBusClient with DefaultAzureCredential and send to the queue.

Best answer: D

Explanation: For Service Bus access from an Azure-hosted AI service, the SDK-based pattern is to create a service-specific client using Microsoft Entra authentication, typically through DefaultAzureCredential or managed identity. The visible flow has the configuration lookup, queue name, and permission assignment, but it bypasses the Service Bus SDK and sends an unsigned REST request, so no bearer token is attached. Using ServiceBusClient with the fully qualified namespace and a queue sender lets the SDK handle token acquisition and message protocol details. App Configuration can provide nonsecret settings, but it is not the messaging boundary. The key takeaway is to use the target service SDK at the integration boundary, not a generic unauthenticated HTTP call.

Event publishing fails because Event Grid is for event notifications, not sending commands directly to a Service Bus queue URL.
Configuration storage fails because App Configuration stores settings, not transient work messages for queue processing.
Secret retrieval fails because repeating the same unsigned REST call still lacks a valid Service Bus authorization token.

Question 19

Topic: Connect to and Consume Azure Services

A Service Bus topic receives AI workflow events from several tenants. Each message already includes application properties tenantId and eventType. A new audit subscription currently uses the default rule and the audit app discards messages for other tenants in code. Security requires that the audit app must not receive out-of-scope tenant messages, while existing publishers and other subscribers must continue working unchanged. What should you do?

Options:

A. Change the producer to publish audit events to a separate topic.
B. Dead-letter messages that the audit app is not allowed to process.
C. Replace the audit subscription rule with a SQL filter on tenantId and eventType.
D. Keep the default rule and filter unauthorized messages in the audit app.

Best answer: C

Explanation: Service Bus topic subscription filters are the right control when a single subscriber is receiving more messages than it should, and the producer already sends properties that can classify the messages. A SQL filter or correlation filter is evaluated by Service Bus before messages are made available on that subscription. This preserves the existing topic contract and does not affect other subscriptions, because each subscription has its own rules. Filtering in consumer code still delivers unauthorized tenant messages to the app, which violates the security requirement. Dead-lettering is for messages that cannot be processed, not for normal routing decisions. The key distinction is routing control at the subscription versus changing producer or consumer behavior.

Producer split overbuilds the solution and risks disrupting existing publishers when the needed routing properties already exist.
Consumer filtering fails the isolation requirement because unauthorized messages are still delivered to the audit app.
Dead-letter handling is for failed or invalid messages, not legitimate messages that belong to other subscribers.

Question 20

Topic: Connect to and Consume Azure Services

A Python Azure Function in an AI ingestion API should generate embeddings when messages arrive in an Azure Service Bus queue. Messages remain active in ingest-requests, and the function never starts after deployment. You must remediate the deployed configuration without changing the function code or queue topology. Select TWO.

Binding excerpt:
type: serviceBusTrigger
queueName: ingest-requests
connection: ServiceBusConnection

App settings:
FUNCTIONS_WORKER_RUNTIME=python
AzureWebJobsStorage=
SB_CONNECTION=Endpoint=sb://...

Startup log:
Service Bus connection setting 'ServiceBusConnection' was not found.
AzureWebJobsStorage is missing or empty.

Options:

A. Change FUNCTIONS_WORKER_RUNTIME to dotnet.
B. Rename the queue to ServiceBusConnection.
C. Add an app setting named ServiceBusConnection.
D. Add a Cosmos DB output binding.
E. Increase the Service Bus trigger concurrency.
F. Configure a valid AzureWebJobsStorage setting.

Correct answers: C, F

Explanation: For Azure Functions triggers, the binding’s connection property is the name of an application setting, not the queue name or an arbitrary alias. The deployed app has SB_CONNECTION, but the trigger is looking for ServiceBusConnection, so the listener cannot resolve the Service Bus connection. The host also reports that AzureWebJobsStorage is missing or empty, which prevents normal Functions host operation for this deployed trigger-based app. Fixing those two app settings addresses the startup failures shown in the log without changing code or the Service Bus queue. Scaling, output bindings, or worker runtime changes do not resolve missing configuration names.

Queue rename confuses the queue entity with the app setting name referenced by the trigger binding.
Concurrency tuning can affect throughput only after the listener starts successfully.
Worker runtime change is inappropriate because the app is a Python function and the runtime is already set to python.
Output binding does not fix a failed Service Bus trigger listener.

Question 21

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A RAG API running in Azure Container Apps uses Azure App Configuration for runtime settings. After operators enabled vector retrieval for production, users still receive answers without citations.

Evidence:

Startup: AppConfig label selector = prod
FeatureManagement:UseVectorContext = false
Rag:RetrievalMode = keyword
Trace: skipped vector retrieval; feature disabled

Recent App Configuration changes:

Key	Label	Value
FeatureManagement:UseVectorContext	production	true
Rag:RetrievalMode	production	vector

Which root cause best fits the evidence?

Options:

A. The app is selecting a different App Configuration label.
B. The vector index must be redesigned to store citations.
C. The model endpoint lacks capacity for vector retrieval.
D. The container image must be rebuilt with updated prompt code.

Best answer: A

Explanation: Azure App Configuration labels let the same key have environment-specific values. In this case, the application is successfully reading configuration, but it is selecting the prod label while the updated values were stored with the production label. The trace also shows that vector retrieval was skipped because the feature flag was disabled, so the failure is in configuration selection rather than RAG feature design or model behavior. The next fix would be to align the label used by the app with the label used for the production settings, or update the keys under the label the app actually selects.

Vector redesign does not follow the trace because retrieval was never attempted.
Image rebuild is unnecessary evidence-wise because Azure App Configuration supplies runtime settings outside the image.
Model capacity would typically show timeout, throttling, or latency symptoms, not a disabled feature flag.

Question 22

Topic: Develop AI Solutions by Using Azure Data Management Services

A Python FastAPI back end in Azure Container Apps must run semantic-search SQL against Azure Database for PostgreSQL. The deployment trace shows the request path below. Which change addresses the failure point in the flow?

Startup: DefaultAzureCredential created
Startup: PostgreSQL flexible server management client created
Request /ask:
  list servers -> 200 OK
  get database metadata -> 200 OK
  run SELECT ... ORDER BY embedding <-> $1
  error: AttributeError: no execute/cursor

Options:

A. Grant the managed identity Reader on the PostgreSQL resource.
B. Use the management client to open a server-level cursor.
C. Use a PostgreSQL driver connection pool, such as asyncpg or psycopg.
D. Publish the SQL command as an Event Grid custom event.

Best answer: C

Explanation: Azure Database for PostgreSQL is queried from application code by using standard PostgreSQL client libraries, such as asyncpg, psycopg, Npgsql, or JDBC. In a back-end service, the usual pattern is to create a bounded connection pool during startup and reuse pooled connections for SQL queries. The trace shows the app can call Azure Resource Manager operations, such as listing servers and reading database metadata, but then tries to execute SQL through that management client. Management clients do not expose database cursors or execute application queries. The missing link is a data-plane PostgreSQL driver connection to the database endpoint.

More RBAC does not add SQL execution methods to a management client.
Server-level cursor is not a capability exposed by the Azure resource management client.
Event Grid routing is for event delivery, not synchronous SQL query execution against PostgreSQL.

Question 23

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps API uses OpenTelemetry correlation. During an incident, only the /rag/answer operation returns 500; /health and /orders/status succeed. The active revision has no restarts. The on-call team must avoid unnecessary rollbacks and keep unaffected routes available.

Trace excerpt for one failed request:

Span	Duration	Status
`api /rag/answer`	2,140 ms	Error
`redis lookup`	9 ms	OK
`postgres vector_query`	2,005 ms	Timeout
`cosmos read profile`	18 ms	OK

Which decision best meets the requirement?

Options:

A. Disable all API ingress until every dependency span succeeds.
B. Roll back the Container Apps revision because the API span failed.
C. Increase Container Apps replicas to remove the API errors.
D. Alert on PostgreSQL dependency failures and keep the revision active.

Best answer: D

Explanation: Distributed traces should be read as causal chains, not just by the root span status. The root API span is Error because /rag/answer depends on postgres vector_query, which times out for most of the request duration. Redis and Cosmos DB spans are OK, other routes and health checks succeed, and no container restarts are reported. That evidence isolates the problem to the PostgreSQL dependency path rather than the API revision or Container Apps runtime. A good control is to alert and triage on the dependency span and correlation ID while leaving the current revision available for healthy routes. Rolling back or disabling ingress would broaden the blast radius.

Revision rollback treats the root API error as a container defect, but the trace points to a timed-out downstream span.
Replica scaling sends more requests to the same failing dependency and does not address the causal timeout.
Ingress shutdown blocks healthy operations and violates the requirement to keep unaffected routes available.

Question 24

Topic: Develop Containerized Solutions on Azure

A team deployed a Python RAG API as a custom container to Azure App Service. The optimized release met the latency target in testing, but production p95 latency remains high.

Exhibit: Deployment evidence

Intended release:
Image: contoso.azurecr.io/rag-api:20260515.3
CACHE_TTL_SECONDS: 300

App Service diagnostics:
Configured image: contoso.azurecr.io/rag-api:latest
Resolved digest: sha256:0b7d...
CACHE_TTL_SECONDS: 0
Startup log: response cache disabled

Which action should the developer take first to improve efficiency without weakening deployment reliability?

Options:

A. Enable Always On to reduce container cold-start latency.
B. Scale up the App Service plan before changing configuration.
C. Pin the intended image tag and set CACHE_TTL_SECONDS=300.
D. Keep latest and bake the cache value into the image.

Best answer: C

Explanation: App Service custom container evidence should be checked against the intended release: the configured image, resolved image digest or tag, startup logs, and app settings exposed as environment variables. In this case, production is not using the intended immutable image tag and CACHE_TTL_SECONDS is 0, which the log confirms disables the response cache. Pinning the tested image tag and setting the App Service app setting to 300 directly applies the known performance change while avoiding the deployment risk of a mutable latest tag. Scaling resources would cost more without fixing the mismatch.

Scaling first misses the visible configuration mismatch and may increase cost without enabling the tested optimization.
Using latest weakens release reliability because the running digest may change without an explicit versioned deployment.
Always On can help startup behavior, but the evidence points to disabled caching during normal request processing.

Quick Cheat Sheet

Cue	What to remember
Compute choice	Use Functions, container apps, app services, or managed compute based on scale, latency, image, deployment, and operations needs.
Integration	Use queues, events, and durable workflow boundaries for long-running AI steps or dependency isolation.
Data fit	Match semantic search, document data, cache, and relational needs to the right Azure data service.
Security	Use managed identity, least privilege, Key Vault, masking, and environment separation for AI-enabled apps.
Resilience	Plan for throttling, retries, timeouts, dependency telemetry, KQL investigation, and user-safe fallback behavior.

Mini Glossary

Azure Container Apps: Managed container platform for microservices, APIs, and event-driven workloads without direct Kubernetes cluster management.
Embedding: Numeric representation used for semantic search and similarity matching.
Managed identity: Azure identity feature that lets services authenticate without stored application secrets.
Queue: Messaging pattern that decouples request intake from later processing.
Vector search: Retrieval method that compares embeddings by similarity rather than exact keywords only.

Focused sample questions

Use these child pages when you want focused IT Mastery practice before returning to mixed sets and timed mocks.

Free study resources

Need concept review first? Read the AI-200 Cheat Sheet for compact concept review before returning to timed practice.

Free preview vs premium

Free preview: a smaller web set so you can validate the question style and explanation depth.
Premium: the full AI-200 practice bank, focused drills, mixed sets, timed mock exams, detailed explanations, and progress tracking across web and mobile.

Good next pages after AI-200

AI-103 if you are comparing Azure AI apps-and-agents development with AI cloud-development work
AI-900 if you need Azure AI fundamentals first
AZ-104 if your weak point is Azure identity, networking, storage, and operations context
Microsoft Certification Practice Hub if you are comparing Azure, Fabric, security, Microsoft 365, Power Platform, Dynamics 365, GitHub, or Windows Server routes

Official sources

Microsoft Learn course AI-200T00-A

In this section

AI-200: Develop Containerized Solutions on Azure
Try 10 focused AI-200 questions on Develop Containerized Solutions on Azure, with explanations, then continue with IT Mastery.
AI-200: Develop AI Solutions by Using Azure Data Management Services
Try 10 focused AI-200 questions on Develop AI Solutions by Using Azure Data Management Services, with explanations, then continue with IT Mastery.
AI-200: Connect to and Consume Azure Services
Try 10 focused AI-200 questions on Connect to and Consume Azure Services, with explanations, then continue with IT Mastery.
AI-200: Secure, Monitor, and Troubleshoot Azure Solutions
Try 10 focused AI-200 questions on Secure, Monitor, and Troubleshoot Azure Solutions, with explanations, then continue with IT Mastery.
Microsoft AI-200 Cheat Sheet: AI Cloud Developer
Review the Microsoft Azure AI Cloud Developer Associate (AI-200) scope, Azure application patterns, data-service decisions, and troubleshooting traps before practicing in IT Mastery.
Free AI-200 Full-Length Practice Exam: 60 Questions
Try 60 free AI-200 questions across the exam domains, with explanations, then continue with full IT Mastery practice.

Revised on Monday, May 25, 2026

AI-103

AI-300

Browse Certification Practice Tests by Exam Family

Microsoft AI-200 Practice Test: AI Cloud Developer

What this AI-200 practice page gives you

Who AI-200 is for

AI-200 exam snapshot

Topic coverage for AI-200

AI-200 application build map

AI-200 readiness map

Sample Exam Questions

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

Question 19

Question 20

Question 21

Question 22

Question 23

Question 24

Quick Cheat Sheet

Mini Glossary

Focused sample questions

Free study resources

Free preview vs premium

Good next pages after AI-200

Official sources

In this section