AI-200 Practice Test & Mock Exam
Prepare for AI-200 with a free diagnostic page, topic drills, timed mocks, detailed explanations, and the current IT Mastery question bank.
Use IT Mastery for interactive practice with timed mocks, topic drills, progress tracking, and detailed explanations across web and mobile. Treat the diagnostic page and public sample questions as optional one-pass style checks for how this exam handles Azure containers, serverless services, AI data services, service integration, identity, security, monitoring, and troubleshooting.
Load an embedded IT Mastery web preview for Microsoft Azure AI Cloud Developer Associate (AI-200) when you want to check question style on this page.
Use the primary Start on Web button above when you want the full app route, sign-in flow, and same-account access across web and mobile.
The embedded preview is here for practice-quality checking; app-store and account actions stay secondary to the main web path.
Sample Exam Questions
Try these 24 original sample questions for Microsoft AI-200. They are selected from the live IT Mastery practice bank for study, self-assessment, and exam-scope review. They are not official Microsoft questions, copied live-exam content, or exam dumps.
Question 1
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python RAG API stores document chunks, embeddings, tenant_id, and source_system in Azure Database for PostgreSQL. Users from one tenant get low-confidence answers even though matching chunks exist. A trace for a failing request shows:
Request: tenant_id=contoso, source_system=policy
SQL executed:
SELECT chunk_id, tenant_id, source_system, content
FROM rag_chunks
ORDER BY embedding <=> :query_embedding
LIMIT 20;
App filter after SQL:
tenant_id=contoso AND source_system=policy
filtered_rows=1
Which retrieval pattern should you use to address this failure mode?
Options:
- A. Return newest matching chunks and skip vector ranking.
- B. Increase
LIMITand keep filtering in application code. - C. Append tenant metadata before generating the query embedding.
- D. Filter metadata in SQL, then rank by vector distance.
Best answer: D
Explanation: The trace shows the top-k vector search is being run across all chunks before tenant and source metadata is applied in the app. Because LIMIT happens before filtering, the best matching rows for the requested tenant and source may never be returned from PostgreSQL. For semantic retrieval with metadata constraints, store embeddings and source metadata together and push the metadata predicates into the PostgreSQL query. The query should restrict rows with predicates such as tenant_id = :tenant_id and source_system = :source, then rank that filtered candidate set with vector distance before applying LIMIT. Increasing top-k is only a workaround; the core issue is the ordering of metadata filtering and similarity ranking.
- Larger top-k may reduce misses, but it still performs global similarity first and can remain slow or incomplete.
- Prompt metadata does not enforce tenant or source predicates against stored rows.
- Newest chunks applies metadata but ignores semantic similarity, which is required for the retrieval pattern.
Question 2
Topic: Develop Containerized Solutions on Azure
A team deploys a Python RAG API to Azure Container Apps from Azure Container Registry. The new revision never becomes ready.
Deployment path:
Build task -> ACR repository -> Container Apps revision -> Running replica
Build task: Succeeded; pushed rag-api:20260516
Revision: rag-api--k9r2 created; desired replicas: 1; ready: 0
Revision event: Image pull failed for contosoacr.azurecr.io/rag-api:20260516
Detail: dial tcp 10.4.2.5:443 i/o timeout
App logs: no stdout or stderr for rag-api--k9r2
Which evidence identifies the deployment failure point?
Options:
- A. Application logs for the missing container stdout
- B. Revision event with the ACR image-pull timeout
- C. KEDA scale history for the revision replicas
- D. Build log with the pushed image digest
Best answer: B
Explanation: In a Container Apps deployment, the image must first be pulled from the registry before the container process can start and emit application logs. The build task succeeded, so the image was created and pushed. The revision was created, but the event shows the revision could not pull the image from ACR because the connection to the registry endpoint timed out. That points to registry reachability or related connectivity evidence between the Container Apps environment and ACR. Missing stdout is expected when the container never starts, and scaling evidence is secondary because provisioning failed first.
- Application logs are absent because the container never started, so they cannot diagnose this pre-start pull failure.
- Build digest confirms an image was pushed, but not that Container Apps can reach the registry.
- Scale history is not decisive because the revision failed during provisioning before scaling behavior matters.
Question 3
Topic: Connect to and Consume Azure Services
An AI document-processing API runs in Azure Container Apps and uses the Azure Service Bus SDK to enqueue one processing command per upload. During burst tests, p95 enqueue latency increases and Service Bus metrics show many short-lived AMQP connections from each replica. The current handler creates a ServiceBusClient and queue sender, sends one message, and closes them for every HTTP request. The app must continue using managed identity and the queue’s retry/dead-letter behavior. Which SDK-based access pattern best improves efficiency?
Options:
- A. Reuse one
ServiceBusClientand sender per process. - B. Store a Service Bus connection string in App Configuration.
- C. Emit upload notifications through Event Grid instead.
- D. Cache pending commands in Azure Managed Redis.
Best answer: A
Explanation: Azure SDK clients that manage network connections should usually be created once per application instance and reused. In this scenario, the performance symptom is many short-lived AMQP connections caused by constructing and closing Service Bus SDK objects inside every request. A long-lived ServiceBusClient created with managed identity, plus a reused queue sender, reduces connection churn and lowers enqueue latency under burst load. This keeps the existing Service Bus queue semantics, including retry and dead-letter handling. Switching services or moving credentials does not address the connection lifecycle problem.
- Event Grid swap fails because it changes the integration pattern and does not preserve queue retry and dead-letter behavior.
- Redis buffering adds another component but does not provide durable Service Bus queue semantics for commands.
- Connection string storage weakens the managed identity requirement and does not reduce per-request client creation overhead.
Question 4
Topic: Connect to and Consume Azure Services
An Azure Function processes embedding jobs from Azure Service Bus. After deployment, queue depth grows and the host logs show the trigger cannot start. The team wants the Functions host to manage the Service Bus listener and avoid a secret lookup in user code on each invocation. The function app’s managed identity can read the secret from Key Vault.
Exhibit:
Trigger type: Service Bus queue
Queue: embedding-jobs
Binding connection: ServiceBusIngest
Log: connection setting 'ServiceBusIngest' was not found
Which Function app application setting should you add?
Options:
- A.
SERVICEBUS_NAMESPACE= Service Bus fully qualified namespace - B.
ServiceBusIngest= Key Vault reference to the Service Bus connection string - C.
ServiceBusIngest= App Configuration endpoint URL - D.
AzureWebJobsStorage= Service Bus connection string
Best answer: B
Explanation: For an Azure Functions binding, the connection property names the Function app application setting the host uses to connect to the target service. Here, the Service Bus trigger specifies ServiceBusIngest, so the deployed app needs an application setting with that exact name. Using a Key Vault reference lets the platform resolve the secret securely while the Functions host initializes the trigger and manages the listener efficiently. That avoids adding per-invocation secret retrieval or client setup in function code.
AzureWebJobsStorage is for the Functions runtime storage account, not the Service Bus queue trigger connection.
- Runtime storage mix-up fails because
AzureWebJobsStorageis not the binding connection named in the trigger. - Namespace only fails because the stem does not configure identity-based Service Bus binding settings under the required prefix.
- App Configuration endpoint fails because it does not give the trigger host the Service Bus credential at startup.
Question 5
Topic: Connect to and Consume Azure Services
A team is building a Python Azure Functions API for AI document enrichment. Each enrichment job can run for several minutes. The API endpoint must return in under 2 seconds with a job ID, create a durable work item that one background worker can claim, support retries with poison-job handling, and avoid hard-coded service secrets. Which implementation should you use?
Options:
- A. Store pending jobs in Cosmos DB and poll them with a timer-triggered Function.
- B. Publish an Event Grid custom event and process it with an Event Grid-triggered Function.
- C. Use an HTTP trigger with an identity-based Service Bus queue output binding; process jobs with a Service Bus-triggered Function.
- D. Run the enrichment inside the HTTP-triggered Function and increase the function timeout.
Best answer: C
Explanation: For long-running AI work, the Function-based API should accept the request, create a job ID, enqueue a command-style work item, and return 202 Accepted quickly. Azure Service Bus queues are designed for durable background work where one worker claims a message, failed processing can be retried, and poison messages can be moved to a dead-letter queue. A Service Bus-triggered Function then performs the enrichment asynchronously. Using an identity-based binding or secure configuration avoids embedding connection strings in code. Event Grid is better for event notification, and polling a database adds custom retry and poison-message logic.
- Synchronous processing violates the under-2-second response goal and ties client availability to the AI job duration.
- Event notification does not best match durable command work that needs worker claiming and poison-message handling.
- Database polling adds latency and custom workflow logic instead of using native trigger and binding behavior.
Question 6
Topic: Develop Containerized Solutions on Azure
An Azure Container Apps worker uses the Azure Service Bus SDK to process queue messages and then query Azure Cosmos DB for NoSQL. It scales with KEDA based on queue length. After a release, queue age increased, and the team suggests raising the replica limit. You must reduce delay without increasing Cosmos DB throttling.
Exhibit: Current observations
KEDA rule: Service Bus queue length, max replicas = 20
Current: 20 replicas, CPU 24%, memory 48%
Cosmos DB query p95: 1.9 seconds, cross-partition
Cosmos DB 429 retries: increased under load
Load test at 40 replicas: same throughput, more 429 retries
Which implementation should you choose?
Options:
- A. Move the worker to an Azure Functions Service Bus trigger.
- B. Add a CPU scaler to increase replicas when CPU is low.
- C. Increase KEDA max replicas and lower the queue-length target.
- D. Optimize the Cosmos DB query path before changing KEDA.
Best answer: D
Explanation: KEDA is appropriate when more replicas can independently process more work. Here, the worker is already at the replica limit, CPU is low, and the slow step is a cross-partition Cosmos DB query with 429 retries. The load test confirms that extra replicas do not drain the queue faster and instead increase downstream throttling. The implementation should keep scaling at a safe level and fix the per-message bottleneck, such as query shape, partition filtering, indexing, request-unit capacity, or client-side concurrency. Scaling the container app is not a substitute for resolving a saturated downstream dependency.
- More replicas fails because the load test already showed unchanged throughput and more Cosmos DB 429 retries.
- CPU scaling fails because CPU is not saturated and low CPU does not indicate a need for more workers.
- Changing hosts fails because a Functions trigger would still hit the same slow and throttled Cosmos DB query path.
Question 7
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
An Azure Container Apps-hosted Python RAG API started returning 500 errors immediately after a new revision was deployed. The revision is intended to use the production managed identity. You need to decide whether the cause is operational or design-related and choose the next step.
| Evidence source | Finding |
|---|---|
| KQL logs | SecretClient.get_secret returns 403 Forbidden for cosmos-key |
| OpenTelemetry trace | Failed span is KeyVault.GetSecret; no Cosmos DB span starts |
| Metrics | CPU, memory, and request volume are unchanged |
| Configuration | Expected identity is mi-rag-prod; new revision uses mi-rag-test |
What should you do next?
Options:
- A. Move Key Vault secrets into App Configuration.
- B. Deploy a revision using the authorized managed identity.
- C. Add Redis caching for retrieval responses.
- D. Increase Cosmos DB Request Units for vector queries.
Best answer: B
Explanation: The combined evidence indicates an operational deployment/configuration issue, not a design flaw or capacity problem. The failure begins at KeyVault.GetSecret, the trace never reaches Cosmos DB, and platform metrics do not show increased load. The configuration also shows the new revision using mi-rag-test while the intended authorized identity is mi-rag-prod. The next step is to correct the revision’s managed identity configuration and redeploy, then validate that traces proceed past Key Vault to the downstream data call. Capacity or caching changes would optimize later parts of the request path, but this request is failing before retrieval starts.
- RU increase targets database throughput, but the trace shows the request never reaches Cosmos DB.
- App Configuration move misuses nonsecret configuration storage and does not fix the denied Key Vault access.
- Redis caching is premature because the failure occurs before retrieval or response caching can help.
Question 8
Topic: Develop AI Solutions by Using Azure Data Management Services
A containerized RAG API on Azure Container Apps stores embeddings in Azure Database for PostgreSQL. The table uses the correct vector data type, metadata filters, and an appropriate vector index. KQL and database metrics show p95 similarity-search latency above target, CPU at 92%, memory pressure during queries, and storage I/O below 40%. Which resource adjustment is the best design fit?
Options:
- A. Scale PostgreSQL to a larger memory-optimized compute SKU
- B. Increase only the allocated PostgreSQL storage size
- C. Move durable embeddings to Azure Managed Redis
- D. Add Service Bus between the API and PostgreSQL
Best answer: A
Explanation: Vector similarity search in Azure Database for PostgreSQL can be compute- and memory-intensive even when the schema and vector index are appropriate. In this scenario, the key evidence is high CPU and memory pressure while storage I/O is not saturated. The best resource adjustment is to scale the PostgreSQL server to a compute option with more vCores and memory, such as a larger memory-optimized SKU. That preserves the existing PostgreSQL vector workload and addresses the measured bottleneck directly. Storage expansion helps only when capacity or I/O is the limiting factor, and queuing or cache substitution does not fix slow synchronous database search.
- Storage-only scaling misses the observed CPU and memory bottleneck and would not materially reduce similarity-search latency.
- Redis as durable storage drifts from the PostgreSQL vector workload and changes the source-of-truth design.
- Service Bus buffering can smooth asynchronous work but does not improve the latency of a synchronous vector query.
Question 9
Topic: Develop AI Solutions by Using Azure Data Management Services
An AI metadata API uses Azure Cosmos DB for NoSQL to retrieve document records before vector retrieval. A new case-insensitive category filter caused RU spikes. The team must reduce RU without removing tenant isolation or changing the filter semantics. The container partition key is /tenantId.
Query:
SELECT c.id, c.title
FROM c
WHERE c.tenantId = @tenantId
AND LOWER(c.category) = @category
AND c.embeddingModel = @model
Observed:
Cross-partition: false
Consistency: Session
Index note: /tenantId and /embeddingModel used
LOWER(c.category): evaluated after retrieval
Retrieved: 24,900 docs
Returned: 37 docs
Request charge: 910 RU
Which control should the developer implement?
Options:
- A. Add a composite index but keep
LOWER(c.category). - B. Change the reads to Strong consistency.
- C. Use an indexed
normalizedCategoryequality filter in the tenant-scoped query. - D. Repartition the container by
/category.
Best answer: C
Explanation: The evidence points to application query shape, not partitioning or consistency. The request is already single-partition because the query includes /tenantId, and Session consistency is not the source of the high RU charge. The large gap between retrieved and returned documents, combined with the note that LOWER(c.category) is evaluated after retrieval, shows that many candidate documents are read before the final filter is applied. Storing a normalized category value, such as lowercasing it at write time, and querying that indexed field with equality preserves case-insensitive behavior while keeping tenant isolation.
- Repartitioning by category ignores the evidence that the current query is already not cross-partition.
- Strong consistency would typically increase read cost or latency and does not fix the inefficient predicate.
- Composite index only does not make a function-wrapped
LOWER(c.category)predicate index-served.
Question 10
Topic: Develop Containerized Solutions on Azure
A team deploys a custom container to Azure App Service. The same image must be promoted unchanged from staging to production. The Python app reads MODEL_ENDPOINT and POSTGRES_PASSWORD only from environment variables. Security requires that secrets are not stored in the image, Dockerfile, or repository. What should you configure to meet the requirement with the least operational risk?
Options:
- A. Set App Service app settings with Key Vault references.
- B. Define the variables only in the ACR build task.
- C. Store the values in a config file inside the image.
- D. Add
ENVvalues to the Dockerfile before building.
Best answer: A
Explanation: For a custom container hosted in Azure App Service, application settings are injected into the running container as environment variables. This lets the same image move across environments while each App Service instance supplies its own runtime configuration. Nonsecret values such as MODEL_ENDPOINT can be stored directly as app settings. Secret values such as POSTGRES_PASSWORD should be referenced from Azure Key Vault, typically using a managed identity, so the secret is retrieved at runtime without being committed to source or baked into the image. Build-time variables or image files do not satisfy the requirement because they make configuration part of the artifact rather than the deployment environment.
- Dockerfile
ENVvalues fail because they bake environment-specific configuration into the container image. - Image config files fail because updating settings would require rebuilding or republishing the image.
- ACR build variables fail because they affect image build automation, not the App Service runtime environment.
Question 11
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A containerized Python API uses a model endpoint name, feature flags, retry counts, and an Azure Database for PostgreSQL password. Security requires least-privilege access to secrets and rotation without redeploying the container. Operations must update feature flags and retry counts without being granted secret permissions. Which design best meets these requirements?
Options:
- A. Store secrets in Key Vault and nonsecrets in App Configuration.
- B. Store all settings in Key Vault.
- C. Bake all settings into the container image.
- D. Store all settings in App Configuration.
Best answer: A
Explanation: Key Vault is the appropriate store for sensitive values such as passwords, API keys, and connection secrets because it supports controlled access, auditing, and rotation patterns. Azure App Configuration is better suited for nonsecret application settings such as feature flags, endpoint names, retry counts, and other runtime behavior flags. In this scenario, separating the stores lets security grant the app identity secret access without giving operations secret permissions, while still allowing operations to change nonsecret behavior safely. Baking values into images or using one store for everything either weakens secret handling or makes operational changes harder than required.
- All in Key Vault over-restricts ordinary configuration updates and does not fit the requirement for operations to manage nonsecret settings separately.
- All in App Configuration fails because database passwords should not be treated as ordinary nonsecret configuration.
- Container image settings fail because rotation or configuration changes would require rebuilding and redeploying the image.
Question 12
Topic: Develop AI Solutions by Using Azure Data Management Services
A chat retrieval API uses Azure Cosmos DB for NoSQL to store chunk metadata for an AI assistant. Users must see chunks they uploaded in their own next chat turn; global latest-order across users is not required.
Flow trace:
Upload -> write chunk item (partition key: tenantId)
Chat -> pass upload session token to read
Query -> same tenant:
WHERE c.tenantId = @tenant AND c.docType = @type
ORDER BY c.updatedAt DESC
Consistency: Session
Indexing:
included paths: /tenantId/?, /docType/?, /updatedAt/?
composite indexes: none
Metrics:
high RU, high retrieved documents, low index utilization
Which change is the most durable Cosmos DB optimization for this query path?
Options:
- A. Change chat reads to eventual consistency.
- B. Remove
updatedAtfrom the indexing policy. - C. Add a composite index matching
docTypeandupdatedAt. - D. Increase provisioned throughput for the container.
Best answer: C
Explanation: The durable optimization is to make the existing query use an index pattern that matches how it filters and sorts. The app already uses Session consistency with the upload session token, which supports the stated read-your-writes requirement. The visible failure point is the query plan evidence: high RU, high retrieved document count, low index utilization, and no composite index for the filter-plus-ORDER BY path. Increasing throughput may reduce throttling if throttling exists, but it does not reduce RU per query. Weakening consistency is not justified because the requirement depends on seeing the user’s own write.
- Eventual consistency can break the stated read-your-writes behavior for the user’s next chat turn.
- More throughput does not address the inefficient query shape shown by low index utilization.
- Removing the sort path would make the
ORDER BY updatedAtpath less index-friendly, not more efficient.
Question 13
Topic: Develop Containerized Solutions on Azure
A team is deploying a containerized Python RAG API for an Azure AI cloud solution. The image is already built and stored in Azure Container Registry. The platform team has provisioned the AKS cluster, namespace, node pools, networking, ingress controller, and ACR pull permissions. The app team must deploy each image version, set replicas and health probes, expose an internal endpoint, and keep deployment definitions in source control. Which design best fits?
Options:
- A. Build a new AKS cluster with updated networking and identity.
- B. Move the API to Azure Container Apps with KEDA scaling.
- C. Version and apply Kubernetes Deployment, Service, and ConfigMap manifests.
- D. Reconfigure AKS node pools and cluster autoscaler settings.
Best answer: C
Explanation: This scenario is about application lifecycle within an existing AKS cluster, not administering the cluster. The platform prerequisites—namespace, node pools, networking, ingress, and ACR permissions—are already in place, and the app team only needs to manage resources inside the namespace. For AI-200, that points to manifest-based application management: a Deployment for image tag, replicas, probes, and environment references; a Service for internal exposure; and ConfigMap references for nonsecret settings. CI can apply versioned YAML manifests for controlled rollouts. Changing node pools, networking, identities, or creating a new cluster shifts into broader AKS administration and ignores the stated boundary.
- Cluster scaling changes miss the app-team scope because node pools and autoscaler settings are platform-level concerns.
- Container Apps migration changes the hosting platform even though AKS is already selected and provisioned.
- New cluster creation overbuilds the solution and violates the constraint to avoid cluster-level changes.
Question 14
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A team monitors an Azure Container Apps-hosted Python API that serves RAG responses. The API uses Azure Cosmos DB for NoSQL and Azure Managed Redis. During a latency incident, all health probes remain healthy.
| Signal | Observation |
|---|---|
| Container Apps | CPU 35%, memory 50%, replica cap not reached |
| Cosmos DB | 0 throttles, p95 server latency 8 ms |
| Managed Redis | p95 server latency 2 ms, hit ratio 86% |
| App traces | p95 request 1,800 ms; client initialization spans run on each request |
Which change is most likely to improve performance while keeping the current services?
Options:
- A. Increase Cosmos DB Request Units for the container.
- B. Shorten the Redis TTL so entries refresh more often.
- C. Lower the Container Apps scale threshold to add replicas earlier.
- D. Reuse Cosmos DB and Redis SDK clients per process.
Best answer: D
Explanation: The evidence points to an application-level bottleneck, not a service-level bottleneck. Container Apps is not CPU or memory constrained, Cosmos DB has no throttling and low server latency, and Redis has low server latency with a healthy hit ratio. The slow part appears in application traces: client initialization spans run on each request and consume much of the request time. Reusing SDK clients for Cosmos DB and Redis per process avoids repeated connection setup and improves efficiency without changing service capacity or weakening reliability. Scaling replicas or increasing database throughput would target symptoms that are not present in the measurements.
- More RUs fails because there are no Cosmos DB throttles or high server-side latency.
- Earlier scaling fails because Container Apps resource usage is low and the replica cap is not being reached.
- Shorter TTL fails because it would reduce cache effectiveness and increase downstream work.
Question 15
Topic: Develop AI Solutions by Using Azure Data Management Services
An Azure Database for PostgreSQL-backed RAG service has finished generating 1,536-dimension embeddings. The prototype table stores each chunk in one column:
chunk_id uuid
payload text -- chunk text, source URI, metadata JSON, embedding array
Production queries must run vector similarity search, filter by tenant, document type, and date range, and return source citations. What is the best next implementation step?
Options:
- A. Cache similar query results in Azure Managed Redis first.
- B. Load the prototype table and parse
payloadduring each query. - C. Create a full-text index on
payloadbefore storing embeddings. - D. Redesign the table with
vector(1536), typed filter/source columns, andjsonbmetadata.
Best answer: D
Explanation: For PostgreSQL-backed RAG retrieval, the schema should separate data by how it will be used. Embeddings should be stored in a vector data type with the correct dimensionality so vector similarity operations and vector indexes can be applied. Frequently filtered attributes such as tenant, document type, and dates should be typed columns, not buried in text, so they remain easy to filter and index. Source references such as document ID, URI, and chunk position should also be explicit columns for reliable citations. jsonb is useful for flexible metadata that is not part of the primary filter path. Load data and tune indexes only after the table structure supports the required retrieval pattern.
- Parsing on read keeps the prototype shape and makes filtering, citations, and vector operations harder to optimize.
- Full-text first addresses keyword search, not the required vector similarity and structured filtering model.
- Caching first is premature because the durable table design still cannot support the required queries cleanly.
Question 16
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A Python Azure Functions app uses a Service Bus trigger to process embedding-refresh messages and query an Azure Database for PostgreSQL vector store. After a deployment, message backlog and retrieval latency increased. You must fix the implementation without changing the database tier or vector query semantics.
Exhibit: Recent evidence
| Signal | Value |
|---|---|
| Function scale-out | 4 to 30 instances |
| PostgreSQL connections | 490/500 active |
| PostgreSQL CPU/memory | 35% / 48% |
| App traces | PoolTimeout, too many clients |
| Current code | New pool per invocation |
Which change should you implement?
Options:
- A. Cache every vector query result in Azure Managed Redis.
- B. Rebuild the vector index with a new distance metric.
- C. Increase Service Bus trigger concurrency.
- D. Reuse a bounded PostgreSQL pool per worker process.
Best answer: D
Explanation: The evidence points to connection pressure, not database compute pressure or vector-index inefficiency. PostgreSQL CPU and memory are moderate, but active connections are nearly exhausted and the app logs show pool timeouts and too many clients. Because the current code creates a new pool per invocation, scale-out multiplies open connections quickly. A module-level or worker-level pool with a bounded maximum, reused across invocations, preserves the same query behavior while preventing connection storms. The key is to align Function concurrency and pool sizing with the database connection capacity instead of trying to drain the queue by creating more simultaneous database clients.
- More trigger concurrency would likely worsen the connection storm because each concurrent invocation can demand more database connections.
- Changing the vector metric addresses retrieval semantics or indexing, but the provided evidence shows connection exhaustion rather than poor vector index selection.
- Caching every result does not fix saturated connections and may introduce stale retrieval results for embedding-refresh processing.
Question 17
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python RAG API in Azure Container Apps must query items in an Azure Cosmos DB for NoSQL container by using the Azure Cosmos DB SDK. The container image must not contain secrets, and the app’s managed identity has Cosmos DB data-plane permissions. Which TWO connection details or client configurations are needed for the SDK query path? Select TWO.
Options:
- A. An Azure Storage connection string for the account data plane
- B. A Service Bus queue name for routing query requests
- C. The database ID and container ID for the container client
- D. The account endpoint URI and managed identity credential for
CosmosClient - E. A SQL Server connection string that uses TCP port 1433
- F. A hard-coded Cosmos DB account key in the container image
Correct answers: C, D
Explanation: An SDK query path for Azure Cosmos DB for NoSQL starts by constructing a CosmosClient for the target account. In this scenario, the app should supply the account endpoint and use an Azure Identity credential, such as DefaultAzureCredential, so the managed identity is used instead of embedding a secret. After the client is created, the code must resolve the database and container by their configured IDs, then run the query on the container client. Other Azure connection strings or messaging settings do not establish a Cosmos DB for NoSQL query path.
- Storage connection string fails because Azure Storage credentials do not connect the Cosmos DB for NoSQL SDK to a container.
- SQL Server settings fail because Cosmos DB for NoSQL queries are not made through TDS on port 1433.
- Hard-coded account key fails because it violates the stated no-secrets-in-image requirement.
- Service Bus routing fails because message routing is unrelated to creating a Cosmos DB SDK query client.
Question 18
Topic: Connect to and Consume Azure Services
A containerized AI summarization API runs in Azure Container Apps. Its managed identity has the Azure Service Bus Data Sender role. The app must enqueue work items without storing connection strings.
Trace:
POST /summaries
read queueName from App Configuration
build https://orders.servicebus.windows.net/summarize/messages
POST JSON payload without an Authorization header
result: 401 Unauthorized
Which change supplies the missing SDK-based access pattern?
Options:
- A. Publish the payload with
EventGridPublisherClientto the queue URL. - B. Read a Key Vault secret and repeat the same unsigned REST call.
- C. Store the payload as a key by using the App Configuration SDK.
- D. Create a
ServiceBusClientwithDefaultAzureCredentialand send to the queue.
Best answer: D
Explanation: For Service Bus access from an Azure-hosted AI service, the SDK-based pattern is to create a service-specific client using Microsoft Entra authentication, typically through DefaultAzureCredential or managed identity. The visible flow has the configuration lookup, queue name, and permission assignment, but it bypasses the Service Bus SDK and sends an unsigned REST request, so no bearer token is attached. Using ServiceBusClient with the fully qualified namespace and a queue sender lets the SDK handle token acquisition and message protocol details. App Configuration can provide nonsecret settings, but it is not the messaging boundary. The key takeaway is to use the target service SDK at the integration boundary, not a generic unauthenticated HTTP call.
- Event publishing fails because Event Grid is for event notifications, not sending commands directly to a Service Bus queue URL.
- Configuration storage fails because App Configuration stores settings, not transient work messages for queue processing.
- Secret retrieval fails because repeating the same unsigned REST call still lacks a valid Service Bus authorization token.
Question 19
Topic: Connect to and Consume Azure Services
A Service Bus topic receives AI workflow events from several tenants. Each message already includes application properties tenantId and eventType. A new audit subscription currently uses the default rule and the audit app discards messages for other tenants in code. Security requires that the audit app must not receive out-of-scope tenant messages, while existing publishers and other subscribers must continue working unchanged. What should you do?
Options:
- A. Change the producer to publish audit events to a separate topic.
- B. Dead-letter messages that the audit app is not allowed to process.
- C. Replace the audit subscription rule with a SQL filter on
tenantIdandeventType. - D. Keep the default rule and filter unauthorized messages in the audit app.
Best answer: C
Explanation: Service Bus topic subscription filters are the right control when a single subscriber is receiving more messages than it should, and the producer already sends properties that can classify the messages. A SQL filter or correlation filter is evaluated by Service Bus before messages are made available on that subscription. This preserves the existing topic contract and does not affect other subscriptions, because each subscription has its own rules. Filtering in consumer code still delivers unauthorized tenant messages to the app, which violates the security requirement. Dead-lettering is for messages that cannot be processed, not for normal routing decisions. The key distinction is routing control at the subscription versus changing producer or consumer behavior.
- Producer split overbuilds the solution and risks disrupting existing publishers when the needed routing properties already exist.
- Consumer filtering fails the isolation requirement because unauthorized messages are still delivered to the audit app.
- Dead-letter handling is for failed or invalid messages, not legitimate messages that belong to other subscribers.
Question 20
Topic: Connect to and Consume Azure Services
A Python Azure Function in an AI ingestion API should generate embeddings when messages arrive in an Azure Service Bus queue. Messages remain active in ingest-requests, and the function never starts after deployment. You must remediate the deployed configuration without changing the function code or queue topology. Select TWO.
Binding excerpt:
type: serviceBusTrigger
queueName: ingest-requests
connection: ServiceBusConnection
App settings:
FUNCTIONS_WORKER_RUNTIME=python
AzureWebJobsStorage=
SB_CONNECTION=Endpoint=sb://...
Startup log:
Service Bus connection setting 'ServiceBusConnection' was not found.
AzureWebJobsStorage is missing or empty.
Options:
- A. Change
FUNCTIONS_WORKER_RUNTIMEtodotnet. - B. Rename the queue to
ServiceBusConnection. - C. Add an app setting named
ServiceBusConnection. - D. Add a Cosmos DB output binding.
- E. Increase the Service Bus trigger concurrency.
- F. Configure a valid
AzureWebJobsStoragesetting.
Correct answers: C, F
Explanation: For Azure Functions triggers, the binding’s connection property is the name of an application setting, not the queue name or an arbitrary alias. The deployed app has SB_CONNECTION, but the trigger is looking for ServiceBusConnection, so the listener cannot resolve the Service Bus connection. The host also reports that AzureWebJobsStorage is missing or empty, which prevents normal Functions host operation for this deployed trigger-based app. Fixing those two app settings addresses the startup failures shown in the log without changing code or the Service Bus queue. Scaling, output bindings, or worker runtime changes do not resolve missing configuration names.
- Queue rename confuses the queue entity with the app setting name referenced by the trigger binding.
- Concurrency tuning can affect throughput only after the listener starts successfully.
- Worker runtime change is inappropriate because the app is a Python function and the runtime is already set to
python. - Output binding does not fix a failed Service Bus trigger listener.
Question 21
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A RAG API running in Azure Container Apps uses Azure App Configuration for runtime settings. After operators enabled vector retrieval for production, users still receive answers without citations.
Evidence:
Startup: AppConfig label selector = prod
FeatureManagement:UseVectorContext = false
Rag:RetrievalMode = keyword
Trace: skipped vector retrieval; feature disabled
Recent App Configuration changes:
| Key | Label | Value |
|---|---|---|
| FeatureManagement:UseVectorContext | production | true |
| Rag:RetrievalMode | production | vector |
Which root cause best fits the evidence?
Options:
- A. The app is selecting a different App Configuration label.
- B. The vector index must be redesigned to store citations.
- C. The model endpoint lacks capacity for vector retrieval.
- D. The container image must be rebuilt with updated prompt code.
Best answer: A
Explanation: Azure App Configuration labels let the same key have environment-specific values. In this case, the application is successfully reading configuration, but it is selecting the prod label while the updated values were stored with the production label. The trace also shows that vector retrieval was skipped because the feature flag was disabled, so the failure is in configuration selection rather than RAG feature design or model behavior. The next fix would be to align the label used by the app with the label used for the production settings, or update the keys under the label the app actually selects.
- Vector redesign does not follow the trace because retrieval was never attempted.
- Image rebuild is unnecessary evidence-wise because Azure App Configuration supplies runtime settings outside the image.
- Model capacity would typically show timeout, throttling, or latency symptoms, not a disabled feature flag.
Question 22
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python FastAPI back end in Azure Container Apps must run semantic-search SQL against Azure Database for PostgreSQL. The deployment trace shows the request path below. Which change addresses the failure point in the flow?
Startup: DefaultAzureCredential created
Startup: PostgreSQL flexible server management client created
Request /ask:
list servers -> 200 OK
get database metadata -> 200 OK
run SELECT ... ORDER BY embedding <-> $1
error: AttributeError: no execute/cursor
Options:
- A. Grant the managed identity Reader on the PostgreSQL resource.
- B. Use the management client to open a server-level cursor.
- C. Use a PostgreSQL driver connection pool, such as asyncpg or psycopg.
- D. Publish the SQL command as an Event Grid custom event.
Best answer: C
Explanation: Azure Database for PostgreSQL is queried from application code by using standard PostgreSQL client libraries, such as asyncpg, psycopg, Npgsql, or JDBC. In a back-end service, the usual pattern is to create a bounded connection pool during startup and reuse pooled connections for SQL queries. The trace shows the app can call Azure Resource Manager operations, such as listing servers and reading database metadata, but then tries to execute SQL through that management client. Management clients do not expose database cursors or execute application queries. The missing link is a data-plane PostgreSQL driver connection to the database endpoint.
- More RBAC does not add SQL execution methods to a management client.
- Server-level cursor is not a capability exposed by the Azure resource management client.
- Event Grid routing is for event delivery, not synchronous SQL query execution against PostgreSQL.
Question 23
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
An Azure Container Apps API uses OpenTelemetry correlation. During an incident, only the /rag/answer operation returns 500; /health and /orders/status succeed. The active revision has no restarts. The on-call team must avoid unnecessary rollbacks and keep unaffected routes available.
Trace excerpt for one failed request:
| Span | Duration | Status |
|---|---|---|
api /rag/answer | 2,140 ms | Error |
redis lookup | 9 ms | OK |
postgres vector_query | 2,005 ms | Timeout |
cosmos read profile | 18 ms | OK |
Which decision best meets the requirement?
Options:
- A. Disable all API ingress until every dependency span succeeds.
- B. Roll back the Container Apps revision because the API span failed.
- C. Increase Container Apps replicas to remove the API errors.
- D. Alert on PostgreSQL dependency failures and keep the revision active.
Best answer: D
Explanation: Distributed traces should be read as causal chains, not just by the root span status. The root API span is Error because /rag/answer depends on postgres vector_query, which times out for most of the request duration. Redis and Cosmos DB spans are OK, other routes and health checks succeed, and no container restarts are reported. That evidence isolates the problem to the PostgreSQL dependency path rather than the API revision or Container Apps runtime. A good control is to alert and triage on the dependency span and correlation ID while leaving the current revision available for healthy routes. Rolling back or disabling ingress would broaden the blast radius.
- Revision rollback treats the root API error as a container defect, but the trace points to a timed-out downstream span.
- Replica scaling sends more requests to the same failing dependency and does not address the causal timeout.
- Ingress shutdown blocks healthy operations and violates the requirement to keep unaffected routes available.
Question 24
Topic: Develop Containerized Solutions on Azure
A team deployed a Python RAG API as a custom container to Azure App Service. The optimized release met the latency target in testing, but production p95 latency remains high.
Exhibit: Deployment evidence
Intended release:
Image: contoso.azurecr.io/rag-api:20260515.3
CACHE_TTL_SECONDS: 300
App Service diagnostics:
Configured image: contoso.azurecr.io/rag-api:latest
Resolved digest: sha256:0b7d...
CACHE_TTL_SECONDS: 0
Startup log: response cache disabled
Which action should the developer take first to improve efficiency without weakening deployment reliability?
Options:
- A. Enable Always On to reduce container cold-start latency.
- B. Scale up the App Service plan before changing configuration.
- C. Pin the intended image tag and set
CACHE_TTL_SECONDS=300. - D. Keep
latestand bake the cache value into the image.
Best answer: C
Explanation: App Service custom container evidence should be checked against the intended release: the configured image, resolved image digest or tag, startup logs, and app settings exposed as environment variables. In this case, production is not using the intended immutable image tag and CACHE_TTL_SECONDS is 0, which the log confirms disables the response cache. Pinning the tested image tag and setting the App Service app setting to 300 directly applies the known performance change while avoiding the deployment risk of a mutable latest tag. Scaling resources would cost more without fixing the mismatch.
- Scaling first misses the visible configuration mismatch and may increase cost without enabling the tested optimization.
- Using latest weakens release reliability because the running digest may change without an explicit versioned deployment.
- Always On can help startup behavior, but the evidence points to disabled caching during normal request processing.
Public diagnostic page: a static diagnostic page is available for a one-pass self-check. Use IT Mastery for interactive practice with mixed sets, timed mocks, topic drills, explanations, and progress tracking.
What this AI-200 practice page gives you
- a direct web entry for AI-200 practice in IT Mastery
- 24 on-page sample questions selected from the live AI-200 practice bank
- a free diagnostic page across the AI-200 topic areas
- topic drills for containers, AI data services, Azure service integration, security, monitoring, and troubleshooting
- the same IT Mastery account across web and mobile
Who AI-200 is for
- developers building backend and AI-enabled applications on Microsoft Azure
- candidates moving from AZ-204-style Azure development into the current Azure AI cloud developer route
- teams that need practice around containers, Azure Functions, messaging, data services, security, monitoring, and troubleshooting
AI-200 exam snapshot
- Issuer: Microsoft
- Certification lane: Microsoft Certified: Azure AI Cloud Developer Associate
- Exam code: AI-200
- Practice support: public samples, a static diagnostic page, and live IT Mastery practice
- Current IT Mastery status: live practice available
Topic coverage for AI-200
| Domain | Weight |
|---|---|
| Develop Containerized Solutions on Azure | 23% |
| Develop AI Solutions by Using Azure Data Management Services | 29% |
| Connect to and Consume Azure Services | 24% |
| Secure, Monitor, and Troubleshoot Azure Solutions | 24% |
AI-200 application build map
Use this map to connect individual questions to the Azure AI cloud-developer decisions this practice page tests.
flowchart LR
S1["App requirement"] --> S2
S2["Choose compute boundary"] --> S3
S3["Connect AI and data services"] --> S4
S4["Secure identities and secrets"] --> S5
S5["Add observability and resilience"] --> S6
S6["Ship reviewed release"]
AI-200 readiness map
| Area | What strong readiness looks like |
|---|---|
| Containerized solutions | You can choose container app, registry, revision, scaling, identity, and deployment patterns from scenario evidence. |
| AI data services | You can match vector search, document storage, caching, relational data, and data-governance requirements to Azure services. |
| Azure service integration | You can choose queues, events, API boundaries, Functions, and workflow patterns that avoid brittle synchronous designs. |
| Security and operations | You can apply managed identity, Key Vault, App Configuration, telemetry, KQL, retry strategy, and troubleshooting evidence. |
Mini Glossary
- Azure Container Apps: Managed container platform for microservices, APIs, and event-driven workloads without direct Kubernetes cluster management.
- Embedding: Numeric representation used for semantic search and similarity matching.
- Managed identity: Azure identity feature that lets services authenticate without stored application secrets.
- Queue: Messaging pattern that decouples request intake from later processing.
- Vector search: Retrieval method that compares embeddings by similarity rather than exact keywords only.
Free study resources
Use this IT Mastery page for live practice, topic drills, timed mocks, explanations, and app access.
Web preview and premium practice
- Web/public preview: a smaller web set so you can validate the question style and explanation depth.
- Premium: interactive web-app practice with focused drills, mixed sets, timed mock exams, detailed explanations, and progress tracking across web and mobile.
Good next pages after AI-200
- AI-103 if you are comparing Azure AI apps-and-agents development with AI cloud-development work
- AI-900 if you need Azure AI fundamentals first
- AZ-104 if your weak point is Azure identity, networking, storage, and operations context
- Microsoft Certification Practice Hub if you are comparing Azure, Fabric, security, Microsoft 365, Power Platform, Dynamics 365, GitHub, or Windows Server routes
Official sources
In this section
- Free AI-200 Practice Questions: Develop Containerized Solutions on AzurePractice 10 free Microsoft Azure AI Cloud Developer Associate (AI-200) questions on Develop Containerized Solutions on Azure, with answers, explanations, and the IT Mastery next step.
- Free AI-200 Practice Questions: Develop AI Solutions by Using Azure Data Management ServicesPractice 10 free Microsoft Azure AI Cloud Developer Associate (AI-200) questions on Develop AI Solutions by Using Azure Data Management Services, with answers, explanations, and the IT Mastery next step.
- Free AI-200 Practice Questions: Connect to and Consume Azure ServicesPractice 10 free Microsoft Azure AI Cloud Developer Associate (AI-200) questions on Connect to and Consume Azure Services, with answers, explanations, and the IT Mastery next step.
- Free AI-200 Practice Questions: Secure, Monitor, and Troubleshoot Azure SolutionsPractice 10 free Microsoft Azure AI Cloud Developer Associate (AI-200) questions on Secure, Monitor, and Troubleshoot Azure Solutions, with answers, explanations, and the IT Mastery next step.
- Free AI-200 Practice Exam: Microsoft Azure AI Cloud Developer AssociateTry 60 free Microsoft Azure AI Cloud Developer Associate (AI-200) questions across the exam domains, with explanations, then continue with IT Mastery practice.