Prepare for Microsoft Azure AI Cloud Developer Associate (AI-200) with a stable, objective-mapped IT Mastery bank, 24 public sample questions, a free 60-question diagnostic, Azure container, data-service, integration, security, monitoring, and troubleshooting drills.
Start with the free 60-question AI-200 diagnostic or the 24 public sample questions. See how the questions test Azure containers, serverless services, AI data services, service integration, identity, security, monitoring, and troubleshooting before you subscribe; IT Mastery then gives you a stable, objective-mapped AI-200 practice bank with 1,980 questions, timed mocks, topic drills, progress tracking, and detailed explanations across web and mobile.
Start a practice session for Microsoft Azure AI Cloud Developer Associate (AI-200) below, or open the full app in a new tab. For the best experience, open the full app in a new tab and navigate with swipes/gestures or the mouse wheel—just like on your phone or tablet.
Open Full App in a New TabA small set of questions is available for free preview. Subscribers can unlock full access by signing in with the same app-family account they use on web and mobile.
Prefer to practice on your phone or tablet? Download the IT Mastery – AWS, Azure, GCP & CompTIA exam prep app for iOS or IT Mastery app on Google Play (Android) and use the same IT Mastery account across web and mobile.
Free diagnostic: Try the AI-200 full-length practice exam before subscribing. Use it as one Azure AI cloud-developer baseline, then return to IT Mastery for timed mocks, topic drills, explanations, and the full AI-200 question bank.
| Domain | Weight |
|---|---|
| Develop Containerized Solutions on Azure | 23% |
| Develop AI Solutions by Using Azure Data Management Services | 29% |
| Connect to and Consume Azure Services | 24% |
| Secure, Monitor, and Troubleshoot Azure Solutions | 24% |
Use this map to connect individual questions to the Azure AI cloud-developer decisions this practice page tests.
flowchart LR
S1["App requirement"] --> S2
S2["Choose compute boundary"] --> S3
S3["Connect AI and data services"] --> S4
S4["Secure identities and secrets"] --> S5
S5["Add observability and resilience"] --> S6
S6["Ship reviewed release"]
| Area | What strong readiness looks like |
|---|---|
| Containerized solutions | You can choose container app, registry, revision, scaling, identity, and deployment patterns from scenario evidence. |
| AI data services | You can match vector search, document storage, caching, relational data, and data-governance requirements to Azure services. |
| Azure service integration | You can choose queues, events, API boundaries, Functions, and workflow patterns that avoid brittle synchronous designs. |
| Security and operations | You can apply managed identity, Key Vault, App Configuration, telemetry, KQL, retry strategy, and troubleshooting evidence. |
Try these 24 original sample questions for Microsoft AI-200. They are selected from the live IT Mastery practice bank for self-assessment and are not official exam questions.
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python RAG API stores document chunks, embeddings, tenant_id, and source_system in Azure Database for PostgreSQL. Users from one tenant get low-confidence answers even though matching chunks exist. A trace for a failing request shows:
Request: tenant_id=contoso, source_system=policy
SQL executed:
SELECT chunk_id, tenant_id, source_system, content
FROM rag_chunks
ORDER BY embedding <=> :query_embedding
LIMIT 20;
App filter after SQL:
tenant_id=contoso AND source_system=policy
filtered_rows=1
Which retrieval pattern should you use to address this failure mode?
Options:
LIMIT and keep filtering in application code.Best answer: D
Explanation: The trace shows the top-k vector search is being run across all chunks before tenant and source metadata is applied in the app. Because LIMIT happens before filtering, the best matching rows for the requested tenant and source may never be returned from PostgreSQL. For semantic retrieval with metadata constraints, store embeddings and source metadata together and push the metadata predicates into the PostgreSQL query. The query should restrict rows with predicates such as tenant_id = :tenant_id and source_system = :source, then rank that filtered candidate set with vector distance before applying LIMIT. Increasing top-k is only a workaround; the core issue is the ordering of metadata filtering and similarity ranking.
Topic: Develop Containerized Solutions on Azure
A team deploys a Python RAG API to Azure Container Apps from Azure Container Registry. The new revision never becomes ready.
Deployment path:
Build task -> ACR repository -> Container Apps revision -> Running replica
Build task: Succeeded; pushed rag-api:20260516
Revision: rag-api--k9r2 created; desired replicas: 1; ready: 0
Revision event: Image pull failed for contosoacr.azurecr.io/rag-api:20260516
Detail: dial tcp 10.4.2.5:443 i/o timeout
App logs: no stdout or stderr for rag-api--k9r2
Which evidence identifies the deployment failure point?
Options:
Best answer: B
Explanation: In a Container Apps deployment, the image must first be pulled from the registry before the container process can start and emit application logs. The build task succeeded, so the image was created and pushed. The revision was created, but the event shows the revision could not pull the image from ACR because the connection to the registry endpoint timed out. That points to registry reachability or related connectivity evidence between the Container Apps environment and ACR. Missing stdout is expected when the container never starts, and scaling evidence is secondary because provisioning failed first.
Topic: Connect to and Consume Azure Services
An AI document-processing API runs in Azure Container Apps and uses the Azure Service Bus SDK to enqueue one processing command per upload. During burst tests, p95 enqueue latency increases and Service Bus metrics show many short-lived AMQP connections from each replica. The current handler creates a ServiceBusClient and queue sender, sends one message, and closes them for every HTTP request. The app must continue using managed identity and the queue’s retry/dead-letter behavior. Which SDK-based access pattern best improves efficiency?
Options:
ServiceBusClient and sender per process.Best answer: A
Explanation: Azure SDK clients that manage network connections should usually be created once per application instance and reused. In this scenario, the performance symptom is many short-lived AMQP connections caused by constructing and closing Service Bus SDK objects inside every request. A long-lived ServiceBusClient created with managed identity, plus a reused queue sender, reduces connection churn and lowers enqueue latency under burst load. This keeps the existing Service Bus queue semantics, including retry and dead-letter handling. Switching services or moving credentials does not address the connection lifecycle problem.
Topic: Connect to and Consume Azure Services
An Azure Function processes embedding jobs from Azure Service Bus. After deployment, queue depth grows and the host logs show the trigger cannot start. The team wants the Functions host to manage the Service Bus listener and avoid a secret lookup in user code on each invocation. The function app’s managed identity can read the secret from Key Vault.
Exhibit:
Trigger type: Service Bus queue
Queue: embedding-jobs
Binding connection: ServiceBusIngest
Log: connection setting 'ServiceBusIngest' was not found
Which Function app application setting should you add?
Options:
SERVICEBUS_NAMESPACE = Service Bus fully qualified namespaceServiceBusIngest = Key Vault reference to the Service Bus connection stringServiceBusIngest = App Configuration endpoint URLAzureWebJobsStorage = Service Bus connection stringBest answer: B
Explanation: For an Azure Functions binding, the connection property names the Function app application setting the host uses to connect to the target service. Here, the Service Bus trigger specifies ServiceBusIngest, so the deployed app needs an application setting with that exact name. Using a Key Vault reference lets the platform resolve the secret securely while the Functions host initializes the trigger and manages the listener efficiently. That avoids adding per-invocation secret retrieval or client setup in function code.
AzureWebJobsStorage is for the Functions runtime storage account, not the Service Bus queue trigger connection.
AzureWebJobsStorage is not the binding connection named in the trigger.Topic: Connect to and Consume Azure Services
A team is building a Python Azure Functions API for AI document enrichment. Each enrichment job can run for several minutes. The API endpoint must return in under 2 seconds with a job ID, create a durable work item that one background worker can claim, support retries with poison-job handling, and avoid hard-coded service secrets. Which implementation should you use?
Options:
Best answer: C
Explanation: For long-running AI work, the Function-based API should accept the request, create a job ID, enqueue a command-style work item, and return 202 Accepted quickly. Azure Service Bus queues are designed for durable background work where one worker claims a message, failed processing can be retried, and poison messages can be moved to a dead-letter queue. A Service Bus-triggered Function then performs the enrichment asynchronously. Using an identity-based binding or secure configuration avoids embedding connection strings in code. Event Grid is better for event notification, and polling a database adds custom retry and poison-message logic.
Topic: Develop Containerized Solutions on Azure
An Azure Container Apps worker uses the Azure Service Bus SDK to process queue messages and then query Azure Cosmos DB for NoSQL. It scales with KEDA based on queue length. After a release, queue age increased, and the team suggests raising the replica limit. You must reduce delay without increasing Cosmos DB throttling.
Exhibit: Current observations
KEDA rule: Service Bus queue length, max replicas = 20
Current: 20 replicas, CPU 24%, memory 48%
Cosmos DB query p95: 1.9 seconds, cross-partition
Cosmos DB 429 retries: increased under load
Load test at 40 replicas: same throughput, more 429 retries
Which implementation should you choose?
Options:
Best answer: D
Explanation: KEDA is appropriate when more replicas can independently process more work. Here, the worker is already at the replica limit, CPU is low, and the slow step is a cross-partition Cosmos DB query with 429 retries. The load test confirms that extra replicas do not drain the queue faster and instead increase downstream throttling. The implementation should keep scaling at a safe level and fix the per-message bottleneck, such as query shape, partition filtering, indexing, request-unit capacity, or client-side concurrency. Scaling the container app is not a substitute for resolving a saturated downstream dependency.
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
An Azure Container Apps-hosted Python RAG API started returning 500 errors immediately after a new revision was deployed. The revision is intended to use the production managed identity. You need to decide whether the cause is operational or design-related and choose the next step.
| Evidence source | Finding |
|---|---|
| KQL logs | SecretClient.get_secret returns 403 Forbidden for cosmos-key |
| OpenTelemetry trace | Failed span is KeyVault.GetSecret; no Cosmos DB span starts |
| Metrics | CPU, memory, and request volume are unchanged |
| Configuration | Expected identity is mi-rag-prod; new revision uses mi-rag-test |
What should you do next?
Options:
Best answer: B
Explanation: The combined evidence indicates an operational deployment/configuration issue, not a design flaw or capacity problem. The failure begins at KeyVault.GetSecret, the trace never reaches Cosmos DB, and platform metrics do not show increased load. The configuration also shows the new revision using mi-rag-test while the intended authorized identity is mi-rag-prod. The next step is to correct the revision’s managed identity configuration and redeploy, then validate that traces proceed past Key Vault to the downstream data call. Capacity or caching changes would optimize later parts of the request path, but this request is failing before retrieval starts.
Topic: Develop AI Solutions by Using Azure Data Management Services
A containerized RAG API on Azure Container Apps stores embeddings in Azure Database for PostgreSQL. The table uses the correct vector data type, metadata filters, and an appropriate vector index. KQL and database metrics show p95 similarity-search latency above target, CPU at 92%, memory pressure during queries, and storage I/O below 40%. Which resource adjustment is the best design fit?
Options:
Best answer: A
Explanation: Vector similarity search in Azure Database for PostgreSQL can be compute- and memory-intensive even when the schema and vector index are appropriate. In this scenario, the key evidence is high CPU and memory pressure while storage I/O is not saturated. The best resource adjustment is to scale the PostgreSQL server to a compute option with more vCores and memory, such as a larger memory-optimized SKU. That preserves the existing PostgreSQL vector workload and addresses the measured bottleneck directly. Storage expansion helps only when capacity or I/O is the limiting factor, and queuing or cache substitution does not fix slow synchronous database search.
Topic: Develop AI Solutions by Using Azure Data Management Services
An AI metadata API uses Azure Cosmos DB for NoSQL to retrieve document records before vector retrieval. A new case-insensitive category filter caused RU spikes. The team must reduce RU without removing tenant isolation or changing the filter semantics. The container partition key is /tenantId.
Query:
SELECT c.id, c.title
FROM c
WHERE c.tenantId = @tenantId
AND LOWER(c.category) = @category
AND c.embeddingModel = @model
Observed:
Cross-partition: false
Consistency: Session
Index note: /tenantId and /embeddingModel used
LOWER(c.category): evaluated after retrieval
Retrieved: 24,900 docs
Returned: 37 docs
Request charge: 910 RU
Which control should the developer implement?
Options:
LOWER(c.category).normalizedCategory equality filter in the tenant-scoped query./category.Best answer: C
Explanation: The evidence points to application query shape, not partitioning or consistency. The request is already single-partition because the query includes /tenantId, and Session consistency is not the source of the high RU charge. The large gap between retrieved and returned documents, combined with the note that LOWER(c.category) is evaluated after retrieval, shows that many candidate documents are read before the final filter is applied. Storing a normalized category value, such as lowercasing it at write time, and querying that indexed field with equality preserves case-insensitive behavior while keeping tenant isolation.
LOWER(c.category) predicate index-served.Topic: Develop Containerized Solutions on Azure
A team deploys a custom container to Azure App Service. The same image must be promoted unchanged from staging to production. The Python app reads MODEL_ENDPOINT and POSTGRES_PASSWORD only from environment variables. Security requires that secrets are not stored in the image, Dockerfile, or repository. What should you configure to meet the requirement with the least operational risk?
Options:
ENV values to the Dockerfile before building.Best answer: A
Explanation: For a custom container hosted in Azure App Service, application settings are injected into the running container as environment variables. This lets the same image move across environments while each App Service instance supplies its own runtime configuration. Nonsecret values such as MODEL_ENDPOINT can be stored directly as app settings. Secret values such as POSTGRES_PASSWORD should be referenced from Azure Key Vault, typically using a managed identity, so the secret is retrieved at runtime without being committed to source or baked into the image. Build-time variables or image files do not satisfy the requirement because they make configuration part of the artifact rather than the deployment environment.
ENV values fail because they bake environment-specific configuration into the container image.Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A containerized Python API uses a model endpoint name, feature flags, retry counts, and an Azure Database for PostgreSQL password. Security requires least-privilege access to secrets and rotation without redeploying the container. Operations must update feature flags and retry counts without being granted secret permissions. Which design best meets these requirements?
Options:
Best answer: A
Explanation: Key Vault is the appropriate store for sensitive values such as passwords, API keys, and connection secrets because it supports controlled access, auditing, and rotation patterns. Azure App Configuration is better suited for nonsecret application settings such as feature flags, endpoint names, retry counts, and other runtime behavior flags. In this scenario, separating the stores lets security grant the app identity secret access without giving operations secret permissions, while still allowing operations to change nonsecret behavior safely. Baking values into images or using one store for everything either weakens secret handling or makes operational changes harder than required.
Topic: Develop AI Solutions by Using Azure Data Management Services
A chat retrieval API uses Azure Cosmos DB for NoSQL to store chunk metadata for an AI assistant. Users must see chunks they uploaded in their own next chat turn; global latest-order across users is not required.
Flow trace:
Upload -> write chunk item (partition key: tenantId)
Chat -> pass upload session token to read
Query -> same tenant:
WHERE c.tenantId = @tenant AND c.docType = @type
ORDER BY c.updatedAt DESC
Consistency: Session
Indexing:
included paths: /tenantId/?, /docType/?, /updatedAt/?
composite indexes: none
Metrics:
high RU, high retrieved documents, low index utilization
Which change is the most durable Cosmos DB optimization for this query path?
Options:
updatedAt from the indexing policy.docType and updatedAt.Best answer: C
Explanation: The durable optimization is to make the existing query use an index pattern that matches how it filters and sorts. The app already uses Session consistency with the upload session token, which supports the stated read-your-writes requirement. The visible failure point is the query plan evidence: high RU, high retrieved document count, low index utilization, and no composite index for the filter-plus-ORDER BY path. Increasing throughput may reduce throttling if throttling exists, but it does not reduce RU per query. Weakening consistency is not justified because the requirement depends on seeing the user’s own write.
ORDER BY updatedAt path less index-friendly, not more efficient.Topic: Develop Containerized Solutions on Azure
A team is deploying a containerized Python RAG API for an Azure AI cloud solution. The image is already built and stored in Azure Container Registry. The platform team has provisioned the AKS cluster, namespace, node pools, networking, ingress controller, and ACR pull permissions. The app team must deploy each image version, set replicas and health probes, expose an internal endpoint, and keep deployment definitions in source control. Which design best fits?
Options:
Best answer: C
Explanation: This scenario is about application lifecycle within an existing AKS cluster, not administering the cluster. The platform prerequisites—namespace, node pools, networking, ingress, and ACR permissions—are already in place, and the app team only needs to manage resources inside the namespace. For AI-200, that points to manifest-based application management: a Deployment for image tag, replicas, probes, and environment references; a Service for internal exposure; and ConfigMap references for nonsecret settings. CI can apply versioned YAML manifests for controlled rollouts. Changing node pools, networking, identities, or creating a new cluster shifts into broader AKS administration and ignores the stated boundary.
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A team monitors an Azure Container Apps-hosted Python API that serves RAG responses. The API uses Azure Cosmos DB for NoSQL and Azure Managed Redis. During a latency incident, all health probes remain healthy.
| Signal | Observation |
|---|---|
| Container Apps | CPU 35%, memory 50%, replica cap not reached |
| Cosmos DB | 0 throttles, p95 server latency 8 ms |
| Managed Redis | p95 server latency 2 ms, hit ratio 86% |
| App traces | p95 request 1,800 ms; client initialization spans run on each request |
Which change is most likely to improve performance while keeping the current services?
Options:
Best answer: D
Explanation: The evidence points to an application-level bottleneck, not a service-level bottleneck. Container Apps is not CPU or memory constrained, Cosmos DB has no throttling and low server latency, and Redis has low server latency with a healthy hit ratio. The slow part appears in application traces: client initialization spans run on each request and consume much of the request time. Reusing SDK clients for Cosmos DB and Redis per process avoids repeated connection setup and improves efficiency without changing service capacity or weakening reliability. Scaling replicas or increasing database throughput would target symptoms that are not present in the measurements.
Topic: Develop AI Solutions by Using Azure Data Management Services
An Azure Database for PostgreSQL-backed RAG service has finished generating 1,536-dimension embeddings. The prototype table stores each chunk in one column:
chunk_id uuid
payload text -- chunk text, source URI, metadata JSON, embedding array
Production queries must run vector similarity search, filter by tenant, document type, and date range, and return source citations. What is the best next implementation step?
Options:
payload during each query.payload before storing embeddings.vector(1536), typed filter/source columns, and jsonb metadata.Best answer: D
Explanation: For PostgreSQL-backed RAG retrieval, the schema should separate data by how it will be used. Embeddings should be stored in a vector data type with the correct dimensionality so vector similarity operations and vector indexes can be applied. Frequently filtered attributes such as tenant, document type, and dates should be typed columns, not buried in text, so they remain easy to filter and index. Source references such as document ID, URI, and chunk position should also be explicit columns for reliable citations. jsonb is useful for flexible metadata that is not part of the primary filter path. Load data and tune indexes only after the table structure supports the required retrieval pattern.
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A Python Azure Functions app uses a Service Bus trigger to process embedding-refresh messages and query an Azure Database for PostgreSQL vector store. After a deployment, message backlog and retrieval latency increased. You must fix the implementation without changing the database tier or vector query semantics.
Exhibit: Recent evidence
| Signal | Value |
|---|---|
| Function scale-out | 4 to 30 instances |
| PostgreSQL connections | 490/500 active |
| PostgreSQL CPU/memory | 35% / 48% |
| App traces | PoolTimeout, too many clients |
| Current code | New pool per invocation |
Which change should you implement?
Options:
Best answer: D
Explanation: The evidence points to connection pressure, not database compute pressure or vector-index inefficiency. PostgreSQL CPU and memory are moderate, but active connections are nearly exhausted and the app logs show pool timeouts and too many clients. Because the current code creates a new pool per invocation, scale-out multiplies open connections quickly. A module-level or worker-level pool with a bounded maximum, reused across invocations, preserves the same query behavior while preventing connection storms. The key is to align Function concurrency and pool sizing with the database connection capacity instead of trying to drain the queue by creating more simultaneous database clients.
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python RAG API in Azure Container Apps must query items in an Azure Cosmos DB for NoSQL container by using the Azure Cosmos DB SDK. The container image must not contain secrets, and the app’s managed identity has Cosmos DB data-plane permissions. Which TWO connection details or client configurations are needed for the SDK query path? Select TWO.
Options:
CosmosClientCorrect answers: C, D
Explanation: An SDK query path for Azure Cosmos DB for NoSQL starts by constructing a CosmosClient for the target account. In this scenario, the app should supply the account endpoint and use an Azure Identity credential, such as DefaultAzureCredential, so the managed identity is used instead of embedding a secret. After the client is created, the code must resolve the database and container by their configured IDs, then run the query on the container client. Other Azure connection strings or messaging settings do not establish a Cosmos DB for NoSQL query path.
Topic: Connect to and Consume Azure Services
A containerized AI summarization API runs in Azure Container Apps. Its managed identity has the Azure Service Bus Data Sender role. The app must enqueue work items without storing connection strings.
Trace:
POST /summaries
read queueName from App Configuration
build https://orders.servicebus.windows.net/summarize/messages
POST JSON payload without an Authorization header
result: 401 Unauthorized
Which change supplies the missing SDK-based access pattern?
Options:
EventGridPublisherClient to the queue URL.ServiceBusClient with DefaultAzureCredential and send to the queue.Best answer: D
Explanation: For Service Bus access from an Azure-hosted AI service, the SDK-based pattern is to create a service-specific client using Microsoft Entra authentication, typically through DefaultAzureCredential or managed identity. The visible flow has the configuration lookup, queue name, and permission assignment, but it bypasses the Service Bus SDK and sends an unsigned REST request, so no bearer token is attached. Using ServiceBusClient with the fully qualified namespace and a queue sender lets the SDK handle token acquisition and message protocol details. App Configuration can provide nonsecret settings, but it is not the messaging boundary. The key takeaway is to use the target service SDK at the integration boundary, not a generic unauthenticated HTTP call.
Topic: Connect to and Consume Azure Services
A Service Bus topic receives AI workflow events from several tenants. Each message already includes application properties tenantId and eventType. A new audit subscription currently uses the default rule and the audit app discards messages for other tenants in code. Security requires that the audit app must not receive out-of-scope tenant messages, while existing publishers and other subscribers must continue working unchanged. What should you do?
Options:
tenantId and eventType.Best answer: C
Explanation: Service Bus topic subscription filters are the right control when a single subscriber is receiving more messages than it should, and the producer already sends properties that can classify the messages. A SQL filter or correlation filter is evaluated by Service Bus before messages are made available on that subscription. This preserves the existing topic contract and does not affect other subscriptions, because each subscription has its own rules. Filtering in consumer code still delivers unauthorized tenant messages to the app, which violates the security requirement. Dead-lettering is for messages that cannot be processed, not for normal routing decisions. The key distinction is routing control at the subscription versus changing producer or consumer behavior.
Topic: Connect to and Consume Azure Services
A Python Azure Function in an AI ingestion API should generate embeddings when messages arrive in an Azure Service Bus queue. Messages remain active in ingest-requests, and the function never starts after deployment. You must remediate the deployed configuration without changing the function code or queue topology. Select TWO.
Binding excerpt:
type: serviceBusTrigger
queueName: ingest-requests
connection: ServiceBusConnection
App settings:
FUNCTIONS_WORKER_RUNTIME=python
AzureWebJobsStorage=
SB_CONNECTION=Endpoint=sb://...
Startup log:
Service Bus connection setting 'ServiceBusConnection' was not found.
AzureWebJobsStorage is missing or empty.
Options:
FUNCTIONS_WORKER_RUNTIME to dotnet.ServiceBusConnection.ServiceBusConnection.AzureWebJobsStorage setting.Correct answers: C, F
Explanation: For Azure Functions triggers, the binding’s connection property is the name of an application setting, not the queue name or an arbitrary alias. The deployed app has SB_CONNECTION, but the trigger is looking for ServiceBusConnection, so the listener cannot resolve the Service Bus connection. The host also reports that AzureWebJobsStorage is missing or empty, which prevents normal Functions host operation for this deployed trigger-based app. Fixing those two app settings addresses the startup failures shown in the log without changing code or the Service Bus queue. Scaling, output bindings, or worker runtime changes do not resolve missing configuration names.
python.Topic: Secure, Monitor, and Troubleshoot Azure Solutions
A RAG API running in Azure Container Apps uses Azure App Configuration for runtime settings. After operators enabled vector retrieval for production, users still receive answers without citations.
Evidence:
Startup: AppConfig label selector = prod
FeatureManagement:UseVectorContext = false
Rag:RetrievalMode = keyword
Trace: skipped vector retrieval; feature disabled
Recent App Configuration changes:
| Key | Label | Value |
|---|---|---|
| FeatureManagement:UseVectorContext | production | true |
| Rag:RetrievalMode | production | vector |
Which root cause best fits the evidence?
Options:
Best answer: A
Explanation: Azure App Configuration labels let the same key have environment-specific values. In this case, the application is successfully reading configuration, but it is selecting the prod label while the updated values were stored with the production label. The trace also shows that vector retrieval was skipped because the feature flag was disabled, so the failure is in configuration selection rather than RAG feature design or model behavior. The next fix would be to align the label used by the app with the label used for the production settings, or update the keys under the label the app actually selects.
Topic: Develop AI Solutions by Using Azure Data Management Services
A Python FastAPI back end in Azure Container Apps must run semantic-search SQL against Azure Database for PostgreSQL. The deployment trace shows the request path below. Which change addresses the failure point in the flow?
Startup: DefaultAzureCredential created
Startup: PostgreSQL flexible server management client created
Request /ask:
list servers -> 200 OK
get database metadata -> 200 OK
run SELECT ... ORDER BY embedding <-> $1
error: AttributeError: no execute/cursor
Options:
Best answer: C
Explanation: Azure Database for PostgreSQL is queried from application code by using standard PostgreSQL client libraries, such as asyncpg, psycopg, Npgsql, or JDBC. In a back-end service, the usual pattern is to create a bounded connection pool during startup and reuse pooled connections for SQL queries. The trace shows the app can call Azure Resource Manager operations, such as listing servers and reading database metadata, but then tries to execute SQL through that management client. Management clients do not expose database cursors or execute application queries. The missing link is a data-plane PostgreSQL driver connection to the database endpoint.
Topic: Secure, Monitor, and Troubleshoot Azure Solutions
An Azure Container Apps API uses OpenTelemetry correlation. During an incident, only the /rag/answer operation returns 500; /health and /orders/status succeed. The active revision has no restarts. The on-call team must avoid unnecessary rollbacks and keep unaffected routes available.
Trace excerpt for one failed request:
| Span | Duration | Status |
|---|---|---|
api /rag/answer | 2,140 ms | Error |
redis lookup | 9 ms | OK |
postgres vector_query | 2,005 ms | Timeout |
cosmos read profile | 18 ms | OK |
Which decision best meets the requirement?
Options:
Best answer: D
Explanation: Distributed traces should be read as causal chains, not just by the root span status. The root API span is Error because /rag/answer depends on postgres vector_query, which times out for most of the request duration. Redis and Cosmos DB spans are OK, other routes and health checks succeed, and no container restarts are reported. That evidence isolates the problem to the PostgreSQL dependency path rather than the API revision or Container Apps runtime. A good control is to alert and triage on the dependency span and correlation ID while leaving the current revision available for healthy routes. Rolling back or disabling ingress would broaden the blast radius.
Topic: Develop Containerized Solutions on Azure
A team deployed a Python RAG API as a custom container to Azure App Service. The optimized release met the latency target in testing, but production p95 latency remains high.
Exhibit: Deployment evidence
Intended release:
Image: contoso.azurecr.io/rag-api:20260515.3
CACHE_TTL_SECONDS: 300
App Service diagnostics:
Configured image: contoso.azurecr.io/rag-api:latest
Resolved digest: sha256:0b7d...
CACHE_TTL_SECONDS: 0
Startup log: response cache disabled
Which action should the developer take first to improve efficiency without weakening deployment reliability?
Options:
CACHE_TTL_SECONDS=300.latest and bake the cache value into the image.Best answer: C
Explanation: App Service custom container evidence should be checked against the intended release: the configured image, resolved image digest or tag, startup logs, and app settings exposed as environment variables. In this case, production is not using the intended immutable image tag and CACHE_TTL_SECONDS is 0, which the log confirms disables the response cache. Pinning the tested image tag and setting the App Service app setting to 300 directly applies the known performance change while avoiding the deployment risk of a mutable latest tag. Scaling resources would cost more without fixing the mismatch.
| Cue | What to remember |
|---|---|
| Compute choice | Use Functions, container apps, app services, or managed compute based on scale, latency, image, deployment, and operations needs. |
| Integration | Use queues, events, and durable workflow boundaries for long-running AI steps or dependency isolation. |
| Data fit | Match semantic search, document data, cache, and relational needs to the right Azure data service. |
| Security | Use managed identity, least privilege, Key Vault, masking, and environment separation for AI-enabled apps. |
| Resilience | Plan for throttling, retries, timeouts, dependency telemetry, KQL investigation, and user-safe fallback behavior. |
Use these child pages when you want focused IT Mastery practice before returning to mixed sets and timed mocks.
Need concept review first? Read the AI-200 Cheat Sheet for compact concept review before returning to timed practice.