Browse Certification Practice Tests by Exam Family

Free AI-200 Full-Length Practice Exam: 60 Questions

Try 60 free AI-200 questions across the exam domains, with explanations, then continue with full IT Mastery practice.

This free full-length AI-200 practice exam includes 60 original IT Mastery questions across the exam domains.

These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.

Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.

Need concept review first? Read the Microsoft AI-200 cheat sheet for Azure AI Foundry, agents, retrieval, evaluation, responsible AI, deployment, and operations cues before starting another diagnostic.

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try AI-200 on Web View full AI-200 practice page

Exam snapshot

  • Exam route: AI-200
  • Practice-set question count: 60
  • Time limit: 120 minutes
  • Practice style: mixed-domain diagnostic run with answer explanations

Full-length exam mix

DomainWeight
Develop Containerized Solutions on Azure23%
Develop AI Solutions by Using Azure Data Management Services29%
Connect to and Consume Azure Services24%
Secure, Monitor, and Troubleshoot Azure Solutions24%

Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.

Practice questions

Questions 1-25

Question 1

Topic: Develop Containerized Solutions on Azure

An Azure Container Apps Python RAG API was deployed as revision rag-api--v12. Traffic remains on the older revision, increasing latency during peak load. You need to make v12 available with the least wasted scaling or rebuild work. Interpret the startup evidence.

Exhibit: startup events and logs

Normal  Pulling   image acrprod.azurecr.io/rag-api:2026-05-15
Normal  Pulled    successfully pulled image
Normal  Created   created container rag-api
Warning BackOff   restarting failed container
app: RuntimeError: Missing required setting REDIS_ENDPOINT
app: startup aborted before listening on port 8080

Which action should you take?

Options:

  • A. Increase minimum replicas for rag-api--v12.

  • B. Configure the missing REDIS_ENDPOINT runtime setting.

  • C. Rebuild and push the container image.

  • D. Grant the app access to Azure Container Registry.

Best answer: B

Explanation: The startup evidence points to missing runtime configuration, not a pull or capacity problem. The image was pulled successfully and the container was created, so registry access and image availability are not the blockers. The BackOff follows an application startup abort, and the application log names the missing setting: REDIS_ENDPOINT. The efficient remediation is to provide that setting, commonly as an environment variable or secret reference, and let the corrected revision start. Scaling replicas would only create more failed starts, while rebuilding the image wastes time unless the image content is actually wrong.

The key takeaway is to read the event sequence first: pull failure, container crash, and missing configuration leave different evidence in startup logs.

  • More replicas fails because each replica would hit the same missing setting during startup.
  • Image rebuild fails because the image was already pulled and created successfully.
  • Registry access fails because a registry permission problem would appear before a successful pull.

Question 2

Topic: Connect to and Consume Azure Services

An Azure Functions HTTP API accepts requests to generate RAG summaries. The AI workflow can run for 1-4 minutes and traffic spikes 10x during business hours. Clients need an acknowledgment within 2 seconds after validation, and accepted work must be retried if a worker fails. Which design best improves API latency and throughput?

Options:

  • A. Start a background thread from the HTTP trigger

  • B. Enqueue a Service Bus message and return 202

  • C. Publish only an Event Grid notification for each request

  • D. Run the AI workflow synchronously in the HTTP trigger

Best answer: B

Explanation: For a Function-based API entry point, keep the HTTP-triggered function short: validate the request, persist a job/status record if needed, send a command-style message to Azure Service Bus, and return 202 Accepted. A Service Bus-triggered function can then run the longer AI workflow with retries, message completion, dead-letter handling, and scale behavior independent of API request latency. This improves throughput because HTTP workers are not held for 1-4 minute operations. It also preserves reliability when a worker fails. Background threads and synchronous processing tie long work to the request host lifecycle, while Event Grid is better for event notification than durable command processing for accepted jobs.

  • Synchronous processing keeps HTTP executions busy for minutes, so it misses the 2-second acknowledgment goal.
  • Background threads are not a reliable serverless handoff because host restarts or scale-in can interrupt work.
  • Event notification only does not provide the same command queue semantics, completion, and dead-letter handling needed for accepted jobs.

Question 3

Topic: Develop AI Solutions by Using Azure Data Management Services

A team is optimizing a Cosmos DB for NoSQL container used by a RAG API. Which TWO changes should reduce RU consumption for the shown access pattern without changing query results or weakening the stated freshness requirement? Select TWO.

Query:
SELECT TOP 20 c.id, c.title
FROM c
WHERE c.tenantId = @tenant
  AND c.docType = @docType
ORDER BY c.lastUpdated DESC

Indexing facts:
- included paths: /tenantId/?, /docType/?, /lastUpdated/?
- composite indexes: none
- excluded paths: /content/*, /embedding/*

Consistency facts:
- current account default: Strong
- required freshness: read-your-writes within a user session

Options:

  • A. Use Eventual consistency for all user reads.

  • B. Configure Session consistency for this workload.

  • C. Re-index /content/* and /embedding/* in the regular index.

  • D. Increase provisioned throughput for the container.

  • E. Add a composite index on tenantId, docType, and lastUpdated DESC.

  • F. Remove tenantId and docType from the index.

Correct answers: B and E

Explanation: The durable optimizations should match the documented access pattern. The query filters by tenantId and docType and sorts by lastUpdated, so a composite index aligned to those properties can reduce query work compared with separate single-property indexes. The current exclusions for content and embedding should not be reversed because the query does not filter or sort on those large fields. For consistency, Strong is more than the stated requirement. Session consistency still supports read-your-writes within a user session and can reduce the cost and latency associated with stronger consistency guarantees. The key is to optimize the index and consistency model based only on the facts provided.

  • Removing predicate indexes fails because the query uses tenantId and docType in its filter.
  • Eventual consistency fails because it does not provide the required read-your-writes behavior.
  • More throughput can reduce throttling, but it does not reduce RU consumed by this query.
  • Re-indexing large fields adds index maintenance for paths the shown query does not use.

Question 4

Topic: Develop Containerized Solutions on Azure

A team uses an Azure Container Registry Task to build a container for an AI inference API. The release gate can deploy only if the task built the image, applied the expected tag, and pushed it to the target registry.

Exhibit: ACR Task run summary

Run ID: ca-1842
Repository: rag-api
Expected tag: rag-api:2026.05.16.4
Build step: Succeeded
Tag step: myacr.azurecr.io/rag-api:2026.05.16.4
Push step: Failed
Error: denied: requested access to resource is denied
Image digest: not recorded
Run status: Failed

What should the release gate do?

Options:

  • A. Retag the previous image as the requested release tag.

  • B. Deploy from the build output because the image was tagged.

  • C. Block deployment, fix push authorization, and rerun the task.

  • D. Deploy the previous digest and record this run as successful.

Best answer: C

Explanation: An ACR Task result must show the full artifact lifecycle required by the release gate: successful build, expected tag, successful push, and a recorded registry digest. In this run, the build and tag steps completed, but the push failed with an authorization error and no digest was recorded. That means downstream services cannot reliably pull the expected rag-api:2026.05.16.4 artifact from Azure Container Registry. The control should prevent rollout of a non-existent or unverified image while allowing normal recovery: correct the push permission issue and rerun the task. Treating a local build, a reused tag, or a previous digest as equivalent would weaken release integrity.

  • Build-only evidence fails because a tagged local or intermediate image is not the registry artifact the deployment must pull.
  • Retagging the previous image risks deploying code that was not produced by this task run.
  • Using a previous digest may preserve service availability, but it cannot satisfy the requirement that this run produced the expected tag.

Question 5

Topic: Develop Containerized Solutions on Azure

A team uses Azure Container Registry to store images for a containerized RAG API. The CI/CD pipeline deploys each image to Azure Container Apps, where each deployment creates a revision. The team must trace any running revision back to the exact app version, Git commit, and pipeline run that built the image. Which tagging approach should the pipeline implement?

Options:

  • A. Use a unique tag like 1.8.3-<gitSha>-<runId> for each build.

  • B. Use only the minor version tag, such as 1.8.

  • C. Tag every successful build as latest before deployment.

  • D. Retag the promoted image as prod for production deployments.

Best answer: A

Explanation: Traceable container releases need a stable, unique identifier for each built image. A tag generated from the application version, source commit, and pipeline run ID lets the ACR image, CI run, and Azure Container Apps revision refer to the same artifact. The deployment should use that exact non-reused tag, and teams often also retain the image digest in deployment metadata or logs. Moving aliases such as latest or prod are convenient, but they become ambiguous after a later build repoints the tag. The key is to make each deployment reference a unique image identity, not just the current environment.

  • Mutable aliases such as latest can be repointed, so older revisions lose clear image traceability.
  • Environment tags such as prod show promotion state, not the exact build that produced the image.
  • Broad version tags such as 1.8 can collapse multiple patch builds and commits into one label.

Question 6

Topic: Develop AI Solutions by Using Azure Data Management Services

An AI document assistant stores embeddings in Azure Database for PostgreSQL with pgvector. Queries must keep tenant isolation and use:

WHERE tenant_id = @tenant ORDER BY embedding <=> @queryVector LIMIT 10

After the table grows to millions of rows, query plans show a sequential scan and sort. Ingestion must remain available during the change. Which indexing decision best reduces latency while preserving these requirements?

Options:

  • A. Build a concurrent HNSW index with vector_cosine_ops.

  • B. Build an IVFFlat index with vector_l2_ops.

  • C. Remove tenant_id filtering and authorize after retrieval.

  • D. Add only a B-tree index on tenant_id.

Best answer: A

Explanation: The query uses pgvector cosine distance through <=>, so the vector index must match that distance operation. An HNSW index with vector_cosine_ops is designed for approximate nearest-neighbor vector search and avoids the sequential scan and full sort pattern that appears as the embedding table grows. Building it concurrently addresses the operational requirement that ingestion remain available during the change. The tenant predicate should stay in the query to preserve isolation; depending on selectivity, a supporting tenant index or partitioning can also help, but it does not replace the vector similarity index. The key distinction is that the latency problem is caused by unindexed vector ranking, not by tenant filtering alone.

  • Wrong distance class fails because an L2 operator class does not match the cosine-distance query semantics.
  • Tenant-only indexing may reduce filtering cost but still leaves PostgreSQL to rank vectors without a suitable ANN index.
  • Post-retrieval authorization breaks tenant isolation because cross-tenant candidates are retrieved before filtering.

Question 7

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A containerized RAG API in Azure Container Apps calls a retriever service, Azure Cosmos DB for NoSQL, and an Azure Function through Service Bus. Intermittent failures occur only on one request path, and the team must identify the exact dependency call that caused a failed user request. Which observability configuration should you implement?

Options:

  • A. Increase stdout log verbosity for every container revision.

  • B. Run broad KQL error-text searches across all container logs.

  • C. Configure OpenTelemetry distributed tracing with propagated W3C trace context.

  • D. Create metric alerts for service-level 5xx rates and latency.

Best answer: C

Explanation: Trace-based root cause analysis is the right fit when the deciding evidence is the path of a single request across services. OpenTelemetry distributed tracing creates spans for each service and dependency call, then propagates a trace context across HTTP, messaging, and function boundaries. Exporting those spans to Azure Monitor or Application Insights lets you follow one failed request from the API through the retriever, Cosmos DB, and Service Bus-triggered function to find the failing span. Broad log searches can show that errors happened, but they do not reliably reconstruct the causal request path without correlation context. The key distinction is correlation-first tracing, not more unstructured logs.

  • Broad log search can find matching error text, but it does not prove which call in one request path failed.
  • Verbose stdout logs increase volume, but they still lack end-to-end span relationships unless trace context is propagated.
  • Metric alerts show aggregate symptoms such as 5xx or latency, not the dependency-level cause for a specific request.

Question 8

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A Python RAG API runs in Azure Container Apps. Each request calls Azure Managed Redis for cache lookup, Azure Cosmos DB for NoSQL vector search, and Azure Service Bus for enrichment work. Security requires end-to-end request flow, downstream-call latency, and failure context, but prompts and retrieved document text must not be stored in telemetry. Which instrumentation design is the best fit?

Options:

  • A. Use OpenTelemetry traces with propagated context and redacted span attributes

  • B. Rely on Service Bus dead-letter logs for request diagnostics

  • C. Log prompts and retrieved documents as custom telemetry events

  • D. Collect only Container Apps replica, CPU, and memory metrics

Best answer: A

Explanation: The best fit is distributed tracing with the OpenTelemetry SDK. Instrument the inbound API request and each downstream dependency call as spans, propagate trace context across Redis, Cosmos DB, and Service Bus operations, and record safe attributes such as operation name, duration, status code, exception type, and dependency target. Sensitive inputs and outputs, such as prompts, completions, embeddings source text, or retrieved document content, should be excluded or redacted before export to monitoring backends such as Azure Monitor. This captures the request path and failure context without turning telemetry into a data-leak channel.

  • Payload logging fails because storing prompts or retrieved text violates the security constraint.
  • Host metrics only can show resource pressure, but not request flow or downstream dependency latency.
  • Dead-letter logs help diagnose failed messages, but they do not trace the full API request path.

Question 9

Topic: Develop AI Solutions by Using Azure Data Management Services

A RAG API stores embeddings in Azure Database for PostgreSQL and runs pgvector similarity searches with metadata filters. During peak traffic, only the retrieval step is slow.

Exhibit: Recent PostgreSQL metrics

SignalObservation
CPU utilization32% average, 45% peak
Memory utilization91% average
Temp file writesIncrease during vector queries
Storage latency3 ms average

Which configuration decision should you make first?

Options:

  • A. Increase provisioned storage IOPS for the database

  • B. Add more vCores to the same compute family

  • C. Scale to a PostgreSQL compute size with more memory

  • D. Move embeddings to Azure Managed Redis as the source of truth

Best answer: C

Explanation: The metrics point to memory pressure in PostgreSQL. CPU is not saturated, and storage latency is low, so adding vCores or storage IOPS would not address the main bottleneck shown. Temp file writes during vector queries often indicate that operations cannot stay in memory and must spill to disk, which increases retrieval latency. For a pgvector workload, the first configuration move is to use a compute size or tier with more available memory before changing unrelated resources.

The key takeaway is to match the scaling action to the constrained resource shown by metrics, not just to the slow symptom.

  • Storage IOPS is not supported by the evidence because observed storage latency is low.
  • More vCores misses the bottleneck because CPU utilization remains well below saturation.
  • Redis source of truth changes the data architecture and does not address the PostgreSQL memory constraint shown.

Question 10

Topic: Develop AI Solutions by Using Azure Data Management Services

A Python RAG API stores chunk embeddings in Azure Database for PostgreSQL with pgvector. The team must keep cosine-distance ordering and the active-row filter; approximate nearest-neighbor retrieval is acceptable.

Exhibit: Current query pattern

CREATE TABLE chunks (
  id uuid PRIMARY KEY,
  is_active boolean NOT NULL,
  embedding vector(1536) NOT NULL,
  content text NOT NULL
);

-- 70% of rows have is_active = false.
-- Every retrieval query includes this filter.
SELECT id, content
FROM chunks
WHERE is_active = true
ORDER BY embedding <=> $1
LIMIT 10;

Which implementation should you choose?

Options:

  • A. Create a partial HNSW vector_cosine_ops index for active rows.

  • B. Create a B-tree index on the embedding column.

  • C. Create a full HNSW vector_l2_ops index on embeddings.

  • D. Scale compute only and keep the sequential scan.

Best answer: A

Explanation: For pgvector, the index strategy should follow the actual query pattern and provided selectivity facts. The query orders by <=>, which is cosine distance, so the vector index should use vector_cosine_ops. Because every retrieval filters is_active = true and 70% of rows are inactive, a partial HNSW index with the same predicate avoids maintaining and searching archived embeddings. Approximate nearest-neighbor retrieval is allowed, so HNSW fits the latency and CPU goal while preserving the required distance behavior. Scaling compute may mask the symptom, but it does not use the available query and data facts.

  • Wrong distance class fails because L2 indexing does not match the cosine-distance ordering used by <=>.
  • B-tree on embeddings fails because it does not support pgvector nearest-neighbor ordering.
  • Compute-only scaling ignores the active-row filter and keeps paying for broad scans.

Question 11

Topic: Develop AI Solutions by Using Azure Data Management Services

A team is building a containerized Python API for a RAG feature. The API must store customers, contracts, and document chunks in normalized tables, apply transactional updates when a contract changes, and run vector similarity searches over embeddings with SQL metadata filters such as tenant and contract status. The team wants one durable Azure data service for this data layer. Which implementation best preserves these constraints?

Options:

  • A. Use Azure Managed Redis as the authoritative vector store.

  • B. Use Azure Cosmos DB for NoSQL containers for all records.

  • C. Use Azure Database for PostgreSQL with vector-enabled columns and indexes.

  • D. Publish contract changes to Azure Service Bus topics only.

Best answer: C

Explanation: Azure Database for PostgreSQL is the best fit when the AI workload needs both relational modeling and vector search support. In this scenario, the application must keep normalized customer, contract, and document tables, use transactional updates, and combine embedding similarity with SQL metadata filters. PostgreSQL can support that pattern by storing embeddings alongside relational data and indexing them for vector similarity queries. It also allows the application to keep familiar SQL joins, constraints, and transactional behavior in one durable database. Cosmos DB for NoSQL and Azure Managed Redis can support AI-adjacent retrieval patterns, but they do not best satisfy the stated relational and transactional requirements as the primary store.

  • NoSQL document storage can support flexible documents and vector workloads, but it does not preserve the normalized relational SQL model required here.
  • Redis as source of truth is a poor fit because Redis is primarily used for caching or low-latency lookup, not durable relational transactions.
  • Service Bus topics coordinate message processing, but they do not provide queryable relational storage or vector similarity search.

Question 12

Topic: Develop AI Solutions by Using Azure Data Management Services

A team is building a RAG API for support articles. The app must use Azure Database for PostgreSQL for retrieval storage only, while an Azure-hosted generative model produces the final answer. Retrieval must support semantic similarity plus tenant and product metadata filters. Which TWO design choices meet the requirement? Select TWO.

Options:

  • A. Run vector search inside the generative model endpoint.

  • B. Store chunks, embeddings, and metadata in PostgreSQL.

  • C. Send the user question directly to PostgreSQL for synthesis.

  • D. Store only completed prompt-and-answer pairs in PostgreSQL.

  • E. Use PostgreSQL SQL functions to compose the final response.

  • F. Pass retrieved chunks to the model as grounding context.

Correct answers: B and F

Explanation: In this RAG pattern, Azure Database for PostgreSQL is the vector store and retrieval engine. It stores document chunks, embedding vectors, and metadata such as tenant and product so the application can run similarity search with filters. PostgreSQL returns relevant grounding content, but it does not create the natural-language answer. The application then sends the user question plus retrieved context to the generative model, which produces the response. A cache of completed answers or a SQL-generated response changes the role and does not satisfy the stated separation.

  • SQL-generated response fails because the requirement keeps natural-language answer generation outside PostgreSQL.
  • Completed answer storage may be useful for logging or caching, but it is not vector retrieval with metadata filters.
  • Model-side vector search fails because PostgreSQL would no longer be the vector store used for retrieval.

Question 13

Topic: Develop AI Solutions by Using Azure Data Management Services

A RAG API uses Azure Database for PostgreSQL to store document chunks with embedding, tenant_id, classification, and expires_at metadata. The API must return the most semantically relevant chunks, but only from the requesting tenant and allowed classifications. Which retrieval control should the API implement?

Options:

  • A. Embed tenant and classification text, then rank only by vector distance

  • B. Rank all chunks by vector distance, then filter results in the API

  • C. Filter metadata in SQL, then rank eligible rows by vector distance

  • D. Apply metadata filters only and skip vector similarity ranking

Best answer: C

Explanation: Metadata filtering and vector similarity ranking solve different parts of a RAG retrieval path. Metadata filters are control predicates: they determine which rows are eligible based on facts such as tenant, classification, expiration, or access scope. Vector similarity ranking is relevance logic: it orders the eligible rows by distance between the query embedding and stored chunk embeddings. For this scenario, the safe pattern is to include the tenant and classification constraints in the database query, then apply vector ordering and limits to that filtered candidate set. This preserves isolation while still returning the best semantically relevant allowed chunks.

  • Post-filtering top results can return too few valid chunks and may process unauthorized candidates before enforcement.
  • Embedding policy facts does not enforce exact tenant or classification boundaries; similarity is not an access-control mechanism.
  • Metadata-only retrieval preserves eligibility but loses semantic ranking, which is required for the RAG path.

Question 14

Topic: Develop AI Solutions by Using Azure Data Management Services

An Azure Container Apps API uses the Azure Cosmos DB for NoSQL SDK to fetch RAG metadata. The container is partitioned by /tenantId, and every request includes exactly one tenant ID. Under load, p95 latency increases and logs show a new SDK client is created inside each request handler. The team wants lower connection overhead and fewer RU-wasting fan-out queries. Which SDK setup should it use?

Options:

  • A. Reuse one client but omit the partition key for tenant queries.

  • B. Use one shared client with endpoint, managed identity, container IDs, and tenant partition key.

  • C. Create a new client per request using the account key from Key Vault.

  • D. Provide only database and container names so the SDK discovers the account.

Best answer: B

Explanation: An SDK query path for Azure Cosmos DB for NoSQL needs the account endpoint, an authentication credential such as managed identity, and the database and container identifiers. The CosmosClient is designed to be reused because it manages connections internally; creating it per request adds avoidable connection setup and resource pressure. Because each query is scoped to one /tenantId, passing that value as the request partition key lets the SDK route the query to the relevant logical partition instead of performing a cross-partition fan-out. The key performance idea is to combine correct connection information with a long-lived client and partition-aware query options.

  • Fresh clients add connection setup overhead and can increase socket/resource pressure under load.
  • Cross-partition queries waste RUs when the tenant partition key is already known.
  • Name-only setup is incomplete because the SDK still needs an account endpoint and credential.

Question 15

Topic: Develop AI Solutions by Using Azure Data Management Services

A team is building a RAG-backed product assistant hosted in Azure Container Apps. Azure Database for PostgreSQL stores document chunks with tenant_id, is_active, category, updated_at, and an embedding column. Each request must return the 10 chunks most semantically similar to the user’s question, but only for the caller’s tenant, active documents, and the selected category. The API should avoid retrieving broad results and filtering them in memory. Which query design is the best fit?

Options:

  • A. Apply only SQL predicates and sort by updated_at.

  • B. Publish requests to Event Grid before querying PostgreSQL.

  • C. Apply SQL predicates and vector similarity ranking in PostgreSQL.

  • D. Run vector similarity first, then filter rows in the API.

Best answer: C

Explanation: This requirement needs both structured filtering and vector search. The tenant, active-state, and category constraints are exact metadata conditions, so they belong in SQL predicates such as WHERE tenant_id = ... AND is_active = true. The “most semantically similar” requirement needs vector similarity ranking over the embedding column, typically combined with ORDER BY using the vector distance operator and LIMIT 10. Keeping both steps in PostgreSQL reduces unnecessary row transfer and avoids application-side filtering that can miss the best results after filters are applied. The key distinction is that SQL filters narrow eligible rows, while vector search ranks those eligible rows by semantic similarity.

  • SQL only misses the semantic similarity requirement because recency or metadata sorting cannot find meaningfully related chunks.
  • App-side filtering wastes retrieval work and can produce poor top-10 results after excluded rows are removed.
  • Event Grid routing is for event notification workflows, not for satisfying an interactive PostgreSQL query requirement.

Question 16

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps RAG API writes OpenTelemetry request telemetry to a Log Analytics workspace. During a deployment on March 15, 2026, the incident question is: Which revision generated the most HTTP 5xx responses from 09:00 through 09:30 UTC? A developer runs this query and blames rev-blue based on the result. What is the best next troubleshooting step?

AppRequests
| where TimeGenerated > ago(24h)
| where AppRoleName == "rag-api"
| where Success == false
| summarize Failures=count() by Revision=tostring(Properties["revision"])
| top 1 by Failures

Options:

  • A. Query trace messages instead of request telemetry.

  • B. Scale down rev-blue and monitor new failures.

  • C. Increase telemetry sampling and rerun the same query.

  • D. Rerun KQL with the deployment window and 5xx filter.

Best answer: D

Explanation: KQL troubleshooting starts by checking whether the query matches the question being asked. Here, the incident asks for HTTP 5xx responses only during 09:00-09:30 UTC, but the query uses a 24-hour window and Success == false, which can include failures outside the deployment window and non-5xx responses. The next step is to rerun the query with the precise TimeGenerated range and a ResultCode filter for 500-599, while keeping the revision aggregation. Operational action should wait until the evidence matches the incident scope.

  • Scaling first acts on an unverified conclusion and could disrupt traffic unnecessarily.
  • Changing sampling does not fix a misleading query scope or failure definition.
  • Using traces may help root-cause details later, but request telemetry is the right source for HTTP status counts.

Question 17

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A Python worker in Azure Container Apps consumes Azure Service Bus queue messages, writes embeddings to Azure Cosmos DB for NoSQL, and uses dead-letter handling for poison messages. After 429 responses caused a queue backlog, the team increased RU/s and narrowed the indexing policy. The team must validate recovery without purging messages or masking failures. Which evidence best confirms the fix?

Options:

  • A. Container CPU stays below 50% during the next run.

  • B. 429s drop, RU stabilizes, and active messages drain without dead-letter spikes.

  • C. The queue becomes empty after purging all active messages.

  • D. Retries are disabled and no dead-letter messages appear.

Best answer: B

Explanation: Validation evidence should prove that the original failure mode is resolved and that normal safeguards still work. In this case, the root cause was Cosmos DB RU pressure shown by 429 responses, and the user-visible effect was a Service Bus backlog. Good evidence therefore combines data-service metrics with messaging metrics: fewer 429s, RU consumption within available capacity, active messages decreasing, and no abnormal increase in dead-lettered messages. That confirms the system is processing legitimate work again without hiding poison messages or deleting work from the queue.

Evidence based only on an empty queue, disabled retries, or low CPU can hide the real problem or measure the wrong component.

  • Purging the queue removes evidence of recovery and can discard legitimate work instead of proving processing succeeded.
  • Disabling retries can suppress failure signals and prevents normal dead-letter handling from protecting the workflow.
  • Low container CPU does not validate the Cosmos DB RU fix or prove that Service Bus backlog is draining.

Question 18

Topic: Connect to and Consume Azure Services

An Azure Function app with a Service Bus trigger is configured to mount a ZIP package from the storage URI in WEBSITE_RUN_FROM_PACKAGE. A pipeline deployed a new artifact, but message processing still uses the old parsing logic.

Expected artifact: ingest-worker-20260515.4.zip
ZipDeploy result: 202 Accepted
App settings:
  WEBSITE_RUN_FROM_PACKAGE = package ending in ingest-worker-20260510.1.zip
  ServiceBusConnection = Key Vault reference for prod namespace
Host startup:
  Mounted package: ingest-worker-20260510.1.zip
  Connected namespace: prod Service Bus

Which implementation should you perform to run the intended package while preserving the prod Service Bus connection?

Options:

  • A. Change ServiceBusConnection to a new queue-specific setting.

  • B. Redeploy with ZipDeploy and leave WEBSITE_RUN_FROM_PACKAGE unchanged.

  • C. Point AzureWebJobsStorage to the package storage account.

  • D. Point WEBSITE_RUN_FROM_PACKAGE to the new ZIP and restart the app.

Best answer: D

Explanation: For a Function app configured to run from a package URI, the runtime mounts the ZIP indicated by WEBSITE_RUN_FROM_PACKAGE when the host starts. A successful ZipDeploy response does not prove that the host is executing the new artifact. The evidence names the old ZIP in both the app setting and the startup log, while the Service Bus evidence already shows the prod namespace. The implementation should change only the package pointer to the expected artifact and restart so the host remounts it. This preserves the existing trigger and connection configuration.

  • ZipDeploy-only retry leaves the app pointing at the old package URI shown in configuration.
  • Connection change treats the issue as a namespace problem, but the log already shows prod Service Bus is connected.
  • Storage setting change affects Functions host storage, not which ZIP package is mounted for execution.

Question 19

Topic: Develop Containerized Solutions on Azure

A team is deploying a containerized Python AI back-end to Azure Container Apps. The security requirement is that images remain private and are distributed to Azure runtimes without using a public registry account.

Deployment path:

CI build
  -> tag image ai-api:2.4.1
  -> push to image store
  -> Container Apps revision pulls image

Which image store should be used at the missing step?

Options:

  • A. Azure Blob Storage container

  • B. Azure Managed Redis cache

  • C. Private Docker Hub repository

  • D. Azure Container Registry

Best answer: D

Explanation: Azure Container Registry is the Azure-native service for storing, versioning, and distributing private container images. In this flow, the build produces a tagged container image that an Azure Container Apps revision must pull privately. ACR fits that missing step because it is designed for container image repositories and integrates with Azure deployment services and Azure identity-based access patterns. A private third-party registry can be private, but it does not provide the same Azure-integrated image distribution path. Blob Storage and Redis are not container image registries.

  • Third-party registry can store private images, but it adds external registry integration instead of the required Azure-integrated distribution path.
  • Blob Storage stores objects, not container image repositories with tags and manifests for runtime pulls.
  • Managed Redis is for caching and low-latency data access, not durable container image storage.

Question 20

Topic: Connect to and Consume Azure Services

An AI chat backend uses Azure Functions. The web client must call one function directly over HTTPS to start a prompt workflow and receive an immediate HTTP response. A separate function must run only when a message is added to the Azure Service Bus queue embedding-jobs. Which triggers should you use? Select TWO.

Options:

  • A. HTTP trigger for the prompt-start function

  • B. Cosmos DB trigger for the prompt-start function

  • C. Service Bus queue trigger for the embedding job function

  • D. Timer trigger for the embedding job function

  • E. Azure Queue Storage trigger for the embedding job function

  • F. Event Grid trigger for the embedding job function

Correct answers: A and C

Explanation: Azure Functions triggers determine what starts a function. A direct client-call API should use an HTTP trigger so the function exposes an HTTPS endpoint and can return an HTTP response, such as 202 Accepted. Work that must run from an Azure Service Bus queue should use a Service Bus queue trigger because the runtime listens to that queue and invokes the function for incoming messages. Other triggers may still run similar code, but they do not match the explicit invocation sources in the scenario. Match the trigger to the source event, not just to the processing logic.

  • Event Grid mismatch fails because Event Grid handles event notifications, not command-style processing from the specified Service Bus queue.
  • Timer mismatch fails because a schedule would not react to each arriving queue message.
  • Cosmos DB mismatch fails because it reacts to data changes, not direct HTTPS client calls.
  • Storage Queue mismatch fails because the queue source is Azure Service Bus, not Azure Queue Storage.

Question 21

Topic: Develop AI Solutions by Using Azure Data Management Services

A team is building a product-support API hosted in Azure Container Apps. Support articles are stored in Azure Cosmos DB for NoSQL with tenantId, category, productVersion, and an embedding generated during ingestion. Users submit natural-language questions, and results must be semantically relevant while staying within the user’s tenant and optional category. Which design best fits the requirement?

Options:

  • A. Use Cosmos DB vector search over stored embeddings with metadata filters.

  • B. Filter Cosmos DB items by tenantId, category, and keyword matches.

  • C. Return recent articles from the matching tenant partition only.

  • D. Send the full article corpus to the model on each request.

Best answer: A

Explanation: Semantic retrieval uses an embedding of the user’s question and compares it with embeddings stored with the content items. In this scenario, the article embedding supports similarity search, while fields such as tenantId and category remain structured filters that constrain which items are eligible. A design that uses Cosmos DB vector similarity search with metadata filters satisfies both needs: semantic relevance and tenant/category boundaries. Ordinary filters over properties can narrow records, but they do not find conceptually similar content when the words differ from the stored article text.

  • Keyword filtering can enforce tenant and category constraints, but it does not perform semantic similarity over embeddings.
  • Recent-article lookup uses partitioned data access but ignores the natural-language meaning of the question.
  • Full-corpus prompting increases latency and bypasses the intended Cosmos DB vector retrieval pattern.

Question 22

Topic: Connect to and Consume Azure Services

You are building an Azure Functions serverless API endpoint POST /users/{userId}/summary-jobs. The function must be called by web clients over HTTPS, read the user’s preferences from Azure Cosmos DB for NoSQL by userId, and send a summarization command to an Azure Service Bus queue. The team wants to avoid custom SDK client code when bindings can handle the integration. Which binding design should you implement?

Options:

  • A. Use a Service Bus trigger and Cosmos DB output binding.

  • B. Use an HTTP trigger, Cosmos DB input binding, and Service Bus output binding.

  • C. Use a Cosmos DB trigger and HTTP output binding.

  • D. Use an HTTP trigger and SDK clients for both data services.

Best answer: B

Explanation: Azure Functions bindings reduce integration code by declaratively connecting a function to external services. For this API, the entry point must be an HTTPS request, so the function should use an HTTP trigger. The user preferences are read-only data needed during the request, so a Cosmos DB input binding can retrieve the document using the route value. The summarization command is a message to be sent after the request is handled, so a Service Bus output binding is the appropriate write mechanism. This preserves the API behavior while avoiding unnecessary Cosmos DB and Service Bus SDK client code inside the function.

  • Service Bus trigger fails because the function must be invoked by web clients over HTTPS, not by a queued message.
  • Cosmos DB trigger fails because the API should run on an HTTP request, not on database changes.
  • SDK clients may work, but they ignore the requirement to use bindings where they can handle the integration.

Question 23

Topic: Develop Containerized Solutions on Azure

A team is deploying a new Azure Container Apps revision for a Python API. The new image requires VECTOR_DB_HOST, EMBEDDING_MODEL, and a sensitive REDIS_PASSWORD at runtime. Multi-revision mode is enabled so the team can validate the new revision before shifting traffic. Which configuration should they use?

Options:

  • A. Store all three values only as Container Apps secrets.

  • B. Set the values on the Container Apps environment resource.

  • C. Set revision template env entries and use a secretRef for the password.

  • D. Pass all three values as Azure Container Registry Task build arguments.

Best answer: C

Explanation: Azure Container Apps runtime configuration for a container should be placed in the container app revision template as environment variables. Nonsecret values can be stored directly as env entries, while sensitive values should be stored as Container Apps secrets and referenced from the environment variable by secretRef. Updating revision-scoped template settings, such as environment variables or the image, creates a new revision that can be validated before traffic is moved to it. Build-time settings and environment-level resources do not provide the required per-revision runtime configuration. The key distinction is runtime revision configuration versus image build or shared environment configuration.

  • Build arguments are available during image build, not as the correct runtime configuration mechanism for a Container Apps revision.
  • Secrets only store sensitive values but do not expose them to the container unless an environment variable references them.
  • Environment resource settings configure the shared Container Apps environment, not per-container runtime variables.

Question 24

Topic: Connect to and Consume Azure Services

An AI ingestion service publishes one DocumentReady message to an Azure Service Bus topic. Three Azure Functions should independently run enrichment, cache invalidation, and audit processing. Operations reports that each message is processed by only one function, with no dead-letter activity.

Evidence:

Entity: topic/doc-events
Subscriptions:
  shared-processing
Consumers:
  enrich-func -> shared-processing
  cache-func  -> shared-processing
  audit-func  -> shared-processing
Observation:
  msg-1042 processed by enrich-func only
  msg-1043 processed by audit-func only
Dead-letter count: 0

Which root cause and subscription pattern best match the evidence?

Options:

  • A. The functions require sessions; assign the same session ID to each message.

  • B. The topic drops messages after delivery; enable duplicate detection.

  • C. The functions share one subscription; create one subscription per processing path.

  • D. Messages are failing delivery; increase the subscription max delivery count.

Best answer: C

Explanation: Azure Service Bus topics fan out messages to subscriptions, not to each consumer attached to the same subscription. Each subscription behaves like an independent queue and receives its own copy of a matching published message. When multiple functions listen to one subscription, they are competing consumers, so only one function processes each message. The evidence shows one shared subscription and zero dead-lettered messages, which points to a subscription design issue rather than delivery failure. Use separate subscriptions such as enrich, cache, and audit, with filters as needed, and have each function listen to its own subscription.

  • Duplicate detection prevents repeated messages with the same ID; it does not create fan-out copies for multiple processors.
  • Max delivery count addresses repeated delivery failures, but the dead-letter count is zero.
  • Sessions provide ordered, stateful message handling; they do not make all consumers on one subscription receive the same message.

Question 25

Topic: Connect to and Consume Azure Services

A Python back-end service publishes operation updates that are consumed by billing, analytics, and notification workers. Today, the service sends every update to three Service Bus queues, and each worker discards messages outside its required status and tenant set. This adds publisher latency and unnecessary worker processing. You must keep durable delivery and independent retry/dead-letter handling per consumer. Which design should you implement?

Options:

  • A. Publish once to a Service Bus topic with filtered subscriptions.

  • B. Keep three queues and move filtering into each worker.

  • C. Replace Service Bus with Event Grid custom events.

  • D. Use one Service Bus queue with competing consumers.

Best answer: A

Explanation: Service Bus topics are designed for publish/subscribe messaging when multiple consumers need different views of the same published operation stream. The publisher sends each operation update once to the topic. Each subscription can apply filters, such as status or tenant, so billing, analytics, and notification workers receive only relevant messages. Each subscription also has its own delivery state and dead-letter handling, which preserves reliability and lets consumers scale independently. This reduces publisher fan-out work and avoids wasting worker capacity on discarded messages.

A single queue is efficient for competing consumers, but it does not deliver each relevant message to multiple independent consumer groups.

  • Single queue fan-out fails because competing consumers divide messages instead of giving each consumer group its own view.
  • Worker-side filtering keeps the wasteful full-feed delivery and does not reduce broker or worker processing.
  • Event Grid replacement is less appropriate when durable Service Bus processing and per-consumer dead-letter handling are required.

Questions 26-50

Question 26

Topic: Connect to and Consume Azure Services

A team is building a containerized Python API in Azure Container Apps for a RAG application. Each request may start a long-running embedding refresh. The design must keep HTTP responses fast, avoid hard-coded secrets, allow batch-size changes without image redeployment, and correlate traces across the API and worker. Which design is the best fit?

Options:

  • A. Call the worker synchronously and store secrets as image environment variables.

  • B. Use managed identity, Service Bus, Key Vault, App Configuration, and OpenTelemetry.

  • C. Queue refresh requests in Redis and change batch size by rebuilding the image.

  • D. Use Event Grid for refresh commands and store secrets in App Configuration.

Best answer: B

Explanation: The best design uses Azure SDK clients at clear service boundaries and assigns each Azure service to its intended role. Azure Service Bus decouples the HTTP API from long-running embedding refresh work, so the API can return quickly while a worker processes commands asynchronously. Managed identity lets the container access Azure services without embedded credentials. Azure Key Vault should hold secrets, while Azure App Configuration is appropriate for nonsecret operational settings such as batch size. OpenTelemetry provides distributed traces that can correlate the API request with downstream worker processing. The key is matching integration mechanisms to workload constraints rather than adding syntax-level SDK details or overloading the wrong service.

  • Synchronous worker calls miss the fast-response constraint and embedding secrets into image environment variables weakens secret management.
  • Event Grid for commands is a poor fit for durable command-style work, and App Configuration is not where secrets belong.
  • Redis as a queue overuses a cache for durable job coordination and rebuilding the image violates the configuration-change constraint.

Question 27

Topic: Develop Containerized Solutions on Azure

A Python API runs as a custom container in Azure App Service. The database password currently appears in the Dockerfile as an ENV value, so every rotation requires rebuilding and redeploying the image. The team must keep credentials out of image layers and avoid adding a Key Vault network call to every request. Which approach best meets the requirement?

Options:

  • A. Pass the password as a Docker build argument

  • B. Inject the password into the image from ACR during build

  • C. Fetch the password from Key Vault on every request

  • D. Use an App Service Key Vault reference with managed identity

Best answer: D

Explanation: For an App Service custom container, secrets should be supplied at runtime, not baked into the Dockerfile or image build process. A Key Vault reference in an App Service application setting lets App Service use managed identity to resolve the secret and expose it to the container as an environment variable. This keeps the image reusable across environments and rotations, avoids leaking credentials through image layers or build history, and avoids a Key Vault call on every API request. The app can read the setting during startup and use it for its database connection pool. Build-time injection is less efficient for rotation and weaker for secret isolation.

  • Build argument fails because build-time values can remain in image history or layers and still require image rebuilds for rotation.
  • Per-request retrieval protects the image but adds avoidable latency and dependency pressure on Key Vault for normal request handling.
  • ACR build injection still treats the secret as build-time data, so it does not solve image-layer exposure or efficient rotation.

Question 28

Topic: Develop Containerized Solutions on Azure

A Python AI back-end runs in Azure Container Apps. Chat requests are returning HTTP 500 because the app secret reference for Azure Cosmos DB for NoSQL was incorrect. You fix the secret reference and route traffic to a new active revision. The app emits OpenTelemetry request and dependency telemetry to Log Analytics. Which monitoring signal best confirms recovery?

Options:

  • A. The replica restart count has stopped increasing

  • B. The new Container Apps revision state is Active

  • C. The container image pull from Azure Container Registry succeeded

  • D. End-to-end traces with 2xx requests and successful Cosmos DB spans

Best answer: D

Explanation: Recovery should be confirmed with a workload-level signal that exercises the remediated failure path. Here, the incident was not merely that the container failed to start; requests failed because the app could not use its Cosmos DB secret. An end-to-end OpenTelemetry signal showing successful HTTP responses and successful Cosmos DB dependency spans proves that the new revision is receiving traffic, loading the corrected configuration, and reaching the data service. Platform signals such as revision state, image pull, or restart count are useful supporting evidence, but they can be healthy while the application still returns 500s.

  • Active revision only confirms the revision is deployed and eligible for traffic, not that the Cosmos DB path works.
  • Image pull success validates registry access, not runtime configuration or data-service connectivity.
  • Stable restarts can show the process is not crashing, but a running app can still fail requests.

Question 29

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps API and two Azure Functions workers read feature flags, service endpoints, and credentials from Azure Key Vault on every operation. During scale-out, Key Vault calls spike and p95 latency increases. Security requires no stored connection strings and each component must have only the permissions it needs for Cosmos DB for NoSQL, Service Bus, and configuration. Which remediation best improves efficiency with the least exposure?

Options:

  • A. Use App Configuration with Key Vault references and per-component managed identities.

  • B. Use one broad user-assigned managed identity for all components.

  • C. Use deployment-time environment variables containing shared connection strings.

  • D. Send connection strings in Service Bus messages to workers.

Best answer: A

Explanation: The best remediation separates configuration coordination from secret exposure. Azure App Configuration can centralize nonsecret settings and feature flags, while Key Vault references keep secrets out of application settings. The configuration provider can cache values and refresh them instead of forcing every operation to call Key Vault, reducing latency and dependency pressure during scale-out. Per-component managed identities then allow least-privilege access: for example, one worker can receive from a Service Bus queue without also receiving Cosmos DB write permissions if it does not need them. The key trade-off is improving runtime efficiency without replacing repeated secret calls with broader credential exposure.

  • Environment variables can reduce lookup latency, but shared stored connection strings violate the stated secret-handling requirement.
  • Shared broad identity simplifies access, but every component receives permissions beyond what it needs.
  • Secrets in messages avoid configuration lookups, but expose credentials through the messaging layer and to consumers.

Question 30

Topic: Connect to and Consume Azure Services

A Python worker in Azure Container Apps stopped receiving messages from Azure Service Bus after a redeployment. The app uses DefaultAzureCredential and no connection string. The Service Bus namespace and queue exist, and messages remain active in the queue.

Evidence:

SERVICEBUS_FQDN=sb-prod.servicebus.windows.net
AZURE_CLIENT_ID=8b1f...e42
Container app identity: system-assigned = enabled; user-assigned = none
RBAC: system-assigned identity has Azure Service Bus Data Receiver
Trace: token request for https://servicebus.azure.net/.default
Error: managed identity endpoint returned "identity not found for client_id 8b1f...e42"

What is the best root cause?

Options:

  • A. The system-assigned identity lacks the Service Bus receiver role.

  • B. The messages were moved to the dead-letter queue.

  • C. The queue path is missing from the namespace endpoint.

  • D. The SDK is using an unassigned user-assigned managed identity.

Best answer: D

Explanation: The failure occurs during managed identity token acquisition, before the worker can call Azure Service Bus. In this configuration, AZURE_CLIENT_ID tells DefaultAzureCredential to use a specific user-assigned managed identity. The evidence shows that no user-assigned identity is attached to the container app, so the managed identity endpoint cannot issue a token for that client ID. The Service Bus endpoint and system-assigned identity role are not the deciding problem because the trace never reaches a Service Bus authorization decision. Remove or update AZURE_CLIENT_ID, or attach the intended user-assigned identity and assign it the required Service Bus data role.

  • Endpoint path fails because Azure SDK clients use the namespace FQDN with a separate queue name, and the trace shows an identity failure.
  • Missing receiver role fails because the system-assigned identity already has the receiver role and token acquisition failed first.
  • Dead-letter queue fails because the stem says messages remain active and the error occurs before message processing.

Question 31

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A team monitors a Python Azure Functions app that processes embedding jobs from an Azure Service Bus queue. The requirement is to keep valid jobs flowing, preserve evidence for failed jobs, and notify service owners only for Azure service or connectivity issues.

Exhibit: Current signals

Function host: Running; trigger connected
Service Bus: throttling=0; server errors=0
Queue depth: within normal range
DLQ growth: +480 messages/hour
Function log: ValidationError: vector length 1,024; expected 1,536
DLQ reason: ApplicationError; deliveryCount=5

Which control decision best meets the requirement?

Options:

  • A. Dead-letter invalid payloads with an application reason and alert the app team.

  • B. Increase the queue max delivery count for all failed jobs.

  • C. Suppress DLQ monitoring until the producer is fixed.

  • D. Scale out the Function app and upgrade the Service Bus tier.

Best answer: A

Explanation: The signals separate a service-level condition from an application-level condition. The Function host is running, the trigger is connected, and Service Bus shows no throttling or server errors. The failure is tied to application validation: the payload contains an embedding vector with the wrong length, and the DLQ reason is already classified as an application error. The safest control is to preserve the failed-message evidence, route the alert to the application owners, and allow valid messages to continue processing. Service-owner alerts should remain reserved for platform symptoms such as trigger connectivity failures, throttling, server errors, or host instability.

Retrying or scaling does not fix deterministic bad payloads; it only increases noise and processing cost.

  • Longer retries fail because a malformed vector is not a transient messaging failure.
  • More capacity misses the evidence because queue depth and service throttling are normal.
  • Suppressing DLQ monitoring hides failed-message evidence and can mask new regressions.

Question 32

Topic: Develop AI Solutions by Using Azure Data Management Services

An Azure Database for PostgreSQL database stores document chunks for a RAG API. The documents table has B-tree indexes on tenant_id and created_at, and pgvector is enabled. A query filters by tenant_id, then orders by cosine distance between embedding and a query vector, returning the top 10. The plan still scans and sorts many rows. What is the best next implementation step?

Options:

  • A. Increase PostgreSQL compute before changing indexes.

  • B. Create an HNSW vector index on embedding for cosine similarity.

  • C. Create a B-tree index on embedding.

  • D. Add a full-text index on the chunk text.

Best answer: B

Explanation: Embedding similarity search needs an index type that understands vector distance, such as a pgvector HNSW or IVFFlat index with the operator class that matches the query distance metric. B-tree indexes are useful for ordinary relational predicates and ordering, such as filtering by tenant_id or sorting by created_at, but they do not accelerate nearest-neighbor ranking by cosine distance across high-dimensional vectors. In this state, the relational filters already have indexes, and the slow part is the vector-distance order and top-k retrieval. The next implementation step is to add a vector index on the embedding column, then validate the query plan and latency.

  • B-tree on embeddings fails because scalar relational ordering does not support nearest-neighbor vector distance search.
  • More compute first is premature because the missing index type directly matches the observed scan-and-sort behavior.
  • Full-text indexing solves lexical text search, not embedding similarity ranking.

Question 33

Topic: Develop Containerized Solutions on Azure

A team deploys image rag-api:1.8 to an Azure Container Apps API that supports a RAG workflow. The deployment succeeds, but requests to the production URL still use the old model configuration. You must make version 1.8 serve production traffic while keeping multiple-revision mode enabled. Select TWO.

Exhibit:

Revision mode: Multiple
Ingress: External
Scale: min 0, max 5

rag-api--rev17  Active  Healthy    Traffic 100%  Image rag-api:1.7
rag-api--rev18  Active  Unhealthy  Traffic 0%    Image rag-api:1.8

rev18 log:
KeyError: EMBEDDINGS_ENDPOINT
Readiness probe failed

Options:

  • A. Retag image rag-api:1.8 as latest.

  • B. Switch the app to single-revision mode.

  • C. Deploy a revision with EMBEDDINGS_ENDPOINT configured.

  • D. Increase the minimum replica count to 1.

  • E. Assign production traffic to the healthy 1.8 revision.

  • F. Disable external ingress for the container app.

Correct answers: C and E

Explanation: Azure Container Apps separates revision health from traffic routing. A successful image deployment can create a revision that is active but not serving requests. Here, rev18 is unhealthy because the container cannot start or pass readiness without EMBEDDINGS_ENDPOINT, so the runtime configuration must be supplied in a new healthy revision. Also, because the app is in multiple-revision mode, production requests continue going to rev17 until traffic is assigned to the healthy 1.8 revision. The scale setting with a minimum of 0 is not the blocker for an HTTP app; the shown blockers are failed readiness and 0% traffic assignment.

  • Image retagging fails because Container Apps routes by revision traffic assignment, not by requiring a latest tag.
  • Single-revision mode violates the requirement to keep multiple revisions enabled and does not fix the missing setting.
  • Replica minimum is not the shown blocker; HTTP scaling can activate from zero when traffic is routed.
  • Disabling ingress would prevent use of the production URL instead of serving the API externally.

Question 34

Topic: Connect to and Consume Azure Services

A Python ingestion service running in Azure Container Apps validates uploaded contracts and creates embeddings. After each contract is ready, the app must publish domain events such as ContractEmbedded and ContractRejected, including contractId and tenantId, so separate Azure Functions can process each event type. The events are not emitted by an Azure resource. Which configuration should the developer choose?

Options:

  • A. Publish custom events to an Event Grid custom topic.

  • B. Send all contract status changes to one Service Bus queue.

  • C. Use Event Grid partner events from the ingestion service.

  • D. Create an Event Grid system topic for the container app.

Best answer: A

Explanation: Event Grid custom events are used when an application, rather than an Azure resource, is the event source. The ingestion service needs to publish domain-specific events with its own event types and data fields, such as ContractEmbedded and tenantId. An Event Grid custom topic provides the publishing endpoint, and downstream subscriptions can route or filter events to different Azure Functions based on event type or subject. System topics are for Azure resource events, and partner topics are for supported external SaaS providers. Service Bus queues are better for command-style work items, not fan-out event notification with Event Grid filtering.

  • System topic fails because the events are application-defined, not built-in events from an Azure resource.
  • Partner events fails because the source is the team’s own application, not a supported partner provider.
  • Service Bus queue changes the pattern to queued command processing and does not match Event Grid custom event routing.

Question 35

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps RAG API is instrumented with OpenTelemetry and sends request and dependency telemetry to Application Insights. The telemetry includes a revisionName custom dimension. After a Redis secret rotation and rollout to revision rag-api--v12 at 10:15 UTC, failed requests appear to drop. The team must verify the remediation and still expose failures in any dependency.

Current query:

requests
| where timestamp between (datetime(2026-05-16T09:45:00Z) .. datetime(2026-05-16T10:45:00Z))
| where cloud_RoleName == "rag-api"
| where success == false

Which KQL refinement should you apply next?

Options:

  • A. Filter to successful requests after 10:15 UTC and count by revision.

  • B. Include all outcomes, join dependencies by operation_Id, and bin failure rate by target and revision.

  • C. Summarize all failed requests into one total for the full hour.

  • D. Exclude Redis dependencies and summarize the remaining failed requests.

Best answer: B

Explanation: To isolate a failing component or validate a remediation, keep the correlation and comparison dimensions in the query. operation_Id ties request telemetry to downstream dependency calls, while dependency target identifies whether Redis, Cosmos DB, or another component is involved. Binning by time around the rollout and grouping by revisionName shows whether the new revision changed the failure rate. Including both successful and failed outcomes avoids mistaking lower traffic for a fix. Filters that remove the suspected dependency or show only successful requests can make the remediation look effective while hiding residual failures.

  • Success-only filter hides remaining failures and provides no before/after component comparison.
  • Redis exclusion removes the suspected dependency, so it can falsely validate the secret rotation.
  • Single total loses target, revision, and timing dimensions needed to isolate the problem.

Question 36

Topic: Connect to and Consume Azure Services

A backend API accepts requests to create embeddings for uploaded documents. Each request can take several minutes, must survive worker restarts, and failed requests must be held for later inspection. Several worker instances should compete for the same backlog, with each request claimed by only one worker. Which integration configuration should you use?

Options:

  • A. Call an HTTP-only Azure Function directly.

  • B. Use an Event Grid subscription with a subject filter.

  • C. Use a Service Bus queue with worker triggers.

  • D. Publish custom events to Event Grid.

Best answer: C

Explanation: Service Bus is the better integration mechanism when the workload is a durable command or work item that must be processed by exactly one worker from a backlog. In this scenario, embedding creation is long-running backend work, not just a notification that something happened. A Service Bus queue supports durable storage, competing consumers, retry behavior, and a dead-letter queue for messages that cannot be processed successfully. Event Grid is optimized for event notification and routing, while an HTTP-only Function call ties the producer directly to the processing endpoint and does not provide the same durable queue semantics by itself.

The key distinction is command processing with a backlog versus event notification.

  • Event notification is not ideal because Event Grid signals that something happened; it is not the primary durable work queue for competing workers.
  • HTTP-only processing fails the durability requirement because the producer remains coupled to a live processing endpoint.
  • Subject filtering helps route events but does not provide queue-style backlog processing and dead-letter handling for work commands.

Question 37

Topic: Develop AI Solutions by Using Azure Data Management Services

A containerized Python RAG API on Azure Container Apps reads embeddings from Azure Database for PostgreSQL. After a scale-out event, p95 API latency increased. The team must reduce latency without changing retrieval semantics or rejecting normal bursts. Which operational control should you apply?

Evidence:

SignalObservation
App span db.connection.acquirep95 820 ms
App span db.query.executep95 45 ms
pg_stat_statementsretrieval query mean 39 ms
Database metricsCPU and I/O below 40%
Connection eventsspikes near connection limit

Options:

  • A. Increase query timeouts and retry reads immediately.

  • B. Add bounded connection pooling and reuse database connections.

  • C. Reduce database connection limits to fail excess requests.

  • D. Add a vector index for the retrieval query.

Best answer: B

Explanation: The slow path is connection handling, not query execution. The app span for acquiring a database connection is much higher than the query execution span, and PostgreSQL query statistics show the retrieval query itself is fast. Low CPU and I/O also argue against a database execution bottleneck. A bounded pool lets each app instance reuse established PostgreSQL connections and limits total concurrent connections so scale-out does not create connection storms. This reduces acquisition latency without changing retrieval logic or intentionally rejecting legitimate bursts.

  • Index tuning targets slow query execution, but both app and database evidence show the query is already fast.
  • Retrying reads can amplify pressure during connection spikes and does not reduce connection acquisition time.
  • Failing excess connections may protect the database, but it violates the requirement to avoid rejecting normal bursts.

Question 38

Topic: Develop AI Solutions by Using Azure Data Management Services

A Python RAG API uses Azure Database for PostgreSQL to retrieve policy documents. Every request must apply exact filters on tenant_id, region, and document_state, then rank the remaining rows by semantic closeness to the query embedding. Which table-modeling configuration should you choose?

Options:

  • A. A text search index on content instead of stored embeddings

  • B. One embedding column containing metadata and content semantics

  • C. Indexed relational filter columns plus a vector-indexed embedding column

  • D. Only a vector index, with filters applied in application code

Best answer: C

Explanation: Exact relational filtering and vector similarity serve different query needs. Model tenant_id, region, and document_state as normal typed PostgreSQL columns so SQL can filter them with relational predicates and appropriate indexes. Store the document embedding separately in a vector column and use a vector index for nearest-neighbor ranking. A typical query can combine WHERE clauses for metadata with an ORDER BY on vector distance and a LIMIT. Encoding metadata into embeddings or filtering only after vector search weakens exactness and can produce inefficient or incorrect candidate sets.

  • Metadata in embeddings fails because vectors do not enforce exact tenant, region, or state predicates.
  • Text search only misses the required embedding-based semantic similarity ranking.
  • Application-side filtering bypasses relational indexes and may retrieve the wrong candidate set before filtering.

Question 39

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps API uses OpenTelemetry. Users report slow RAG responses, but requests succeed. A representative trace for one slow request is shown; child spans ran sequentially under the request span.

SpanStartDuration
HTTP POST /rag0 ms2,030 ms
auth.validate_token12 ms34 ms
redis.get_context58 ms42 ms
cosmos.vector_query118 ms1,760 ms
servicebus.enqueue_audit1,890 ms31 ms
response.write1,930 ms68 ms

Which component is the best first focus for latency troubleshooting?

Options:

  • A. Investigate Service Bus audit enqueue latency

  • B. Investigate Cosmos DB vector query latency

  • C. Investigate token validation middleware latency

  • D. Investigate Azure Managed Redis lookup latency

Best answer: B

Explanation: Distributed tracing breaks one request into timed spans so you can identify where wall-clock time is spent. The root HTTP span represents the whole request, so the useful comparison is among child spans. Because the child spans are sequential, the cosmos.vector_query span accounts for most of the 2,030 ms request duration at 1,760 ms. The Redis lookup, token validation, Service Bus enqueue, and response write spans are all small by comparison.

The first diagnostic focus should be the Cosmos DB vector retrieval path, such as query shape, indexing, RU pressure, or vector search performance.

  • Cache lookup is only 42 ms, so it cannot explain most of the 2,030 ms request.
  • Audit enqueue completes in 31 ms and occurs after the long vector query span.
  • Token validation contributes 34 ms, which is minor compared with the slow trace duration.

Question 40

Topic: Develop Containerized Solutions on Azure

A team deploys a Python RAG API to Azure Container Apps. The image is stored in Azure Container Registry, and the app reads vector data from Azure Cosmos DB for NoSQL. A new revision never becomes ready.

Startup evidence:

Revision: rag-api--v42
Event: Successfully pulled image myacr.azurecr.io/rag-api:2026-05-16
Event: Started container rag-api
Log: INFO Loading runtime settings
Log: ERROR Missing required setting: COSMOS_DB_ENDPOINT
State: Terminated, exit code 1

Which interpretation and action is the best fit?

Options:

  • A. Add the missing setting as a Container Apps environment variable or secret reference.

  • B. Grant the Container Apps managed identity AcrPull on the registry.

  • C. Move the workload to AKS to control the pod startup sequence.

  • D. Increase replica count to keep at least one container warm.

Best answer: A

Explanation: The evidence shows a missing runtime configuration problem, not an image pull failure. The platform successfully pulled the ACR image and started the container, so registry permissions and image tags are not the blocking issue. The application then logged that COSMOS_DB_ENDPOINT was missing and terminated with exit code 1. For Azure Container Apps, required configuration should be supplied through environment variables, secret references, or connected configuration sources such as Key Vault-backed values where appropriate. The fastest targeted fix is to provide the missing setting and redeploy or activate a corrected revision.

  • Registry permission fails because the log explicitly shows the image was pulled successfully.
  • Replica count does not fix a container that exits immediately because required configuration is missing.
  • Moving to AKS overbuilds the solution and does not address the absent runtime setting.

Question 41

Topic: Connect to and Consume Azure Services

An Azure Functions app uses function.json metadata for its bindings. The deployment method replaces the app content with the uploaded package.

Current binding metadata:

BindingKey setting
Service Bus triggerqueueName=orders, connection=OrdersBus
Cosmos DB outputconnection=CosmosStore

You need to deploy updated Python business logic while preserving the trigger and output binding behavior. Which deployment action should you choose?

Options:

  • A. Replace binding connection names with literal connection strings.

  • B. Deploy updated code with unchanged binding metadata and app settings.

  • C. Convert the Service Bus trigger to an HTTP trigger.

  • D. Deploy only the updated Python files to the app content path.

Best answer: B

Explanation: Azure Functions discovers triggers and bindings from function metadata, such as function.json, and resolves connection values through Function app settings. Because this deployment replaces the app content, the package must include the existing binding metadata along with the updated code. Keeping OrdersBus and CosmosStore unchanged preserves the same Service Bus queue trigger and Cosmos DB output binding behavior. Changing the trigger type or embedding connection strings would alter the app’s activation model or secret-handling pattern rather than safely updating the business logic.

  • Deploying only Python files can remove the metadata the runtime needs to discover the trigger and output binding.
  • Literal connection strings weaken secret management and bypass the intended app-setting references.
  • Converting to HTTP changes how the function starts and no longer preserves Service Bus trigger behavior.

Question 42

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A Python API in Azure Container Apps reads nonsecret settings from Azure App Configuration. It uses dynamic refresh with a registered sentinel key, App:ConfigVersion, and a 60-second refresh interval. After Rag:RetrievalMode was changed from hybrid to vectorOnly, requests still use hybrid until the revision is restarted.

Exhibit:

10:04 loaded keys: label=prod
10:05 refresh check: App:ConfigVersion label=prod ETag=a1
10:12 refresh check: App:ConfigVersion label=prod ETag=a1
10:12 request used Rag:RetrievalMode=hybrid
KeyLabelValueLast modified
Rag:RetrievalModeprodvectorOnly10:08
App:ConfigVersionprod1409:30

What is the best root cause?

Options:

  • A. The Rag:RetrievalMode key is missing from the prod label.

  • B. The refresh sentinel was not updated after the setting change.

  • C. Traffic is still being routed to an old Container Apps revision.

  • D. The app is using an expired Key Vault secret reference.

Best answer: B

Explanation: Azure App Configuration clients commonly cache settings and use dynamic refresh to decide when to reload them. In this scenario, the app is configured to watch the sentinel key App:ConfigVersion. The logs show repeated refresh checks for that sentinel with the same ETag, while the actual Rag:RetrievalMode value changed later. Because the sentinel did not change, the provider had no signal to reload the cached configuration, so the app continued using hybrid until restart forced a full load. Update the sentinel after related configuration changes, or register the specific key for refresh with the correct label.

  • Missing key fails because the App Configuration query shows Rag:RetrievalMode exists under the prod label.
  • Secret reference fails because the setting is nonsecret and the evidence shows stale configuration, not a Key Vault retrieval error.
  • Old revision routing fails because no traffic-split evidence is shown, and the refresh log explains the stale value.

Question 43

Topic: Develop Containerized Solutions on Azure

Your team is deploying a new image for a Python API to Azure Container Apps in multiple revision mode. The image reads COSMOS_ENDPOINT, COSMOS_DATABASE, and COSMOS_KEY at startup. COSMOS_KEY must not be committed to the image or repository, and the current active revision must remain unchanged while the new revision is validated. Which implementation should you use?

Options:

  • A. Bake the values into Dockerfile ENV instructions.

  • B. Set app env vars with COSMOS_KEY as a secret reference.

  • C. Update only the Container Apps secret value.

  • D. Set the variables on the managed Container Apps environment.

Best answer: B

Explanation: Azure Container Apps revisions capture revision-scoped container template settings, including environment variables. For this deployment, set the nonsecret values as container environment variables and store COSMOS_KEY as a Container Apps secret that is referenced by an environment variable. Updating the container template with the new image and environment variable settings creates a new revision while the existing revision remains unchanged for validation. The key point is to externalize runtime configuration from the image and attach it to the app revision that will run the new container.

  • Dockerfile values fail because secrets and deployment-specific settings would be baked into the image instead of supplied at runtime.
  • Secret-only update fails because it does not configure the missing nonsecret environment variables or create the full new revision configuration.
  • Managed environment settings fail because the Container Apps environment resource is not where per-container revision environment variables are defined.

Question 44

Topic: Develop Containerized Solutions on Azure

A team deploys a Python embedding worker to Azure Container Apps with a KEDA Service Bus scale rule. They expect the worker to have no running containers when the queue is idle.

Trace:

Scale rule: Service Bus queue, target messages: 5
minReplicas: 1
maxReplicas: 10

10:00 queue length = 0  -> replicas = 1
10:05 queue length = 30 -> replicas increase
10:30 queue length = 0  -> replicas = 1

Which behavior does the trace show?

Options:

  • A. minReplicas prevents scale-to-zero during idle periods.

  • B. maxReplicas keeps one container running when idle.

  • C. The target message count requires one replica for an empty queue.

  • D. The Service Bus scaler cannot scale Container Apps from zero.

Best answer: A

Explanation: In Azure Container Apps, KEDA scale rules decide when to add or remove replicas based on signals such as Service Bus queue depth, but minReplicas sets the lower bound. If minReplicas is 1, the revision keeps at least one replica running even when the queue has no messages. To allow true scale-to-zero for an event-driven worker, set the minimum replica count to 0 and use an appropriate event-based scale rule. The target message count affects scale-out behavior when work exists; it does not override the configured minimum.

  • Scaler limitation fails because KEDA-based Container Apps can scale from zero when the minimum replica count allows it.
  • Maximum replicas is only an upper bound and does not define the idle floor.
  • Target messages influences scaling under load, not the number of replicas required for an empty queue.

Question 45

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps API connects to Azure Database for PostgreSQL using a password stored as a versioned secret in Azure Key Vault. During rotation, operations create a new secret version and update the database password; the old password can stop working immediately. Which configuration and retrieval behavior should the API use to handle rotation safely?

Options:

  • A. Move the password to App Configuration with dynamic refresh.

  • B. Use a versionless Key Vault URI and refetch on authentication failure.

  • C. Use a versioned Key Vault URI and refresh every 24 hours.

  • D. Inject the password as an image build environment variable.

Best answer: B

Explanation: Secret rotation is safest when the application does not pin itself to a specific Key Vault secret version and does not rely only on a startup-time value. A versionless Key Vault URI, or an equivalent lookup by secret name, allows retrieval of the current active secret version. Because the old database password can stop working immediately, the app should also invalidate its cached value and refetch from Key Vault when authentication fails or when its refresh policy indicates the cached value is stale. This avoids rebuilding the container image and reduces outages caused by stale credentials. App Configuration is appropriate for nonsecret settings, but secrets should remain in Key Vault.

  • Versioned URI fails because refreshing a pinned version can still return the old secret value.
  • Image environment variable fails because the secret is baked into the image lifecycle and will not track rotation safely.
  • App Configuration storage fails because moving the password there weakens the Key Vault secret-management pattern.

Question 46

Topic: Connect to and Consume Azure Services

An API uses an Azure SDK to submit indexing jobs to Azure Service Bus. The team expects an Azure Function with a Service Bus subscription trigger to process only high-priority jobs. The send call succeeds, but the Function never runs for this job.

Flow trace:

API -> topic indexing-jobs -> subscription highPriority -> Function
Send result: succeeded, messageId=job-742
Message body fields: tenantId=contoso, priority=high
Application properties: correlationId=tr-18
Subscription rules: only tenantId='contoso' AND priority='high'
Function invocations for job-742: 0
Dead-letter count: 0

Which integration issue best explains the missing Function invocation?

Options:

  • A. The SDK send call must wait for processing

  • B. The Function should use an Event Grid trigger

  • C. Filter fields were placed only in the message body

  • D. The job was dead-lettered before the trigger ran

Best answer: C

Explanation: This is an integration-boundary issue between the SDK sender and the Service Bus broker. A message body is opaque payload for the receiving application, while subscription rules evaluate broker-visible values such as system properties and application properties. In the trace, tenantId and priority exist only inside the body, but the only subscription rule filters on those values. Because the broker cannot match those body fields, no copy is routed into the highPriority subscription, so the Azure Function trigger has nothing to invoke on. A zero dead-letter count also fits this behavior because an unmatched topic message is not the same as a failed subscription delivery. Put routing fields in application properties when subscription filters must use them.

  • Event Grid trigger is a different eventing pattern and does not fix a Service Bus topic subscription filter mismatch.
  • Waiting for processing misunderstands the boundary; sending a message succeeds independently of later Function execution.
  • Dead-letter assumption conflicts with the visible dead-letter count and the fact that unmatched topic messages are not trigger failures.

Question 47

Topic: Develop Containerized Solutions on Azure

A team deploys a new Python API revision to Azure Container Apps and routes 100% of traffic to it. Users receive HTTP 502 responses. The container logs show:

INFO: Uvicorn running on http://0.0.0.0:8000
ERROR: Ingress failed to connect to container on target port 80

The revision has external ingress enabled with targetPort set to 80. What should you do next?

Options:

  • A. Update ingress targetPort to 8000.

  • B. Rotate the Azure Container Registry credentials.

  • C. Add a private DNS zone for the app.

  • D. Increase the revision replica count.

Best answer: A

Explanation: The evidence points to a port mismatch, not a scaling, image-pull, or DNS problem. Azure Container Apps ingress forwards requests to the configured targetPort. If the container process listens on 0.0.0.0:8000 but ingress targets 80, the platform cannot connect to the app and clients can receive 502 responses. The next step is to align the Container Apps ingress port with the application listener, then validate the new revision. Scaling should come after basic connectivity is correct.

  • Scaling replicas does not fix a listener-port mismatch; more replicas would still be unreachable on port 80.
  • Rotating registry credentials targets image-pull authentication, but this container is already running and logging.
  • Adding DNS addresses name resolution, while the logs show ingress reaching the revision but using the wrong port.

Question 48

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A containerized Python API on Azure Container Apps connects to Azure Database for PostgreSQL. A database password was accidentally added as ENV PG_PASSWORD=... in the Dockerfile and pushed to Azure Container Registry. A prototype fix that reads the password from Azure Key Vault on every request adds noticeable latency. Which remediation best preserves least exposure while improving request efficiency?

Options:

  • A. Rotate the password and store the replacement as a plain Container Apps setting.

  • B. Rotate the password, delete the exposed image, rebuild secret-free, and cache managed-identity Key Vault retrieval.

  • C. Restrict Azure Container Registry pulls and keep using the published image password.

  • D. Rotate the password, keep it out of code, and call Key Vault on every request.

Best answer: B

Explanation: When a secret is baked into an image, the exposure has already occurred, so the secret must be treated as compromised. Rotate or revoke the database password, remove it from the Dockerfile, and publish a replacement image with no secret material. Store the new password in Key Vault and let the workload access it with managed identity. To meet the latency constraint, fetch the secret outside the request path and cache it only in process memory, with a refresh or restart plan for rotation. This avoids repeated Key Vault calls while avoiding plain settings or image-baked secrets.

  • Per-request retrieval can protect the secret but fails the efficiency constraint because the stem says it adds latency.
  • Plain settings reduce lookup cost but reintroduce secret exposure outside Key Vault.
  • ACR pull restriction does not invalidate a password that may already have been extracted from the image.

Question 49

Topic: Develop AI Solutions by Using Azure Data Management Services

An AI support API caches generated answers in Azure Managed Redis to reduce repeated vector lookups. Each catalog publish updates a lightweight catalogVersion value that the API can read before checking the answer cache. Users report wrong answers after catalog updates, although Cosmos DB shows the updated product record.

Exhibit:

Request: tenant=contoso, query=Use model X in EU, filter=region:EU
Current Redis key: answer:{sha256(normalizedQuery)}
TTL: 12 hours
Current catalogVersion: 2026-05-14T18:21Z
Cached answer version: 2026-05-13T09:05Z

Which change best improves correctness while preserving cache efficiency?

Options:

  • A. Flush all cached answers whenever any answer is reported wrong.

  • B. Reduce TTL to 30 seconds but keep the current key.

  • C. Use versioned keys with tenant, filters, query, and catalogVersion.

  • D. Scale up Azure Managed Redis and keep the current key.

Best answer: C

Explanation: This is a cache-coherency problem, not a Redis capacity problem. The cached answer depends on tenant, filters, query text, and the source-of-truth catalog version, but the current key uses only the normalized query. A versioned key such as answer:{tenant}:{filterHash}:{queryHash}:{catalogVersion} makes a catalog update naturally miss the stale entry while preserving cache hits for repeated requests against the same data version. The TTL still cleans up old entries, but correctness no longer depends only on waiting 12 hours for expiration. Short TTLs and cache flushes are blunt invalidation strategies; scaling Redis does not change which value is returned.

  • Short TTL reduces the stale window but still ignores tenant, filter, and version differences.
  • Redis scale-up can improve latency or capacity, but it cannot fix stale or incorrectly keyed responses.
  • Full cache flushing is reactive and inefficient compared with deterministic versioned cache lookup.

Question 50

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A team deploys a containerized chat API to Azure Container Apps. The API already retrieves secrets from Azure Key Vault. After each environment promotion, developers rebuild the image only to change nonsecret values such as ModelDeploymentName, RetrieverTopK, and a UseSemanticReranker feature flag. The requirement is to change these values per environment without rebuilding and without changing user-facing AI behavior. What is the best next implementation step?

Options:

  • A. Store the values as Key Vault secrets.

  • B. Add KQL queries for request latency logs.

  • C. Redesign the retrieval pipeline and reranker prompt.

  • D. Load the values from Azure App Configuration.

Best answer: D

Explanation: This is an application configuration problem, not an AI feature-design problem. The values are nonsecret, environment-specific, and need to change independently of the container image. Azure App Configuration is designed to centralize application settings and feature flags, while Key Vault remains the right place for secrets. The next step is to externalize these settings into Azure App Configuration and update the app to read them at startup or refresh them through the supported configuration provider. Tuning prompts or adding telemetry may be useful later, but they do not meet the stated deployment requirement.

  • Pipeline redesign changes user-facing behavior and skips the configuration prerequisite.
  • Key Vault storage is for secrets; these are nonsecret settings and a feature flag.
  • KQL latency queries help troubleshoot performance but do not externalize environment configuration.

Questions 51-60

Question 51

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

A Python RAG API runs in Azure Container Apps, reads embeddings from Azure Cosmos DB for NoSQL, and queues enrichment work to Azure Service Bus for Azure Functions processing. Users report intermittent latency during peak traffic. Security requires using approved Azure-native telemetry, and the team must identify which existing component is causing delays. Which monitoring design is the best fit?

Options:

  • A. Use OpenTelemetry, Azure Monitor, and KQL across the app and platform signals.

  • B. Create an Azure AI Foundry prompt-evaluation pipeline.

  • C. Monitor only Cosmos DB RU and vector index metrics.

  • D. Forward all telemetry to a new third-party observability platform.

Best answer: A

Explanation: This scenario needs operational monitoring of the existing Azure components, not a new AI evaluation workflow or external observability migration. OpenTelemetry can capture application spans from the containerized API and the function, while Azure platform diagnostics and metrics cover Container Apps, Cosmos DB, Service Bus, and Azure Functions. Sending those signals to Azure Monitor and querying them with KQL helps correlate request latency, dependency calls, RU or throttling signals, queue backlog, and function execution timing. That fits the security constraint to stay with approved Azure-native telemetry and supports isolating the slow component in the current architecture.

  • Third-party migration fails because the stem requires approved Azure-native telemetry and rapid component-level diagnosis.
  • Cosmos-only monitoring fails because latency could also come from the container app, queue, or function processing.
  • Prompt evaluation fails because it assesses AI behavior, not runtime operational signals across Azure services.

Question 52

Topic: Develop Containerized Solutions on Azure

A Python RAG API runs in Azure Container Apps with multiple revisions enabled. After a new revision is assigned 20% traffic, users routed to it receive 503 responses. The operations requirement is to restore service quickly without dropping messages or weakening health checks.

Evidence:

SignalObservation
New revision eventsReadiness probe failed on port 8080
New revision logsApp listening on port 8000
Cosmos DB latencyNormal p95 latency
Service Bus queueNo backlog or dead-letter spike

Which control decision should you make?

Options:

  • A. Increase Cosmos DB Request Units for the RAG container.

  • B. Route traffic back to the previous healthy revision.

  • C. Purge and recreate the Service Bus queue subscription.

  • D. Disable the readiness probe on the new revision.

Best answer: B

Explanation: The symptoms isolate the failure to the new Container Apps revision: its readiness probe checks port 8080, but the application is listening on port 8000. Cosmos DB latency is normal, and Service Bus has no backlog or dead-letter spike, so scaling data services or changing messaging state does not address the observed 503s. A safe operational control is to remove traffic from the unhealthy revision and send users to the previous healthy revision. This restores availability without dropping queued work or weakening platform health checks. The failed revision can then be corrected and redeployed through a controlled rollout.

  • Cosmos DB scaling does not fit because the latency signal is normal and the failures are tied to revision readiness.
  • Queue recreation risks message loss or disruption and is unsupported by the no-backlog evidence.
  • Probe removal hides the unhealthy revision instead of fixing the port mismatch and could route users to a broken container.

Question 53

Topic: Connect to and Consume Azure Services

A containerized ingestion API publishes a notification after a document is indexed. Downstream services should react independently, but the notification is not a command that must be locked, settled, or completed by a worker. Handlers should receive only events with eventType of DocumentIndexed and subjects that start with /customers/contoso/. Which configuration should you use?

Options:

  • A. Service Bus queue with competing consumers

  • B. Service Bus topic with message settlement

  • C. Event Grid topic with filtered event subscriptions

  • D. Single Event Grid subscription that filters in code

Best answer: C

Explanation: Event Grid is designed for event notifications: a publisher announces that something happened, and subscribers independently react. Event subscriptions can filter by event type and subject, so handlers can receive only DocumentIndexed events for /customers/contoso/ without treating the event as a durable work item that requires lock and settlement. Service Bus is better when the application is processing commands or jobs that must be queued, received, completed, abandoned, deferred, or dead-lettered. The key distinction is notification routing versus brokered message processing.

  • Competing consumers would allow only one worker instance to process each queued message, not independent notification to multiple handlers.
  • Message settlement fits Service Bus work-item processing, but the stem states the event is not a command requiring completion.
  • Filtering in code wastes delivery and processing because Event Grid can filter before invoking handlers.

Question 54

Topic: Develop AI Solutions by Using Azure Data Management Services

A containerized RAG API queries Azure Database for PostgreSQL. Each chunk row has tenant_id, classification, and an embedding vector. The API must return the 8 most semantically similar chunks, but only for the caller’s tenant and allowed classifications. Which implementation should you use?

Options:

  • A. Sort by metadata match first, then use vector distance as a tie-breaker.

  • B. Order all chunks by vector distance, then filter in application code.

  • C. Embed metadata into the chunk text and use vector ranking only.

  • D. Filter metadata in WHERE, then order by vector distance.

Best answer: D

Explanation: In a RAG retrieval path, metadata filtering and vector similarity ranking serve different purposes. Metadata filters are deterministic constraints: they decide which rows are eligible based on fields such as tenant, classification, language, or document type. Vector similarity ranking then orders only those eligible rows by semantic closeness to the query embedding. In PostgreSQL, this typically means using parameterized metadata predicates in the WHERE clause and ordering by a vector distance operator before applying LIMIT. This preserves authorization and business constraints while still selecting the most relevant chunks. Filtering after a global vector search can miss better matches inside the allowed subset and may retrieve data the caller should not process.

  • Application-side filtering fails because the top global vector results may exclude the best matches within the caller’s allowed subset.
  • Embedding metadata only fails because tenant and classification are exact constraints, not semantic preferences.
  • Metadata-first sorting fails because eligible rows should be ranked by semantic similarity, not by arbitrary metadata order.

Question 55

Topic: Develop AI Solutions by Using Azure Data Management Services

An Azure Database for PostgreSQL flexible server stores embeddings for a RAG API. After a content rollout, vector searches are slow only during peak traffic. The team must choose the first configuration change.

SignalObservation
CPU and memory92-96% CPU, 89-93% memory
Connections42 active; pool wait time is 0
Query planUses vector and metadata indexes
Query patternSame top-k query; no new predicates

Which change addresses the most likely bottleneck?

Options:

  • A. Scale the server to more vCores and memory.

  • B. Rewrite retrieval to remove metadata filters.

  • C. Increase the application connection pool size.

  • D. Add another vector index on the embedding column.

Best answer: A

Explanation: The evidence points to database resource pressure, not an application or query-design problem. For PostgreSQL vector workloads, sustained high CPU and memory during peak vector search means the server may not have enough compute or memory for the retrieval load. Connection pooling is not the constraint because active connections are modest and pool wait time is 0. Indexing is also unlikely to be the first fix because the query plan already uses both the vector index and metadata index. When the query shape has not changed and indexes are being used, scaling compute and memory is the most direct configuration response.

  • Pool sizing fails because there is no pool wait and connection count is not near capacity.
  • Extra vector index fails because the plan already uses the relevant vector index.
  • Removing filters is not supported because the metadata filter index is already used and predicates did not change.

Question 56

Topic: Secure, Monitor, and Troubleshoot Azure Solutions

An Azure Container Apps API uses OpenTelemetry to call two backend services and then Azure Cosmos DB for NoSQL. One user request returns HTTP 504. You need to choose the telemetry evidence that best reconstructs the single request path and shows the missing downstream call.

Exhibit: Telemetry for failed request

Expected path:
client -> chat-api -> rag-orchestrator -> vector-worker -> Cosmos DB

TraceId: 8ad4
chat-api POST /chat             OK
  rag-orchestrator /retrieve     OK
    vector-worker /lookup        ERROR
    (no Cosmos DB child span)

Logs:
vector-worker: "Cosmos DB timeout" at 10:42:17
chat-api: "returned 504" at 10:42:18

Metrics:
chat-api p95 latency: 1.9s
vector-worker CPU: 38%
Cosmos DB RU consumption: normal

Which evidence identifies where this request path failed?

Options:

  • A. The p95 latency and CPU metrics

  • B. The span tree for TraceId 8ad4

  • C. The normal Cosmos DB RU metric

  • D. The vector-worker timeout log line

Best answer: B

Explanation: Distributed traces are the right evidence for reconstructing a single request path. OpenTelemetry traces group spans by trace ID and preserve parent-child relationships across service calls, so they show where the request moved and where the next expected span is missing. In this case, TraceId 8ad4 reaches vector-worker /lookup, that span is marked ERROR, and no Cosmos DB child span appears. Logs can confirm that a timeout was recorded, but a log line alone does not prove the end-to-end call chain. Metrics summarize behavior over time, such as latency, CPU, or RU usage, and are not designed to identify the exact failure point of one distributed request.

  • Timeout log is useful context, but it does not show the full parent-child request path.
  • Aggregate metrics can reveal service health trends, but they cannot reconstruct a single request flow.
  • Normal RU usage does not rule out a per-request failure before or during the Cosmos DB call.

Question 57

Topic: Connect to and Consume Azure Services

A team routes document-processing work through Azure Event Grid for a RAG ingestion service. The trace for one document is shown:

API publishes commands for doc-42:
  extractText -> createEmbedding -> updateVectorIndex

Event Grid delivers events independently.
Function returns HTTP 200 before the database write finishes.
Observed trace:
  updateVectorIndex ran before createEmbedding completed.

The requirement is that commands for the same document run in order and are acknowledged only after the durable write succeeds. Which assessment identifies the failure point?

Options:

  • A. Use Service Bus with sessions and explicit settlement.

  • B. Add an Event Grid subject filter for each document.

  • C. Return HTTP 500 until later events finish.

  • D. Enable Event Grid dead-lettering as the settlement mechanism.

Best answer: A

Explanation: Event Grid is designed for event notification, not as an ordered command queue. The trace shows command-style work: each step depends on the previous step, and completion must be acknowledged only after a durable write succeeds. Event Grid can retry event delivery and route events with filters, but it does not provide per-message settlement such as complete, abandon, defer, or dead-letter after application processing. Azure Service Bus is the better fit for command workloads that require durable queues, explicit settlement, and ordered processing with sessions for the same document key. Event Grid can still notify that a document was uploaded or indexing completed, but it should not coordinate this ordered command lifecycle.

  • Subject filtering routes events to subscribers but does not serialize dependent commands or confirm durable processing.
  • HTTP retry control affects delivery attempts, not ordered command execution across independent events.
  • Event Grid dead-lettering captures undelivered events after retries; it is not an application-level settlement model.

Question 58

Topic: Develop Containerized Solutions on Azure

A team builds a container image for a Python RAG API in Azure Container Registry. The app source changes infrequently, but the shared base image is patched at unpredictable times.

Exhibit:

FactValue
App DockerfileFROM aiacr.azurecr.io/runtime/python:3.12
Current processNightly rebuild of app image
Build timeAbout 18 minutes per run
RequirementRebuild soon after base-image patches, with fewer unnecessary builds

Which ACR Task pattern should the team use?

Options:

  • A. Source-commit-triggered task only

  • B. Base-image update-triggered task using the repo Dockerfile

  • C. Nightly timer-triggered task for the same Dockerfile

  • D. Manual az acr build quick task for each patch

Best answer: B

Explanation: Azure Container Registry Tasks can automate image builds for different change sources. In this scenario, the performance problem is unnecessary nightly build time, and the operational delay is waiting for the next scheduled build after a base image is patched. A base-image update trigger is the efficient pattern because the app image depends on runtime/python:3.12 through the Dockerfile. When that base image changes, ACR Tasks can rebuild the dependent image without requiring an app source commit or a manual build. This keeps patched images moving through the container pipeline while reducing wasted build minutes.

  • Source-only trigger misses the main change source because app commits are infrequent and base-image patches would not start the build.
  • Manual quick task can build on demand, but it relies on human action and does not meet the automatic patch requirement.
  • Nightly timer still spends build minutes when nothing changed and can delay patched images until the next run.

Question 59

Topic: Develop AI Solutions by Using Azure Data Management Services

A RAG API stores document chunks in Azure Database for PostgreSQL with columns embedding, tenant_id, and doc_type. The current retrieval step uses only vector similarity:

ORDER BY embedding <=> @query_embedding
LIMIT 8

The LLM prompt already says to answer only from the user’s tenant, but traces show retrieved chunks sometimes come from other tenants before answer generation. Which configuration change best addresses the issue?

Options:

  • A. Add metadata predicates to the PostgreSQL retrieval query.

  • B. Increase the number of retrieved chunks.

  • C. Lower the LLM temperature setting.

  • D. Rewrite the system prompt to reject other tenants.

Best answer: A

Explanation: This is a semantic retrieval problem, not an answer-generation problem. The retrieval layer decides which chunks are provided as grounding context to the LLM. If chunks from other tenants are being retrieved, the PostgreSQL vector search should combine similarity ranking with metadata filters such as WHERE tenant_id = @tenant_id AND doc_type = @doc_type before returning the top matches. The LLM prompt can guide response style, but it should not be used as the main control for data isolation or retrieval scope.

The key takeaway is to fix the retrieval query so only eligible chunks can be selected.

  • Prompt-only control fails because the wrong context has already been selected before generation starts.
  • More chunks can make leakage worse by increasing the amount of irrelevant or unauthorized context.
  • Temperature tuning affects generation variability, not which PostgreSQL rows are retrieved.

Question 60

Topic: Develop AI Solutions by Using Azure Data Management Services

A containerized Python enrichment worker must process every new or updated document in an Azure Cosmos DB for NoSQL container to generate embeddings and update a projection store. The design must keep progress across restarts, support multiple worker replicas, and avoid adding message-publishing logic to the write API. Which design best fits?

Options:

  • A. Use a Cosmos DB change feed processor with a lease container.

  • B. Publish each document change to a Service Bus queue from the API.

  • C. Run polling queries filtered by the document update timestamp.

  • D. Subscribe Event Grid to Cosmos DB account events.

Best answer: A

Explanation: Cosmos DB change feed processing is designed for item-level reactions to inserts and updates in a container. A change feed processor uses a lease container to coordinate ownership of feed ranges, checkpoint progress, and allow multiple worker instances to scale processing without making the write API publish separate messages. This fits enrichment workflows such as generating embeddings after source documents change.

Polling queries can miss or duplicate work unless you build your own watermarking, retry, and partition coordination. Event Grid is event-notification oriented and is not the right mechanism for reading the ordered item change stream from a Cosmos DB container. Service Bus is useful for explicit command messages, but it would require changing the write path to publish messages.

  • Polling timestamp queries shift checkpointing, partition coordination, and duplicate handling into custom code.
  • Event Grid account events do not provide the Cosmos DB item change stream needed for enrichment.
  • Service Bus publishing violates the requirement to avoid adding message logic to the write API.

Continue with full practice

Use the AI-200 Practice Test page for the full IT Mastery practice bank, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try AI-200 on Web View AI-200 Practice Test

Focused topic pages

Free review resource

Read the AI-200 Cheat Sheet for compact concept review before returning to timed practice.

Revised on Monday, May 25, 2026