Free Databricks Generative AI Engineer Associate Practice Questions: Application Development

Practice 10 free Databricks Certified Generative AI Engineer Associate (Databricks Generative AI Engineer Associate) questions on Application Development, with answers, explanations, and the IT Mastery next step.

Try the IT Mastery web app for a richer interactive practice experience with mixed sets, timed mocks, topic drills, explanations, and progress tracking.

Try Databricks Generative AI Engineer Associate on Web

Topic snapshot

FieldDetail
Practice targetDatabricks Generative AI Engineer Associate
Topic areaApplication Development
Blueprint weight30%
Page purposeFocused sample questions before returning to mixed practice

How to use this topic drill

Use this page to isolate Application Development for Databricks Generative AI Engineer Associate. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.

PassWhat to doWhat to record
First attemptAnswer without checking the explanation first.The fact, rule, calculation, or judgment point that controlled your answer.
ReviewRead the explanation even when you were correct.Why the best answer is stronger than the closest distractor.
RepairRepeat only missed or uncertain items after a short break.The pattern behind misses, not the answer letter.
TransferReturn to mixed practice once the topic feels stable.Whether the same skill holds up when the topic is no longer obvious.

Blueprint context: 30% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.

Sample questions

These are original IT Mastery practice questions aligned to this topic area. They are not official Databricks questions, copied live-exam content, or exam dumps. Use them to preview question style and explanation depth before continuing with topic drills, mixed sets, and timed mocks in IT Mastery.

Question 1

Topic: Application Development

A team is building a multi-agent Databricks app. One agent must answer sales and inventory questions by conversing over governed Unity Catalog Delta tables, and responses must respect existing data permissions. The team wants to avoid building a custom SQL-generation and permission layer. Which component should they configure for the data-retrieval agent?

Options:

  • A. Index the tables in Mosaic AI Vector Search

  • B. Use a Genie Space with the conversational API

  • C. Expose the tables through a custom external MCP server

  • D. Serve a foundation model with a custom SQL prompt

Best answer: B

Explanation: Genie Spaces are designed for conversational access to business data in Databricks. In this scenario, the key requirements are natural-language business questions, governed Unity Catalog data, and integration with a multi-agent application. Configuring the agent to use a Genie Space through the conversational API lets the application retrieve governed business answers without building its own SQL-generation, semantic interpretation, and permission-handling layer. Vector Search is a better fit for retrieval over embedded unstructured or semi-structured content, not governed conversational analytics over business tables. A custom prompt or external tool could work only by recreating governance-sensitive behavior that Genie already provides.

  • Vector Search layer misses the conversational business-data requirement and is mainly for embedding-based retrieval over indexed content.
  • Custom SQL prompting shifts semantic parsing and permission-sensitive query behavior into application code instead of using the governed Genie capability.
  • External MCP access may connect tools, but it does not by itself provide Genie’s governed conversational interface over Databricks business data.

Question 2

Topic: Application Development

A support agent built with Agent Framework must answer account-status questions from governed Delta tables. Business owners already maintain approved metrics, table relationships, and natural-language examples in a Genie Space. The agent must not generate SQL directly; it must retrieve structured answers only through the approved conversational interface. Which integration should the engineer configure?

Options:

  • A. Call the Genie Space through its conversational API

  • B. Call a Foundation Model API with table schemas

  • C. Register the Delta tables as MLflow models

  • D. Build a Vector Search retriever over the tables

Best answer: A

Explanation: Genie Spaces are the Databricks choice for controlled conversational retrieval over structured data when business logic, approved metrics, and semantic context are already curated there. In this scenario, the agent should treat the Genie Space as the data-access component and call it through the conversational API. That keeps natural-language-to-data access inside the approved Genie interface instead of letting the agent invent SQL or bypass the curated definitions. Vector Search is better for semantic retrieval over indexed text or records, not this governed conversational SQL-style workflow. The key distinction is using Genie for structured data conversation, not a generic LLM or retrieval layer.

  • Vector Search solves semantic retrieval, but it does not use the approved Genie conversational interface for structured metrics.
  • Foundation Model API could generate SQL-like text, but it bypasses the curated Genie Space and its controlled data semantics.
  • MLflow registration manages models and lifecycle artifacts, not conversational access to governed Delta tables.

Question 3

Topic: Application Development

A Databricks team uses Mosaic AI Vector Search for a RAG assistant over HTML product docs stored in Delta. The current pipeline creates fixed 600-token chunks with 75-token overlap. Offline retrieval evaluation shows:

SignalResult
Table questionsTop chunk has matching row values but not headers
Procedure questionsTop chunk starts mid-list and misses warning text
Neighbor checkMissing context is usually adjacent in the same HTML section
Broad text questionsCurrent recall target is met

Which chunking configuration should be implemented next?

Options:

  • A. Decrease fixed-size chunks to 300 tokens.

  • B. Increase fixed-size chunks to 2,000 tokens.

  • C. Switch to structure-aware chunks that preserve headings and tables.

  • D. Keep chunks and increase Vector Search top_k.

Best answer: C

Explanation: The retrieval results point to a structural chunking problem, not just a token-count problem. The retriever is finding the right region, but important semantic labels are separated from the content: table rows from headers, and procedure steps from headings or warnings. In a Databricks RAG pipeline, the next change should be to chunk by document structure, store the revised chunks, re-embed them, and update the Vector Search index. Increasing chunk size may hide some boundary loss but adds unrelated context. Decreasing chunk size worsens fragmentation. Raising top_k increases context volume without making each chunk self-contained.

  • Larger chunks may reduce some splits, but they add noise when broad-text recall is already meeting the target.
  • Smaller chunks make table and procedure fragmentation worse under the shown failures.
  • Higher top_k retrieves more neighboring text, but it does not fix non-self-contained chunks.

Question 4

Topic: Application Development

A Databricks team is building a LangChain-based support agent. The agent already uses a Vector Search retriever for product documentation. A new requirement says the agent must answer order-status questions by fetching the current status from an approved internal API at request time, only when the user asks about a specific order. Which component should the team add?

Options:

  • A. A larger prompt template with order-status examples

  • B. A Vector Search retriever over product documentation

  • C. An MLflow model registration step for the API

  • D. A LangChain tool that calls the order-status API

Best answer: D

Explanation: In a LangChain agent or chain, a tool is used when the application needs the model to invoke an external capability, such as a governed API, function, or live lookup, based on the user request. The existing Vector Search retriever is still useful for semantic retrieval from indexed product documentation, but it cannot provide current order status unless that live data is indexed and refreshed for that purpose. A prompt template can guide format and reasoning, but it does not perform the API call. MLflow can package, track, and deploy components, but registration alone does not make the status API available as an agent action.

  • Prompt-only approach misses the request-time lookup requirement and would risk stale or fabricated status values.
  • Documentation retriever solves semantic search over indexed text, not live order-status retrieval from an API.
  • MLflow registration supports lifecycle management, but it is not the callable LangChain component the agent needs.

Question 5

Topic: Application Development

A Databricks team evaluated four foundation-model endpoints with the same prompt and retrieval configuration for a support-ticket RAG chain. The launch gate is: answer_quality >= 0.86, safety_violation_rate <= 0.5%, p95_latency_ms <= 1,800, and cost <= $4.00 per 1,000 requests. If more than one candidate passes all gates, choose the highest answer_quality.

MLflow evaluation summary:

CandidateQualitySafety violationsp95 latency / cost per 1k
model-sm0.840.1%900 ms / $1.20
model-fast0.860.7%850 ms / $1.60
model-balanced0.880.2%1,450 ms / $3.20
model-premium0.910.2%2,400 ms / $5.80

Which candidate should the team select for launch?

Options:

  • A. model-balanced

  • B. model-fast

  • C. model-sm

  • D. model-premium

Best answer: A

Explanation: Experiment-based model selection starts by applying hard release requirements before optimizing a favorite metric. In this MLflow summary, quality must be at least 0.86, safety violations at most 0.5%, p95 latency at most 1,800 ms, and cost at most $4.00 per 1,000 requests. The low-cost candidate misses the quality floor, the fast candidate misses the safety gate, and the premium candidate exceeds both latency and cost gates despite the highest quality. The balanced candidate is above the quality floor and within all operational and safety limits, so it is the only launch-ready choice. The key takeaway is to avoid selecting a model on a single metric when release constraints span quality, latency, cost, and safety.

  • Lowest cost fails because the cheapest candidate has quality below the minimum release score.
  • Fastest latency fails because the fastest candidate exceeds the safety-violation limit.
  • Highest quality fails because the premium candidate is over both the latency and cost gates.

Question 6

Topic: Application Development

A Databricks team is choosing a Foundation Model API for a RAG support assistant. Vector Search retrieves 3 short policy chunks per query, and the assistant must return 1-2 sentence answers. The endpoint has a p95 latency target under 2 seconds, traffic is high enough that cost is a major concern, and the release gate requires a held-out quality score of at least 0.85.

Approved candidate metadata:

CandidateBest fitLatency/costQuality score
Small instructShort Q&A and summariesLow/low0.86
Large reasoningComplex multi-step reasoningHigh/high0.92
Code specialistCode generationLow/medium0.74
Long-context instructVery large promptsMedium/high0.88

Which model is the best engineering decision?

Options:

  • A. Long-context instruct model

  • B. Large reasoning model

  • C. Code specialist model

  • D. Small instruct model

Best answer: D

Explanation: Model selection should match the application attributes, not just the highest quality score. This RAG assistant receives a small retrieved context, produces short responses, and has strict latency and cost constraints. The small instruct model is the only candidate that satisfies the quality threshold while also matching the task type, response length, and operational constraints. A larger model can improve quality, but the stem does not require complex reasoning or long synthesis, so the extra latency and cost are not justified. The key is to choose the least complex model that meets the quality requirement and deployment constraints.

  • Large reasoning is tempting because it has the highest score, but it violates the latency and cost priorities for a simple short-answer task.
  • Code specialist has acceptable latency, but its task fit and quality score do not meet the support assistant requirement.
  • Long-context instruct is unnecessary because Vector Search returns only a few short chunks, so its larger context capability overbuilds the solution.

Question 7

Topic: Application Development

A team is choosing a Foundation Model API model for a Databricks RAG chain. Each prompt contains a user question plus up to 12,000 tokens of retrieved text; the downstream application accepts only a schema-constrained JSON object. Based on the model-card excerpt, which model can support the required input and output behavior?

ModelAccepted inputMax contextOutput behavior
Model Atext8,000 tokensfree-form text
Model Btext32,000 tokensschema-constrained JSON
Model Ctext + image16,000 tokensfree-form text
Model Dtext128,000 tokensembedding vector

Options:

  • A. Model D

  • B. Model C

  • C. Model A

  • D. Model B

Best answer: D

Explanation: Model-card facts should be matched directly to the application attributes: input modality, context length, and output behavior. The chain sends text prompts that may reach 12,000 tokens, so the selected model must support text input and a context window larger than that prompt size. The downstream application also requires schema-constrained JSON, not just natural-language text. A model with extra modalities is not enough if it cannot produce the required structured output. An embedding model is also the wrong output type for a generative RAG response.

  • Too small context makes the 8,000-token text model unsuitable for prompts that may reach 12,000 tokens.
  • Extra modality in the text-plus-image model does not solve the missing schema-constrained JSON requirement.
  • Embedding output returns vectors for retrieval use, not the JSON response needed by the downstream application.

Question 8

Topic: Application Development

A Databricks Lakehouse app uses a Foundation Model API endpoint to summarize support tickets into a fixed JSON schema that is written to a Delta table. The schema cannot change this sprint, retrieval is not used, and MLflow traces show misses mainly when a ticket contains multiple issues.

Prompt and response pair

System prompt:
Extract a concise ticket summary. Return JSON with:
- issue_type: one of billing, login, shipment, other
- summary: one sentence
- next_action: one sentence

User ticket:
I was charged twice for order 4442, and I also cannot reset my password.

Model response:
{
  "issue_type": "billing",
  "summary": "Customer was charged twice for order 4442.",
  "next_action": "Review the duplicate charge."
}

Which remediation is the best engineering decision?

Options:

  • A. Change the Delta schema to store an array of issue objects

  • B. Increase retriever top_k and rebuild the Vector Search index

  • C. Retarget the prompt for multi-issue summaries within the existing schema

  • D. Mask order numbers before sending tickets to the model

Best answer: C

Explanation: The core issue is response completeness, not retrieval, storage, or masking. The model produced valid JSON, but it failed to capture the password-reset problem. Because the downstream Delta schema is fixed, the best remediation is to make the prompt explicit about preserving every distinct user-reported issue inside the existing fields, such as using issue_type for the primary issue while requiring summary and next_action to mention all issues. This directly targets the failure shown in the trace without changing architecture or data contracts.

A larger schema redesign might be useful later, but it violates the current constraint and is unnecessary for the immediate quality problem.

  • Retriever tuning fails because the scenario states retrieval is not used, so a Vector Search change cannot fix the omission.
  • Schema redesign may model multi-issue tickets better, but it conflicts with the fixed downstream contract this sprint.
  • Order masking addresses a potential privacy concern, but the visible quality failure is missing information, not exposure of the order ID.

Question 9

Topic: Application Development

A team is iterating on a Databricks RAG assistant. Each candidate version changes the prompt template, foundation model endpoint, retriever settings, and guardrail logic. Before deployment, the team must compare evaluation metrics, inspect example traces, and preserve evidence for the selected application version. What is the best MLflow setup?

Options:

  • A. Store evaluation notes only in a Delta table

  • B. Use one MLflow experiment with a run per candidate version

  • C. Create a new Vector Search index for each prompt change

  • D. Use AI Gateway inference tables as the development record

Best answer: B

Explanation: MLflow is the development evidence system for this scenario. A common pattern is to create an experiment for the application and log each candidate app version as a separate run. Each run can capture parameters such as prompt version, model endpoint, retriever settings, and guardrail configuration, plus artifacts, evaluation results, and traces. This makes versions comparable and auditable before the selected version is registered or deployed. Vector Search, Delta tables, and AI Gateway can support the application, but they do not replace MLflow’s role in organizing experiment evidence across prompts, models, traces, evaluations, and application versions.

  • Vector Search layer misses the requirement because indexes support retrieval, not organizing prompt, trace, and evaluation evidence.
  • Delta-only notes may store data, but they do not provide MLflow run structure, metric comparison, or artifact lineage.
  • AI Gateway logs are more useful for live usage and monitoring than for organizing predeployment development iterations.

Question 10

Topic: Application Development

A team is selecting a foundation model for a Databricks GenAI app that summarizes internal incident tickets. Model Hub shows Model A has stronger public benchmark scores. An MLflow experiment using 300 representative tickets, the intended prompt, and the app’s scoring rubric shows Model B produces more accurate summaries with fewer unsupported details. Which model-selection action is best?

Options:

  • A. Select Model B based on the MLflow task experiment

  • B. Select Model A based on the public benchmark scores

  • C. Register both models in Unity Catalog and let serving choose

  • D. Create a new Vector Search index before choosing a model

Best answer: A

Explanation: Public benchmark results are useful for shortlisting candidate models, but they are not the strongest evidence for a specific application. The MLflow experiment uses the app’s real ticket style, final prompt, and scoring rubric, so it directly measures the behavior the production system needs. If that experiment is representative and well designed, its results should drive the model choice. Benchmark scores can still be kept as supporting metadata for governance or candidate filtering, but they should not override better task-specific evidence.

  • Benchmark-only selection misses that public leaderboards may not reflect the incident-summary task, prompt, or quality rubric.
  • Vector Search indexing solves retrieval problems, but the stem is about choosing between summarization models using existing evaluation evidence.
  • Serving auto-selection confuses deployment with evaluation; Unity Catalog and Model Serving do not automatically determine the best model for the task.

Continue in the web app

Use IT Mastery for interactive Databricks Generative AI Engineer Associate practice with mixed sets, timed mocks, topic drills, explanations, and progress tracking.

Try Databricks Generative AI Engineer Associate on Web