Databricks Certified Generative AI Engineer Associate Exam Blueprint

Practical exam blueprint for the Databricks Certified Generative AI Engineer Associate exam.

How to Use This Exam Blueprint

Use this independent Exam Blueprint as a practical study map for the Databricks Certified Generative AI Engineer Associate exam, code GenAI Engineer. It is designed to help you verify that you can apply Databricks generative AI concepts in realistic engineering scenarios, not just recognize terms.

Work through the checklist in three passes:

  1. Concept pass: Confirm you understand the purpose of each component, service, and workflow.
  2. Scenario pass: Practice choosing the right design, data flow, or troubleshooting step from a short business requirement.
  3. Final-readiness pass: Use the checkbox sections to identify weak areas before taking the exam.

Because exact exam weights are not provided here, the areas below are presented as readiness areas, not official weighted domains.

Exam Identity and Readiness Scope

ItemWhat to Know
Vendor/providerDatabricks
Exam titleDatabricks Certified Generative AI Engineer Associate
Exam codeGenAI Engineer
Professional verticalIT, data, AI engineering
Main readiness focusBuilding, evaluating, deploying, governing, and troubleshooting generative AI solutions on Databricks
Practical emphasisRAG applications, model serving, vector search, prompt engineering, evaluation, MLflow, governance, and production operations
Study approachCombine Databricks platform knowledge with applied generative AI engineering judgment

Topic-Area Readiness Table

Readiness areaYou should be able to…Common exam-style cue
Generative AI fundamentalsExplain LLM behavior, tokens, context windows, embeddings, hallucination risk, grounding, and prompt design“The model gives plausible but unsupported answers. What should you improve?”
Databricks GenAI architectureMap a GenAI application to Databricks components such as notebooks, workflows, model serving, vector search, Unity Catalog, and MLflow“Which component stores governed data or serves an endpoint?”
Retrieval-augmented generationDesign a RAG flow from source data to chunking, embeddings, indexing, retrieval, prompt construction, response generation, and evaluation“Users need answers grounded in internal documents.”
Embeddings and vector searchChoose when to embed text, how to create searchable chunks, and how retrieval quality affects answer quality“Search returns irrelevant passages even though documents exist.”
Prompt engineeringImprove instructions, context, examples, formatting constraints, and safety boundaries“The model ignores output format or invents fields.”
Model selection and servingUnderstand tradeoffs among hosted models, external models, foundation models, custom models, latency, cost, quality, and governance“The team needs a low-latency endpoint with controlled access.”
MLflow and experiment trackingTrack prompts, parameters, model versions, evaluation outputs, traces, and artifacts“You need reproducibility across model and prompt versions.”
Evaluation and qualitySelect offline and online evaluation approaches, use human feedback where appropriate, and diagnose retrieval vs generation failures“Answers are fluent but fail factual correctness tests.”
Agents and tool useRecognize when an agentic workflow, function/tool calling, or multi-step orchestration is appropriate“The assistant must query data, call tools, and decide next steps.”
Data governance and securityApply Unity Catalog concepts, access control, lineage, data privacy, secrets, and responsible AI controls“Sensitive documents should be available only to approved users.”
Deployment and operationsMove from notebook prototype to production workflow, endpoint, monitoring, evaluation, and rollback plan“The prototype works, but failures occur after deployment.”
TroubleshootingIdentify likely causes for poor responses, slow latency, failed indexing, missing permissions, stale data, or excessive cost“Quality dropped after a document refresh.”

Generative AI Fundamentals

Core Concepts to Review

ConceptReady means you can explain…Watch for
Large language modelHow an LLM generates text from input context and learned patternsTreating LLMs as deterministic databases
TokenWhy input and output length affect cost, latency, and context limitsAssuming characters, words, and tokens are the same
Context windowWhy only supplied context and model-accessible information can influence a responseExpecting the model to “know” private data without retrieval
TemperatureHow higher or lower randomness can affect consistency and creativityIncreasing randomness when consistency is required
PromptInstructions, user task, retrieved context, examples, constraints, and formatting guidanceMixing system rules, user content, and retrieved facts carelessly
HallucinationPlausible but unsupported outputSolving hallucination only by changing wording instead of grounding
EmbeddingNumeric representation used for semantic similarityUsing embeddings without considering chunk quality
RAGRetrieval-augmented generation: retrieve relevant context, then generate grounded outputConfusing RAG with model fine-tuning

Can You Do This?

  • Explain why an LLM may answer incorrectly even when the prompt is well written.
  • Distinguish between model knowledge, prompt-provided context, and retrieved enterprise data.
  • Describe how embeddings support semantic search.
  • Explain why context quality matters more than simply adding more text.
  • Identify when a problem is likely a retrieval issue rather than a generation issue.
  • Explain why evaluation should include factuality, relevance, safety, and usefulness.
  • Describe the difference between prototype behavior and production GenAI requirements.

Databricks Platform Concepts for GenAI Engineering

You do not need to memorize every product detail, but you should understand how Databricks platform components fit together in a GenAI application.

Platform areaGenAI engineering roleReadiness check
Workspace and notebooksExploration, prototyping, prompt tests, data preparation, and model experimentsCan you describe when notebooks are useful and when to productionize with jobs/workflows?
Delta tablesReliable structured storage for source data, prepared chunks, logs, and evaluation dataCan you identify where intermediate RAG artifacts might be stored?
Unity CatalogGovernance, permissions, lineage, discovery, and secure access to data and AI assetsCan you reason about who should access source data, indexes, models, and endpoints?
Databricks SQLQuerying governed datasets and supporting analytics use casesCan you tell when a natural language assistant should query structured data instead of only documents?
MLflowTracking experiments, prompts, models, parameters, metrics, artifacts, and evaluation resultsCan you explain why reproducibility matters for GenAI systems?
Model ServingHosting model or application endpoints for inferenceCan you choose serving when an application needs API-based inference?
Vector SearchSemantic retrieval over embedded contentCan you diagnose poor retrieval from chunking, embedding, indexing, or query problems?
Workflows/jobsScheduled or triggered production pipelinesCan you describe how to refresh embeddings or evaluations automatically?
Secrets and credentialsSecure handling of tokens and external service credentialsCan you avoid hard-coding credentials in notebooks or applications?
Monitoring and logsObservability for latency, errors, quality, and usageCan you name signals that indicate a production GenAI issue?

Retrieval-Augmented Generation Readiness

RAG is a central pattern for enterprise GenAI because it grounds model output in data that the model may not have learned during training.

RAG Workflow Checklist

StepWhat happensCandidate readiness
Source selectionIdentify documents, tables, pages, tickets, policies, or knowledge basesKnow how source quality and permissions affect answers
IngestionLoad data into a reliable processing environmentUnderstand batch, incremental, and refresh considerations
CleaningRemove noise, duplicates, irrelevant markup, or broken contentRecognize that noisy input creates noisy answers
ChunkingSplit content into retrievable unitsBalance context completeness against retrieval precision
Metadata enrichmentAdd source, owner, date, category, access level, or document IDUse metadata for filtering, citation, and governance
EmbeddingConvert chunks into vectorsUnderstand model compatibility and semantic similarity
IndexingStore embeddings for vector searchKnow that the index must reflect current, authorized data
RetrievalFind relevant chunks for a user queryTune query handling, filters, and top results
Prompt constructionCombine instructions, user question, and retrieved contextAvoid prompt injection and context confusion
GenerationProduce answer using the modelConstrain answer style, citations, and uncertainty handling
EvaluationMeasure retrieval and answer qualitySeparate retrieval failure from generation failure
MonitoringTrack performance in productionWatch latency, errors, drift, freshness, and user feedback

RAG Design Prompts

Ask yourself these before choosing a design:

  • Is the data primarily unstructured text, structured tables, or a mix?
  • Does the answer require semantic search, SQL aggregation, or both?
  • Must answers cite source documents?
  • Are users allowed to access all retrieved content?
  • How often does source data change?
  • What happens when no relevant context is retrieved?
  • Should the assistant refuse, ask a clarifying question, or answer with uncertainty?
  • How will the team test whether retrieval is working?
  • How will refreshed documents update the vector index?
  • How will incorrect or outdated source content be removed?

Common RAG Weak Areas

Weak areaSymptomBetter thinking
Poor chunkingRetrieved context is incomplete or too broadChunk by semantic sections where possible; preserve useful metadata
Missing metadataCannot filter by source, user group, date, or document typeAdd metadata during ingestion, not as an afterthought
Stale indexAnswers reference outdated policy or old documentationPlan index refresh and validation
Over-retrievalPrompt includes irrelevant chunks and confuses the modelImprove retrieval precision and ranking
Under-retrievalModel lacks enough context to answerImprove query transformation, chunking, embedding, or top result strategy
No fallback behaviorModel invents an answer when context is absentInstruct the model to say when information is unavailable
Ignoring permissionsUsers see content they should not accessAlign retrieval with governance and access controls

What to Be Ready For

TopicReadiness target
Embedding purposeExplain how embeddings represent semantic meaning for similarity search
Query embeddingUnderstand that the user query is embedded and compared with indexed content
Chunk embeddingKnow that retrieval quality depends on the embedded chunk content
Metadata filteringUse filters to restrict candidate results before or during retrieval
SimilarityUnderstand that “similar” does not always mean “correct” or “sufficient”
Index refreshRecognize that new or changed source data requires refreshed searchable representations
Retrieval evaluationMeasure whether the right passages are retrieved, not just whether the final answer sounds good

Can You Diagnose This?

ScenarioLikely issue to investigate
The correct document exists, but it is never retrievedChunking, embedding model choice, index freshness, metadata filter, query wording
Results are semantically related but not answer-bearingChunk granularity, ranking, metadata, query transformation
Sensitive documents appear for unauthorized usersPermission model, metadata filters, Unity Catalog governance, application-layer checks
Latency is high during retrievalIndex design, query pattern, result count, filtering, endpoint load
Answers cite irrelevant sourcesRetrieval quality, prompt structure, citation logic, post-processing

Prompt Engineering Checklist

Prompt engineering for the Databricks Certified Generative AI Engineer Associate exam is not just writing clever instructions. Be ready to reason about prompts as production artifacts that need testing, versioning, and governance.

Prompt Components

Prompt componentPurposeExample readiness question
Role or task instructionDefines what the assistant should doCan you make the task unambiguous?
ConstraintsLimits scope, tone, format, or allowed sourcesCan you prevent unsupported claims?
Retrieved contextProvides grounded factsCan you distinguish context from user instructions?
ExamplesDemonstrate desired outputCan you use examples without overfitting the response?
Output schemaEnforces structured responseCan you specify JSON-like fields or bullet structure?
Refusal/uncertainty ruleHandles missing or unsafe informationCan you tell the model not to guess?
Citation ruleRequires source referencesCan you tie claims to retrieved passages?

Prompt Readiness Checklist

  • Write prompts that separate system instructions, developer/application instructions, retrieved context, and user input.
  • Include rules for missing context, uncertainty, and unsupported questions.
  • Constrain output format when downstream systems need structured results.
  • Add examples only when they improve consistency.
  • Avoid placing untrusted retrieved text where it can override safety instructions.
  • Test prompts with normal, ambiguous, adversarial, and out-of-scope questions.
  • Track prompt versions and evaluation results.
  • Recognize when prompt tuning is not enough and retrieval, data quality, or model choice must change.

Example Prompt Skeleton

System:
You are a support assistant. Answer only from the provided context.
If the context does not contain the answer, say that the information is not available.

Context:
{retrieved_chunks}

User question:
{question}

Response requirements:
- Use concise language.
- Cite the source title when possible.
- Do not invent policy details.

Model Selection, Serving, and Inference

Model Choice Decision Table

RequirementConsider
Fast prototypeUse an available model endpoint or managed model access pattern suitable for experimentation
Enterprise data groundingAdd RAG rather than relying only on model pretraining
Strict output structurePrompt constraints, schema validation, post-processing, or model/tool strategy
Domain-specific languageBetter retrieval, examples, fine-tuning, or specialized model selection depending on need
Low latencySmaller/faster model, efficient prompt, fewer retrieved chunks, optimized serving
High factual accuracyBetter grounding, retrieval evaluation, citations, and human review
Cost controlPrompt length, model size, request volume, caching, batching where appropriate
GovernanceAccess controls, lineage, approval process, tracking, and monitoring

Serving Readiness

You should be able to reason about:

  • When a model or GenAI application should be exposed through a serving endpoint.
  • Why production inference needs authentication, authorization, and monitoring.
  • How input size, output length, retrieval calls, and model choice affect latency.
  • How to separate development, staging, and production behavior.
  • How to compare model versions or prompt versions before rollout.
  • How to handle endpoint errors, timeouts, and fallback responses.
  • Why logging prompts and responses may require privacy controls.

Inference Failure Cues

CueWhat to check
Endpoint works in notebook but not applicationAuthentication, endpoint name, network path, request format, permissions
Responses are slowPrompt size, retrieved context count, model latency, tool calls, concurrency, downstream systems
Responses changed after deploymentModel version, prompt version, retrieval index, configuration, data refresh
Cost increasedRequest volume, token usage, large context, inefficient retrieval, model selection
Users receive inconsistent answersTemperature/settings, prompt ambiguity, retrieval variability, missing deterministic constraints

MLflow, Tracking, and Evaluation

What MLflow Readiness Looks Like

AreaYou should be able to…
Experiment trackingTrack prompt versions, model parameters, evaluation data, and outputs
Artifact loggingStore relevant files, examples, metrics, and evaluation results
Model lifecycleUnderstand why versioning and reproducibility matter
ComparisonCompare runs across prompts, models, retrieval settings, and datasets
EvaluationUse metrics and qualitative review to select better candidates
TraceabilityConnect a production issue back to prompt, model, data, and code changes

Evaluation Checklist

  • Define what a good answer means before optimizing.
  • Use representative questions, not only easy examples.
  • Include negative tests where the answer should be “not enough information.”
  • Evaluate retrieval separately from final answer quality.
  • Check factual correctness against source context.
  • Check relevance, completeness, conciseness, and citation accuracy.
  • Include safety and privacy tests.
  • Compare model and prompt versions using the same evaluation set.
  • Review failures manually to categorize root cause.
  • Keep evaluation artifacts so results can be reproduced.

Evaluation Dimensions

DimensionGood resultFailure signal
GroundednessClaims are supported by retrieved contextUnsupported facts or invented details
RelevanceAnswer directly addresses the questionTangential or generic response
CompletenessIncludes necessary details without excessMissing key steps or conditions
FaithfulnessDoes not contradict sourceConflicts with retrieved document
Citation qualitySources match claimsCitations are absent, wrong, or decorative
SafetyAvoids unsafe, private, or prohibited outputReveals sensitive content or follows malicious instructions
Format complianceMatches required structureInvalid JSON, missing fields, wrong schema
LatencyMeets application needsToo slow for user workflow

Agents, Tools, and Application Orchestration

Agentic patterns can be useful when an application must decide among actions, call tools, retrieve information, or complete multi-step tasks. Be ready to distinguish agent use cases from simpler RAG or single-prompt applications.

PatternUse when…Avoid when…
Simple promptTask is self-contained and does not need external dataEnterprise facts or current data are required
RAGModel needs grounded unstructured contextThe answer requires precise structured calculation only
Text-to-SQL or tool callAssistant needs to query structured data or call an APIA free-form answer is enough
Agent workflowTask requires planning, multiple steps, tool selection, or iterative reasoningDeterministic workflow is simpler and safer
Human-in-the-loopOutput affects high-risk decisions or needs expert approvalFully automated action is acceptable and low risk

Agent Readiness Checklist

  • Explain why tool selection increases both capability and risk.
  • Identify when a deterministic workflow is better than an agent.
  • Define allowed tools, inputs, outputs, and stopping conditions.
  • Validate tool outputs before using them in final responses.
  • Prevent the model from calling tools with unauthorized or unsafe parameters.
  • Log traces for debugging multi-step behavior.
  • Evaluate not only the final answer but also the path taken.

Data Governance, Security, and Responsible AI

For Databricks GenAI engineering, governance is part of the design. Be ready for scenarios where the technically easiest solution is not the correct production answer.

Governance Readiness Table

AreaWhat to reviewScenario cue
Unity CatalogGoverned access to data and AI assets“Only HR users should retrieve HR documents.”
Access controlLeast privilege for users, jobs, endpoints, and service principals“The notebook owner can access data, but the app user cannot.”
Data lineageUnderstanding where source data, chunks, indexes, and outputs came from“Which documents influenced this answer?”
Sensitive dataHandling PII, secrets, regulated data, or confidential text“Logs contain full user prompts with private information.”
Secrets managementAvoiding hard-coded credentials“A token is stored directly in a notebook.”
Prompt injectionDefending against malicious instructions in user input or retrieved content“A document says: ignore previous instructions.”
Output safetyRefusal, redaction, review, or policy rules“The assistant returns restricted information.”
AuditabilityTracking requests, versions, and decisions“The team must explain why a response changed.”

Security and Privacy Checklist

  • Apply least privilege to source tables, files, models, indexes, and endpoints.
  • Avoid embedding or indexing data that users should not be able to retrieve.
  • Use metadata and governance controls to enforce access boundaries.
  • Do not hard-code secrets in prompts, notebooks, jobs, or application code.
  • Treat retrieved documents as untrusted content that may contain malicious instructions.
  • Decide what prompt, response, and trace data may be logged.
  • Redact or avoid storing sensitive information when not needed.
  • Validate generated output before downstream use in high-impact workflows.
  • Maintain lineage from answer to source where citations or audit are required.

Data Preparation for GenAI Applications

Artifact Checklist

ArtifactWhy it mattersReady when you can…
Raw source dataOriginal enterprise knowledgeIdentify source owner, freshness, and permissions
Cleaned documentsReduced noise and duplicationExplain cleaning rules and what was removed
Chunk tableRetrieval-ready text unitsChoose chunk size strategy based on content type
Metadata columnsFiltering, governance, and citationsName useful metadata fields for a scenario
Embedding table or index sourceVector search inputExplain how embeddings are regenerated
Vector indexFast semantic retrievalDescribe refresh and access considerations
Evaluation datasetRepeatable quality measurementBuild questions with expected evidence
Prompt templateControlled generation behaviorVersion and test changes
Inference endpointProduction access pathMonitor latency, errors, and usage
Feedback tableUser or reviewer signalsUse feedback to prioritize improvements

Example RAG Data Fields

document_id
source_system
source_title
source_uri_or_reference
owner_team
access_group
last_updated
chunk_id
chunk_text
chunk_order
embedding

You do not need to use these exact names, but you should understand why fields like source, owner, access group, and last-updated date are useful.

Troubleshooting Decision Points

Retrieval or Generation?

SymptomMore likely retrieval issueMore likely generation issue
Correct source is absent from contextYesNo
Context is present but answer contradicts itPossibleYes
Answer is generic and lacks detailYesPossible
Answer invents a policy not in contextPossibleYes
Citations point to irrelevant documentsYesPossible
Output format is wrongNoYes
Answer omits required field from JSONNoYes
Model refuses safe questionsPossible prompt or safety configuration issueYes

Production Troubleshooting Checklist

  • Check whether source data changed.
  • Check whether the vector index was refreshed successfully.
  • Check whether the user has permission to retrieve needed documents.
  • Check whether metadata filters are too restrictive.
  • Check whether prompts or model parameters changed.
  • Check whether the endpoint version changed.
  • Check recent logs for errors, timeouts, or malformed requests.
  • Compare failing examples against the evaluation set.
  • Reproduce the issue with the exact prompt, context, model, and configuration.
  • Categorize the failure before changing the system.

Troubleshooting Flow

    flowchart TD
	    A[Bad or unexpected answer] --> B{Was relevant context retrieved?}
	    B -- No --> C[Check source data, chunking, embeddings, index freshness, filters, permissions]
	    B -- Yes --> D{Does the answer follow the retrieved context?}
	    D -- No --> E[Check prompt instructions, model behavior, safety rules, output constraints]
	    D -- Yes --> F{Is the answer still incomplete?}
	    F -- Yes --> G[Improve retrieval depth, context selection, prompt specificity, or source coverage]
	    F -- No --> H[Review evaluation criteria and user expectation]

Scenario and Decision-Point Practice

Use these prompts to test whether you can make exam-ready choices.

ScenarioBetter answer should consider…
A legal team wants an assistant that answers only from approved policy documentsRAG, governed source data, metadata, access control, citations, refusal when context is missing
A support bot gives outdated answers after a documentation updateIndex refresh, data pipeline schedule, source versioning, cache behavior, evaluation after refresh
A model gives correct answers in testing but leaks sensitive content in productionPermissions, retrieval filters, logging, prompt injection, user identity propagation
A finance analyst asks natural language questions about sales totalsStructured query/tool use may be better than document-only RAG
A chatbot response is too slow for end usersPrompt length, retrieval count, model choice, endpoint performance, tool call chain
A team wants to compare two prompts and two modelsMLflow tracking, fixed evaluation set, metrics, artifacts, side-by-side review
A retrieved document contains “ignore previous instructions”Treat retrieved text as untrusted context, reinforce system instructions, guard against prompt injection
A generated JSON response fails downstream parsingStronger output schema, validation, retry logic, lower randomness, post-processing
A user asks a question outside the knowledge baseRefusal or clarification, not hallucination
A team cannot reproduce a bad production responseLog prompt version, retrieved chunks, model version, parameters, endpoint, and trace where appropriate

Code and Configuration Awareness

The exam may test whether you understand the shape of GenAI engineering workflows. You should not rely on memorizing long code blocks, but you should recognize concise patterns.

Embedding and Retrieval Pseudocode

question = "What is the escalation policy for priority incidents?"

query_embedding = embed(question)

results = vector_search(
    embedding=query_embedding,
    filters={"document_type": "policy"},
    top_k=5
)

context = format_context(results)

answer = llm_generate(
    instructions="Answer only from the provided context. Cite sources.",
    context=context,
    question=question
)

Readiness checks:

  • Can you identify where permissions and filters should apply?
  • Can you explain what happens if results is empty?
  • Can you explain why top_k affects context quality, latency, and cost?
  • Can you explain why the prompt should tell the model not to invent missing facts?

Evaluation Pseudocode

for example in evaluation_set:
    retrieved = retrieve(example.question)
    response = generate(example.question, retrieved)

    score = evaluate(
        question=example.question,
        expected_evidence=example.expected_evidence,
        retrieved_context=retrieved,
        response=response
    )

    log_result(example.id, score, response, retrieved)

Readiness checks:

  • Can you separate retrieval quality from answer quality?
  • Can you explain why the same evaluation set should be used when comparing changes?
  • Can you describe what artifacts should be logged for reproducibility?

Common Traps and Weak Areas

TrapWhy it hurtsCorrect exam-prep mindset
Thinking prompt engineering solves everythingPoor data and retrieval still produce poor answersDiagnose the full pipeline
Treating RAG as fine-tuningRAG retrieves external context at inference timeChoose RAG for current, governed knowledge
Ignoring access control in retrievalUsers may see unauthorized contentDesign permissions into the retrieval layer
Evaluating only happy pathsReal users ask ambiguous, incomplete, and unsafe questionsInclude edge cases and negative tests
Logging everything by defaultPrompts and responses may contain sensitive dataLog intentionally with privacy controls
Using too much contextMore text can increase latency and confuse the modelRetrieve relevant, compact context
Trusting citations automaticallyA model can cite incorrectly if not constrained and checkedValidate citation grounding
Not versioning promptsSmall prompt changes can alter behaviorTrack prompts, models, data, and evaluations
Confusing semantic similarity with correctnessSimilar chunks may not answer the questionEvaluate answer-bearing retrieval
Choosing agents too earlyAgents add complexity and riskUse the simplest reliable architecture

Final-Week Readiness Checklist

Platform and Architecture

  • I can draw a basic Databricks GenAI architecture for a RAG application.
  • I can identify where data is stored, governed, transformed, embedded, indexed, served, and monitored.
  • I can explain the role of Unity Catalog in governance scenarios.
  • I can explain why MLflow tracking matters for GenAI experimentation.
  • I can distinguish notebook prototyping from production workflow deployment.
  • I can describe the full RAG flow without looking at notes.
  • I can choose useful metadata for filtering, citation, and access control.
  • I can diagnose stale, missing, irrelevant, or unauthorized retrieval results.
  • I can explain chunking tradeoffs.
  • I can describe how source data updates affect embeddings and indexes.

Prompting and Model Behavior

  • I can write a prompt that restricts answers to provided context.
  • I can add refusal behavior for missing information.
  • I can enforce structured output requirements.
  • I can identify prompt injection risk.
  • I can tell when the problem is prompt design versus data or retrieval quality.

Evaluation and Operations

  • I can define evaluation criteria for a GenAI assistant.
  • I can compare prompts or models using a consistent evaluation set.
  • I can identify useful logs and traces for troubleshooting.
  • I can reason about latency, cost, and quality tradeoffs.
  • I can explain how to monitor a deployed GenAI application.

Security and Governance

  • I can apply least privilege to data, indexes, models, and endpoints.
  • I can identify when logs may expose sensitive information.
  • I can explain why retrieved content should not override system instructions.
  • I can design a response strategy for restricted, missing, or unsafe information.
  • I can connect lineage and citations to auditability.

Quick Self-Test Prompts

Before exam day, answer these without notes:

  1. A RAG assistant hallucinates when no context is retrieved. What changes would you make?
  2. Users receive answers from documents they should not access. Where do you investigate first?
  3. Retrieval returns long but irrelevant chunks. What parts of the pipeline might need adjustment?
  4. Two prompts appear equally good in manual testing. How would you compare them more reliably?
  5. A production endpoint becomes slow after adding citations and more retrieved context. What tradeoffs are involved?
  6. A model answers structured sales questions incorrectly from PDFs. What alternative design might be better?
  7. A document contains malicious instructions aimed at the assistant. How should the application treat it?
  8. A team cannot reproduce yesterday’s bad answer. What should have been tracked?
  9. A new document is added but not used in answers. What refresh or indexing steps might be missing?
  10. The output must be valid JSON for an application. What prompt and validation strategies help?

Practical Next Step

Use this checklist to mark weak areas, then practice with scenario-based questions that force you to choose an architecture, diagnose failures, and justify tradeoffs on Databricks. Focus your final review on the areas where you cannot yet explain both the correct action and the reason it is correct.