Orientation
This Study Plan is for candidates preparing for the Databricks Certified Generative AI Engineer Associate exam, official exam code GenAI Engineer, from Databricks.
Use it to turn your remaining time into a practical schedule for generative AI engineering topics on Databricks: retrieval-augmented generation, vector search, embeddings, prompt design, model serving, evaluation, governance, security, MLflow/Mosaic AI workflows, troubleshooting, and production-readiness decisions.
This is an independent study planning resource. Always use the current Databricks exam guide as your final scope reference.
Which plan should you use?
| Time available | Best for | Primary goal | Mock exam use |
|---|
| 7 days | You have studied before or work with Databricks regularly | Final review, weak-area repair, timed practice | 1 timed mock early, 1 final timed set |
| 14 days | You know GenAI concepts but need Databricks-specific review | Focused domain coverage plus hands-on reinforcement | Diagnostic on Day 1, timed mock near Days 10-12 |
| 30 days | You want a balanced plan while working full time | Build coverage, practice scenarios, review misses | Diagnostic Week 1, mocks in Weeks 3 and 4 |
| 60/90 days | You are newer to Databricks GenAI engineering | Full preparation with labs, notes, and repeated practice | Monthly checkpoint mocks, final-week timed mock |
Core exam-prep priorities
Prioritize the work that most directly supports exam decisions.
| Priority area | What to be able to do |
|---|
| Generative AI architecture | Choose patterns for RAG, tool use, agents, chat applications, and evaluation loops |
| Databricks platform workflow | Understand how notebooks, jobs, model serving, vector search, governance, and monitoring fit together |
| Retrieval-augmented generation | Explain chunking, embeddings, vector indexes, retrieval quality, grounding, and hallucination reduction |
| Model serving and endpoints | Know when to use hosted model serving, foundation model APIs, endpoint configuration concepts, and operational tradeoffs |
| Evaluation | Compare expected output quality, retrieval quality, safety, latency, cost, and regression checks |
| MLflow and Mosaic AI concepts | Track prompts, parameters, traces, evaluations, model versions, and application behavior where applicable |
| Security and governance | Apply Unity Catalog concepts, data permissions, access controls, secret handling, and safe data use |
| Troubleshooting | Diagnose poor responses, irrelevant retrieval, latency, permissions, deployment errors, and evaluation failures |
Daily practice rhythm
Use this rhythm on most study days. Adjust the duration, not the order.
| Block | 45-minute day | 90-minute day | 2-3 hour day |
|---|
| Warm-up recall | 5 min | 10 min | 15 min |
| Learn or review one topic | 15 min | 25 min | 40 min |
| Hands-on or scenario practice | 15 min | 30 min | 45-60 min |
| Practice questions | 5-7 min | 15 min | 25-35 min |
| Missed-question review | 5 min | 10 min | 20 min |
| Update weak-area list | 2 min | 5 min | 5-10 min |
Daily rule: every study session should produce one of these outputs:
- A corrected misunderstanding.
- A short architecture decision note.
- A reviewed set of missed questions.
- A hands-on note explaining what failed and how you fixed it.
- A flashcard or checklist item for final review.
Start with a diagnostic
Before choosing what to study, find your gaps.
Diagnostic setup
| Step | Action |
|---|
| 1 | Take a mixed practice set without notes. |
| 2 | Mark each question as confident, guessed, or unfamiliar. |
| 3 | Review every missed and guessed question. |
| 4 | Tag each miss by topic: RAG, embeddings, vector search, serving, evaluation, MLflow, security, governance, troubleshooting, or architecture. |
| 5 | Build a weak-area list and use it to drive the next 3-5 study sessions. |
Diagnostic scoring categories
Do not only track right and wrong answers. Track why you missed.
| Miss type | Meaning | Fix |
|---|
| Concept gap | You did not know the idea | Study the topic from the exam guide and create a short note |
| Databricks workflow gap | You knew GenAI generally but not how it maps to Databricks | Do a focused platform review or mini-lab |
| Scenario misread | You missed a key constraint | Practice slower reading and underline requirements |
| Two-answer confusion | You narrowed it down but chose wrong | Write the decision rule that separates the answers |
| Terminology gap | Product or feature term was unclear | Build a term list and review it daily |
| Overengineering | You picked a complex design when a simpler one fit | Practice architecture tradeoff questions |
7-day final review plan
Use this if the exam is within one week. Do not try to relearn everything. Your goal is to stabilize score, reduce careless misses, and repair the highest-value gaps.
| Day | Main focus | Study actions | Output |
|---|
| 1 | Diagnostic and triage | Take a timed mixed set. Review every miss. Rank weak areas. | Top 5 weak-area list |
| 2 | RAG and retrieval | Review embeddings, chunking, vector search, retrieval quality, grounding, and failure modes. Do targeted questions. | RAG decision checklist |
| 3 | Model serving and app flow | Review model endpoints, foundation model usage patterns, latency/cost tradeoffs, deployment flow, and request path. | Serving workflow note |
| 4 | Evaluation and MLflow | Review prompt evaluation, retrieval evaluation, regression testing, traces, logged parameters, and comparison of candidate approaches. | Evaluation checklist |
| 5 | Governance and security | Review Unity Catalog concepts, permissions, data access, secrets, PII/sensitive data handling, and safe GenAI application patterns. | Security review sheet |
| 6 | Timed mock and weak sprint | Take a timed mock or large timed set. Review deeply. Rework only the weakest 2-3 topics. | Final miss log |
| 7 | Light final review | Review notes, wrong-answer log, and decision rules. Do a small confidence set only. Stop heavy new study. | Exam-day checklist |
7-day rules
- Stop adding new material after Day 5 unless it appears repeatedly in missed questions.
- Do not spend Day 7 on a full new mock if it will leave you tired.
- Revisit all guessed questions, even if they were correct.
- Practice reading scenario constraints: data access, latency, quality, governance, deployment, and evaluation requirements.
14-day focused plan
Use this if you have two weeks and need both review and practice.
| Day | Focus | Actions |
|---|
| 1 | Diagnostic | Mixed practice set, miss tagging, weak-area ranking |
| 2 | Databricks GenAI workflow | Review how data, prompts, models, endpoints, evaluation, and governance connect |
| 3 | RAG foundations | Chunking, embeddings, vector similarity, retrieval quality, hallucination reduction |
| 4 | Databricks Vector Search and retrieval patterns | Review index concepts, source data preparation, metadata filtering concepts, and troubleshooting poor retrieval |
| 5 | Prompt engineering | Instructions, context, examples, output format, safety constraints, prompt iteration |
| 6 | Model serving | Serving endpoint concepts, foundation model usage, latency, cost, scaling, deployment tradeoffs |
| 7 | Review checkpoint | Timed practice set, review misses, update weak-area list |
| 8 | Evaluation | Quality metrics, human review, automated checks, regression testing, retrieval evaluation |
| 9 | MLflow and tracking | Track prompts, parameters, versions, traces, evaluations, and compare experiments |
| 10 | Security and governance | Unity Catalog concepts, access control, sensitive data, secrets, least privilege, auditability |
| 11 | Architecture scenarios | Choose designs for chat, RAG, summarization, classification, and knowledge assistant use cases |
| 12 | Timed mock | Full timed mock or largest available timed set; review all misses |
| 13 | Weak-area sprint | Drill the 2-4 topics with the most misses; retake similar questions |
| 14 | Final review | Light practice, review decision rules, prepare exam-day plan |
14-day emphasis
Spend more time on scenario reasoning than memorization. The exam is likely to reward knowing which approach fits a given application requirement, not just recognizing product names.
30-day balanced plan
Use this if you can study consistently for a month. This is the best path for many working candidates.
Weekly structure
| Week | Goal | Practice target |
|---|
| Week 1 | Build baseline and cover platform workflow | Diagnostic plus topic drills |
| Week 2 | Master RAG, embeddings, vector search, and prompt patterns | Scenario practice and mini-labs |
| Week 3 | Cover serving, evaluation, MLflow, governance, and troubleshooting | Timed sets and hands-on review |
| Week 4 | Convert knowledge into exam readiness | Mock exams, weak-area sprint, final review |
30-day schedule
| Day | Focus | Actions |
|---|
| 1 | Diagnostic | Take mixed set, tag misses, create study tracker |
| 2 | Exam guide mapping | Map official objectives to topics you know, partly know, and do not know |
| 3 | Databricks GenAI workflow | Review workspace flow, notebooks, data sources, endpoints, evaluation, governance |
| 4 | Core GenAI concepts | LLM behavior, tokens conceptually, prompts, embeddings, grounding, hallucination risks |
| 5 | RAG architecture | Retrieval pipeline, chunking strategy, embedding selection concepts, answer generation |
| 6 | Practice and review | RAG and prompt questions; update miss log |
| 7 | Weekly checkpoint | Timed set; review weak areas |
| 8 | Vector search | Index concepts, data preparation, update patterns, metadata and filtering concepts |
| 9 | Retrieval troubleshooting | Irrelevant results, stale data, poor chunking, missing permissions, weak prompts |
| 10 | Prompt engineering | System/user instructions, examples, formatting, guardrails, evaluation prompts |
| 11 | Application patterns | Chatbot, summarization, classification, Q&A, knowledge assistant scenarios |
| 12 | Hands-on consolidation | Build or review a simple RAG flow conceptually in Databricks |
| 13 | Practice and review | Targeted questions on retrieval and architecture |
| 14 | Weekly checkpoint | Timed set; update top weak areas |
| 15 | Model serving | Endpoint concepts, deployment flow, endpoint selection, operational considerations |
| 16 | Foundation models and APIs | When to use hosted foundation models, external models, or custom models |
| 17 | MLflow and tracking | Experiments, parameters, prompts, traces, model versions, comparisons |
| 18 | Evaluation | Answer quality, retrieval quality, safety, human review, regression tests |
| 19 | Governance and security | Unity Catalog, permissions, secrets, sensitive data, least privilege |
| 20 | Troubleshooting | Latency, permissions, retrieval failures, poor output, failed deployments |
| 21 | Mock 1 | Timed mock or large timed set; deep review |
| 22 | Mock review | Rework misses; write decision rules for repeated errors |
| 23 | Weak area 1 | Target your largest miss category |
| 24 | Weak area 2 | Target your second largest miss category |
| 25 | Architecture scenarios | Practice choosing between design options under constraints |
| 26 | Security and evaluation review | Revisit governance and evaluation because they affect many scenario questions |
| 27 | Mock 2 | Timed mock or large timed set |
| 28 | Final weak sprint | Drill repeated misses; review guessed questions |
| 29 | Final review | Review notes, checklists, and decision rules; light timed set only |
| 30 | Exam readiness | Rest, logistics, confidence set, no heavy new material |
60/90-day full preparation path
Use this if you are newer to Databricks, newer to generative AI engineering, or want deeper hands-on practice.
60-day path
| Phase | Days | Goal | What to do |
|---|
| Foundation | 1-10 | Understand exam scope and GenAI basics | Read exam guide, take diagnostic, review LLMs, prompts, embeddings, RAG concepts |
| Databricks workflow | 11-20 | Connect GenAI concepts to Databricks | Study notebooks, data preparation, model serving concepts, Unity Catalog, MLflow/Mosaic AI workflow |
| RAG depth | 21-32 | Build strong retrieval reasoning | Study chunking, embeddings, vector search, metadata, grounding, retrieval evaluation, failure modes |
| Serving and operations | 33-42 | Prepare for deployment decisions | Study serving endpoints, foundation model usage, app patterns, latency/cost/governance tradeoffs |
| Evaluation and governance | 43-50 | Improve production-readiness judgment | Review evaluation, tracing, monitoring concepts, security, access control, safe data use |
| Exam conversion | 51-60 | Convert knowledge into timed performance | Timed mocks, weak-area sprints, final review, exam-day checklist |
90-day path
Use the 60-day path, but add deeper practice between phases.
| Added time | Use it for |
|---|
| Extra 10 days after Foundation | Hands-on notebooks, basic prompt experiments, terminology review |
| Extra 10 days after RAG depth | Build or review multiple RAG scenarios: internal docs, support Q&A, summarization with retrieval, metadata filters |
| Extra 10 days before Exam conversion | More timed mixed practice, governance scenarios, evaluation comparisons, troubleshooting drills |
Long-path weekly cadence
| Day type | Activity |
|---|
| 2 days per week | Learn or review concepts |
| 1-2 days per week | Hands-on Databricks workflow or architecture walkthrough |
| 1 day per week | Practice questions |
| 1 day per week | Missed-question review and notes |
| Every 2-3 weeks | Timed mixed checkpoint |
Hands-on concept review checklist
You do not need to build a production system for exam prep, but you should understand how a GenAI application would be assembled and operated on Databricks.
| Area | Hands-on or walkthrough task |
|---|
| Data preparation | Identify source data, clean text, decide chunking approach, and explain metadata use |
| Embeddings | Explain how documents and queries become vectors and why embedding choice matters |
| Vector search | Walk through index creation conceptually, retrieval, filters, freshness, and failure modes |
| RAG request path | Trace user question to retrieval to prompt construction to model response |
| Prompt iteration | Compare a vague prompt with a constrained prompt that includes role, context, format, and refusal rules |
| Model serving | Explain how an application calls a model endpoint and what operational constraints matter |
| Evaluation | Compare two versions of a prompt or retriever and decide which is better |
| Governance | Identify who can access data, models, endpoints, and logs |
| Troubleshooting | Diagnose poor answer quality, missing context, slow response, permission error, or unsafe output |
RAG flow you should be able to explain
User question
-> validate and prepare request
-> embed query
-> retrieve relevant chunks from vector index
-> apply filters and ranking
-> build prompt with instructions and context
-> call model endpoint
-> return answer with appropriate grounding or citations
-> log trace, feedback, and evaluation signals
Use this flow for scenario questions. Ask: where is the failure happening, and which control fixes it?
Domain-by-domain study actions
RAG, embeddings, and vector search
| Study task | Questions to ask yourself |
|---|
| Review chunking | Are chunks too large, too small, or missing context? |
| Review embeddings | Does the embedding model fit the data and query style? |
| Review metadata | Can filters improve relevance or enforce scope? |
| Review grounding | Does the response use retrieved context or unsupported model knowledge? |
| Review freshness | How are document updates reflected in retrieval? |
| Review troubleshooting | Is the problem retrieval, prompt construction, permissions, or model behavior? |
Prompt engineering
| Study task | Practice action |
|---|
| Role and task | Write the instruction so the model knows exactly what to do |
| Context boundaries | Separate retrieved context from user input |
| Output format | Specify JSON, bullet list, short answer, or citation style when needed |
| Safety behavior | Define what the model should do when context is insufficient |
| Few-shot examples | Know when examples help and when they add noise |
| Prompt regression | Compare output before and after prompt changes |
Model serving and application design
| Scenario constraint | Design consideration |
|---|
| Low latency | Simplify retrieval, reduce unnecessary calls, review endpoint and app design |
| Higher quality | Improve retrieval, prompts, evaluation, and feedback loop |
| Sensitive data | Apply access control, governance, logging caution, and least privilege |
| Frequent updates | Consider data refresh and index update workflow |
| Multiple users | Consider endpoint access, scaling concepts, monitoring, and governance |
| Cost pressure | Avoid unnecessary model calls, oversized context, and inefficient evaluation runs |
Evaluation and MLflow/Mosaic AI review
| Topic | What to know |
|---|
| Offline evaluation | Compare versions before release |
| Human evaluation | Use reviewers for subjective quality or safety concerns |
| Retrieval evaluation | Check whether the right context is retrieved |
| Response evaluation | Check correctness, helpfulness, format, safety, and groundedness |
| Regression testing | Ensure prompt or retriever changes do not break previous behavior |
| Tracking | Track prompts, parameters, versions, traces, metrics, and feedback where applicable |
Security, governance, and responsible use
| Area | Review points |
|---|
| Unity Catalog | Data governance, permissions, lineage concepts, managed access patterns |
| Access control | Who can read source data, query indexes, call endpoints, and view logs |
| Secrets | Avoid hard-coded credentials and unsafe token handling |
| Sensitive data | Consider PII, proprietary data, and whether it should be used in prompts or logs |
| Least privilege | Grant only the access needed for the application or user |
| Auditability | Understand why tracking, logs, and governance matter in GenAI systems |
| Safe responses | Know how to handle insufficient context, unsafe requests, or unsupported claims |
Missed-question review method
A missed-question log is more useful than rereading notes.
| Field | What to record |
|---|
| Date | When you missed it |
| Topic | RAG, serving, evaluation, security, etc. |
| Question type | Concept, scenario, troubleshooting, terminology |
| Your answer | What you chose |
| Correct idea | The principle you should have applied |
| Why you missed it | Gap, misread, confusion, or overengineering |
| Fix | Note, flashcard, mini-lab, or retest |
| Retest date | When you will verify the fix |
Review loop
- Review missed questions within 24 hours.
- Rewrite the correct decision rule in your own words.
- Add one similar practice question or scenario.
- Retest the topic 2-3 days later.
- Keep the item active until you can explain why the wrong answers are wrong.
| Weak area | Decision rule |
|---|
| Poor RAG answers | First check retrieval quality and prompt grounding before blaming the model |
| Sensitive source data | Apply governance and least privilege before designing the app flow |
| Irrelevant retrieved chunks | Revisit chunking, embeddings, metadata filters, and index freshness |
| Unstable output after changes | Add evaluation and regression checks before release |
| Slow application | Check retrieval path, prompt size, model call pattern, and serving design |
When to use timed mock exams
Timed mocks are for exam performance, not initial learning. Use them after you have reviewed enough content to learn from the results.
| Plan | First timed mock | Second timed mock | Final timed practice |
|---|
| 7 days | Day 1 or 2 | Day 6 | Small confidence set Day 7 |
| 14 days | Day 7 or 8 | Day 12 | Light review Day 14 |
| 30 days | Day 21 | Day 27 | Small set Day 29 |
| 60/90 days | Every 2-3 weeks after foundation phase | Final 10 days | 2-3 days before exam |
How to review a timed mock
| Review step | Action |
|---|
| First pass | Mark every miss and every guess |
| Second pass | Categorize by topic and miss type |
| Third pass | Identify repeated patterns |
| Fourth pass | Create 3-5 study tasks, not 20 |
| Final pass | Re-answer missed questions without looking at the explanation |
Avoid taking mock after mock without review. One deeply reviewed mock is usually more valuable than several unreviewed attempts.
Final-week rules
| Rule | Why it matters |
|---|
| Stop adding broad new material 48-72 hours before the exam | Prevents overload and confusion |
| Keep studying weak areas, not favorite areas | Favorite topics create false confidence |
| Review guessed correct answers | They reveal unstable knowledge |
| Use timed sets sparingly | Protect energy and focus |
| Practice scenario reading | Many errors come from missing constraints |
| Sleep and logistics matter | Fatigue increases careless misses |
Final 48-hour checklist
- Review your top 10 decision rules.
- Review all high-frequency missed topics.
- Revisit RAG flow, evaluation flow, and governance flow.
- Do one small timed set if it builds confidence.
- Stop heavy study the evening before the exam.
- Prepare identification, appointment details, workspace requirements, and timing plan.
Exam-readiness checks
You are likely ready when you can do the following without notes.
| Readiness check | Can you do it? |
|---|
| Explain a Databricks GenAI application flow from source data to model response | Yes / No |
| Diagnose poor RAG output using retrieval, prompt, data, and model factors | Yes / No |
| Choose when vector search and embeddings are appropriate | Yes / No |
| Explain how evaluation supports prompt and application changes | Yes / No |
| Describe how MLflow/Mosaic AI concepts support tracking and comparison | Yes / No |
| Apply governance and access-control thinking to GenAI scenarios | Yes / No |
| Handle timed questions without repeatedly running out of time | Yes / No |
| Explain why the wrong answers are wrong on reviewed practice questions | Yes / No |
If several answers are “No,” do not simply read more. Pick the weakest area, do targeted practice, and review misses until the decision rule is clear.
Practical next step
Choose the plan that matches your exam date, take a diagnostic practice set, and build a missed-question log today. Use each study session to repair one specific weakness in your preparation for the Databricks Certified Generative AI Engineer Associate (GenAI Engineer) exam.