GENAI-ASSOC Cheatsheet — RAG, Vector Search, Evaluation & Deployment on Databricks

Last-mile GENAI-ASSOC review: embeddings and chunking, vector search relevance, RAG prompt patterns, evaluation loops, and production trade-offs (cost, latency, governance).

On this page

Use this for last‑mile review. Pair it with Syllabus and practice drills.

1) The RAG pipeline (the canonical mental model)

    flowchart LR
	  DOC["Docs"] --> CH["Chunk + clean"]
	  CH --> EMB["Embeddings"]
	  EMB --> IDX["Vector index"]
	  Q["User query"] --> QEMB["Query embedding"]
	  QEMB --> RET["Retrieve top-k"]
	  RET --> PROMPT["Prompt with context"]
	  PROMPT --> LLM["LLM"]
	  LLM --> OUT["Answer + citations"]

Rule: good RAG is mostly data + retrieval quality, not clever prompts.

2) Chunking and embeddings (high-yield pickers)

Decision	Trade-off	Rule of thumb
Chunk size	recall vs precision	chunks should fit the model context with room for instructions
Overlap	redundancy vs cost	small overlap helps continuity
Metadata	filtering and security	store source, date, tenant, access tags

3) Retrieval relevance (why results look wrong)

Common causes:

poor chunking (too big/too small)
missing metadata filters (wrong tenant/version)
query mismatch (user question needs reformulation)

4) Evaluation loop (production-safe approach)

What to test	Examples
Retrieval quality	top-k hit rate, groundedness
Answer quality	correctness, citation quality
Safety	leakage, prompt injection resilience
Regression	keep a fixed eval set

5) Cost/latency controls (exam-friendly)

Cache embeddings and reuse indexes.
Use metadata filters to reduce candidate set.
Limit top-k and context length intentionally.
Monitor token usage and tail latency.

Syllabus

Practice

Browse Exams — Mock Exams & Practice Tests