Last-mile GENAI-ASSOC review: embeddings and chunking, vector search relevance, RAG prompt patterns, evaluation loops, and production trade-offs (cost, latency, governance).
Use this for last‑mile review. Pair it with Syllabus and practice drills.
flowchart LR
DOC["Docs"] --> CH["Chunk + clean"]
CH --> EMB["Embeddings"]
EMB --> IDX["Vector index"]
Q["User query"] --> QEMB["Query embedding"]
QEMB --> RET["Retrieve top-k"]
RET --> PROMPT["Prompt with context"]
PROMPT --> LLM["LLM"]
LLM --> OUT["Answer + citations"]
Rule: good RAG is mostly data + retrieval quality, not clever prompts.
| Decision | Trade-off | Rule of thumb |
|---|---|---|
| Chunk size | recall vs precision | chunks should fit the model context with room for instructions |
| Overlap | redundancy vs cost | small overlap helps continuity |
| Metadata | filtering and security | store source, date, tenant, access tags |
Common causes:
| What to test | Examples |
|---|---|
| Retrieval quality | top-k hit rate, groundedness |
| Answer quality | correctness, citation quality |
| Safety | leakage, prompt injection resilience |
| Regression | keep a fixed eval set |