AIF-C01 Cheatsheet — AI & Generative AI Fundamentals, AWS Service Map, RAG & Prompt Patterns

High-signal AIF-C01 reference: AI/ML terminology, generative AI concepts (tokens, embeddings, RAG), AWS services (Bedrock, SageMaker and core AI services), prompt engineering patterns, evaluation rubrics, responsible AI, and security/governance essentials.

On this page

Keep this page open while drilling questions. AIF‑C01 rewards clean definitions, best-fit service selection, and risk-aware design (hallucinations, privacy, prompt injection, responsible use).

Quick facts (AIF-C01)

Item	Value
Questions	65 (multiple-choice + multiple-response)
Time	90 minutes
Passing score	700 (scaled 100–1000)
Cost	100 USD
Domains	D1 20% • D2 24% • D3 28% • D4 14% • D5 14%

How AIF-C01 questions work (fast strategy)

If the prompt says least operational effort, prefer managed services (and native integrations).
If the question is about “improving factual accuracy,” the best answer is often grounding (RAG + citations) rather than “make the model bigger.”
If the question is about “sensitive data,” the best answer usually includes least privilege, encryption, and minimizing what you send to the model.
If the scenario includes “untrusted user input,” think prompt injection defenses and safe tool use (allowlists, scoped permissions).
Read the last sentence first to capture the constraint (cost, latency, safety, compliance).

0) Core mental model: a GenAI app (with RAG)

    flowchart LR
	  U[User] --> A[App / API]
	  A -->|Prompt + context| FM[Foundation Model]
	  A -->|Embed query| E[Embeddings]
	  E --> VS[(Vector Store)]
	  VS -->|Top-k chunks| A
	  A -->|Policy filters| G[Guardrails / Moderation]
	  A -->|Logs/metrics| O[Observability]
	  A -->|AuthN/AuthZ| IAM[IAM / Identity]

RAG in one sentence: retrieve relevant private content, then ask the model to answer using only that content (ideally with citations).

1) AI/ML fundamentals (Domain 1)

Core terminology (must know)

Term	Exam-friendly meaning
AI	Broad goal: machines doing tasks that appear intelligent (perception, language, planning).
ML	Subset of AI: models learn patterns from data to make predictions/decisions.
Deep learning	ML with neural networks (often needs more data/compute; strong for vision/language).
Supervised learning	Learn from labeled examples (classification/regression).
Unsupervised learning	Find structure without labels (clustering, dimensionality reduction).
Reinforcement learning	Learn actions via rewards/penalties (policies).
Feature / label	Input signal vs correct output.
Training vs inference	Fit the model vs use the model to predict/generate.
Overfitting	Great on training data, poor on new data (memorization).
Data leakage	Training sees information it shouldn’t (inflates metrics).
Drift	Data or reality changes → performance decays over time.

Metrics (common, conceptual)

Use case	Useful metrics	What to watch for
Classification	Precision/recall/F1, ROC-AUC	Class imbalance; false positives vs false negatives
Regression	MAE/MSE/RMSE	Outliers; error tolerance
Ranking/retrieval	Precision@k / Recall@k	“Did we retrieve the right things?”

ML lifecycle (high level)

    flowchart LR
	  P[Define problem + metric] --> D[Collect/prepare data]
	  D --> T[Train + tune]
	  T --> E[Evaluate]
	  E --> DEP[Deploy]
	  DEP --> M[Monitor + feedback]
	  M --> D

Common best answer patterns:

If you can’t define a metric or get data, ML is usually the wrong first move.
Production ML needs monitoring (quality/latency/cost) and retraining plans.

2) Generative AI fundamentals (Domain 2)

Key GenAI terms (must know)

Term	Exam-friendly meaning
LLM	Language model that generates text from prompts.
Tokens	Model “chunks” of text; drives cost/limits.
Context window	Max tokens model can consider in one request.
Embeddings	Numeric vectors that capture semantic meaning for similarity search.
Vector store	Database/index optimized for similarity search over embeddings.
RAG	Retrieve relevant data and include it in the prompt to ground answers.
Temperature / top-p	Controls randomness vs determinism.
Hallucination	Output that sounds plausible but isn’t supported by facts.
Prompt injection	Untrusted text attempts to override instructions (“ignore previous”).

Prompting vs RAG vs fine-tuning (decision table)

Need	Best starting point	Why
Better instructions/format	Prompt engineering	Fast, cheap, reversible
Fresh/private knowledge	RAG	Grounds answers in your content without retraining
Consistent style/behavior	Fine-tuning	Teach patterns; reduces prompt complexity
A completely new capability	Usually not AIF-C01 scope	Consider specialist ML work

GenAI limitations to recognize

Factuality isn’t guaranteed → use grounding/citations and “unknown” responses.
Context is limited → don’t paste entire corpora; retrieve and summarize.
Outputs can be unsafe/biased → add guardrails, evaluation, and human review paths.
Costs scale with tokens → control prompt size, choose smaller models when acceptable, cache repeated work.

3) AWS service map (what to pick when)

Foundation models and ML platforms

You need…	Typical AWS answer
Managed foundation model access for GenAI apps	Amazon Bedrock
Build/train/tune/deploy custom ML models	Amazon SageMaker
A GenAI assistant for work/dev tasks	Amazon Q

Pre-built AI services (use-case driven)

Use case	Typical AWS service
Extract text/forms from documents	Amazon Textract
NLP (entities, sentiment, classification)	Amazon Comprehend
Image/video analysis	Amazon Rekognition
Speech-to-text	Amazon Transcribe
Translation	Amazon Translate
Text-to-speech	Amazon Polly
Chatbot interfaces	Amazon Lex
Enterprise search	Amazon Kendra

Common building blocks for GenAI apps (glue)

Need	Typical AWS building blocks
Store docs and artifacts	Amazon S3
Orchestrate workflows	AWS Step Functions
Serverless compute	AWS Lambda
Containerized APIs	Amazon ECS/Fargate or Amazon EKS
Vector search	Amazon OpenSearch Service, Aurora PostgreSQL with pgvector
Secrets and keys	AWS Secrets Manager, AWS KMS
Audit + monitoring	AWS CloudTrail, Amazon CloudWatch

4) RAG: design notes that show up in exam scenarios (Domain 3)

RAG architecture (end-to-end)

    flowchart TB
	  subgraph Ingestion
	    S3[(Docs in S3)] --> C[Chunk + clean]
	    C --> EMB1[Create embeddings]
	    EMB1 --> VS[(Vector store)]
	  end
	
	  subgraph Answering
	    Q[User question] --> EMB2[Embed query]
	    EMB2 --> VS
	    VS --> K[Top-k chunks]
	    K --> P[Prompt template: instructions + context]
	    P --> FM[Foundation model]
	    FM --> A[Answer + citations]
	  end

High-yield design choices

Chunking: smaller chunks improve precision; larger chunks improve context. The exam often wants “tune chunking for relevance.”
Citations: if the requirement says “trust” or “audit,” add citations/source links.
Freshness: if content changes often, prefer RAG over fine-tuning.
Privacy: don’t send more data than needed; redact PII; restrict who can retrieve what (multi-tenant boundaries).

5) Prompt engineering patterns (Domain 3)

Techniques you should recognize

Technique	What it does	When to use
Clear instructions + constraints	Reduces ambiguity	Most questions
Few-shot examples	Improves formatting/edge cases	Structured outputs
Delimiters	Separates instructions vs data	Untrusted input scenarios
Output schema	Produces predictable JSON	App integrations
Grounding instructions	Reduces hallucinations	RAG and knowledge tasks
Refusal/escalation	Safer behavior	Policy/safety constraints

Prompt template (practical)

 1Goal: Answer the user question using ONLY the provided context.
 2Context:
 3<<<
 4{retrieved_chunks}
 5>>>
 6Rules:
 7- If the answer is not in the context, say "Insufficient context".
 8- Provide 2-3 bullet citations (source titles/ids).
 9Output format (JSON):
10{"answer":"...", "citations":[{"source":"...","quote":"..."}]}
11User question: {question}

Anti-prompt-injection rule of thumb

Treat user-provided text as data, not instructions. If the model is allowed to call tools/actions, use allowlists and scoped permissions.

6) Evaluation and monitoring (Domain 3)

What to evaluate

Dimension	How to test it (high level)
Correctness	Gold questions, expert review, spot checks
Groundedness	Require citations; verify claims against sources
Safety	Toxicity/harm prompts; policy violations; refusal behavior
Bias	Compare outcomes across groups; document disparities
Reliability	Regression tests for prompt/model changes
Latency/cost	Measure P50/P95 and token usage; set budgets

Common “best answers”:

Use a representative test set (not just a few demos).
Do A/B testing when changing prompts/models.
Monitor production for quality regressions and abuse.

7) Responsible AI (Domain 4)

Responsible AI checklist (high signal)

Define intended use + out-of-scope use (avoid “silent expansion”).
Add human oversight for high-impact decisions.
Evaluate for bias and document limitations.
Implement safety policies (harmful content, privacy leakage).
Be transparent with users (what it is, what it isn’t, how to verify).

Common risks and mitigations

Risk	Typical mitigation
Hallucinations	RAG + citations; “unknown” responses
Unsafe content	Guardrails/moderation + refusal behavior
Privacy leakage	Data minimization; redaction; access controls
Bias/unfairness	Diverse evaluation sets; monitoring and remediation
Over-trust	User messaging + explainability + source links

8) Security, compliance, and governance (Domain 5)

Security “gotchas” the exam expects you to notice

Over-permissive IAM roles (“*” actions/resources)
Secrets embedded in prompts, logs, or code
Sending unnecessary sensitive data to the model
No audit trail for access and changes
Tool use without constraints (model can “do anything”)

AWS controls to name in answers (by theme)

Theme	Common AWS controls
Identity	IAM roles/policies, least privilege
Encryption	AWS KMS, TLS
Secrets	AWS Secrets Manager
Network	VPC endpoints/PrivateLink, security groups
Audit	AWS CloudTrail
Monitoring	Amazon CloudWatch, AWS Security Hub, Amazon GuardDuty
Governance	AWS Organizations (accounts, SCPs), tagging
Compliance evidence	AWS Artifact

Next steps

Use the Syllabus as your checklist (objective-by-objective).
Use Practice to drill weak tasks fast.
Use the Study Plan if you want a 30/60/90-day schedule.

Syllabus

Practice

Browse Exams — Mock Exams & Practice Tests