AIF-C01 Syllabus — Objectives by Domain

Blueprint-aligned learning objectives for AWS Certified AI Practitioner (AIF-C01), organized by domain with quick links to targeted practice.

Use this syllabus as your source of truth for AIF-C01. Work through each domain in order and drill targeted sets after every task.

What’s covered

Domain 1: Fundamentals of AI and ML (20%)
Domain 2: Fundamentals of Generative AI (24%)
Domain 3: Applications of Foundation Models (28%)
Domain 4: Guidelines for Responsible AI (14%)
Domain 5: Security, Compliance, and Governance for AI Solutions (14%)

Domain 1: Fundamentals of AI and ML (20%)

Practice this topic →

Task 1.1 - Explain basic AI concepts and terminologies

Differentiate artificial intelligence (AI), machine learning (ML), and deep learning, and identify when each is most applicable.
Distinguish supervised learning, unsupervised learning, and reinforcement learning using simple real-world examples.
Compare classification, regression, and clustering problems and choose the correct framing for a described business requirement.
Define common ML dataset terms (feature, label, training example, ground truth) and explain why data quality matters.
Differentiate model parameters and hyperparameters and describe how hyperparameter tuning affects model performance.
Explain underfitting and overfitting and relate them to bias/variance trade-offs at a high level.
Describe training, validation, and test splits and identify common causes of data leakage.
Select appropriate evaluation metrics (accuracy, precision, recall, F1) based on the problem type and error costs.
Interpret a confusion matrix and recognize the effect of class imbalance on model evaluation.
Differentiate training vs inference and relate inference latency and throughput to real-time and batch use cases.
Explain what an embedding represents and how vector similarity search enables semantic retrieval.
Identify common AI subfields (NLP, computer vision, speech) and the kinds of inputs/outputs each typically uses.
Differentiate generative models and discriminative models at a conceptual level and recognize example outputs of each.
Explain what a foundation model is and why organizations use foundation models as a base for multiple downstream tasks.

Task 1.2 - Identify practical use cases for AI

Identify when AI/ML is a good fit versus when a rules-based or traditional analytics solution is more appropriate.
Select AI/ML approaches that support forecasting and time-series prediction use cases.
Recognize anomaly detection scenarios (operations, fraud, security) and the signals typically required.
Identify recommendation and personalization use cases and the data types commonly used to power them.
Recognize document processing use cases (OCR, forms extraction, summarization) and the typical workflow from ingest to output.
Identify NLP use cases (sentiment analysis, entity extraction, classification) and the expected outputs.
Identify computer vision use cases (image classification, object detection, face analysis) and the required input media.
Identify speech and language use cases (speech-to-text, translation, text-to-speech) and common constraints (latency, accuracy, accents).
Identify conversational AI use cases and recognize when chatbots should route to human agents.
Map common use cases to AWS AI services (for example: Amazon Rekognition, Amazon Textract, Amazon Comprehend, Amazon Transcribe, Amazon Translate, Amazon Polly, Amazon Lex, Amazon Kendra, Amazon Personalize, Amazon Forecast).
Choose between using a managed, pre-trained AWS AI service and building a custom ML model based on required customization and operational effort.
Identify business scenarios where generative AI adds value (summarization, content creation, Q&A, code assistance) and where it is risky.
Recognize human-in-the-loop patterns and acceptance criteria (precision/recall targets, review thresholds) for higher-impact decisions.
Identify key data considerations for AI use cases, including privacy/PII, bias, representativeness, and data access constraints.

Task 1.3 - Describe the ML development lifecycle

Translate a business problem into an ML problem statement and define measurable success criteria and baseline performance.
Identify common data sources for ML and describe the role of data labeling and ground truth creation.
Describe typical data preparation steps (cleaning, normalization, deduplication) and why these steps affect model outcomes.
Explain feature engineering and feature selection concepts at a high level and when they are most important.
Describe model selection at a high level, including choosing a simple baseline before using more complex models.
Explain the purpose of hyperparameter tuning and how to avoid overfitting while tuning.
Describe evaluation practices for ML models, including holdout testing and avoiding data leakage.
Describe deployment options for ML inference (batch vs real-time) and how models integrate into applications.
Identify common production monitoring needs for ML (data drift, concept drift, performance regression, latency).
Describe retraining triggers and feedback loops and why model performance can degrade over time.
Explain MLOps at a high level, including automation, versioning, repeatable pipelines, and model governance.
Identify AWS components commonly used in an ML lifecycle (for example: Amazon S3 for data/artifacts and Amazon SageMaker for training and inference).
Describe reproducibility concepts, including tracking datasets, code, parameters, and model versions across experiments.
Recognize responsible ML lifecycle needs (documentation, approvals, audits) before promoting a model to production.

Domain 2: Fundamentals of Generative AI (24%)

Practice this topic →

Task 2.1 - Explain the basic concepts of generative AI

Define generative AI and describe how it differs from traditional predictive ML in terms of outputs and use cases.
Identify common generative model families (large language models, diffusion models, GANs) and the types of content they generate.
Explain the transformer concept at a high level and why attention mechanisms are useful for language tasks.
Define tokens, tokenization, and context window and relate them to what a model can "see" in a single request.
Explain prompt and completion concepts and identify common generation settings (temperature, top-k, top-p) and their effects.
Define embeddings and explain how embeddings enable semantic search and retrieval for generative AI applications.
Explain the concept of foundation models and why foundation models are adaptable to many downstream tasks.
Describe retrieval-augmented generation (RAG) at a high level and why it can reduce hallucinations and improve grounding.
Differentiate prompt engineering, RAG, and fine-tuning as methods to improve model usefulness for a specific task.
Explain hallucinations in generative AI and identify mitigation strategies (grounding, constraints, verification, human review).
Recognize generative AI safety risks (toxic content, harmful instructions) and the need for moderation and guardrails.
Explain prompt injection and jailbreak concepts at a high level and why untrusted input requires defensive design.
Identify key drivers of generative AI cost and performance (token volume, context size, model size, latency).
Describe model selection trade-offs (capability, speed, cost, context length, modality) when choosing a model for a workload.

Task 2.2 - Understand the capabilities and limitations of generative AI for solving business problems

Identify business problems that generative AI commonly solves well (summarization, Q&A, extraction, drafting, ideation).
Recognize common limitations of generative AI (hallucinations, context limits, brittle prompts) and how they affect solution design.
Determine when human review or human-in-the-loop workflows are required to manage risk and quality.
Identify bias and fairness risks in generative AI outputs and the need for evaluation across user groups.
Recognize privacy and confidentiality concerns when sending prompts and documents to models, including handling of PII.
Recognize intellectual property and content provenance considerations, including licensed data and ownership of outputs.
Explain latency and cost trade-offs in production gen AI, including when caching and smaller models are appropriate.
Select appropriate generation settings (for example, temperature) based on whether the task requires creativity or determinism.
Determine when to use RAG versus fine-tuning based on knowledge freshness, proprietary data, and required style consistency.
Recognize multimodal use cases (text + image + audio) and constraints that affect model choice and evaluation.
Explain why evaluation and monitoring are required for generative AI systems, including regression testing for prompts and models.
Identify fallback and fail-safe strategies for gen AI outputs (refusal, escalation, "I don't know", human handoff).
Recognize the role of guardrails and content filtering for policy compliance (safety, brand voice, privacy constraints).
Identify organizational readiness requirements for gen AI adoption (policy, training, risk ownership, and change management).

Task 2.3 - Describe AWS infrastructure and technologies for building generative AI applications

Describe Amazon Bedrock at a high level and how it provides managed access to foundation models for inference.
Identify common Amazon Bedrock capabilities used in applications (model selection, prompt management patterns, guardrails) at a conceptual level.
Describe Amazon SageMaker at a high level as a platform for building, training, tuning, and deploying ML models.
Explain the role of SageMaker JumpStart and pre-built models as a starting point for ML and foundation model workloads.
Describe the purpose of vector databases and identify common AWS options for vector search (for example: Amazon OpenSearch Service and Aurora PostgreSQL with pgvector).
Describe typical storage layers for generative AI applications (Amazon S3 for documents and artifacts, databases for metadata) and why they are used.
Identify common orchestration and automation services for gen AI pipelines (AWS Lambda, AWS Step Functions, AWS Glue) and what each is used for.
Describe deployment patterns for gen AI APIs (Amazon API Gateway + AWS Lambda, containers on Amazon ECS/Fargate or Amazon EKS) and basic scaling considerations.
Identify observability components for AI apps (Amazon CloudWatch, AWS X-Ray) and audit logging needs (AWS CloudTrail) at a high level.
Describe security building blocks for gen AI solutions (IAM, AWS KMS, AWS Secrets Manager, VPC endpoints/PrivateLink) and why they are used.
Identify responsible AI tooling patterns on AWS (for example: content filters/guardrails and bias/explainability tooling) and where they fit in the lifecycle.
Describe Amazon Q at a high level and recognize common assistant use cases (business Q&A, developer assistance) and constraints.
Explain high-level cost management strategies for AI workloads (token usage awareness, right-sizing, caching, and setting budgets).
Recognize reliability considerations for gen AI applications (timeouts, retries, fallbacks, multi-Region design) and their trade-offs.

Domain 3: Applications of Foundation Models (28%)

Practice this topic →

Task 3.1 - Describe design considerations for applications that use foundation models

Select an appropriate foundation model based on modality, context length, latency, cost, and licensing/commercial terms at a high level.
Compare using a managed foundation model service (for example, Amazon Bedrock) versus self-hosting a model, including operational trade-offs.
Design a basic gen AI application architecture including prompt templates, retrieval (optional), and tool/function integration.
Describe document ingestion and chunking strategies for RAG and how chunking affects retrieval quality and cost.
Select an embedding strategy (embedding model choice, similarity function) and explain how it impacts semantic search results.
Describe conversation memory concepts and identify options for storing session state and chat history.
Explain grounding and citation patterns for RAG-based answers and why source attribution improves trust and auditability.
Describe caching patterns for gen AI (prompt/result caching, embedding caching) and their cost/latency impact.
Design for rate limits, throttling, and retries, including backpressure patterns to protect downstream model endpoints.
Identify safe tool-use design practices (allowlists, scoped permissions, validation) when models are used to invoke actions.
Explain trade-offs between streaming and non-streaming responses for user experience and system design.
Describe multi-tenant isolation considerations for gen AI, including data segregation and least-privilege access boundaries.
Identify observability requirements for gen AI systems (metrics, traces, error rates) while minimizing sensitive prompt/response exposure.
Design failure handling for AI features (fallback answers, escalation paths, safe defaults) to maintain user trust and continuity.

Task 3.2 - Choose effective prompt engineering techniques

Write clear prompts that specify role, goal, constraints, and success criteria to reduce ambiguity in model responses.
Use structure (headings, bullet lists, delimiters) to separate instructions, context, and user input to improve reliability.
Apply few-shot prompting (examples) to guide formatting, style, and edge-case handling.
Specify output schemas (for example, valid JSON fields) and validation rules to produce structured and machine-readable outputs.
Use decomposition techniques (break down a complex task into steps or sub-questions) to improve response quality.
Inject retrieved context for RAG with clear grounding instructions (use sources, avoid unsupported claims).
Separate system-level instructions from user content and recognize why mixing untrusted input with instructions increases risk.
Use prompt chaining or iterative refinement (draft → critique → improve) to increase consistency and reduce omissions.
Tune generation settings (temperature, top-p, max tokens) to balance determinism, creativity, and response length.
Handle long contexts by summarizing, chunking, and selecting the most relevant information rather than pasting entire corpora.
Reduce hallucinations by requesting citations, constraining the answer space, and instructing the model to ask clarifying questions.
Apply prompt-injection defenses in prompt design (treat user content as data, not instructions; refuse unsafe tool actions).
Use evaluation-driven prompt iteration by testing prompts against a representative set of inputs and tracking results over time.
Manage prompts as versioned assets (prompt libraries, change control) to support repeatability and safe rollbacks.

Task 3.3 - Describe the training and fine-tuning process for foundation models

Differentiate pretraining, fine-tuning, and instruction tuning at a conceptual level and describe what each aims to improve.
Identify dataset requirements for supervised fine-tuning (quality, representativeness, labeling) and the impact of poor data.
Describe parameter-efficient fine-tuning approaches (for example, LoRA/adapters) at a high level and why they reduce cost.
Select between prompt engineering, RAG, and fine-tuning based on whether the need is knowledge, style, or task-specific behavior.
Explain the risk of overfitting in fine-tuning and the need for validation sets and early stopping concepts.
Identify compute and cost considerations for training and fine-tuning (GPU/accelerator needs, time-to-train, experimentation).
Recognize training data governance needs, including removing sensitive data and ensuring proper licensing and permissions.
Describe customization options on AWS at a high level (for example, model customization via Amazon Bedrock and training jobs in Amazon SageMaker).
Describe the role of embeddings models and how customizing embeddings can affect retrieval quality in RAG solutions.
Explain the idea of alignment (for example, reinforcement learning from human feedback) at a high level and why it matters.
Describe model deployment considerations after fine-tuning, including versioning, rollback, and controlled rollout.
Identify monitoring needs after customization, including regression testing and detecting shifts in output quality over time.
Recognize the value of experiment tracking and reproducibility (datasets, prompts, training code, parameters) during customization.
Describe continuous improvement for foundation models using feedback data while maintaining governance and change control.

Task 3.4 - Describe methods to evaluate foundation model performance

Define evaluation dimensions for foundation models, including quality, groundedness, safety, latency, and cost.
Create a representative evaluation set (prompts + expected outputs or rubrics) and explain why coverage matters.
Describe human evaluation using scoring rubrics and why consistency across reviewers improves reliability.
Identify when automated metrics (for example, ROUGE/BLEU or embedding similarity) are appropriate and when they are misleading.
Evaluate hallucinations and factuality risks and identify techniques for measuring groundedness and correctness.
Evaluate safety risks (toxicity, self-harm, policy violations) and define acceptable thresholds and refusal behavior.
Evaluate bias and fairness by comparing behavior across demographic or user groups and documenting disparities.
Evaluate RAG systems end-to-end, including retrieval accuracy, chunk quality, and the impact of sources on final answers.
Run A/B tests for prompts, models, or retrieval settings and interpret results to select the best-performing configuration.
Describe production monitoring for foundation model applications, including user feedback signals and quality regression detection.
Explain red teaming and adversarial testing at a high level and why it is essential for high-risk applications.
Evaluate multi-turn conversations for consistency, instruction adherence, and appropriate use of conversation memory.
Recognize evaluation needs for multimodal models (text + image/audio) and modality-specific quality and safety checks.
Define acceptance criteria and rollout/rollback triggers based on evaluation outcomes and business risk tolerance.

Domain 4: Guidelines for Responsible AI (14%)

Practice this topic →

Task 4.1 - Explain the development of AI systems that are responsible

Define responsible AI principles (fairness, accountability, transparency, privacy, safety) and how they guide system design.
Identify common sources of bias in AI systems (data collection, labeling, sampling) and high-level mitigation strategies.
Explain data minimization and privacy-by-design concepts and how they reduce risk when building AI systems.
Identify when human-in-the-loop review is appropriate and how escalation paths reduce harm for high-impact decisions.
Perform a high-level risk assessment by defining intended use, out-of-scope use, and failure modes for an AI feature.
Describe content safety controls (policy-based filtering, refusal behavior) and why they are necessary for generative AI.
Recognize the value of documentation (model cards, datasheets for datasets) for governance and stakeholder trust.
Design feedback and incident response loops for AI harms (reporting, triage, remediation, post-incident learning).
Describe monitoring for misuse and harmful outputs, including abuse detection and policy violation signals.
Recognize responsible AI risk management frameworks (for example, NIST AI RMF) and why structured controls matter.
Identify AWS services and features that support responsible AI patterns (for example, Amazon Bedrock guardrails and Amazon SageMaker Clarify).
Explain consent, data rights, and appropriate use boundaries for training data and user-provided inputs.
Recognize environmental and cost considerations of AI systems and why efficiency can be a responsible design goal.
Design inclusive and accessible AI experiences, including clear user communication and accommodations for diverse users.

Task 4.2 - Recognize the importance of transparent and explainable models

Differentiate transparency, explainability, and interpretability and identify why each matters to different stakeholders.
Identify scenarios where explainability is required (regulated industries, high-impact decisions) and what evidence may be needed.
Describe feature importance at a high level and how it can be used to explain model predictions.
Recognize common explainability techniques (for example, LIME/SHAP) and what kinds of explanations they provide.
Explain how to communicate model limitations, uncertainty, and appropriate use to users to build trust and reduce misuse.
Describe citation and source attribution patterns for generative AI (especially RAG) and how they support transparency.
Design user-facing messaging that clarifies whether outputs are generated, what data sources were used, and how to verify results.
Recognize the role of audit logs and traceability for investigating model behavior and supporting compliance requirements.
Explain the trade-off between simple, interpretable models and complex, higher-performing models and how to choose appropriately.
Explain why some explainability techniques can be misleading if they do not faithfully represent the model’s true behavior.
Recognize explainability risks such as leaking sensitive information through explanations and the need for safe disclosure.
Identify AWS tooling that supports explainability and bias analysis (for example, Amazon SageMaker Clarify) and what it can provide.
Describe decision thresholds and decision boundaries at a high level and how they relate to transparency in classification tasks.
Recognize that AI systems often combine multiple models and components, and explainability may require tracing the full pipeline.

Domain 5: Security, Compliance, and Governance for AI Solutions (14%)

Practice this topic →

Task 5.1 - Explain methods to secure AI systems

Apply least-privilege IAM design for AI systems, including scoping permissions for data access, model invocation, and administration.
Describe encryption in transit and at rest for AI data and artifacts using TLS and AWS KMS-managed keys.
Explain network isolation concepts for AI workloads, including private subnets, security groups, and private connectivity (VPC endpoints/PrivateLink).
Protect secrets used by AI applications (API keys, database credentials) using AWS Secrets Manager and secure rotation practices.
Secure retrieval data stores used for RAG with strong access controls, encryption, and appropriate data partitioning.
Recognize prompt injection and jailbreak risks and apply high-level mitigations (input boundaries, guardrails, tool allowlists).
Implement protections against data exfiltration through model outputs, including output filtering and PII redaction patterns.
Recognize threats to training pipelines such as data poisoning and how governance, validation, and provenance checks reduce risk.
Describe model artifact protection concepts (integrity, signing, secure storage) and why tampered artifacts are a risk.
Explain audit logging needs for AI systems, including capturing access events with AWS CloudTrail and controlling log access.
Identify security monitoring services and patterns (Amazon GuardDuty, AWS Security Hub, Amazon CloudWatch alarms) to detect threats.
Protect AI-facing APIs using controls such as AWS WAF, throttling/rate limiting, and authentication/authorization mechanisms.
Apply secure SDLC concepts to AI applications, including threat modeling, dependency hygiene, and secure CI/CD practices.
Design multi-tenant isolation for AI workloads, including data boundaries, per-tenant encryption strategies, and access segmentation.
Recognize adversarial attacks (for example, adversarial examples and evasion) and why robustness testing is part of security.
Describe incident response basics for AI systems, including containment, rotation of credentials/keys, and post-incident review.

Task 5.2 - Recognize governance and compliance regulations for AI systems

Define AI governance and explain why organizations implement oversight for data, models, and AI-enabled decisions.
Describe data governance concepts relevant to AI (classification, lineage, retention, access) and why they matter for compliance.
Recognize privacy regulations (for example, GDPR and CCPA) and identify how they influence data handling for AI workloads.
Recognize industry regulations (for example, HIPAA or PCI DSS) and why regulated workloads require additional controls and auditing.
Recognize emerging AI regulations and frameworks (for example, the EU AI Act) at a high level and why risk classification matters.
Identify AWS Artifact as a source for compliance reports and agreements and explain how it supports audits.
Describe auditability requirements, including capturing access logs, configuration history, and model version changes.
Describe change management for AI systems, including approvals for model updates, prompt changes, and deployment rollouts.
Explain how AWS Organizations, multi-account strategy, and tagging support governance, separation of duties, and cost allocation.
Recognize third-party model and provider considerations, including contracts, licensing terms, and data usage restrictions.
Define policies for prompt and response logging that balance audit needs with privacy requirements and data minimization.
Describe acceptable use policies for generative AI, including prohibited content categories and organizational guardrails.
Identify documentation artifacts that support governance (model cards, risk assessments, evaluation results, monitoring dashboards).
Recognize security and compliance frameworks (for example, ISO 27001, SOC reports, NIST) and the concept of control mapping.
Describe data residency and cross-border transfer considerations and how region selection can support compliance requirements.
Explain retention and deletion workflows (including right-to-be-forgotten concepts) and how they affect AI datasets and logs.

Tip: For AIF-C01, learn the concept first, then drill until you can explain the trade-off in one sentence (for example: prompting vs RAG vs fine-tuning).

Study Plan

Cheat Sheet

Browse Exams — Mock Exams & Practice Tests

AIF-C01 Syllabus — Objectives by Domain

What’s covered

Domain 1: Fundamentals of AI and ML (20%)

Task 1.1 - Explain basic AI concepts and terminologies

Task 1.2 - Identify practical use cases for AI

Task 1.3 - Describe the ML development lifecycle

Domain 2: Fundamentals of Generative AI (24%)

Task 2.1 - Explain the basic concepts of generative AI

Task 2.2 - Understand the capabilities and limitations of generative AI for solving business problems

Task 2.3 - Describe AWS infrastructure and technologies for building generative AI applications

Domain 3: Applications of Foundation Models (28%)

Task 3.1 - Describe design considerations for applications that use foundation models

Task 3.2 - Choose effective prompt engineering techniques

Task 3.3 - Describe the training and fine-tuning process for foundation models

Task 3.4 - Describe methods to evaluate foundation model performance

Domain 4: Guidelines for Responsible AI (14%)

Task 4.1 - Explain the development of AI systems that are responsible

Task 4.2 - Recognize the importance of transparent and explainable models

Domain 5: Security, Compliance, and Governance for AI Solutions (14%)

Task 5.1 - Explain methods to secure AI systems

Task 5.2 - Recognize governance and compliance regulations for AI systems