AIF-C01 Syllabus — Objectives by Domain

Blueprint-aligned learning objectives for AWS Certified AI Practitioner (AIF-C01), organized by domain with quick links to targeted practice.

Use this syllabus as your source of truth for AIF-C01. Work through each domain in order and drill targeted sets after every task.

What’s covered

Domain 1: Fundamentals of AI and ML (20%)

Practice this topic →

Task 1.1 - Explain basic AI concepts and terminologies

  • Differentiate artificial intelligence (AI), machine learning (ML), and deep learning, and identify when each is most applicable.
  • Distinguish supervised learning, unsupervised learning, and reinforcement learning using simple real-world examples.
  • Compare classification, regression, and clustering problems and choose the correct framing for a described business requirement.
  • Define common ML dataset terms (feature, label, training example, ground truth) and explain why data quality matters.
  • Differentiate model parameters and hyperparameters and describe how hyperparameter tuning affects model performance.
  • Explain underfitting and overfitting and relate them to bias/variance trade-offs at a high level.
  • Describe training, validation, and test splits and identify common causes of data leakage.
  • Select appropriate evaluation metrics (accuracy, precision, recall, F1) based on the problem type and error costs.
  • Interpret a confusion matrix and recognize the effect of class imbalance on model evaluation.
  • Differentiate training vs inference and relate inference latency and throughput to real-time and batch use cases.
  • Explain what an embedding represents and how vector similarity search enables semantic retrieval.
  • Identify common AI subfields (NLP, computer vision, speech) and the kinds of inputs/outputs each typically uses.
  • Differentiate generative models and discriminative models at a conceptual level and recognize example outputs of each.
  • Explain what a foundation model is and why organizations use foundation models as a base for multiple downstream tasks.

Task 1.2 - Identify practical use cases for AI

  • Identify when AI/ML is a good fit versus when a rules-based or traditional analytics solution is more appropriate.
  • Select AI/ML approaches that support forecasting and time-series prediction use cases.
  • Recognize anomaly detection scenarios (operations, fraud, security) and the signals typically required.
  • Identify recommendation and personalization use cases and the data types commonly used to power them.
  • Recognize document processing use cases (OCR, forms extraction, summarization) and the typical workflow from ingest to output.
  • Identify NLP use cases (sentiment analysis, entity extraction, classification) and the expected outputs.
  • Identify computer vision use cases (image classification, object detection, face analysis) and the required input media.
  • Identify speech and language use cases (speech-to-text, translation, text-to-speech) and common constraints (latency, accuracy, accents).
  • Identify conversational AI use cases and recognize when chatbots should route to human agents.
  • Map common use cases to AWS AI services (for example: Amazon Rekognition, Amazon Textract, Amazon Comprehend, Amazon Transcribe, Amazon Translate, Amazon Polly, Amazon Lex, Amazon Kendra, Amazon Personalize, Amazon Forecast).
  • Choose between using a managed, pre-trained AWS AI service and building a custom ML model based on required customization and operational effort.
  • Identify business scenarios where generative AI adds value (summarization, content creation, Q&A, code assistance) and where it is risky.
  • Recognize human-in-the-loop patterns and acceptance criteria (precision/recall targets, review thresholds) for higher-impact decisions.
  • Identify key data considerations for AI use cases, including privacy/PII, bias, representativeness, and data access constraints.

Task 1.3 - Describe the ML development lifecycle

  • Translate a business problem into an ML problem statement and define measurable success criteria and baseline performance.
  • Identify common data sources for ML and describe the role of data labeling and ground truth creation.
  • Describe typical data preparation steps (cleaning, normalization, deduplication) and why these steps affect model outcomes.
  • Explain feature engineering and feature selection concepts at a high level and when they are most important.
  • Describe model selection at a high level, including choosing a simple baseline before using more complex models.
  • Explain the purpose of hyperparameter tuning and how to avoid overfitting while tuning.
  • Describe evaluation practices for ML models, including holdout testing and avoiding data leakage.
  • Describe deployment options for ML inference (batch vs real-time) and how models integrate into applications.
  • Identify common production monitoring needs for ML (data drift, concept drift, performance regression, latency).
  • Describe retraining triggers and feedback loops and why model performance can degrade over time.
  • Explain MLOps at a high level, including automation, versioning, repeatable pipelines, and model governance.
  • Identify AWS components commonly used in an ML lifecycle (for example: Amazon S3 for data/artifacts and Amazon SageMaker for training and inference).
  • Describe reproducibility concepts, including tracking datasets, code, parameters, and model versions across experiments.
  • Recognize responsible ML lifecycle needs (documentation, approvals, audits) before promoting a model to production.

Domain 2: Fundamentals of Generative AI (24%)

Practice this topic →

Task 2.1 - Explain the basic concepts of generative AI

  • Define generative AI and describe how it differs from traditional predictive ML in terms of outputs and use cases.
  • Identify common generative model families (large language models, diffusion models, GANs) and the types of content they generate.
  • Explain the transformer concept at a high level and why attention mechanisms are useful for language tasks.
  • Define tokens, tokenization, and context window and relate them to what a model can "see" in a single request.
  • Explain prompt and completion concepts and identify common generation settings (temperature, top-k, top-p) and their effects.
  • Define embeddings and explain how embeddings enable semantic search and retrieval for generative AI applications.
  • Explain the concept of foundation models and why foundation models are adaptable to many downstream tasks.
  • Describe retrieval-augmented generation (RAG) at a high level and why it can reduce hallucinations and improve grounding.
  • Differentiate prompt engineering, RAG, and fine-tuning as methods to improve model usefulness for a specific task.
  • Explain hallucinations in generative AI and identify mitigation strategies (grounding, constraints, verification, human review).
  • Recognize generative AI safety risks (toxic content, harmful instructions) and the need for moderation and guardrails.
  • Explain prompt injection and jailbreak concepts at a high level and why untrusted input requires defensive design.
  • Identify key drivers of generative AI cost and performance (token volume, context size, model size, latency).
  • Describe model selection trade-offs (capability, speed, cost, context length, modality) when choosing a model for a workload.

Task 2.2 - Understand the capabilities and limitations of generative AI for solving business problems

  • Identify business problems that generative AI commonly solves well (summarization, Q&A, extraction, drafting, ideation).
  • Recognize common limitations of generative AI (hallucinations, context limits, brittle prompts) and how they affect solution design.
  • Determine when human review or human-in-the-loop workflows are required to manage risk and quality.
  • Identify bias and fairness risks in generative AI outputs and the need for evaluation across user groups.
  • Recognize privacy and confidentiality concerns when sending prompts and documents to models, including handling of PII.
  • Recognize intellectual property and content provenance considerations, including licensed data and ownership of outputs.
  • Explain latency and cost trade-offs in production gen AI, including when caching and smaller models are appropriate.
  • Select appropriate generation settings (for example, temperature) based on whether the task requires creativity or determinism.
  • Determine when to use RAG versus fine-tuning based on knowledge freshness, proprietary data, and required style consistency.
  • Recognize multimodal use cases (text + image + audio) and constraints that affect model choice and evaluation.
  • Explain why evaluation and monitoring are required for generative AI systems, including regression testing for prompts and models.
  • Identify fallback and fail-safe strategies for gen AI outputs (refusal, escalation, "I don't know", human handoff).
  • Recognize the role of guardrails and content filtering for policy compliance (safety, brand voice, privacy constraints).
  • Identify organizational readiness requirements for gen AI adoption (policy, training, risk ownership, and change management).

Task 2.3 - Describe AWS infrastructure and technologies for building generative AI applications

  • Describe Amazon Bedrock at a high level and how it provides managed access to foundation models for inference.
  • Identify common Amazon Bedrock capabilities used in applications (model selection, prompt management patterns, guardrails) at a conceptual level.
  • Describe Amazon SageMaker at a high level as a platform for building, training, tuning, and deploying ML models.
  • Explain the role of SageMaker JumpStart and pre-built models as a starting point for ML and foundation model workloads.
  • Describe the purpose of vector databases and identify common AWS options for vector search (for example: Amazon OpenSearch Service and Aurora PostgreSQL with pgvector).
  • Describe typical storage layers for generative AI applications (Amazon S3 for documents and artifacts, databases for metadata) and why they are used.
  • Identify common orchestration and automation services for gen AI pipelines (AWS Lambda, AWS Step Functions, AWS Glue) and what each is used for.
  • Describe deployment patterns for gen AI APIs (Amazon API Gateway + AWS Lambda, containers on Amazon ECS/Fargate or Amazon EKS) and basic scaling considerations.
  • Identify observability components for AI apps (Amazon CloudWatch, AWS X-Ray) and audit logging needs (AWS CloudTrail) at a high level.
  • Describe security building blocks for gen AI solutions (IAM, AWS KMS, AWS Secrets Manager, VPC endpoints/PrivateLink) and why they are used.
  • Identify responsible AI tooling patterns on AWS (for example: content filters/guardrails and bias/explainability tooling) and where they fit in the lifecycle.
  • Describe Amazon Q at a high level and recognize common assistant use cases (business Q&A, developer assistance) and constraints.
  • Explain high-level cost management strategies for AI workloads (token usage awareness, right-sizing, caching, and setting budgets).
  • Recognize reliability considerations for gen AI applications (timeouts, retries, fallbacks, multi-Region design) and their trade-offs.

Domain 3: Applications of Foundation Models (28%)

Practice this topic →

Task 3.1 - Describe design considerations for applications that use foundation models

  • Select an appropriate foundation model based on modality, context length, latency, cost, and licensing/commercial terms at a high level.
  • Compare using a managed foundation model service (for example, Amazon Bedrock) versus self-hosting a model, including operational trade-offs.
  • Design a basic gen AI application architecture including prompt templates, retrieval (optional), and tool/function integration.
  • Describe document ingestion and chunking strategies for RAG and how chunking affects retrieval quality and cost.
  • Select an embedding strategy (embedding model choice, similarity function) and explain how it impacts semantic search results.
  • Describe conversation memory concepts and identify options for storing session state and chat history.
  • Explain grounding and citation patterns for RAG-based answers and why source attribution improves trust and auditability.
  • Describe caching patterns for gen AI (prompt/result caching, embedding caching) and their cost/latency impact.
  • Design for rate limits, throttling, and retries, including backpressure patterns to protect downstream model endpoints.
  • Identify safe tool-use design practices (allowlists, scoped permissions, validation) when models are used to invoke actions.
  • Explain trade-offs between streaming and non-streaming responses for user experience and system design.
  • Describe multi-tenant isolation considerations for gen AI, including data segregation and least-privilege access boundaries.
  • Identify observability requirements for gen AI systems (metrics, traces, error rates) while minimizing sensitive prompt/response exposure.
  • Design failure handling for AI features (fallback answers, escalation paths, safe defaults) to maintain user trust and continuity.

Task 3.2 - Choose effective prompt engineering techniques

  • Write clear prompts that specify role, goal, constraints, and success criteria to reduce ambiguity in model responses.
  • Use structure (headings, bullet lists, delimiters) to separate instructions, context, and user input to improve reliability.
  • Apply few-shot prompting (examples) to guide formatting, style, and edge-case handling.
  • Specify output schemas (for example, valid JSON fields) and validation rules to produce structured and machine-readable outputs.
  • Use decomposition techniques (break down a complex task into steps or sub-questions) to improve response quality.
  • Inject retrieved context for RAG with clear grounding instructions (use sources, avoid unsupported claims).
  • Separate system-level instructions from user content and recognize why mixing untrusted input with instructions increases risk.
  • Use prompt chaining or iterative refinement (draft → critique → improve) to increase consistency and reduce omissions.
  • Tune generation settings (temperature, top-p, max tokens) to balance determinism, creativity, and response length.
  • Handle long contexts by summarizing, chunking, and selecting the most relevant information rather than pasting entire corpora.
  • Reduce hallucinations by requesting citations, constraining the answer space, and instructing the model to ask clarifying questions.
  • Apply prompt-injection defenses in prompt design (treat user content as data, not instructions; refuse unsafe tool actions).
  • Use evaluation-driven prompt iteration by testing prompts against a representative set of inputs and tracking results over time.
  • Manage prompts as versioned assets (prompt libraries, change control) to support repeatability and safe rollbacks.

Task 3.3 - Describe the training and fine-tuning process for foundation models

  • Differentiate pretraining, fine-tuning, and instruction tuning at a conceptual level and describe what each aims to improve.
  • Identify dataset requirements for supervised fine-tuning (quality, representativeness, labeling) and the impact of poor data.
  • Describe parameter-efficient fine-tuning approaches (for example, LoRA/adapters) at a high level and why they reduce cost.
  • Select between prompt engineering, RAG, and fine-tuning based on whether the need is knowledge, style, or task-specific behavior.
  • Explain the risk of overfitting in fine-tuning and the need for validation sets and early stopping concepts.
  • Identify compute and cost considerations for training and fine-tuning (GPU/accelerator needs, time-to-train, experimentation).
  • Recognize training data governance needs, including removing sensitive data and ensuring proper licensing and permissions.
  • Describe customization options on AWS at a high level (for example, model customization via Amazon Bedrock and training jobs in Amazon SageMaker).
  • Describe the role of embeddings models and how customizing embeddings can affect retrieval quality in RAG solutions.
  • Explain the idea of alignment (for example, reinforcement learning from human feedback) at a high level and why it matters.
  • Describe model deployment considerations after fine-tuning, including versioning, rollback, and controlled rollout.
  • Identify monitoring needs after customization, including regression testing and detecting shifts in output quality over time.
  • Recognize the value of experiment tracking and reproducibility (datasets, prompts, training code, parameters) during customization.
  • Describe continuous improvement for foundation models using feedback data while maintaining governance and change control.

Task 3.4 - Describe methods to evaluate foundation model performance

  • Define evaluation dimensions for foundation models, including quality, groundedness, safety, latency, and cost.
  • Create a representative evaluation set (prompts + expected outputs or rubrics) and explain why coverage matters.
  • Describe human evaluation using scoring rubrics and why consistency across reviewers improves reliability.
  • Identify when automated metrics (for example, ROUGE/BLEU or embedding similarity) are appropriate and when they are misleading.
  • Evaluate hallucinations and factuality risks and identify techniques for measuring groundedness and correctness.
  • Evaluate safety risks (toxicity, self-harm, policy violations) and define acceptable thresholds and refusal behavior.
  • Evaluate bias and fairness by comparing behavior across demographic or user groups and documenting disparities.
  • Evaluate RAG systems end-to-end, including retrieval accuracy, chunk quality, and the impact of sources on final answers.
  • Run A/B tests for prompts, models, or retrieval settings and interpret results to select the best-performing configuration.
  • Describe production monitoring for foundation model applications, including user feedback signals and quality regression detection.
  • Explain red teaming and adversarial testing at a high level and why it is essential for high-risk applications.
  • Evaluate multi-turn conversations for consistency, instruction adherence, and appropriate use of conversation memory.
  • Recognize evaluation needs for multimodal models (text + image/audio) and modality-specific quality and safety checks.
  • Define acceptance criteria and rollout/rollback triggers based on evaluation outcomes and business risk tolerance.

Domain 4: Guidelines for Responsible AI (14%)

Practice this topic →

Task 4.1 - Explain the development of AI systems that are responsible

  • Define responsible AI principles (fairness, accountability, transparency, privacy, safety) and how they guide system design.
  • Identify common sources of bias in AI systems (data collection, labeling, sampling) and high-level mitigation strategies.
  • Explain data minimization and privacy-by-design concepts and how they reduce risk when building AI systems.
  • Identify when human-in-the-loop review is appropriate and how escalation paths reduce harm for high-impact decisions.
  • Perform a high-level risk assessment by defining intended use, out-of-scope use, and failure modes for an AI feature.
  • Describe content safety controls (policy-based filtering, refusal behavior) and why they are necessary for generative AI.
  • Recognize the value of documentation (model cards, datasheets for datasets) for governance and stakeholder trust.
  • Design feedback and incident response loops for AI harms (reporting, triage, remediation, post-incident learning).
  • Describe monitoring for misuse and harmful outputs, including abuse detection and policy violation signals.
  • Recognize responsible AI risk management frameworks (for example, NIST AI RMF) and why structured controls matter.
  • Identify AWS services and features that support responsible AI patterns (for example, Amazon Bedrock guardrails and Amazon SageMaker Clarify).
  • Explain consent, data rights, and appropriate use boundaries for training data and user-provided inputs.
  • Recognize environmental and cost considerations of AI systems and why efficiency can be a responsible design goal.
  • Design inclusive and accessible AI experiences, including clear user communication and accommodations for diverse users.

Task 4.2 - Recognize the importance of transparent and explainable models

  • Differentiate transparency, explainability, and interpretability and identify why each matters to different stakeholders.
  • Identify scenarios where explainability is required (regulated industries, high-impact decisions) and what evidence may be needed.
  • Describe feature importance at a high level and how it can be used to explain model predictions.
  • Recognize common explainability techniques (for example, LIME/SHAP) and what kinds of explanations they provide.
  • Explain how to communicate model limitations, uncertainty, and appropriate use to users to build trust and reduce misuse.
  • Describe citation and source attribution patterns for generative AI (especially RAG) and how they support transparency.
  • Design user-facing messaging that clarifies whether outputs are generated, what data sources were used, and how to verify results.
  • Recognize the role of audit logs and traceability for investigating model behavior and supporting compliance requirements.
  • Explain the trade-off between simple, interpretable models and complex, higher-performing models and how to choose appropriately.
  • Explain why some explainability techniques can be misleading if they do not faithfully represent the model’s true behavior.
  • Recognize explainability risks such as leaking sensitive information through explanations and the need for safe disclosure.
  • Identify AWS tooling that supports explainability and bias analysis (for example, Amazon SageMaker Clarify) and what it can provide.
  • Describe decision thresholds and decision boundaries at a high level and how they relate to transparency in classification tasks.
  • Recognize that AI systems often combine multiple models and components, and explainability may require tracing the full pipeline.

Domain 5: Security, Compliance, and Governance for AI Solutions (14%)

Practice this topic →

Task 5.1 - Explain methods to secure AI systems

  • Apply least-privilege IAM design for AI systems, including scoping permissions for data access, model invocation, and administration.
  • Describe encryption in transit and at rest for AI data and artifacts using TLS and AWS KMS-managed keys.
  • Explain network isolation concepts for AI workloads, including private subnets, security groups, and private connectivity (VPC endpoints/PrivateLink).
  • Protect secrets used by AI applications (API keys, database credentials) using AWS Secrets Manager and secure rotation practices.
  • Secure retrieval data stores used for RAG with strong access controls, encryption, and appropriate data partitioning.
  • Recognize prompt injection and jailbreak risks and apply high-level mitigations (input boundaries, guardrails, tool allowlists).
  • Implement protections against data exfiltration through model outputs, including output filtering and PII redaction patterns.
  • Recognize threats to training pipelines such as data poisoning and how governance, validation, and provenance checks reduce risk.
  • Describe model artifact protection concepts (integrity, signing, secure storage) and why tampered artifacts are a risk.
  • Explain audit logging needs for AI systems, including capturing access events with AWS CloudTrail and controlling log access.
  • Identify security monitoring services and patterns (Amazon GuardDuty, AWS Security Hub, Amazon CloudWatch alarms) to detect threats.
  • Protect AI-facing APIs using controls such as AWS WAF, throttling/rate limiting, and authentication/authorization mechanisms.
  • Apply secure SDLC concepts to AI applications, including threat modeling, dependency hygiene, and secure CI/CD practices.
  • Design multi-tenant isolation for AI workloads, including data boundaries, per-tenant encryption strategies, and access segmentation.
  • Recognize adversarial attacks (for example, adversarial examples and evasion) and why robustness testing is part of security.
  • Describe incident response basics for AI systems, including containment, rotation of credentials/keys, and post-incident review.

Task 5.2 - Recognize governance and compliance regulations for AI systems

  • Define AI governance and explain why organizations implement oversight for data, models, and AI-enabled decisions.
  • Describe data governance concepts relevant to AI (classification, lineage, retention, access) and why they matter for compliance.
  • Recognize privacy regulations (for example, GDPR and CCPA) and identify how they influence data handling for AI workloads.
  • Recognize industry regulations (for example, HIPAA or PCI DSS) and why regulated workloads require additional controls and auditing.
  • Recognize emerging AI regulations and frameworks (for example, the EU AI Act) at a high level and why risk classification matters.
  • Identify AWS Artifact as a source for compliance reports and agreements and explain how it supports audits.
  • Describe auditability requirements, including capturing access logs, configuration history, and model version changes.
  • Describe change management for AI systems, including approvals for model updates, prompt changes, and deployment rollouts.
  • Explain how AWS Organizations, multi-account strategy, and tagging support governance, separation of duties, and cost allocation.
  • Recognize third-party model and provider considerations, including contracts, licensing terms, and data usage restrictions.
  • Define policies for prompt and response logging that balance audit needs with privacy requirements and data minimization.
  • Describe acceptable use policies for generative AI, including prohibited content categories and organizational guardrails.
  • Identify documentation artifacts that support governance (model cards, risk assessments, evaluation results, monitoring dashboards).
  • Recognize security and compliance frameworks (for example, ISO 27001, SOC reports, NIST) and the concept of control mapping.
  • Describe data residency and cross-border transfer considerations and how region selection can support compliance requirements.
  • Explain retention and deletion workflows (including right-to-be-forgotten concepts) and how they affect AI datasets and logs.

Tip: For AIF-C01, learn the concept first, then drill until you can explain the trade-off in one sentence (for example: prompting vs RAG vs fine-tuning).