AIF-C01 — AWS Certified AI Practitioner Quick Review

Last revised: June 29, 2026

Quick Review for AWS Certified AI Practitioner (AIF-C01): high-yield AI, ML, generative AI, AWS service selection, security, evaluation, and practice focus.

Quick Review purpose

This Quick Review is for candidates preparing for the real AWS Certified AI Practitioner (AIF-C01) exam from AWS, exam code AIF-C01. Use it to refresh the concepts that are easiest to confuse before moving into IT Mastery practice, original practice questions, topic drills, mock exams, and detailed explanations.

The AIF-C01 exam is foundational. Expect questions that test whether you can:

Explain core AI, ML, and generative AI concepts.
Match business use cases to appropriate AWS services.
Recognize responsible AI, security, privacy, and governance concerns.
Understand high-level model lifecycle, data preparation, evaluation, and monitoring.
Choose between managed AI services, Amazon Bedrock, and Amazon SageMaker based on scenario clues.

This page is IT Mastery review support and is not affiliated with AWS.

Exam mindset: how to choose the best answer

Many AIF-C01 questions are scenario-based. Do not answer only from memorized service names. First identify the task, then the level of customization, then the operational responsibility.

If the scenario says…	Think first…	Common trap
“Extract text, forms, or tables from scanned documents”	Amazon Textract	Choosing Amazon Rekognition just because an image is involved
“Analyze sentiment, entities, key phrases, or language in text”	Amazon Comprehend	Choosing Amazon Bedrock when a managed NLP service is enough
“Convert speech to text”	Amazon Transcribe	Confusing with Amazon Polly
“Convert text to lifelike speech”	Amazon Polly	Confusing with Amazon Transcribe
“Translate text between languages”	Amazon Translate	Confusing translation with summarization
“Build a conversational bot with intents and slots”	Amazon Lex	Choosing a general LLM when the exam emphasizes intent-based bot design
“Use foundation models through an API without managing infrastructure”	Amazon Bedrock	Choosing Amazon SageMaker by default
“Train, tune, build, deploy, or monitor custom ML models”	Amazon SageMaker	Choosing Bedrock when the task is custom ML lifecycle work
“No-code or low-code ML predictions for business users”	Amazon SageMaker Canvas	Choosing full SageMaker Studio-style development
“Enterprise generative AI assistant over business data”	Amazon Q Business or Bedrock with retrieval	Treating all chatbots as Amazon Lex
“Developer coding assistant”	Amazon Q Developer	Confusing with Amazon Q Business
“Search internal enterprise content”	Amazon Kendra or retrieval architecture	Confusing keyword search, semantic search, and generative answering

High-yield concepts to know cold

AI, ML, deep learning, and generative AI

Concept	Quick definition	What AIF-C01 may test
Artificial intelligence	Broad field of systems performing tasks associated with human intelligence	AI is the umbrella term
Machine learning	Systems learn patterns from data rather than being explicitly programmed for every rule	Training data, features, labels, evaluation
Deep learning	ML using neural networks with many layers	Often used for images, speech, NLP, and foundation models
Generative AI	AI that creates new content such as text, images, code, or summaries	Prompts, tokens, foundation models, hallucinations, responsible use
Foundation model	Large model trained on broad data and adaptable to many tasks	Often accessed through Amazon Bedrock
Large language model	Foundation model focused on language tasks	Summarization, Q&A, generation, reasoning-like responses
Embedding	Numeric vector representation of meaning	Search, recommendations, similarity, RAG
Token	Unit of text processed by a model	Cost, latency, context window, output length
Inference	Using a trained model to make predictions or generate output	Production use, latency, throughput, cost
Training	Learning model parameters from data	Requires data, compute, evaluation, iteration

Supervised, unsupervised, and reinforcement learning

Learning type	Uses	Examples	Watch for
Supervised learning	Learn from labeled examples	Classification, regression	Needs labeled training data
Unsupervised learning	Find structure without labels	Clustering, anomaly patterns, dimensionality reduction	No “correct label” in training data
Reinforcement learning	Learn actions using rewards or penalties	Optimization, game-like decision environments, agent policies	Not the default answer for ordinary prediction
Semi-supervised learning	Mix of small labeled data plus larger unlabeled data	Reducing labeling effort	Useful when labels are expensive
Self-supervised learning	Model learns from data structure itself	Many foundation model pretraining approaches	Often foundational for generative AI

Classification, regression, clustering, and anomaly detection

Task	Output	Example	Best metric clue
Classification	Category/class	Fraud vs not fraud, image label, sentiment class	Accuracy, precision, recall, F1, ROC-AUC
Regression	Numeric value	Price, demand, wait time	MAE, RMSE, R-squared
Clustering	Groups	Customer segments	Silhouette score, business usefulness
Anomaly detection	Unusual events	Unusual transactions, abnormal sensor readings	False positives vs missed anomalies
Recommendation	Ranked items	Products, media, content	Click-through, conversion, ranking metrics

Core AWS service selection review

Managed AI services vs Amazon Bedrock vs Amazon SageMaker

Choice	Use when…	Candidate mistake to avoid
Managed AWS AI service	The task is standard and specific: OCR, speech, translation, sentiment, image labels, chatbot intents	Over-engineering with custom ML
Amazon Bedrock	You need foundation models, generative AI, embeddings, RAG, agents, or guardrails through managed APIs	Treating Bedrock as a traditional custom model training platform
Amazon SageMaker	You need to prepare data, train, tune, deploy, monitor, or manage custom ML models	Choosing SageMaker when a simpler managed AI API satisfies the use case
Amazon SageMaker Canvas	Business users need no-code or low-code predictions	Assuming every SageMaker scenario requires data scientists writing code
Amazon Q Business	Organization wants a generative AI assistant connected to company data and business apps	Confusing enterprise assistant use cases with Amazon Lex
Amazon Q Developer	Developers want coding, AWS guidance, or software development assistance	Confusing with business-user knowledge assistant use cases

AWS AI and ML services at a glance

Service	Primary use	Fast exam cue
Amazon Bedrock	Build generative AI applications with foundation models	LLMs, embeddings, RAG, agents, guardrails
Amazon SageMaker	Build, train, tune, deploy, and monitor ML models	Full ML lifecycle
Amazon SageMaker Canvas	No-code ML for business analysts	Predictions without writing code
Amazon SageMaker Ground Truth	Data labeling workflows	Human labeling, annotation
Amazon SageMaker Clarify	Bias detection and model explainability	Fairness, explainability
Amazon SageMaker Model Monitor	Monitor deployed model quality and drift	Production ML monitoring
Amazon Textract	Extract printed/handwritten text, forms, and tables from documents	OCR plus document structure
Amazon Comprehend	NLP for text insights	Sentiment, entities, key phrases, language
Amazon Transcribe	Speech to text	Audio becomes text
Amazon Polly	Text to speech	Text becomes audio
Amazon Translate	Language translation	Translate between languages
Amazon Lex	Conversational interfaces using voice/text	Intents, slots, chatbot flow
Amazon Rekognition	Image and video analysis	Objects, scenes, faces, moderation labels
Amazon Personalize	Personalized recommendations	User-item recommendations
Amazon Kendra	Intelligent enterprise search	Search across internal documents
Amazon OpenSearch Service	Search, analytics, and vector search patterns	Semantic search, vector retrieval
Amazon Q Business	Generative AI assistant for enterprise knowledge	Business assistant over company data
Amazon Q Developer	Generative AI assistant for developers	Code, AWS development help
AWS Glue	ETL and Data Catalog	Prepare/catalog data
Amazon S3	Object storage for data lakes, datasets, artifacts	Durable storage foundation
AWS Lake Formation	Data lake governance	Permissions and governance for data lakes
Amazon Athena	Query S3 data using SQL	Serverless interactive query
Amazon Redshift	Data warehouse analytics	Large-scale structured analytics
Amazon QuickSight	Business intelligence dashboards	Visualize and share insights
Amazon Macie	Discover sensitive data in S3	PII/sensitive data detection
AWS IAM	Identity and access control	Least privilege
AWS KMS	Encryption key management	Protect data at rest
AWS CloudTrail	API activity audit logs	Who did what, when
Amazon CloudWatch	Metrics, logs, alarms	Operational monitoring

ML lifecycle quick review

AIF-C01 usually tests lifecycle understanding at a conceptual level: what happens before, during, and after model development.

    flowchart LR
	  A[Define business problem] --> B[Collect and govern data]
	  B --> C[Prepare, clean, label, and split data]
	  C --> D[Train or select model]
	  D --> E[Evaluate against metrics]
	  E --> F[Deploy for inference]
	  F --> G[Monitor quality, drift, latency, and cost]
	  G --> H[Retrain, tune, or improve]
	  H --> C

Stage	Know this	Common trap
Define problem	Convert business goal into ML task and success metric	Starting with a model before defining success
Collect data	Data must be relevant, permitted, representative, and high quality	Assuming more data always fixes poor data quality
Label data	Supervised learning needs correct labels	Ignoring label noise and inconsistent annotation
Prepare data	Clean, normalize, transform, handle missing values, remove duplicates	Accidentally introducing data leakage
Split data	Use training, validation, and test data appropriately	Evaluating on the same data used to train
Train/select model	Choose model based on task, data, cost, latency, and explainability	Picking the largest model by default
Evaluate	Use metrics aligned with business risk	Relying on accuracy for imbalanced data
Deploy	Make model available for inference	Ignoring latency, scale, and security
Monitor	Watch for drift, degraded quality, bias, errors, and cost	Treating deployment as the finish line
Improve	Tune, retrain, add data, change prompts, or redesign	Changing the model without measuring impact

Data concepts that commonly appear

Data types and storage patterns

Data concept	Meaning	AWS-related clue
Structured data	Rows and columns with schema	Databases, warehouses, SQL analytics
Semi-structured data	Flexible structure such as JSON, logs, XML	Data lakes, Glue, Athena
Unstructured data	Text, images, audio, video, documents	S3, Textract, Comprehend, Rekognition, Transcribe
Data lake	Central storage for raw and processed data	Amazon S3 plus governance/catalog tools
Data warehouse	Optimized analytics on structured data	Amazon Redshift
Data catalog	Metadata about data assets	AWS Glue Data Catalog
Feature	Input variable used by a model	Customer age, text embedding, transaction amount
Label	Correct answer used in supervised learning	Fraud/not fraud, category, price
Feature engineering	Transforming data into useful model inputs	Scaling, encoding, extracting features
Data leakage	Training uses information unavailable at prediction time	Inflated test results, poor real-world performance
Data drift	Input data distribution changes over time	Monitoring and retraining needed
Concept drift	Relationship between inputs and target changes	Model may become stale even if pipeline works

Data quality and bias checks

High-yield review points:

Representative data matters. If training data excludes important populations, conditions, products, geographies, or use cases, predictions may be biased or unreliable.
Labels must be accurate. Bad labels create bad supervised models.
Missing values need deliberate handling. Dropping records may bias the dataset; imputing values may introduce assumptions.
Outliers are not always errors. In fraud or anomaly detection, unusual points may be the signal.
PII and sensitive data require controls. Use data minimization, access control, encryption, masking/redaction where appropriate, and auditability.
Training and test sets must remain separate. If the model “sees” test data during training or tuning, evaluation is not trustworthy.

Generative AI quick review

Foundation model concepts

Concept	What to remember
Prompt	Input instructions and context given to a generative model
System instruction	Higher-level behavior or constraints for the model
Context window	Amount of input/output text the model can consider at once
Temperature	Controls randomness; lower is more predictable, higher is more varied
Top-p	Controls sampling from probable tokens
Max tokens	Limits output length and affects cost/latency
Stop sequence	Text pattern that tells generation to stop
Embeddings	Vector representations used for semantic similarity and retrieval
Hallucination	Plausible but incorrect or unsupported output
Grounding	Tying model output to trusted context or source data
RAG	Retrieval-Augmented Generation: retrieve relevant content, then generate an answer using it
Fine-tuning	Adapting a model’s behavior using task-specific examples
Agent	System that uses a model to reason over tasks and call tools/APIs
Guardrail	Control to reduce unsafe, unwanted, or noncompliant outputs

RAG vs fine-tuning vs prompt engineering

Need	Best first approach	Why
Improve instructions, format, tone, or constraints	Prompt engineering	Fastest and lowest operational change
Answer using current or private company knowledge	RAG	Adds external context without retraining the model
Reduce hallucinations by grounding in approved documents	RAG plus evaluation and guardrails	The model can cite or use retrieved sources
Teach a repeated task style or domain-specific output pattern	Fine-tuning	Changes behavior based on examples
Add new factual knowledge that changes often	RAG	Easier to update documents than retrain
Enforce safety boundaries	Guardrails plus prompt controls	Do not rely on prompt wording alone
Connect model to actions or APIs	Agent architecture	Model can plan and invoke tools under controls

Typical RAG flow

Store trusted documents in a searchable knowledge source.
Convert document chunks into embeddings.
Store embeddings in a vector-capable store.
Convert the user query into an embedding.
Retrieve the most relevant chunks.
Add retrieved context to the prompt.
Generate a grounded response.
Apply guardrails, logging, evaluation, and human review where needed.

Common RAG traps:

Poor chunking can retrieve irrelevant or incomplete context.
Stale source documents produce stale answers.
Retrieval does not guarantee correctness; evaluate generated answers.
RAG helps with knowledge grounding but does not automatically solve authorization. Users should only retrieve data they are allowed to access.
Prompt injection can occur when retrieved content contains malicious instructions. Guardrails and input/output controls matter.

Model evaluation and metrics

Classification metrics

Know what each metric favors. You usually do not need heavy math, but you should understand the tradeoff.

Metric	Plain meaning	Use when…	Trap
Accuracy	Overall percent correct	Classes are balanced and errors have similar cost	Misleading for imbalanced data
Precision	Of predicted positives, how many were correct	False positives are costly	High precision may miss true positives
Recall	Of actual positives, how many were found	False negatives are costly	High recall may create many false positives
F1 score	Balance of precision and recall	Need one combined metric	Hides which error type matters more
ROC-AUC	Ranking/separation quality across thresholds	Comparing binary classifiers	Does not directly pick the operating threshold
Confusion matrix	Counts true/false positives/negatives	Understanding error types	Must interpret positive class correctly

Useful formulas:

\[ \text{Accuracy} = \frac{\text{correct predictions}}{\text{all predictions}} \]\[ \text{Precision} = \frac{\text{true positives}}{\text{true positives} + \text{false positives}} \]\[ \text{Recall} = \frac{\text{true positives}}{\text{true positives} + \text{false negatives}} \]\[ \text{F1} = 2 \times \frac{\text{precision} \times \text{recall}}{\text{precision} + \text{recall}} \]

Regression and generative AI evaluation

Evaluation area	Metric or method	What it tells you
Regression error	MAE	Average absolute error; easier to explain
Regression error	RMSE	Penalizes large errors more strongly
Regression fit	R-squared	Amount of variance explained
Generative AI quality	Human evaluation	Whether output is useful, accurate, and appropriate
Generative AI grounding	Factuality/groundedness checks	Whether response is supported by source context
Generative AI safety	Toxicity, harmful content, policy checks	Whether output violates safety requirements
Generative AI relevance	Relevance scoring	Whether output answers the user’s question
Operations	Latency, throughput, error rate	Whether the solution performs in production
Cost	Cost per request, token usage, infrastructure cost	Whether the solution is economically viable

Responsible AI review

AIF-C01 candidates should recognize responsible AI as a lifecycle concern, not a single feature.

Theme	Meaning	Practical controls
Fairness	Avoid unjustified performance gaps or harmful bias	Representative data, bias checks, SageMaker Clarify, human review
Explainability	Understand why a model produced an output	Feature attribution, interpretable models, documentation
Transparency	Communicate AI use, limitations, and confidence appropriately	User notices, model documentation, clear escalation paths
Privacy	Protect personal and sensitive data	Data minimization, masking, encryption, access control
Security	Protect systems, models, data, and prompts	IAM, KMS, network controls, logging, secure APIs
Safety	Reduce harmful, toxic, or inappropriate outputs	Guardrails, content filters, testing, human oversight
Robustness	Maintain quality under realistic input variation	Evaluation, adversarial testing, monitoring
Governance	Manage approvals, accountability, and auditability	Policies, versioning, logs, risk review, ownership
Controllability	Keep humans and systems in control of AI behavior	Constraints, approval workflows, rollback options

Common responsible AI mistakes:

Treating fairness as only a data science issue. It also involves product design, monitoring, and governance.
Assuming a model is objective because it is mathematical.
Using sensitive data without a clear purpose or access controls.
Deploying generative AI without testing for hallucinations, unsafe output, and prompt injection.
Failing to document known limitations.
Ignoring human review for high-impact or ambiguous decisions.

Security, privacy, and governance decision rules

Core AWS controls

Requirement	AWS control to consider	Exam cue
Restrict who can call a service or access data	AWS IAM	Least privilege, roles, policies
Encrypt data at rest	AWS KMS with service encryption features	Key management, encryption
Protect data in transit	TLS/HTTPS	Secure communication
Audit API activity	AWS CloudTrail	Who called which API
Monitor metrics and logs	Amazon CloudWatch	Alarms, logs, dashboards
Detect sensitive data in S3	Amazon Macie	PII discovery
Govern data lake access	AWS Lake Formation	Data lake permissions
Avoid hardcoded secrets	AWS Secrets Manager	Secure secret storage
Private connectivity to supported services	VPC endpoints / AWS PrivateLink patterns	Avoid public internet paths where required
Control S3 access	Bucket policies, IAM, encryption, block public access	Protect datasets and artifacts

AI-specific security concerns

Concern	Why it matters	Mitigation direction
Prompt injection	Malicious input tries to override instructions	Input validation, guardrails, isolation, retrieval controls
Data leakage	Sensitive data appears in prompts, logs, or outputs	Data minimization, redaction, access control
Unauthorized retrieval	RAG returns documents a user should not see	Enforce permissions before retrieval and generation
Hallucinated authority	Model fabricates policies, citations, or facts	Grounding, citations, human review, evaluation
Model drift	Production behavior degrades over time	Monitoring, retraining, rollback
Over-permissioned agents	Agent can perform actions beyond user intent	Least privilege, scoped tools, approvals
Unsafe output	Harmful, biased, or noncompliant content	Guardrails, filters, testing, escalation

Cost, performance, and operational tradeoffs

AIF-C01 questions may include practical constraints such as budget, latency, scale, and maintainability.

Decision factor	What to remember
Model size	Larger models may improve quality but often increase cost and latency
Token volume	More input/output tokens usually increase cost and response time
Context length	Longer context can help but may add cost and noise
Prompt quality	Better prompts can improve results without changing models
RAG retrieval quality	Good retrieval can reduce hallucinations and improve relevance
Batch vs real time	Batch processing can be cheaper or simpler when immediate response is not needed
Managed services	Reduce operational burden for common AI tasks
Monitoring	Needed for errors, latency, drift, quality, and cost
Human review	Adds cost but may be necessary for high-risk or low-confidence outputs
Right-sizing	Match solution complexity to business value and risk

High-yield scenario patterns

Scenario clue	Likely answer direction
“Business users want predictions without coding”	Amazon SageMaker Canvas
“Data scientists need to build, train, and deploy a custom model”	Amazon SageMaker
“Use multiple foundation models through a managed service”	Amazon Bedrock
“Add enterprise documents to a generative AI Q&A workflow”	RAG, Knowledge Bases-style architecture, or Amazon Q Business depending on wording
“Prevent harmful generative AI responses”	Guardrails, content filtering, evaluation, human review
“Find sensitive data in S3 before using it for ML”	Amazon Macie
“Catalog and prepare data for analytics or ML”	AWS Glue and AWS Glue Data Catalog
“Query data directly in S3 with SQL”	Amazon Athena
“Central data lake governance”	AWS Lake Formation
“Analyze call recordings by converting audio to text”	Amazon Transcribe, then text analysis if needed
“Extract fields from invoices or forms”	Amazon Textract
“Detect objects or moderation labels in images”	Amazon Rekognition
“Identify sentiment and entities in customer reviews”	Amazon Comprehend
“Create natural-sounding audio from text”	Amazon Polly
“Build a bot that collects required fields from users”	Amazon Lex
“Translate support content into another language”	Amazon Translate
“Personalized product recommendations”	Amazon Personalize
“Audit who accessed AI resources”	AWS CloudTrail
“Encrypt data used by AI workloads”	AWS KMS and service-level encryption settings

Common candidate mistakes

Choosing the most advanced service instead of the most appropriate service. If a managed AI service directly solves the use case, it is often the best foundational answer.
Confusing Amazon Bedrock and Amazon SageMaker. Bedrock is the first thought for managed foundation model and generative AI application patterns. SageMaker is the first thought for custom ML lifecycle work.
Using fine-tuning when RAG is the better fit. If the problem is “answer from current company documents,” think retrieval and grounding before fine-tuning.
Using accuracy for imbalanced classification. A fraud model that predicts “not fraud” almost every time may have high accuracy and still be useless. Think precision, recall, F1, and business cost of errors.
Ignoring data leakage. If future information appears in training data, evaluation results may look excellent but fail in production.
Treating deployment as the end. Real systems require monitoring for drift, quality, latency, errors, security, and cost.
Assuming generative AI output is always correct. LLMs can hallucinate. Use grounding, evaluation, guardrails, citations, and human review where appropriate.
Forgetting authorization in RAG. Retrieval must respect user permissions. A model should not expose documents just because they exist in the vector store.
Confusing speech, text, and language services. Transcribe is speech-to-text. Polly is text-to-speech. Translate changes language. Comprehend analyzes text.
Overlooking responsible AI. Fairness, privacy, security, explainability, safety, robustness, transparency, and governance are all testable themes.

Fast final review checklist

Before starting topic drills or a mock exam, make sure you can answer these without hesitation:

Can you explain AI vs ML vs deep learning vs generative AI?
Can you distinguish supervised, unsupervised, and reinforcement learning?
Can you identify classification, regression, clustering, recommendation, and anomaly detection scenarios?
Can you explain features, labels, training, validation, testing, and inference?
Can you identify overfitting, underfitting, data leakage, drift, and bias?
Can you choose between managed AI services, Amazon Bedrock, and Amazon SageMaker?
Can you explain embeddings, vector search, semantic similarity, and RAG?
Can you decide when prompt engineering, RAG, fine-tuning, agents, or guardrails are appropriate?
Can you choose the right evaluation metric for common scenarios?
Can you recognize responsible AI risks and mitigation controls?
Can you map IAM, KMS, CloudTrail, CloudWatch, Macie, Glue, S3, and Lake Formation to security and governance needs?

Practice plan after this Quick Review

Use this Quick Review as a checkpoint, then move into IT Mastery practice:

Start with topic drills on AI/ML fundamentals, generative AI, AWS service selection, responsible AI, and security.
Use original practice questions to force scenario recognition rather than memorization.
Read detailed explanations for every missed question and every guessed question.
Create a miss log with three columns: concept missed, why the wrong answer was tempting, and the decision rule to remember.
Take a mixed mock exam only after your topic drills show consistent performance across service selection, generative AI, evaluation, and governance.

Next step: choose a focused AIF-C01 question bank topic drill, answer without notes, then review the detailed explanations until you can explain why each wrong option is wrong.

Continue in IT Mastery

Use this Quick Review as a final concept map, then move into IT Mastery for focused topic drills, mixed practice sets, timed mock exams, and detailed explanations. The practice questions are original IT Mastery practice items; they are not official AWS questions, copied live-exam content, or exam dumps.

Study Plan