SAA-C03 — AWS Certified Solutions Architect – Associate Exam Blueprint
A practical SAA-C03 exam blueprint for AWS Certified Solutions Architect – Associate candidates reviewing architecture, security, resiliency, cost, and operations.
How to Use This Exam Blueprint
Use this independent Exam Blueprint as a practical study map for the AWS Certified Solutions Architect – Associate (SAA-C03) exam. It is designed to help you turn broad exam areas into specific readiness checks: what you should recognize, compare, design, troubleshoot, and eliminate in scenario questions.
Do not treat this as a list of exact exam weights or guaranteed question coverage. Instead, use it to find weak areas before final review.
For each section:
- Mark items as ready, needs review, or uncertain.
- For every uncertain item, write one sentence explaining when to use it and one sentence explaining when not to use it.
- Practice mixed scenarios where security, availability, performance, cost, and operations compete.
- Avoid memorizing service names without understanding the architecture tradeoff.
SAA-C03 Readiness Area Map
| Readiness area | What you should be able to decide | AWS services, features, or artifacts to review | Ready signal |
|---|---|---|---|
| Secure access design | Who should access what, from where, and under which conditions | IAM users, groups, roles, policies, resource policies, STS, Organizations, SCPs, KMS, Secrets Manager | You can choose role-based access over static credentials and explain policy evaluation at a high level |
| Network architecture | How traffic enters, leaves, and stays private inside AWS | VPC, subnets, route tables, internet gateway, NAT gateway, security groups, network ACLs, VPC endpoints, peering, Transit Gateway, VPN, Direct Connect | You can trace packet flow from client to workload and identify the control that allows or blocks traffic |
| Compute selection | Which compute model fits the workload and operating responsibility | EC2, Auto Scaling, AMIs, launch templates, Lambda, ECS, EKS, Fargate, Elastic Beanstalk, Batch | You can choose between server-based, container-based, and serverless designs from scenario cues |
| Load balancing and scaling | How to distribute traffic and scale safely | ALB, NLB, Gateway Load Balancer, Auto Scaling groups, target groups, health checks, CloudFront, Route 53 | You can match the load balancer and scaling pattern to protocol, latency, and availability needs |
| Storage architecture | Which storage service fits access pattern, durability, sharing, and lifecycle needs | S3, EBS, EFS, FSx, S3 lifecycle, S3 replication, S3 Object Lock, Storage Gateway, DataSync | You can distinguish object, block, and file storage quickly |
| Database and data services | Which data platform fits consistency, access pattern, scale, and operations | RDS, Aurora, DynamoDB, ElastiCache, Redshift, OpenSearch Service, Neptune, DMS | You can explain why a scenario needs relational, key-value, cache, search, warehouse, or graph capabilities |
| Resiliency and disaster recovery | How to recover from component, AZ, Region, or data failure | Multi-AZ, backups, snapshots, replication, Route 53 routing, CloudFront, S3 versioning, cross-Region patterns | You can align designs to RTO/RPO language without overengineering |
| Application integration | How to decouple, buffer, fan out, orchestrate, and stream | SQS, SNS, EventBridge, Step Functions, Kinesis, Amazon MQ, API Gateway | You can choose queue vs notification vs event bus vs workflow vs stream |
| Monitoring and operations | How to detect, audit, automate, and respond | CloudWatch, CloudTrail, AWS Config, Systems Manager, X-Ray, EventBridge, AWS Health, Trusted Advisor | You can separate metrics, logs, traces, audit events, configuration history, and operational automation |
| Cost optimization | How to reduce waste while preserving requirements | Cost Explorer, Budgets, Trusted Advisor, Savings Plans, Reserved Instances, Spot, S3 storage classes, lifecycle policies | You can select cost levers that do not violate availability, durability, or performance needs |
| Governance and account strategy | How to organize accounts, apply guardrails, and centralize control | AWS Organizations, SCPs, IAM Identity Center, Control Tower concepts, tagging, Config, CloudTrail | You can distinguish guardrails from permissions and monitoring |
| Hybrid and migration | How to connect or move workloads and data | Site-to-Site VPN, Direct Connect, Transit Gateway, Storage Gateway, DataSync, DMS, Snow Family, Transfer Family | You can choose migration tools based on data type, downtime tolerance, and connectivity |
High-Value “Can You Do This?” Checklist
Before you sit for SAA-C03, you should be able to do the following without relying on notes.
Architecture judgment
- Convert business requirements into architecture qualities: security, reliability, performance, cost, and operational simplicity.
- Identify the managed AWS service that reduces operational burden compared with self-managed infrastructure.
- Decide when a workload should be stateless, stateful, event-driven, batch-oriented, or streaming.
- Separate high availability, disaster recovery, backup, replication, and scaling.
- Recognize when “lowest cost” conflicts with “highest availability” or “lowest latency.”
- Explain why a design should use multiple Availability Zones.
- Explain when a multi-Region design is justified and when it is unnecessary.
- Choose an edge service when users are globally distributed or static content must be cached close to users.
- Identify single points of failure in compute, networking, storage, and database designs.
- Choose the simplest architecture that satisfies the stated requirements.
AWS service selection
- Choose between EC2, Lambda, containers, and managed platforms.
- Choose between S3, EBS, EFS, and FSx.
- Choose between RDS/Aurora, DynamoDB, ElastiCache, Redshift, OpenSearch Service, and Neptune.
- Choose between ALB, NLB, Gateway Load Balancer, CloudFront, and Route 53.
- Choose between SQS, SNS, EventBridge, Step Functions, and Kinesis.
- Choose between NAT gateway, internet gateway, VPC endpoint, VPN, Direct Connect, VPC peering, and Transit Gateway.
- Choose between CloudWatch, CloudTrail, AWS Config, X-Ray, and Systems Manager for an operational scenario.
- Choose between KMS, Secrets Manager, Systems Manager Parameter Store, IAM policies, and resource policies.
Scenario elimination
- Eliminate answers that expose private resources directly to the internet.
- Eliminate answers that use long-term credentials where roles or temporary credentials are more appropriate.
- Eliminate answers that scale only vertically when horizontal scaling is required.
- Eliminate answers that use a database read replica as a substitute for Multi-AZ availability.
- Eliminate answers that solve asynchronous decoupling with synchronous service calls.
- Eliminate answers that require excessive custom operations when a managed AWS service fits.
- Eliminate answers that improve cost by breaking a stated durability, security, or availability requirement.
- Eliminate answers that use public connectivity when private connectivity is explicitly required.
Core Architecture Decision Points
| Scenario cue | Ask yourself | Strong candidate response |
|---|---|---|
| “Highly available web application” | Are compute, database, and load balancer spread across failure boundaries? | Use multiple AZs, load balancing, Auto Scaling, and managed database availability features |
| “Unpredictable traffic” | Can capacity scale automatically and safely? | Use Auto Scaling, serverless options, queues, or managed scaling depending on workload type |
| “Global users with static assets” | Is edge caching appropriate? | Use CloudFront with an appropriate origin such as S3 or an application load balancer |
| “Private application instances need software updates” | Do they need inbound internet access? | Use private subnets with controlled outbound access or private endpoints where possible |
| “Loose coupling between services” | Is the caller waiting for the callee? | Use SQS, SNS, EventBridge, or Step Functions based on queueing, fanout, event routing, or orchestration needs |
| “Strict audit requirements” | Are management events, configuration changes, and access attempts visible? | Use CloudTrail, AWS Config, CloudWatch, and centralized logging patterns |
| “Minimize operational overhead” | Is there a managed service that handles scaling, patching, backups, or failover? | Prefer managed AWS services where they meet the requirement |
| “Large data migration” | What is the data source, transfer path, and downtime tolerance? | Consider DataSync, DMS, Storage Gateway, Snow Family, or network connectivity choices |
| “Cost must be reduced” | Which cost lever preserves the stated requirement? | Use right sizing, lifecycle policies, purchase options, Spot for interruptible work, and managed scaling |
Secure Architecture Checklist
IAM and access control
| Topic | Review focus | Ready check |
|---|---|---|
| IAM identities | Users, groups, roles, and temporary credentials | You know why applications should usually use roles instead of embedded access keys |
| IAM policies | Allow, deny, action, resource, principal, condition | You can interpret a policy fragment and identify what is allowed or denied |
| Resource policies | Bucket policies, key policies, queue policies, trust policies | You can tell when access is controlled by the resource rather than only by identity policy |
| Role assumption | Trust policy, permissions policy, STS | You can explain cross-account access at a high level |
| Least privilege | Narrow actions, resources, and conditions | You can identify overly broad permissions in a scenario |
| AWS Organizations | Accounts, organizational units, SCPs | You know SCPs set guardrails but do not grant permissions by themselves |
| Federation | External identity provider and temporary access | You can choose federation over creating many long-term IAM users |
| Permission boundaries | Maximum permissions for a principal | You can recognize when boundaries limit delegated administration |
Can you do these?
- Explain the difference between an IAM identity policy and an S3 bucket policy.
- Explain why an explicit deny overrides an allow.
- Identify the purpose of a role trust policy.
- Choose IAM roles for EC2, Lambda, ECS tasks, and cross-account access.
- Identify when to use temporary credentials.
- Recognize that SCPs restrict maximum permissions for accounts or organizational units.
- Use conditions conceptually, such as source VPC endpoint, MFA, encryption, or organization membership.
- Avoid using root credentials for routine administrative work.
Example policy-reading readiness:
{
"Effect": "Allow",
"Action": ["s3:GetObject"],
"Resource": "arn:aws:s3:::example-bucket/*",
"Condition": {
"StringEquals": {
"aws:PrincipalOrgID": "o-example"
}
}
}
You should be able to say: this allows object reads from a bucket only when the principal meets the organization condition, assuming no other policy blocks the request.
Network security
| Control | What it does | Common exam trap |
|---|---|---|
| Security group | Stateful instance or ENI-level traffic control | Confusing it with a network ACL |
| Network ACL | Stateless subnet-level traffic control | Forgetting return traffic must be allowed |
| Route table | Determines where traffic is sent | Assuming a subnet is public without a route to an internet gateway |
| Internet gateway | Enables internet routing for public subnets | Placing private resources in a public subnet unnecessarily |
| NAT gateway | Allows private subnet resources to initiate outbound internet access | Assuming it supports inbound internet access to private instances |
| VPC endpoint | Private access to supported AWS services | Choosing NAT when the scenario asks to avoid internet traversal |
| WAF | Web-layer protection for HTTP/S traffic | Using it for non-web network filtering |
| Shield | DDoS protection context | Confusing it with application authorization |
| KMS | Key management and encryption integration | Assuming encryption and access authorization are the same thing |
Checklist:
- Design public and private subnets correctly.
- Place load balancers, NAT gateways, and application instances in appropriate subnet types.
- Keep databases private unless the scenario explicitly requires otherwise.
- Use security groups for instance-level allow rules.
- Use network ACLs when subnet-level stateless filtering is required.
- Use VPC endpoints to keep supported service traffic private.
- Choose AWS WAF for web request filtering.
- Choose AWS KMS where encryption key control or auditability is required.
- Use Secrets Manager or Parameter Store instead of hardcoded secrets.
- Recognize when CloudTrail is required for API audit visibility.
Encryption, secrets, and auditability
| Need | Likely AWS capability to review | Ready signal |
|---|---|---|
| Encrypt data at rest | KMS-integrated encryption, service-managed or customer-managed keys | You can distinguish encryption from access permission |
| Encrypt data in transit | TLS, HTTPS listeners, certificates | You know where TLS is terminated in common architectures |
| Rotate database credentials | Secrets Manager | You can choose managed secret storage over application config files |
| Store application parameters | Systems Manager Parameter Store or Secrets Manager | You can choose based on sensitivity and rotation needs |
| Audit API activity | CloudTrail | You know CloudTrail is for account/API activity, not application performance metrics |
| Track resource configuration | AWS Config | You can use Config for compliance and configuration history |
| Detect threats | GuardDuty and related security services | You can recognize threat detection versus preventive access control |
| Inspect vulnerabilities | Inspector | You can distinguish vulnerability assessment from logging |
| Discover sensitive data | Macie | You can connect Macie with sensitive data discovery in S3 |
Network and Connectivity Readiness
VPC fundamentals
| Concept | You should know |
|---|---|
| VPC | Logical network boundary for AWS resources |
| Subnet | AZ-scoped segment inside a VPC |
| Public subnet | Has route path to the internet through an internet gateway and resources with appropriate public addressing |
| Private subnet | No direct inbound internet route |
| Route table | Controls traffic destination |
| Security group | Stateful allow rules associated with resources |
| Network ACL | Stateless subnet boundary rules |
| NAT gateway | Outbound internet path for private resources |
| VPC endpoint | Private path to supported AWS services |
| Peering | Direct private connectivity between VPCs with routing considerations |
| Transit Gateway | Hub-style connectivity for multiple networks |
| VPN | Encrypted connectivity over the internet |
| Direct Connect | Dedicated network connectivity pattern |
| Route 53 | DNS, routing policies, and health-check-aware patterns |
Can you trace this?
- A user reaches a public ALB.
- The ALB sends traffic to private application instances.
- Application instances connect to a private database.
- Application instances retrieve objects from S3 without public internet exposure when required.
- Operations teams access instances without opening broad inbound SSH/RDP access.
- Logs and metrics are sent to monitoring services.
- Failed instances are replaced automatically.
Connectivity decision table
| Requirement | Consider first | Watch for |
|---|---|---|
| Private access from VPC to supported AWS service | VPC endpoint | Gateway vs interface endpoint fit |
| Many VPCs and on-premises networks need hub connectivity | Transit Gateway | Routing, segmentation, and governance |
| Two VPCs need direct private connectivity | VPC peering | Non-transitive routing considerations |
| Encrypted on-premises connectivity over internet | Site-to-Site VPN | Bandwidth and resilience expectations in the scenario |
| Dedicated private connectivity | Direct Connect | Often paired with resilient design choices |
| Private resources need outbound updates | NAT gateway or endpoints | Do not confuse outbound access with inbound exposure |
| Global DNS routing or failover | Route 53 routing policies | Match routing policy to latency, failover, weighted, or geolocation needs |
| Global acceleration for application endpoints | Global Accelerator or CloudFront depending on use case | Do not use CDN-only thinking for all global routing needs |
Compute, Scaling, and Load Balancing Checklist
Compute selection
| Workload cue | Better-fit option to consider | Why |
|---|---|---|
| Full OS control, custom agents, legacy software | EC2 | Maximum control over instance environment |
| Predictable fleet of web/app servers | EC2 Auto Scaling | Horizontal scaling and replacement |
| Event-driven short tasks | Lambda | Serverless execution with reduced infrastructure management |
| Containerized application | ECS, EKS, or Fargate | Container orchestration and deployment flexibility |
| Containers without managing servers | Fargate | Reduced host management |
| Simple managed application deployment | Elastic Beanstalk | Platform abstraction while retaining AWS resource visibility |
| Batch jobs | AWS Batch | Managed batch scheduling pattern |
| Interruptible, flexible compute | Spot-capable designs | Cost optimization when interruption is acceptable |
Checklist:
- Choose EC2 when the workload requires server-level control.
- Choose Auto Scaling groups for elastic fleets and self-healing instance replacement.
- Choose Lambda for event-driven workloads when execution model fits.
- Choose containers when packaging, portability, or microservice deployment matters.
- Choose Fargate when the scenario emphasizes avoiding server management.
- Choose Spot only for workloads that tolerate interruption.
- Use launch templates or reusable configuration patterns for repeatable EC2 deployments.
- Store state outside ephemeral compute when instances may be replaced.
- Recognize when a queue protects compute from traffic spikes.
Load balancer selection
| Service | Best-fit scenario cues | Avoid confusing with |
|---|---|---|
| Application Load Balancer | HTTP/HTTPS, path-based or host-based routing, web applications | Network-level ultra-low-latency TCP routing |
| Network Load Balancer | TCP/UDP/TLS, static IP-style needs, high-performance network traffic | Layer 7 web routing |
| Gateway Load Balancer | Third-party virtual appliances, traffic inspection | Standard web application load balancing |
| CloudFront | Global content delivery, edge caching, TLS at edge | Regional load balancing only |
| Route 53 | DNS routing and failover patterns | Replacing application-level health and scaling design |
Ready checks:
- Match ALB to HTTP/S application routing.
- Match NLB to transport-level traffic.
- Match CloudFront to edge caching and global delivery.
- Use health checks to remove unhealthy targets.
- Design target groups and scaling together.
- Know that load balancing does not replace application or database resiliency.
Storage Architecture Checklist
Storage service selection
| Requirement | Service to consider | Why |
|---|---|---|
| Object storage, static content, backups, data lake storage | S3 | Durable object storage with lifecycle and policy controls |
| Block storage for EC2 instance | EBS | Attached block volume for instance workloads |
| Shared Linux file system | EFS | Managed elastic file storage for multiple clients |
| Windows file shares or specialized file systems | FSx family | Managed file systems for specific workload needs |
| Hybrid file access to cloud-backed storage | Storage Gateway | Integrates on-premises environments with AWS storage |
| Data transfer between locations or services | DataSync | Managed data movement |
| Long-term archive or infrequent access pattern | S3 storage classes and lifecycle | Cost optimization through access-pattern alignment |
Checklist:
- Distinguish object, block, and file storage.
- Choose S3 for object storage, static websites, backups, logs, and data lakes.
- Choose EBS for EC2-attached block storage.
- Choose EFS when multiple compute resources need shared file access.
- Choose FSx when the scenario names Windows file shares or a specific file system requirement.
- Use S3 lifecycle policies to transition or expire objects based on access and retention needs.
- Use S3 versioning when accidental overwrite or deletion protection is needed.
- Use replication when data must be copied across buckets or Regions for a stated purpose.
- Use Object Lock only when write-once-read-many or retention enforcement is required.
- Use S3 event notifications or EventBridge when object changes should trigger processing.
S3 access and protection readiness
| Topic | Ready if you can explain |
|---|---|
| Bucket policy | Resource-based access control for bucket and objects |
| IAM policy | Identity-based permissions for principals |
| Block Public Access | Guardrail against unintended public exposure |
| Versioning | Retains multiple versions of objects |
| Lifecycle | Automates transition or expiration |
| Replication | Copies objects to another bucket based on configured rules |
| Encryption | Protects data at rest using appropriate key management |
| Pre-signed URL | Time-limited delegated object access pattern |
| Static website hosting | Public website pattern with routing and access considerations |
| CloudFront origin access | Edge distribution pattern that can reduce direct S3 exposure |
Common S3 traps:
- Assuming encryption automatically grants access.
- Assuming public access is required for CloudFront-backed delivery.
- Confusing lifecycle transition with replication.
- Confusing versioning with backup strategy.
- Forgetting that object storage is not block storage for an EC2 boot volume.
- Choosing EFS or EBS when the scenario clearly describes object storage.
Database and Data Platform Checklist
Database service selection
| Scenario cue | Service to consider | Readiness point |
|---|---|---|
| Managed relational database | RDS | SQL engine, backups, patching, Multi-AZ patterns |
| High-performance managed relational architecture | Aurora | Managed relational option with AWS-specific architecture |
| Key-value or document access at scale | DynamoDB | Partition key design, indexes, capacity model concepts, global tables concepts |
| Microsecond-style cache layer or session cache | ElastiCache | Redis or Memcached fit, cache-aside patterns |
| Data warehouse and analytics queries | Redshift | Analytical workloads, not OLTP replacement |
| Search and log analytics | OpenSearch Service | Search indexing and analysis |
| Graph relationships | Neptune | Highly connected graph data |
| Database migration | DMS | Homogeneous or heterogeneous migration patterns |
| Schema conversion | Schema Conversion Tool concepts | Used when database engines differ |
Checklist:
- Choose RDS/Aurora for relational requirements and SQL transactions.
- Choose DynamoDB for key-value access patterns and serverless-style NoSQL needs.
- Identify partition key and access pattern importance for DynamoDB.
- Choose read replicas for read scaling where appropriate.
- Choose Multi-AZ patterns for availability and failover.
- Choose ElastiCache to reduce repeated database reads or handle session/cache needs.
- Choose Redshift for analytical queries over large datasets.
- Choose OpenSearch Service for search use cases.
- Choose DMS when migrating databases with minimal disruption requirements.
- Use backups, snapshots, point-in-time concepts, and restore testing as part of resilience planning.
Database decision traps
| Trap | Correct reasoning |
|---|---|
| “Read replica means high availability” | Read replicas primarily address read scaling and can support some recovery patterns; Multi-AZ is the common availability cue |
| “DynamoDB works without data modeling” | Access patterns and key design are central |
| “Cache fixes all database problems” | Cache helps repeated reads but does not replace correct database design |
| “Redshift is for transactional app writes” | Redshift is a data warehouse pattern |
| “Backups equal zero downtime” | Backups help recovery; they do not automatically provide continuous availability |
| “Multi-Region is always better” | It adds complexity and cost; use it when requirements justify it |
| “Encryption handles authorization” | Encryption protects data; IAM, resource policies, and application controls authorize access |
Application Integration and Event-Driven Design
Service selection
| Requirement | Service to consider | Key distinction |
|---|---|---|
| Decouple producers and consumers with buffering | SQS | Queue-based asynchronous processing |
| Fan out messages to multiple subscribers | SNS | Pub/sub notification pattern |
| Route events from many sources to targets | EventBridge | Event bus and rule-based routing |
| Coordinate multi-step workflows | Step Functions | State machine orchestration |
| Process high-volume ordered event streams | Kinesis | Streaming data ingestion and processing |
| Managed message broker compatibility | Amazon MQ | Broker-based migration or protocol compatibility |
| Build managed APIs | API Gateway | Front door for APIs, often with Lambda or private integrations |
Can you do this?
- Choose SQS when work should wait in a queue until a consumer processes it.
- Choose SNS when one message should notify multiple subscribers.
- Choose EventBridge when events need routing, filtering, or SaaS/service integration patterns.
- Choose Step Functions when business logic requires explicit workflow states and retries.
- Choose Kinesis when the scenario describes streaming data records.
- Choose DLQs for failed asynchronous processing paths.
- Add idempotency when retries may cause duplicate processing.
- Use queues to absorb traffic spikes and protect downstream services.
- Avoid tightly coupling synchronous services when the scenario asks for resiliency.
Event pattern cues
| Cue in question | Think |
|---|---|
| “Order processing has multiple steps and compensation logic” | Step Functions |
| “Image uploads trigger thumbnail generation” | S3 event notification/EventBridge plus Lambda or queue |
| “Payment service must not be overwhelmed” | SQS buffer and controlled consumers |
| “Multiple systems need the same update” | SNS fanout or EventBridge routing |
| “Clickstream or telemetry ingestion” | Kinesis-style streaming |
| “Legacy application uses standard broker protocols” | Amazon MQ |
Resilient Architecture Checklist
Availability and recovery design
| Requirement language | Architecture thinking |
|---|---|
| “Highly available” | Remove single points of failure; use multiple AZs and managed failover where suitable |
| “Fault tolerant” | Continue operating despite component failure |
| “Disaster recovery” | Define recovery strategy for larger failures, including Region-level events when required |
| “Backup and restore” | Recovery depends on backup frequency, restore process, and validation |
| “Low RTO” | Favor warm, active, or automated failover patterns over manual rebuilds |
| “Low RPO” | Favor frequent replication or continuous data protection patterns |
| “Stateless web tier” | Store session/data outside replaceable instances |
| “Self-healing” | Use health checks, Auto Scaling, managed services, and automation |
Checklist:
- Design web/application tiers across multiple AZs.
- Use load balancer health checks and Auto Scaling replacement.
- Keep application state out of individual instances.
- Use managed database availability features where requirements call for them.
- Use backups and snapshots for recoverability.
- Use S3 versioning and replication when object recovery or geographic copy is required.
- Use Route 53 routing policies for DNS-level failover or routing scenarios.
- Use CloudFront to improve availability and performance for cacheable global content.
- Test whether a design has a single NAT, single instance, single database, or single network dependency.
- Match DR pattern complexity to stated business requirements.
Resiliency scenario checks
| Scenario | Poor answer pattern | Better direction |
|---|---|---|
| EC2-hosted web app fails when one instance dies | Single instance | ALB plus Auto Scaling across AZs |
| Database outage causes application outage | Single database instance | Managed Multi-AZ or replicated architecture depending on service |
| Users lose session on instance replacement | Local session storage | External session store or stateless design |
| Queue consumers fail during spike | Direct synchronous calls only | SQS buffering and scalable consumers |
| Accidental object deletion | No object protection | Versioning, retention controls, backups, or replication as required |
| Region-level recovery required | Single-Region only | Cross-Region backup, replication, or active patterns as requirements justify |
High-Performing Architecture Checklist
Performance selection areas
| Area | Review | Ready signal |
|---|---|---|
| Compute performance | Instance families conceptually, Auto Scaling, serverless concurrency concepts | You choose scaling pattern before tuning individual servers |
| Storage performance | S3 request patterns, EBS fit, EFS fit, caching | You select storage based on workload access pattern |
| Database performance | Read replicas, caching, partitioning, indexes, query pattern | You know when the bottleneck is read load, write design, or query model |
| Network performance | Placement, edge caching, Direct Connect, Global Accelerator, CloudFront | You reduce latency using the correct network or edge service |
| Application performance | Decoupling, async processing, batching, caching | You reduce synchronous bottlenecks |
| Analytics performance | Redshift, Athena, OpenSearch Service, data lake patterns | You avoid using OLTP stores for heavy analytics when inappropriate |
Checklist:
- Add caching when repeated reads are slowing the system and freshness requirements allow it.
- Use CloudFront for cacheable content close to users.
- Use read replicas or caching for read-heavy relational workloads when appropriate.
- Use DynamoDB access-pattern design rather than relational thinking for key-value workloads.
- Use SQS to smooth bursts and prevent overload.
- Use asynchronous processing when users do not need to wait for background work.
- Choose the right storage class and storage type for latency and throughput needs.
- Avoid assuming bigger instances are always the best performance answer.
- Recognize when a managed service automatically handles scaling dimensions that would otherwise be operational work.
Cost-Optimized Architecture Checklist
Cost levers to know
| Cost lever | Use when | Be careful not to |
|---|---|---|
| Right sizing | Compute or database is overprovisioned | Undersize and violate performance requirements |
| Auto Scaling | Demand varies | Treat scaling as only a performance feature |
| Savings Plans or Reserved Instances concepts | Usage is predictable | Apply to workloads that may disappear or change significantly |
| Spot | Workload is fault-tolerant and interruptible | Use for critical non-interruptible stateful work |
| S3 lifecycle | Access frequency changes over time | Move data to a class that violates retrieval needs |
| Managed services | Operations cost and complexity matter | Ignore feature fit or workload constraints |
| Caching | Repeated expensive reads | Cache stale data when freshness is critical |
| Serverless | Event-driven or variable demand | Force-fit serverless where workload model does not fit |
| Budgets and Cost Explorer | Visibility and cost monitoring | Treat monitoring as optimization by itself |
| Tagging | Allocation and governance | Assume tags reduce cost without policy or action |
Checklist:
- Choose Spot only for workloads designed for interruption.
- Use lifecycle policies for storage cost optimization.
- Use managed scaling to avoid paying for idle fixed capacity.
- Choose serverless for spiky event-driven workloads when requirements fit.
- Use purchase options conceptually for steady-state workloads.
- Identify NAT, data transfer, idle resources, and overprovisioned compute as cost-review areas.
- Use budgets and cost tools for visibility, not as substitutes for architecture changes.
- Avoid selecting the cheapest answer if it violates durability, availability, or security.
Monitoring, Logging, and Operations Checklist
Observability and governance services
| Need | AWS service or feature to review | Ready distinction |
|---|---|---|
| Infrastructure metrics and alarms | CloudWatch metrics and alarms | Performance/health signals |
| Application and system logs | CloudWatch Logs | Log collection and search |
| Distributed tracing | X-Ray | Request path and latency analysis |
| API activity audit | CloudTrail | Who did what through AWS APIs |
| Configuration history/compliance | AWS Config | What changed in resource configuration |
| Automated operational tasks | Systems Manager | Patch, run command, inventory, session access concepts |
| Event-driven operations | EventBridge | React to service events or scheduled rules |
| Service health visibility | AWS Health | AWS service event awareness |
| Cost and best-practice checks | Trusted Advisor concepts | Recommendations, not enforcement by itself |
| Security findings aggregation | Security Hub | Centralized security posture findings |
Checklist:
- Choose CloudWatch for metrics, logs, and alarms.
- Choose CloudTrail for API audit trails.
- Choose AWS Config for configuration history and compliance rules.
- Choose X-Ray when tracing application requests is required.
- Choose Systems Manager for operational management without opening broad inbound access.
- Use EventBridge to react to operational events.
- Centralize logs for multi-account or compliance scenarios.
- Alarm on symptoms that affect users, not only infrastructure internals.
- Know the difference between detection, alerting, audit, and remediation.
Migration and Hybrid Architecture Checklist
| Requirement | Service or pattern to consider | Ready check |
|---|---|---|
| Move files or objects efficiently | DataSync | You know it is for managed data transfer workflows |
| Connect on-premises applications to AWS storage | Storage Gateway | You can match file, volume, or tape-style concepts at a high level |
| Migrate databases | DMS | You can identify database migration and replication scenarios |
| Transfer data when network is constrained | Snow Family | You recognize offline or edge data movement patterns |
| Secure managed file transfer | Transfer Family | You can identify SFTP/FTPS/FTP-style managed transfer needs |
| Hybrid private connectivity | VPN or Direct Connect | You can choose based on dedicated connectivity and encryption requirements |
| Many networks need centralized routing | Transit Gateway | You understand hub-style routing |
| Migrate DNS or route users | Route 53 | You can apply routing policies conceptually |
| Rehost, replatform, refactor discussion | Migration strategy concepts | You can identify operational tradeoffs without overcomplicating |
Can you do this?
- Choose DMS for database migration rather than general file transfer.
- Choose DataSync for file/object transfer rather than application messaging.
- Choose Storage Gateway for hybrid storage access patterns.
- Choose VPN for encrypted internet-based connectivity.
- Choose Direct Connect when dedicated connectivity is the key requirement.
- Choose Transit Gateway when many VPCs and networks need centralized connectivity.
- Identify migration answers that minimize downtime when the scenario requires it.
- Avoid choosing a complex migration tool when a managed service import/export or replication feature directly fits.
Service Selection Quick Reference
Compute and application front end
| If the scenario says… | Think first |
|---|---|
| “HTTP path-based routing” | ALB |
| “TCP/UDP high-performance traffic” | NLB |
| “Third-party firewall appliance insertion” | Gateway Load Balancer |
| “Global static content delivery” | CloudFront |
| “DNS failover or weighted routing” | Route 53 |
| “No server management for events” | Lambda |
| “Container orchestration” | ECS or EKS |
| “Containers without managing hosts” | Fargate |
| “Self-healing fleet of instances” | Auto Scaling group |
| “Custom OS dependencies” | EC2 |
Data and storage
| If the scenario says… | Think first |
|---|---|
| “Object storage” | S3 |
| “Shared file system for Linux workloads” | EFS |
| “Block volume attached to EC2” | EBS |
| “Windows file share” | FSx for Windows File Server |
| “Relational database” | RDS or Aurora |
| “Key-value at scale” | DynamoDB |
| “Read-heavy relational workload” | Read replicas or cache, depending on context |
| “Session store or repeated reads” | ElastiCache or DynamoDB, depending on access pattern |
| “Analytics warehouse” | Redshift |
| “Search” | OpenSearch Service |
| “Graph relationships” | Neptune |
Integration
| If the scenario says… | Think first |
|---|---|
| “Buffer work” | SQS |
| “Fan out to subscribers” | SNS |
| “Route events by rules” | EventBridge |
| “Orchestrate steps” | Step Functions |
| “Streaming records” | Kinesis |
| “Legacy broker compatibility” | Amazon MQ |
| “Dead-letter handling” | DLQ with SQS/SNS-supported patterns |
Common Weak Areas and Exam Traps
| Weak area | Why candidates miss it | How to fix it |
|---|---|---|
| Security group vs network ACL | Both filter network traffic | Memorize stateful resource-level vs stateless subnet-level behavior |
| Public vs private subnet | Candidates focus on subnet name instead of route path | Determine whether there is a route to an internet gateway and appropriate addressing |
| NAT gateway vs internet gateway | Both relate to internet access | NAT supports outbound from private resources; internet gateway enables internet routing for public resources |
| VPC endpoint vs NAT | Both can reach AWS services | Endpoint keeps supported service traffic private without general internet traversal |
| IAM role vs IAM user | Static credentials feel simpler | Prefer roles and temporary credentials for AWS services and cross-account access |
| SCP vs IAM policy | Both mention permissions | SCPs set guardrails; IAM/resource policies grant or deny access |
| KMS vs Secrets Manager | Both appear in security scenarios | KMS manages encryption keys; Secrets Manager stores and can rotate secrets |
| CloudTrail vs CloudWatch | Both are monitoring-related | CloudTrail audits API activity; CloudWatch handles metrics, logs, and alarms |
| AWS Config vs CloudTrail | Both track changes | Config records resource configuration state; CloudTrail records API calls |
| Multi-AZ vs read replica | Both improve database architecture | Multi-AZ is availability-focused; read replicas are read-scaling-focused |
| Backup vs replication | Both protect data | Backup supports restore; replication keeps another copy synchronized according to configuration |
| SQS vs SNS | Both are messaging services | SQS queues work; SNS publishes notifications to subscribers |
| EventBridge vs SNS | Both distribute events | EventBridge adds event bus routing and filtering patterns |
| Step Functions vs Lambda chain | Both can coordinate tasks | Step Functions makes workflow state, retries, and branching explicit |
| DynamoDB vs RDS | Both store application data | DynamoDB requires key/access-pattern design; RDS supports relational SQL patterns |
| EBS vs EFS vs S3 | All are storage | EBS is block, EFS is file, S3 is object |
| CloudFront vs Route 53 | Both affect user traffic | CloudFront caches/delivers content; Route 53 resolves and routes DNS |
| Cost-first answers | Lowest cost may violate requirements | Re-check security, durability, availability, and performance constraints before choosing |
| Overengineering | Candidates choose the most advanced service | Prefer the simplest managed design that satisfies all stated requirements |
Scenario Practice Prompts
Use these prompts to test whether you can reason like a solutions architect instead of recalling isolated facts.
Secure design prompts
- A Lambda function needs to read from an S3 bucket. What identity should it use, and where are permissions defined?
- An application in a private subnet needs to call AWS APIs without traversing the public internet. What network pattern should you consider?
- A company needs to prevent accounts in an organization from using disallowed services. Is this an IAM policy or an SCP use case?
- A database password is currently stored in application code. Which service pattern improves this?
- Auditors ask who changed a security group. Which logging or audit service is most relevant?
Resilient design prompts
- A web tier runs on one EC2 instance and must survive instance failure. What changes are required?
- A relational database must remain available during an AZ issue. Which managed availability pattern applies?
- A queue consumer occasionally fails after receiving a message. What retry and failure-handling pattern should you review?
- Users in different parts of the world report slow static content downloads. Which edge service should you consider?
- An accidental S3 object delete must be recoverable. Which S3 features should you review?
Performance and cost prompts
- A read-heavy application is overloading its database. Should you consider cache, read replica, data model change, or all of these depending on the scenario?
- A nightly batch job can be restarted if interrupted. Which compute purchase or capacity pattern might reduce cost?
- A workload has unpredictable traffic and spends much of the day idle. Which serverless or scaling options fit?
- Old S3 data is rarely accessed but must be retained. Which lifecycle decision is likely?
- An application performs long synchronous tasks during user checkout. How can decoupling improve performance and reliability?
Final-Week Review Checklist
| Timeframe | Focus | Actions |
|---|---|---|
| 7 days out | Identify weak domains | Take a mixed practice set, tag every miss by topic, and list the top five recurring causes |
| 6 days out | Security and IAM | Rework IAM, KMS, S3 access, VPC security, CloudTrail, and Config scenarios |
| 5 days out | Networking | Draw VPC flows, route tables, subnet types, NAT, endpoints, peering, Transit Gateway, VPN, and Direct Connect decisions |
| 4 days out | Compute and scaling | Review EC2, Auto Scaling, load balancers, Lambda, containers, and event-driven scaling |
| 3 days out | Storage and databases | Compare S3/EBS/EFS/FSx and RDS/Aurora/DynamoDB/ElastiCache/Redshift/OpenSearch |
| 2 days out | Resiliency and cost | Review Multi-AZ, backup, replication, DR cues, Spot, lifecycle, right sizing, and managed scaling |
| 1 day out | Mixed scenario judgment | Do light review only; focus on eliminating bad answers and explaining why the best answer satisfies all requirements |
Final checks:
- I can explain the difference between security, reliability, performance, cost, and operations requirements in a scenario.
- I can choose a service because of a requirement, not because it sounds familiar.
- I can identify when AWS managed services reduce operational overhead.
- I can read policy, network, and architecture descriptions carefully before answering.
- I can eliminate answers that violate private networking, least privilege, or high availability requirements.
- I can explain why the correct answer is better than the second-best answer.
- I have practiced mixed questions, not only single-service flashcards.
Practical Next Step
Turn every unchecked item into a targeted practice task. For each weak topic, review the AWS service decision points, then answer several scenario-based questions that force you to choose between similar services. Focus especially on IAM, VPC networking, storage selection, database availability, decoupling, monitoring, and cost tradeoffs, because these areas often separate memorization from architect-level readiness for SAA-C03.