AWS SAA-C03: Design High-Performing Architectures

May 1, 2026

Try 10 focused AWS SAA-C03 questions on Design High-Performing Architectures, with explanations, then continue with IT Mastery.

On this page

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try AWS SAA-C03 on Web View full AWS SAA-C03 practice page

Topic snapshot

Field	Detail
Exam route	AWS SAA-C03
Topic area	Design High-Performing Architectures
Blueprint weight	24%
Page purpose	Focused sample questions before returning to mixed practice

How to use this topic drill

Use this page to isolate Design High-Performing Architectures for AWS SAA-C03. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.

Pass	What to do	What to record
First attempt	Answer without checking the explanation first.	The fact, rule, calculation, or judgment point that controlled your answer.
Review	Read the explanation even when you were correct.	Why the best answer is stronger than the closest distractor.
Repair	Repeat only missed or uncertain items after a short break.	The pattern behind misses, not the answer letter.
Transfer	Return to mixed practice once the topic feels stable.	Whether the same skill holds up when the topic is no longer obvious.

Blueprint context: 24% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.

Sample questions

These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.

Question 1

Topic: Design High-Performing Architectures

A company runs a stateless web application on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. Load testing shows that a single instance can reliably process 400 HTTP requests/second while keeping latency within the target SLO.

During a big marketing event, the company expects a peak of 2,100 requests/second. The solutions architect will configure a scheduled scaling action to set the Auto Scaling group’s desired capacity 5 minutes before the event so that performance stays within targets while minimizing overprovisioning.

To meet the expected peak load, what should the scheduled action set as the desired capacity (number of instances)? Round up to the nearest whole instance.

Select the correct answer and include the appropriate unit in your reasoning (instances).

Options:

A. 8 instances
B. 6 instances
C. 5 instances
D. 7 instances

Best answer: B

Explanation: The key requirement is to size the Auto Scaling group so that the total instance capacity meets or exceeds the expected peak of 2,100 requests/second, based on the known per-instance throughput.

From load testing:

Each instance can handle: 400 requests/second (while staying within latency SLO)
Expected peak load: 2,100 requests/second

We need to find the minimum whole number of instances such that:

\[ \text{instances} \times 400 \ge 2,100 \]

Variables used:

\(R_{\text{per instance}} = 400\) requests/second per instance
\(R_{\text{peak}} = 2,100\) requests/second (peak)

One-line calculation:

\[ \text{instances} = \left\lceil \frac{R_{\text{peak}}}{R_{\text{per\ instance}}} \right\rceil = \left\lceil \frac{2,100}{400} \right\rceil = \left\lceil 5.25 \right\rceil = 6 \]

Rounding up is required because you cannot run a fractional instance, and using fewer than 6 instances would not provide enough capacity. Therefore, the scheduled action should set the desired capacity to 6 instances.

This aligns with good Auto Scaling design: provision just enough instances to meet performance targets while avoiding unnecessary cost.

Question 2

Topic: Design High-Performing Architectures

A company runs an Amazon RDS for MySQL DB instance with a write-intensive primary. To scale a read-intensive reporting workload, the company will send all read traffic to RDS read replicas only. Each read replica can handle 5,000 read requests/second. Peak read traffic is estimated at 18,000 read requests/second. Assuming traffic is evenly balanced and you must round up to the next whole instance, how many read replicas are required to meet the peak read load?

Options:

A. 4 read replicas
B. 3 read replicas
C. 2 read replicas
D. 5 read replicas

Best answer: A

Explanation: This scenario focuses on scaling a read-intensive workload by offloading reads from the primary Amazon RDS for MySQL instance to read replicas. Read replicas are designed to horizontally scale read throughput by adding additional instances that serve read-only queries.

The stem specifies that all read traffic (18,000 read requests/second at peak) will be sent to read replicas, and each replica can handle 5,000 read requests/second. The primary is assumed to handle only writes, so its read capacity is not counted.

To determine the number of replicas, divide the required read throughput by the per-replica capacity and then round up to the next whole instance, because you cannot provision a fraction of a replica and you must ensure capacity meets or exceeds the requirement.

Using the numbers provided:

Required read throughput: 18,000 read requests/second
Per-replica capacity: 5,000 read requests/second

Number of replicas needed is:

\[ \text{replicas} = \left\lceil \frac{18{,}000}{5{,}000} \right\rceil = \left\lceil 3.6 \right\rceil = 4 \]

Thus, four read replicas are required to meet the peak read demand while keeping the primary focused on write operations. This design aligns with the performance efficiency pillar by right-sizing capacity and using RDS read replicas to support a read-intensive workload without overprovisioning.

Question 3

Topic: Design High-Performing Architectures

An e-commerce company uses a Multi-AZ Amazon RDS for MySQL db.r6g.large instance for all reads and writes. About 80% of queries are read-only product lookups. CPU averages 85%, and read latency is increasing. The application is mission-critical and must retain Multi-AZ high availability. The company must keep the existing instance size due to reserved instances and wants minimal application changes. Which change best improves performance and cost-efficiency?

Options:

A. Convert the RDS instance from Multi-AZ to Single-AZ to free capacity for additional read queries.
B. Upgrade the RDS instance to a larger instance class to increase CPU and memory capacity for all queries.
C. Create one or more Amazon RDS for MySQL read replicas and modify the application to send read-only queries to the replicas.
D. Migrate the database to Amazon DynamoDB and redesign the application to use a key-value access pattern.

Best answer: C

Explanation: The workload is clearly read-heavy: 80% of queries are read-only product lookups. The current Amazon RDS for MySQL instance is under high CPU load and experiencing increasing read latency, but the company must keep the existing instance size because of reserved instances and must maintain Multi-AZ high availability.

In such a scenario, the most effective optimization is to scale reads horizontally rather than vertically. Amazon RDS for MySQL read replicas are designed for exactly this pattern: they asynchronously replicate data from the primary and can serve read-only traffic. By directing read-heavy queries (such as product lookups and catalog browsing) to the read replicas, the primary instance can focus on writes and critical transactional operations.

This approach:

Preserves the existing primary instance size (satisfies the reserved instance constraint).
Retains Multi-AZ on the primary for high availability.
Targets the dominant bottleneck (read queries) without a disruptive engine or schema change.
Allows incremental scaling: each read replica can roughly add another unit of read capacity, improving performance per dollar in a predictable way.

Other options either violate explicit constraints (changing instance size or removing Multi-AZ), or require a full redesign of the data model and application, which is not acceptable given the requirement for minimal application changes.

Question 4

Topic: Design High-Performing Architectures

A retail company ingests clickstream events into a single Amazon Kinesis Data Stream. Each event averages 2KB. Peak volume is 30,000 events per second. A shard supports up to 1MB/s or 1,000 records/s of writes. Assume 1MB=1,000KB and round up. What is the minimum shard count required?

Options:

A. 60 shards
B. 90 shards
C. 15 shards
D. 30 shards

Best answer: A

Explanation: This question tests how to size an Amazon Kinesis Data Stream based on required write throughput and records per second.

Variables used:

R = 30,000 events/s (record rate)
S = 2KB/event (average event size)
shard_data_limit = 1,000KB/s (1MB/s per shard)
shard_record_limit = 1,000 records/s per shard

Step 1 – Calculate required data throughput.

Total write throughput required:

\[ \text{throughput (KB/s)} = R \times S = 30{,}000 \times 2 = 60{,}000\,\text{KB/s} = 60\,\text{MB/s} \]

Number of shards needed to satisfy data throughput:

\[ \text{shards}_{\text{data}} = \left\lceil \frac{60{,}000}{1{,}000} \right\rceil = \lceil 60 \rceil = 60 \]

Step 2 – Check the records-per-second limit.

Each shard can ingest up to 1,000 records/s, so:

\[ \text{shards}_{\text{records}} = \left\lceil \frac{30{,}000}{1{,}000} \right\rceil = \lceil 30 \rceil = 30 \]

Step 3 – Take the maximum of the two.

Kinesis must satisfy both limits, so the required shard count is the maximum:

\[ \text{shards} = \max(\text{shards}_{\text{data}},\,\text{shards}_{\text{records}}) = \max(60, 30) = 60 \]

Therefore, the minimum shard count that meets both throughput and record-rate requirements is 60 shards.

Question 5

Topic: Design High-Performing Architectures

A company has an on-premises data center and 20 VPCs across two AWS Regions. The company needs:

Private, high-throughput connectivity from on premises to all VPCs
Centralized, scalable routing (no full-mesh VPC peering)
Connectivity that continues if any single network link fails
Minimal management of individual VPN tunnels

Which THREE architectures meet these requirements? (Select THREE.)

Options:

A. Create a 10 Gbps AWS Direct Connect link aggregation group (LAG) with two physical connections to a Direct Connect gateway. Associate the Direct Connect gateway with a Transit Gateway in each Region and attach all VPCs using Transit Gateway VPC attachments.
B. Establish individual Site-to-Site VPN connections from the on-premises router to each VPC over the internet. Use dynamic routing (BGP) on each VPN to exchange routes.
C. Create a single AWS Direct Connect connection to a Direct Connect gateway associated with a Transit Gateway in each Region. Attach all VPCs to the Transit Gateways. Rely on the inherent availability of Direct Connect and do not configure any backup connectivity.
D. Provision a single AWS Direct Connect connection to one shared “hub” VPC. Peer that VPC with all other VPCs in both Regions in a hub-and-spoke pattern. Configure static routes on the on-premises router and VPC route tables.
E. Provision one AWS Direct Connect connection to a Direct Connect gateway associated with a Transit Gateway in each Region. Attach all VPCs to their Regional Transit Gateway. Also configure two Site-to-Site VPN connections from on premises to the Transit Gateways as backup paths using BGP for automatic failover.
F. Provision two separate AWS Direct Connect connections at different locations, both to the same Direct Connect gateway. Associate the Direct Connect gateway with a Transit Gateway in each Region and attach all VPCs. Use BGP so traffic can fail over between Direct Connect connections.

Correct answers: A, E and F

Explanation: The scenario calls for a global hybrid network that connects an on-premises data center to many VPCs across two Regions. The key requirements are:

High-throughput, private connectivity → favors AWS Direct Connect over VPN-only designs.
Centralized, scalable routing without full-mesh VPC peering → favors AWS Transit Gateway or similar hub-and-spoke constructs.
Resilience to a single network link failure → needs more than one physical or logical path.
Minimal management of individual VPN tunnels → avoid per-VPC VPN connections.

Architectures that use Direct Connect + Direct Connect gateway + Transit Gateway provide scalable hub-and-spoke routing for many VPCs and can be made highly available either by adding multiple Direct Connect circuits (LAG or separate locations) or by combining Direct Connect with a small number of backup VPN tunnels using BGP-based failover.

Designs that rely on a single Direct Connect without backup, or many individual VPNs, either fail the resiliency requirement or create significant operational overhead and performance risk.

Question 6

Topic: Design High-Performing Architectures

A solutions architect adds Amazon RDS read replicas and introduces an in-memory cache with Amazon ElastiCache to reduce query latency and handle more concurrent read requests without changing the primary database instance size. Which AWS Well-Architected Framework pillar does this action primarily support?

Options:

A. Performance Efficiency
B. Cost Optimization
C. Security
D. Reliability

Best answer: A

Explanation: The scenario describes architectural changes aimed at reducing query latency and supporting more concurrent read traffic: adding Amazon RDS read replicas and introducing an in-memory cache using Amazon ElastiCache. Both techniques are core patterns in the Performance Efficiency pillar of the AWS Well-Architected Framework.

Performance Efficiency is about using computing resources efficiently to meet requirements and maintaining that efficiency as demand changes and technologies evolve. Read replicas increase read throughput by horizontally scaling reads, while caching returns results from memory instead of disk-based storage, decreasing latency and offloading work from the primary database. These changes directly target performance metrics such as response time and throughput.

Although these decisions can also influence other pillars (for example, potentially improving perceived availability or even cost), the primary intent and most direct impact in this scenario is on performance, not reliability, cost, or security.

Question 7

Topic: Design High-Performing Architectures

A company runs a Java web application on Amazon ECS tasks in private subnets. The application connects to an Amazon RDS for MySQL DB and experiences connection storms during traffic spikes, exhausting DB connections and increasing latency. The company requires improved connection scalability through pooling, no hard-coded database credentials, and that all database traffic stay within the VPC without traversing the internet. The operations team also wants the solution to require minimal application code changes. Which solution BEST meets these requirements?

Options:

A. Increase the RDS instance class to a larger size to support more connections, and move database credentials into AWS Systems Manager Parameter Store for secure retrieval at startup.
B. Move the RDS instance to a public subnet, route ECS traffic through a NAT gateway to the DB’s public endpoint, implement client-side connection pooling in the application, and store credentials in AWS Secrets Manager.
C. Create an Amazon RDS Proxy in the same VPC, enable IAM authentication, update the application to connect to the proxy endpoint instead of the DB endpoint, and attach an IAM role to the ECS task for authentication.
D. Enable the RDS Data API, create an interface VPC endpoint for the API, and modify the application to call the Data API using the AWS SDK with credentials from AWS Secrets Manager.

Best answer: C

Explanation: RDS Proxy provides managed connection pooling, IAM-based authentication, and an in-VPC endpoint, allowing the application to scale database connections securely with only a minor connection-string change.

Question 8

Topic: Design High-Performing Architectures

Which of the following statements about read-intensive, write-intensive, and mixed database workloads are INCORRECT? (Select THREE.)

Options:

A. Write-optimized architectures, such as using a streaming ingestion layer before the database, are commonly used for high-ingest workloads like clickstream or IoT data.
B. For a write-intensive workload with very few reads, adding more read replicas is the recommended way to scale write throughput.
C. Amazon DynamoDB is primarily optimized for read-heavy workloads and should be avoided for write-intensive use cases.
D. Amazon RDS read replicas are primarily used to offload read traffic and do not increase the write capacity of the primary DB instance.
E. For mixed workloads with unpredictable spikes, using a single large database instance and turning off autoscaling features is preferred to avoid scaling overhead.
F. Horizontally sharding data across multiple database instances can improve the maximum aggregate throughput for both reads and writes.

Correct answers: B, C and E

Explanation: Read- and write-intensive workloads drive different database scaling patterns. For read-heavy workloads, you typically add read replicas or caching layers so the primary database handles fewer read queries. In Amazon RDS, read replicas asynchronously copy data from the primary instance and therefore increase read capacity but not write capacity.

To scale both reads and writes, you can horizontally partition (shard) data across multiple databases so each instance handles a subset of the data and load. For high-ingest or write-heavy workloads such as clickstream, IoT, or time-series data, architectures often use write-optimized systems like Amazon DynamoDB, Amazon Timestream, or a streaming ingestion layer (for example, Amazon Kinesis Data Streams) in front of downstream storage. For mixed and spiky workloads, combining horizontal scaling and autoscaling is usually better than relying on a single, large fixed instance, because it improves elasticity and resilience.

Question 9

Topic: Design High-Performing Architectures

A company runs an OLTP workload on Amazon RDS for MySQL. The team collected the following metrics and requirements:

Item	Value	Notes	Priority
DB deployment	Amazon RDS MySQL, us-east-1, Multi-AZ	All writes must stay in us-east-1.	Must
Traffic & latency	EU users: 220 ms avg read latency	Reads: 80%, Writes: 20%	High
Availability	RPO≤5 min, RTO≤15 min	Primary region only (us-east-1)	High
Budget	+25% monthly DB cost maximum	Prefer read-optimized scaling	Medium

Based only on this information, which change is the most appropriate to meet the requirements for EU users while maintaining availability and cost goals?

Options:

A. Migrate to an Aurora MySQL global database with the primary cluster in eu-west-1 and a secondary cluster in us-east-1, sending all writes to eu-west-1.
B. Convert the us-east-1 RDS instance to single-AZ and create two additional read replicas in us-east-1 to handle global reads.
C. Add a cross-Region read replica of the existing RDS MySQL DB in eu-west-1 and route EU read-only traffic there while keeping the us-east-1 Multi-AZ primary.
D. Create a separate standalone RDS MySQL instance in eu-west-1 and synchronize data using custom application-level replication between regions.

Best answer: C

Explanation: The exhibit highlights three key points: the database currently runs on “Amazon RDS MySQL, us-east-1, Multi-AZ,” EU users experience “EU users: 220 ms avg read latency,” and availability requirements of “RPO≤5 min, RTO≤15 min” apply to the primary region only, with a budget of “+25% monthly DB cost maximum” and a preference to “Prefer read-optimized scaling.”

Because reads are 80% of traffic and EU users see high latency, the best approach is to move read workloads for EU users closer to them, while keeping writes and HA in us-east-1. An RDS cross-Region read replica in eu-west-1 allows asynchronous replication of data from the us-east-1 Multi-AZ primary and provides low-latency local reads to EU users.

This design keeps the Multi-AZ deployment in us-east-1 to satisfy the high-priority RPO/RTO goals for the primary region, respects the constraint “All writes must stay in us-east-1,” and uses a cost-effective read replica pattern rather than creating a second full Multi-AZ cluster or a more expensive global configuration. Other options either fail the latency requirement, violate the write-location constraint, or undermine availability.

Question 10

Topic: Design High-Performing Architectures

A company is migrating a CPU-intensive analytics batch job to Amazon EC2. The workload runs 3 hours every evening at 80–90% CPU utilization and does not use GPUs. The following instance options have been shortlisted:

Instance type	vCPUs	CPU credits	Notes
t3.large	2	Yes (burstable)	General purpose; for spiky CPU workloads
m6i.large	2	No (fixed CPU)	General purpose
c6i.large	2	No (fixed CPU)	Compute optimized for high-CPU workloads
g5.xlarge	4	No (fixed CPU)	Includes GPU; intended for graphics/ML

Based only on the information in the exhibit, which instance type is the MOST appropriate choice for this workload?

Options:

A. t3.large
B. g5.xlarge
C. c6i.large
D. m6i.large

Best answer: C

Explanation: The exhibit compares four instance options and explicitly calls out which workloads they target. The key line is in the c6i.large row: Compute optimized for high-CPU workloads with CPU credits: No (fixed CPU). This directly matches a batch job that runs at 80–90% CPU for several hours.

By contrast, the t3.large line shows CPU credits: Yes (burstable) and notes it is for spiky CPU workloads. Burstable instances are designed for occasional peaks, not long periods of sustained high utilization, where they can exhaust CPU credits and throttle. The g5.xlarge row includes Includes GPU; intended for graphics/ML, which clashes with the stem stating the workload does not use GPUs. The m6i.large row indicates a fixed CPU, general-purpose instance, which would be acceptable but is less targeted than the compute-optimized c6i.large specifically described as suitable for high-CPU workloads.

A common misread is to focus only on vCPUs or to assume that the GPU in g5.xlarge must be better because it has more hardware. However, the exhibit clearly limits g5.xlarge to graphics/ML, and the scenario rules out GPU use, so the compute-optimized, fixed-CPU c6i.large is the best fit based solely on the provided data.

Continue with full practice

Use the AWS SAA-C03 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try AWS SAA-C03 on Web View AWS SAA-C03 Practice Test

Free review resource

Read the AWS SAA-C03 Cheat Sheet on Tech Exam Lexicon, then return to IT Mastery for timed practice.

Revised on Thursday, May 14, 2026

Design Resilient Architectures

Design Cost-Optimized Architectures

Browse Certification Practice Tests by Exam Family

AWS SAA-C03: Design High-Performing Architectures

Topic snapshot

How to use this topic drill

Sample questions

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Continue with full practice

Related focused pages

Free review resource