Try 10 focused AWS SAA-C03 questions on Design High-Performing Architectures, with explanations, then continue with IT Mastery.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
| Field | Detail |
|---|---|
| Exam route | AWS SAA-C03 |
| Topic area | Design High-Performing Architectures |
| Blueprint weight | 24% |
| Page purpose | Focused sample questions before returning to mixed practice |
Use this page to isolate Design High-Performing Architectures for AWS SAA-C03. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.
| Pass | What to do | What to record |
|---|---|---|
| First attempt | Answer without checking the explanation first. | The fact, rule, calculation, or judgment point that controlled your answer. |
| Review | Read the explanation even when you were correct. | Why the best answer is stronger than the closest distractor. |
| Repair | Repeat only missed or uncertain items after a short break. | The pattern behind misses, not the answer letter. |
| Transfer | Return to mixed practice once the topic feels stable. | Whether the same skill holds up when the topic is no longer obvious. |
Blueprint context: 24% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.
These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.
Topic: Design High-Performing Architectures
A company runs a stateless web application on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. Load testing shows that a single instance can reliably process 400 HTTP requests/second while keeping latency within the target SLO.
During a big marketing event, the company expects a peak of 2,100 requests/second. The solutions architect will configure a scheduled scaling action to set the Auto Scaling group’s desired capacity 5 minutes before the event so that performance stays within targets while minimizing overprovisioning.
To meet the expected peak load, what should the scheduled action set as the desired capacity (number of instances)? Round up to the nearest whole instance.
Select the correct answer and include the appropriate unit in your reasoning (instances).
Options:
A. 8 instances
B. 6 instances
C. 5 instances
D. 7 instances
Best answer: B
Explanation: The key requirement is to size the Auto Scaling group so that the total instance capacity meets or exceeds the expected peak of 2,100 requests/second, based on the known per-instance throughput.
From load testing:
We need to find the minimum whole number of instances such that:
\[ \text{instances} \times 400 \ge 2,100 \]Variables used:
One-line calculation:
\[ \text{instances} = \left\lceil \frac{R_{\text{peak}}}{R_{\text{per\ instance}}} \right\rceil = \left\lceil \frac{2,100}{400} \right\rceil = \left\lceil 5.25 \right\rceil = 6 \]Rounding up is required because you cannot run a fractional instance, and using fewer than 6 instances would not provide enough capacity. Therefore, the scheduled action should set the desired capacity to 6 instances.
This aligns with good Auto Scaling design: provision just enough instances to meet performance targets while avoiding unnecessary cost.
Topic: Design High-Performing Architectures
A company runs an Amazon RDS for MySQL DB instance with a write-intensive primary. To scale a read-intensive reporting workload, the company will send all read traffic to RDS read replicas only. Each read replica can handle 5,000 read requests/second. Peak read traffic is estimated at 18,000 read requests/second. Assuming traffic is evenly balanced and you must round up to the next whole instance, how many read replicas are required to meet the peak read load?
Options:
A. 4 read replicas
B. 3 read replicas
C. 2 read replicas
D. 5 read replicas
Best answer: A
Explanation: This scenario focuses on scaling a read-intensive workload by offloading reads from the primary Amazon RDS for MySQL instance to read replicas. Read replicas are designed to horizontally scale read throughput by adding additional instances that serve read-only queries.
The stem specifies that all read traffic (18,000 read requests/second at peak) will be sent to read replicas, and each replica can handle 5,000 read requests/second. The primary is assumed to handle only writes, so its read capacity is not counted.
To determine the number of replicas, divide the required read throughput by the per-replica capacity and then round up to the next whole instance, because you cannot provision a fraction of a replica and you must ensure capacity meets or exceeds the requirement.
Using the numbers provided:
Number of replicas needed is:
\[ \text{replicas} = \left\lceil \frac{18{,}000}{5{,}000} \right\rceil = \left\lceil 3.6 \right\rceil = 4 \]Thus, four read replicas are required to meet the peak read demand while keeping the primary focused on write operations. This design aligns with the performance efficiency pillar by right-sizing capacity and using RDS read replicas to support a read-intensive workload without overprovisioning.
Topic: Design High-Performing Architectures
An e-commerce company uses a Multi-AZ Amazon RDS for MySQL db.r6g.large instance for all reads and writes. About 80% of queries are read-only product lookups. CPU averages 85%, and read latency is increasing. The application is mission-critical and must retain Multi-AZ high availability. The company must keep the existing instance size due to reserved instances and wants minimal application changes. Which change best improves performance and cost-efficiency?
Options:
A. Convert the RDS instance from Multi-AZ to Single-AZ to free capacity for additional read queries.
B. Upgrade the RDS instance to a larger instance class to increase CPU and memory capacity for all queries.
C. Create one or more Amazon RDS for MySQL read replicas and modify the application to send read-only queries to the replicas.
D. Migrate the database to Amazon DynamoDB and redesign the application to use a key-value access pattern.
Best answer: C
Explanation: The workload is clearly read-heavy: 80% of queries are read-only product lookups. The current Amazon RDS for MySQL instance is under high CPU load and experiencing increasing read latency, but the company must keep the existing instance size because of reserved instances and must maintain Multi-AZ high availability.
In such a scenario, the most effective optimization is to scale reads horizontally rather than vertically. Amazon RDS for MySQL read replicas are designed for exactly this pattern: they asynchronously replicate data from the primary and can serve read-only traffic. By directing read-heavy queries (such as product lookups and catalog browsing) to the read replicas, the primary instance can focus on writes and critical transactional operations.
This approach:
Other options either violate explicit constraints (changing instance size or removing Multi-AZ), or require a full redesign of the data model and application, which is not acceptable given the requirement for minimal application changes.
Topic: Design High-Performing Architectures
A retail company ingests clickstream events into a single Amazon Kinesis Data Stream. Each event averages 2KB. Peak volume is 30,000 events per second. A shard supports up to 1MB/s or 1,000 records/s of writes. Assume 1MB=1,000KB and round up. What is the minimum shard count required?
Options:
A. 60 shards
B. 90 shards
C. 15 shards
D. 30 shards
Best answer: A
Explanation: This question tests how to size an Amazon Kinesis Data Stream based on required write throughput and records per second.
Variables used:
Step 1 – Calculate required data throughput.
Total write throughput required:
\[ \text{throughput (KB/s)} = R \times S = 30{,}000 \times 2 = 60{,}000\,\text{KB/s} = 60\,\text{MB/s} \]Number of shards needed to satisfy data throughput:
\[ \text{shards}_{\text{data}} = \left\lceil \frac{60{,}000}{1{,}000} \right\rceil = \lceil 60 \rceil = 60 \]Step 2 – Check the records-per-second limit.
Each shard can ingest up to 1,000 records/s, so:
\[ \text{shards}_{\text{records}} = \left\lceil \frac{30{,}000}{1{,}000} \right\rceil = \lceil 30 \rceil = 30 \]Step 3 – Take the maximum of the two.
Kinesis must satisfy both limits, so the required shard count is the maximum:
\[ \text{shards} = \max(\text{shards}_{\text{data}},\,\text{shards}_{\text{records}}) = \max(60, 30) = 60 \]Therefore, the minimum shard count that meets both throughput and record-rate requirements is 60 shards.
Topic: Design High-Performing Architectures
A company has an on-premises data center and 20 VPCs across two AWS Regions. The company needs:
Which THREE architectures meet these requirements? (Select THREE.)
Options:
A. Create a 10 Gbps AWS Direct Connect link aggregation group (LAG) with two physical connections to a Direct Connect gateway. Associate the Direct Connect gateway with a Transit Gateway in each Region and attach all VPCs using Transit Gateway VPC attachments.
B. Establish individual Site-to-Site VPN connections from the on-premises router to each VPC over the internet. Use dynamic routing (BGP) on each VPN to exchange routes.
C. Create a single AWS Direct Connect connection to a Direct Connect gateway associated with a Transit Gateway in each Region. Attach all VPCs to the Transit Gateways. Rely on the inherent availability of Direct Connect and do not configure any backup connectivity.
D. Provision a single AWS Direct Connect connection to one shared “hub” VPC. Peer that VPC with all other VPCs in both Regions in a hub-and-spoke pattern. Configure static routes on the on-premises router and VPC route tables.
E. Provision one AWS Direct Connect connection to a Direct Connect gateway associated with a Transit Gateway in each Region. Attach all VPCs to their Regional Transit Gateway. Also configure two Site-to-Site VPN connections from on premises to the Transit Gateways as backup paths using BGP for automatic failover.
F. Provision two separate AWS Direct Connect connections at different locations, both to the same Direct Connect gateway. Associate the Direct Connect gateway with a Transit Gateway in each Region and attach all VPCs. Use BGP so traffic can fail over between Direct Connect connections.
Correct answers: A, E and F
Explanation: The scenario calls for a global hybrid network that connects an on-premises data center to many VPCs across two Regions. The key requirements are:
Architectures that use Direct Connect + Direct Connect gateway + Transit Gateway provide scalable hub-and-spoke routing for many VPCs and can be made highly available either by adding multiple Direct Connect circuits (LAG or separate locations) or by combining Direct Connect with a small number of backup VPN tunnels using BGP-based failover.
Designs that rely on a single Direct Connect without backup, or many individual VPNs, either fail the resiliency requirement or create significant operational overhead and performance risk.
Topic: Design High-Performing Architectures
A solutions architect adds Amazon RDS read replicas and introduces an in-memory cache with Amazon ElastiCache to reduce query latency and handle more concurrent read requests without changing the primary database instance size. Which AWS Well-Architected Framework pillar does this action primarily support?
Options:
A. Performance Efficiency
B. Cost Optimization
C. Security
D. Reliability
Best answer: A
Explanation: The scenario describes architectural changes aimed at reducing query latency and supporting more concurrent read traffic: adding Amazon RDS read replicas and introducing an in-memory cache using Amazon ElastiCache. Both techniques are core patterns in the Performance Efficiency pillar of the AWS Well-Architected Framework.
Performance Efficiency is about using computing resources efficiently to meet requirements and maintaining that efficiency as demand changes and technologies evolve. Read replicas increase read throughput by horizontally scaling reads, while caching returns results from memory instead of disk-based storage, decreasing latency and offloading work from the primary database. These changes directly target performance metrics such as response time and throughput.
Although these decisions can also influence other pillars (for example, potentially improving perceived availability or even cost), the primary intent and most direct impact in this scenario is on performance, not reliability, cost, or security.
Topic: Design High-Performing Architectures
A company runs a Java web application on Amazon ECS tasks in private subnets. The application connects to an Amazon RDS for MySQL DB and experiences connection storms during traffic spikes, exhausting DB connections and increasing latency. The company requires improved connection scalability through pooling, no hard-coded database credentials, and that all database traffic stay within the VPC without traversing the internet. The operations team also wants the solution to require minimal application code changes. Which solution BEST meets these requirements?
Options:
A. Increase the RDS instance class to a larger size to support more connections, and move database credentials into AWS Systems Manager Parameter Store for secure retrieval at startup.
B. Move the RDS instance to a public subnet, route ECS traffic through a NAT gateway to the DB’s public endpoint, implement client-side connection pooling in the application, and store credentials in AWS Secrets Manager.
C. Create an Amazon RDS Proxy in the same VPC, enable IAM authentication, update the application to connect to the proxy endpoint instead of the DB endpoint, and attach an IAM role to the ECS task for authentication.
D. Enable the RDS Data API, create an interface VPC endpoint for the API, and modify the application to call the Data API using the AWS SDK with credentials from AWS Secrets Manager.
Best answer: C
Explanation: RDS Proxy provides managed connection pooling, IAM-based authentication, and an in-VPC endpoint, allowing the application to scale database connections securely with only a minor connection-string change.
Topic: Design High-Performing Architectures
Which of the following statements about read-intensive, write-intensive, and mixed database workloads are INCORRECT? (Select THREE.)
Options:
A. Write-optimized architectures, such as using a streaming ingestion layer before the database, are commonly used for high-ingest workloads like clickstream or IoT data.
B. For a write-intensive workload with very few reads, adding more read replicas is the recommended way to scale write throughput.
C. Amazon DynamoDB is primarily optimized for read-heavy workloads and should be avoided for write-intensive use cases.
D. Amazon RDS read replicas are primarily used to offload read traffic and do not increase the write capacity of the primary DB instance.
E. For mixed workloads with unpredictable spikes, using a single large database instance and turning off autoscaling features is preferred to avoid scaling overhead.
F. Horizontally sharding data across multiple database instances can improve the maximum aggregate throughput for both reads and writes.
Correct answers: B, C and E
Explanation: Read- and write-intensive workloads drive different database scaling patterns. For read-heavy workloads, you typically add read replicas or caching layers so the primary database handles fewer read queries. In Amazon RDS, read replicas asynchronously copy data from the primary instance and therefore increase read capacity but not write capacity.
To scale both reads and writes, you can horizontally partition (shard) data across multiple databases so each instance handles a subset of the data and load. For high-ingest or write-heavy workloads such as clickstream, IoT, or time-series data, architectures often use write-optimized systems like Amazon DynamoDB, Amazon Timestream, or a streaming ingestion layer (for example, Amazon Kinesis Data Streams) in front of downstream storage. For mixed and spiky workloads, combining horizontal scaling and autoscaling is usually better than relying on a single, large fixed instance, because it improves elasticity and resilience.
Topic: Design High-Performing Architectures
A company runs an OLTP workload on Amazon RDS for MySQL. The team collected the following metrics and requirements:
| Item | Value | Notes | Priority |
|---|---|---|---|
| DB deployment | Amazon RDS MySQL, us-east-1, Multi-AZ | All writes must stay in us-east-1. | Must |
| Traffic & latency | EU users: 220 ms avg read latency | Reads: 80%, Writes: 20% | High |
| Availability | RPO≤5 min, RTO≤15 min | Primary region only (us-east-1) | High |
| Budget | +25% monthly DB cost maximum | Prefer read-optimized scaling | Medium |
Based only on this information, which change is the most appropriate to meet the requirements for EU users while maintaining availability and cost goals?
Options:
A. Migrate to an Aurora MySQL global database with the primary cluster in eu-west-1 and a secondary cluster in us-east-1, sending all writes to eu-west-1.
B. Convert the us-east-1 RDS instance to single-AZ and create two additional read replicas in us-east-1 to handle global reads.
C. Add a cross-Region read replica of the existing RDS MySQL DB in eu-west-1 and route EU read-only traffic there while keeping the us-east-1 Multi-AZ primary.
D. Create a separate standalone RDS MySQL instance in eu-west-1 and synchronize data using custom application-level replication between regions.
Best answer: C
Explanation: The exhibit highlights three key points: the database currently runs on “Amazon RDS MySQL, us-east-1, Multi-AZ,” EU users experience “EU users: 220 ms avg read latency,” and availability requirements of “RPO≤5 min, RTO≤15 min” apply to the primary region only, with a budget of “+25% monthly DB cost maximum” and a preference to “Prefer read-optimized scaling.”
Because reads are 80% of traffic and EU users see high latency, the best approach is to move read workloads for EU users closer to them, while keeping writes and HA in us-east-1. An RDS cross-Region read replica in eu-west-1 allows asynchronous replication of data from the us-east-1 Multi-AZ primary and provides low-latency local reads to EU users.
This design keeps the Multi-AZ deployment in us-east-1 to satisfy the high-priority RPO/RTO goals for the primary region, respects the constraint “All writes must stay in us-east-1,” and uses a cost-effective read replica pattern rather than creating a second full Multi-AZ cluster or a more expensive global configuration. Other options either fail the latency requirement, violate the write-location constraint, or undermine availability.
Topic: Design High-Performing Architectures
A company is migrating a CPU-intensive analytics batch job to Amazon EC2. The workload runs 3 hours every evening at 80–90% CPU utilization and does not use GPUs. The following instance options have been shortlisted:
| Instance type | vCPUs | CPU credits | Notes |
|---|---|---|---|
| t3.large | 2 | Yes (burstable) | General purpose; for spiky CPU workloads |
| m6i.large | 2 | No (fixed CPU) | General purpose |
| c6i.large | 2 | No (fixed CPU) | Compute optimized for high-CPU workloads |
| g5.xlarge | 4 | No (fixed CPU) | Includes GPU; intended for graphics/ML |
Based only on the information in the exhibit, which instance type is the MOST appropriate choice for this workload?
Options:
A. t3.large
B. g5.xlarge
C. c6i.large
D. m6i.large
Best answer: C
Explanation: The exhibit compares four instance options and explicitly calls out which workloads they target. The key line is in the c6i.large row: Compute optimized for high-CPU workloads with CPU credits: No (fixed CPU). This directly matches a batch job that runs at 80–90% CPU for several hours.
By contrast, the t3.large line shows CPU credits: Yes (burstable) and notes it is for spiky CPU workloads. Burstable instances are designed for occasional peaks, not long periods of sustained high utilization, where they can exhaust CPU credits and throttle. The g5.xlarge row includes Includes GPU; intended for graphics/ML, which clashes with the stem stating the workload does not use GPUs. The m6i.large row indicates a fixed CPU, general-purpose instance, which would be acceptable but is less targeted than the compute-optimized c6i.large specifically described as suitable for high-CPU workloads.
A common misread is to focus only on vCPUs or to assume that the GPU in g5.xlarge must be better because it has more hardware. However, the exhibit clearly limits g5.xlarge to graphics/ML, and the scenario rules out GPU use, so the compute-optimized, fixed-CPU c6i.large is the best fit based solely on the provided data.
Use the AWS SAA-C03 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Try AWS SAA-C03 on Web View AWS SAA-C03 Practice Test
Read the AWS SAA-C03 Cheat Sheet on Tech Exam Lexicon, then return to IT Mastery for timed practice.