Try 10 focused AWS SOA-C03 questions on Reliability and Business Continuity, with explanations, then continue with IT Mastery.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
| Field | Detail |
|---|---|
| Exam route | AWS SOA-C03 |
| Topic area | Reliability and Business Continuity |
| Blueprint weight | 22% |
| Page purpose | Focused sample questions before returning to mixed practice |
Use this page to isolate Reliability and Business Continuity for AWS SOA-C03. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.
| Pass | What to do | What to record |
|---|---|---|
| First attempt | Answer without checking the explanation first. | The fact, rule, calculation, or judgment point that controlled your answer. |
| Review | Read the explanation even when you were correct. | Why the best answer is stronger than the closest distractor. |
| Repair | Repeat only missed or uncertain items after a short break. | The pattern behind misses, not the answer letter. |
| Transfer | Return to mixed practice once the topic feels stable. | Whether the same skill holds up when the topic is no longer obvious. |
Blueprint context: 22% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.
These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.
Topic: Reliability and Business Continuity
Which statement is INCORRECT about mapping RPO and RTO to backup frequency, restore approach, and operational runbooks on AWS?
Options:
A. RPO sets maximum data loss; it drives backup frequency.
B. RTO sets maximum downtime; it drives restore approach and automation.
C. RTO sets maximum data loss; it determines backup frequency.
D. Runbooks should be tested and timed to meet the stated RTO.
Best answer: C
Explanation: RPO measures how much data loss is acceptable, so it maps to how frequently you back up or replicate data. RTO measures how long the service can be down, so it maps to how you restore and how operational runbooks are executed and tested. The incorrect statement swaps these definitions and mappings.
RPO (Recovery Point Objective) is the maximum acceptable amount of data loss measured in time (for example, “up to 15 minutes of changes”). Operationally, this maps to backup/replication frequency and whether point-in-time recovery is available.
RTO (Recovery Time Objective) is the maximum acceptable downtime (for example, “service must be back within 30 minutes”). Operationally, this maps to the restore approach and runbooks needed to hit the time target, such as:
Key takeaway: RPO drives “how much you can lose”; RTO drives “how fast you must recover.”
Topic: Reliability and Business Continuity
A public application runs on an Auto Scaling group that spans two Availability Zones behind an internet-facing Application Load Balancer (ALB). The operations team wants to improve resiliency and make it easy to shift traffic during maintenance by changing ALB listeners and routing rules.
Which change should the team NOT make?
Options:
A. Configure target group health checks and ensure targets are registered in multiple AZs
B. Add an HTTP (port 80) listener that redirects requests to HTTPS (port 443)
C. Create a listener rule that forwards all traffic to one fixed EC2 instance target
D. Configure the HTTPS listener to forward to two target groups with adjustable weights
Best answer: C
Explanation: For resilient traffic patterns, ALB listener routing should keep multiple healthy targets available across Availability Zones and allow safe traffic shifting between target groups. Sending all requests to one fixed instance removes redundancy and increases blast radius during failures or maintenance.
The core principle is to keep load balancer routing highly available by distributing traffic across multiple healthy targets (ideally across multiple Availability Zones) and using listener behavior to support safe operational changes. ALB listeners can redirect HTTP to HTTPS, and they can use a single forward action to distribute traffic across multiple target groups with weights, which is a common way to do canary releases or maintenance cutovers. Target group health checks are essential so the ALB stops routing to unhealthy targets automatically. In contrast, routing all traffic to one fixed EC2 instance is an operations anti-pattern because it introduces a single point of failure and undermines the resilience gained from Auto Scaling and multi-AZ deployment.
Topic: Reliability and Business Continuity
You need to restore an Amazon DynamoDB table after accidental writes and then confirm the restore is usable before switching the application. Which THREE statements about DynamoDB point-in-time recovery (PITR) and validating the restore are true? (Select THREE.)
Options:
A. A PITR restore overwrites the existing table in place when you specify the same table name.
B. A PITR restore creates a new table and does not modify the source table.
C. Enabling PITR today allows restores to times from before PITR was enabled.
D. A practical validation step is to run representative read queries (GetItem/Query) against the restored table before redirecting traffic.
E. PITR must have been enabled before the incident to restore to a time within the PITR window.
F. A PITR restore requires an on-demand backup to exist before the restore can start.
Correct answers: B, D and E
Explanation: DynamoDB PITR relies on continuous backups that only exist while PITR is enabled, and restoring writes recovered data into a new table. Operationally, you validate the restore by checking the restored table can serve expected reads (and matches required schema/index behavior) before you repoint the application to it.
The core PITR workflow is: ensure PITR was enabled, restore to a specific timestamp within the PITR window, and then validate the new table before cutover. A PITR restore never “rolls back” the existing table in place; it creates a separate restored table that you can test safely.
PITR enabled before incident? OK — otherwise you can’t restore to that time
Restore modifies source table? NO — restore creates a new table
Validation approach? OK — run representative reads on restored table
Overwrite existing in place? NO — you must cut over to the new table
Restore earlier than enable? NO — no continuous backups exist for that period
Needs on-demand backup? NO — PITR restore uses continuous backups
Key takeaway: restore to a new table, verify it, then switch traffic to the restored table.
Topic: Reliability and Business Continuity
An operations team needs to automate backups for EC2/EBS volumes and Amazon RDS databases. Which statement is INCORRECT?
Options:
A. To automate EBS snapshots, you must stop the EC2 instance before each snapshot.
B. AWS Backup can automate backups using backup plans and can assign resources by tags.
C. Amazon RDS automated backups enable point-in-time restore and restore creates a new DB instance.
D. Amazon Data Lifecycle Manager (DLM) can automate EBS snapshot creation and retention based on tags.
Best answer: A
Explanation: EBS snapshots can be scheduled and taken without stopping an EC2 instance, and they are typically crash-consistent by default. Stopping the instance is not required for automation; it is an optional step when you need stronger, application-consistent backups. AWS Backup, DLM, and RDS automated backups all provide native ways to automate backups and restores.
The incorrect statement is the claim that you must stop an EC2 instance to automate EBS snapshots. Amazon EBS supports taking snapshots of in-use volumes, and services like AWS Backup or Amazon Data Lifecycle Manager can schedule and manage those snapshots without downtime. By default, EBS snapshots are crash-consistent; if you need application-consistent backups, you can coordinate quiescing (for example, using pre/post scripts via AWS Systems Manager) but it is not a requirement to stop the instance.
AWS Backup centralizes scheduling, retention, and lifecycle policies across supported services and can select resources dynamically using tags. DLM is a native EBS feature specifically for automated snapshot and retention management. For databases, RDS automated backups support point-in-time restore by restoring to a new DB instance.
Key takeaway: stopping instances is an optional consistency measure, not a prerequisite for automated EBS snapshots.
Topic: Reliability and Business Continuity
A web application must remain available during an Availability Zone impairment. You are reviewing a recent change to the application’s Auto Scaling group.
Exhibit: CloudTrail event snippet
eventName: UpdateAutoScalingGroup
requestParameters:
autoScalingGroupName: web-asg
vpcZoneIdentifier: "subnet-0a1b2c3d4e5f6a7b"
availabilityZones: ["us-east-1a"]
desiredCapacity: 6
Based only on the exhibit, what is the best next step to implement a multi-AZ compute pattern for this Auto Scaling group?
Options:
A. Add subnets from other AZs to VPCZoneIdentifier
B. Enable the AZRebalance scaling process for the ASG
C. Perform an instance refresh to replace all running instances
D. Increase desiredCapacity to reduce per-instance load
Best answer: A
Explanation: The exhibit indicates the Auto Scaling group is configured for a single Availability Zone. To make the compute layer resilient to an AZ impairment, the ASG must be attached to multiple subnets that map to different AZs. Updating the ASG’s subnet list causes instances to launch across those AZs.
Multi-AZ for an Auto Scaling group in a VPC is achieved by configuring the group to use multiple subnets that reside in different Availability Zones. In the exhibit, the ASG is effectively single-AZ because vpcZoneIdentifier contains only one subnet and availabilityZones lists only one AZ (["us-east-1a"]).
Operationally, the next step is to update the ASG to include additional subnet IDs (for example, subnets in us-east-1b and us-east-1c) in vpcZoneIdentifier so new and replacement instances can launch in multiple AZs. The key takeaway is that subnets (not instance count) determine AZ placement for an ASG in a VPC.
us-east-1a is configured.Topic: Reliability and Business Continuity
An operations team uses AWS Backup and wants consistent backup coverage for new production resources without manually adding each resource to a backup plan. The team assigns resources to the backup plan by using a tag-based resource assignment (for example, Environment=Prod) and requires the tag to be applied when resources are created.
Which operations principle does this action most directly demonstrate?
Options:
A. Shared responsibility
B. Automation and standardization
C. Blast-radius reduction
D. Least privilege
Best answer: B
Explanation: Using tags to assign resources to AWS Backup plans makes backup coverage consistent and repeatable as environments grow. This reduces manual configuration drift and the risk of forgetting to include new resources. The core principle is automation and standardization to improve operational reliability.
The core concept is automation/standardization: implementing repeatable mechanisms so that routine operational controls (like backups) are applied consistently as resources change. In AWS Backup, tag-based resource assignments let you automatically include resources in a backup plan based on required tags (for example, Environment=Prod) instead of manually selecting each EBS volume, RDS database, or DynamoDB table. This approach reduces human error, improves auditability, and helps ensure new production resources receive the same protection level without additional change tickets. The key takeaway is that tag-driven assignments operationalize a standard backup policy across resources as they are created.
Topic: Reliability and Business Continuity
An application team accidentally deployed code that wrote incorrect values to a DynamoDB table named Orders in us-east-1 at 10:20 UTC. The table had point-in-time recovery (PITR) enabled before the incident. Operations must restore the data to its state at 10:15 UTC and confirm the restore is correct before cutting traffic over.
Which THREE actions should the operations engineer take? (Select THREE.)
Options:
A. Export the table to S3 and import it back into Orders to achieve a point-in-time restore
B. Verify that Orders has point-in-time recovery enabled (continuous backups status is ENABLED)
C. After restore completes, confirm the new table is ACTIVE and validate data with DescribeTable restore details plus a few representative GetItem/Query checks
D. Restore the existing Orders table in place to 10:15 UTC so the application does not need any changes
E. Run RestoreTableToPointInTime to create a new table restored to 10:15 UTC
F. Create an on-demand backup of the corrupted table and restore that backup to 10:15 UTC
Correct answers: B, C and E
Explanation: DynamoDB PITR lets you restore a table to a chosen second within the retention window, but the restore operation creates a new table. Operationally, you confirm PITR is enabled, run a point-in-time restore to the target timestamp, and then validate completion and correctness by checking restore metadata and performing targeted reads against known keys.
The core concept is DynamoDB point-in-time recovery (PITR): it restores table data to an exact timestamp (to the second) within the PITR window and creates a new table rather than changing the existing table.
At a high level you:
RestoreTableToPointInTime, giving the restored table a new name.DescribeTable), and running a small set of representative GetItem/Query reads to confirm expected pre-incident values.In-place rewinds and “time-travel” from an on-demand backup are not supported by DynamoDB.
Topic: Reliability and Business Continuity
A company serves a global user base from an Application Load Balancer (ALB) origin in us-east-1. During peak hours, the ALB and web tier show high request rates and increased latency. The operations team is adding Amazon CloudFront to reduce origin load while keeping content reasonably fresh.
Which CloudFront configuration action should be AVOIDED because it prevents effective caching at edge locations?
Options:
A. Use an AWS managed cache policy optimized for caching static content and enable CloudFront compression
B. Enable Origin Shield in the AWS Region closest to the ALB origin to improve cache hit ratios
C. Forward all headers, all cookies, and all query strings and set the default TTL to 0 seconds for all paths
D. Create separate cache behaviors so static paths use long TTLs while dynamic paths use low or disabled caching
Best answer: C
Explanation: CloudFront reduces origin load only when many requests can be served from edge caches. Forwarding all request attributes and setting a 0-second TTL for all paths makes most requests uncacheable or uniquely cached, driving traffic back to the ALB. This violates the principle of minimizing cache-key variation and using appropriate TTLs to increase cache hit ratio.
The core operational goal is to increase CloudFront cache hits so fewer requests reach the origin. CloudFront caches objects based on the cache key (what you include from the request) and how long objects are allowed to stay in cache (TTL). If you forward many values (headers/cookies/query strings), you create many unique cache keys; if you set TTLs to 0, CloudFront must revalidate or fetch from the origin for every request.
To reduce origin load while keeping content fresh:
The key takeaway is to avoid configurations that make nearly every request a cache miss.
Topic: Reliability and Business Continuity
An operations engineer accidentally deleted many items from a DynamoDB table named Orders in us-east-1 at 09:14 UTC. Point-in-time recovery (PITR) was enabled before the incident. The engineer must restore the data to its state at 09:10 UTC and confirm the restore worked without overwriting the current Orders table.
Which TWO actions should the engineer take?
Options:
A. Create an on-demand backup now and restore it to recover deleted items
B. Use UpdateTable to roll back the existing Orders table in place
C. After the restore is ACTIVE, read from the restored table to verify expected items
D. Enable PITR now and restore the table to 09:10 UTC
E. Create a DynamoDB VPC endpoint and validate the restore using private DNS
F. Restore to a new table using RestoreTableToPointInTime at 09:10 UTC
Correct answers: C and F
Explanation: With DynamoDB PITR, you restore to a new table at the desired point in time, then validate the outcome by confirming the restored table is ACTIVE and that expected data is present. PITR cannot retroactively cover time periods before it was enabled, and DynamoDB does not support in-place rollback of an existing table.
DynamoDB point-in-time recovery lets you recover a table to a specific second within the PITR window, but the restore operation always creates a new table (so you don’t overwrite the current one). In this scenario, the correct operational flow is to restore Orders to a new table name at 09:10 UTC, wait for the restore to complete, and then validate the restored content by performing reads against the restored table.
Typical high-level steps:
RestoreTableToPointInTime with a new target table name and the 09:10 UTC timestamp.Key takeaway: PITR is a restore-to-new-table operation, and validation must include checking the restored data, not just configuration.
RestoreTableToPointInTime at 09:10 UTC — this is the PITR mechanism and avoids overwriting the current table.Topic: Reliability and Business Continuity
An operations team uses Amazon FSx and wants to confirm the basic recovery workflow. Which statement about Amazon FSx backups is correct?
Options:
A. FSx backups are stored on the file system itself and are deleted when the file system is deleted.
B. FSx backups automatically replicate to all AWS Regions without additional configuration.
C. An FSx backup is used to create a new FSx file system from that point in time.
D. An FSx backup can be mounted directly by clients as a read-only file share.
Best answer: C
Explanation: Amazon FSx backups are point-in-time restore points. To recover, you restore a backup by creating a new FSx file system that contains the backed-up data. This is the fundamental mechanism FSx provides to meet recovery objectives after data loss or file system failure.
The core concept is that an Amazon FSx backup is not a share you attach to; it is a recovery artifact. Operationally, when you need to recover from accidental deletion or a corrupted file system, you select the appropriate backup (based on the desired recovery point) and restore it by creating a new FSx file system from that backup. After the new file system is available, you update clients (for example, DNS or mount targets) to use the restored file system. The key takeaway is that backups support recovery by enabling point-in-time creation of a replacement file system rather than providing a directly mountable backup volume.
Use the AWS SOA-C03 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Try AWS SOA-C03 on Web View AWS SOA-C03 Practice Test
Read the AWS SOA-C03 Cheat Sheet on Tech Exam Lexicon, then return to IT Mastery for timed practice.