AWS SOA-C03: Reliability and Business Continuity

May 1, 2026

Try 10 focused AWS SOA-C03 questions on Reliability and Business Continuity, with explanations, then continue with IT Mastery.

On this page

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try AWS SOA-C03 on Web View full AWS SOA-C03 practice page

Topic snapshot

Field	Detail
Exam route	AWS SOA-C03
Topic area	Reliability and Business Continuity
Blueprint weight	22%
Page purpose	Focused sample questions before returning to mixed practice

How to use this topic drill

Use this page to isolate Reliability and Business Continuity for AWS SOA-C03. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.

Pass	What to do	What to record
First attempt	Answer without checking the explanation first.	The fact, rule, calculation, or judgment point that controlled your answer.
Review	Read the explanation even when you were correct.	Why the best answer is stronger than the closest distractor.
Repair	Repeat only missed or uncertain items after a short break.	The pattern behind misses, not the answer letter.
Transfer	Return to mixed practice once the topic feels stable.	Whether the same skill holds up when the topic is no longer obvious.

Blueprint context: 22% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.

Sample questions

These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.

Question 1

Topic: Reliability and Business Continuity

Which statement is INCORRECT about mapping RPO and RTO to backup frequency, restore approach, and operational runbooks on AWS?

Options:

A. RPO sets maximum data loss; it drives backup frequency.
B. RTO sets maximum downtime; it drives restore approach and automation.
C. RTO sets maximum data loss; it determines backup frequency.
D. Runbooks should be tested and timed to meet the stated RTO.

Best answer: C

Explanation: RPO measures how much data loss is acceptable, so it maps to how frequently you back up or replicate data. RTO measures how long the service can be down, so it maps to how you restore and how operational runbooks are executed and tested. The incorrect statement swaps these definitions and mappings.

RPO (Recovery Point Objective) is the maximum acceptable amount of data loss measured in time (for example, “up to 15 minutes of changes”). Operationally, this maps to backup/replication frequency and whether point-in-time recovery is available.

RTO (Recovery Time Objective) is the maximum acceptable downtime (for example, “service must be back within 30 minutes”). Operationally, this maps to the restore approach and runbooks needed to hit the time target, such as:

Pre-staged capacity (warm/standby) vs. rebuild-from-backup
Automation (AWS Systems Manager Automation, scripts) vs. manual steps
Regular restore drills with measured end-to-end times

Key takeaway: RPO drives “how much you can lose”; RTO drives “how fast you must recover.”

RPO drives frequency is accurate because tighter RPO requires more frequent backups/replication.
RTO drives restore approach is accurate because faster recovery often needs pre-provisioning and automation.
Runbooks aligned to RTO is accurate because you must validate the procedure fits the downtime budget.
RTO equals data loss is incorrect because data loss targets are defined by RPO, not RTO.

Question 2

Topic: Reliability and Business Continuity

A public application runs on an Auto Scaling group that spans two Availability Zones behind an internet-facing Application Load Balancer (ALB). The operations team wants to improve resiliency and make it easy to shift traffic during maintenance by changing ALB listeners and routing rules.

Which change should the team NOT make?

Options:

A. Configure target group health checks and ensure targets are registered in multiple AZs
B. Add an HTTP (port 80) listener that redirects requests to HTTPS (port 443)
C. Create a listener rule that forwards all traffic to one fixed EC2 instance target
D. Configure the HTTPS listener to forward to two target groups with adjustable weights

Best answer: C

Explanation: For resilient traffic patterns, ALB listener routing should keep multiple healthy targets available across Availability Zones and allow safe traffic shifting between target groups. Sending all requests to one fixed instance removes redundancy and increases blast radius during failures or maintenance.

The core principle is to keep load balancer routing highly available by distributing traffic across multiple healthy targets (ideally across multiple Availability Zones) and using listener behavior to support safe operational changes. ALB listeners can redirect HTTP to HTTPS, and they can use a single forward action to distribute traffic across multiple target groups with weights, which is a common way to do canary releases or maintenance cutovers. Target group health checks are essential so the ALB stops routing to unhealthy targets automatically. In contrast, routing all traffic to one fixed EC2 instance is an operations anti-pattern because it introduces a single point of failure and undermines the resilience gained from Auto Scaling and multi-AZ deployment.

Weighted forwarding is a standard way to shift traffic gradually or during maintenance.
HTTP to HTTPS redirect is a common listener configuration and doesn’t reduce availability.
Health checks and multi-AZ targets enable automatic failover within the load balancer.
Single fixed instance target concentrates all traffic on one host and increases outage risk.

Question 3

Topic: Reliability and Business Continuity

You need to restore an Amazon DynamoDB table after accidental writes and then confirm the restore is usable before switching the application. Which THREE statements about DynamoDB point-in-time recovery (PITR) and validating the restore are true? (Select THREE.)

Options:

A. A PITR restore overwrites the existing table in place when you specify the same table name.
B. A PITR restore creates a new table and does not modify the source table.
C. Enabling PITR today allows restores to times from before PITR was enabled.
D. A practical validation step is to run representative read queries (GetItem/Query) against the restored table before redirecting traffic.
E. PITR must have been enabled before the incident to restore to a time within the PITR window.
F. A PITR restore requires an on-demand backup to exist before the restore can start.

Correct answers: B, D and E

Explanation: DynamoDB PITR relies on continuous backups that only exist while PITR is enabled, and restoring writes recovered data into a new table. Operationally, you validate the restore by checking the restored table can serve expected reads (and matches required schema/index behavior) before you repoint the application to it.

The core PITR workflow is: ensure PITR was enabled, restore to a specific timestamp within the PITR window, and then validate the new table before cutover. A PITR restore never “rolls back” the existing table in place; it creates a separate restored table that you can test safely.

PITR enabled before incident?  OK — otherwise you can’t restore to that time
Restore modifies source table? NO — restore creates a new table
Validation approach?          OK — run representative reads on restored table
Overwrite existing in place?  NO — you must cut over to the new table
Restore earlier than enable?  NO — no continuous backups exist for that period
Needs on-demand backup?       NO — PITR restore uses continuous backups

Key takeaway: restore to a new table, verify it, then switch traffic to the restored table.

NO: In-place rollback DynamoDB PITR does not overwrite an existing table; it restores into a new table.
NO: Retroactive history You cannot restore to timestamps from before PITR was enabled.
NO: Backup prerequisite PITR restore does not require creating an on-demand backup first.
OK: Validate before cutover Testing real read patterns against the restored table helps confirm restore outcomes safely.

Question 4

Topic: Reliability and Business Continuity

An operations team needs to automate backups for EC2/EBS volumes and Amazon RDS databases. Which statement is INCORRECT?

Options:

A. To automate EBS snapshots, you must stop the EC2 instance before each snapshot.
B. AWS Backup can automate backups using backup plans and can assign resources by tags.
C. Amazon RDS automated backups enable point-in-time restore and restore creates a new DB instance.
D. Amazon Data Lifecycle Manager (DLM) can automate EBS snapshot creation and retention based on tags.

Best answer: A

Explanation: EBS snapshots can be scheduled and taken without stopping an EC2 instance, and they are typically crash-consistent by default. Stopping the instance is not required for automation; it is an optional step when you need stronger, application-consistent backups. AWS Backup, DLM, and RDS automated backups all provide native ways to automate backups and restores.

The incorrect statement is the claim that you must stop an EC2 instance to automate EBS snapshots. Amazon EBS supports taking snapshots of in-use volumes, and services like AWS Backup or Amazon Data Lifecycle Manager can schedule and manage those snapshots without downtime. By default, EBS snapshots are crash-consistent; if you need application-consistent backups, you can coordinate quiescing (for example, using pre/post scripts via AWS Systems Manager) but it is not a requirement to stop the instance.

AWS Backup centralizes scheduling, retention, and lifecycle policies across supported services and can select resources dynamically using tags. DLM is a native EBS feature specifically for automated snapshot and retention management. For databases, RDS automated backups support point-in-time restore by restoring to a new DB instance.

Key takeaway: stopping instances is an optional consistency measure, not a prerequisite for automated EBS snapshots.

Stopping required is wrong because EBS snapshots can be taken while volumes are attached and in use.
AWS Backup plans/tags is accurate; plans and tag-based assignments are common for automation at scale.
DLM for EBS is accurate; DLM automates snapshot scheduling and retention using tag targeting.
RDS point-in-time restore is accurate; restores create a new DB instance rather than overwriting the existing one.

Question 5

Topic: Reliability and Business Continuity

A web application must remain available during an Availability Zone impairment. You are reviewing a recent change to the application’s Auto Scaling group.

Exhibit: CloudTrail event snippet

eventName: UpdateAutoScalingGroup
requestParameters:
  autoScalingGroupName: web-asg
  vpcZoneIdentifier: "subnet-0a1b2c3d4e5f6a7b"
  availabilityZones: ["us-east-1a"]
  desiredCapacity: 6

Based only on the exhibit, what is the best next step to implement a multi-AZ compute pattern for this Auto Scaling group?

Options:

A. Add subnets from other AZs to VPCZoneIdentifier
B. Enable the AZRebalance scaling process for the ASG
C. Perform an instance refresh to replace all running instances
D. Increase desiredCapacity to reduce per-instance load

Best answer: A

Explanation: The exhibit indicates the Auto Scaling group is configured for a single Availability Zone. To make the compute layer resilient to an AZ impairment, the ASG must be attached to multiple subnets that map to different AZs. Updating the ASG’s subnet list causes instances to launch across those AZs.

Multi-AZ for an Auto Scaling group in a VPC is achieved by configuring the group to use multiple subnets that reside in different Availability Zones. In the exhibit, the ASG is effectively single-AZ because vpcZoneIdentifier contains only one subnet and availabilityZones lists only one AZ (["us-east-1a"]).

Operationally, the next step is to update the ASG to include additional subnet IDs (for example, subnets in us-east-1b and us-east-1c) in vpcZoneIdentifier so new and replacement instances can launch in multiple AZs. The key takeaway is that subnets (not instance count) determine AZ placement for an ASG in a VPC.

Increase capacity only scales within the same AZ when only us-east-1a is configured.
AZRebalance can help redistribute instances, but it cannot use AZs not listed/configured.
Instance refresh replaces instances but still launches them in the currently configured subnet/AZ.

Question 6

Topic: Reliability and Business Continuity

An operations team uses AWS Backup and wants consistent backup coverage for new production resources without manually adding each resource to a backup plan. The team assigns resources to the backup plan by using a tag-based resource assignment (for example, Environment=Prod) and requires the tag to be applied when resources are created.

Which operations principle does this action most directly demonstrate?

Options:

A. Shared responsibility
B. Automation and standardization
C. Blast-radius reduction
D. Least privilege

Best answer: B

Explanation: Using tags to assign resources to AWS Backup plans makes backup coverage consistent and repeatable as environments grow. This reduces manual configuration drift and the risk of forgetting to include new resources. The core principle is automation and standardization to improve operational reliability.

The core concept is automation/standardization: implementing repeatable mechanisms so that routine operational controls (like backups) are applied consistently as resources change. In AWS Backup, tag-based resource assignments let you automatically include resources in a backup plan based on required tags (for example, Environment=Prod) instead of manually selecting each EBS volume, RDS database, or DynamoDB table. This approach reduces human error, improves auditability, and helps ensure new production resources receive the same protection level without additional change tickets. The key takeaway is that tag-driven assignments operationalize a standard backup policy across resources as they are created.

Least privilege focuses on restricting IAM permissions, not on ensuring uniform backup coverage.
Blast-radius reduction is about limiting the scope of failures/changes (for example, segmentation), not selecting backup resources.
Shared responsibility describes who secures what; it does not directly address standardizing resource inclusion in backups.

Question 7

Topic: Reliability and Business Continuity

An application team accidentally deployed code that wrote incorrect values to a DynamoDB table named Orders in us-east-1 at 10:20 UTC. The table had point-in-time recovery (PITR) enabled before the incident. Operations must restore the data to its state at 10:15 UTC and confirm the restore is correct before cutting traffic over.

Which THREE actions should the operations engineer take? (Select THREE.)

Options:

A. Export the table to S3 and import it back into Orders to achieve a point-in-time restore
B. Verify that Orders has point-in-time recovery enabled (continuous backups status is ENABLED)
C. After restore completes, confirm the new table is ACTIVE and validate data with DescribeTable restore details plus a few representative GetItem/Query checks
D. Restore the existing Orders table in place to 10:15 UTC so the application does not need any changes
E. Run RestoreTableToPointInTime to create a new table restored to 10:15 UTC
F. Create an on-demand backup of the corrupted table and restore that backup to 10:15 UTC

Correct answers: B, C and E

Explanation: DynamoDB PITR lets you restore a table to a chosen second within the retention window, but the restore operation creates a new table. Operationally, you confirm PITR is enabled, run a point-in-time restore to the target timestamp, and then validate completion and correctness by checking restore metadata and performing targeted reads against known keys.

The core concept is DynamoDB point-in-time recovery (PITR): it restores table data to an exact timestamp (to the second) within the PITR window and creates a new table rather than changing the existing table.

At a high level you:

Confirm PITR is enabled on the source table (continuous backups).
Restore to the required timestamp using RestoreTableToPointInTime, giving the restored table a new name.
Validate the outcome by ensuring the restored table is ACTIVE, checking restore metadata (for example, restore time in DescribeTable), and running a small set of representative GetItem/Query reads to confirm expected pre-incident values.

In-place rewinds and “time-travel” from an on-demand backup are not supported by DynamoDB.

OK — Verify PITR enabled: Required prerequisite for a PITR restore.
OK — Restore to new table: PITR restore creates a separate restored table at 10:15 UTC.
OK — Validate restore outcome: Confirm ACTIVE/restore metadata and verify key reads match expected state.
NO — In-place restore: DynamoDB PITR does not overwrite the existing table in place.
NO — On-demand backup to a time: An on-demand backup captures a point-in-time at backup creation, not an earlier timestamp.
NO — Export/import as PITR: Export/import is not the PITR restore workflow and is not the appropriate method here for restoring to 10:15 UTC.

Question 8

Topic: Reliability and Business Continuity

A company serves a global user base from an Application Load Balancer (ALB) origin in us-east-1. During peak hours, the ALB and web tier show high request rates and increased latency. The operations team is adding Amazon CloudFront to reduce origin load while keeping content reasonably fresh.

Which CloudFront configuration action should be AVOIDED because it prevents effective caching at edge locations?

Options:

A. Use an AWS managed cache policy optimized for caching static content and enable CloudFront compression
B. Enable Origin Shield in the AWS Region closest to the ALB origin to improve cache hit ratios
C. Forward all headers, all cookies, and all query strings and set the default TTL to 0 seconds for all paths
D. Create separate cache behaviors so static paths use long TTLs while dynamic paths use low or disabled caching

Best answer: C

Explanation: CloudFront reduces origin load only when many requests can be served from edge caches. Forwarding all request attributes and setting a 0-second TTL for all paths makes most requests uncacheable or uniquely cached, driving traffic back to the ALB. This violates the principle of minimizing cache-key variation and using appropriate TTLs to increase cache hit ratio.

The core operational goal is to increase CloudFront cache hits so fewer requests reach the origin. CloudFront caches objects based on the cache key (what you include from the request) and how long objects are allowed to stay in cache (TTL). If you forward many values (headers/cookies/query strings), you create many unique cache keys; if you set TTLs to 0, CloudFront must revalidate or fetch from the origin for every request.

To reduce origin load while keeping content fresh:

Cache static assets with longer TTLs (and version object names when you deploy changes)
Limit cache-key inputs to only what the application truly needs
Use features like compression and Origin Shield to improve efficiency

The key takeaway is to avoid configurations that make nearly every request a cache miss.

Split behaviors is a common way to cache static content aggressively while treating dynamic routes differently.
Managed cache policy + compression increases cacheability and reduces bytes served without breaking correctness.
Origin Shield can reduce origin load by consolidating cache misses before they reach the origin.

Question 9

Topic: Reliability and Business Continuity

An operations engineer accidentally deleted many items from a DynamoDB table named Orders in us-east-1 at 09:14 UTC. Point-in-time recovery (PITR) was enabled before the incident. The engineer must restore the data to its state at 09:10 UTC and confirm the restore worked without overwriting the current Orders table.

Which TWO actions should the engineer take?

Options:

A. Create an on-demand backup now and restore it to recover deleted items
B. Use UpdateTable to roll back the existing Orders table in place
C. After the restore is ACTIVE, read from the restored table to verify expected items
D. Enable PITR now and restore the table to 09:10 UTC
E. Create a DynamoDB VPC endpoint and validate the restore using private DNS
F. Restore to a new table using RestoreTableToPointInTime at 09:10 UTC

Correct answers: C and F

Explanation: With DynamoDB PITR, you restore to a new table at the desired point in time, then validate the outcome by confirming the restored table is ACTIVE and that expected data is present. PITR cannot retroactively cover time periods before it was enabled, and DynamoDB does not support in-place rollback of an existing table.

DynamoDB point-in-time recovery lets you recover a table to a specific second within the PITR window, but the restore operation always creates a new table (so you don’t overwrite the current one). In this scenario, the correct operational flow is to restore Orders to a new table name at 09:10 UTC, wait for the restore to complete, and then validate the restored content by performing reads against the restored table.

Typical high-level steps:

Start RestoreTableToPointInTime with a new target table name and the 09:10 UTC timestamp.
Wait until the restored table status is ACTIVE.
Validate by querying/scanning for known keys or counts that should exist as of 09:10.

Key takeaway: PITR is a restore-to-new-table operation, and validation must include checking the restored data, not just configuration.

OK Restore to a new table using RestoreTableToPointInTime at 09:10 UTC — this is the PITR mechanism and avoids overwriting the current table.
OK Read from the restored table to verify expected items — confirms the restore outcome by validating the recovered data.
NO Rolling back the existing table in place — DynamoDB does not support in-place point-in-time rollback of an existing table.
NO Enabling PITR now and restoring to 09:10 UTC — PITR cannot restore to times before it was enabled.
NO Restoring from an on-demand backup created now — a backup taken after deletion will not contain the deleted items.
NO Using a VPC endpoint/private DNS — connectivity settings don’t perform or validate a PITR restore outcome.

Question 10

Topic: Reliability and Business Continuity

An operations team uses Amazon FSx and wants to confirm the basic recovery workflow. Which statement about Amazon FSx backups is correct?

Options:

A. FSx backups are stored on the file system itself and are deleted when the file system is deleted.
B. FSx backups automatically replicate to all AWS Regions without additional configuration.
C. An FSx backup is used to create a new FSx file system from that point in time.
D. An FSx backup can be mounted directly by clients as a read-only file share.

Best answer: C

Explanation: Amazon FSx backups are point-in-time restore points. To recover, you restore a backup by creating a new FSx file system that contains the backed-up data. This is the fundamental mechanism FSx provides to meet recovery objectives after data loss or file system failure.

The core concept is that an Amazon FSx backup is not a share you attach to; it is a recovery artifact. Operationally, when you need to recover from accidental deletion or a corrupted file system, you select the appropriate backup (based on the desired recovery point) and restore it by creating a new FSx file system from that backup. After the new file system is available, you update clients (for example, DNS or mount targets) to use the restored file system. The key takeaway is that backups support recovery by enabling point-in-time creation of a replacement file system rather than providing a directly mountable backup volume.

Directly mounting backups is not how FSx backups work; you restore to a new file system.
Backups stored on the file system would not be reliable for disaster recovery if the file system is lost.
Automatic multi-Region replication is not inherent to FSx backups without additional DR configuration.

Continue with full practice

Use the AWS SOA-C03 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try AWS SOA-C03 on Web View AWS SOA-C03 Practice Test

Free review resource

Read the AWS SOA-C03 Cheat Sheet on Tech Exam Lexicon, then return to IT Mastery for timed practice.

Revised on Thursday, May 14, 2026

Monitoring and Optimization

Deployment, Provisioning, and Automation

Browse Certification Practice Tests by Exam Family

AWS SOA-C03: Reliability and Business Continuity

Topic snapshot

How to use this topic drill

Sample questions

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Continue with full practice

Related focused pages

Free review resource