DS0-002 — CompTIA DataSys+ V2 Quick Reference

Compact DS0-002 Quick Reference for CompTIA DataSys+ V2 candidates covering database design, SQL, security, operations, performance, and recovery.

Exam-use focus

This independent Quick Reference is for candidates preparing for CompTIA DataSys+ V2 (DS0-002). Use it to review database administration decisions quickly: data modeling, SQL behavior, deployment patterns, access control, backup and recovery, monitoring, performance, and operational change management.

High-yield DBA decision map

TaskChoose / rememberCommon exam trap
Design transactional systemNormalize, enforce constraints, index frequent predicatesAdding indexes for every column without considering write cost
Design analytics systemDimensional model, columnar storage, aggregates, ETL/ELT pipelinesTreating OLAP like OLTP and over-normalizing reporting tables
Protect dataLeast privilege, encryption, auditing, masking/tokenization where neededConfusing encryption with authorization
Improve slow queryCheck plan, indexes, statistics, joins, predicates, locksScaling hardware before reading the execution plan
Recover from data lossRestore last good full backup, apply differential/incremental/logs as appropriate, validateHaving backups but never testing restores
Meet low downtime requirementReplication, clustering, failover, load balancing, tested runbooksAssuming backup alone provides high availability
Move data between systemsMigration plan, validation, rollback, CDC or replication if low downtimeBig-bang migration without reconciliation
Troubleshoot contentionIdentify locks, long transactions, isolation level, deadlock victimsKilling sessions without preserving evidence
Improve data qualityConstraints, validation rules, profiling, deduplication, lineageFixing reports instead of correcting source data
Manage changesVersioned scripts, pre-prod testing, approvals, rollback planManual schema changes directly in production

Data types, structures, and workload fit

Data category reference

Data categoryExamplesTypical handling
StructuredRows and columns, well-defined schemaRelational DBMS, SQL, constraints
Semi-structuredJSON, XML, logs with variable fieldsDocument stores, data lakes, schema-on-read
UnstructuredImages, audio, PDFs, free textObject storage, search indexes, metadata catalog
Master dataCustomer, product, supplier recordsGovernance, deduplication, stewardship
Transactional dataOrders, payments, inventory movementsACID controls, constraints, auditability
Reference dataCountry codes, status codes, taxonomiesControlled updates, versioning
MetadataSchema, lineage, ownership, sensitivity labelsCataloging, governance, impact analysis

Workload selection matrix

WorkloadPrimary goalCommon designStorage/query pattern
OLTPFast, consistent transactionsNormalized relational schemaShort reads/writes, indexed lookups
OLAPAnalysis across large historyStar/snowflake schema, aggregatesLarge scans, joins, grouping
HTAP / mixedTransactional plus near-real-time analyticsSeparate serving paths or specialized platformAvoid analytics queries degrading OLTP
StreamingContinuous event processingEvent logs, stream processors, time windowsAppend-first, low-latency processing
Data lakeFlexible raw and curated dataObject storage zones, catalog, governanceBatch/ELT, schema-on-read
SearchText relevance and filteringInverted indexesTokenization, ranking, faceting
Time-seriesMetrics, telemetry, sensor dataTimestamp-based partitions/retentionRange scans, downsampling

Database model selection

ModelBest forStrengthsWatch for
RelationalStructured business transactionsACID, SQL, constraints, joinsRigid schema changes if poorly managed
Key-valueSession state, cache, simple lookupsVery fast access by keyLimited querying and relationships
DocumentJSON-like entities, flexible attributesSchema flexibility, nested dataDuplicated data and inconsistent shapes
Column-family / wide-columnMassive sparse datasets, high write scaleHorizontal scale, high throughputQuery design must follow access patterns
GraphHighly connected relationshipsTraversal, path analysisNot ideal for simple tabular reporting
Time-seriesMetrics over timeRetention, compression, time windowsCardinality management
Object storage + catalogRaw files, lake architecturesLow-cost durable storage, many formatsGovernance and query performance depend on design

Relational design quick reference

Core relational terms

TermMeaningExam note
EntityThing represented by a tableExample: Customer, Order, Product
AttributeColumn describing an entityChoose data type and constraints carefully
Tuple / rowOne recordShould represent one instance of the entity
Primary keyUnique row identifierShould be stable, unique, and non-null
Foreign keyReference to another table’s keyEnforces referential integrity
Candidate keyColumn set that could uniquely identify rowsOne is selected as primary key
Surrogate keyArtificial key, such as generated IDUseful when natural keys are unstable
Natural keyReal-world identifierCan change or contain business meaning
CardinalityRelationship countOne-to-one, one-to-many, many-to-many
OptionalityWhether relationship is requiredImplemented through nullability and constraints
Junction tableResolves many-to-many relationshipContains foreign keys to both parent tables

Relationship patterns

PatternImplementationExample
One-to-oneFK with unique constraint, or shared PKUser and user profile
One-to-manyFK on child tableCustomer to orders
Many-to-manyJunction/bridge tableStudents to courses
HierarchicalSelf-referencing FKEmployee to manager
Recursive graph-likeEdge table with source/target IDsNetwork links, dependencies

Constraint reference

ConstraintProtects againstExample
NOT NULLMissing required dataOrder date must exist
UNIQUEDuplicate valuesEmail address must be unique
PRIMARY KEYMissing or duplicate row identityCustomerID
FOREIGN KEYOrphan recordsOrder must reference valid customer
CHECKInvalid domain valuesQuantity greater than 0
DEFAULTMissing routine valueCreatedDate defaults to current timestamp
EXCLUDE / specialized constraintOverlapping or conflicting rangesRoom bookings cannot overlap, if supported

Normalization decision table

FormMain ruleFixesPractical exam cue
1NFAtomic values; no repeating groupsMulti-value columnsSplit comma-separated phone numbers into child table
2NF1NF plus no partial dependency on composite keyRedundant data in composite-key tablesNon-key attribute depends on only part of key
3NF2NF plus no transitive dependencyNon-key data depending on other non-key dataMove ZIP-to-city mapping to reference table
BCNFEvery determinant is a candidate keyEdge cases with overlapping candidate keysStricter than 3NF
DenormalizationIntentionally duplicate or precompute dataRead/report performanceRequires consistency strategy

High-yield distinction: normalization reduces update anomalies; denormalization can improve read performance but increases maintenance and consistency risk.

Dimensional and analytics modeling

ConceptMeaningUse when
Fact tableNumeric events or measurementsSales amount, clicks, shipments
Dimension tableDescriptive contextDate, customer, product, region
Star schemaFact table directly linked to dimensionsSimpler, faster BI queries
Snowflake schemaDimensions further normalizedLower redundancy, more joins
GrainLevel of detail in fact table“One row per order line”
Slowly changing dimension Type 1Overwrite old valueCurrent-state reporting only
Slowly changing dimension Type 2Add new versioned rowHistorical reporting needed
Aggregate tablePre-summarized dataImprove frequent reports
Data martSubject-specific analytics storeDepartment or domain reporting

SQL compact reference

SQL statement classes

ClassPurposeExamples
DDLDefine structuresCREATE, ALTER, DROP, TRUNCATE
DMLManipulate dataSELECT, INSERT, UPDATE, DELETE, MERGE
DCLControl permissionsGRANT, REVOKE
TCLControl transactionsCOMMIT, ROLLBACK, SAVEPOINT
DQLQuery dataSELECT; often treated as part of DML

Logical SELECT processing order

OrderClausePurpose
1FROM / JOINIdentify source rows
2WHEREFilter individual rows
3GROUP BYForm groups
4HAVINGFilter groups
5SELECTReturn expressions
6DISTINCTRemove duplicates
7ORDER BYSort result
8LIMIT / OFFSET / FETCHReturn subset, syntax varies

Trap: WHERE filters rows before grouping. HAVING filters groups after aggregation.

SELECT c.customer_id, COUNT(*) AS order_count
FROM customers c
JOIN orders o
  ON o.customer_id = c.customer_id
WHERE o.order_date >= DATE '2026-01-01'
GROUP BY c.customer_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;

Join behavior

Join typeReturnsCommon use
INNER JOINMatching rows onlyRequired relationship
LEFT OUTER JOINAll left rows plus matching right rowsFind optional related data
RIGHT OUTER JOINAll right rows plus matching left rowsLess common; can rewrite as left join
FULL OUTER JOINAll rows from both sidesReconciliation, data comparison
CROSS JOINCartesian productGenerate combinations; dangerous if accidental
SELF JOINTable joined to itselfHierarchies, comparisons
-- Find customers with no orders
SELECT c.customer_id
FROM customers c
LEFT JOIN orders o
  ON o.customer_id = c.customer_id
WHERE o.order_id IS NULL;

NULL behavior

ExpressionResult / note
column = NULLIncorrect; comparisons to NULL are unknown
column IS NULLCorrect NULL test
column IS NOT NULLCorrect non-NULL test
COUNT(*)Counts rows
COUNT(column)Counts non-NULL values
COALESCE(a, b)Returns first non-NULL value
NULL in arithmeticUsually produces NULL

Aggregation and window functions

TechniquePurposeExample use
GROUP BYCollapse rows into groupsSales by region
HAVINGFilter aggregate groupsRegions with sales above threshold
Window functionCalculate across related rows without collapsingRank orders by customer
PARTITION BYDefines window groupsPer customer, per department
ORDER BY inside windowDefines calculation orderRunning total, row number
SELECT
  customer_id,
  order_id,
  order_total,
  ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY order_date DESC
  ) AS order_rank
FROM orders;

DDL and constraints example

CREATE TABLE orders (
  order_id      INTEGER PRIMARY KEY,
  customer_id   INTEGER NOT NULL,
  order_date    DATE NOT NULL,
  order_total   DECIMAL(12,2) NOT NULL CHECK (order_total >= 0),
  status        VARCHAR(20) NOT NULL DEFAULT 'NEW',
  CONSTRAINT fk_orders_customer
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Transaction example

BEGIN;

UPDATE accounts
SET balance = balance - 100
WHERE account_id = 10;

UPDATE accounts
SET balance = balance + 100
WHERE account_id = 20;

COMMIT;
-- Use ROLLBACK instead of COMMIT if validation fails.

Transactions, concurrency, and consistency

ACID reference

PropertyMeaningDBA relevance
AtomicityAll-or-nothing transactionPrevent partial transfers or half-written changes
ConsistencyDatabase moves between valid statesConstraints and business rules remain valid
IsolationConcurrent transactions do not improperly interfereControlled by isolation levels and locks
DurabilityCommitted data survives failureLogs, storage, checkpoints, replication

Isolation and anomalies

Isolation conceptPrevents / allowsExam cue
Read uncommittedMay allow dirty readsFast but unsafe for correctness
Read committedPrevents dirty readsCommon baseline behavior
Repeatable readPrevents non-repeatable readsSame row reread remains stable
SerializableStrongest isolation; behaves like serial executionMore blocking/overhead possible
Snapshot / MVCCReaders see consistent versionReduces read/write blocking, may create version storage pressure
AnomalyMeaning
Dirty readRead data from uncommitted transaction
Non-repeatable readSame row read twice returns different committed values
Phantom readRe-running query returns new/deleted matching rows
Lost updateOne update overwrites another without detection
DeadlockTransactions wait on each other in a cycle

Locking and contention

SymptomLikely causeFirst checksPossible action
Queries hang or time outBlocking locksActive sessions, wait events, lock tablesCommit/rollback long transaction; tune query
Deadlock errorsConflicting access orderDeadlock logs/graphsAccess objects in consistent order; shorten transactions
High write latencyIndex overhead, log pressure, contentionWrite waits, index count, log I/OReduce unnecessary indexes, batch carefully
Readers block writersLock-based isolationIsolation settings, query durationUse appropriate isolation/MVCC if supported
Version store growsLong snapshot readersLong-running queriesEnd stale sessions; tune reporting workload

Indexing and query performance

Index selection

Index type / patternUse whenAvoid / watch for
B-tree / balanced treeEquality and range predicates, sortingLow-selectivity columns may not help
Composite indexQueries filter/sort by multiple columnsColumn order matters
Covering indexQuery can be satisfied from indexExtra storage and write overhead
Unique indexEnforce uniqueness and speed lookupDuplicates will fail
Filtered / partial indexOnly subset is frequently queriedPredicate must match supported syntax
Full-text indexNatural language searchNot same as LIKE '%term%'
Hash indexEquality lookup, if supportedUsually not for range queries
Bitmap indexLow-cardinality analytics, if supportedOften poor for high-concurrency OLTP

SARGability checklist

A predicate is more index-friendly when the database can search the index directly.

Less index-friendlyMore index-friendly
WHERE YEAR(order_date) = 2026WHERE order_date >= DATE '2026-01-01' AND order_date < DATE '2027-01-01'
WHERE LOWER(email) = 'a@x.com'Store normalized email or use supported function-based index
WHERE amount + 10 > 100WHERE amount > 90
WHERE name LIKE '%son'Use full-text/search index if suffix search is required

Query tuning sequence

  1. Confirm the performance symptom and baseline.
  2. Review execution plan: scan vs seek, join method, sort, spill, estimated vs actual rows if available.
  3. Check predicates for SARGability.
  4. Validate relevant indexes and index column order.
  5. Check statistics freshness and cardinality estimates.
  6. Inspect joins, missing filters, accidental cross joins, and implicit conversions.
  7. Check locks, waits, I/O, CPU, memory, and temp space.
  8. Test changes in non-production and compare measured results.

Performance tools and signals

SignalIndicatesTypical response
Full table scanNo useful index, low selectivity, or optimizer choiceAdd/tune index only if beneficial
High CPUComplex calculations, poor plan, too many executionsTune query, cache results, reduce frequency
High disk I/OLarge scans, missing indexes, poor cachingIndex, partition, archive, tune queries
Memory pressureSort/hash spills, insufficient cacheTune query, review memory config/capacity
Temp space growthLarge sorts, hash joins, temp tablesAdd indexes, reduce intermediate rows
Stale statisticsBad cardinality estimatesUpdate statistics/maintenance
Parameter sensitivityPlan good for one value, bad for anotherRecompile/plan strategy, query rewrite where supported

Storage, partitioning, and capacity

Storage concepts

ConceptDBA meaningExam note
Data fileStores table/index dataPlacement affects I/O
Log / journalRecords changes for durability and recoveryCritical for point-in-time restore
Tablespace / filegroupLogical storage grouping, vendor term variesCan separate data, indexes, partitions
Temp spaceUsed for sorts, joins, temp objectsRunning out can break queries
CheckpointFlushes dirty pages to durable storageInteracts with recovery time
CompressionReduces storage and I/OMay increase CPU
TieringMove data to different storage classesBalance performance, cost, retention

Partitioning patterns

PatternUse whenBenefitTrap
Range partitionDate/time or ordered keyFast pruning and archivalBad partition key creates imbalance
List partitionKnown categoriesRegion/status separationToo many categories can be hard to manage
Hash partitionSpread data evenlyReduces hotspot riskLess intuitive for pruning
Composite partitionMultiple strategiesLarge complex workloadsMore operational complexity

Partitioning is not automatically a performance fix. It helps most when queries filter on the partition key, maintenance can operate per partition, or data lifecycle actions need efficient archive/drop.

Deployment and architecture patterns

Database deployment decision table

PatternBest forStrengthsRisks / controls
Single instanceSmall/simple workload, dev/testSimple administrationSingle point of failure unless backed by platform features
Primary-replicaRead scaling, reporting, HA supportOffload reads, failover optionReplication lag, consistency expectations
Active-active / multi-writerGlobal or high-write availabilityContinuous service potentialConflict resolution and complexity
Clustered databaseHigh availability within environmentAutomated failover, shared governanceQuorum/split-brain design matters
ShardingVery large scale-out workloadHorizontal write/read scaleCross-shard queries and rebalancing complexity
Managed database serviceReduce operational burdenAutomated maintenance options, platform integrationShared responsibility still applies
Containerized databaseDev/test portability, some specialized deploymentsRepeatable deploymentPersistent storage, backup, and performance need care
Serverless databaseVariable workload, simplified scalingOperational simplicityCold starts/latency/cost behavior may vary by platform

Replication choices

Replication typeDescriptionChoose when
SynchronousCommit waits for replica acknowledgmentVery low data-loss tolerance
AsynchronousPrimary commits before replica catches upDistance/performance more important than zero-lag
Physical/block-levelReplicates storage/log changesDisaster recovery or exact copy
LogicalReplicates rows/statements/changesSelective replication, migration, integration
SnapshotPeriodic copyReporting or initial synchronization
CDCCaptures data changes over timeNear-real-time pipelines and migrations

CAP and distributed data systems

ConceptMeaning
ConsistencyReads return latest correct data according to model
AvailabilityRequests receive non-error responses
Partition toleranceSystem continues despite network partitions
Eventual consistencyReplicas converge over time
Strong consistencyReads reflect committed writes according to defined rules

Exam distinction: CAP applies under network partition conditions. It does not mean every system permanently chooses only two characteristics in all situations.

Security quick reference

Security control matrix

ControlProtectsExamples / notes
AuthenticationWho are you?Passwords, MFA, certificates, federation
AuthorizationWhat can you do?Roles, grants, row-level access
Accounting / auditingWhat happened?Login logs, query audit, DDL audit
Encryption in transitNetwork confidentialityTLS/secure client connections
Encryption at restStored data protectionDatabase/file/storage encryption
Key managementControl encryption keysRotation, separation of duties, access policy
MaskingHide sensitive values in displaysUseful for non-prod or limited users
TokenizationReplace sensitive data with tokenReduces exposure of original value
HashingOne-way transformationPassword verification, integrity checks
SaltingRandom value added before hashingDefends against precomputed hash attacks
Data classificationLabel sensitivityDrives access, retention, and monitoring
Secrets managementProtect credentialsAvoid hardcoded passwords
Network segmentationLimit reachable pathsPrivate subnets, firewalls, allowlists
PatchingRemove known vulnerabilitiesTest, schedule, rollback plan

Least privilege SQL example

-- Example pattern; exact syntax varies by platform.
CREATE ROLE reporting_reader;

GRANT SELECT ON sales_summary TO reporting_reader;

GRANT reporting_reader TO analyst_user;

REVOKE INSERT, UPDATE, DELETE ON sales_summary FROM reporting_reader;

Role design

PatternGood practiceTrap
User-specific grantsUse sparinglyHard to audit and revoke
Role-based access controlGrant privileges to roles, users to rolesOverly broad shared roles
Separation of dutiesSplit DBA, security admin, developer dutiesOne account can alter, approve, and audit itself
Break-glass accessEmergency elevated accessMust be logged, time-limited, reviewed
Service accountsApplication/database connectivityUse rotation and scoped permissions

Sensitive data handling

TechniqueReversible?Primary use
EncryptionYes, with keyProtect stored/transmitted data
HashingNoVerify password or integrity
MaskingUsually display-levelReduce exposure to users
TokenizationYes, through token vault/mappingReplace sensitive values in workflows
RedactionUsually no in outputRemove sensitive text from logs/documents
AnonymizationIntended noAnalytics without identifying individuals
PseudonymizationPossible with mappingReduce direct identifiability

Audit and logging focus

Event typeWhy it matters
Failed loginsBrute force or credential misuse
Privilege changesEscalation or misconfiguration
DDL changesSchema drift, unauthorized alteration
Access to sensitive tablesData exposure investigation
Bulk exportPotential exfiltration
Backup/restore eventsData movement and recovery assurance
Configuration changesSecurity or availability impact

Trap: audit logs must be protected from tampering. Logging sensitive values can create a second data exposure location.

Backup, restore, and disaster recovery

RPO, RTO, and availability

TermMeaningPractical implication
RPOMaximum acceptable data lossDetermines backup/log replication frequency
RTOMaximum acceptable recovery timeDetermines restore automation and HA design
MTTDMean time to detectMonitoring and alerting quality
MTTRMean time to repair/recoverRunbooks, automation, staff readiness
HAReduce downtime during component failureClustering, failover, redundancy
DRRecover from site/region/system disasterBackups, replication, alternate environment
\[ \text{Availability} = \frac{\text{Total time} - \text{Downtime}}{\text{Total time}} \times 100 \]

Backup type comparison

Backup typeCapturesStrengthsWatch for
FullEntire database/data setSimplest restore baseLargest time/storage
DifferentialChanges since last fullFaster restore than many incrementalsGrows until next full
IncrementalChanges since last backupEfficient storageRestore may require chain
Transaction log / redo logOrdered changesPoint-in-time recoveryLog chain must be intact
SnapshotPoint-in-time storage imageFast creationMust understand consistency and dependency on storage
Logical exportSchema/data as statements/filesMigration, selective restoreMay be slower; consistency must be managed
Physical backupData files/logsFast full restorePlatform/version compatibility matters

Restore sequence reference

ScenarioTypical restore approach
Full backup onlyRestore full backup
Full + differentialRestore full, then latest differential
Full + incrementalsRestore full, then each incremental in order
Full + logsRestore full, then logs to target point
Full + differential + logsRestore full, latest differential, then logs
Corrupt object onlyConsider object-level restore/export if supported
Accidental DELETEPoint-in-time restore to separate environment, extract/replay valid data

Backup validation checklist

  • Confirm backups complete successfully.
  • Test restore in a non-production environment.
  • Verify application can connect after restore.
  • Validate row counts, checksums, or reconciliation totals.
  • Document restore steps and owners.
  • Protect backups with encryption and access controls.
  • Store backups separately from the primary failure domain.
  • Monitor backup age, duration, failure, and capacity.
  • Review retention and deletion settings against business requirements.

HA/DR decision path

    flowchart TD
	    A[Requirement: protect database service] --> B{Main concern?}
	    B -->|Short outage from server failure| C[High availability: cluster or failover replica]
	    B -->|Data loss tolerance is very low| D[Synchronous replication or frequent log shipping]
	    B -->|Regional/site disaster| E[Disaster recovery environment plus offsite backups]
	    B -->|Accidental data change| F[Point-in-time restore and audit trail]
	    C --> G[Test failover runbook]
	    D --> G
	    E --> G
	    F --> H[Test restore and data reconciliation]

Data lifecycle, integration, and governance

ETL, ELT, CDC, and streaming

PatternDescriptionChoose when
ETLTransform before loading targetTarget requires curated data before storage
ELTLoad raw first, transform in targetScalable analytics platform handles transforms
CDCCapture inserts/updates/deletes from sourceLow-latency sync, migrations, audit feeds
BatchPeriodic bulk movementLarge scheduled processing
StreamingContinuous event processingReal-time alerts, telemetry, near-real-time analytics
API integrationApplication-level data exchangeControlled business operations and validation
Message queueDecouple producers and consumersResilience and asynchronous processing

Data quality checks

CheckDetects
CompletenessMissing required values
ValidityValues outside allowed domain
UniquenessDuplicate keys/entities
ConsistencyConflicting values across systems
AccuracyValues not matching real-world source
TimelinessStale or late-arriving data
IntegrityBroken relationships or corrupted values
ConformityWrong format, unit, code set, or standard
-- Basic reconciliation checks after migration/load
SELECT COUNT(*) AS source_count FROM source_orders;
SELECT COUNT(*) AS target_count FROM target_orders;

SELECT SUM(order_total) AS source_total FROM source_orders;
SELECT SUM(order_total) AS target_total FROM target_orders;

Governance and metadata

AreaDBA relevance
Data catalogFind datasets, owners, schemas, descriptions
LineageUnderstand where data came from and downstream impact
ClassificationIdentify sensitive, regulated, or business-critical data
RetentionKeep or dispose data according to policy
Data ownershipAssign accountability for definitions and quality
StewardshipDay-to-day data quality and definition management
Change impactKnow which reports, apps, or pipelines depend on a table
Master data managementCreate trusted shared entities across systems

Monitoring and troubleshooting

Baseline metrics

CategoryMetrics / signals
AvailabilityUptime, connection success, failover events
WorkloadTransactions per second, query rate, batch duration
LatencyQuery response time, commit latency, replication lag
ResourceCPU, memory, disk I/O, network throughput
StorageFree space, growth rate, temp usage, log usage
Locks/waitsBlocking sessions, deadlocks, wait classes
CacheBuffer/cache hit ratio, plan cache behavior
BackupSuccess/failure, duration, backup age
SecurityFailed logins, privilege changes, unusual access
Data pipelineJob failures, late data, rejected records

Symptom-to-action table

SymptomLikely causesFirst action
Database unavailableService down, network issue, storage failure, auth issueCheck service health, connectivity, logs
Sudden slow queriesBad plan, stale stats, blocking, workload spikeCompare to baseline; review plan and waits
Disk fullData growth, logs not truncating, temp spill, failed cleanupIdentify consuming files; protect data before cleanup
Replication lagNetwork latency, slow replica, large transactionCheck replica health and apply queue
Backup failurePermission, capacity, I/O, schedule conflictReview job logs and destination capacity
Login failuresCredential changes, lockout, expired secret, attackValidate auth path and security logs
Data mismatchETL bug, partial load, constraint disabled, duplicate sourceReconcile counts/totals and review load logs
DeadlocksConflicting transaction order, missing indexesReview deadlock details and access pattern
Corruption alertStorage fault, software issue, abrupt failureStop risky writes if needed; follow vendor recovery guidance

Incident response for database issues

  1. Detect and classify severity.
  2. Preserve logs, alerts, and current state.
  3. Stabilize service; avoid destructive “quick fixes.”
  4. Identify blast radius: users, applications, tables, replicas, backups.
  5. Communicate status through the agreed channel.
  6. Apply tested recovery or remediation steps.
  7. Validate service and data correctness.
  8. Document root cause, timeline, and preventive actions.

Maintenance, change, and release management

Routine DBA operations

OperationPurposeKey caution
Patch database engineSecurity and stabilityTest compatibility and rollback
Update statisticsBetter optimizer estimatesSchedule to avoid peak impact
Rebuild/reorganize indexesAddress fragmentation where relevantDo not perform blindly; measure benefit
Archive/purge dataControl growth and performanceValidate retention and dependencies
Rotate credentials/keysReduce credential exposureCoordinate application updates
Review permissionsEnforce least privilegeRemove stale users and excessive grants
Test restoresConfirm recoverabilityTest regularly, not only after incidents
Capacity reviewAvoid outages and performance issuesTrend storage, CPU, memory, I/O

Schema change checklist

  • Use version-controlled migration scripts.
  • Test in development and staging with realistic data volume.
  • Identify dependent applications, views, reports, jobs, and permissions.
  • Back up or snapshot before high-risk changes.
  • Plan locking and downtime impact.
  • Use backward-compatible changes where possible.
  • Include rollback or roll-forward strategy.
  • Validate data after migration.
  • Document change, owner, approval, and implementation time.

Safe migration sequence

PhaseActivities
AssessInventory objects, dependencies, data volume, compatibility
PlanChoose migration method, window, rollback, validation
PrepareProvision target, security, connectivity, schema
LoadInitial copy or replication setup
ValidateCounts, checksums, sample queries, application tests
Cut overFreeze or sync final changes, switch traffic
MonitorWatch errors, latency, locks, data drift
DecommissionRetire old system only after acceptance and backup

Cloud and managed database responsibilities

Responsibility areaCustomer usually still managesProvider/platform may manage
Data modelSchema, indexes, constraintsNot usually automatic
AccessUsers, roles, secrets, app permissionsIAM integration features
Data protectionClassification, encryption choices, backup policy validationInfrastructure encryption options, backup tooling
PatchingApplication compatibility testing, scheduling decisionsEngine or host patch automation depending on service
AvailabilityArchitecture choice, failover testingPlatform redundancy mechanisms
MonitoringAlerts, query performance, business KPIsBuilt-in metrics/log delivery
Compliance supportPolicies, evidence, data handlingPlatform controls and reports
Cost/capacityWorkload sizing, scaling choicesMetering and scaling mechanisms

Trap: managed database does not mean unmanaged data. The organization still owns data quality, access design, query behavior, and recovery requirements.

Common exam distinctions

DistinctionRemember
Backup vs replicationBackup protects against deletion/corruption history; replication can copy bad changes quickly
HA vs DRHA handles local/component failure; DR handles broader disaster recovery
Authentication vs authorizationAuthentication proves identity; authorization grants actions
Encryption vs hashingEncryption is reversible with key; hashing is one-way
Masking vs encryptionMasking changes what users see; encryption protects stored/transmitted data
OLTP vs OLAPOLTP optimizes transactions; OLAP optimizes analysis
Primary key vs unique keyBoth enforce uniqueness; primary key is main row identifier and non-null
Foreign key vs joinFK enforces relationship; join retrieves related data
WHERE vs HAVINGWHERE filters rows; HAVING filters groups
DELETE vs TRUNCATEDELETE is row operation and can filter; TRUNCATE removes all rows more directly, behavior varies by platform
Logical vs physical backupLogical exports objects/data; physical backs up files/pages/logs
Incremental vs differentialIncremental since last backup; differential since last full
Scale up vs scale outScale up adds resources to one node; scale out adds nodes/partitions
Normalization vs denormalizationNormalize for integrity; denormalize intentionally for read performance
Data lake vs data warehouseLake stores flexible raw/curated files; warehouse stores structured optimized analytics
CDC vs full reloadCDC captures changes; full reload replaces or reloads entire set

Compact command/query patterns to recognize

-- Add an index for a frequent lookup pattern
CREATE INDEX idx_orders_customer_date
ON orders (customer_id, order_date);

-- Enforce valid status values
ALTER TABLE orders
ADD CONSTRAINT chk_orders_status
CHECK (status IN ('NEW', 'PAID', 'SHIPPED', 'CANCELLED'));

-- Identify duplicate business keys
SELECT email, COUNT(*) AS duplicate_count
FROM customers
GROUP BY email
HAVING COUNT(*) > 1;

-- Find orphaned child rows
SELECT o.order_id
FROM orders o
LEFT JOIN customers c
  ON c.customer_id = o.customer_id
WHERE c.customer_id IS NULL;

Final DS0-002 review checklist

  • Can you select relational, NoSQL, warehouse, lake, or streaming architecture based on workload?
  • Can you explain primary keys, foreign keys, constraints, joins, NULLs, and transaction isolation?
  • Can you read SQL and predict result-set behavior?
  • Can you choose an index strategy and a query-tuning sequence?
  • Can you distinguish authentication, authorization, auditing, encryption, hashing, masking, and tokenization?
  • Can you map RPO/RTO requirements to backup, restore, replication, HA, and DR designs?
  • Can you troubleshoot slow queries, blocking, failed backups, replication lag, and storage growth?
  • Can you describe safe schema changes, migrations, validation, and rollback planning?
  • Can you connect data governance, classification, lineage, and quality checks to DBA operations?

Practical next step

Use this Quick Reference as a checklist, then move into timed DS0-002 practice questions and hands-on SQL/database administration scenarios. Focus review on any item where you cannot explain both the correct choice and why the tempting alternatives are wrong.

Browse Certification Practice Tests by Exam Family