Browse Certification Practice Tests by Exam Family

CompTIA DataSys+ DS0-002: Data Integration

Try 10 focused CompTIA DataSys+ DS0-002 questions on Data Integration, with explanations, then continue with IT Mastery.

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try CompTIA DataSys+ DS0-002 on Web View full CompTIA DataSys+ DS0-002 practice page

Topic snapshot

FieldDetail
Exam routeCompTIA DataSys+ DS0-002
Topic areaData Integration
Blueprint weight13%
Page purposeFocused sample questions before returning to mixed practice

How to use this topic drill

Use this page to isolate Data Integration for CompTIA DataSys+ DS0-002. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.

PassWhat to doWhat to record
First attemptAnswer without checking the explanation first.The fact, rule, calculation, or judgment point that controlled your answer.
ReviewRead the explanation even when you were correct.Why the best answer is stronger than the closest distractor.
RepairRepeat only missed or uncertain items after a short break.The pattern behind misses, not the answer letter.
TransferReturn to mixed practice once the topic feels stable.Whether the same skill holds up when the topic is no longer obvious.

Blueprint context: 13% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.

Sample questions

These original IT Mastery practice questions are aligned to this topic area. Use them for self-assessment, scope review, and deciding what to drill next.

Question 1

Topic: Data Integration

A nightly data acquisition job was moved from a test server to a new production server. It now fails before running the extraction query. The source database, query text, and credentials are unchanged, and security policy requires encrypted connections.

Exhibit: Production job log

ETL runtime: 64-bit
Source database version: 14
Installed database driver: version 10, 32-bit
Driver support note: supports database versions 9-11
Error: connection initialization failed before query execution

Which implementation choice best addresses the likely infrastructure limitation?

Options:

  • A. Disable encrypted connections for the job

  • B. Rewrite the extraction query syntax

  • C. Install a supported 64-bit database driver

  • D. Increase the connection retry timeout

Best answer: C

Explanation: The evidence points to a driver and platform compatibility issue, not a query, credential, or network timing problem. The job fails during connection initialization, before the SQL runs. The production server also has a 32-bit driver while the ETL runtime is 64-bit, and the driver support note excludes the current source database version. The best implementation choice is to install a database driver that matches the runtime architecture and is supported for the source database version while preserving the required encrypted connection. Query changes do not fix a pre-query connection failure, and weakening encryption would violate the stated security requirement.

  • Query rewrite fails because the log shows the connection fails before the extraction SQL executes.
  • Disable encryption conflicts with the security requirement and does not address the unsupported driver version.
  • Retry timeout is unlikely to help because the error is a deterministic initialization failure, not an intermittent availability issue.

Question 2

Topic: Data Integration

An ELT job that has run successfully for months began failing after a source API release. A DBA must identify the failure category before escalating.

Exhibit: Job error excerpt

Job: orders_api_to_stage
API response: 200 OK
TLS handshake: succeeded
Database driver: same version as last successful run
Target column: stage_orders.order_total DECIMAL(10,2) NOT NULL
Raw field sample: "order_total": "1,299.95"
Load error: Row 42: cannot convert varchar value '1,299.95' to DECIMAL(10,2)

Which interpretation is best supported by the exhibit?

Options:

  • A. Driver incompatibility with the database

  • B. Encryption failure during transport

  • C. Data format failure in the source field

  • D. Connection failure to the source API

Best answer: C

Explanation: The decisive evidence is the successful API response and TLS handshake, followed by a row-level conversion error. The connection and encryption layers completed, and the database driver did not change from the last successful run. The failing value is presented as a quoted string with a thousands separator, while the target column requires DECIMAL(10,2). That points to a data format issue in the incoming payload or parsing rules, not a connectivity, encryption, or driver problem.

A practical next step would be to confirm the provider’s new format and update the mapping or cleansing logic to remove separators and cast the value safely before loading.

  • Connection failure is not supported because the API returned 200 OK.
  • Encryption failure is not supported because the TLS handshake succeeded.
  • Driver incompatibility is weak because the driver version is unchanged and the error is specific to value conversion.

Question 3

Topic: Data Integration

A data integration team is onboarding a monthly vendor extract into an analytics platform. The DBA must decide how the file should be acquired before scheduling the ETL job.

Exhibit: Data classification note

FieldClassificationHandling note
customer_idConfidentialRetain 7 years
emailPIIMask in nonproduction
diagnosis_codePHIRestricted access; encrypt in transit and at rest
ticket_textInternalRetain 2 years

Which acquisition action best follows from the exhibit?

Options:

  • A. Treat the extract as internal data for standard ETL

  • B. Reject the extract because it contains regulated data

  • C. Use a restricted encrypted pipeline with retention controls

  • D. Load the file directly into a shared staging schema

Best answer: C

Explanation: Data classification drives acquisition handling before data is copied into an integration or analytics environment. The exhibit shows multiple sensitivity levels, including PII and PHI. Those classifications do not automatically block acquisition, but they require stronger controls during transfer, staging, processing, and retention. For this file, the acquisition path should use encryption in transit and at rest, restrict access to authorized roles, preserve the 7-year retention requirement for confidential customer data, and ensure PII such as email is masked in nonproduction environments.

The key takeaway is that sensitive classifications change the acquisition design, not just the downstream reporting permissions.

  • Shared staging fails because PHI and PII should not be placed in broadly accessible staging areas.
  • Internal-only handling ignores the higher classifications explicitly shown for email and diagnosis_code.
  • Rejecting the extract is too extreme because regulated data can be acquired when required controls are applied.

Question 4

Topic: Data Integration

A database team is designing ingestion for sensor readings from manufacturing equipment. Operators need dashboard updates within 5 seconds of each reading so they can stop a line before parts are damaged. Historical reports can still run from the warehouse after the data is stored. Which ingestion design best fits the timing requirement?

Options:

  • A. Run a nightly database dump into the warehouse

  • B. Request weekly API extracts from the equipment system

  • C. Load hourly CSV exports with an ETL job

  • D. Use a streaming source with continuous event ingestion

Best answer: D

Explanation: Streaming data sources continuously produce events that must be ingested and processed as they arrive or within a short latency target. In this scenario, each sensor reading must reach an operational dashboard within 5 seconds, so the ingestion design should support continuous event flow rather than scheduled file or extract processing. The warehouse can still store the data for later reporting, but that does not change the operational ingestion need. Non-streaming sources, such as hourly files, nightly dumps, or weekly extracts, are better for batch analytics when delayed availability is acceptable.

  • Hourly files miss the 5-second update target because the data is only available after each scheduled export.
  • Nightly dumps are useful for bulk transfers or reporting but are too delayed for operational monitoring.
  • Weekly extracts are non-streaming and fit periodic analysis, not immediate line-stop decisions.

Question 5

Topic: Data Integration

A DBA is asked to improve quality control for a nightly ELT pipeline that loads order data into a warehouse. The business needs support staff to verify each run and trace future changes.

Exhibit: Last run summary

CheckpointResult
Source rows read48,250
Target rows inserted48,238
Job statusSucceeded
Transform definitionEdited in scheduler UI
Reject detailsOverwritten on next run
Mapping versionNot recorded

What is the best next action?

Options:

  • A. Increase the job timeout and rerun the pipeline nightly

  • B. Add version-controlled mappings with automated checks and reject logs

  • C. Disable target constraints until the load completes

  • D. Manually correct missing rows after reports are generated

Best answer: B

Explanation: Pipeline quality control should make each run measurable, repeatable, and supportable. The exhibit shows a successful job status even though 12 rows did not reach the target, while the transform was edited outside version control and reject details are not retained. A good QC action should add automated validations, such as row-count reconciliation and rule checks, plus persistent exception logging and version-controlled mappings. That combination lets support staff prove what happened, identify rejected records, and connect behavior to a specific approved mapping version. Simply rerunning or relaxing constraints may hide the defect instead of making the integration behavior verifiable.

  • Longer timeout fails because the exhibit shows a quality and traceability gap, not a runtime limit.
  • Disabled constraints can allow bad data into the warehouse and reduce verification.
  • Manual correction is not maintainable because it happens after reporting and does not preserve repeatable evidence.

Question 6

Topic: Data Integration

A data team is building a nightly acquisition pipeline that pulls customer records from a partner API and loads them into an internal database. The records include PII, and the partner requires an API credential. The DBA must reduce exposure of keys, credentials, and sensitive data during acquisition and transfer. Which implementation choice is best?

Options:

  • A. Transfer files by FTP and encrypt only the target database

  • B. Store the API key in the job script and use masking after load

  • C. Use TLS, managed secrets storage, and encrypted staging

  • D. Pass credentials in URL parameters and restrict database roles

Best answer: C

Explanation: Protecting acquisition and transfer requires controls at each exposure point: the network path, the credential store, and any temporary landing area. TLS protects sensitive records while they move between systems. A managed secrets store or vault keeps API credentials out of scripts, logs, and configuration files and supports controlled access and rotation. Encrypting staged files or landing tables reduces exposure if temporary storage is accessed outside the normal load process. Masking and database permissions can still be useful after loading, but they do not protect the credential or the data while it is being acquired.

  • Hard-coded credential fails because scripts can be copied, logged, or checked into version control, and masking after load is too late for transfer risk.
  • FTP-only transfer fails because FTP does not provide protected transport for sensitive records in transit.
  • URL credentials fail because query strings are commonly captured in logs, browser histories, proxies, and monitoring tools.

Question 7

Topic: Data Integration

A DBA is preparing a new acquisition job that imports customer support tickets from a partner API into a staging table. The job must confirm that the data is handled as confidential, arrives as JSON, matches the expected ticket schema, and can be reached reliably before loading production tables. Which validation approach is best?

Options:

  • A. Run classification, format, schema, and connectivity checks before staging load

  • B. Validate only row counts and file size after the load completes

  • C. Convert the response to CSV and bypass schema validation

  • D. Load the API response directly into production and review rejected rows

Best answer: A

Explanation: Acquisition validation should confirm the assumptions that make the incoming data safe and usable before it affects downstream tables. In this scenario, the DBA needs checks for classification (confidential handling), format (JSON), schema (required fields and data types), and connectivity (the partner API is reachable and responding as expected). Running these checks before or during staging reduces the chance of loading misclassified, malformed, incompatible, or unavailable data into production workflows.

Post-load row counts can help with reconciliation, but they do not prove the source classification, JSON structure, expected schema, or API availability. The key takeaway is to validate source assumptions at acquisition time, before trusting the data in production.

  • Production-first loading increases risk because bad or misclassified data can reach production before validation occurs.
  • Row-count-only checks may detect volume differences but miss classification, JSON format, schema compatibility, and API reachability.
  • Format conversion bypass changes the acquisition format and removes the schema check needed to confirm expected fields and types.

Question 8

Topic: Data Integration

A DBA needs to automate daily acquisition of order records from a partner’s hosted platform. The partner does not permit direct database connections or file shares. Access is provided through HTTPS endpoints that return JSON, require token authentication, and support pagination for large result sets. Which connectivity option best fits this need?

Options:

  • A. SOAP

  • B. NFS

  • C. ODBC

  • D. API

Best answer: D

Explanation: The core decision is matching the access method to how the source exposes data. The partner is offering application endpoints over HTTPS with JSON responses, token authentication, and pagination. That pattern is best handled as API-based acquisition, commonly through scheduled requests that parse responses and load the target database. ODBC would be appropriate for direct relational database connectivity through a driver. NFS would be appropriate for shared file-system access. SOAP is a specific XML-based web service style, but the stem describes JSON endpoints rather than SOAP messages.

  • Direct database driver fails because the partner explicitly does not allow direct database connections.
  • File share access fails because no shared file system is available.
  • SOAP service fails because the described interface uses JSON HTTPS endpoints, not SOAP XML envelopes.

Question 9

Topic: Data Integration

A DBA is planning a new feed into a relational analytics warehouse. Which acquisition approach best matches the requirements in the exhibit?

Exhibit: Intake requirements

RequirementDetail
SourceSaaS CRM REST endpoint
ClassificationRestricted PII: email, phone
ConnectivityHTTPS API
FormatJSON payloads
PolicyMask PII before storing it in analytics landing tables

Options:

  • A. Use FTP to receive CSV files and classify PII after loading.

  • B. Use ELT to load raw JSON first, then mask PII in the warehouse.

  • C. Use ETL from the HTTPS API, parse JSON, and mask PII before loading.

  • D. Use JDBC to extract relational rows directly from the SaaS source.

Best answer: C

Explanation: The exhibit points to an ETL acquisition pattern. The source is a SaaS REST endpoint reached through an HTTPS API, and the payloads are JSON, so the acquisition process should call the API and parse JSON rather than assume database connectivity or flat files. The data is classified as restricted PII, and the policy requires masking before the data is stored in analytics landing tables. That transformation-order requirement rules out loading raw data first and transforming later. ETL fits because extraction and transformation occur before loading into the target warehouse. The key takeaway is to let the source type and policy-driven transformation order drive the acquisition choice.

  • Raw-first ELT fails because the policy prohibits storing unmasked PII in landing tables.
  • JDBC extraction fails because the source is an HTTPS REST endpoint, not a directly queried relational source.
  • FTP and CSV fail because the exhibit specifies API connectivity and JSON payloads, not file transfer and flat files.

Question 10

Topic: Data Integration

A nightly acquisition job imports customer records from a partner feed into a staging table. The latest run stopped before loading any rows.

Exhibit: Job evidence

Source connection test: success
Read file: denied for service account svc_ingest
Previous mapping: customer_id, email, status
Incoming header: customer_id, email, status, consent_code
Target staging table: customer_id, email, status

Which troubleshooting path is best?

Options:

  • A. Ignore the schema change until the permission error is resolved

  • B. Add consent_code to production tables immediately

  • C. Restore least-privilege read access, then validate the schema mapping

  • D. Grant administrator access to svc_ingest and rerun the job

Best answer: C

Explanation: The best path is to handle the blocking permission issue first, then investigate the visible schema mismatch before allowing the acquisition to load data. The job cannot read the file with the current service account permissions, so schema troubleshooting alone will not produce a valid test. However, the incoming header contains consent_code, which is not in the previous mapping or target staging table. After restoring only the needed read access, the DBA should compare the feed, mapping, and staging schema, then apply an approved mapping or schema change if required. This avoids both over-permissioning and loading data with an unvalidated structure.

  • Immediate production change skips impact review and change control, and the evidence only proves the staging/mapping needs validation.
  • Administrator access may make the error disappear but violates least privilege and does not address the schema mismatch.
  • Permission-only troubleshooting leaves a known incoming-field mismatch unresolved before the rerun.

Continue with full practice

Use the CompTIA DataSys+ DS0-002 Practice Test page for the full IT Mastery practice bank, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try CompTIA DataSys+ DS0-002 on Web View CompTIA DataSys+ DS0-002 Practice Test

Free review resource

Use the full IT Mastery practice page above for the latest review links and practice page.

Revised on Thursday, May 28, 2026