DA0-002 — CompTIA Data+ V2 Exam Blueprint

Last revised: June 18, 2026

Practical DA0-002 exam blueprint for CompTIA Data+ V2 candidates reviewing data concepts, analytics, visualization, governance, and exam-ready decision skills.

How to Use This Exam Blueprint

Use this page as an independent readiness map for the CompTIA Data+ V2 (DA0-002) exam. It is organized around the types of data tasks, decisions, and analysis concepts a candidate should be able to handle on exam day.

Work through it in three passes:

Recognition pass: Can you define the term and identify it in a scenario?
Application pass: Can you choose the right method, chart, query logic, or control?
Justification pass: Can you explain why the chosen answer is better than the distractors?

Mark an item as ready only when you can apply it to a realistic business or technical data scenario, not just recall a definition.

Topic-Area Readiness Map

Readiness area	What to review	What “ready” looks like
Data lifecycle and analytics workflow	Business question, data acquisition, preparation, analysis, visualization, reporting, monitoring	You can place a task in the correct phase and identify what should happen before analysis begins.
Data types and structures	Structured, semi-structured, unstructured, categorical, numerical, ordinal, nominal, discrete, continuous	You can classify data correctly and choose suitable storage, analysis, and visualization approaches.
Data sources	Databases, flat files, APIs, logs, surveys, transactional systems, external datasets	You can compare source reliability, latency, format, and fitness for a business question.
Databases and storage concepts	OLTP, OLAP, data warehouse, data lake, relational tables, keys, schemas, metadata	You can distinguish operational storage from analytical storage and explain when each fits.
Data acquisition and integration	ETL, ELT, batch, streaming, full load, incremental load, joins, unions, append operations	You can identify integration risks such as mismatched keys, duplicate records, and inconsistent formats.
Data profiling and quality	Completeness, accuracy, consistency, validity, timeliness, uniqueness, outliers, missing values	You can diagnose quality issues and select an appropriate correction or escalation path.
Data preparation	Cleaning, standardization, normalization, deduplication, parsing, type conversion, encoding	You can decide whether to transform, exclude, impute, or preserve data based on context.
Descriptive statistics	Mean, median, mode, range, variance, standard deviation, percentiles, distributions	You can interpret summaries and recognize when an average hides important variation.
Analytical methods	Descriptive, diagnostic, predictive, prescriptive, segmentation, trend analysis, regression, correlation	You can match the method to the question being asked and avoid overclaiming results.
Query and transformation logic	Filtering, sorting, grouping, aggregation, joins, calculated fields, conditional logic	You can read SQL-like logic and predict the resulting dataset or metric.
Visualization and dashboard design	Chart selection, visual encoding, accessibility, filters, drilldowns, KPIs, layout	You can choose a chart that fits the data type, audience, and message.
Reporting and communication	Executive summaries, assumptions, limitations, recommendations, stakeholder review	You can explain findings clearly without hiding uncertainty or data-quality constraints.
Governance and privacy	Data classification, stewardship, lineage, retention, consent, access control, sensitive data	You can identify governance risks and choose appropriate controls.
Ethics and bias	Sampling bias, algorithmic bias, fairness, transparency, misuse of data, correlation vs causation	You can spot misleading conclusions and recommend responsible analysis practices.
Operational readiness	Documentation, reproducibility, monitoring, refresh schedules, version control, handoff	You can explain how an analysis or dashboard should be maintained after delivery.

Analytics Workflow Check

A typical data analytics scenario is not just “run a report.” Be ready to reason through the order of work.

    flowchart LR
	    A[Define the business question] --> B[Identify required data]
	    B --> C[Acquire or connect to data]
	    C --> D[Profile data quality]
	    D --> E[Clean and transform]
	    E --> F[Analyze]
	    F --> G[Visualize and report]
	    G --> H[Validate with stakeholders]
	    H --> I[Monitor, refresh, and improve]
	    D --> B
	    H --> A

Can You Do This?

Identify the business decision a dataset is supposed to support.
Separate a stakeholder’s desired output from the real analytical question.
Determine whether more data is needed before analysis.
Recognize when poor data quality makes an analysis unreliable.
Explain assumptions and limitations before presenting conclusions.
Recommend a follow-up metric, dashboard change, or data-quality control.

Data Concepts Checklist

Data Types and Measurement Levels

Concept	Be ready to recognize	Example exam cue
Nominal data	Categories with no natural order	Region, product type, department
Ordinal data	Categories with a ranked order	Satisfaction rating, risk level, priority
Interval data	Numeric scale with meaningful differences but no true zero	Temperature on some scales
Ratio data	Numeric scale with a true zero	Revenue, duration, quantity
Discrete data	Countable values	Number of tickets, transactions, defects
Continuous data	Measurable values on a range	Response time, weight, temperature
Structured data	Organized in rows, columns, defined fields	Relational database table
Semi-structured data	Has tags or keys but flexible structure	JSON, XML, log records
Unstructured data	No fixed tabular structure	Free text, images, audio

Data Source Readiness

Distinguish primary data collected for the current purpose from secondary data reused from another purpose.
Identify risks in third-party or external data, including unknown collection methods and outdated definitions.
Compare data collected from surveys, transactions, sensors, logs, and APIs.
Recognize when a source is authoritative for a specific attribute.
Check whether a dataset’s grain matches the analysis question.
Explain how latency affects reporting: real-time, near-real-time, scheduled batch, or ad hoc refresh.
Identify when a dataset is too aggregated or too detailed for the requested metric.

Grain, Keys, and Relationships

Topic	Ready means you can answer
Grain	What does one row represent? A customer, transaction, day, event, or product?
Primary key	Which field uniquely identifies a record?
Foreign key	Which field links one table to another?
Cardinality	Is the relationship one-to-one, one-to-many, or many-to-many?
Join risk	Will the join duplicate rows, drop records, or create nulls?
Surrogate key	Why might a system-generated key be used instead of a natural key?
Data dictionary	Where are field names, definitions, formats, and business rules documented?

Data Storage and Architecture Decision Checks

Scenario cue	Strong answer direction
High-volume transaction processing is needed	Consider an operational database or OLTP-oriented system.
Historical trend analysis across multiple systems is needed	Consider a data warehouse or analytical repository.
Raw files in varied formats must be retained for future analysis	Consider a data lake-style approach with governance and metadata.
Business users need consistent KPI definitions	Centralize definitions through semantic layers, governed datasets, or documented metrics.
Analysts need to explore data flexibly	Provide governed access to analytical datasets with appropriate controls.
A report is slow because it queries transactional systems directly	Consider extracts, aggregation, indexing, or analytical storage depending on context.
Data arrives continuously from logs or events	Consider streaming or near-real-time ingestion patterns.
Data is refreshed nightly from source systems	Batch processing may be sufficient.

Common Architecture Traps

Choosing a storage pattern before understanding access patterns.
Treating a data lake as automatically governed because data is centralized.
Running heavy analytical queries on systems intended for transactions.
Ignoring metadata, lineage, and ownership.
Assuming all data must be real-time when scheduled refresh would meet the requirement.
Confusing a dashboard tool with a complete data architecture.

Data Acquisition, Integration, and Preparation

ETL and ELT Readiness

Concept	What to know
ETL	Extract, transform, load. Transformations happen before loading into the target.
ELT	Extract, load, transform. Transformations happen after loading into the target environment.
Full load	Replaces or reloads the entire dataset.
Incremental load	Loads only new or changed records.
Batch processing	Processes data on a schedule or in groups.
Streaming	Processes data as events arrive.
Data mapping	Aligns fields from source to target definitions.
Data validation	Confirms that values meet expected rules before or after loading.

Preparation Checklist

Identify null, blank, malformed, duplicated, or invalid values.
Confirm data types before aggregation or filtering.
Standardize date, time, currency, units, and categorical values.
Normalize inconsistent labels such as “NY,” “New York,” and “N.Y.”
Detect duplicate records and determine whether they are true duplicates or valid repeated events.
Parse combined fields when needed, such as full name or address components.
Create derived fields only when the formula and assumptions are documented.
Preserve raw data or a reproducible transformation path.
Validate record counts before and after transformation.
Check whether joins changed the number of rows unexpectedly.

Missing Data Decision Guide

Situation	Better exam-day reasoning
A few random missing values	Consider whether removal or imputation affects results.
Missing values concentrated in one group	Investigate bias before excluding records.
Missing value means “not applicable”	Do not treat it the same as unknown.
Missing value means zero	Convert only if the business rule supports it.
Critical field is missing	Escalate, validate source logic, or exclude from specific analysis.
Missingness is caused by system failure	Address data pipeline or collection process, not only the dataset.

Data Quality Checklist

Dimension	Question to ask	Example issue
Accuracy	Does the value reflect reality?	Incorrect customer age
Completeness	Are required values present?	Missing product category
Consistency	Do values agree across systems?	Different customer status in two systems
Validity	Does the value follow expected format or rule?	Invalid date or impossible value
Timeliness	Is the data current enough?	Dashboard uses stale records
Uniqueness	Are records duplicated?	Same transaction loaded twice
Integrity	Do relationships remain valid?	Order references a missing customer
Relevance	Does the data answer the question?	Collecting clicks for a retention analysis without user identifiers

Can You Diagnose the Quality Problem?

A sudden revenue spike appears after a new import process.
A customer count increases after joining customer and order tables.
A dashboard total differs from the finance system.
Date filters exclude records because time zones differ.
“Unknown” values are mixed with blank values and true nulls.
Survey responses overrepresent one customer segment.
A calculated KPI changed after a field definition was updated.

Statistics and Calculation Readiness

The CompTIA Data+ V2 (DA0-002) exam may require practical interpretation of common analytical measures. Focus on knowing when to use each measure and how to interpret the result.

Core Measures

Measure	Use it for	Watch out for
Mean	General average	Sensitive to outliers
Median	Middle value	Better for skewed distributions
Mode	Most frequent value	May be non-numeric or have multiple modes
Range	Spread from minimum to maximum	Can be distorted by outliers
Variance	Average squared spread from the mean	Less intuitive for business users
Standard deviation	Typical distance from the mean	Assumes context for interpretation
Percentile	Relative standing	Requires understanding distribution
Rate	Events divided by opportunities or population	Numerator and denominator must match
Ratio	Relationship between two quantities	Units and definitions matter
Percentage change	Relative increase or decrease	Baseline cannot be ignored

Formulas to Be Comfortable With

\[ \text{Percentage change} = \frac{\text{New value} - \text{Old value}}{\text{Old value}} \times 100 \]\[ \text{Weighted average} = \frac{\sum(\text{value} \times \text{weight})}{\sum \text{weight}} \]\[ z = \frac{x - \mu}{\sigma} \]\[ \text{Conversion rate} = \frac{\text{Conversions}}{\text{Total eligible events}} \times 100 \]\[ \text{Error rate} = \frac{\text{Errors}}{\text{Total observations}} \times 100 \]

Calculation Checklist

Calculate a percentage increase or decrease.
Choose median instead of mean when outliers distort the average.
Interpret a standard deviation in business language.
Explain whether a value is unusually high or low using a z-score concept.
Compare rates fairly when group sizes differ.
Avoid comparing raw counts when the denominator matters.
Recognize when a metric needs weighting.
Identify whether a KPI is a count, sum, average, ratio, or rate.
Check whether time periods are comparable before reporting growth.
Distinguish cumulative totals from period-specific values.

Analysis Method Checklist

Analysis type	Purpose	Scenario cue
Descriptive	Explain what happened	Monthly sales summary, ticket counts
Diagnostic	Explain why it happened	Root-cause investigation, drilldown by region
Predictive	Estimate what may happen	Forecast demand, predict churn risk
Prescriptive	Recommend what to do	Optimize staffing, suggest next-best action
Exploratory	Find patterns or generate hypotheses	Initial review of an unfamiliar dataset
Confirmatory	Test a specific hypothesis	Validate whether a change affected conversion
Segmentation	Compare groups	Customer cohorts, product categories
Trend analysis	Examine change over time	Revenue by month, defect rate by week
Correlation analysis	Measure relationship between variables	Relationship between ad spend and leads
Regression	Model relationship or estimate outcome	Predict sales from multiple factors
Classification	Assign categories	Flag transactions as likely fraudulent
Clustering	Group similar records	Discover customer segments

Interpretation Traps

Correlation does not prove causation.
A statistically interesting result may not be operationally useful.
A business-significant change may be hidden by aggregation.
A trend can be caused by seasonality, data collection changes, or missing periods.
A model can perform well overall while failing for an important subgroup.
A sample can be large and still biased.
A dashboard can be precise without being accurate.

Query and Transformation Logic Readiness

You do not need to think like a database administrator for every data scenario, but you should be able to read common query patterns and understand their effect.

SQL-Like Pattern to Recognize

SELECT
    region,
    COUNT(*) AS order_count,
    SUM(order_amount) AS total_sales
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY region
HAVING SUM(order_amount) > 10000
ORDER BY total_sales DESC;

Can You Predict What Happens?

WHERE filters rows before aggregation.
GROUP BY creates one result row per group.
HAVING filters aggregated groups.
ORDER BY sorts the final result.
COUNT(*) counts rows, including rows with nulls in other fields.
COUNT(column_name) counts non-null values in that column.
An inner join returns matching records only.
A left join keeps all records from the left table and adds matches from the right table.
A many-to-many join can inflate totals.
Aggregating before joining can sometimes prevent duplicate-driven overcounting.

Join Decision Table

Join situation	What to check
Customer table joined to order table	One customer may have many orders, so customer rows can repeat.
Product table joined to sales table	Missing product IDs may create null product details.
Two fact tables joined directly	Risk of many-to-many multiplication.
Lookup table has duplicate keys	Join may create duplicate output rows.
Left join produces many nulls	Keys may not match due to format, timing, or missing reference data.
Totals change after join	Check row counts, key uniqueness, and aggregation level.

Visualization and Dashboard Checklist

Chart Selection

Goal	Better chart choices	Common trap
Compare categories	Bar chart, column chart	Using pie charts with too many slices
Show trend over time	Line chart, area chart	Using unordered categories on the x-axis
Show part-to-whole	Stacked bar, 100 percent stacked bar, limited pie/donut	Hiding small but important segments
Show distribution	Histogram, box plot	Reporting only the mean
Show relationship	Scatter plot	Implying causation from visual association
Show geographic pattern	Map	Using maps when location is not analytically important
Show KPI status	Scorecard, gauge, bullet chart	Missing target, time period, or definition
Show process flow	Flow diagram	Overloading a dashboard with workflow detail

Dashboard Design Readiness

Identify the dashboard audience: executive, manager, analyst, operational team.
Place the most important KPI where it is seen first.
Include definitions for metrics that may be interpreted differently.
Use consistent colors, scales, and date ranges.
Avoid truncated axes that exaggerate differences.
Use filters that match user decision needs.
Provide drilldown where users need investigation, not just summary.
Show data freshness or refresh date when it affects trust.
Make visualizations accessible with readable labels and sufficient contrast.
Avoid chart junk, unnecessary 3D effects, and decorative elements that reduce clarity.

Scenario Checks

Scenario	Strong response
Executives need a quick view of performance against targets	Use concise KPIs, trend indicators, and exception-focused visuals.
Analysts need to investigate root causes	Provide filters, drilldowns, detail tables, and segmentation.
A chart shows a huge change because the y-axis starts near the data values	Identify the misleading scale and recommend a clearer axis.
A pie chart has many small categories	Replace with a sorted bar chart or group minor categories if appropriate.
A dashboard combines metrics from different time periods	Align periods or clearly label differences.
A color palette makes red/green status difficult to read	Use accessible colors, labels, or symbols.

Governance, Security, and Privacy Checklist

Governance Artifacts and Roles

Item	What to know
Data owner	Accountable for data domain decisions and access approval.
Data steward	Helps maintain definitions, quality, and proper use.
Data custodian	Manages technical storage, protection, and availability.
Data dictionary	Defines fields, formats, accepted values, and meanings.
Metadata	Describes data, including source, structure, lineage, and context.
Data lineage	Shows where data came from and how it changed.
Data catalog	Helps users discover governed datasets.
Retention policy	Defines how long data is kept and when it is disposed.
Classification	Labels data by sensitivity or business importance.
Access control	Limits use based on role, need, and authorization.

Privacy and Security Controls

Identify sensitive data such as personal, financial, health, credential, or confidential business data.
Apply least privilege when granting access.
Use role-based access where appropriate.
Mask or tokenize sensitive fields when full values are not needed.
Distinguish anonymization from pseudonymization.
Protect data in transit and at rest where required by policy.
Avoid exporting sensitive data to unmanaged locations.
Apply retention and disposal rules.
Maintain auditability for access and changes.
Escalate when a requested report exposes unnecessary sensitive data.

Ethics and Responsible Use

Risk	Exam-ready response
Biased sample	Question representativeness before drawing conclusions.
Proxy variable	Recognize that a harmless-looking field may stand in for a sensitive trait.
Overcollection	Collect only data needed for the stated purpose.
Reidentification	Avoid assuming de-identified data is risk-free.
Misleading visualization	Correct scale, labels, context, or chart type.
Unsupported causal claim	State that the data shows association unless causation is justified.
Hidden limitation	Disclose data gaps, assumptions, and confidence concerns.

Business Requirements and Stakeholder Communication

Requirements Checklist

Identify the decision the stakeholder needs to make.
Define the audience and level of detail.
Confirm KPI definitions before building reports.
Clarify time period, filters, exclusions, and update frequency.
Determine whether the output should be a one-time analysis, recurring report, or dashboard.
Capture acceptance criteria.
Confirm security and access requirements.
Document assumptions and known data limitations.
Validate results with subject matter experts.
Explain tradeoffs between speed, accuracy, completeness, and maintainability.

Good Questions to Ask in a Scenario

If the prompt says…	Ask yourself…
“The manager wants a dashboard”	What decision will the dashboard support?
“Sales are down”	Compared to what period, target, segment, or baseline?
“The data is inaccurate”	Which quality dimension is failing?
“Users disagree on the metric”	Is there a documented business definition?
“The report is too slow”	Is the issue source system performance, query design, volume, or dashboard design?
“The analysis must be real time”	Is real time truly required, or is frequent refresh enough?
“The dataset contains customer information”	What privacy, masking, retention, and access controls apply?

Scenario-Based Decision Points

Use these prompts to test exam judgment. For each one, identify the best next action before looking for a tool-specific answer.

Scenario	What a ready candidate notices
A report total does not match the source system	Check filters, timing, definitions, joins, and transformation logic.
A stakeholder asks for all customer data “just in case”	Apply data minimization and clarify the business need.
A model performs well overall but poorly for a subgroup	Investigate bias, representation, and segment-level performance.
A chart shows average ticket resolution time only	Ask whether median, distribution, or outliers are needed.
A table join doubles revenue	Check relationship cardinality and duplicate keys.
A dashboard is refreshed monthly but users make daily decisions	Assess update frequency against business need.
A dataset contains multiple date fields	Clarify whether to use order date, ship date, close date, or event date.
A survey result is used to represent all customers	Check sample size, response bias, and population coverage.
A field has values of “0,” blank, “N/A,” and null	Determine the business meaning of each before cleaning.
A KPI is green but customer complaints increased	Check whether the KPI measures the right outcome.

“Can You Do This?” Master Checklist

Concepts and Definitions

Explain the difference between data, information, and insight.
Identify structured, semi-structured, and unstructured data.
Classify variables as categorical or numerical.
Distinguish nominal, ordinal, interval, and ratio data.
Explain the difference between OLTP and OLAP use cases.
Describe the purpose of metadata, lineage, and a data dictionary.
Define data quality dimensions and recognize examples.
Explain ETL and ELT at a practical level.
Distinguish descriptive, diagnostic, predictive, and prescriptive analytics.
Explain why correlation is not causation.

Practical Analysis Skills

Choose an appropriate metric for a business question.
Calculate and interpret percentage change.
Select mean, median, or mode appropriately.
Recognize when outliers affect interpretation.
Compare rates instead of raw counts when group sizes differ.
Read a basic aggregation query.
Predict the effect of filters, joins, and grouping.
Identify data-quality problems from symptoms.
Recommend a cleaning approach based on business meaning.
Validate an analysis against known totals or source records.

Visualization and Reporting Skills

Choose a chart based on comparison, trend, relationship, distribution, or part-to-whole needs.
Identify misleading visualizations.
Design a dashboard for the intended audience.
Include context, targets, and definitions for KPIs.
Communicate limitations without undermining useful findings.
Recommend next steps based on data rather than personal preference.
Explain findings in business language.
Identify when a table is better than a chart.
Use filters and drilldowns appropriately.
Avoid clutter and unnecessary visual complexity.

Governance and Risk Skills

Identify sensitive or regulated data in a scenario.
Recommend least-privilege access.
Know when masking, anonymization, or pseudonymization may be appropriate.
Recognize retention and disposal concerns.
Explain why lineage matters for trust.
Identify ethical risks in collection, analysis, or reporting.
Detect sampling bias or representational gaps.
Avoid unsupported causal claims.
Escalate governance issues when data use is unclear.
Document assumptions, definitions, and transformation steps.

Common Weak Areas and Traps

Weak area	Why candidates miss it	How to fix it
Join logic	They memorize join names but do not track row counts	Practice predicting output rows and duplicate effects.
Metric definitions	They calculate correctly but use the wrong denominator	Always identify numerator, denominator, period, and exclusions.
Data grain	They aggregate data without knowing what one row represents	State the grain before joining or summarizing.
Missing data	They delete nulls automatically	Determine whether null means unknown, not applicable, zero, or error.
Chart selection	They choose familiar charts instead of purpose-fit charts	Match chart to comparison, trend, distribution, or relationship.
Outliers	They ignore how extreme values affect mean and scale	Compare mean vs median and inspect distribution.
Correlation	They infer cause from relationship	Look for experimental design, controls, or plausible alternatives.
Governance	They treat access as a convenience issue	Apply classification, least privilege, and business need.
Data freshness	They overlook refresh schedules	Match latency to decision timing.
Communication	They present numbers without context	Include baseline, target, limitation, and recommended action.

Final-Week Review Checklist

Seven to Five Days Out

Re-read the exam objectives from CompTIA for CompTIA Data+ V2 (DA0-002).
Mark each topic in this checklist as strong, shaky, or weak.
Prioritize weak areas that affect many scenarios: joins, quality, metrics, visualization, and governance.
Redo missed practice questions by explaining why each wrong answer is wrong.
Build a one-page formula and interpretation sheet.
Review chart selection and misleading-visual examples.
Practice scenario questions without looking up definitions.

Four to Two Days Out

Complete mixed-topic practice sets under timed conditions.
Review all missed questions by topic, not just by score.
Practice reading SQL-like queries and join scenarios.
Rehearse data-quality diagnosis from symptoms.
Review privacy, access, masking, lineage, and retention concepts.
Memorize only what supports decisions; focus on application.
Make a short list of recurring traps you personally miss.

Day Before

Do a light review of formulas, chart choices, and governance terms.
Review your personal weak-area notes.
Avoid cramming unfamiliar advanced material.
Prepare identification, scheduling details, and testing environment requirements.
Sleep and reset; scenario judgment is harder when fatigued.

Exam-Day Mindset

Read the business goal before choosing a technical answer.
Watch for words such as “best,” “first,” “most appropriate,” and “least.”
Eliminate answers that ignore data quality, privacy, or stakeholder requirements.
Check whether the scenario is asking for diagnosis, visualization, calculation, governance, or communication.
Do not overengineer; choose the answer that fits the stated requirement.
Flag time-consuming questions and return if needed.

Practical Next Step

Pick the three weakest areas from this checklist and complete targeted practice on those topics first. Then move to mixed, timed practice so you can apply CompTIA Data+ V2 (DA0-002) concepts under exam-like conditions and justify each answer choice.

Study Plan

Scenario Guide