DA0-002 — CompTIA Data+ V2 Exam Blueprint

Practical DA0-002 exam blueprint for CompTIA Data+ V2 candidates reviewing data concepts, analytics, visualization, governance, and exam-ready decision skills.

How to Use This Exam Blueprint

Use this page as an independent readiness map for the CompTIA Data+ V2 (DA0-002) exam. It is organized around the types of data tasks, decisions, and analysis concepts a candidate should be able to handle on exam day.

Work through it in three passes:

  1. Recognition pass: Can you define the term and identify it in a scenario?
  2. Application pass: Can you choose the right method, chart, query logic, or control?
  3. Justification pass: Can you explain why the chosen answer is better than the distractors?

Mark an item as ready only when you can apply it to a realistic business or technical data scenario, not just recall a definition.

Topic-Area Readiness Map

Readiness areaWhat to reviewWhat “ready” looks like
Data lifecycle and analytics workflowBusiness question, data acquisition, preparation, analysis, visualization, reporting, monitoringYou can place a task in the correct phase and identify what should happen before analysis begins.
Data types and structuresStructured, semi-structured, unstructured, categorical, numerical, ordinal, nominal, discrete, continuousYou can classify data correctly and choose suitable storage, analysis, and visualization approaches.
Data sourcesDatabases, flat files, APIs, logs, surveys, transactional systems, external datasetsYou can compare source reliability, latency, format, and fitness for a business question.
Databases and storage conceptsOLTP, OLAP, data warehouse, data lake, relational tables, keys, schemas, metadataYou can distinguish operational storage from analytical storage and explain when each fits.
Data acquisition and integrationETL, ELT, batch, streaming, full load, incremental load, joins, unions, append operationsYou can identify integration risks such as mismatched keys, duplicate records, and inconsistent formats.
Data profiling and qualityCompleteness, accuracy, consistency, validity, timeliness, uniqueness, outliers, missing valuesYou can diagnose quality issues and select an appropriate correction or escalation path.
Data preparationCleaning, standardization, normalization, deduplication, parsing, type conversion, encodingYou can decide whether to transform, exclude, impute, or preserve data based on context.
Descriptive statisticsMean, median, mode, range, variance, standard deviation, percentiles, distributionsYou can interpret summaries and recognize when an average hides important variation.
Analytical methodsDescriptive, diagnostic, predictive, prescriptive, segmentation, trend analysis, regression, correlationYou can match the method to the question being asked and avoid overclaiming results.
Query and transformation logicFiltering, sorting, grouping, aggregation, joins, calculated fields, conditional logicYou can read SQL-like logic and predict the resulting dataset or metric.
Visualization and dashboard designChart selection, visual encoding, accessibility, filters, drilldowns, KPIs, layoutYou can choose a chart that fits the data type, audience, and message.
Reporting and communicationExecutive summaries, assumptions, limitations, recommendations, stakeholder reviewYou can explain findings clearly without hiding uncertainty or data-quality constraints.
Governance and privacyData classification, stewardship, lineage, retention, consent, access control, sensitive dataYou can identify governance risks and choose appropriate controls.
Ethics and biasSampling bias, algorithmic bias, fairness, transparency, misuse of data, correlation vs causationYou can spot misleading conclusions and recommend responsible analysis practices.
Operational readinessDocumentation, reproducibility, monitoring, refresh schedules, version control, handoffYou can explain how an analysis or dashboard should be maintained after delivery.

Analytics Workflow Check

A typical data analytics scenario is not just “run a report.” Be ready to reason through the order of work.

    flowchart LR
	    A[Define the business question] --> B[Identify required data]
	    B --> C[Acquire or connect to data]
	    C --> D[Profile data quality]
	    D --> E[Clean and transform]
	    E --> F[Analyze]
	    F --> G[Visualize and report]
	    G --> H[Validate with stakeholders]
	    H --> I[Monitor, refresh, and improve]
	    D --> B
	    H --> A

Can You Do This?

  • Identify the business decision a dataset is supposed to support.
  • Separate a stakeholder’s desired output from the real analytical question.
  • Determine whether more data is needed before analysis.
  • Recognize when poor data quality makes an analysis unreliable.
  • Explain assumptions and limitations before presenting conclusions.
  • Recommend a follow-up metric, dashboard change, or data-quality control.

Data Concepts Checklist

Data Types and Measurement Levels

ConceptBe ready to recognizeExample exam cue
Nominal dataCategories with no natural orderRegion, product type, department
Ordinal dataCategories with a ranked orderSatisfaction rating, risk level, priority
Interval dataNumeric scale with meaningful differences but no true zeroTemperature on some scales
Ratio dataNumeric scale with a true zeroRevenue, duration, quantity
Discrete dataCountable valuesNumber of tickets, transactions, defects
Continuous dataMeasurable values on a rangeResponse time, weight, temperature
Structured dataOrganized in rows, columns, defined fieldsRelational database table
Semi-structured dataHas tags or keys but flexible structureJSON, XML, log records
Unstructured dataNo fixed tabular structureFree text, images, audio

Data Source Readiness

  • Distinguish primary data collected for the current purpose from secondary data reused from another purpose.
  • Identify risks in third-party or external data, including unknown collection methods and outdated definitions.
  • Compare data collected from surveys, transactions, sensors, logs, and APIs.
  • Recognize when a source is authoritative for a specific attribute.
  • Check whether a dataset’s grain matches the analysis question.
  • Explain how latency affects reporting: real-time, near-real-time, scheduled batch, or ad hoc refresh.
  • Identify when a dataset is too aggregated or too detailed for the requested metric.

Grain, Keys, and Relationships

TopicReady means you can answer
GrainWhat does one row represent? A customer, transaction, day, event, or product?
Primary keyWhich field uniquely identifies a record?
Foreign keyWhich field links one table to another?
CardinalityIs the relationship one-to-one, one-to-many, or many-to-many?
Join riskWill the join duplicate rows, drop records, or create nulls?
Surrogate keyWhy might a system-generated key be used instead of a natural key?
Data dictionaryWhere are field names, definitions, formats, and business rules documented?

Data Storage and Architecture Decision Checks

Scenario cueStrong answer direction
High-volume transaction processing is neededConsider an operational database or OLTP-oriented system.
Historical trend analysis across multiple systems is neededConsider a data warehouse or analytical repository.
Raw files in varied formats must be retained for future analysisConsider a data lake-style approach with governance and metadata.
Business users need consistent KPI definitionsCentralize definitions through semantic layers, governed datasets, or documented metrics.
Analysts need to explore data flexiblyProvide governed access to analytical datasets with appropriate controls.
A report is slow because it queries transactional systems directlyConsider extracts, aggregation, indexing, or analytical storage depending on context.
Data arrives continuously from logs or eventsConsider streaming or near-real-time ingestion patterns.
Data is refreshed nightly from source systemsBatch processing may be sufficient.

Common Architecture Traps

  • Choosing a storage pattern before understanding access patterns.
  • Treating a data lake as automatically governed because data is centralized.
  • Running heavy analytical queries on systems intended for transactions.
  • Ignoring metadata, lineage, and ownership.
  • Assuming all data must be real-time when scheduled refresh would meet the requirement.
  • Confusing a dashboard tool with a complete data architecture.

Data Acquisition, Integration, and Preparation

ETL and ELT Readiness

ConceptWhat to know
ETLExtract, transform, load. Transformations happen before loading into the target.
ELTExtract, load, transform. Transformations happen after loading into the target environment.
Full loadReplaces or reloads the entire dataset.
Incremental loadLoads only new or changed records.
Batch processingProcesses data on a schedule or in groups.
StreamingProcesses data as events arrive.
Data mappingAligns fields from source to target definitions.
Data validationConfirms that values meet expected rules before or after loading.

Preparation Checklist

  • Identify null, blank, malformed, duplicated, or invalid values.
  • Confirm data types before aggregation or filtering.
  • Standardize date, time, currency, units, and categorical values.
  • Normalize inconsistent labels such as “NY,” “New York,” and “N.Y.”
  • Detect duplicate records and determine whether they are true duplicates or valid repeated events.
  • Parse combined fields when needed, such as full name or address components.
  • Create derived fields only when the formula and assumptions are documented.
  • Preserve raw data or a reproducible transformation path.
  • Validate record counts before and after transformation.
  • Check whether joins changed the number of rows unexpectedly.

Missing Data Decision Guide

SituationBetter exam-day reasoning
A few random missing valuesConsider whether removal or imputation affects results.
Missing values concentrated in one groupInvestigate bias before excluding records.
Missing value means “not applicable”Do not treat it the same as unknown.
Missing value means zeroConvert only if the business rule supports it.
Critical field is missingEscalate, validate source logic, or exclude from specific analysis.
Missingness is caused by system failureAddress data pipeline or collection process, not only the dataset.

Data Quality Checklist

DimensionQuestion to askExample issue
AccuracyDoes the value reflect reality?Incorrect customer age
CompletenessAre required values present?Missing product category
ConsistencyDo values agree across systems?Different customer status in two systems
ValidityDoes the value follow expected format or rule?Invalid date or impossible value
TimelinessIs the data current enough?Dashboard uses stale records
UniquenessAre records duplicated?Same transaction loaded twice
IntegrityDo relationships remain valid?Order references a missing customer
RelevanceDoes the data answer the question?Collecting clicks for a retention analysis without user identifiers

Can You Diagnose the Quality Problem?

  • A sudden revenue spike appears after a new import process.
  • A customer count increases after joining customer and order tables.
  • A dashboard total differs from the finance system.
  • Date filters exclude records because time zones differ.
  • “Unknown” values are mixed with blank values and true nulls.
  • Survey responses overrepresent one customer segment.
  • A calculated KPI changed after a field definition was updated.

Statistics and Calculation Readiness

The CompTIA Data+ V2 (DA0-002) exam may require practical interpretation of common analytical measures. Focus on knowing when to use each measure and how to interpret the result.

Core Measures

MeasureUse it forWatch out for
MeanGeneral averageSensitive to outliers
MedianMiddle valueBetter for skewed distributions
ModeMost frequent valueMay be non-numeric or have multiple modes
RangeSpread from minimum to maximumCan be distorted by outliers
VarianceAverage squared spread from the meanLess intuitive for business users
Standard deviationTypical distance from the meanAssumes context for interpretation
PercentileRelative standingRequires understanding distribution
RateEvents divided by opportunities or populationNumerator and denominator must match
RatioRelationship between two quantitiesUnits and definitions matter
Percentage changeRelative increase or decreaseBaseline cannot be ignored

Formulas to Be Comfortable With

\[ \text{Percentage change} = \frac{\text{New value} - \text{Old value}}{\text{Old value}} \times 100 \]\[ \text{Weighted average} = \frac{\sum(\text{value} \times \text{weight})}{\sum \text{weight}} \]\[ z = \frac{x - \mu}{\sigma} \]\[ \text{Conversion rate} = \frac{\text{Conversions}}{\text{Total eligible events}} \times 100 \]\[ \text{Error rate} = \frac{\text{Errors}}{\text{Total observations}} \times 100 \]

Calculation Checklist

  • Calculate a percentage increase or decrease.
  • Choose median instead of mean when outliers distort the average.
  • Interpret a standard deviation in business language.
  • Explain whether a value is unusually high or low using a z-score concept.
  • Compare rates fairly when group sizes differ.
  • Avoid comparing raw counts when the denominator matters.
  • Recognize when a metric needs weighting.
  • Identify whether a KPI is a count, sum, average, ratio, or rate.
  • Check whether time periods are comparable before reporting growth.
  • Distinguish cumulative totals from period-specific values.

Analysis Method Checklist

Analysis typePurposeScenario cue
DescriptiveExplain what happenedMonthly sales summary, ticket counts
DiagnosticExplain why it happenedRoot-cause investigation, drilldown by region
PredictiveEstimate what may happenForecast demand, predict churn risk
PrescriptiveRecommend what to doOptimize staffing, suggest next-best action
ExploratoryFind patterns or generate hypothesesInitial review of an unfamiliar dataset
ConfirmatoryTest a specific hypothesisValidate whether a change affected conversion
SegmentationCompare groupsCustomer cohorts, product categories
Trend analysisExamine change over timeRevenue by month, defect rate by week
Correlation analysisMeasure relationship between variablesRelationship between ad spend and leads
RegressionModel relationship or estimate outcomePredict sales from multiple factors
ClassificationAssign categoriesFlag transactions as likely fraudulent
ClusteringGroup similar recordsDiscover customer segments

Interpretation Traps

  • Correlation does not prove causation.
  • A statistically interesting result may not be operationally useful.
  • A business-significant change may be hidden by aggregation.
  • A trend can be caused by seasonality, data collection changes, or missing periods.
  • A model can perform well overall while failing for an important subgroup.
  • A sample can be large and still biased.
  • A dashboard can be precise without being accurate.

Query and Transformation Logic Readiness

You do not need to think like a database administrator for every data scenario, but you should be able to read common query patterns and understand their effect.

SQL-Like Pattern to Recognize

SELECT
    region,
    COUNT(*) AS order_count,
    SUM(order_amount) AS total_sales
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY region
HAVING SUM(order_amount) > 10000
ORDER BY total_sales DESC;

Can You Predict What Happens?

  • WHERE filters rows before aggregation.
  • GROUP BY creates one result row per group.
  • HAVING filters aggregated groups.
  • ORDER BY sorts the final result.
  • COUNT(*) counts rows, including rows with nulls in other fields.
  • COUNT(column_name) counts non-null values in that column.
  • An inner join returns matching records only.
  • A left join keeps all records from the left table and adds matches from the right table.
  • A many-to-many join can inflate totals.
  • Aggregating before joining can sometimes prevent duplicate-driven overcounting.

Join Decision Table

Join situationWhat to check
Customer table joined to order tableOne customer may have many orders, so customer rows can repeat.
Product table joined to sales tableMissing product IDs may create null product details.
Two fact tables joined directlyRisk of many-to-many multiplication.
Lookup table has duplicate keysJoin may create duplicate output rows.
Left join produces many nullsKeys may not match due to format, timing, or missing reference data.
Totals change after joinCheck row counts, key uniqueness, and aggregation level.

Visualization and Dashboard Checklist

Chart Selection

GoalBetter chart choicesCommon trap
Compare categoriesBar chart, column chartUsing pie charts with too many slices
Show trend over timeLine chart, area chartUsing unordered categories on the x-axis
Show part-to-wholeStacked bar, 100 percent stacked bar, limited pie/donutHiding small but important segments
Show distributionHistogram, box plotReporting only the mean
Show relationshipScatter plotImplying causation from visual association
Show geographic patternMapUsing maps when location is not analytically important
Show KPI statusScorecard, gauge, bullet chartMissing target, time period, or definition
Show process flowFlow diagramOverloading a dashboard with workflow detail

Dashboard Design Readiness

  • Identify the dashboard audience: executive, manager, analyst, operational team.
  • Place the most important KPI where it is seen first.
  • Include definitions for metrics that may be interpreted differently.
  • Use consistent colors, scales, and date ranges.
  • Avoid truncated axes that exaggerate differences.
  • Use filters that match user decision needs.
  • Provide drilldown where users need investigation, not just summary.
  • Show data freshness or refresh date when it affects trust.
  • Make visualizations accessible with readable labels and sufficient contrast.
  • Avoid chart junk, unnecessary 3D effects, and decorative elements that reduce clarity.

Scenario Checks

ScenarioStrong response
Executives need a quick view of performance against targetsUse concise KPIs, trend indicators, and exception-focused visuals.
Analysts need to investigate root causesProvide filters, drilldowns, detail tables, and segmentation.
A chart shows a huge change because the y-axis starts near the data valuesIdentify the misleading scale and recommend a clearer axis.
A pie chart has many small categoriesReplace with a sorted bar chart or group minor categories if appropriate.
A dashboard combines metrics from different time periodsAlign periods or clearly label differences.
A color palette makes red/green status difficult to readUse accessible colors, labels, or symbols.

Governance, Security, and Privacy Checklist

Governance Artifacts and Roles

ItemWhat to know
Data ownerAccountable for data domain decisions and access approval.
Data stewardHelps maintain definitions, quality, and proper use.
Data custodianManages technical storage, protection, and availability.
Data dictionaryDefines fields, formats, accepted values, and meanings.
MetadataDescribes data, including source, structure, lineage, and context.
Data lineageShows where data came from and how it changed.
Data catalogHelps users discover governed datasets.
Retention policyDefines how long data is kept and when it is disposed.
ClassificationLabels data by sensitivity or business importance.
Access controlLimits use based on role, need, and authorization.

Privacy and Security Controls

  • Identify sensitive data such as personal, financial, health, credential, or confidential business data.
  • Apply least privilege when granting access.
  • Use role-based access where appropriate.
  • Mask or tokenize sensitive fields when full values are not needed.
  • Distinguish anonymization from pseudonymization.
  • Protect data in transit and at rest where required by policy.
  • Avoid exporting sensitive data to unmanaged locations.
  • Apply retention and disposal rules.
  • Maintain auditability for access and changes.
  • Escalate when a requested report exposes unnecessary sensitive data.

Ethics and Responsible Use

RiskExam-ready response
Biased sampleQuestion representativeness before drawing conclusions.
Proxy variableRecognize that a harmless-looking field may stand in for a sensitive trait.
OvercollectionCollect only data needed for the stated purpose.
ReidentificationAvoid assuming de-identified data is risk-free.
Misleading visualizationCorrect scale, labels, context, or chart type.
Unsupported causal claimState that the data shows association unless causation is justified.
Hidden limitationDisclose data gaps, assumptions, and confidence concerns.

Business Requirements and Stakeholder Communication

Requirements Checklist

  • Identify the decision the stakeholder needs to make.
  • Define the audience and level of detail.
  • Confirm KPI definitions before building reports.
  • Clarify time period, filters, exclusions, and update frequency.
  • Determine whether the output should be a one-time analysis, recurring report, or dashboard.
  • Capture acceptance criteria.
  • Confirm security and access requirements.
  • Document assumptions and known data limitations.
  • Validate results with subject matter experts.
  • Explain tradeoffs between speed, accuracy, completeness, and maintainability.

Good Questions to Ask in a Scenario

If the prompt says…Ask yourself…
“The manager wants a dashboard”What decision will the dashboard support?
“Sales are down”Compared to what period, target, segment, or baseline?
“The data is inaccurate”Which quality dimension is failing?
“Users disagree on the metric”Is there a documented business definition?
“The report is too slow”Is the issue source system performance, query design, volume, or dashboard design?
“The analysis must be real time”Is real time truly required, or is frequent refresh enough?
“The dataset contains customer information”What privacy, masking, retention, and access controls apply?

Scenario-Based Decision Points

Use these prompts to test exam judgment. For each one, identify the best next action before looking for a tool-specific answer.

ScenarioWhat a ready candidate notices
A report total does not match the source systemCheck filters, timing, definitions, joins, and transformation logic.
A stakeholder asks for all customer data “just in case”Apply data minimization and clarify the business need.
A model performs well overall but poorly for a subgroupInvestigate bias, representation, and segment-level performance.
A chart shows average ticket resolution time onlyAsk whether median, distribution, or outliers are needed.
A table join doubles revenueCheck relationship cardinality and duplicate keys.
A dashboard is refreshed monthly but users make daily decisionsAssess update frequency against business need.
A dataset contains multiple date fieldsClarify whether to use order date, ship date, close date, or event date.
A survey result is used to represent all customersCheck sample size, response bias, and population coverage.
A field has values of “0,” blank, “N/A,” and nullDetermine the business meaning of each before cleaning.
A KPI is green but customer complaints increasedCheck whether the KPI measures the right outcome.

“Can You Do This?” Master Checklist

Concepts and Definitions

  • Explain the difference between data, information, and insight.
  • Identify structured, semi-structured, and unstructured data.
  • Classify variables as categorical or numerical.
  • Distinguish nominal, ordinal, interval, and ratio data.
  • Explain the difference between OLTP and OLAP use cases.
  • Describe the purpose of metadata, lineage, and a data dictionary.
  • Define data quality dimensions and recognize examples.
  • Explain ETL and ELT at a practical level.
  • Distinguish descriptive, diagnostic, predictive, and prescriptive analytics.
  • Explain why correlation is not causation.

Practical Analysis Skills

  • Choose an appropriate metric for a business question.
  • Calculate and interpret percentage change.
  • Select mean, median, or mode appropriately.
  • Recognize when outliers affect interpretation.
  • Compare rates instead of raw counts when group sizes differ.
  • Read a basic aggregation query.
  • Predict the effect of filters, joins, and grouping.
  • Identify data-quality problems from symptoms.
  • Recommend a cleaning approach based on business meaning.
  • Validate an analysis against known totals or source records.

Visualization and Reporting Skills

  • Choose a chart based on comparison, trend, relationship, distribution, or part-to-whole needs.
  • Identify misleading visualizations.
  • Design a dashboard for the intended audience.
  • Include context, targets, and definitions for KPIs.
  • Communicate limitations without undermining useful findings.
  • Recommend next steps based on data rather than personal preference.
  • Explain findings in business language.
  • Identify when a table is better than a chart.
  • Use filters and drilldowns appropriately.
  • Avoid clutter and unnecessary visual complexity.

Governance and Risk Skills

  • Identify sensitive or regulated data in a scenario.
  • Recommend least-privilege access.
  • Know when masking, anonymization, or pseudonymization may be appropriate.
  • Recognize retention and disposal concerns.
  • Explain why lineage matters for trust.
  • Identify ethical risks in collection, analysis, or reporting.
  • Detect sampling bias or representational gaps.
  • Avoid unsupported causal claims.
  • Escalate governance issues when data use is unclear.
  • Document assumptions, definitions, and transformation steps.

Common Weak Areas and Traps

Weak areaWhy candidates miss itHow to fix it
Join logicThey memorize join names but do not track row countsPractice predicting output rows and duplicate effects.
Metric definitionsThey calculate correctly but use the wrong denominatorAlways identify numerator, denominator, period, and exclusions.
Data grainThey aggregate data without knowing what one row representsState the grain before joining or summarizing.
Missing dataThey delete nulls automaticallyDetermine whether null means unknown, not applicable, zero, or error.
Chart selectionThey choose familiar charts instead of purpose-fit chartsMatch chart to comparison, trend, distribution, or relationship.
OutliersThey ignore how extreme values affect mean and scaleCompare mean vs median and inspect distribution.
CorrelationThey infer cause from relationshipLook for experimental design, controls, or plausible alternatives.
GovernanceThey treat access as a convenience issueApply classification, least privilege, and business need.
Data freshnessThey overlook refresh schedulesMatch latency to decision timing.
CommunicationThey present numbers without contextInclude baseline, target, limitation, and recommended action.

Final-Week Review Checklist

Seven to Five Days Out

  • Re-read the exam objectives from CompTIA for CompTIA Data+ V2 (DA0-002).
  • Mark each topic in this checklist as strong, shaky, or weak.
  • Prioritize weak areas that affect many scenarios: joins, quality, metrics, visualization, and governance.
  • Redo missed practice questions by explaining why each wrong answer is wrong.
  • Build a one-page formula and interpretation sheet.
  • Review chart selection and misleading-visual examples.
  • Practice scenario questions without looking up definitions.

Four to Two Days Out

  • Complete mixed-topic practice sets under timed conditions.
  • Review all missed questions by topic, not just by score.
  • Practice reading SQL-like queries and join scenarios.
  • Rehearse data-quality diagnosis from symptoms.
  • Review privacy, access, masking, lineage, and retention concepts.
  • Memorize only what supports decisions; focus on application.
  • Make a short list of recurring traps you personally miss.

Day Before

  • Do a light review of formulas, chart choices, and governance terms.
  • Review your personal weak-area notes.
  • Avoid cramming unfamiliar advanced material.
  • Prepare identification, scheduling details, and testing environment requirements.
  • Sleep and reset; scenario judgment is harder when fatigued.

Exam-Day Mindset

  • Read the business goal before choosing a technical answer.
  • Watch for words such as “best,” “first,” “most appropriate,” and “least.”
  • Eliminate answers that ignore data quality, privacy, or stakeholder requirements.
  • Check whether the scenario is asking for diagnosis, visualization, calculation, governance, or communication.
  • Do not overengineer; choose the answer that fits the stated requirement.
  • Flag time-consuming questions and return if needed.

Practical Next Step

Pick the three weakest areas from this checklist and complete targeted practice on those topics first. Then move to mixed, timed practice so you can apply CompTIA Data+ V2 (DA0-002) concepts under exam-like conditions and justify each answer choice.

Browse Certification Practice Tests by Exam Family