Free DAMA CDMP Quality Practice Exam: Data Quality Specialist
Try 100 free DAMA CDMP Data Quality Specialist questions across the exam domains, with explanations, then continue with IT Mastery practice.
This free full-length DAMA CDMP Data Quality Specialist practice exam includes 100 original IT Mastery questions across the exam domains.
These are original IT Mastery practice questions. They are not official exam questions, copied live-exam content, or exam dumps. Use them for self-assessment, scope review, and deciding what to drill next.
Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.
Try the IT Mastery web app for a richer interactive practice experience with mixed sets, timed mocks, topic drills, explanations, and progress tracking.
Exam snapshot
- Practice target: DAMA CDMP Data Quality Specialist
- Practice-set question count: 100
- Time limit: 90 minutes
- Practice style: mixed-domain diagnostic run with answer explanations
Full-length exam mix
| Domain | Weight |
|---|---|
| Data Quality Foundations and Business Fitness | 8% |
| Quality Strategy Business Case and Prioritization | 8% |
| Profiling Discovery and Assessment | 9% |
| Quality Rules Standards and Requirements | 9% |
| Root Cause Analysis and Remediation | 10% |
| Monitoring Scorecards and Measurement | 10% |
| Quality Governance Roles and Stewardship | 8% |
| Metadata Lineage and Quality Evidence | 7% |
| Master Reference Integration and Warehouse Quality | 8% |
| Quality in Data Lifecycle and Operations | 7% |
| Quality Maturity and Continuous Improvement | 8% |
| Specialist Cross-Discipline Judgment | 8% |
Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and interactive practice.
Practice questions
Questions 1-25
Question 1
Topic: Quality Maturity and Continuous Improvement
A bank’s customer data quality scorecard shows a recurring defect that delays KYC checks. The Customer Data Steward has approved the quality rule and threshold. Lineage shows most failures are created by the web onboarding process after a recent form change.
| Month | Invalid tax ID rate | Failures from web onboarding | Cleansing SLA met |
|---|---|---|---|
| April | 1.8% | 41% | Yes |
| May | 4.9% | 86% | No |
| June | 7.5% | 92% | No |
Which next action is the BEST quality decision?
Options:
A. Add source validation to web onboarding and monitor residual exceptions
B. Assign more analysts to cleanse invalid tax IDs after loading
C. Increase the monthly profiling sample size for customer records
D. Escalate the tax ID definition to the governance council
Best answer: A
Explanation: Continuous improvement uses trend evidence to choose the next quality response. Here, the rule and threshold are already approved, so the issue is not unclear stewardship or definition. The trend is worsening, the remediation SLA is failing, and lineage identifies the web onboarding process as the dominant defect source after a process change. That evidence supports a prevention action: embed the approved rule into the source process so invalid tax IDs are stopped or corrected before records are created. Ongoing monitoring remains useful to confirm the control works and to catch residual exceptions. Adding cleanup capacity treats the symptom, and additional profiling delays action when the pattern is already clear.
- More profiling adds measurement, but the existing trend and lineage already identify the main source of the defect.
- More cleansing may reduce backlog, but it does not stop the recurring creation of invalid tax IDs.
- Governance escalation is not the priority because the rule, threshold, and steward ownership are already established.
Question 2
Topic: Monitoring Scorecards and Measurement
A customer master data quality scorecard tracks a rule requiring valid country and postal-code combinations. The enterprise threshold is a pass rate of at least 95%.
| Month | Overall pass | New CRM pass | Legacy billing pass | New CRM volume |
|---|---|---|---|---|
| April | 96.2% | 98.1% | 95.4% | 15% |
| May | 96.1% | 96.8% | 95.8% | 25% |
| June | 96.0% | 94.5% | 96.4% | 35% |
| July | 96.1% | 92.8% | 97.9% | 45% |
What is the best interpretation and next action?
Options:
A. Stable overall, but hiding a new CRM issue
B. Improving, because legacy billing pass rates increased
C. Deteriorating overall, because one segment is below threshold
D. Stable, so continue reporting only the overall score
Best answer: A
Explanation: A quality trend should be interpreted at the level needed to reveal business risk, not only at the aggregate level. The overall pass rate is stable and remains above the 95% threshold, but that stability masks a meaningful segment change. New CRM performance drops from 98.1% to 92.8%, crossing below threshold, while New CRM volume rises from 15% to 45%. The improving legacy billing results offset the visible enterprise score, so the scorecard is hiding a new issue. The appropriate response is to drill into the New CRM source, open or escalate a quality issue, and work with the responsible steward or process owner on root-cause analysis and remediation.
- Legacy-only improvement is misleading because one source improves while the growing New CRM source worsens.
- Overall deterioration overstates the aggregate trend because the enterprise pass rate remains nearly flat and above threshold.
- Aggregate-only reporting fails because it would continue masking the source-specific defect that has crossed the threshold.
Question 3
Topic: Quality Strategy Business Case and Prioritization
A health insurer is prioritizing a first-phase data quality initiative. Profiling shows 18% of new claims are missing injury_severity_code; analysts currently impute averages for reserve modeling and product pricing. No regulatory report uses the field, manual rework is minimal, and the claims data steward can approve one intake validation rule this quarter. Which proposed benefit is most defensible for the business case?
Options:
A. Improved enterprise trust in all claims data assets
B. Major operational efficiency from eliminating manual rework
C. Better reserve and pricing decisions from more complete severity data
D. Reduced regulatory reporting risk across claims submissions
Best answer: C
Explanation: A defensible data quality benefit should connect the defect, its business use, and the expected value. Here, the missing injury_severity_code affects reserve modeling and product pricing because analysts must impute values. That makes the clearest benefit better decisions: more complete severity data should improve the reliability of pricing and reserve outputs. The scenario weakens other benefit claims by stating that no regulatory report uses the field and that manual rework is minimal. The available stewardship capacity also supports a focused rule rather than a broad enterprise trust claim. The strongest business case is the one with a visible cause-and-effect path from quality improvement to business outcome.
- Regulatory risk is not supported because the field is not used in regulatory reporting.
- Manual rework savings are weak because the scenario says rework is minimal.
- Enterprise trust is too broad because only one field and one intake rule are in scope.
Question 4
Topic: Quality Strategy Business Case and Prioritization
A data quality council can fund remediation for only one issue in the next month. The decision must balance business risk, stakeholder impact, rule maturity, ownership, and feasibility.
| Issue | Evidence and context |
|---|---|
| Loan delinquency status | 7% stale by more than 2 days; feeds regulatory and collections reporting due in 3 weeks; approved timeliness rule; source owner has sprint capacity |
| Marketing preference code | 22% invalid; affects campaign targeting; no approved business rule or steward |
| Vendor duplicate records | 4% duplicates; causes AP rework; process owner unavailable this quarter |
| Product color | 35% missing; used only for optional website filtering; no reported business impact |
Which issue should be addressed first?
Options:
A. Marketing preference code
B. Vendor duplicate records
C. Product color
D. Loan delinquency status
Best answer: D
Explanation: Risk-based data quality triage should prioritize issues where the defect threatens important business outcomes and can realistically be remediated. The stale loan delinquency status affects regulatory and collections reporting, has a near-term deadline, already has an approved timeliness rule, and has an available source owner. That combination makes it both high value and feasible. A larger defect percentage alone is not enough if the business rule is immature, ownership is unclear, or impact is low. The strongest prioritization decisions consider risk, value, readiness, ownership, and capacity together.
- Marketing preference code has a high invalid rate, but the lack of an approved rule or steward makes immediate remediation less feasible.
- Vendor duplicates have a real operational impact, but unavailable ownership limits near-term remediation.
- Product color has the largest missing rate, but the current stakeholder impact is low compared with regulatory and collections use.
Question 5
Topic: Quality Rules Standards and Requirements
A customer analytics team has implemented data quality rules for email_address, customer_status, and preferred_channel. The rules generate many exceptions, but sales, service, and marketing users frequently disagree about whether the exceptions are true defects or acceptable business cases. What is the best improvement to the rule-definition process?
Options:
A. Have data stewards approve defect criteria and acceptable exceptions
B. Send all exceptions directly to source-system developers
C. Cleanse disputed records in the reporting layer
D. Increase profiling frequency before each rule run
Best answer: A
Explanation: Quality rules need business-approved interpretation, not just technical execution. When users disagree about whether rule exceptions are real defects, the weak point is usually the definition and approval of defect criteria: business meaning, valid exceptions, thresholds, severity, and ownership. Data stewards and accountable business representatives should agree how each exception is classified and when it is acceptable for the intended use. This turns a rule from a technical check into a governed quality requirement that can support consistent triage, remediation, and scorecarding. More profiling may reveal patterns, but it will not settle business fitness-for-purpose disputes by itself.
- More profiling may show exception volumes and patterns, but it does not decide whether an exception is acceptable for business use.
- Developer routing assumes every exception is a source-system defect, which is premature when users have not agreed on defect criteria.
- Reporting-layer cleansing masks disagreement downstream and does not establish governed rule meaning or prevention.
Question 6
Topic: Monitoring Scorecards and Measurement
A finance data quality team is updating its scorecard for customer master data used in invoicing. Invalid billing identifiers caused invoice rejections and delayed cash collection. Profiling already tracks invalid-value counts, and stewards log each exception and correction. Leadership asks for the best KPI to show whether the quality work improved the business outcome.
Options:
A. Number of customer master defects opened each month
B. Average time to close billing identifier exceptions
C. Reduction in invoice rejections caused by invalid billing identifiers
D. Percentage of profiled customer records reviewed by stewards
Best answer: C
Explanation: A business outcome KPI shows whether better data quality changed a business result, not just whether the team found or processed defects. In this case, the quality issue is invalid billing identifiers in customer master data, and the business impact is invoice rejection and delayed cash collection. A KPI based on fewer invoice rejections caused by those invalid identifiers connects the quality rule to a measurable business result. Defect counts, review volume, and closure time are useful operational metrics, but they mainly describe data quality activity and remediation workflow. They do not prove that invoicing outcomes improved.
- Defects opened can rise when monitoring improves, so it measures detection activity rather than business improvement.
- Records reviewed shows stewardship workload, not whether invoice processing performed better.
- Closure time measures remediation efficiency, but faster closure may not reduce invoice rejection impact.
Question 7
Topic: Quality in Data Lifecycle and Operations
A retailer finds that age-restricted orders are being delayed because customer birth dates are missing or invalid. Profiling shows that 21% of records created by the call-center application have a blank birth date, while web-created records are 99.7% complete. The call-center screen allows agents to skip the field, and no approved quality rule exists for the point of capture. Which conclusion best explains why the defects enter the data lifecycle?
Options:
A. Customer master matching is creating duplicate records
B. Warehouse transformation logic is calculating age incorrectly
C. Reporting users are applying inconsistent age filters
D. Capture controls do not enforce the business requirement
Best answer: D
Explanation: Defects often enter the data lifecycle at creation or capture when the process does not enforce the data needed for business fitness for purpose. Here, the issue is concentrated in records from one source process, and that process allows agents to skip a field needed for age-restricted fulfillment. The profiling evidence points to a capture-process weakness, not a downstream reporting or warehouse problem. A mature quality response would define and approve the quality rule, assign stewardship, and implement validation at the point of capture so bad or incomplete data is prevented rather than cleaned later.
- Warehouse logic is not supported because the profile shows missing source values before age calculation would matter.
- Master matching may affect uniqueness, but the visible defect is missing or invalid birth dates from one capture channel.
- Reporting filters could change analysis results, but they do not explain why source records lack required values.
Question 8
Topic: Profiling Discovery and Assessment
During a baseline data quality assessment at a bank, the governance group can assign enhanced rules, monitoring, ownership, and remediation capacity to only one data element this quarter.
| Data element | Downstream use | Profiling result |
|---|---|---|
| Loan maturity date | Payment schedule and regulatory liquidity report | 3% invalid or earlier than origination date; no named steward |
| Customer mobile phone | Marketing campaigns | 18% blank; optional for most products |
| Branch code | Internal routing reports | 4 obsolete codes remapped nightly |
| Customer nickname | Online display only | 25% blank |
Which decision best fits the assessment evidence?
Options:
A. Prioritize customer mobile phone because it has the highest blank rate
B. Prioritize branch code because obsolete values still appear
C. Prioritize loan maturity date for enhanced quality management
D. Prioritize customer nickname because completeness is lowest
Best answer: C
Explanation: A critical data element should receive stronger quality attention when poor quality could materially affect business operations, regulatory reporting, customer outcomes, or other high-value decisions. The loan maturity date supports both payment scheduling and a regulatory liquidity report, so even a smaller defect rate can be more important than a larger defect rate on a low-impact field. The profiling evidence also shows a clear validity problem and an ownership gap, which makes stronger rules, monitoring, stewardship assignment, and remediation planning appropriate. High defect volume alone does not make a field critical; fitness for purpose and business impact drive prioritization.
- Highest blank rate is tempting, but mobile phone is optional for most products and mainly affects marketing use.
- Obsolete branch codes are already handled by a nightly remapping control, reducing the need for new priority treatment.
- Low nickname completeness has limited business impact because the field is used only for online display.
Question 9
Topic: Quality Rules Standards and Requirements
A bank’s customer-risk scorecard depends on valid customer country and risk status values. Nightly warehouse monitoring shows 6% invalid risk_status values against a threshold of less than 1%. Data stewards correct the scorecard manually each week, but the same branches keep entering retired local codes in the CRM onboarding workflow. The CRM owner controls that workflow, and an approved reference data list and quality rule already exist. What is the best quality control improvement?
Options:
A. Map retired codes to valid codes in the warehouse
B. Create a monthly exception trend dashboard for branches
C. Increase steward capacity for weekly scorecard corrections
D. Add source-system validation using the approved reference list
Best answer: D
Explanation: When downstream monitoring repeatedly detects defects created by a source process, the strongest improvement is to add or strengthen a preventive control at the point of creation. The approved rule and reference list are already available, and the CRM owner controls the onboarding workflow, so validation can reject or prevent retired codes before they enter the pipeline. Downstream monitoring should continue as a detective control, but it should not be the primary way to manage a recurring, source-created defect. Manual correction and warehouse mapping may be useful as temporary containment, but they do not create sustainable quality.
- More manual correction treats symptoms and increases capacity, but it leaves the branch onboarding defect unchanged.
- Trend dashboarding improves visibility, but it remains detective and does not stop invalid values at capture.
- Warehouse mapping can mask the defect for reporting, but it bypasses source-process accountability and may obscure lineage.
Question 10
Topic: Quality in Data Lifecycle and Operations
A bank plans to reduce storage costs for loan-servicing data. The proposal moves loan records older than 24 months to offline storage and retains only monthly balance totals in the analytics warehouse. Compliance and customer-resolution teams must investigate individual loan events for 7 years, including status changes and code meanings. Which decision best protects data quality and usability across retention?
Options:
A. Purge records after 24 months and document the purge
B. Retain only monthly totals after validating their formats
C. Archive event-level records with lineage and code metadata for 7 years
D. Keep event records but discard historical reference-code mappings
Best answer: C
Explanation: Retention decisions affect data quality because data can be technically stored but no longer usable for its intended business purpose. In this case, compliance and customer-resolution work require event-level detail, lineage, and the historical meaning of status codes for 7 years. Monthly totals may be valid and cheaper to keep, but they remove the granularity needed to reconstruct individual loan events. Keeping data without code metadata also weakens interpretability, especially when code sets change over time. The quality issue is fitness for purpose across the lifecycle, not only storage efficiency or file validity.
- Aggregated totals fail because valid summaries cannot support investigations of individual loan events.
- Missing code history fails because retained records lose business meaning when historical reference data is unavailable.
- Documented purge fails because metadata about deletion does not satisfy a 7-year investigation need.
Question 11
Topic: Specialist Cross-Discipline Judgment
A bank’s customer risk reports are being rejected because 18% of new commercial customer records have invalid legal_entity_type values. Profiling shows the defects originate in one onboarding channel. A draft glossary definition exists, but no business owner has approved the rule or acceptable threshold. Downstream analysts can cleanse the current report, but the defect recurs each week. Which action is the best quality decision?
Options:
A. Cleanse the rejected report and document the analyst correction steps
B. Track the defect on a scorecard until analysts agree on a threshold
C. Let IT enforce the draft code list immediately in the onboarding system
D. Assign ownership, approve the rule and threshold, escalate the issue, and remediate the source process
Best answer: D
Explanation: Data governance supports data quality by making quality expectations owned, approved, standardized, and enforceable. In this situation, the defect has business impact, a known source process, an unapproved rule, and recurring downstream rework. The right response is not only cleansing the current report; it is to establish accountable ownership, approve the business rule and threshold, escalate through the governance process, and coordinate remediation at the onboarding source. That approach links data quality management to stewardship, standards, issue management, and sustained prevention. A scorecard can help monitor results, but measurement without ownership and approved standards will not resolve the recurring defect.
- Downstream cleansing addresses the immediate rejected report but leaves the weekly source-process defect unresolved.
- IT-only enforcement may implement the wrong rule because the business definition and threshold have not been approved.
- Scorecard-only tracking measures the problem but does not establish accountability or trigger remediation.
Question 12
Topic: Metadata Lineage and Quality Evidence
A data quality manager is preparing a customer churn scorecard. Marketing and Finance disagree on the churn rate because they use different meanings of “active customer” and different source systems. The scorecard must support issue ownership and traceable remediation. Which metadata artifact best fits this need?
Options:
A. An ETL job log showing load times, row counts, and error messages
B. A data profile showing null counts and frequency distributions for customer status
C. A remediation ticket listing duplicate customer records to merge
D. A governed quality-rule record with definition, owner, lineage, threshold, and usage context
Best answer: D
Explanation: Quality management depends on metadata that explains what a data element or metric means, who is accountable for it, how it is calculated, where it comes from, how it changes, and what level of quality is acceptable for a given business use. In this case, the disagreement is not just a profiling issue; it involves conflicting definitions, source-system differences, lineage, transformations, ownership, thresholds, and scorecard usage. A governed quality-rule record provides the evidence needed to interpret results, assign stewardship, monitor thresholds, and trace defects back to source or transformation points. Profiling and logs can support investigation, but they do not provide the full management context needed for a defensible quality scorecard.
- Profiling alone helps reveal patterns such as nulls or value frequencies, but it does not establish meaning, ownership, lineage, or acceptable thresholds.
- Job logs support operational troubleshooting, but row counts and load times do not resolve conflicting business definitions.
- Duplicate remediation addresses one possible defect type, but the scenario requires metadata for scorecard governance and traceable quality management.
Question 13
Topic: Master Reference Integration and Warehouse Quality
A regional insurer is integrating claims data from two policy systems into an enterprise warehouse. Profiling shows that both systems pass format validation, but claim status counts differ because one source maps C to “Closed” and the other maps C to “Cancelled.” Finance uses the warehouse for monthly loss reporting, and the claims data steward owns the business definition of claim status. What is the best quality decision?
Options:
A. Create an approved canonical status definition and crosswalk
B. Report separate claim status metrics by source system
C. Cleanse the incorrect rows after each monthly load
D. Add a warehouse filter for ambiguous status codes
Best answer: A
Explanation: Interoperability quality depends on shared meaning, not just valid formats. Here, the defect is semantic inconsistency: the same code has different business meanings across sources. Because finance relies on consolidated warehouse reporting and the claims data steward owns the definition, the sustainable action is to establish a governed canonical definition and source-to-target crosswalk for claim status. That action aligns metadata, reference mapping, stewardship approval, and integration rules so the warehouse applies consistent meaning during ingestion. Downstream filters or recurring cleansing may reduce visible errors, but they do not prevent the source-mapping mismatch from recurring.
- Warehouse filtering hides ambiguous values but does not resolve the definition conflict or preserve business meaning.
- Monthly cleansing treats symptoms after the load and creates recurring manual remediation.
- Separate source metrics avoids consolidation rather than fixing the interoperability issue needed for finance reporting.
Question 14
Topic: Profiling Discovery and Assessment
A data quality team profiles a customer mart and finds frequent invalid values in customer_status. Lineage shows the values flow unchanged from the account-servicing application through ETL into the mart. Business users confirm the status is selected when a service representative updates an account, but the application allows free-text entry.
Where should the recurring quality control be placed?
Options:
A. At account status update in the source application
B. After mart load in a manual cleansing queue
C. In the BI report filter for customer status
D. During monthly profiling of the customer mart
Best answer: A
Explanation: Lifecycle reasoning places a control where the defect enters or can be most effectively prevented, not merely where it is discovered. Profiling found the invalid statuses in the mart, but lineage shows the values pass through unchanged from the account-servicing application. Since representatives create the invalid values during account updates, the sustainable control is source-process validation against the governed reference list at that update point. Downstream cleanup or reporting filters may reduce visible impact, but they do not stop recurrence or correct the originating process.
- Profiling the mart detects and measures the issue, but monthly discovery is not a preventive lifecycle control.
- Filtering reports hides invalid statuses from users, but leaves defective data in the pipeline.
- Manual cleansing after load can repair existing records, but it treats symptoms rather than the source-process cause.
Question 15
Topic: Quality Rules Standards and Requirements
A data quality analyst profiles customer onboarding data and finds that 7% of records have a blank date_of_birth. Marketing explains that age is needed for consent checks in some jurisdictions, but not for every campaign. Before a quality rule is approved, which clarification best supports a business-fit requirement?
Options:
A. Can the ETL replace blank birth dates with a standard default value?
B. How many blank birth dates appeared in the most recent profile run?
C. Can every source system make birth date a mandatory field?
D. Which customers require birth date, what tolerance applies, and who remediates exceptions?
Best answer: D
Explanation: A quality requirement should express fitness for purpose, not just a technical defect. Here, date_of_birth is required only where age-based consent checks depend on it. The best clarification identifies the population in scope, the acceptable tolerance for missing values, and the accountable remediation path. That gives stewards and implementers enough context to write a meaningful rule and manage exceptions. Profiling counts help quantify the issue, but they do not define the business requirement. Mandatory capture or default values may be possible controls, but they should follow from the agreed requirement rather than replace it.
- Default value fix risks creating inaccurate data and does not clarify whether birth date is truly required for each customer.
- Profile count only measures the observed defect but does not define business impact, tolerance, or accountability.
- Mandatory field everywhere may over-control the process because the business need applies only to specific jurisdictions or uses.
Question 16
Topic: Profiling Discovery and Assessment
An analyst profiles ship_date in an order mart and finds that 18% of records are null. The shipping team confirms that digital products do not ship, and physical orders receive ship_date only after carrier handoff. The intended dashboard uses fulfilled physical orders to calculate delivery cycle time. What is the most appropriate interpretation of the evidence?
Options:
A. Treat all null
ship_datevalues as completeness defectsB. Classify the finding primarily as a timeliness defect
C. Assess
ship_datewith a conditional completeness ruleD. Reject the profile because marts are not source systems
Best answer: C
Explanation: A profiling result is evidence, not a final quality judgment by itself. The 18% null rate must be interpreted against the business rule for when ship_date should exist, the order lifecycle, the source process that populates the field, and the dashboard’s intended use. Digital-product orders and physical orders before carrier handoff may be validly null. For delivery cycle-time reporting, the relevant population is fulfilled physical orders, so the quality rule should test completeness only within that scope. The key distinction is between a raw profiling finding and a contextual business-quality assessment.
- All nulls as defects ignores cases where the attribute is not applicable or not yet expected in the lifecycle.
- Timeliness defect would concern whether expected values arrive late, not whether all nulls are invalid.
- Rejecting mart evidence goes too far because mart profiling can still reveal useful assessment evidence when lineage and intended use are considered.
Question 17
Topic: Data Quality Foundations and Business Fitness
A marketing team cannot determine campaign eligibility for many new customers. Profiling shows 18% of customer records created in the last two weeks have a null consent_status. Lineage review shows the new web signup form stopped requiring the field after a release. The data steward must log the issue so governance can track the quality dimension, measurement, root cause, and remediation. Which entry is the best fit?
Options:
A. Dimension: completeness; measure: null rate; root cause: form change; remediation: restore control and backfill values
B. Dimension: validity; measure: allowed-code failure rate; root cause: reference list gap; remediation: add consent codes
C. Dimension: timeliness; measure: load latency; root cause: delayed batch; remediation: increase refresh frequency
D. Dimension: accuracy; measure: duplicate count; root cause: weak matching; remediation: merge customer records
Best answer: A
Explanation: The central defect is that a required attribute is absent, so the quality dimension is completeness. The measurement method should quantify the defect directly, such as the percentage of records where consent_status is null. The root cause is not the null value itself; it is the web signup process change that stopped requiring the field. A sustainable remediation addresses both prevention and correction: restore the source control so new records are captured correctly, then backfill or otherwise resolve the affected records according to approved business rules. Accuracy, timeliness, and validity are plausible dimensions in other cases, but they do not match the visible evidence of missing required data.
- Accuracy trap fails because no evidence shows incorrect values or duplicate customer records.
- Timeliness trap fails because the records are present; the defect is a missing attribute, not late delivery.
- Validity trap fails because the issue is null
consent_status, not values failing an approved code set.
Question 18
Topic: Quality Strategy Business Case and Prioritization
A bank is building a data quality roadmap for customer data. Profiling has found duplicate records, missing contact fields, and inconsistent customer status values. Each line of business defines “active customer” differently, so proposed quality metrics, rules, steward assignments, and remediation priorities do not align. Which dependency should the roadmap resolve first?
Options:
A. Create a monthly defect-count dashboard
B. Approve shared business definitions and quality requirements
C. Cleanse the highest-volume duplicate records
D. Purchase a data quality profiling tool
Best answer: B
Explanation: A quality roadmap needs stable business meaning before operational quality management can work consistently. If “active customer” means different things across business areas, then rules may test different conditions, metrics may report incompatible results, stewards may dispute ownership, and remediation may target the wrong priorities. DAMA-aligned data quality practice starts from fitness for purpose: define the critical data elements, agree their business definitions, document quality expectations, and confirm accountable ownership. Profiling evidence is useful, but it cannot decide business meaning by itself.
Tooling, dashboards, and cleansing can follow, but they should implement agreed quality requirements rather than substitute for them.
- Tool first fails because profiling capability does not resolve conflicting business definitions or accountability.
- Dashboard first fails because defect counts will not be comparable when the measured terms differ by business area.
- Cleansing first fails because duplicate remediation may be inconsistent without agreed customer definitions and survivorship expectations.
Question 19
Topic: Metadata Lineage and Quality Evidence
A data quality team publishes monthly quality scores for customer data, but business users distrust the scores. They say they cannot tell what each score means, which source systems were measured, or how the results relate to approved business definitions. What metadata improvement would most directly improve trust in the scores?
Options:
A. Cleanse failed records before publishing the scores
B. Increase the frequency of automated profiling jobs
C. Add more visual indicators to the quality scorecard
D. Publish business definitions, rule logic, source lineage, and score ownership
Best answer: D
Explanation: Quality scores need supporting metadata to be credible and fit for purpose. When users do not know what a score means, where the measured data came from, or how the score maps to approved business definitions, the main gap is not measurement volume or presentation. The improvement should connect each score to business glossary definitions, quality rules, thresholds, source lineage, calculation logic, and stewardship accountability. This evidence lets users understand whether the score measures the right thing and whether it applies to their business use.
More profiling or cleansing may improve data quality operations, but it does not explain the meaning, provenance, or governance basis of the published scores.
- More profiling can produce additional measurements, but it does not clarify approved definitions or source lineage.
- Better visuals may make a scorecard easier to read, but it does not provide evidence behind the scores.
- Pre-publication cleansing may reduce visible defects, but it can hide quality issues without explaining score meaning or sources.
Question 20
Topic: Root Cause Analysis and Remediation
A data quality team is investigating recurring invalid shipping-country codes in the monthly customer analytics dataset. The approved quality rule requires an ISO country code, and invalid values cause 6-8% of regional sales records to be excluded from the scorecard. Profiling shows the spike began after a new partner order portal went live. Lineage shows the analytics load maps the ERP Ship Country field through the enterprise country reference table; unmatched values are loaded as ZZ. The reference table and CRM customer master have not changed. Which underlying cause is most likely?
Options:
A. The enterprise country reference table is missing valid codes
B. The CRM customer master contains duplicate customer records
C. The regional sales scorecard threshold is too strict
D. The partner portal is not enforcing enterprise country codes
Best answer: D
Explanation: Recurring defect evidence should be traced to the point where the defect first enters the data lifecycle. Here, the timing, lineage, and unchanged reference/master data all point upstream to the new partner portal. The analytics process is behaving consistently: it maps ERP Ship Country values to the approved reference table and assigns ZZ when no match exists. Because the spike begins after the portal launch and not after a reference-data or CRM change, the likely cause is weak data capture control in the source process, such as accepting free text or unapproved country values. The best root-cause finding identifies the process that creates recurring defects, not only the downstream symptom.
- Missing reference codes is unlikely because the enterprise reference table has not changed and the defects are limited to the new portal source.
- Duplicate customer records would more directly affect uniqueness or survivorship, not country-code validity from order entry.
- Strict scorecard threshold describes business tolerance, not the mechanism creating invalid
Ship Countryvalues.
Question 21
Topic: Root Cause Analysis and Remediation
A bank profiles customer onboarding data and finds that 18% of online applications have a blank country_of_residence. Lineage shows the web form allows the field to be skipped, but regulatory reporting requires the value before account activation. Operations currently fills blanks after loading by reading free-text addresses. Which remediation action best fits this situation?
Options:
A. Exclude online applications from the completeness scorecard
B. Run a weekly cleansing job to infer the country from addresses
C. Make the field mandatory with controlled-list validation at onboarding
D. Send monthly exceptions to a data steward for manual review
Best answer: C
Explanation: Remediation should target the root cause and the business impact. Here, the defect is not just a backlog of missing values; it is caused by an onboarding process that permits a mandatory regulatory attribute to be blank. A source correction or process redesign, such as making the field required and validating it against an approved controlled list before activation, prevents new defects and aligns the process with fitness for purpose. Cleansing and stewardship review may help with existing exceptions or interim handling, but they do not remove the cause. Adjusting the scorecard would hide the quality issue rather than remediate it.
- Cleansing only treats symptoms after load and leaves the broken capture process in place.
- Scorecard exclusion weakens measurement and masks a regulatory quality requirement.
- Manual review may support exception handling, but monthly review is too late and does not prevent recurrence.
Question 22
Topic: Monitoring Scorecards and Measurement
A data steward reports that the customer onboarding data set has “1,250 defects this month.” Business owners say the number does not help them decide what to fix first. Which metric would provide the most actionable evidence for a quality scorecard?
Options:
A. Count of open data quality tickets
B. Number of records scanned by the profiling tool
C. Defect rate by rule, source, trend, and business threshold
D. Total defects found during monthly profiling
Best answer: C
Explanation: A useful data quality metric supports a decision, not just activity reporting. A raw count such as “1,250 defects” lacks context: it does not show the size of the population, which quality rules failed, which sources or processes caused the failures, whether the trend is improving, or whether a threshold was breached. For a scorecard, the metric should connect quality results to business fitness for purpose, agreed rules, and remediation priority. A defect rate by rule and source, compared with thresholds and trends, gives stewards and owners evidence for triage and root-cause work.
The key takeaway is that actionable metrics explain significance and priority; generic counts only show that defects exist.
- Monthly defect total fails because it lacks population size, business impact, rule context, and priority signal.
- Records scanned measures profiling activity, not quality outcomes or remediation urgency.
- Open tickets may help manage workflow, but it does not directly measure the quality condition of the data.
Question 23
Topic: Quality Maturity and Continuous Improvement
A retailer’s data quality maturity assessment finds a reactive approach to customer address quality. Missed deliveries are increasing, and profiling shows 9% of new customer addresses fail postal-format validation. Exceptions are corrected in the data warehouse after the monthly load, but no named business steward approves rules, reviews trends, or escalates recurring source-process defects.
Which capability improvement would best strengthen sustained data quality management?
Options:
A. Establish steward-owned rules, thresholds, monitoring, and issue escalation
B. Run a one-time cleansing project in the data warehouse
C. Increase ad hoc profiling before each monthly load
D. Ask downstream analysts to document manual corrections
Best answer: A
Explanation: The maturity finding points to weak sustained management, not just dirty address values. The organization already profiles defects and performs downstream corrections, but the process remains reactive because ownership, thresholds, monitoring, and escalation are missing. A stronger capability would assign accountable stewardship for address quality rules, define acceptable thresholds, monitor results over time, and route recurring exceptions to the source-process owner for remediation. This turns isolated detection and cleanup into managed prevention and continuous improvement. One-time cleansing or more frequent profiling may improve short-term reporting, but they do not create accountability or prevent recurrence.
- Warehouse cleansing treats symptoms after the monthly load and does not address rule ownership or source-process prevention.
- More profiling increases detection, but profiling alone does not establish thresholds, accountability, or remediation governance.
- Manual correction logs may provide evidence, but they leave sustained quality management dependent on downstream workarounds.
Question 24
Topic: Metadata Lineage and Quality Evidence
A data quality team is reviewing catalog entries after two certified reports show different “active customer” counts. The glossary contains two approved definitions: Sales defines an active customer as “placed an order in the last 12 months,” while Support defines it as “has an open account with no suspension.” Both reports use valid source fields and documented lineage.
What is the most direct quality effect of this glossary conflict?
Options:
A. Invalid reference data values in source systems
B. Inconsistent quality measurement across reports
C. Duplicate customer records requiring survivorship rules
D. Incomplete lineage for report certification
Best answer: B
Explanation: A glossary conflict affects data quality by undermining shared meaning. Here, the source fields are valid and lineage is documented, but the business term “active customer” has two approved meanings. That means each report can be technically correct while measuring a different customer population. The quality issue is not primarily validity, completeness, or duplication; it is inconsistent interpretation that makes scorecards, thresholds, and business reporting difficult to compare. Resolving the conflict requires governance agreement on the term definition or clearly governed term variants for different business purposes.
- Reference data values are not the issue because the facts say the source fields are valid.
- Lineage certification is not the issue because lineage is already documented.
- Duplicate records are not indicated; conflicting definitions can change counts even when each customer record is unique.
Question 25
Topic: Profiling Discovery and Assessment
A data steward receives complaints that the current rule, “customer email must be populated,” is producing too many exceptions for records loaded from a legacy call-center source. Business users suggest changing the rule to apply only to online customers. Before changing the rule, what evidence is most appropriate to collect through profiling?
Options:
A. A new scorecard showing fewer email exceptions
B. A revised exception threshold approved by the data quality team
C. Completeness and usage patterns by source and customer channel
D. A one-time cleanup list for all missing email values
Best answer: C
Explanation: Profiling supports rule definition by discovering actual data patterns before a quality rule is created or changed. In this case, the proposed rule change depends on whether missing email values are concentrated in a source, associated with a customer channel, and meaningful for business use. The strongest evidence is a profile that segments completeness results by source system and customer channel, ideally paired with usage context from stakeholders. That evidence helps distinguish a valid business exception from a recurring defect or source-process issue. Changing thresholds, cleansing records, or adjusting scorecards before understanding the pattern can hide quality problems rather than improve fitness for purpose.
- Threshold approval skips the discovery needed to know whether the proposed exception is justified.
- One-time cleanup treats symptoms but does not determine whether the rule itself fits the data and business need.
- Scorecard reduction improves reported results cosmetically if the underlying pattern and rule rationale are not understood.
Questions 26-50
Question 26
Topic: Root Cause Analysis and Remediation
A data quality issue has already been prioritized by the stewardship council. Profiling confirmed that 8% of customer records contain an expired status code, and root-cause analysis traced the defect to a missing validation check in the customer onboarding application. The approved plan is to prevent new invalid codes and repair affected records. Which action best represents remediation execution?
Options:
A. Add the defect rate to the monthly scorecard
B. Rank the issue by business impact and urgency
C. Implement the validation fix and correct affected records
D. Run a frequency profile on status code values
Best answer: C
Explanation: Remediation execution is the point where approved fixes are applied. In this case, the issue has already been triaged, profiled, and analyzed for root cause, so the next fit is to implement the onboarding validation change and repair the records affected by the expired code. That action addresses both prevention and correction. Ranking the issue belongs to triage, profiling measures and discovers the defect pattern, and scorecarding supports ongoing monitoring after or during remediation. Sustainable remediation should focus on the source process when the root cause is known, not only on downstream cleanup.
- Business ranking is triage because it decides priority before the fix is executed.
- Frequency profiling is diagnostic because it quantifies the defect but does not repair or prevent it.
- Monthly scorecarding is control monitoring because it tracks quality results over time rather than applying the fix.
Question 27
Topic: Data Quality Foundations and Business Fitness
A data steward reviews a recurring customer onboarding issue. Sales reports lost orders when new customer records are created with invalid tax classification codes. Profiling shows 7% invalid codes in the last month, all originating from a free-text field in the onboarding form. Governance has approved a standard reference list, and the process owner can change the form before the next sales cycle. Which classification best fits the quality initiative that should be prioritized?
Options:
A. Corrective cleansing of existing customer records
B. Reactive handling of sales order failures
C. Preventive control at the source process
D. Detective monitoring through a monthly scorecard
Best answer: C
Explanation: A preventive quality initiative stops defects from being created, usually by improving the source process, control, rule, or data entry design. The facts point to a known root cause: a free-text onboarding field allows invalid tax classification codes. Because governance has approved the reference list and the process owner can change the form before the next cycle, the best priority is to prevent future invalid values at capture. Detective monitoring would find defects after entry, corrective cleansing would repair existing records, and reactive handling would respond only after sales orders fail.
- Monthly scorecard detects and reports the 7% invalid-code rate but does not stop new invalid values from being created.
- Record cleansing corrects current customer data but leaves the free-text root cause in place.
- Order-failure handling reacts to business disruption after the defect has already affected sales.
Question 28
Topic: Monitoring Scorecards and Measurement
A data quality team reports major scorecard improvement after a customer onboarding remediation. The approved scorecard contains only these measures:
| Measure | Current result |
|---|---|
| Email format validity | 99.4% |
| Mandatory field completeness | 98.1% |
| Duplicate customer ID rate | 0.8% |
Sales operations still cannot prioritize high-value leads because industry_segment is inconsistent across channels, and forecast accuracy has not improved. What should the data quality lead do next?
Options:
A. Increase the email-format threshold until forecast accuracy improves
B. Ask BI analysts to relabel the forecast dashboard metric
C. Reframe the scorecard to measure segment fitness for lead prioritization
D. Close the remediation because all reported quality measures improved
Best answer: C
Explanation: Data quality is judged by fitness for purpose, not by improved scores alone. The scorecard shows better email validity, completeness, and uniqueness, but the remaining business problem depends on consistent industry_segment values. The next action is to align measurement with the downstream decision, define or approve a quality rule for the segment data with the appropriate steward and business owner, and track whether that rule supports lead prioritization and forecasting. A scorecard can be green while the business outcome remains unsupported if it measures the wrong attributes or dimensions. The key takeaway is to interpret metrics in relation to the intended business use, not as isolated technical achievements.
- Closing too early treats metric improvement as success even though the stated sales outcome is still unresolved.
- Raising email validity optimizes a measure that is already strong and not tied to the segment defect.
- Relabeling the dashboard changes presentation, not the underlying data fitness problem affecting prioritization.
Question 29
Topic: Master Reference Integration and Warehouse Quality
A data quality team investigates why the warehouse KPI for current billable subscribers is overstated. CRM uses status ACTIVE for signed subscriptions, including future start dates. Billing uses status ACTIVE only when a subscription is currently billable. The integration maps both source values named ACTIVE to the warehouse value Active. Profiling shows status values are populated and valid, and the nightly load completed before the scorecard cutoff. Which cause classification best fits the defect?
Options:
A. Semantic mismatch between source definitions
B. Timing defect in the nightly load
C. Source-system quality defect in CRM
D. Technical transformation error in the warehouse
Best answer: A
Explanation: The defect is a semantic mismatch: the same code value, ACTIVE, is valid in both systems but means different things. CRM includes future-start subscriptions because its status reflects a signed subscription, while Billing limits ACTIVE to currently billable subscriptions. The warehouse KPI is wrong because integration treated identical labels as identical meanings. Profiling rules for completeness and validity do not expose this type of issue, because the source values are present and allowed. The proper quality response would involve clarifying business definitions, lineage, and integration rules so the warehouse distinguishes current billable subscribers from future-start subscribers.
- CRM source defect does not fit because CRM is using a valid value according to its own business process.
- Nightly timing does not fit because the load completed before the cutoff and no late-arriving data fact is given.
- Transformation error is not the primary cause because the stated problem is meaning alignment, not a failed technical conversion.
Question 30
Topic: Quality Strategy Business Case and Prioritization
A product leadership team has stopped using a sales-performance dashboard because revenue, product category, and customer-segment totals do not reconcile across source systems. Teams now delay pricing and portfolio decisions until analysts manually investigate the differences. Which data quality value driver is most directly demonstrated?
Options:
A. Trusted analytics
B. Reference data standardization
C. Customer experience
D. Regulatory support
Best answer: A
Explanation: A data quality business case should link defects to business value. Here, the visible impact is that leaders do not trust analytical outputs and delay management decisions until reconciliation work is completed. That points most directly to trusted analytics: improving quality so reports, dashboards, and metrics are fit for decision use. The manual investigation also creates inefficiency, but it is secondary to the stated business problem of unreliable performance information. Regulatory support would require a compliance or reporting obligation, and customer experience would require a direct customer-facing impact.
- Regulatory support would fit if the defect affected statutory reporting, audit evidence, or compliance controls, which is not stated.
- Customer experience would fit if customers received wrong messages, products, bills, or service outcomes, not just internal decision delays.
- Reference data standardization may be part of remediation, but it is not the business value driver being demonstrated.
Question 31
Topic: Specialist Cross-Discipline Judgment
A bank’s customer analytics and regulatory reporting teams are affected by duplicate customer records and invalid tax identifiers. Profiling shows that most new defects originate in two regional onboarding forms where validation is optional. The warehouse team already performs monthly cleanup but has no capacity for more manual matching. No enterprise owner has approved the customer identity quality rules. Which action is the best quality decision for sustainable improvement?
Options:
A. Create a local duplicate list for each region
B. Add a reporting caveat to affected dashboards
C. Increase monthly warehouse cleansing frequency
D. Assign rule ownership and approve source-entry controls
Best answer: D
Explanation: Sustainable data quality improvement depends on governance decisions that make quality expectations explicit, owned, and embedded in the process where defects arise. Here, profiling points to the onboarding forms as the main source of duplicate records and invalid identifiers, while downstream cleanup is already capacity constrained. The most effective governance action is to assign ownership for the customer identity rules, approve standards and thresholds, and require source-entry validation or stewardship workflow changes. That shifts the response from repeated cleanup to prevention and ongoing accountability. A warehouse or reporting fix may reduce visible symptoms, but it does not establish durable rule ownership or correct the source process.
- More cleanup treats symptoms after defects enter the environment and conflicts with the stated capacity limit.
- Reporting caveats may warn users, but they do not improve the underlying quality of customer data.
- Regional lists create fragmented local handling and do not establish enterprise rules for shared customer identity.
Question 32
Topic: Monitoring Scorecards and Measurement
A data steward is designing a monthly scorecard for the customer master data domain. The business wants a measure that shows whether the data is fit for purpose, not just how much work the team performed or how users feel about the system. Which scorecard entry best fits that need?
Options:
A. Sales team comments about the ease of finding customers
B. Number of customer records processed by the matching job
C. Number of duplicate-resolution tickets closed during the month
D. Percentage of active customer records passing approved validity and uniqueness rules
Best answer: D
Explanation: A data quality metric measures the condition of data against defined expectations, such as validity, completeness, consistency, timeliness, uniqueness, or integrity. For a customer master scorecard, the strongest measure is the percentage of active records that pass approved quality rules, because it reports the quality state of the data itself. Operational volume metrics, project activity measures, and satisfaction comments can provide useful context, but they do not directly show whether the data meets quality requirements. A good quality KPI should connect a rule or threshold to business use, not merely count processing activity or remediation workload.
- Processing volume can show workload or system throughput, but it does not indicate whether the processed records are correct or fit for use.
- Tickets closed measures remediation activity, but defects may remain or recur even when many tickets are completed.
- User comments can signal perceived problems, but they are qualitative feedback rather than a controlled quality metric against rules.
Question 33
Topic: Quality Maturity and Continuous Improvement
A customer service data quality scorecard has shown 99% validity for delivery_status for three months, above the 98% threshold. However, returned shipments are rising because orders marked Delivered often have a blank delivery_date. Profiling confirms the status codes are valid, but the date is missing mainly from one fulfillment workflow. What improvement best closes the learning loop?
Options:
A. Lower the validity threshold so the scorecard reflects the returns issue
B. Transfer ownership of the rule from the steward to the ETL team
C. Cleanse missing delivery dates in the reporting database each month
D. Add a conditional rule for delivered orders and monitor source exceptions
Best answer: D
Explanation: Continuous improvement uses monitoring evidence to refine rules, controls, ownership, and follow-up action. The scorecard result is not wrong, but it measures only whether delivery_status contains valid codes. The business problem shows a missed condition: when an order is marked Delivered, delivery_date must be present. The better improvement is to add or revise a quality rule tied to business fitness for purpose, apply it near the fulfillment source process, and monitor exceptions so the recurring workflow defect can be corrected. Lowering the threshold would make the metric look worse without measuring the real defect. Downstream cleansing may help reports temporarily, but it does not prevent recurrence.
- Threshold tuning fails because the measured rule is incomplete; changing the threshold does not add the missing delivery-date condition.
- Monthly cleansing treats symptoms in reporting but leaves the fulfillment workflow defect recurring.
- ETL ownership confuses implementation responsibility with stewardship accountability for approving business quality rules.
Question 34
Topic: Quality Governance Roles and Stewardship
A customer master data domain has produced duplicate customer exceptions for three monthly cycles. Profiling shows most duplicates are created when regional onboarding teams use different matching criteria. The duplicates delay billing and affect customer service reporting. The data steward has limited remediation capacity and the matching rule is still informal. Which stewardship practice is the BEST quality decision?
Options:
A. Cleanse the current duplicate records and report the reduced exception count
B. Publish a dashboard showing duplicate trends by region each month
C. Convene rule owners, document the matching decision, assign remediation actions, and monitor recurrence
D. Ask the integration team to add stricter technical validation on required fields
Best answer: C
Explanation: Recurring quality exceptions require stewardship practices that move beyond cleanup. In this case, the defect is tied to inconsistent regional matching criteria, so the steward should coordinate a business decision about the customer matching rule, document it, assign accountable remediation or process changes, and monitor whether recurrence declines. This connects quality governance with operational follow-up: rule ownership, decision records, issue management, and sustained monitoring. Cleansing existing duplicates may reduce today’s exception count, but it does not address the informal rule or inconsistent source process that creates new duplicates.
- Current-state cleanup helps the immediate backlog but does not create durable prevention or a documented matching decision.
- Technical validation targets field-level completeness or validity, while the visible cause is inconsistent duplicate-matching logic.
- Trend reporting provides useful visibility, but reporting alone does not assign ownership or change the process creating exceptions.
Question 35
Topic: Quality Governance Roles and Stewardship
A regional bank’s customer risk scorecard shows that 18% of commercial customers are missing a required industry classification. Profiling shows the gap comes from three onboarding systems that use different code lists and have no shared owner for the classification standard. Compliance reporting uses the scorecard monthly, and the data quality team has capacity only for temporary exception fixes. Which decision best supports sustained data quality improvement?
Options:
A. Escalate to governance to assign ownership and standardize the code list
B. Ask each onboarding team to choose its preferred classification values
C. Cleanse the missing classifications in the scorecard each month
D. Lower the completeness threshold until onboarding systems improve
Best answer: A
Explanation: Governance escalation is appropriate when a data quality issue cannot be sustainably resolved by local cleansing or technical correction alone. Here, the defect affects a compliance-facing scorecard, originates in multiple onboarding systems, and depends on a shared industry classification standard with no clear owner. The data quality team can identify and temporarily repair exceptions, but it should not unilaterally decide enterprise policy, ownership, or cross-domain standards. Governance should assign accountable ownership, approve the standard code list, and ensure source-process changes are funded and implemented. Temporary cleanup may reduce immediate reporting risk, but it does not address the root cause.
- Monthly cleansing treats symptoms in the scorecard but leaves inconsistent onboarding code lists and missing ownership unchanged.
- Lowering the threshold hides the compliance-facing quality problem instead of addressing fitness for purpose.
- Local preferences would increase inconsistency because each onboarding team could maintain a different standard.
Question 36
Topic: Quality Strategy Business Case and Prioritization
A data quality lead can assign one remediation team this week. The business has asked for a risk-based triage decision, not a general cleanup plan.
| Logged issue | Evidence | Business use |
|---|---|---|
| Missing tax jurisdiction code | 8% of new-portal invoices; threshold 0.5% | Tax calculation and billing release |
| Mixed-case customer names | 35,000 records | No affected downstream rule |
| Duplicate inactive suppliers | 900 records | Archived purchasing history |
| Blank legacy product notes | 18% of discontinued SKUs | Not used in reporting |
Which action is the BEST quality decision?
Options:
A. Prioritize customer name standardization because it affects the most records.
B. Prioritize inactive supplier deduplication because uniqueness defects are serious.
C. Prioritize the tax jurisdiction defect and engage the process owner.
D. Backfill tax codes manually and close the defect as remediated.
Best answer: C
Explanation: Risk-based data quality triage should focus scarce remediation capacity on defects that threaten business fitness for purpose, not simply the largest or messiest cleanup queue. The missing tax jurisdiction code is above the agreed threshold and blocks tax calculation and billing release, so it has immediate business impact. Because the evidence points to the new portal, the triage decision should include the source process owner, not only downstream correction. The other issues may deserve backlog items, but the visible facts show low current value or limited operational impact.
- Largest row count is a weak priority signal when the affected attribute has no downstream rule or business impact.
- Generic uniqueness concern overstates risk because the duplicate suppliers are inactive and tied to archived history.
- Manual backfill only treats symptoms and does not address the recurring source-process defect.
Question 37
Topic: Specialist Cross-Discipline Judgment
A customer analytics team finds that the same quality rule for “active customer” is implemented differently in three reports. The differences affect retention metrics, and no team can show who approved the rule or who owns future changes. Which action best uses data governance to support sustained data quality?
Options:
A. Clean the three reports to use the most common definition
B. Ask each report team to document its local definition
C. Add a dashboard note warning users about possible differences
D. Assign a data owner and steward to approve and maintain the rule
Best answer: D
Explanation: Data governance strengthens data quality by defining decision rights and accountability for shared data. In this case, the problem is not only inconsistent report logic; it is the absence of an approved business rule and accountable ownership. A data owner should have authority for the business definition, while a steward helps maintain the rule, coordinate implementation, and monitor adherence. Once approved, the rule can be published as a standard, linked to metadata, and used consistently in quality checks and scorecards. Cleaning reports without governance may fix symptoms temporarily, but it does not prevent future conflicting definitions.
- Local cleanup may align current reports, but it does not establish approval authority or prevent recurrence.
- Dashboard warning communicates uncertainty, but it leaves the quality rule unresolved.
- Local documentation improves transparency, but separate definitions preserve inconsistency for a shared metric.
Question 38
Topic: Metadata Lineage and Quality Evidence
A data quality team is investigating inconsistent “active customer” counts used in executive reporting. Profiling shows the CRM source has valid status codes, no unexpected nulls, and stable record counts. Lineage shows the warehouse job and marketing dashboard each apply a different business rule, and monitoring shows no recent load failures. The catalog contains two unapproved glossary definitions for “active customer” with no assigned data steward. What is the best quality decision?
Options:
A. Cleanse CRM customer status values before the next load
B. Let each report keep its local active-customer rule
C. Rewrite the warehouse transformation as a technical defect
D. Escalate the glossary conflict for steward-approved definition alignment
Best answer: D
Explanation: A glossary definition problem exists when data is technically valid but stakeholders apply different meanings or rules to the same term. Here, profiling does not show source-data defects, and monitoring does not show a load or transformation failure. Lineage reveals that different downstream processes are applying different definitions of “active customer,” while the catalog lacks an approved glossary definition and steward ownership. The appropriate quality response is governance-led definition alignment: assign or engage the steward, approve one business definition or clearly governed variants, and then align quality rules and transformations to that decision. Fixing records or code before resolving meaning risks making the wrong process consistent.
- Source cleansing fails because the CRM values are profiled as valid, complete, and stable.
- Technical rewrite fails because there is no evidence of a failed or incorrect transformation implementation.
- Local rules fail because unmanaged competing definitions preserve the reporting inconsistency.
Question 39
Topic: Quality Rules Standards and Requirements
A bank is defining quality standards for the customer_mobile_number data element. Fraud alerts use the number within minutes of account changes, marketing campaigns use it monthly, and profiling shows 7% of active customer records have missing or invalid numbers. The customer data steward can approve standards, but remediation capacity is limited this quarter. Which quality decision best fits these facts?
Options:
A. Delay defining thresholds until all invalid numbers are corrected
B. Set one enterprise threshold of 100% validity for all uses
C. Set separate thresholds by use, with a stricter threshold for fraud alerts
D. Use the current 93% valid profile as the approved standard
Best answer: C
Explanation: Quality standards and thresholds define what level of quality is acceptable for a specific data element in a specific business context. Here, the same mobile number supports two uses with different risk and timing: near-real-time fraud alerts and monthly marketing. A stricter validity and timeliness threshold is justified for fraud alerts because the business impact of failure is higher. Marketing may tolerate a lower threshold while remediation is prioritized. The steward should approve the standards, and profiling evidence should inform but not automatically become the target. The key is to connect thresholds to fitness for purpose, business impact, and feasible remediation priorities.
- One universal target ignores different business uses and may create an unrealistic standard without prioritization.
- Waiting for cleanup confuses remediation work with the need to define approved quality expectations.
- Using the profile result treats current performance as acceptable without considering business risk or desired quality level.
Question 40
Topic: Quality Rules Standards and Requirements
A customer analytics team says, “The monthly retention dashboard is not trustworthy because customer records are missing consent status and some customers appear twice.” The dashboard is used for marketing eligibility decisions. Which action best translates the concern into measurable quality expectations and ownership needs?
Options:
A. Facilitate rule definition with stewards and business owners
B. Run a one-time duplicate removal before the next dashboard refresh
C. Profile all source tables and publish the raw findings
D. Ask the reporting developer to relabel the dashboard as provisional
Best answer: A
Explanation: Stakeholder concerns should be converted into data quality requirements that are measurable and owned. Here, “missing consent status” points to a completeness expectation, and “customers appear twice” points to a uniqueness expectation. Because the dashboard supports marketing eligibility, business owners and data stewards should agree on the acceptable thresholds, exception handling, and remediation accountability. Profiling can provide evidence, but it does not by itself define fitness for purpose or assign responsibility. A sustainable response links rules, thresholds, scorecard measures, and issue ownership to the business use.
- One-time cleanup may reduce immediate duplicates, but it does not define the required quality level or prevent recurrence.
- Dashboard relabeling communicates uncertainty, but it does not establish measurable expectations or ownership.
- Raw profiling finds patterns and defects, but unapproved findings are not the same as governed quality requirements.
Question 41
Topic: Quality in Data Lifecycle and Operations
A bank buys a weekly external credit-sensitivity score used to prioritize retention offers. The latest feed is 99.8% complete and all scores pass the numeric range rule, but 22% of high-value customers changed bands since last week. The provider file contains only score and customer key; it does not state the original source, transformation logic, or data-as-of time, and campaign decisions are due tomorrow. Which quality risk should be treated as the primary concern?
Options:
A. Invalid score values in the provider feed
B. Unverifiable lineage, derivation, and currency of the score
C. Duplicate customer records from weak matching rules
D. Low completeness of the external customer population
Best answer: B
Explanation: External data can pass basic profiling checks and still be risky if its lineage, derivation, and currency are unclear. Here, completeness and value validity look acceptable, but the large band movement affects a time-sensitive business decision. Without the original source, transformation logic, and data-as-of time, the bank cannot tell whether the change reflects real customer behavior, a provider processing change, stale data, or an undocumented calculation. The quality concern is fitness for purpose, especially timeliness and traceability, not only technical validity. The appropriate response is to record and escalate the uncertainty as a quality risk before relying on the score for retention targeting.
- Completeness concern is less compelling because the feed is 99.8% complete and the visible problem is unexplained movement.
- Validity concern does not fit because all scores pass the numeric range rule.
- Duplicate matching is not supported by the facts; no duplicate profile or matching exception is described.
Question 42
Topic: Quality Strategy Business Case and Prioritization
A company’s executives say customer churn reports and product profitability reports are trusted by some departments but rejected by others. They want a 6-month initiative that shows measurable improvement in the data most tied to revenue decisions, not just a technical cleanup. Which initiative best fits this need?
Options:
A. Replace the reporting tool used by finance
B. Profile every table in the enterprise data warehouse
C. Create a business-driven data quality roadmap for critical data elements
D. Cleanse duplicate customer records as a one-time project
Best answer: C
Explanation: A quality strategy and roadmap should focus effort where better data creates measurable business value. For customer churn and product profitability, the strongest initiative is to identify critical data elements, define business-approved quality rules and thresholds, assign stewardship, baseline current quality, track scorecards, and prioritize remediation by business impact. This turns broad distrust into a managed improvement program with visible measures over the 6-month period. Technical activities such as profiling and cleansing can support the roadmap, but they should not drive it without business priorities, ownership, and measurable targets.
- Profiling everything may produce useful discovery results, but it does not by itself prioritize revenue-critical data or deliver measured improvement.
- One-time cleansing can fix visible defects temporarily, but it does not create sustained controls, ownership, or progress measures.
- Tool replacement may improve presentation, but it does not resolve inconsistent definitions, rules, or source-process defects.
Question 43
Topic: Data Quality Foundations and Business Fitness
A claims analytics team no longer trusts the monthly denial-rate report because provider identifiers are often missing after intake. Analysts have been filling blanks in the warehouse using previous-month values, but the same defect returns every cycle. Profiling shows that 82% of missing identifiers come from one intake form used by a regional operations team. What response best improves upstream behavior and downstream trust?
Options:
A. Create a new dashboard footnote describing the defect
B. Continue warehouse imputation with a documented assumption
C. Exclude regional records from the denial-rate report
D. Add a governed intake validation rule and monitor exceptions
Best answer: D
Explanation: Sustainable data quality management focuses on fitness for purpose and prevention, not only downstream repair. The defect is recurring and has a clear upstream concentration: one intake form used by a regional operations team. A governed validation rule at intake, with steward-approved thresholds and exception monitoring, changes the process that creates the defect. It also gives downstream users evidence that quality is being controlled, not merely hidden or patched. Warehouse imputation may be a temporary workaround, but it does not improve the source behavior that undermines trust.
- Warehouse imputation may reduce blanks temporarily, but it keeps the recurring upstream defect in place.
- Record exclusion can distort the denial-rate metric and does not correct the intake process.
- Dashboard footnotes improve transparency, but disclosure alone does not prevent future missing identifiers.
Question 44
Topic: Metadata Lineage and Quality Evidence
A customer data quality scorecard shows Customer status validity = 93%. Business users do not trust the result because they cannot tell which status values were allowed, which source systems were measured, when the check ran, or who can approve a rule change. What metadata would best make this quality result trusted and actionable?
Options:
A. Distinct counts, null percentages, and value frequency rankings
B. Server name, CPU usage, and job duration statistics
C. Approved rule context, lineage, run time, and steward accountability
D. Dashboard layout, chart type, and refresh animation settings
Best answer: C
Explanation: Trusted data quality results need metadata that explains what was measured, against which approved rule, over which data scope, at what time, and under whose stewardship. A validity score alone is not enough to support remediation or acceptance because users cannot judge whether the rule is authoritative, whether the right sources were included, or who can resolve disputes. Lineage and run context make the evidence reproducible; rule metadata and stewardship make it governable. Profiling statistics can help discovery, but they do not by themselves make a published quality result fit for decision making.
- Profiling-only evidence is useful for discovery, but distinct counts and frequencies do not show approved rule authority or accountability.
- Presentation metadata may improve readability, but it does not make the score auditable or actionable.
- Technical operations data can support job monitoring, but CPU and duration do not explain business validity or ownership.
Question 45
Topic: Master Reference Integration and Warehouse Quality
A bank finds that customer segment codes are maintained separately in CRM, loan servicing, and the data warehouse. The same code value means different segments in different systems, causing conflicting regulatory and profitability reports. Which quality response best addresses the defect across the affected environment?
Options:
A. Correct the conflicting values in the warehouse tables
B. Govern a shared reference code set and update mappings
C. Run profiling weekly to count segment-code exceptions
D. Add report footnotes describing each system’s segment meaning
Best answer: B
Explanation: A reference data defect that affects multiple systems should be handled as a governed reference data quality issue, not as isolated report cleanup. The core problem is that the same code is not consistently defined and mapped across CRM, loan servicing, and the warehouse. A shared reference code set, clear ownership, approved definitions, controlled mappings, and remediation of affected data provide a sustainable response. This also supports consistent reporting because consuming systems use the same meaning or an explicit crosswalk. Warehouse correction may repair one reporting layer temporarily, but it does not prevent the operational systems or future integrations from reintroducing the conflict.
- Warehouse-only cleanup treats a symptom in one downstream store and leaves CRM and loan servicing meanings unresolved.
- Report footnotes disclose inconsistency but do not correct the reference data or improve fitness for purpose.
- Weekly profiling can detect recurrence, but measurement alone does not define or govern the shared code meanings.
Question 46
Topic: Quality Maturity and Continuous Improvement
A company has recurring customer data defects that affect order fulfillment. A central data quality team has been fixing errors after month-end, but the same defects keep returning. Business leaders want sales and operations teams to take more responsibility for preventing defects during daily work. Which adoption approach best fits this goal?
Options:
A. Assign business stewards with quality rules, scorecards, and review routines
B. Require all users to complete one-time data quality tool training
C. Have the central quality team cleanse failed records more frequently
D. Publish detailed profiling reports for analysts to inspect monthly
Best answer: A
Explanation: Sustainable data quality adoption depends on making quality part of business work, not only a central cleanup activity. Business teams should own the meaning of critical data, approve quality rules and thresholds, monitor scorecards, and act on defects through regular operating routines. This connects data quality to fitness for purpose and business outcomes such as order fulfillment. A central team can facilitate methods, tooling, and analysis, but recurring defects usually require source-process ownership and prevention by the teams that create or use the data. The key is an operating model that combines stewardship accountability, measurement, issue management, and continuous improvement.
- More cleansing treats symptoms faster but does not shift ownership or prevent the same defects from recurring.
- Monthly profiling reports can reveal issues, but passive reporting does not create daily accountability or remediation routines.
- One-time training may build awareness, but it does not establish ownership, metrics, thresholds, or sustained improvement behavior.
Question 47
Topic: Quality Rules Standards and Requirements
A multinational company feeds regional customer onboarding data into one enterprise revenue dashboard. Profiling shows that core reporting attributes, such as customer identifier, country code, and legal entity status, are common across regions, while several optional local attributes vary by regulation and sales process maturity. Executives need comparable quality expectations for quarterly reporting, but regional stewards must retain flexibility for local requirements. Which standard is the BEST quality decision?
Options:
A. Require one identical rule set for all regional and local attributes
B. Set enterprise minimum thresholds for shared attributes and allow documented local extensions
C. Apply central cleansing after integration without source-level standards
D. Let each region define its own thresholds and note differences in reports
Best answer: B
Explanation: A fit-for-purpose quality standard should separate enterprise comparability from legitimate local variation. For shared reporting attributes, enterprise minimum thresholds and standard definitions support consistent measurement, scorecards, and governance decisions across regions. Local stewards can then define additional or stricter rules for region-specific attributes, provided those extensions are documented and aligned with the enterprise standard. This balances standardization with flexibility and avoids treating every local requirement as an enterprise mandate. The key is to standardize what must be comparable and govern variation where business context justifies it.
- Uniform rule set overreaches because optional local attributes vary by regulation and process maturity.
- Regional-only thresholds weaken enterprise comparability for quarterly reporting.
- Downstream cleansing may improve outputs temporarily but does not establish sustainable source quality expectations.
Question 48
Topic: Specialist Cross-Discipline Judgment
A city uses a risk score to prioritize rental-housing inspections. The score influences which buildings are inspected first, and inspection capacity is limited. Profiling shows building_age is missing for 38% of records from a new registration portal, mostly in lower-income districts. The model treats missing building_age as low risk, and no steward-approved quality rule exists for this field by district. What is the BEST quality decision?
Options:
A. Escalate rule ownership, measure district-level impact, and apply an interim control.
B. Exclude affected districts until every missing value is corrected.
C. Approve use because overall field completeness exceeds 90%.
D. Replace missing values with the citywide median building age.
Best answer: A
Explanation: Quality defects can create ethical harm when they affect groups unevenly and influence consequential decisions. Here, missing building_age is not just a completeness issue. It is concentrated in lower-income districts and is interpreted by the model as low risk, which could systematically delay inspections where tenant safety may be at stake. The best response combines data quality governance and harm reduction: assign stewardship for an approved rule, measure results by relevant groups or districts, investigate the source-process defect in the portal, and use an interim control such as manual review or conservative handling of missing values. Overall completeness is insufficient when subgroup impact is material.
- Overall completeness hides concentration of missing values in affected districts and does not test fitness for this inspection use.
- District exclusion would likely worsen service inequity by removing the very areas at risk from prioritization.
- Median imputation may mask the portal defect and can preserve biased prioritization if missingness is not random.
Question 49
Topic: Quality Maturity and Continuous Improvement
A data quality assessment finds that a customer analytics team cleans duplicate and incomplete records before each monthly report. The team has no named data stewards, quality rules are stored in individual spreadsheets, scorecards are not produced, and recurring defects are not traced back to source processes. Business leaders want the next improvement to build sustainable capability, not just improve the next report. What should be prioritized first?
Options:
A. Publish a dashboard of current defect counts
B. Increase manual cleansing before each monthly report
C. Buy a profiling tool for the analytics team
D. Assign stewardship and establish governed quality rules
Best answer: D
Explanation: Maturity improvement should address the weakest foundational capability that prevents sustainable data quality management. In this case, the organization is operating reactively: defects are cleaned downstream, rules are informal, no one owns rule decisions, and recurring causes are not addressed. Establishing named stewardship and governed quality rules creates accountability for defining fitness-for-purpose criteria, approving thresholds, coordinating remediation, and escalating source-process issues. Once those foundations exist, scorecards can report against agreed rules and remediation can be managed as a repeatable process.
Manual cleansing may help a single report, and tooling or dashboards can support later stages, but they do not by themselves create ownership, standards, or prevention.
- More cleansing stays reactive and does not prevent the same defects from recurring.
- Defect dashboards can expose problems, but counts without governed rules and owners do not create sustained control.
- Profiling tools help discover patterns, but tooling is premature when stewardship and rule governance are absent.
Question 50
Topic: Quality Governance Roles and Stewardship
A customer data domain has an automated rule that rejects records when country_code is not in the approved reference list. A new sales channel begins sending XK for Kosovo. The integration team can add the value technically, but the sales, compliance, and reporting teams disagree about whether and how it should be represented. What action best fits the data steward’s responsibility?
Options:
A. Ask developers to hard-code
XKas a valid valueB. Disable the validation rule until all teams agree
C. Facilitate a business decision on the approved code and rule change
D. Patch rejected records directly in the target database
Best answer: C
Explanation: Data stewardship is responsible for business-facing quality judgment: clarifying meaning, coordinating rule ownership, resolving cross-functional conflicts, and ensuring approved changes are reflected in standards, reference data, and controls. Automated validation executes an agreed rule; it does not decide whether a new code is semantically acceptable or compliant. Technical teams can implement the change after the stewarded decision is made, but the decision itself requires accountable business approval and documentation.
- Disabling validation removes a control and allows unmanaged defects rather than resolving the meaning and approval of the code.
- Patching target records treats symptoms downstream and bypasses the governed reference-data decision.
- Hard-coding the value makes a technical correction without confirming business definition, compliance impact, or rule ownership.
Questions 51-75
Question 51
Topic: Root Cause Analysis and Remediation
A data quality team is investigating why the monthly customer profitability dashboard shows 14% of active customers with an unknown segment. Profiling shows the CRM source has segment populated for 99.7% of active customers. Lineage shows the data warehouse load rejects segment codes added during a recent marketing reclassification because the validation lookup still uses the prior code set. The dashboard is used for executive pricing decisions next week. Which remediation is the BEST quality decision?
Options:
A. Suppress unknown-segment rows from the dashboard
B. Open a governance issue to redefine customer profitability
C. Ask CRM users to re-enter missing customer segments
D. Update the governed segment code set used by the warehouse validation
Best answer: D
Explanation: Root-cause reasoning should follow the evidence from the defect back through profiling and lineage. The source CRM is nearly complete, so the main issue is not missing source entry. The warehouse rejects newly valid segment codes because its validation lookup was not updated after the marketing reclassification. Remediation should therefore target the reference metadata or validation rule used by the load, with proper stewardship control so future code changes are synchronized. Masking the symptom in reporting would reduce visibility, and redefining the profitability metric does not address the failed segment classification.
- Source re-entry fails because profiling shows CRM already contains the segment for almost all active customers.
- Dashboard suppression hides the business impact but does not restore valid classifications or prevent recurrence.
- Metric redefinition is not supported by the evidence; the defect is in segment code validation, not profitability logic.
Question 52
Topic: Quality Strategy Business Case and Prioritization
A data quality analyst is triaging defects found during monthly customer billing reconciliation. The business sponsor has asked the team to address the defect that creates the most material business risk first.
| Defect | Profile result | Business context |
|---|---|---|
| Missing apartment number | 8,400 records | Paper statements may be delayed; e-statements unaffected |
| Invalid marketing consent code | 1,200 records | Affects campaign segmentation only |
| Duplicate customer record | 430 records | May cause duplicate invoices and incorrect credit exposure |
| Blank preferred language | 6,900 records | Default language is used on service emails |
Which defect should be prioritized first?
Options:
A. Invalid marketing consent code
B. Blank preferred language
C. Duplicate customer record
D. Missing apartment number
Best answer: C
Explanation: Risk-based data quality triage considers business impact, likelihood, urgency, and control exposure, not only defect volume. The duplicate customer records affect billing and credit exposure, so the defect can create direct financial loss, customer harm, and downstream reconciliation issues. A smaller count can outrank larger populations when the consequences are more material. The other defects may still need remediation, but their stated impacts are operational delays, campaign quality, or default communication behavior rather than direct invoicing and credit-risk errors.
The key takeaway is to prioritize defects by fitness-for-purpose risk to the business process, not by the largest profile count alone.
- Largest count is tempting, but missing apartment numbers mainly delay paper statements while e-statements are unaffected.
- Segmentation impact matters, but the consent-code issue is described as limited to campaign segmentation.
- Default handling reduces urgency because blank preferred language still results in service emails being sent.
Question 53
Topic: Quality Maturity and Continuous Improvement
A bank’s customer onboarding domain has mature data quality practices: approved rules, active monitoring, and timely remediation. However, customer identifiers, contact preferences, and risk classifications are reused by lending, marketing, and compliance teams, where definitions and thresholds differ. A maturity assessment rates onboarding highly but finds recurring enterprise reporting defects in the shared customer data.
Which action best addresses the maturity gap?
Options:
A. Establish enterprise stewardship for shared customer data standards
B. Run additional profiling within the onboarding domain
C. Raise onboarding’s quality thresholds for all customer fields
D. Assign each consuming team to maintain local customer rules
Best answer: A
Explanation: A maturity gap can exist when one domain has strong local quality controls but shared enterprise data remains weak because definitions, rules, thresholds, and accountability are not consistent across consumers. The appropriate response is to extend quality management to the shared data context: define enterprise-level standards, assign stewardship, align rules to business use, and monitor outcomes across domains. This does not mean weakening the strong domain’s practices; it means adding governance and coordination where reuse creates enterprise risk.
Local profiling or stricter source thresholds may help detect defects, but they do not resolve conflicting definitions or fragmented ownership across lending, marketing, and compliance.
- Higher local thresholds miss the cross-domain definition and accountability problem affecting shared customer data.
- More source profiling may reveal issues, but the stem already shows recurring enterprise defects and inconsistent rules.
- Local consumer rules preserve fragmentation and can increase inconsistency across shared enterprise data.
Question 54
Topic: Profiling Discovery and Assessment
A data quality analyst is assessing customer address records after delivery failures increased. Initial checks show postal_code, city, and state are populated and each field usually matches its allowed format. The business suspects that some addresses combine valid values in invalid ways, such as a valid postal code paired with the wrong city. Which profiling technique best fits this concern?
Options:
A. Cross-column relationship profiling using postal reference data
B. Duplicate detection across customer records
C. Single-column pattern analysis of
postal_codeD. Completeness profiling for address fields
Best answer: A
Explanation: The concern is not whether each address field is present or formatted correctly; it is whether the fields are fit together as a valid business fact. Cross-column relationship profiling examines dependencies between fields, such as whether a postal code is valid for a specific city and state. Using authoritative postal reference data strengthens the profile because it checks the observed combinations against an accepted code set or relationship, not just against patterns seen in the file. Single-column techniques can show formats, lengths, null rates, or value frequencies, but they miss defects where each value is valid in isolation. The key distinction is profiling the relationship among values, not profiling each value alone.
- Pattern-only checking can confirm the postal code shape but cannot prove that it belongs with the city and state.
- Completeness checking addresses missing values, while the scenario says the fields are generally populated.
- Duplicate detection finds repeated customer entities, not invalid address component combinations.
Question 55
Topic: Specialist Cross-Discipline Judgment
A monthly customer quality scorecard shows that 18% of new customer records are missing a required tax classification. Operations has been manually filling the field after each load, but the defect recurs because sales teams disagree on when the field is mandatory. Which governance action would most directly enable sustainable improvement?
Options:
A. Add another downstream cleansing step before reporting
B. Have the data owner approve a shared rule and assign stewardship accountability
C. Ask operations to document each manual correction
D. Increase the scorecard frequency from monthly to daily
Best answer: B
Explanation: Recurring defects usually require governance, not only more cleanup. In this case, the field is missing because business areas do not agree on when it is required. A data owner should approve the authoritative quality rule, and stewards should be accountable for applying, monitoring, and escalating it. That creates a shared definition of fitness for purpose and supports prevention in the source process.
More frequent measurement may reveal the defect sooner, and documentation may help analysis, but neither resolves disagreement about the rule. Downstream cleansing treats symptoms and can hide process weakness.
- More frequent scorecards improve visibility but do not establish the mandatory-field rule or accountability.
- Downstream cleansing repairs records after the fact and leaves the source-process cause unchanged.
- Correction logs can support issue analysis, but manual documentation alone does not prevent recurrence.
Question 56
Topic: Master Reference Integration and Warehouse Quality
A telecommunications company’s churn dashboard shows a sudden 18% increase in inactive business customers. The CRM and billing source profiles show valid status codes, expected completeness, and no abnormal duplicate account identifiers. Lineage review shows the increase starts when the integration layer maps CRM SUSPENDED and billing ON_HOLD to warehouse status Inactive, while the approved glossary defines inactive as “no billable service for 90 days.” What is the best quality decision?
Options:
A. Treat it as an integration mapping quality issue
B. Cleanse inactive statuses directly in the warehouse
C. Open source defects against CRM and billing
D. Ask BI developers to hide suspended accounts
Best answer: A
Explanation: The strongest decision is to classify the defect where the evidence shows it was introduced: the integration layer. The sources pass profiling checks for validity, completeness, and uniqueness, so there is no visible source quality defect. The dashboard is consuming warehouse data according to its design, so the issue is not primarily a reporting layout or filter problem. The mismatch is semantic consistency: the integration crosswalk maps operational statuses to a warehouse value that conflicts with the governed business definition. Remediation should focus on correcting the mapping rule, confirming ownership with the integration steward or data owner, and adding monitoring so the same semantic defect does not recur.
- Source correction fails because the source values are valid and complete under their own approved code sets.
- Report masking fails because hiding records would obscure the defect instead of fixing the governed transformation rule.
- Warehouse cleansing fails because downstream cleanup treats symptoms and leaves the integration rule producing bad values.
Question 57
Topic: Master Reference Integration and Warehouse Quality
A manufacturer is consolidating customer master data into a golden record. Matching has already confirmed that the two active records below refer to the same legal customer.
| Attribute | Record 118 | Record 742 |
|---|---|---|
| Tax ID | 98-7654321 | 98-7654321 |
| Credit status | On hold | Active |
| Last update reason | Credit review | Address change |
| Source | Credit system | CRM |
The current merge rule keeps the most recently updated value for every attribute. Which quality risk is the most direct concern?
Options:
A. Weak survivorship rules
B. Uncontrolled code sets
C. Outdated hierarchies
D. Unresolved duplicate detection
Best answer: A
Explanation: Survivorship rules determine which source value becomes the trusted value when matched master records contain conflicting attributes. Here, the records have already been matched as the same customer, so the main risk is not duplicate detection. The dangerous rule is using the most recent update for every attribute, even when the update reason is unrelated to the attribute being selected. A CRM address change should not necessarily override a credit-system status after a credit review. A stronger approach would define attribute-level source precedence, business ownership, and exception handling for conflicts. The key distinction is that matching identifies the same party; survivorship determines the correct golden-record values.
- Code-set control is not the issue because both credit statuses appear to be recognizable values, not unmanaged local codes.
- Hierarchy currency is unrelated because no parent-child, territory, or reporting hierarchy is shown.
- Duplicate detection has already succeeded because matching confirmed both records represent the same legal customer.
Question 58
Topic: Quality in Data Lifecycle and Operations
A retailer is evaluating a third-party household-income dataset to decide which customers should receive premium loyalty offers. The file passes schema checks and has few nulls, but the business risk is making offers to the wrong income segment. Which assessment or control best validates the data for this use?
Options:
A. Verify that the vendor encrypts the delivery file in transit
B. Count the number of records delivered in each monthly file
C. Benchmark income segments against an approved internal sample and acceptance threshold
D. Confirm that all required fields use the agreed data types
Best answer: C
Explanation: External data quality should be assessed against the intended business use and risk. In this case, the key concern is whether household-income segments support accurate offer targeting. A benchmark against an approved internal sample, trusted reference, or known outcome set, combined with an agreed acceptance threshold, directly tests fitness for purpose. Schema validity, secure transfer, and record counts are useful controls, but they do not show whether the income classification is accurate enough for premium-offer decisions. The strongest validation connects the external attribute to the business decision it will drive.
- Format-only checking can show valid data types, but it does not prove the income segment is correct for targeting.
- Security control protects the delivery process, but it does not assess the business accuracy of the supplied values.
- Volume monitoring may detect missing files or major delivery issues, but it does not validate customer-level income classification.
Question 59
Topic: Quality Governance Roles and Stewardship
A bank finds recurring defects in the customer_risk_rating field used for regulatory reporting. Database administrators maintain the platform, branch staff enter updates, and compliance analysts report defects when reports fail review. A new quality rule and acceptance threshold must be approved, and recurring breaches must be escalated for remediation. Which role should be accountable for this quality ownership?
Options:
A. Compliance analyst consuming the report
B. Database administrator maintaining the table
C. Branch employee entering the updates
D. Business data owner or delegated data steward
Best answer: D
Explanation: Data quality accountability is not the same as technical custody, data capture, or consumer feedback. In DAMA-aligned practice, the business data owner, often supported by a delegated data steward, is accountable for defining acceptable quality, approving rules and thresholds, prioritizing remediation, and escalating persistent issues. Technical teams may implement controls and maintain systems, data entry staff may follow procedures, and report consumers may identify defects. Those roles contribute evidence or execution, but they do not own the quality standard for a business-critical data element. The key distinction is accountability for fitness for purpose and sustained governance decisions.
- Technical custody supports storage, access, and implementation, but it does not decide the business quality threshold.
- Data entry responsibility affects defect prevention, but entering values is not the same as owning the rule.
- Report-consumer feedback helps detect issues, but consumers usually do not approve enterprise quality standards.
Question 60
Topic: Data Quality Foundations and Business Fitness
A customer service team configures its CRM to reject lowercase state codes and to auto-capitalize contact names. The team asks the data quality council to treat both settings as enterprise data quality principles. Which response best distinguishes a data quality principle from a local cleanup preference or tool validation feature?
Options:
A. Define principles by the controls available in the current CRM.
B. Adopt any validation feature that prevents invalid field entry.
C. Treat local formatting preferences as principles when users prefer them.
D. Base principles on governed fitness-for-purpose expectations across data use.
Best answer: D
Explanation: A data quality principle is a durable business and governance position about what makes data fit for purpose. It should be tied to agreed definitions, business impact, stewardship, and consistent use across relevant processes. A CRM validation feature can help enforce a rule, but the feature does not make the rule a principle. A local cleanup preference, such as auto-capitalizing names, may be cosmetic and can even reduce accuracy for some real names. Rejecting lowercase state codes may be useful if it supports an approved reference data standard, but the principle is the governed standard and business need, not the CRM setting itself. The key distinction is business-governed fitness for purpose versus local convenience or tool capability.
- Feature-first thinking fails because tool controls can enforce rules, but they do not define enterprise quality principles.
- Preference as principle fails because user preference alone does not establish business fitness, stewardship, or cross-process consistency.
- CRM-defined quality fails because available controls are implementation details, not the basis for quality governance.
Question 61
Topic: Quality Strategy Business Case and Prioritization
A data quality team proposes funding for a customer data improvement initiative. The proposal emphasizes duplicate rates, invalid email formats, and the number of failed validation rules. Sales leaders resist, saying the issue is “data hygiene” and not a priority compared with improving renewal rates. What adjustment would most directly improve the quality value proposition?
Options:
A. Translate defects into renewal-risk impact and sales process costs
B. Add more profiling statistics for every customer data field
C. Ask governance to mandate participation before revising the case
D. Lower the proposed quality thresholds to reduce resistance
Best answer: A
Explanation: A quality value proposition is strongest when it expresses data quality in terms of fitness for purpose and stakeholder value. Here, sales leaders are resisting because the proposal is framed as technical hygiene rather than as a business problem affecting renewals, productivity, customer contactability, or forecast confidence. The better adjustment is to convert defects into business consequences: lost renewal opportunities, time spent manually correcting records, duplicate account handling, or failed outreach. Profiling evidence still matters, but it supports the business case rather than replacing it. A mandate may be needed later for accountability, but it does not by itself create stakeholder buy-in.
- More profiling may provide evidence, but it does not address why sales leaders should care.
- Lower thresholds weakens the quality target without proving business value.
- Governance mandate can enforce participation, but it does not repair a poorly framed value proposition.
Question 62
Topic: Quality Maturity and Continuous Improvement
A data governance council reviews a data quality maturity profile for customer data. The scale is 1 = reactive and 5 = optimized.
| Capability | Rating | Evidence |
|---|---|---|
| Profiling coverage | 4 | Monthly source profiles |
| Cleansing execution | 4 | Backlog reduced after each batch fix |
| Rule ownership | 2 | Rules maintained by IT analysts |
| Issue management | 1 | No root-cause tracking or owner follow-up |
| Scorecard use | 2 | Counts defects, not prevention progress |
Duplicate and invalid customer records return within two months after each cleansing cycle. Which quality capability should be strengthened next?
Options:
A. More frequent one-time cleansing of customer records
B. Additional profiling reports for each source system
C. A refreshed visual scorecard using the same measures
D. Quality issue management with root-cause remediation ownership
Best answer: D
Explanation: A maturity profile should guide the next improvement toward the weakest capability that limits sustainable quality. Here, profiling and cleansing are already relatively mature, but defects recur because issue management and ownership are immature. DAMA-aligned quality improvement emphasizes preventing recurrence through root-cause analysis, accountable stewardship, remediation tracking, and governance follow-up. Strengthening issue management connects detected defects to process fixes and responsible data owners or stewards. Better dashboards or more frequent cleanup may improve visibility or short-term appearance, but they do not address why invalid and duplicate records keep reappearing.
- More cleansing treats symptoms and may reduce the backlog temporarily, but it does not prevent the same defects from returning.
- More profiling adds detection capacity, but the profile already shows strong profiling and weak follow-through.
- Refreshed scorecards may improve presentation, but using the same measures leaves prevention and accountability unresolved.
Question 63
Topic: Quality Governance Roles and Stewardship
A customer onboarding team sees the same address validation exceptions every month. Analysts correct the rejected records before reporting, but the defect returns because sales teams use different entry practices. Which stewardship practice best creates durable prevention, documented decisions, and accountable follow-up?
Options:
A. Run monthly profiling and publish the exception counts
B. Convene a stewardship review to approve rule changes and assign remediation owners
C. Cleanse the rejected addresses before each reporting cycle
D. Lower the validation threshold to reduce exception volume
Best answer: B
Explanation: Recurring quality exceptions usually require stewardship, not just measurement or cleanup. In DAMA-aligned quality practice, stewards help convert repeated defects into governed action: clarify the business rule, document the decision, assign accountable owners, track remediation, and monitor whether the source-process change prevents recurrence. Here, the visible root cause is inconsistent sales entry practice, so durable improvement depends on an approved rule and accountable process follow-up, not only correcting downstream records. The key distinction is prevention through governed stewardship versus repeated detection or repair.
- Profiling only identifies and trends exceptions, but it does not by itself assign ownership or change the entry process.
- Repeated cleansing fixes reported data temporarily while leaving the recurring source-process defect in place.
- Lowering thresholds may hide exceptions and weaken fitness for purpose unless a governed business decision supports the change.
Question 64
Topic: Metadata Lineage and Quality Evidence
A data quality scorecard flags many exceptions for the metric active customer count used in executive reporting. Profiling shows customer status codes are valid, complete, and loaded correctly from source systems. Lineage shows Sales counts customers with any quote in the last 12 months, while Finance counts only customers with an invoiced transaction in the last 12 months. The metric is business-critical and must remain auditable. What is the best quality decision?
Options:
A. Resolve and approve a shared glossary definition for the metric
B. Lower the scorecard threshold until exceptions are manageable
C. Ask the integration team to reload the source status codes
D. Cleanse customer records that fail the current scorecard rule
Best answer: A
Explanation: When profiling confirms values are valid and complete, repeated exceptions may reflect a definition conflict rather than poor data values. Here, Sales and Finance are applying different meanings to the same metric, so the quality response should focus on metadata and governance: agree the business definition, record it in the glossary, align lineage and quality rules to that approved meaning, and make the scorecard auditable. Cleansing or reloading records would not remove the underlying disagreement about what should be counted. The durable fix is to standardize the meaning before measuring conformance.
- Record cleansing treats valid source data as defective, even though profiling shows the values are not the cause.
- Threshold lowering hides the symptom and weakens trust without resolving the business definition conflict.
- Source reload targets a technical load issue, but the lineage evidence points to different metric semantics.
Question 65
Topic: Root Cause Analysis and Remediation
A retailer’s customer analytics team cleans duplicate customer records every month before producing churn reports. Profiling shows the duplicate rate returns to about 8% after each cleanup, mainly for accounts created through the mobile sign-up process. The churn model and campaign suppression rules both use this customer file, and the data stewardship group has limited capacity for manual review. Which action is the best quality decision?
Options:
A. Increase monthly manual deduplication before churn reporting
B. Analyze and correct the mobile sign-up creation process
C. Exclude mobile sign-up records from churn reporting
D. Publish a duplicate-rate scorecard without remediation
Best answer: B
Explanation: Sustainable data quality improvement focuses on root-cause analysis and remediation, not repeated downstream correction. The evidence shows a recurring uniqueness defect that reappears after each cleanup and is concentrated in one source process: mobile sign-up. Because the same customer file supports multiple downstream uses, repeated deduplication treats only the symptom and consumes scarce stewardship capacity. The better response is to investigate how mobile records are created, identify the process or rule gap, and implement a preventive control at the point of capture or integration. Monitoring can then confirm whether the duplicate rate stays reduced. Downstream cleanup may still be needed temporarily, but it should not be the main long-term control.
- More cleanup leaves the mobile sign-up defect in place, so the duplicate rate is likely to recur.
- Excluding records avoids some bad data but harms completeness and does not address the cause.
- Scorecard only improves visibility, but measurement without remediation will not prevent the recurring defect.
Question 66
Topic: Quality Rules Standards and Requirements
A customer onboarding process is creating recurring data quality defects. Profiling shows 8% of new records contain invalid country_code values, causing tax calculations to fail in downstream billing. The approved reference list is owned by a data steward and changes monthly. The remediation team has limited capacity, and leadership wants to stop new defects before they enter the pipeline. Which control type best addresses this situation?
Options:
A. Detective scorecard reporting invalid codes weekly
B. Preventive validation against the governed reference list
C. Corrective cleansing of invalid codes after billing failure
D. Manual review of tax calculation exceptions
Best answer: B
Explanation: The risk is recurring invalid reference data entering at the source, and the business goal is to prevent new defects rather than expand cleanup. A preventive control, such as source-system validation against the governed country_code list, addresses the defect at the point of creation. Because the list changes monthly and has a steward owner, the control should use the governed reference source rather than a locally maintained copy. Detective controls are useful for monitoring, and corrective controls are needed for existing bad records, but neither is the primary response when leadership wants to stop the defect before it enters the pipeline.
- Weekly reporting detects trends, but it allows invalid values to continue entering the process.
- Post-failure cleansing fixes symptoms after billing is affected and increases load on a constrained remediation team.
- Manual exception review may handle urgent cases, but it does not prevent invalid codes at capture.
Question 67
Topic: Quality Governance Roles and Stewardship
A bank has named data stewards for customer data, and defect logs show recurring duplicate records and invalid status codes. The stewards review scorecards but cannot require source-system changes, business managers rarely attend triage meetings, and stewardship tasks are treated as extra work outside normal duties. What is the most appropriate action to make issue resolution sustainable?
Options:
A. Assign IT to cleanse the warehouse records after each load
B. Give stewards a more detailed profiling report for each recurring defect
C. Publish scorecards that rank stewards by unresolved issue count
D. Escalate to governance leaders to formalize authority, participation, and steward capacity
Best answer: D
Explanation: Assigning stewards is not enough if they lack the authority, business participation, or time needed to resolve root causes. In DAMA-aligned quality governance, stewardship must be supported by decision rights, accountable data owners, issue escalation paths, and practical capacity to act. Recurring duplicates and invalid codes point to source-process or reference/master data controls, not only to review activity. The appropriate response is to strengthen the stewardship operating model so business owners can approve rules, prioritize remediation, and require process changes. More reports, downstream cleansing, or public ranking may create activity, but they do not fix the governance gap that prevents sustained resolution.
- More profiling identifies defects, but the facts already show recurring issues and blocked resolution authority.
- Warehouse cleansing treats symptoms downstream and leaves the source-process causes in place.
- Steward ranking may increase pressure, but it unfairly holds stewards accountable without giving them authority or capacity.
Question 68
Topic: Quality in Data Lifecycle and Operations
A marketing analytics team wants to add a third-party demographic file to a customer segmentation model because early tests show stronger predictive results. The provider cannot show lineage, refresh history, profiling results, or how key fields such as household_income_band are defined. What is the best next action before the data is adopted for production use?
Options:
A. Approve use because predictive performance improved
B. Cleanse invalid values during ingestion
C. Load the file and monitor model lift after deployment
D. Start a data quality and metadata assessment with the provider
Best answer: D
Explanation: External and third-party data can add value, but DAMA-aligned quality practice requires evidence that it is fit for purpose. Useful analytical performance is not enough when lineage, definitions, refresh behavior, and quality results are unknown. The next step is to obtain or create quality and metadata evidence: source lineage, business definitions, profiling results, refresh controls, permitted use, and compatibility with internal definitions. If the provider cannot supply adequate evidence, the organization should treat the data as untrusted for production decisions or limit it to controlled evaluation until risks are resolved.
Cleansing can address visible defects, but it cannot prove meaning, provenance, or sustainable quality.
- Deploy and monitor is reactive because it exposes production decisions before provenance and meaning are understood.
- Ingestion cleansing fixes some format or value issues but does not establish lineage, definitions, or provider controls.
- Performance-only approval confuses model usefulness with data fitness for governed production use.
Question 69
Topic: Data Quality Foundations and Business Fitness
A data steward reviews a monthly customer risk report used for credit-limit approvals. Profiling shows the customer ID is unique, the risk rating field is populated for all records, each rating uses an approved code, and the feeds arrived before the reporting cutoff. However, 18% of customers have different risk ratings in the CRM feed and the onboarding feed, and the report sometimes uses one value and sometimes the other. Which data quality issue is the main issue?
Options:
A. Late data
B. Duplicated data
C. Missing data
D. Conflicting data
Best answer: D
Explanation: The main issue is conflicting data: two sources provide different values for the same customer risk rating. The evidence rules out several other dimensions. The field is fully populated, so the problem is not completeness. The feeds arrived before cutoff, so timeliness is not the main concern. Customer ID is unique, so duplication is not driving the issue. The business impact comes from inconsistent source values being selected unpredictably for credit-limit decisions. A sustainable response would clarify the authoritative source or survivorship rule, then monitor the agreed rule through stewardship and governance.
- Missing data fails because profiling shows every risk rating field is populated.
- Late data fails because both feeds arrived before the reporting cutoff.
- Duplicated data fails because customer ID uniqueness checks passed.
Question 70
Topic: Metadata Lineage and Quality Evidence
A finance team finds that 8% of records in a regulatory dashboard contain invalid branch codes. Lineage evidence shows where the values change.
| Stage | Evidence |
|---|---|
| Loan system | branch_id is mandatory and matches the enterprise code set |
| Integration mapping | Local spreadsheet maps some closed-branch aliases to old codes |
| Warehouse | Invalid branch_code values first appear after the integration load |
| Dashboard | Reads branch_code from the warehouse without transformation |
Which action best fits the defect?
Options:
A. Cleanse the regulatory extract each month
B. Control the integration mapping against governed reference data
C. Filter invalid branch codes in the dashboard
D. Ask loan officers to re-enter branch IDs
Best answer: B
Explanation: Lineage helps identify where a data value is created, transformed, and consumed. Here, the source loan system already has mandatory, valid branch IDs, and the invalid values first appear after the integration load. That points to the transformation and mapping process, not to source data entry or dashboard presentation. A governed reference-data control at the integration point prevents recurrence and aligns the mapping with the enterprise code set. Downstream filtering or monthly cleansing may hide symptoms, but it does not remove the process that creates the defect.
- Source re-entry fails because the visible evidence shows valid source values before integration.
- Dashboard filtering fails because it masks bad values after they have already entered the warehouse.
- Monthly cleansing fails because it is a recurring workaround rather than a preventive control at the defect point.
Question 71
Topic: Monitoring Scorecards and Measurement
A data quality team publishes a weekly dashboard for customer master data. It currently shows completeness percentages, duplicate counts, and invalid postal-code counts by week. Sales Operations says the report is accurate but does not help managers prioritize remediation or know who must act. Which enhancement best supports the dashboard’s purpose?
Options:
A. Add rule status, trend, exception owner, and business impact
B. Add a record-level list of all corrected values
C. Add ETL job duration and server utilization by load
D. Add detailed frequency distributions for every profiled attribute
Best answer: A
Explanation: Scorecards and dashboards should translate data quality measurements into management action. Counts and percentages show quality status, but they are not enough when stakeholders must decide what to fix first and who is responsible. A useful data quality dashboard connects rules and thresholds to trends, exceptions, accountable owners or stewards, and business impact such as affected orders, customers, revenue, compliance exposure, or operational delay. That makes the display fit for governance, escalation, and remediation planning rather than only technical inspection. Detailed profiling can support analysis, but dashboard reporting should emphasize decisions and accountability.
- Profiling detail can help analysts investigate patterns, but it does not directly show ownership or business impact for prioritization.
- Operational metrics such as job duration and server utilization concern system performance, not customer data fitness for purpose.
- Corrected records provide audit detail, but a record-level list is too granular for communicating trends, exceptions, and accountability.
Question 72
Topic: Root Cause Analysis and Remediation
A data quality issue found 18,000 customer records with account_status values outside the approved reference list. Root cause analysis traced the defect to an integration mapping error. The mapping has been corrected, and the affected records have been reloaded. What action best supports remediation validation before closure?
Options:
A. Ask business stakeholders whether the status values look reasonable
B. Monitor the monthly invalid-status trend for two cycles
C. Retest the status validity rule on the reloaded records
D. Review the integration control design at the next governance meeting
Best answer: C
Explanation: Remediation validation should prove that the specific defect has been corrected in the affected data and that the agreed quality rule is now satisfied. Here, the issue was an objective validity failure: account_status values did not match the approved reference list. Because the mapping was fixed and records were reloaded, the most direct closure evidence is to rerun the same validity rule against the remediated population, especially the records that previously failed. Trend monitoring and control review are useful for sustained prevention, but they do not by themselves confirm that this remediation corrected the known failed records.
- Trend monitoring helps detect recurrence over time, but it is not immediate evidence that the corrected records now pass.
- Stakeholder confirmation is useful for ambiguous fitness-for-purpose concerns, but this case has an objective reference-data rule.
- Control review can strengthen prevention, but it does not validate the reloaded data for closure.
Question 73
Topic: Monitoring Scorecards and Measurement
A company’s customer onboarding is delayed when tax identifiers are unusable for credit checks. Profiling shows tax_id is 96% populated, but 7% of new records fail the approved validation rule, mostly from the partner portal. The Customer Master Data Steward owns the rule and can sponsor source-process fixes. Which measurement approach is the best quality decision?
Options:
A. Track validation pass rate by source, with steward-owned targets
B. Track the number of tax ID defects manually corrected each week
C. Track average CRM load time for new customer records
D. Track overall
tax_idcompleteness across all customer records
Best answer: A
Explanation: A quality KPI should reflect business fitness for purpose, not just convenient technical counts. Here, the business impact is failed credit checks during onboarding, and the known defect is tax identifiers failing an approved validation rule. Measuring validation pass rate for new records by source channel connects the metric to the failing process, enables partner-portal root-cause action, and gives the Customer Master Data Steward a clear accountability point for targets and escalation. Completeness alone would miss invalid identifiers, and counting manual fixes would measure cleanup activity rather than sustained prevention.
The strongest scorecard metric combines the relevant quality dimension, business outcome, source-process view, and stewardship ownership.
- Completeness only misses the visible problem because most records are populated but still fail validation.
- Manual correction counts can rise even when prevention is poor, so they do not show whether onboarding risk is being reduced.
- CRM load time may be operationally useful, but it does not measure tax identifier quality or steward-owned rule performance.
Question 74
Topic: Master Reference Integration and Warehouse Quality
A retailer’s customer master feeds sales reporting and account hierarchy analytics. Profiling shows 8% likely duplicate customer records, conflicting legal names and tax IDs across CRM and billing, locally maintained region codes, and a parent-child hierarchy last approved 18 months ago. BI reports are double-counting revenue, and the data stewardship council can approve enterprise rules but has limited remediation capacity. Which quality decision best addresses the risk?
Options:
A. Deduplicate customer records in the warehouse before each BI refresh.
B. Ask each sales region to correct its local customer and region lists.
C. Use the most recently updated source value for all conflicting customer attributes.
D. Prioritize governed master data remediation with survivorship, code-set control, hierarchy refresh, and monitoring.
Best answer: D
Explanation: Master and reference data quality risks should be managed at the governed master data level, not only where defects appear in reports. The facts show multiple connected risks: duplicate customer masters affect uniqueness, conflicting legal names and tax IDs affect accuracy and integrity, local region codes indicate uncontrolled reference data, and the stale hierarchy affects downstream rollups. With limited remediation capacity, the strongest decision is to prioritize high-impact master data controls: approved match and merge rules, explicit survivorship by attribute, governed code sets, hierarchy review, and ongoing monitoring. This improves fitness for purpose for BI while preventing recurring defects from re-entering the warehouse.
- Warehouse-only cleanup reduces visible reporting defects but leaves source master records, survivorship, codes, and hierarchies uncontrolled.
- Local list correction may fix some regional symptoms but reinforces inconsistent ownership and uncontrolled reference data.
- Most recent value wins is a weak survivorship rule because recency does not prove accuracy for legal names, tax IDs, or hierarchy attributes.
Question 75
Topic: Master Reference Integration and Warehouse Quality
A regional sales dashboard is used for weekly pricing and inventory decisions. Business users report that the dashboard totals sometimes change after meetings, even when source transactions were already closed. Profiling shows that late warehouse loads and unmatched product reference codes affect 3% of fact rows. Which quality control best protects decision trust?
Options:
A. Publish a BI quality gate with reconciliation and freshness thresholds
B. Increase the dashboard refresh frequency to hourly
C. Ask analysts to manually adjust meeting reports
D. Add a glossary definition for sales revenue
Best answer: A
Explanation: Decision trust in BI depends on visible, repeatable controls that show whether analytical data is fit for its intended use before stakeholders rely on it. In this scenario, the defects are measurable and recurring: late warehouse loads and unmatched reference codes are changing published totals. A BI quality gate or scorecard should test reconciliation to closed source transactions, freshness of loads, and validity of product reference mappings against agreed thresholds. The dashboard can then be certified, held, or flagged based on quality status. This protects business decisions better than faster refreshes or manual adjustments because it addresses whether the published numbers are trustworthy at decision time.
- Hourly refresh may make unstable data appear sooner, but it does not verify reconciliation, reference-code validity, or readiness for use.
- Manual adjustment creates inconsistent downstream fixes and weak auditability instead of a repeatable warehouse quality control.
- Glossary definition helps semantic consistency, but the visible defects are operational quality failures in loads and reference mapping.
Questions 76-100
Question 76
Topic: Specialist Cross-Discipline Judgment
A data quality team is triaging defects found during profiling of customer service data. Leadership wants to know which item should be escalated to the data governance council as an ethical data quality concern, not just handled as an operational cleanup or report presentation issue. Which defect best fits that escalation?
Options:
A. Missing accessibility needs flags suppress service accommodations
B. Duplicate supplier rows increase invoice matching effort
C. A daily exception report arrives 30 minutes late
D. A dashboard label uses an outdated department name
Best answer: A
Explanation: Ethical data quality concerns arise when poor quality can materially affect people’s rights, access, safety, fairness, or dignity. A missing accessibility needs flag is not merely incomplete data; it can prevent customers from receiving needed accommodations and create unequal treatment. That makes the issue appropriate for governance escalation, stewardship attention, root-cause analysis, and sustained controls. Operational inefficiency, cosmetic reporting defects, and minor timing issues may still need remediation, but they do not automatically create an ethical concern unless they materially affect decisions about people or expose them to harm.
- Supplier duplication creates process inefficiency and possible payment risk, but the facts point to operational cleanup rather than ethical impact.
- Outdated labeling is a presentation and metadata issue, but it does not show unfair treatment or harm.
- Late reporting may affect operations, but a 30-minute delay alone does not indicate an ethical quality concern.
Question 77
Topic: Quality Governance Roles and Stewardship
A data steward has drafted a quality rule for the customer domain: every active customer record must have a validated tax identifier before quarterly regulatory reporting. Profiling shows 12% exceptions, and the data engineering team can add the validation to the ingestion pipeline. Who should be accountable for approving the rule threshold and remediation priority?
Options:
A. The data engineering team
B. The customer data owner
C. The data custodian
D. The data steward
Best answer: B
Explanation: Data quality accountability should sit with the business role that owns the data domain and its fitness for purpose. A data owner has authority to approve quality expectations, thresholds, priorities, and acceptance of residual quality risk. The steward is central to coordinating the work: documenting the rule, analyzing exceptions, facilitating issue management, and helping stakeholders agree on definitions. Custodians and technical teams implement controls, pipelines, monitoring, and fixes, but they should not decide the business tolerance for missing or invalid regulatory identifiers. The key distinction is accountability versus execution: technical implementation supports quality management, while the data owner approves the business standard and remediation priority.
- Steward authority is tempting because the steward drafts and coordinates the rule, but approval of business tolerance belongs to the accountable owner.
- Custodian execution fits platform and data handling responsibilities, not business approval of thresholds or priorities.
- Engineering implementation is needed for pipeline validation, but implementation responsibility does not create ownership of the quality standard.
Question 78
Topic: Monitoring Scorecards and Measurement
A data quality scorecard flags customer onboarding as red for Tax ID completeness. The rule and recent evidence are shown below.
| Item | Evidence |
|---|---|
| Current rule | Tax ID must be populated for every active customer |
| Threshold | At least 98% complete |
| Current result | 94% complete |
| Exception profile | 100% of missing Tax IDs have reason code MINOR_EXEMPT |
| Business impact | KYC accepts the exemption code; no failed reviews |
What is the most appropriate action?
Options:
A. Remediate the missing Tax IDs in the source system
B. Accept the red status and continue routine monitoring
C. Refine the quality rule to measure conditional completeness
D. Escalate the result as an unresolved governance issue
Best answer: C
Explanation: Measurement results need interpretation against business meaning, not only threshold math. The scorecard is technically red because the current rule requires Tax ID for every active customer, but the profile shows every exception has a valid exemption reason and the KYC process accepts that condition. That means the issue is not missing data that must be corrected; it is a rule-definition mismatch. A better rule would test conditional completeness, such as requiring Tax ID unless a valid exemption code is present. The scorecard should then report whether records meet that business-approved condition. Acceptance alone would leave a misleading red metric in place.
- Source remediation fails because adding Tax IDs for exempt minors would contradict the documented business condition.
- Governance escalation is premature because the evidence points to rule refinement, not an unresolved ownership conflict or material business harm.
- Routine monitoring fails because continuing to report a known false breach weakens scorecard credibility.
Question 79
Topic: Quality Rules Standards and Requirements
A data steward is documenting quality requirements for customer contact data. Marketing can run campaigns if at least 98% of email addresses pass syntax and domain checks. Regulatory notices require every active customer to have a deliverable postal address before the monthly notice file is released. Which artifact best defines acceptable quality for these different uses?
Options:
A. A profiling report showing current contact-data defect counts
B. A single 100% quality target for all customer fields
C. A standard with use-specific rules, thresholds, and checkpoints
D. A cleansing backlog ordered by easiest defects first
Best answer: C
Explanation: Data quality standards translate fitness for purpose into approved expectations for specific data elements and uses. In this case, email quality for marketing has an acceptable threshold of 98%, while postal address quality for regulatory notices has a stricter release requirement. A useful standard states the rule, threshold, measurement point, owner, and expected response when results fall below tolerance. Those thresholds can then feed scorecards, controls, and remediation priorities. Profiling may help discover current defect levels, but it does not by itself define what is acceptable.
- Current defect counts show the existing condition, but they do not establish approved tolerance levels or release criteria.
- Universal perfection target ignores fitness for purpose and may misallocate effort across different business uses.
- Ease-based cleansing prioritizes work execution, not the standard that defines acceptable quality.
Question 80
Topic: Monitoring Scorecards and Measurement
A data governance council funded a remediation program because poor customer contact data was causing failed regulatory notices and delayed complaint responses. Executives now want one KPI on the data quality scorecard that shows whether the improvement is reducing business risk. Which KPI best fits this need?
Options:
A. Percentage of contact-data fields passing format validation rules
B. Number of customer records cleansed during the remediation cycle
C. Number of new data quality rules added to the scorecard
D. Percentage reduction in failed regulatory notices caused by contact-data defects
Best answer: D
Explanation: Executives need a KPI that connects data quality improvement to reduced business risk, not just evidence that data quality work occurred. In this case, the funded risk is failed regulatory notices and delayed complaint responses caused by poor contact data. A KPI based on the reduction of failed notices attributable to contact-data defects is outcome-oriented, business-relevant, and traceable to the stated risk. It can still be supported by operational measures such as completeness, validity, and cleansing volume, but those measures do not by themselves prove risk reduction. The strongest executive KPI shows whether fewer harmful business events are occurring because quality has improved.
- Cleansing volume shows activity completed, but not whether risk events decreased.
- Format validation pass rate measures technical validity, but valid formatting can still fail business fitness for purpose.
- Rules added measures scorecard expansion, not actual improvement or risk reduction.
Question 81
Topic: Quality Governance Roles and Stewardship
A data quality council receives an escalation about a customer master rule: “Tax Registration Number is mandatory for all B2B customers.” Profiling shows 12% missing values, almost all for public-sector agencies in countries where some agencies are not issued that identifier. The data owner and steward are already assigned, the rule was approved last quarter, and the source team can add an exception reason if policy allows it. What is the best quality governance decision?
Options:
A. Re-approve the existing tax identifier completeness rule
B. Define approved exceptions and tolerance for public-sector accounts
C. Rank the missing values against other remediation backlog items
D. Assign a new owner for customer master tax identifiers
Best answer: B
Explanation: Governance should decide how legitimate exceptions to an approved quality rule are handled. The defect appears as a completeness failure, but the profile and source-process facts show that some missing values may be valid for a defined public-sector population. Since ownership and rule approval are already in place, the unresolved governance issue is exception tolerance: which cases are allowed, what evidence or reason code is required, and how those records should be excluded or reported in scorecards. This preserves the rule while preventing valid business cases from being treated as ordinary defects.
- Ownership change is not the issue because the data owner and steward are already assigned.
- Rule re-approval does not address the legitimate exception population or how it should be measured.
- Backlog ranking may help remediation work, but the escalation first needs a policy decision on valid exceptions.
Question 82
Topic: Quality in Data Lifecycle and Operations
A retailer plans to enrich its customer master with a partner-provided propensity_segment feed for campaign targeting. Initial profiling shows 99.8% completeness, valid code formats, and no duplicate customer IDs. The partner provides no business definition for the segment, no lineage for how it is calculated, and no commitment to notify the retailer if the calculation changes.
Which data quality risk is most important to identify before approving use of the feed?
Options:
A. Invalid code format in the segment field
B. Excessive storage cost for the enriched master data
C. Duplicate customer records in the customer master
D. Uncontrolled meaning and comparability of the segment over time
Best answer: D
Explanation: Third-party data quality risk is not limited to profile results such as completeness, validity, or uniqueness. A feed can pass technical checks but still be risky if the consuming organization cannot understand its meaning, provenance, calculation method, refresh behavior, or change controls. In this case, the decisive issue is business fitness for purpose: campaign decisions may depend on a segment whose definition or derivation could change without notice. That threatens consistency and comparability over time, even though the values are populated and formatted correctly.
The key takeaway is to assess external data for provenance, semantics, lineage, and provider change management, not just field-level conformance.
- Format validity is already supported by profiling, so it does not address the missing definition and change-control concern.
- Duplicate customers are not indicated because the profile found no duplicate customer IDs.
- Storage cost may be an operational concern, but it is not the data quality risk created by unclear external data provenance.
Question 83
Topic: Quality in Data Lifecycle and Operations
A bank wants to use a third-party business dataset to enrich customer risk models. Initial profiling shows valid file formats, 98% populated key fields, and no duplicate supplier IDs. Before approving it for production use, the data quality lead must decide whether the dataset is fit for the intended business purpose. What evidence is most important to request next?
Options:
A. Supplier provenance, definitions, rights, cadence, coverage, and validation evidence
B. A one-time cleansing plan for nulls and duplicate records
C. A mapping from supplier fields to internal database columns
D. A dashboard showing file delivery success and row counts
Best answer: A
Explanation: External and third-party data quality requires more than technical profiling. Valid formats, populated fields, and uniqueness checks show useful quality signals, but they do not prove that the data is suitable for the bank’s risk purpose. The data quality lead should verify where the data came from, what each field means, how often it is refreshed, whether the bank has rights to use it, whether it covers the intended population, and what validation has been performed against reliable sources. These factors connect data quality to fitness for purpose, governance, and operational risk. Mapping and monitoring help later, but approval should not rely on them before the external data’s trustworthiness and permitted use are established.
- Cleansing first misses the main risk because the visible defects are not the decisive issue; external suitability must be established before cleanup.
- Delivery metrics confirm operational receipt, but they do not prove meaning, rights, coverage, or value accuracy.
- Field mapping supports integration, but it cannot resolve whether the supplier data is trustworthy or permitted for the intended use.
Question 84
Topic: Specialist Cross-Discipline Judgment
A regional insurer reports conflicting “active policy” counts between claims analytics and finance reporting. Profiling shows both marts are technically valid and complete, but they use different policy status definitions and receive status updates through separate feeds from the policy administration system. Finance uses the count for regulatory reporting, and the data governance council has asked for a sustainable quality decision rather than another reconciliation spreadsheet. Which decision best improves data quality?
Options:
A. Ask each reporting team to document its local definition
B. Create a monthly manual tie-out between marts
C. Define an authoritative policy source and shared status model
D. Increase completeness checks on both reporting marts
Best answer: C
Explanation: Conflicting counts can occur even when each dataset is valid and complete if architecture allows multiple flows, duplicated transformations, and inconsistent business definitions. The sustainable data quality response is to clarify the authoritative source, standardize the policy status model, and route downstream reporting through governed, well-defined flows. That improves consistency and fitness for purpose, especially where finance relies on the measure for regulatory reporting. Local documentation or recurring reconciliations may expose differences, but they do not remove the redundant logic that creates them.
- More completeness checks miss the issue because the facts say the marts are complete; the defect is inconsistent definition and flow.
- Local definitions preserve conflicting interpretations instead of governing a shared business meaning.
- Manual tie-out treats symptoms after the fact and does not prevent recurring inconsistency.
Question 85
Topic: Quality Maturity and Continuous Improvement
A retailer has recurring fulfillment failures caused by invalid shipping addresses and missing customer contact details. Profiling shows most defects are introduced during call-center order entry, and the call-center director owns that process. The marketing warehouse is affected downstream, but the governance council has limited central remediation capacity. Which adoption approach is the BEST quality decision?
Options:
A. Run annual data quality awareness training for call-center staff
B. Embed business-owned rules, scorecards, and exception handling in order entry
C. Have IT cleanse defective records before warehouse loads
D. Ask the governance council to review quarterly defect reports
Best answer: B
Explanation: Sustainable data quality adoption works best when the business team that creates or uses the data owns the quality expectations and improvement actions. In this case, the defects originate in call-center order entry, so the call-center process should include agreed quality rules, visible scorecards, exception handling, and steward-led follow-up. That makes quality part of daily work rather than a downstream cleanup activity. Central governance still supports standards and escalation, but it should not become the only place where quality is measured or fixed. The key is shifting from episodic remediation to business-owned prevention and continuous improvement.
- Downstream cleansing reduces immediate warehouse impact but leaves the source process unchanged.
- Annual training may raise awareness, but it does not create ongoing measurement or accountability.
- Quarterly review is too removed from daily operations and does not give the process owner active control.
Question 86
Topic: Master Reference Integration and Warehouse Quality
A data warehouse integrates order status from CRM and ERP for an executive fulfillment KPI. Profiling shows each source uses only valid local codes, but reconciliation after the warehouse load shows CRM C means Cancelled while ERP C means Closed. The enterprise reference data standard exists, the ERP mapping changed during an upgrade, and audit users need traceability from the KPI back to source status values. Which action is the best quality decision?
Options:
A. Add a dashboard filter for records with status
C.B. Require both source systems to rename their local codes.
C. Validate only that warehouse status values use approved codes.
D. Review integration mapping and capture lineage before warehouse load.
Best answer: D
Explanation: The key issue is not simple code validity in either source. Each source value is technically valid, but the same code has different business meanings. That makes this an integration-quality problem involving reference data mapping, reconciliation, and lineage. The best control point is before the warehouse load, where source-specific values are translated to the enterprise standard and lineage is recorded for auditability. Reconciliation evidence shows the defect appears after integration, so downstream reporting cleanup would mask the issue rather than prevent recurrence. A domain validation that checks only approved warehouse codes would also miss the semantic mismatch.
- Dashboard filtering hides affected records and weakens the KPI rather than correcting the integration defect.
- Source renaming is disruptive and unnecessary because local codes can remain valid if the integration mapping is governed.
- Warehouse-only validation checks format or domain conformance but does not prove that source meanings were translated correctly.
Question 87
Topic: Monitoring Scorecards and Measurement
A bank’s customer master feed updates credit-risk attributes every hour and is used for same-day lending decisions. A validated quality rule already exists: risk_rating_code must be in the approved reference list and populated for all active customers. Recent profiling found no major issue, but a batch incident last week caused invalid codes for 40 minutes before users noticed. What is the best quality decision?
Options:
A. Audit a small sample of lending decisions each month
B. Implement continuous rule monitoring with threshold-based alerts
C. Profile the customer master table at the end of each quarter
D. Ask lending users to report suspicious risk ratings
Best answer: B
Explanation: Continuous monitoring is the best fit when a known, approved data quality rule protects time-sensitive business use. The feed changes hourly, the data supports same-day lending decisions, and a recent incident showed that waiting for users to notice creates business risk. Monitoring applies the rule repeatedly or in near real time, compares results with defined thresholds, and alerts accountable parties for timely remediation. Periodic profiling is useful for discovery and baseline assessment, but it is not enough once the rule is mature and the process needs rapid detection. Audit sampling and complaint handling can provide evidence or feedback, but they are reactive and may miss short-lived defects.
- Quarterly profiling is too infrequent for an hourly feed used in same-day decisions.
- Monthly audit sampling may detect some impact after the fact but will not reliably catch a 40-minute defect window.
- User reporting depends on manual discovery and repeats the weakness shown by the recent incident.
Question 88
Topic: Data Quality Foundations and Business Fitness
A customer analytics team cleans duplicate customer records every month before publishing a retention dashboard. The duplicates keep returning because three intake systems use different matching rules and no business owner has approved a shared customer identity standard. Which principle best supports sustainable improvement?
Options:
A. Increase the frequency of downstream cleansing jobs
B. Prevent defects at the source through governed standards
C. Report the duplicate count as an activity metric
D. Accept local matching rules for each intake system
Best answer: B
Explanation: Sustainable data quality improvement focuses on preventing recurring defects, not repeatedly correcting symptoms. In this scenario, monthly cleanup improves the published dashboard temporarily, but the same duplicates return because source systems apply different matching rules and no shared business standard exists. A DAMA-aligned response would establish ownership, define and approve the customer identity rule, align intake processes, and monitor whether duplicates decline over time. Cleansing may still be needed for existing records, but it should not be the main control for a recurring source-process problem. The key principle is prevention through governed, business-approved standards.
- More cleansing treats the symptom and may improve a single reporting cycle, but it does not stop the intake systems from creating new duplicates.
- Activity reporting can show work performed or defects found, but measurement alone does not remediate the root cause.
- Local rules preserve inconsistency across systems, which is the visible cause of the recurring duplicate records.
Question 89
Topic: Root Cause Analysis and Remediation
A claims processor has a critical data quality issue: 6% of approved claims have invalid provider specialty codes, causing payment holds. Profiling has confirmed the affected records, root-cause analysis traced the defect to a recent reference-data mapping change, and the data owner has approved a correction rule and threshold. The next payment run is in 36 hours, and the remediation team has capacity. What is the best quality decision?
Options:
A. Reopen triage to reassess business priority
B. Add the defect rate to the monthly scorecard
C. Re-profile the provider specialty code field
D. Execute the approved correction and mapping update
Best answer: D
Explanation: Remediation execution is the act of applying approved corrective actions to affected data and, where appropriate, the process or mapping that produced the defect. In this case, the organization already has the key prerequisites: confirmed scope from profiling, known root cause, an approved correction rule, business urgency, and available remediation capacity. Repeating earlier diagnostic steps would delay correction without adding necessary decision evidence. Monitoring and scorecarding remain important after the fix, but they do not resolve the immediate payment holds. The quality decision should move from analysis to controlled execution, with evidence retained for later verification and governance review.
- More profiling is unnecessary because the defect population and quality dimension have already been confirmed.
- Reopened triage adds delay because priority, ownership, and business impact are already clear.
- Scorecard monitoring tracks quality over time but does not correct the invalid codes before the payment run.
Question 90
Topic: Profiling Discovery and Assessment
A data quality team has completed a baseline assessment for customer onboarding data used in regulatory reporting and fraud scoring. The assessment includes record counts, null percentages, and format failures for 40 columns. It does not identify critical data elements, approved business rules, quality thresholds, data owners, or source-process lineage. Business sponsors want an improvement plan for the next quarter. What is the best quality decision?
Options:
A. Delay all planning until every enterprise data set is profiled
B. Accept it as the official scorecard baseline for quarterly targets
C. Use it only as discovery and complete a CDE-based baseline
D. Rank defects by null percentage and remediate the highest rates first
Best answer: C
Explanation: A baseline assessment supports improvement planning only when it is tied to fitness for purpose. For customer onboarding data used in regulatory reporting and fraud scoring, raw profiling results are useful discovery evidence, but they are not sufficient by themselves. Improvement planning needs identified critical data elements, approved quality rules, agreed thresholds, accountable owners, and enough lineage or source-process context to distinguish symptoms from causes. Without those elements, the team may spend limited remediation capacity on visible defects that have little business impact while missing high-risk issues. The practical decision is to treat the work as preliminary profiling and extend it into a governed, CDE-focused baseline.
- Null-rate ranking ignores business criticality, rule approval, and downstream impact, so it may misdirect remediation effort.
- Official scorecard use is premature because scorecards require agreed rules, thresholds, and ownership.
- Enterprise-wide delay is unnecessary because planning can proceed for the relevant critical data once the baseline is made fit for purpose.
Question 91
Topic: Profiling Discovery and Assessment
An insurer is profiling a policy table before loading it into a governed warehouse. The migration team needs to separate a structural issue in effective_date from a business-rule quality issue involving application_date. Which profiling output best makes that distinction?
Options:
A. Date-pattern counts plus
effective_date >= application_dateexceptionsB.
effective_datenull count, minimum date, and maximum dateC. Top distinct
effective_datevalues and their frequenciesD. Duplicate
policy_idcounts grouped by source system
Best answer: A
Explanation: A structural data issue is about whether the data can be represented and interpreted in the expected form, such as a date field containing non-date patterns or mixed formats. A business-rule quality issue occurs when data is structurally valid but not fit for the business rule, such as an effective date earlier than the application date. The strongest profiling output shows both layers: format or pattern results for structural assessment, and cross-field rule exceptions for business fitness. Basic summaries can reveal anomalies, but they do not clearly separate invalid structure from a valid value that fails a business rule.
- Range summary may show unusual dates, but it does not prove whether the issue is format structure or rule failure.
- Frequency counts help find common values or outliers, but they do not test the relationship between dates.
- Duplicate policy counts address uniqueness of policy identifiers, not the distinction between date structure and date-rule compliance.
Question 92
Topic: Data Quality Foundations and Business Fitness
A retailer’s regional revenue dashboard is losing trust because 8% of new orders have ship_to_country = "UNK". Profiling shows that most exceptions come from two sales portals after an approved country reference list changed. The warehouse team can infer many countries from postal codes, but the same defect reappears each week.
Which response best applies data quality management principles?
Options:
A. Infer missing countries during each warehouse load.
B. Mark the dashboard as provisional until volumes improve.
C. Lower the completeness threshold for country reporting.
D. Enforce the source country list and publish exception trends.
Best answer: D
Explanation: Sustainable data quality management focuses on fitness for purpose and prevention, not only downstream repair. The profile identifies a recurring upstream process failure: two portals are not applying the approved country reference list after it changed. Enforcing the reference list at order entry changes the process behavior that creates the defect. Publishing exception trends through monitoring or a scorecard helps report users understand current quality, see improvement, and regain confidence in the data. Inferring values in the warehouse may be useful as a temporary remediation, but it does not stop the weekly recurrence or provide transparent quality evidence.
- Warehouse inference treats the symptom after ingestion and leaves the defective order-entry behavior unchanged.
- Provisional labeling warns users but does not correct the source process or show measurable improvement.
- Lowering the threshold changes the target instead of improving fitness for purpose or trust in the dashboard.
Question 93
Topic: Root Cause Analysis and Remediation
A data quality defect was opened after duplicate customer records caused duplicate invoices. Remediation changed the customer onboarding validation and the merge rule in the master data hub. Governance-approved closure criteria require evidence that the source process was corrected, impacted records were remediated, and duplicate-invoice exceptions stayed below 0.5% for two monthly billing cycles. Which evidence best supports closing the defect?
Options:
A. Two monthly control reports with validation active, merges completed, and exceptions at 0.2% and 0.1%
B. A development test showing the new merge rule blocks duplicate customer IDs
C. A one-time profile showing no duplicate customer records after cleansing
D. A steward approval note stating that users are satisfied with the cleanup
Best answer: A
Explanation: Defect closure should be supported by validation evidence tied to the agreed closure criteria, not just proof that an activity occurred. Here, closure requires three things: the source process correction is operating, impacted records have been remediated, and the business-impacting exception rate remains below the approved threshold for two monthly cycles. Evidence from production control reports across both cycles is stronger than a single test, a stakeholder sign-off, or a one-time cleansing result because it demonstrates sustained fitness for purpose after remediation. The key distinction is between proving that a fix was implemented and proving that the defect is controlled in normal operation.
- Development testing confirms rule behavior before or outside full operation, but it does not prove sustained production performance.
- User satisfaction may support acceptance, but it does not verify the source correction, remediated records, or threshold results.
- One-time profiling can show cleanup success at a point in time, but it does not demonstrate two-cycle monitoring or source-process control.
Question 94
Topic: Profiling Discovery and Assessment
A bank’s marketing team wants to change a draft data quality rule so date_of_birth is mandatory for every customer record. An initial profile shows 18% completeness failures, and age-based campaign exclusions are affected. The customer hub receives both individual and business customers from branch, web, and acquired-portfolio sources. Data stewards have not yet agreed which customer types are in scope for the rule. Which quality decision is best before approving the rule change?
Options:
A. Set the threshold to the current 82% completeness baseline
B. Make
date_of_birthmandatory for all customer recordsC. Populate missing values with a standard unknown date
D. Profile completeness by customer type, source, and onboarding process
Best answer: D
Explanation: Profiling should clarify the business context and defect pattern before a quality rule is defined or changed. The aggregate 18% failure rate is not enough evidence because date_of_birth may be required for individual customers but not meaningful for business customers. Profiling by customer type, source, and onboarding process can show whether the issue is a true completeness defect, a scope-definition problem, or a source-specific process gap. Stewardship input is then needed to confirm the rule’s applicable population and threshold. A sustainable rule should reflect fitness for purpose, not simply force every record to meet a technical field requirement.
- Universal mandatory rule ignores that business customers may be legitimately out of scope for a birth-date requirement.
- Baseline threshold measures the current state but does not establish whether 82% is acceptable for the intended business use.
- Unknown date default masks missingness and can create inaccurate age-based exclusions downstream.
Question 95
Topic: Quality Rules Standards and Requirements
A customer data quality scorecard uses approved rules for campaign eligibility. The rules identify exceptions for invalid email formats, inactive accounts, and missing consent dates. In review meetings, Sales, Marketing, and Compliance regularly disagree about whether specific exceptions are true defects because each group applies different business interpretations. What improvement best fits this situation?
Options:
A. Assign all disputed exceptions to IT for correction
B. Add approved defect criteria and exception categories to each rule
C. Lower the exception threshold to reduce escalations
D. Run more frequent profiling on the customer file
Best answer: B
Explanation: Quality rules need more than technical test logic. When users disagree about whether exceptions are true defects, the weakness is usually in rule definition and approval: unclear business intent, ambiguous defect criteria, or missing exception categories. The improvement should clarify what counts as a defect, what counts as an allowed exception, who approves those interpretations, and how results are reflected in scorecards. This aligns measurement with fitness for purpose and gives stewards a repeatable basis for dispositioning exceptions. Changing thresholds or increasing profiling may change volume or visibility, but it does not resolve inconsistent business interpretation.
- Lowering thresholds changes tolerance levels but does not define which exceptions are valid business exceptions versus defects.
- More profiling can reveal patterns and volumes, but it will not settle stakeholder disagreement about meaning.
- IT correction treats the issue as a technical fix even though the visible problem is business rule interpretation and approval.
Question 96
Topic: Specialist Cross-Discipline Judgment
A hospital publishes a monthly readmission dashboard used to allocate outreach resources. Profiling shows that 18% of patients whose preferred language is not English have preferred_language defaulted to English when registration staff leave the field blank. The dashboard is visually consistent and delivered on time, but outreach teams use it to decide who receives translated discharge follow-up. What is the best quality decision?
Options:
A. Rename the field label to clarify that the value may be defaulted
B. Keep the dashboard unchanged because it is timely and visually consistent
C. Treat it as an ethical data quality issue and escalate for stewardship action
D. Log it as a low-priority completeness defect for the next release
Best answer: C
Explanation: Ethical data quality concerns arise when poor-quality data can create unfair, harmful, or exclusionary outcomes, not merely when a report is inconvenient or unattractive. Here, a default value masks missing language preference and affects which patients receive translated discharge follow-up. The issue involves completeness and accuracy, but the deciding factor is the downstream impact on equitable care. A DAMA-aligned response should involve stewardship, rule correction, source-process remediation, monitoring, and governance escalation because the data is used for consequential decisions. Cosmetic consistency and timely delivery do not make the data fit for purpose when it can systematically disadvantage a patient group.
- Low priority fails because the defect affects care allocation, not just backlog hygiene.
- Timely dashboard fails because operational timeliness does not offset harmful misclassification.
- Label change fails because warning users does not correct the defaulting process or prevent inequitable use.
Question 97
Topic: Data Quality Foundations and Business Fitness
A customer dataset will be used tomorrow to mail required service-renewal notices. The business owner says the mailing is fit for use only if at least 98% of records have a current postal address. Email and phone are not used for this mailing.
| Profile finding | Result |
|---|---|
| Postal address populated | 99.6% |
| Postal address valid format | 99.1% |
| Postal address confirmed current | 94.7% |
| Email address valid format | 82.0% |
Which assessment best applies fitness-for-purpose reasoning?
Options:
A. Not acceptable because current postal addresses are below the threshold.
B. Acceptable after removing records with missing postal addresses.
C. Not acceptable because email validity is low for customer communication.
D. Acceptable because postal completeness and format validity exceed 98%.
Best answer: A
Explanation: Fitness for purpose evaluates data against the requirements of a specific business use, not against every possible quality concern. Here, the operational use is mailing required notices, and the decisive requirement is at least 98% current postal addresses. Although postal address population and format validity are above 98%, currency is only 94.7%, so the dataset does not meet the stated business threshold. The low email validity rate is not decisive because email is not used for this mailing. The key distinction is that technically valid or complete data may still be unfit when it is not current enough for the intended process.
- Completeness trap fails because populated postal addresses do not prove the addresses are current.
- Irrelevant channel fails because email quality does not determine fitness for a postal mailing.
- Record removal fails because excluding missing addresses does not fix the larger current-address shortfall.
Question 98
Topic: Root Cause Analysis and Remediation
A data quality team is reviewing a recurring defect in customer onboarding data. The same issue appears each month after a batch load from a regional sales system.
Evidence:
| Finding | Detail |
|---|---|
| Profile result | 18% of new customer records have a blank tax_id |
| Business effect | Credit checks are delayed for high-value accounts |
| Process review | The regional sales form allows submission when tax_id is blank |
| Current response | Operations manually adds missing values before month-end reporting |
Which finding identifies the root cause of the defect?
Options:
A. Credit checks are delayed for high-value accounts
B. 18% of new customer records have a blank
tax_idC. The regional sales form allows blank
tax_idsubmissionsD. Operations manually adds missing values before reporting
Best answer: C
Explanation: Root cause analysis seeks the underlying condition or process failure that produces a recurring data quality defect. In this case, missing tax_id values are not just appearing downstream; they are allowed at the point of capture because the regional sales form does not require the field. That explains why the same completeness defect recurs after each batch load.
The exception percentage measures the scale of the symptom, the delayed credit checks describe business impact, and manual updates are an immediate correction action. Sustainable remediation would address the capture control or business rule at the source, not only repair records after loading.
- Exception count describes how often the defect occurs, but not why it is being created.
- Business impact explains the consequence of poor data quality, not the originating process failure.
- Manual correction fixes current records, but it does not prevent the next batch from having the same defect.
Question 99
Topic: Root Cause Analysis and Remediation
A bank’s regulatory reporting feed is being rejected because 12% of new customer records lack tax_residency_code. Profiling shows the defect occurs only when branches use the “quick add” onboarding path. The approved business rule says tax residency is required before account activation. The data steward owns the rule, and the application team owns the onboarding workflow. Which remediation decision provides the best quality outcome?
Options:
A. Ask the steward to lower the completeness threshold
B. Enforce the rule in quick add and monitor exceptions
C. Exclude incomplete records from the regulatory feed
D. Populate missing values nightly with
UNKNOWN
Best answer: B
Explanation: Sustainable remediation should address the cause of the defect, not only reduce visible errors downstream. The evidence points to a source-process failure: one onboarding path does not enforce an approved mandatory rule. The best quality decision is to make the quick add workflow enforce tax_residency_code before activation, with the application team correcting the control and the steward confirming the rule and monitoring results. This also closes the accountability gap because rule ownership and process ownership are both used appropriately. Filling blanks or excluding records may reduce report rejects temporarily, but they leave the defective capture process in place.
- Nightly defaulting masks missing data and may create inaccurate regulatory values rather than correcting capture at the source.
- Feed exclusion reduces rejected submissions but leaves active customer records incomplete for downstream use.
- Lowering the threshold treats the approved business requirement as negotiable without evidence that the rule is wrong.
Question 100
Topic: Profiling Discovery and Assessment
A data steward is assessing a customer contact dataset for a product safety recall. The business purpose is to reach affected customers within 10 days using a legally permitted channel.
| Assessment result | Finding |
|---|---|
| Email format validity | 98% valid |
| Phone format validity | 96% valid |
| Mailing address completeness | 72% complete |
| Contact permission populated | 99% populated |
| Last contact verification | 65% older than 3 years |
Which interpretation best fits these results and the stated business purpose?
Options:
A. The primary defect is uniqueness because duplicate customers may exist.
B. The data is fit because most contact fields are technically valid.
C. The data is not fit because contactability and timeliness are insufficient.
D. The assessment proves the contact permission rule is the only concern.
Best answer: C
Explanation: Fitness for purpose is judged against the business use, not by a single strong profile result. For a safety recall, the organization must be able to contact customers quickly through permitted channels. High email and phone format validity shows many values have acceptable structure, but it does not prove the contacts are current, reachable, or sufficient across channels. The low mailing-address completeness and old verification dates create a material risk that customers cannot be reached within 10 days. The populated permission field is useful, but it does not offset stale or incomplete contact data. The strongest interpretation combines the assessment evidence with the recall objective.
- Format validity trap fails because syntactically valid contact values may still be stale or unusable for the recall purpose.
- Duplicate assumption fails because the assessment provides no uniqueness or matching evidence.
- Permission-only focus fails because permission is nearly populated, while completeness and timeliness create the clearer business risk.
Continue in the web app
Use IT Mastery for interactive DAMA CDMP Data Quality Specialist practice with mixed sets, timed mocks, topic drills, explanations, and progress tracking.
Try DAMA CDMP Data Quality Specialist on Web
Focused topic pages
- Data Quality Foundations and Business Fitness
- Quality Strategy Business Case and Prioritization
- Profiling Discovery and Assessment
- Quality Rules Standards and Requirements
- Root Cause Analysis and Remediation
- Monitoring Scorecards and Measurement
- Quality Governance Roles and Stewardship
- Metadata Lineage and Quality Evidence
- Master Reference and Warehouse Quality
- Quality in Data Lifecycle and Operations
- Quality Maturity and Continuous Improvement
- Specialist Cross-Discipline Judgment