Try 120 free PMI-CPMAI questions across the exam domains, with answers and explanations, then continue in PM Mastery.
This free full-length PMI-CPMAI practice exam includes 120 original PM Mastery questions across the exam domains.
The questions are original PM Mastery practice questions aligned to the exam outline. They are not official exam questions and are not copied from any exam sponsor.
Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some exam sponsors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.
For concept review before or after this set, use the PMI-CPMAI guide on PMExams.com.
Set a 160-minute timer and answer the 120 questions in one sitting. Track each miss by AI initiative stage: responsible AI, business need, data need, model development/evaluation, or operationalization.
Suggested timing checkpoints:
| Question range | Target elapsed time |
|---|---|
| 1-40 | 53 minutes |
| 41-80 | 107 minutes |
| 81-120 | 160 minutes |
| Item | Detail |
|---|---|
| Issuer | PMI |
| Exam route | PMI-CPMAI |
| Official exam name | PMI Certified Professional in Managing AI (PMI-CPMAI) |
| Full-length set on this page | 120 questions |
| Exam time | 160 minutes |
| Topic areas represented | 5 |
| Topic | Approximate official weight | Questions used |
|---|---|---|
| Support Responsible and Trustworthy AI Efforts | 15% | 18 |
| Identify Business Needs and Solutions | 26% | 31 |
| Identify Data Needs | 26% | 31 |
| Manage AI Model Development and Evaluation | 16% | 19 |
| Operationalize AI Solution | 17% | 21 |
Topic: Operationalize AI Solution
A bank is preparing to operationalize an AI assistant that summarizes customer-service chats and drafts agent replies. The model will process regulated PII, and the bank’s AI policy requires: (1) approved data access and retention controls, (2) security review before production, (3) responsible-AI documentation (intended use, limitations, bias risks), and (4) registration in the model inventory with an accountable owner.
The sponsor wants the fastest path to launch without violating policy or increasing privacy/security risk. What should the AI project manager do?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: Use a pre-production governance gate that maps directly to the bank’s required controls: privacy/retention, security review, responsible-AI documentation, and model inventory ownership. Pairing this with a limited pilot and explicit approval checkpoints reduces compliance and operational risk while still enabling a timely release. This optimizes speed without violating stated policy constraints.
When operationalizing an AI capability that processes regulated PII, governance must be implemented as enforceable controls aligned to organizational policy—not as after-the-fact paperwork. In this scenario, the optimized approach is to run a focused governance gate that completes the required policy items and produces auditable evidence, then reduce delivery risk with a staged rollout.
Practical governance steps here include:
This approach meets privacy/security/responsible-AI constraints while keeping time-to-market competitive through a pilot and clear go/no-go criteria, rather than delaying indefinitely or cutting controls that harm operations.
It satisfies mandatory privacy, security, and responsible-AI controls while minimizing time-to-market via a focused pre-production gate and staged release.
Topic: Identify Data Needs
You are leading a 10-week AI initiative to predict hospital appointment no-shows to reduce wasted clinician time. In week 3, the data team reports: 18 months of scheduling data; 12% missing values in appointment_reason; a new intake form changed the meaning of cancellation_code six months ago; and patient ZIP code is considered sensitive, so leadership prefers it not be used unless clearly justified. The COO is non-technical and wants a decision this week on whether to proceed.
What is the BEST next action to convey the data situation to leadership and enable a decision?
Best answer: B
What this tests: Identify Data Needs
Explanation: The COO needs a timely go/no-go decision, not a technical readout. The best action is to translate data issues (missingness, definition changes, and sensitive attributes) into business impact, risk, and what choices are available within the 10-week timeline. This enables informed tradeoffs and clear next steps without overpromising model performance.
In CPMAI work, conveying data understanding to leadership means converting technical observations into decision-ready language: what the issue is, why it matters to outcomes, and what will be done about it within constraints. Here, missing values and a changed code definition affect reliability and may create inconsistent patterns over time (a business risk to forecast accuracy and trust). The sensitive ZIP code constraint is a governance/adoption risk that should be framed as an explicit tradeoff (exclude it, justify limited use, or use privacy-preserving alternatives).
A decision-ready update should include:
This is more effective than overwhelming the COO with technical artifacts or making performance commitments before feasibility is proven.
This reframes data quality, drift, and privacy constraints into business-relevant risks, tradeoffs, and choices needed for a timely go/no-go.
Topic: Identify Data Needs
A project team is preparing data for a customer churn prediction use case. During data profiling, the data lead finds: 12% missing values in key fields, inconsistent customer identifiers across two sources, and unmasked PII in a shared analytics workspace. The project manager asks for evidence that documents the evaluation results and clearly communicates risks and recommended actions for a go/no-go decision.
Which metric/evidence/artifact best meets this need?
Best answer: A
What this tests: Identify Data Needs
Explanation: A data readiness assessment report is the most decision-useful way to document data evaluation results and communicate risks and recommended actions. It ties observed data issues (quality, identity resolution, and PII handling) to impact on the use case, proposes remediation steps and owners, and supports a governance checkpoint. This directly enables a go/no-go decision based on evidence rather than raw outputs.
To validate data readiness and communicate evaluation outcomes, the project team needs an artifact that translates profiling and checks into actionable risk information. A data readiness assessment (or data quality assessment) report typically summarizes key results (e.g., completeness, consistency, identifier integrity, privacy/control gaps), describes business and model impact, assigns a risk level, and recommends specific actions (masking/controls for PII, identifier reconciliation, missing-data mitigation) with owners and timing. This provides an auditable, governance-aligned basis for proceeding, pausing, or revising scope. Raw metrics, source lists, or task backlogs can support the work, but they do not, by themselves, document conclusions and risk-informed recommendations for the broader project team.
It consolidates evaluation results into decision-ready risks, impacts, and recommended actions for the team.
Topic: Identify Data Needs
An AI team needs access to a customer dataset that includes PII to build a churn model. The data owner approves use for this project but wants to minimize exposure by ensuring each person can only access the specific tables/fields needed for their tasks, with access reviewed regularly and removed when no longer needed.
Which governance approach best matches this practice?
Best answer: C
What this tests: Identify Data Needs
Explanation: The practice described is an access-control strategy centered on least privilege: granting only the minimum necessary permissions for defined roles and routinely re-validating them. This directly reduces exposure to PII while still enabling project work. Regular review and timely removal of access are core to maintaining the control over time.
The core concept is enforcing least-privilege access control (often implemented via role-based access control) so people and systems can only view or manipulate the minimum data required for their responsibilities. In AI initiatives, this is especially important for datasets containing PII because it reduces the blast radius of misuse, mistakes, or compromised credentials while still supporting development and evaluation work.
In practice this commonly includes:
Other controls like de-identification, encryption, and data-sharing agreements are valuable, but they do not replace the need to control who can access what data.
It limits data exposure by granting only the minimum, job-aligned access needed and routinely validating that access remains appropriate.
Topic: Support Responsible and Trustworthy AI Efforts
In production, a loan-approval model’s overall AUC remains stable, but the approval rate and false-negative rate gap between two protected subgroups has steadily widened over the last month. Which term best describes this monitoring signal that should trigger a bias investigation and potential retraining or pause?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: A widening disparity between protected subgroups while overall performance remains steady indicates the model’s fairness characteristics are changing over time. This is specifically captured by the term fairness drift. It warrants investigation because it can signal emerging bias even when aggregate KPIs look healthy.
Fairness drift (also called bias drift) refers to a shift over time in fairness-related outcomes across groups, such as changes in disparate impact, approval rates, or error-rate gaps by protected subgroup. In the scenario, the model’s aggregate metric (AUC) stays stable, but subgroup gaps worsen, which is a classic sign that monitoring should focus beyond overall performance.
Practically, fairness drift triggers actions such as:
This differs from general drift concepts that describe changes in inputs or labels without explicitly indicating a growing group disparity.
Fairness drift is a change over time in group fairness metrics (e.g., disparate error/approval rates) despite stable aggregate performance.
Topic: Identify Business Needs and Solutions
A retail bank is rolling out an AI “agent assist” tool that suggests next-best actions during customer calls. A pilot shows acceptable accuracy, but adoption is low: agents say it disrupts their workflow and they fear the suggestions will be used for performance surveillance. Compliance also requires minimizing exposure of customer PII and being able to explain recommendations at a high level. The product owner wants the fastest path to sustained adoption without increasing privacy risk.
What should the AI project manager do next?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: Low adoption here is driven by workflow friction and fear of surveillance, not primarily by model accuracy. The best mitigation optimizes sustained adoption quickly by involving users in redesign, building trust with clear usage policy and transparency, and using change champions in a controlled pilot. This approach also maintains privacy safeguards and supports the required high-level explainability.
When users resist an AI solution, the fastest route to sustained adoption is to address the root barriers that block day-to-day use—typically workflow misfit, low trust, and misaligned incentives—while staying within privacy and governance constraints. In this scenario, agents perceive monitoring risk and experience disruption, so the next step is a change-and-integration intervention, not more model tuning.
This optimizes adoption and time-to-value without trading away PII protections or waiting for unnecessary perfect transparency.
This directly targets key resistance drivers (workflow fit and trust) while preserving privacy controls and meeting explainability needs through a controlled, supported rollout.
Topic: Identify Data Needs
A team is building a model to flag fraudulent insurance claims. Fraud occurs in about 0.5% of claims. To meet a deadline, the team proposes training and evaluating using a simple random sample of 3,000 recently labeled claims (about 15 fraud cases expected) with an 80/20 split, rather than expanding the sample size or using stratified sampling.
What is the most likely near-term impact of this decision?
Best answer: D
What this tests: Identify Data Needs
Explanation: Because fraud is rare, a small simple random sample yields very few positive examples for training and especially for the test set. That makes minority-class performance metrics (e.g., recall/precision) highly variable and sensitive to which few cases land in each split. The immediate outcome is unreliable evaluation and weak evidence for a go/no-go decision.
For robust training and evaluation, you need enough data volume—especially enough examples of the outcome of interest—to learn meaningful patterns and to estimate performance with acceptable uncertainty. With a 0.5% event rate, 3,000 claims provides only ~15 fraud cases total, and an 80/20 split leaves only ~3 fraud cases in the test set on average. That is too few to reliably measure fraud recall/precision or to compare models, so results will swing dramatically based on random partitioning.
Practical fixes include:
The key near-term consequence is unreliable model quality evidence, not downstream operational or drift effects.
With very few fraud examples, both training and test results will have high variance, making minority-class metrics unreliable.
Topic: Identify Data Needs
A lender is building an AI model to predict early loan default to reduce charge-offs. Data profiling found that default labels are missing for 28% of accounts due to a system migration, and key income fields have inconsistent definitions across two origination systems. The sponsor asks whether the project can still meet the target KPI and launch date.
What is the best next step?
Best answer: A
What this tests: Identify Data Needs
Explanation: After identifying major label and definition gaps, the immediate need is to translate those findings into business impacts and explicit decision options. An executive-ready recommendation connects data quality and availability risks to KPI confidence, schedule, cost, and risk exposure. That enables a deliberate go/no-go, rescope, or investment decision before further build work continues.
The core skill is communicating data readiness in a way that drives business decisions. With missing labels and inconsistent field definitions, projected model performance and fairness risks are uncertain, so continuing execution without a sponsor decision can waste effort or create misleading results.
Best next step is to package the findings into a leadership-facing recommendation that explicitly connects:
Once leadership selects a path, you can update the plan, success criteria, and downstream model/deployment activities accordingly. The key is decision-oriented communication, not immediate technical continuation.
Leadership needs a clear set of business-impact tradeoffs (delay, invest, rescope) tied to the data readiness findings to make a go/no-go decision.
Topic: Manage AI Model Development and Evaluation
A health insurer has a claims-triage model that meets agreed offline performance and fairness checks. The goal is to route low-risk claims faster, but the model will consume PII and must integrate with the existing claims management system and IAM. Leadership has low risk tolerance for outages or unauthorized access, and the target production date is in 4 weeks.
What is the BEST next action to support the go/no-go operationalization decision?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: Before a go/no-go decision, the team must confirm the solution can run safely and reliably in its target environment. That means validating infrastructure sizing and resiliency, security and access controls appropriate for PII, and end-to-end integration with upstream/downstream systems. A cross-functional readiness review provides evidence and accountable approvals within the 4-week timeline.
Deployment readiness is more than model quality; it is evidence that the solution can operate in production under real constraints. In this scenario, the model already meets offline performance and fairness criteria, but it will handle PII, must integrate with the claims platform and IAM, and the organization has low tolerance for outages and security incidents. The best next action is to run a structured operational readiness review with IT/SecOps and system owners to validate:
This produces concrete go/no-go evidence rather than assuming production will behave like offline evaluation.
A formal readiness review validates infrastructure capacity, security controls, and end-to-end integration before approving production go/no-go.
Topic: Identify Business Needs and Solutions
A manufacturing firm wants to reduce unplanned downtime by flagging machines likely to fail in the next 24 hours. It has three years of high-frequency sensor readings for all machines, but only a small number of confirmed failure events and inconsistent failure labels across plants. At a conceptual level, which AI approach best fits the problem and available data?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: The key constraint is limited, inconsistent labeled failure data despite plentiful sensor time series. An unsupervised or semi-supervised anomaly detection approach can model normal operating behavior and identify unusual patterns that often precede breakdowns. This matches the goal (early warning) without requiring large volumes of high-quality labels.
Model/approach selection should align to the decision to be made and the kind of training signal available. Here, the organization wants near-term failure risk alerts, but it lacks enough clean, consistently defined failure labels to support a reliable supervised classifier.
A better fit is to leverage the abundant sensor data to learn normal operating patterns and detect deviations:
A supervised classifier may become appropriate later if the team standardizes event definitions and accumulates sufficient, trustworthy labels.
With abundant sensor data but scarce, unreliable labels, anomaly detection can learn “normal” behavior and flag deviations likely to precede failures.
Topic: Operationalize AI Solution
An AI team is deploying a predictive maintenance model to 12 plants. Validation shows good performance for plants using Sensor Suite A, but the model has not been validated for plants using Sensor Suite B (different sampling rate), and results may be unreliable there. Operations leadership wants a single “go-live” announcement for all plants.
What is the best way to communicate deployment status and constraints to stakeholders?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: Because the model is only validated for Sensor Suite A, the key need is transparent communication that clearly limits authorized use and sets expectations. A deployment communication artifact like release notes can summarize what is live, what is not approved, and what stakeholders must do differently for Sensor Suite B plants. This reduces misuse risk and supports accountable operations.
Transparent deployment communication should state current deployment status (what is live), the validated operating envelope (where it is approved), and explicit constraints/limitations that affect decisions. In this scenario, the single decisive factor is the unvalidated Sensor Suite B condition, which creates a high risk of misapplication if a blanket go-live message is sent.
A strong deployment release note/bulletin should include:
Dashboards and deep dives can support the message, but they do not replace a clear, stakeholder-friendly statement of constraints and authorized use.
Release notes (or a deployment bulletin) explicitly communicate what is live, where it is approved to be used, and known limitations to prevent misuse.
Topic: Identify Data Needs
A people-analytics team is launching an AI model to predict employee attrition to target retention programs. The model would use HRIS data plus manager performance notes and employee survey comments.
Constraints:
What is the BEST next action?
Best answer: C
What this tests: Identify Data Needs
Explanation: Before accessing restricted HR text fields, the team must engage the data steward(s) and governance owners to confirm data ownership, permissible use, and required controls. This enables a compliant access request aligned to the privacy office’s documentation needs and the CHRO’s low risk tolerance. It also reduces rework risk within the 6-week pilot timeline.
When ownership and permitted use are unclear—especially for sensitive employee text like performance notes and survey comments—the next step is to engage the data stewards and governance/privacy owners. Their input determines who can authorize access, what the approved purpose and usage boundaries are (e.g., retention limits, aggregation requirements, prohibited decisions), and what documentation must be completed before data is released. In practice, this typically results in an approved data access request and a documented data usage agreement that aligns the pilot scope with stakeholder risk tolerance.
Key actions include:
Moving fast by bypassing governance often causes delays later (blocked access, forced scope changes) and increases trust and adoption risk.
This clarifies who can authorize access and what uses/controls are permitted before requesting restricted data.
Topic: Operationalize AI Solution
An organization deploys a new credit-risk model behind an API. Within hours, operations reports a spike in declined applications and asks the AI team to “quickly adjust the threshold in production” to reduce declines. The deployment plan includes a monitored rollout, a rollback procedure, and a required approval record for any production change.
Which CPMAI-aligned governance approach best matches how the project manager should handle this implementation issue while monitoring deployment execution?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: The right response is to treat the KPI spike as a governed production incident and use the preapproved deployment controls. That means investigating with monitoring data, applying changes through formal change control, and using rollback if necessary to protect customers and the business. This preserves auditability and accountability during deployment execution.
During AI solution deployment, monitoring often reveals unexpected performance or impact issues (e.g., business KPIs moving in the wrong direction). CPMAI practice is to resolve these issues through operational governance, not ad-hoc production edits. In this scenario, the organization already defined a monitored rollout, a rollback procedure, and a required approval record for production changes.
A governance-preserving response is to:
The key takeaway is that speed in production must be balanced with controlled, auditable execution.
It resolves the issue through the approved, auditable deployment path (monitor, rollback/hotfix via change control) without bypassing governance.
Topic: Manage AI Model Development and Evaluation
You are preparing a go/no-go recommendation for a computer-vision model that detects surface defects on manufactured parts. The team reports 96% F1 on the validation set, but all training/validation images came from Plant A. Operations now wants to deploy the model to Plant B, where the camera model, lighting setup, and product mix differ.
What should you ask for FIRST to assess robustness and generalization risk before deciding go/no-go?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: High validation performance on Plant A data does not demonstrate the model will generalize to Plant B. The fastest way to assess robustness is to obtain representative Plant B data and evaluate performance across expected variations (camera, lighting, product mix) and likely edge cases. This directly tests whether distribution shift will create unacceptable failures in the new operating conditions.
Robustness and generalization are primarily threatened by distribution shift: when the operational data (Plant B) differs from the data used to train and validate (Plant A). Before a go/no-go decision, you should verify how the model behaves under the new conditions by obtaining a representative Plant B sample and running an evaluation that includes performance by slices (e.g., product types, lighting conditions, camera angles) and targeted edge cases (rare defect types, borderline defects). This provides evidence about likely failure modes and whether additional data collection, recalibration, or retraining is needed prior to deployment. Governance artifacts and deployment details matter, but they do not answer the core question of whether the model will work reliably in Plant B.
Generalization risk is driven by data shift, so you first need target-environment data to test performance across Plant B conditions.
Topic: Identify Data Needs
A bank is starting an AI initiative to auto-preapprove personal loan applicants. The proposed training data combines customers’ transaction histories with third-party demographic attributes that have not been used before for credit decisions. The model’s output would automatically approve/decline most applications, with minimal human review.
What is the best next step to address data usage privacy requirements and capture the outcome?
Best answer: D
What this tests: Identify Data Needs
Explanation: Because the solution introduces a new purpose for personal data and drives largely automated credit decisions, the privacy risk is inherently elevated. A privacy impact assessment is the appropriate artifact to evaluate necessity, proportionality, risks, and mitigations before data access/processing proceeds. The PIA output should be documented with decisions, residual risk, and accountable sign-offs.
A privacy impact assessment (PIA) is warranted when planned data processing is likely to create elevated privacy risk—commonly when using personal data for a new purpose, combining datasets (especially with third-party sources), processing at scale, using sensitive attributes, or enabling automated decisions that materially affect individuals. In this scenario, the decisive factor is the high-impact, mostly automated credit decision combined with a new use of customer transaction data and third-party demographic attributes.
The PM should trigger the PIA with privacy/compliance stakeholders and ensure outcomes are documented, such as:
This creates auditable evidence that privacy risks were assessed and managed before implementation.
This is a new, high-impact use of personal data with automated decisions, so a PIA is needed and its findings must be recorded and signed off.
Topic: Identify Business Needs and Solutions
A retail bank has approved an AI use case to reduce call-center wait times using an intent classification model. A data science team built a promising prototype, and the business sponsor wants to deploy it within 6 weeks. However, there is no agreed model ownership, no documented approval process for model changes, and operations reports they lack staff to monitor model performance or handle escalations.
What is the best next step?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: The main blocker is not model performance but organizational readiness to operate and control the AI solution. Establishing governance, clear accountability, and sufficient monitoring/change capacity is a prerequisite for responsible deployment. A focused readiness assessment identifies gaps and creates an actionable plan to address them before go-live.
AI feasibility includes confirming the organization can sustain the solution after launch, not just that a prototype works. In this scenario, missing ownership, approval controls for model changes, and operational monitoring capacity are readiness gaps that increase risk and jeopardize adoption. The next step is to assess readiness and implement the minimum operating model needed for production.
Tooling and model tuning can help later, but they do not substitute for governance and operational capability.
Before production, the organization needs defined ownership, governance, and operational capacity to run and control the AI safely.
Topic: Identify Data Needs
A retailer is building a churn-prediction model using loyalty-program transactions combined with third-party mobile location data. The team can technically access the data, but the privacy office requires proof that the planned use, sharing, retention, and access controls comply with internal policy and applicable data protection requirements before any data is provided to the data scientists.
Which evidence best validates compliance readiness for data usage in this situation?
Best answer: A
What this tests: Identify Data Needs
Explanation: Compliance readiness is best validated by evidence that the intended processing is permitted and controlled, not by data quality or project progress outputs. An approved privacy impact assessment (or equivalent) documents the lawful basis, purpose limitation, sharing constraints, retention, and required safeguards, creating an auditable trail that the organization can rely on before access is granted.
To ensure data usage complies with policies and data protection requirements, you need evidence that directly ties the proposed processing to allowed purposes and required safeguards. In this scenario, combining loyalty transactions with third-party location data increases privacy risk and often introduces stricter constraints on consent/lawful basis, sharing, retention, and access.
The strongest validation artifact is an approved privacy impact assessment (or equivalent governance gate) that explicitly documents:
Operational metrics (volume/freshness) and delivery artifacts (feature lists) may be useful, but they do not demonstrate that processing is permitted under policy and data protection obligations.
This provides auditable, cross-functional sign-off that the specific intended use and controls meet policy and data protection requirements.
Topic: Operationalize AI Solution
A team deploys a new ML model that auto-approves/declines consumer credit applications in real time. The deployment plan includes monitoring and alert thresholds, but it omits a defined rollback procedure and contingency workflow (e.g., failover to the prior rules-based service).
Two hours after go-live, alerts show a sharp increase in false declines tied to a data pipeline change, and customer support calls are rising. What is the most likely near-term impact of this omission?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: Rollback and contingency plans are operational controls that limit blast radius when production behavior becomes unacceptable. Because the team lacks a pre-approved way to revert or fail over, the system continues producing bad outcomes while the team diagnoses and coordinates a recovery. The immediate result is prolonged customer harm and operational disruption.
A rollback procedure and contingency plan (often including failover to a previous version or a safe manual/rules-based path) are core parts of a deployment plan because production issues must be handled in minutes, not days. In this scenario, monitoring correctly detects unacceptable behavior (false declines) linked to an upstream data change, but the team has no predefined, tested, and authorized way to quickly restore a known-good state.
Near-term, this typically leads to:
The key takeaway is that monitoring without rollback/contingency converts a detectable issue into a prolonged production incident.
Without a predefined rollback/failover, the team cannot quickly revert service, prolonging incorrect declines and customer impact.
Topic: Manage AI Model Development and Evaluation
Your team is building an ML model to predict delinquency risk for a lender so the collections team can prioritize outreach. You have 12,000 labeled accounts (8% delinquent) and multiple monthly records per customer. The model will score future months, and compliance requires a reproducible evaluation with no data leakage. A quick random split shows excellent AUC, but performance varies widely when rerun.
What is the BEST next action to reduce overfitting risk while selecting a model within a two-week deadline?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: The key risk is leakage and optimistic estimates caused by mixing the same customer across folds and mixing past/future months. A grouped, time-aware cross-validation approach evaluates generalization in a way that matches how the model will be used and makes results stable and reproducible. Keeping a final holdout test set untouched preserves an unbiased check before release.
To reduce overfitting during model selection, the validation design must reflect the data-generating process and deployment conditions. Here, there are repeated records per customer and a forward-in-time scoring use case; random splits can leak customer-specific patterns and future information into training, inflating AUC and creating unstable results.
A sound approach is:
This produces a defensible, reproducible comparison and minimizes leakage-driven overfitting.
This aligns validation to the deployment setting (future months, repeated customers) and prevents leakage while comparing models on consistent folds.
Topic: Identify Business Needs and Solutions
You are building an ROI comparison for an AI-based invoice processing solution versus the current manual process. Finance requires ROI to be based on cashable benefits. IT confirms the annual run cost is fixed under an enterprise agreement, and the vendor states labeling effort is included in the one-time implementation cost. Finance also provides the standard discount rate.
Exhibit (inputs):
Which assumption should you validate first because it most materially affects whether the ROI stays positive?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: The projected benefit comes primarily from reducing staffing from 12 to 6 FTE, which drives most of the savings over the 3-year horizon. If those labor savings cannot be realized as cashable savings (e.g., headcount cannot be reduced or avoided), the ROI outcome can flip even if the solution works technically. Validating this assumption is therefore the most material to the cost-benefit comparison.
A cost-benefit comparison for ROI is only as valid as its most sensitive assumptions. Here, the largest benefit is the labor reduction: 6 FTE \(\times\) $70,000/year \(=\) $420,000/year, or about $1.26M over 3 years. Total cost over 3 years is $250,000 \(+\) $100,000/year \(\times 3\) \(=\) $550,000, so the business case depends heavily on realizing most of that labor benefit as cashable savings.
To validate the material assumption, confirm with operations/HR how the staffing reduction will be achieved (e.g., attrition plan, reassignment that avoids new hires, contractor reduction) within the ROI horizon. If savings are only “time freed” with no cost takeout, ROI is overstated.
Other inputs matter, but they are less likely to swing the decision given the stated constraints.
The ROI is dominated by labor savings, so if headcount reduction cannot be realized as cash savings, the investment can quickly turn negative.
Topic: Identify Business Needs and Solutions
A health insurer is introducing an AI-assisted prior-authorization triage tool that recommends approve/deny and provides reason codes. Constraints: protected health information must remain internal, the tool must be explainable to clinicians, and operations cannot tolerate major workflow disruption. Early demos show clinician skepticism and the call center supervisor warns that “people will work around it” unless it fits their daily process. What should the project manager do next to best optimize adoption while reducing integration risk?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: A stakeholder impact assessment followed by a controlled pilot directly targets adoption and integration risk by identifying who is affected, what must change in their workflow, and what support they need. Role-based training and a feedback loop help clinicians understand and trust explainable recommendations while keeping PHI handling and operational disruption within constraints.
The core change-management requirement in AI adoption is to anticipate how work will change for each impacted stakeholder group and to proactively manage readiness, not just deliver a model. In this scenario, clinicians, supervisors, QA, IT operations, compliance/privacy, and patient services are all affected by new decision support, explainability needs, and PHI constraints. The best optimization is to reduce workarounds and disruption while building trust.
A practical next step is to:
This approach optimizes adoption and integration quality without violating the stated privacy and explainability constraints, unlike mandate-only or big-bang approaches.
This identifies impacted roles and embeds communications, process updates, and feedback into a low-disruption pilot that supports explainable, compliant adoption.
Topic: Identify Data Needs
You are leading data readiness for an AI model that will auto-approve “low-risk” consumer loans. The sponsor wants production in 12 weeks and prohibits collecting new sensitive attributes (e.g., race). A data assessment finds:
- Training labels: repayment outcome exists only for approved loans
- Coverage: 70% of records come from two urban regions
- Drift risk: underwriting policy changed 6 months ago
- Data quality: 4% missing income fields (not missing at random)
To best optimize time-to-market while reducing performance, fairness, and operational risk, what should you communicate and recommend to the steering committee?
Best answer: C
What this tests: Identify Data Needs
Explanation: The data limitations create predictable risks: selective labels can inflate apparent performance and hide errors on rejected applicants, skewed regional coverage can create disparate outcomes, and the policy change increases drift risk. A phased, human-in-the-loop rollout with explicit guardrails and monitoring delivers faster value while making residual risk visible and manageable without violating constraints.
When labels exist only for approved loans, you have a “selective labels” problem: offline metrics may look strong while real-world errors increase when the model changes who gets approved, creating both performance and fairness uncertainty. Skewed geographic coverage means the model may generalize poorly and disproportionately misclassify underrepresented groups, even without collecting protected attributes. Recent underwriting changes add drift risk (the model may learn patterns that no longer hold), and non-random missing income can systematically bias predictions and downstream decisions.
The most decision-useful communication is a clear statement of:
A constrained decision-support launch with phased rollout and monitoring optimizes speed while bounding risk until data and evaluation gaps are closed.
This communicates how the data limits accuracy and fairness confidence, and reduces operational risk via constrained use, monitoring, and a data improvement plan.
Topic: Identify Data Needs
An AI team is struggling to reproduce a prior model result during handoff to a new data scientist. Review the artifact excerpt.
Exhibit: Experiment run record (excerpt)
Experiment: churn_v2
Run ID: 84f2
Code reference: "churn_model_final.ipynb" (emailed)
Training data: s3://data/churn/customers.csv
Data snapshot ID: N/A
Environment: "python 3.10" (no lockfile)
Notes: "tuned features; see my notebook"
Which next action is best supported by the exhibit to improve reproducibility?
Best answer: A
What this tests: Identify Data Needs
Explanation: The exhibit shows no reliable linkage between the training run and a specific code version, data snapshot, or dependency set. Reproducibility requires each experiment/run to reference immutable identifiers (e.g., commit hash and data snapshot ID) and a captured environment (lockfile) in a shared workspace. Establishing these controls enables consistent reruns and auditable collaboration during handoffs.
Reproducibility in AI work depends on being able to re-run the same experiment with the same code, the same data, and the same execution environment. The exhibit shows the code was shared via email, the dataset path has no snapshot identifier, and the environment has no dependency lockfile—each is a common failure mode in collaborative AI workspaces.
Best-practice next action is to implement lightweight, enforceable standards such as:
This directly addresses what the artifact reveals is missing, whereas adding narrative or changing scope does not make the prior result reproducible.
The record lacks immutable code, data, and environment references, preventing others from recreating the exact run.
Topic: Support Responsible and Trustworthy AI Efforts
In an AI product, engineering teams capture detailed model-inference logs for debugging. Which term refers to systematically removing or masking sensitive values (for example, PII and API keys) from logs so the logs remain useful but do not expose secrets if accessed?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Log redaction (sanitization) is the practice of preventing sensitive data from being written to logs by masking, removing, or truncating specific fields. This reduces privacy and security risk because logs are widely replicated, searched, and retained. It complements encryption controls for data at rest and in transit.
The core concept is secure data handling across the full data lifecycle, including observability artifacts like logs. Log redaction means designing logging so sensitive values (PII, credentials, tokens, full request/response bodies) are never persisted in plaintext—typically by masking or omitting specific fields and using safe identifiers for correlation.
Practical controls often include:
Encryption at rest and TLS are necessary, but they do not prevent sensitive values from being recorded and proliferating throughout logging systems.
Log redaction masks or removes sensitive fields from logs while retaining operational context.
Topic: Operationalize AI Solution
A retailer is launching an AI fraud-scoring API that is called for every online checkout. Constraints: RTO is 15 minutes and RPO is 5 minutes; transaction data contains PII and must remain in the single approved cloud region; go-live is in 4 weeks; and the risk committee requires evidence of recoverability before launch.
As the AI product manager, what is the BEST next action to plan backup and disaster recovery for this critical AI-dependent service?
Best answer: B
What this tests: Operationalize AI Solution
Explanation: With strict RTO/RPO and a requirement to demonstrate recoverability before launch, the next step is to implement and validate backups and restoration procedures for the full AI service in the allowed region. A timed recovery test produces objective evidence that the plan works under realistic conditions. This also avoids violating data residency constraints for PII.
Backup and disaster recovery for AI-dependent services must cover more than the trained model; it also includes the data stores and operational dependencies needed to restore service within agreed RTO/RPO. In this scenario, the organization must keep PII in a single approved region and provide evidence before go-live, so the most effective next action is to establish automated, in-region backups and validate recovery through a timed test.
A practical approach is to ensure recoverability for:
Testing the restore confirms the end-to-end recovery path and produces auditable proof, whereas replication outside the approved region or partial backups won’t satisfy the constraints.
This directly proves the service can meet RTO/RPO while keeping PII in-region and producing audit-ready evidence before go-live.
Topic: Identify Business Needs and Solutions
A bank is piloting an AI model to prioritize investigation of potentially fraudulent transactions. Investigators work inside a legacy case-management system that can only import a nightly CSV file and cannot call external APIs; any interface change requires a 6‑month security review. The pilot must start in 8 weeks with minimal disruption to investigator workflow.
Which integration approach is MOST appropriate?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: The deciding factor is the operational constraint: the case-management system only supports nightly CSV imports and new interfaces take too long to approve. A batch scoring pipeline that outputs a CSV aligns to the existing workflow and timeline while still enabling a controlled pilot. More invasive integrations would delay delivery and increase operational risk.
Integration planning should start from real operational constraints (interfaces available, change-control lead times, workflow disruption tolerance) and then shape the technical design to fit. Here, the case tool cannot call APIs and any interface change exceeds the pilot timeline, so the solution must integrate through the existing nightly import.
A practical plan is to:
Approaches that require new application interfaces or major platform replacement conflict with the stated constraints and jeopardize timely adoption.
It fits the legacy system’s file-based constraint and enables adoption without waiting for new API approvals.
Topic: Support Responsible and Trustworthy AI Efforts
A bank is preparing to roll out an AI model that flags potentially fraudulent transactions for analyst review. Business stakeholders want a simple “confidence score” shown in the case-management tool, but the data science team warns that scores are not well-calibrated for all customer segments and that some merchant categories are underrepresented in training data.
Which metric/evidence/artifact best validates the team is ready to communicate transparency limitations and assumptions without overstating model certainty?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Readiness to avoid overstating certainty is best evidenced by clear, user-facing documentation that states intended use, assumptions, and limitations, and that explains how to interpret outputs. A model card-style artifact ties these transparency constraints to actual usage and decision-making, including where confidence scores can mislead. This is stronger governance evidence than performance, explainability visuals, or adoption outcomes alone.
The core need is transparency that prevents misuse: users must understand what the model is for, where it can fail, and what the output does (and does not) mean. In this scenario, uncalibrated scores across segments and underrepresented merchant categories are explicit limitations that must be communicated in a controlled, auditable way before deployment.
The strongest validation artifact is one that:
Pure performance, explainability plots, or pilot productivity do not ensure stakeholders will avoid overconfidence or misinterpretation in production.
This directly operationalizes transparent communication by documenting assumptions/limits and preventing users from over-trusting raw scores.
Topic: Manage AI Model Development and Evaluation
A team is preparing data for a customer churn model. Numeric features include monthly spend (different currencies by region), tenure (days), and call minutes. Several candidate algorithms are scale-sensitive, and the solution must be reproducible and auditable.
Which action is INCORRECT when managing normalization/standardization decisions and documenting their purpose and impact?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: Normalization/standardization choices must be deliberate, consistent, and documented so results can be reproduced and audited. The key is to treat transformations as part of the model artifact: version the logic and parameters, and apply the same fitted parameters across validation, test, and production. Allowing ad hoc, undocumented scaling decisions undermines traceability and makes performance comparisons unreliable.
In AI model development, normalization/standardization is not just a technical tweak; it is a controlled decision that affects model behavior, comparability across experiments, and downstream monitoring. Because the organization requires reproducibility and auditability, you should (1) select the transformation approach based on data characteristics and algorithm sensitivity, (2) fit transformation parameters on the training data only, and (3) apply the same versioned code and fitted parameters consistently to validation/test and production scoring.
Documenting the purpose and impact typically includes what was transformed, why it was needed (e.g., unit/currency differences, scale-sensitive models), where it is applied in the pipeline, and what performance or stability effects were expected/observed. The anti-pattern is allowing transformation choices to vary across runs without documentation, which prevents reliable evaluation and governance.
Undocumented, inconsistent transformations break reproducibility, comparability, and auditability.
Topic: Identify Data Needs
A bank deployed an AI model to predict which customers need proactive support. After a successful pilot, business users report low trust and adoption because recommendations feel “two weeks late.” Monitoring shows a sharp input-data drift starting the week after a CRM migration. The latest production dataset has 20% nulls in last_contact_date, and account_status values no longer match the training data dictionary. A separate incident review found a new CRM export briefly included an internal notes field containing sensitive PII.
What is the most likely underlying cause?
Best answer: A
What this tests: Identify Data Needs
Explanation: The evidence points to a data-source problem introduced by the CRM migration: freshness issues (“two weeks late”), reliability issues (null spikes), and relevance/definition changes (account_status code set mismatch). These gaps commonly drive sudden drift and performance drops even when the model code is unchanged. The brief PII incident further signals weak source controls and unstable extracts during the migration.
In root-cause diagnosis for deteriorating AI performance, a sudden drift that aligns with an upstream system change is a strong indicator of data-source reliability/freshness/relevance issues. Here, the CRM migration coincides with drift onset, and multiple clues show the production inputs no longer match what the model was trained to expect: increased missingness, stale last_contact_date, and changed account_status semantics versus the documented training definitions. Those gaps can make predictions inaccurate and untimely, which directly explains low user trust and adoption. The PII incident is consistent with unstable exports and insufficient source governance/controls during the migration, increasing the likelihood that the feed changed in content and meaning.
Key takeaway: when performance drops immediately after a data-source change, validate lineage, refresh cadence, field definitions, and missingness before changing the model.
The timing of drift plus stale timestamps, new nulls, and changed codes indicate a reliability/freshness/relevance gap in the source after the migration.
Topic: Identify Business Needs and Solutions
A team is preparing to pilot an AI-powered customer support assistant that uses retrieval over internal policy documents and exposes a public chat interface. The risk workshop identifies threats such as prompt injection causing the bot to reveal restricted content, automated querying to extract sensitive training data, and spoofed API calls that could abuse the service. Which CPMAI-aligned governance approach best matches this situation?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: The described risks are cybersecurity threats specific to AI interfaces (prompt injection, model/data extraction, and API abuse). The best-matching governance approach is to use security threat modeling plus adversarial testing (often via red teaming) to identify attack paths and verify mitigations before exposing the system publicly. This aligns risk assessment to likely misuse and attacker behavior, not just performance outcomes.
This scenario calls for an AI-focused cybersecurity risk assessment, not a general project plan activity. When an AI system is exposed via an interface (chat or API) and connected to internal knowledge sources, teams should map likely threat actors, assets (restricted documents, training data, credentials), and attack paths (prompt injection, data/model extraction, endpoint abuse). Then they should validate mitigations through adversarial testing/red teaming before pilot to reduce the chance of data exposure or model misuse.
Practical outputs often include:
The key takeaway is to treat AI systems as attack surfaces and assess adversarial risks early, before deployment decisions are finalized.
These risks are best addressed by AI-specific threat modeling (misuse/abuse cases) and adversarial testing to validate controls before pilot release.
Topic: Identify Data Needs
You are leading an AI initiative to prioritize potentially fraudulent insurance claims for investigation. The sponsor requires evidence the model will be reliable across product lines, not just overall, and you cannot use third-party data due to privacy constraints. A pilot must start in 6 weeks.
Exhibit: Labeled history (last 12 months)
| Product line | Total claims | Confirmed fraud labels |
|---|---|---|
| A | 80,000 | 220 |
| B | 60,000 | 190 |
| C | 70,000 | 170 |
| D | 40,000 | 18 |
Which approach best optimizes time-to-market while ensuring the training and evaluation data will support an acceptably reliable assessment of model performance?
Best answer: B
What this tests: Identify Data Needs
Explanation: Reliable model evaluation requires enough labeled examples in each segment where performance must be trusted. With only 18 positive labels in product line D, any segment-level precision/recall estimate would be too unstable for decision-making. Scoping the pilot to product lines with sufficient labels and executing a focused labeling plan best balances speed with defensible reliability.
To assess whether data is sufficient, focus on whether you can both train and validate performance with acceptable stability for the populations that matter (here, each product line). While lines A–C have hundreds of positive fraud labels, line D has only 18, making product-line-specific evaluation highly uncertain and easy to overfit.
A reliability-optimizing, time-aware approach is to:
This meets the 6-week pilot constraint without presenting unsupported claims about segment-level performance.
It delivers a timely pilot where each included segment has enough labeled positives to evaluate reliably, while addressing the clear insufficiency in line D before scaling.
Topic: Identify Business Needs and Solutions
When building an AI business case, which term describes the full lifecycle cost of the solution (e.g., data acquisition, model development, deployment, monitoring, retraining, support, and eventual retirement), not just the initial build expense?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: Total cost of ownership (TCO) is the cost-side view required to make ROI credible because it includes ongoing AI costs like monitoring and retraining. AI solutions often incur substantial post-deployment effort to maintain performance and manage drift. A business case that ignores these lifecycle costs will overstate financial value.
The core concept is separating value from cost so AI outcomes can be translated into measurable financial results. TCO is the comprehensive, end-to-end cost of delivering and sustaining an AI solution across its lifecycle, including one-time build costs and recurring costs such as infrastructure, data operations, MLOps, monitoring, incident response, and model refresh/retraining. Using TCO helps ensure ROI and payback estimates reflect what it truly takes to operate the solution at the required service level and risk posture.
A practical approach is:
Confusing TCO with only build cost (or only run cost) is a common reason AI business cases fail during operationalization.
TCO captures end-to-end costs across building, running, and sustaining the AI capability over its lifetime.
Topic: Identify Business Needs and Solutions
An AI product team completes an initial risk assessment for a customer-service chatbot. The top risks are (1) PII exposure through prompt injection, (2) training data access delays, and (3) low stakeholder trust if answers are not explainable.
The AI project manager proposes adding a data-masking work package to scope, inserting a security/red-team review before pilot, sequencing data access approvals as a critical-path activity, and adding a formal go/no-go checkpoint tied to agreed risk thresholds.
Which AI project management principle or governance approach best matches this practice?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: The practice uses a risk-based planning approach: high-impact risks drive added scope items, schedule sequencing, and explicit governance checkpoints. This makes mitigation executable (work packages and dependencies) and creates clear decision points (go/no-go) aligned to risk thresholds. It directly operationalizes risk priority into the integrated plan.
Prioritizing AI project risks is only useful if the results change how the project is planned and governed. A risk-based approach takes the highest exposure items (likelihood " impact) and converts responses into concrete plan elements: scoped mitigation work, scheduled activities and dependencies, and governance checkpoints where the team can make a deliberate go/no-go decision based on agreed criteria.
In practice this means:
Simply tracking risks without integrating responses into the plan reduces accountability and often delays mitigation until late-stage failures.
It prioritizes risks and embeds mitigations as planned work and decision checkpoints rather than treating risk as a separate log.
Topic: Identify Business Needs and Solutions
You are defining success criteria for an AI model that will automatically approve or decline consumer credit applications. The governance board has classified the use case as high risk due to potential customer harm and regulatory scrutiny.
Exhibit: Model card (draft excerpt)
Intended use: Auto-approve/auto-decline credit apps
Risk tier: High
Primary metric proposed: Accuracy >= 85%
Secondary metric proposed: AUC >= 0.75
Acceptance thresholds: Same for all customer segments
Fairness metrics: TBD
Cost of errors: Not documented
Based on the exhibit, what is the best next action to define appropriate performance metrics and acceptance thresholds?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: Because the model will make high-risk automated credit decisions, success criteria must go beyond aggregate accuracy and AUC. The team should define acceptance thresholds that reflect the business and harm of different error types and ensure performance is acceptable across relevant customer segments. Fairness thresholds and documented error-cost tradeoffs are needed before governance approval.
For high-risk use cases, acceptance thresholds must be tied to the decision impact and the cost of mistakes, not just overall discrimination metrics like accuracy or AUC. The exhibit shows missing definition of error costs and fairness, plus a single threshold applied to all segments—both are misaligned with a high-risk, automated approve/decline workflow.
A risk-aligned definition of “good enough” typically includes:
The key is to set measurable, testable acceptance gates that reflect the use case’s risk tier and stakeholder risk tolerance.
High-risk automated decisions require explicit, risk-based acceptance thresholds (e.g., FNR/FPR tradeoffs, calibration, and fairness) rather than only global accuracy/AUC.
Topic: Identify Data Needs
Your team is building a customer retention model using data from CRM and billing systems. A key feature depends on the definition of “active customer,” and different departments use different rules; in addition, one requested field may contain regulated personal data, so access must follow formal governance. To resolve data questions quickly without rework, what is the best way to define communication and decision-making paths with SMEs?
Best answer: A
What this tests: Identify Data Needs
Explanation: When data definitions conflict and regulated data access is possible, the team needs explicit decision rights and a governance-aligned escalation path. A data-question intake/triage process plus a RACI that names the accountable data owner/steward prevents endless debate and ensures the approved definition and access decisions are recorded and reusable.
The core need is to resolve data questions efficiently by pre-defining who provides input, who decides, and how exceptions escalate—especially when governance constraints apply. In this scenario, conflicting definitions of “active customer” and the possibility of regulated personal data mean informal consensus or “whoever answers first” will create rework and audit risk. A lightweight intake log (single source of truth for questions, decisions, and due dates) paired with a RACI assigns clear accountability to the business/data owner (decision authority) and uses data stewards/custodians and SMEs for consultation. An explicit escalation path to the data governance body ensures sensitive-field access and definitions are approved through the required control point. The key takeaway is to match the decision path to the highest-governance constraint, not to convenience.
This creates clear decision rights and a fast triage/escalation route aligned to governance constraints.
Topic: Identify Business Needs and Solutions
An insurer deployed an ML model to triage incoming claims into “fast-track” vs “complex.” After 8 weeks, monitoring shows stable input distributions (no drift), offline metrics remain on target, fairness checks show no new bias signal, and there have been no privacy/security incidents. Users completed training, yet supervisors frequently override recommendations and leadership says “we can’t tell if this is delivering value” because the only documented target was “AUC -> 0.85.”
What is the most likely underlying cause?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: The scenario shows technical health is stable (no drift, no incidents, no new bias) but leaders cannot assess value because success was defined only as a model metric. Defining business impact KPIs with baselines and explicit post-deployment thresholds (e.g., cycle time, cost, throughput, quality) is necessary to demonstrate benefits and align operations to the expected outcomes.
Business success criteria for AI must translate model outputs into measurable operational and financial impact after deployment. Here, the team documented only a technical metric target (AUC), so leadership lacks agreed, outcome-based KPIs and thresholds to judge whether triage improves claims processing and to reinforce adoption. With no drift, no incidents, and training completed, the most plausible root cause is mis-specified success criteria: missing baselines, target thresholds, and measurement plan for value (e.g., reduction in average cycle time, increased straight-through processing rate, lower cost per claim, fewer rework loops). When those are defined up front and tracked post-launch, stakeholders can validate value and resolve “override” behavior through aligned goals rather than debating technical performance.
Key takeaway: technical model metrics are necessary but not sufficient to prove business value.
The team set a technical threshold (AUC) but did not define post-deployment business outcome metrics and targets to demonstrate value and guide adoption.
Topic: Support Responsible and Trustworthy AI Efforts
A project team is defining a secure end-to-end data handling procedure for an AI solution that uses customer interaction transcripts. The procedure specifies: (1) encrypt data in transit and at rest from collection through model training and inference, (2) restrict access by role with least privilege and maintain audit logs, (3) store only the minimum necessary fields and de-identify data before feature engineering, and (4) retain training and inference data only for a defined period with automated deletion. Which governance approach best matches this practice?
Best answer: D
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: The described procedure defines safeguards that span the full data lifecycle: protecting data during collection, training, and inference, limiting access, minimizing exposure through de-identification, and enforcing retention/deletion. That is a privacy-and-security-by-design approach focused on secure end-to-end data handling rather than a model quality or documentation practice.
Secure end-to-end data handling is a data governance approach that applies privacy and security controls across every lifecycle stage where data is created, moved, transformed, used, and stored. In the scenario, encryption protects confidentiality during transit and storage; least-privilege access with audit logs supports accountability and reduces insider risk; minimization and de-identification reduce exposure of sensitive attributes before they enter pipelines; and defined retention with automated deletion limits long-term risk and supports policy compliance. This is best characterized as data lifecycle security with privacy-by-design, because the controls are embedded into the process from collection through training and inference, not added after the model is built.
It applies security controls, minimization, and retention rules across collection, training, inference, and deletion.
Topic: Operationalize AI Solution
An insurer has deployed an AI model to prioritize claims for fraud investigation. The business objective is to reduce fraud losses while maintaining customer experience and staying within a privacy policy that prohibits storing raw free-text claim notes in monitoring logs.
In production, the model’s precision has stayed around 0.82, but the special investigations unit (SIU) backlog has doubled and customer complaints about delayed payouts have increased. The team’s operational dashboard currently tracks only precision.
Which next step best optimizes business value and risk reduction while satisfying the stated constraints?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: The current KPI is misaligned: stable model precision can coexist with worsening business outcomes when capacity and customer-impact measures are not tracked. The best response is to redefine success measures to reflect end-to-end value (losses avoided) and operational/customer guardrails (backlog and delays). Monitoring must also respect the stated privacy constraint by using derived or aggregated signals instead of raw text logs.
Operational AI metrics should connect model behavior to the business outcome and the operating environment. Here, tracking only precision hides the real problem: the system is creating downstream workload and customer harm (backlog and payout delays), so the monitored KPIs are misaligned with the stated objective.
A better KPI set balances:
These measures should be monitored using privacy-safe logging (aggregations, derived labels, limited retention) consistent with the restriction against storing raw free-text notes. The key takeaway is to manage to outcome-and-guardrail KPIs, not a single offline-style model metric.
It realigns monitoring to business outcomes and operational constraints (capacity and customer impact) without violating the privacy restriction on raw text logging.
Topic: Identify Business Needs and Solutions
An AI team is piloting a customer-churn model using regulated customer data. The pilot shows unstable performance because different team members are training on different CSV extracts, adoption is low because business users have not received any demos, and a privacy incident occurred when a contractor emailed a dataset with PII to get help “since they couldn’t access the approved workspace.”
The project manager learns the contractor’s background check is complete, but no role-based access, approved development environment access, or data access path was provisioned as part of onboarding.
What is the most likely underlying cause?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: The clues point to preventable operational breakdowns: lack of role-based access, no approved development environment, and no compliant data access path for a cleared contractor. That gap predictably drives workarounds (emailing PII), inconsistent data copies, and slowed delivery, which in turn harms model consistency and user uptake.
The core issue is inadequate onboarding and access provisioning, which should enable people to work quickly while enforcing least-privilege and approved data-handling controls. In the scenario, a cleared contractor still lacked an approved workspace and authorized data access, so the team exported ad hoc CSVs and used insecure transfer methods, creating version inconsistency (performance variability) and a privacy incident.
Effective onboarding/access provisioning typically includes:
The symptoms described are consequences of access gaps, not primarily a modeling or KPI definition problem.
Missing role-based access and an approved workspace forced shadow data handling, causing both productivity delays and a privacy breach.
Topic: Identify Business Needs and Solutions
You are drafting the AI solution approach for an e-commerce checkout fraud screening use case. Review the excerpt from the requirements artifact.
Performance: fraud decision <=150 ms (p95) at checkout
Reliability: 99.95% availability; degrade gracefully on failure
Scalability: peak 2,000 transactions/sec (holiday); avg 200/sec
Constraint: PII must remain in-region
Fallback: if AI unavailable, route to rules engine within 50 ms
Which solution approach should you capture because it is most directly driven by these non-functional requirements?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: The exhibit specifies strict real-time latency at checkout, high peak throughput, and in-region processing, which materially constrain the serving architecture. The appropriate draft solution is an online inference service designed and tested to meet the p95 latency target under peak load while keeping PII in-region.
Non-functional requirements (NFRs) like latency, reliability, and scalability are design drivers, not “nice-to-haves.” Here, the 150 ms p95 constraint at checkout rules out offline/batch patterns and pushes toward an online inference path with performance budgets across preprocessing, model inference, and network hops. The peak 2,000 tx/sec requirement means capacity planning and load/performance testing must be part of the solution design, not deferred. The in-region PII constraint limits where the service can run and which integrations are feasible. Separately, the graceful-degradation and 50 ms fallback requirement implies a resilient operational design, but the first-order architectural choice still needs to be an in-region, low-latency, high-throughput real-time service.
Key takeaway: capture NFR-driven serving architecture decisions early because they constrain model size, hosting, and integration patterns.
The latency and peak-throughput targets require an online, in-region serving design validated at scale.
Topic: Identify Business Needs and Solutions
A retail bank is preparing to launch an AI model that prioritizes mortgage applications for underwriters. The organizational objective is to reduce time-to-decision by 20% while maintaining fair-lending expectations. The bank’s AI governance requires documented evidence of bias testing and an approved set of acceptance thresholds before production go-live.
Which metric/evidence best validates readiness to launch in this context?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: The best validation combines evidence that the solution achieves the stated business outcome and meets governance acceptance criteria. A pilot-based measure of time-to-decision improvement demonstrates organizational value in the real workflow. Pairing it with fairness results compared against preapproved thresholds satisfies governance expectations for responsible deployment.
Success criteria for AI should be traceable to organizational objectives (value) and to governance expectations (risk/compliance). In this scenario, the objective is a 20% reduction in time-to-decision, so readiness evidence should reflect end-to-end process impact, not just offline model accuracy. Because the bank also requires bias testing with approved thresholds, the validation package must include fairness results (e.g., parity or adverse-impact measures) evaluated against those predefined acceptance criteria.
The strongest readiness evidence therefore combines:
Purely technical metrics, activity completion, or awareness/engagement measures do not validate both value and governance readiness.
This directly ties business value (cycle time) to governance acceptance (fairness thresholds) for a go/no-go decision.
Topic: Identify Business Needs and Solutions
While drafting an AI solution for real-time fraud screening at checkout, the team focuses on model features and offline accuracy. Operations notes the service must return a score in under 150 ms, run 24/7 with 99.9% availability, and handle 10 d7 traffic spikes on holidays. Which CPMAI concept is most directly being introduced and should be captured to shape the solution design?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: The constraints described (response time, uptime, and handling spikes) are non-functional requirements that must be made explicit as service-level objectives/agreements. These targets materially affect the solution draft by driving design decisions such as hosting, scaling approach, redundancy, and performance testing. Capturing them early prevents building a model that cannot meet operational expectations.
Non-functional requirements (NFRs) define how well the AI solution must operate, not what it predicts. In this scenario, checkout fraud screening is operationally constrained by strict latency, high availability, and burst scalability; these become measurable SLOs/SLAs (or equivalent internal targets) that shape the draft solution design.
Practically, the draft should translate these into testable operational requirements such as:
Model accuracy remains important, but without NFRs the system may fail in production despite good offline metrics.
Latency, availability, and peak-load capacity are non-functional targets that drive architecture and deployment choices.
Topic: Support Responsible and Trustworthy AI Efforts
A bank is preparing to pilot an AI model that recommends approve/deny decisions for personal loans. The team has a high-performing gradient-boosted model and a completed data quality review, but compliance requires transparent, case-specific explanations for adverse decisions and evidence the model relies on legitimate factors before any customer impact.
What is the best next step?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Because the use case affects individual customers in a regulated context, the team needs interpretability appropriate to the risk level before rollout. The next step is to produce and validate both global understanding of key drivers and local, case-specific explanations that can support adverse action reviews and auditability.
Interpretability techniques should be selected based on the decision risk and the explanation need. For high-stakes, customer-impacting recommendations, you typically need (1) global interpretability to confirm the model’s drivers align with policy and domain expectations, and (2) local interpretability to explain individual outcomes (especially adverse actions) in a consistent, reviewable way.
A practical next step is to run model-agnostic explanation analyses (for example, SHAP for global and local attributions) on representative cases, review results with compliance and subject-matter experts, and capture findings in transparency documentation to support go/no-go for the pilot. This addresses the immediate transparency gap without prematurely changing the model or deferring controls until after deployment.
High-stakes individual decisions require validated, case-level and overall interpretability evidence before piloting.
Topic: Identify Business Needs and Solutions
A payments company is evaluating an AI-based fraud scoring service that must run inline during checkout. Peak traffic is 1,200 transactions/second, and the business requires a p95 end-to-end decision latency under 150 ms. The product owner also needs an expected monthly run-cost estimate before approving a pilot.
Which metric/evidence best validates computing and infrastructure feasibility for this use case?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: For infrastructure feasibility, the most convincing validation is evidence that the end-to-end system meets required throughput and latency under realistic peak conditions, with an associated cost projection. A load/performance test report that measures p95 latency at peak transactions per second and translates resource use into expected monthly cost best supports a go/no-go decision.
Estimating compute and infrastructure constraints is about proving the solution can meet operational nonfunctional requirements (latency and throughput) within an acceptable run-cost envelope. In this scenario, the decision hinges on whether an inline fraud scorer can sustain 1,200 TPS while keeping p95 end-to-end latency under 150 ms and staying within an acceptable monthly operating cost.
The strongest feasibility evidence is an end-to-end performance/capacity benchmark that:
Model quality metrics and conceptual designs are useful, but they do not validate runtime performance and cost under production-like load.
It directly demonstrates throughput-capacity and latency performance at peak load and ties it to projected run costs.
Topic: Operationalize AI Solution
A claims AI solution has completed a 12-week pilot and is ready for handover to operations. The steering committee asked for a final report that documents whether the project achieved its objectives and what was learned, using verifiable evidence. Constraints: customer data cannot be exposed in reports, and adoption by call-center agents is a stated success criterion.
Which reporting approach best optimizes credible outcome documentation while meeting these constraints?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: To document outcomes credibly, the final report should tie objectives to evidence: pre/post KPIs, model performance against acceptance criteria, and real adoption/usage. Because adoption is a success criterion, stakeholder feedback and usability signals must be included. Privacy constraints require aggregation and anonymization rather than exposing customer-level artifacts.
A strong final AI project report demonstrates objective achievement using multiple, auditable evidence sources—not just technical metrics or forward-looking promises. In this scenario, you should connect the original success criteria to (1) pre/post business outcomes from the pilot, (2) model quality and operational measures versus agreed thresholds (e.g., error rates, latency, stability), and (3) adoption evidence, including usage analytics and structured stakeholder feedback from agents and supervisors. Privacy constraints mean the report should use aggregated metrics and anonymized/abstracted feedback (themes, counts, representative redacted quotes) rather than raw customer records. The goal is a traceable narrative from objectives → metrics/feedback → conclusion and lessons learned that supports a go/no-go or scaling decision.
It triangulates objective achievement with measurable outcomes and stakeholder feedback while protecting privacy via anonymization.
Topic: Manage AI Model Development and Evaluation
A team is building a loan-default model (label: default within 90 days). For data preparation, they plan to apply target encoding to several high-cardinality categorical fields and median imputation/standardization to numeric fields. You are concerned these transformations could introduce leakage and/or amplify bias across customer subgroups. What should you verify or obtain FIRST before deciding whether the transformation approach is acceptable?
Best answer: B
What this tests: Manage AI Model Development and Evaluation
Explanation: Leakage often enters through transformations that are fit using information outside the training split or that incorporate post-outcome data. Verifying the split strategy, the exact “fit” scope for encoders/imputers/scalers, and feature availability at prediction time is the fastest way to determine whether the pipeline is even valid to evaluate. Only after this check do fairness and performance metrics become trustworthy.
The core check is transformation integrity: each transform must be derived only from data that would be legitimately available when the model is used, and it must be fit without peeking at validation/test data. With target encoding in particular, leakage can occur if category statistics are computed using the full dataset (or the label) outside a training-only, fold-safe process; similarly, imputers/scalers can leak if fitted on all rows before splitting. Ask for the pipeline specification that shows (1) the train/validation/test (or time-based) split, (2) when the label is determined relative to each feature, and (3) that all “fit” steps occur on training data only (then applied to holdout sets). After leakage is ruled out, you can assess whether transformations shift distributions or error rates across subgroups and apply mitigations if needed.
You must first confirm transformations use only training data (and only pre-outcome information) to prevent leakage before interpreting any performance or fairness results.
Topic: Operationalize AI Solution
A fraud-detection model is live and supports real-time payment approvals. The operations team requests production alerting for threshold breaches and response playbooks for common failures (e.g., data pipeline delays, rising false declines, latency spikes).
Which approach SHOULD AVOID when setting up alerting and response playbooks?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: Operational AI monitoring should detect threshold breaches quickly and trigger a consistent, accountable response. Automated alerting with clear ownership and severity prevents prolonged customer impact when latency, data freshness, or quality metrics degrade. Playbooks (runbooks) translate alerts into repeatable triage and remediation actions, including escalation and rollback when needed.
The core concept is closing the loop from metric thresholds to action. In production, threshold breaches (e.g., latency, data freshness, model performance proxies like false-decline rate) need timely detection plus a defined response so incidents are handled consistently.
Effective setup typically includes:
Depending on the failure, the playbook may direct steps like verifying upstream data health, switching to a fallback rule set, rolling back to a prior model version, or throttling traffic. The key takeaway: monitoring without real-time alerting and an executable response plan is an operational anti-pattern.
This delays detection and response and provides no actionable, time-bound playbook for threshold breaches.
Topic: Manage AI Model Development and Evaluation
A team is requesting a go/no-go decision to operationalize a customer-churn prediction model. In the readiness review, they mention they will “monitor performance and retrain if it drifts,” but they provide no specifics.
What should you verify/ask for first before making the operationalization decision?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: Operationalization readiness depends on whether the solution can be safely controlled in production. Vague statements like “monitor and retrain” are insufficient without a defined baseline, measurable drift/performance thresholds, and pre-agreed response actions with clear ownership. Verifying this plan first confirms the model can be operated and corrected when behavior changes.
For a go/no-go decision, you must confirm the model can be managed once exposed to real-world data changes. A monitoring and drift detection plan is only actionable when it specifies (1) what metrics will be tracked (business and model), (2) the baseline values used for comparison, (3) thresholds that trigger an alert or escalation, and (4) predetermined response actions (e.g., investigate data pipeline issues, fall back to a prior model, throttle usage, or retrain) with owners and timing. Without these elements, “we’ll monitor” does not provide operational control, making the deployment decision premature. Details like algorithm choice or future feature ideas may be useful, but they do not close the operational risk created by undefined drift triggers and responses.
A go/no-go requires explicit drift/performance thresholds and a clear playbook (owners, actions, rollback/retrain) to manage degradation after deployment.
Topic: Identify Data Needs
You are leading an AI initiative to build a churn prediction model using customer support transcripts that contain PII. The data must stay in-region, the organization has a low risk tolerance, and a third-party security audit is scheduled in 8 weeks. Data scientists need an environment in 2 weeks to start experimentation, and the audit will require evidence of approved access, segregation between dev/test/prod, and immutable activity logs.
What is the BEST next action?
Best answer: D
What this tests: Identify Data Needs
Explanation: With PII, low risk tolerance, and an upcoming audit, the environment must be secure-by-design and auditable before data access begins. Establishing a standardized workspace with least-privilege IAM, dev/test/prod segregation, encryption, and immutable centralized logs supports the 2-week start while producing the evidence auditors will require.
The core need is to coordinate an AI workspace and infrastructure that meets security standards and produces audit-ready evidence. In this scenario, experimentation speed cannot come at the expense of controls required for regulated data: approved access, separation of environments, and tamper-resistant logs.
Best next action is to implement a standardized, baselined workspace (often a “landing zone”) that includes:
This sequencing allows the team to start within 2 weeks without creating unmanaged copies of PII or generating gaps that are difficult to remediate before the audit.
It enables rapid experimentation while meeting security standards and creating auditable evidence of access, segregation, and logging controls.
Topic: Identify Business Needs and Solutions
You are planning an AI project to predict customer churn. The highest-signal features come from CRM notes that contain personal data and require security review, data-sharing approvals, and a de-identification pipeline before analysts can access them. The approvals typically take 4–6 weeks and cannot be expedited.
Which resource allocation and phase timing approach is best?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: When the decisive constraint is restricted access to sensitive data, the schedule is driven by governance lead time and data preparation prerequisites. The best plan therefore allocates resources early to privacy, security, and data engineering, and sequences approvals and de-identification before committing significant effort to model development. This prevents idle modeling capacity and rework from building on unusable data.
Resource planning for AI must reflect the true gating items across phases, and data access/readiness often dominates the critical path. In this scenario, CRM notes contain personal data and require approvals plus a de-identification pipeline, with a known 4–6 week lead time that cannot be shortened. The most effective allocation is to staff discovery and the data phase with the roles that unblock access (legal/privacy, security, data governance, and data engineering) and treat approvals + de-identification as explicit prerequisites before scaling model development.
A practical sequencing is:
The key takeaway is to align staffing and timing to the highest-risk, longest-lead dependency—here, sensitive-data access.
Sensitive-data access is the critical-path constraint, so governance and de-identification resources must be staffed early and planned as prerequisites to modeling.
Topic: Identify Data Needs
You are preparing a tabular training dataset for a claim fraud classification model. During data evaluation, you review this profiling excerpt.
Field Observed issue (sample)
claim_amount "1,200.50"; "\$980" (string w/ symbols)
incident_date "2026-01-05"; "01/05/26" (mixed formats)
diagnosis_codes ["E11","I10"] (array/nested)
customer_id 3.1% null; 0.8% duplicate IDs
fraud_flag present for 72% of rows (missing labels)
What is the best next action supported by this exhibit?
Best answer: C
What this tests: Identify Data Needs
Explanation: The profiling excerpt indicates clear schema incompatibilities: numeric values stored as strings with currency symbols, inconsistent date formats, nested arrays, identifier quality issues, and missing labels. The correct response is to define an agreed target schema and specify transformations so the training data is consistent, auditable, and suitable for modeling.
In Domain III data evaluation, schema and structure checks confirm whether data can be reliably used by the intended modeling approach and pipeline. Here, the exhibit shows multiple incompatibilities that will break or silently distort feature creation (string-encoded currency, mixed date formats, nested arrays) plus data quality concerns that affect joins and training set definition (null/duplicate IDs, missing labels). The right next action is to create a canonical (target) schema and a transformation plan, then update the data pipeline accordingly (e.g., parse and cast amounts, standardize timestamps, flatten/encode arrays, enforce key constraints, and define how unlabeled rows are handled). This prevents inconsistent preprocessing between training and inference and supports traceability for later audits and retraining.
The exhibit shows type/format/nesting and label-coverage issues that must be standardized and transformed to make the dataset model-ready and reproducible.
Topic: Identify Data Needs
A retail bank wants to build a model to predict which new personal-loan applicants will default within 12 months. The team has access to current loan-application fields, but it is unclear whether historical repayment outcomes can be linked back to the original applications, and “default” is defined differently across business units.
Before deciding whether to collect more data, adjust the project scope, or use data augmentation, what should the project manager ask for FIRST?
Best answer: A
What this tests: Identify Data Needs
Explanation: To assess data gaps, you first need to confirm that the target outcome (label) is consistently defined and that labeled historical outcomes can be joined to the input records. Without a usable, linkable ground truth, you cannot determine whether the right response is to collect/label more data, narrow the use case, or augment with additional sources.
The core data-gap decision starts with label readiness: a consistent target definition and enough trustworthy labeled examples that can be connected to the inputs. In this scenario, “default” varies by business unit and it’s uncertain whether repayment outcomes can be linked to the original application records. If the label is inconsistent or not linkable, the immediate gap is not model choice or hosting—it’s the absence of a reliable supervised-learning dataset.
Once label definition and linkage are verified, you can select the right strategy based on what’s missing:
Platform, schedule, and algorithm discussions come after confirming the dataset can be constructed and audited.
You must first verify label definition and linkable ground truth to identify the true data gap and choose an appropriate remediation strategy.
Topic: Identify Business Needs and Solutions
You are standing up a cross-functional team to deliver a customer churn prediction MVP for a regulated financial services firm. A newly onboarded contract data scientist is scheduled to start feature engineering on Monday, but currently has no access to the approved development environment or the customer dataset (contains PII). The data owner requires documented purpose, least-privilege role mapping, and completion of mandatory privacy/security training before granting access.
What is the best next step to keep the team productive and compliant?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: The immediate blocker is lack of compliant access to the environment and PII dataset. The correct next step is to complete required onboarding prerequisites (role mapping and mandatory training) and then request least-privilege access through the formal governance workflow. This preserves auditability while unblocking work as soon as approvals are granted.
When an AI team cannot access a needed environment or dataset—especially one containing PII—the next step is not to bypass controls, but to complete the organization’s access provisioning path so work can proceed under approved, auditable conditions. That typically means defining what the person needs to do (role mapping), ensuring prerequisite trainings/acknowledgements are complete, and submitting access requests that specify purpose and scope so the data owner and IAM/security can grant least-privilege access.
A practical sequence is:
This avoids creating unmanaged copies of sensitive data or over-privileging users, both of which increase risk and can halt the initiative later.
This sequences compliance prerequisites (role definition and training) before provisioning least-privilege access through the data owner and IAM process.
Topic: Identify Data Needs
A utility company wants a supervised ML model to predict which residential customers will miss their next monthly bill so the collections team can intervene early. Data will come from billing transactions, service calls, and CRM. The team is writing the data requirements for the training dataset.
Which option best matches what they must specify to support the use case?
Best answer: A
What this tests: Identify Data Needs
Explanation: For a supervised prediction use case, the most important data-requirements artifact is a precise feature/label schema. That includes the unique identifier(s) to join sources, the time grain (as-of date) for each row, and the label definition aligned to the prediction horizon to avoid leakage. Without these fields and definitions, data extraction and modeling cannot be done consistently or audited.
The core concept is defining a fit-for-purpose dataset schema for the AI use case. For supervised learning, you must specify what each training row represents and how it is constructed so different data sources can be reliably integrated and the outcome can be learned.
At minimum, the data requirements should state:
customer_id) used to join sourcesbilling_cycle_end_date) that anchors featuresQuality, governance, and monitoring are important, but they come after (and depend on) a clear definition of required fields and labels.
A usable training set requires a clear join key, an as-of date/time grain, and explicit feature and label fields aligned to the prediction window.
Topic: Identify Data Needs
You are preparing an executive summary for an AI initiative to predict 30-day hospital readmissions using EHR data. The summary states “data is available and ready for modeling,” but it omits that (1) access requires IRB approval and a data-use agreement expected to take 6–8 weeks, and (2) the readmission label definition differs across two source systems and must be reconciled.
Leadership uses the summary to approve a 12-week pilot timeline and releases funding. What is the most likely near-term impact of this omission?
Best answer: D
What this tests: Identify Data Needs
Explanation: Executive summaries that overstate data readiness cause leadership to make commitments (timeline, funding, scope) based on incorrect assumptions. In this case, the missing access lead time and label inconsistency will block modeling work and force rework early in execution. The near-term outcome is delay and loss of confidence due to re-baselining and scope adjustments.
An executive summary should clearly communicate current data readiness plus constraints and risks that affect near-term feasibility (e.g., access approvals, privacy reviews, data-use agreements, label availability/quality, and system inconsistencies). Here, the pilot timeline was approved assuming data could be used immediately and the target label was already consistent. When the team begins execution, they will quickly encounter the IRB/DUA lead time and the need to reconcile label definitions, creating a work stoppage or significant rework. The most likely near-term consequence is a schedule slip and re-planning, which can also reduce stakeholder trust because the original commitment was made without key constraints.
Omitting readiness constraints leads to near-term approval of an infeasible plan, followed quickly by delays and rework once governance and label reconciliation are discovered.
Topic: Identify Data Needs
A retailer is piloting an AI model that predicts weekly stockouts. The pilot used a one-time export of POS transactions and inventory snapshots. The business wants the model to run every Monday with updated data, but data engineering has not defined how new data will be collected, validated, or refreshed, and ownership for ongoing feeds is unclear.
What is the best next step?
Best answer: C
What this tests: Identify Data Needs
Explanation: The use case requires recurring, production-grade data updates, so the immediate need is an agreed process for ongoing data collection and refresh. That includes defining the refresh cadence, data sources and access path, accountable owners, and validation/quality gates so weekly runs are reproducible. Without this, operational runs will be unreliable regardless of model quality or monitoring.
For AI solutions that must run on a schedule, “data readiness” includes more than initial access—it requires an operational process for continuous data collection and refresh. In this scenario, the pilot relied on a one-time export, and there is no defined mechanism or accountability for weekly updates, so the next step is to design and document the refresh process before scaling.
Key elements to define include:
Monitoring and deployment are important, but they assume the underlying data pipeline and refresh controls exist and are owned.
A documented refresh cadence, data owners, access method, and quality checks are prerequisites for reliable continuous updates.
Topic: Manage AI Model Development and Evaluation
You are managing the next iteration of a churn-prediction model and must plan the team’s training schedule.
Exhibit: Training capacity check (next 10 business days)
Compute: Shared GPU pool = 4 GPUs x 8 hrs/day
Other teams reserved: 60% of GPU time
Planned experiments: 30 runs x 8 GPU-hrs/run
People: 1 DS (full-time), 1 MLE (50%);
Data pipeline changes pending from Data Eng (20% capacity)
Go/no-go review scheduled in 2 weeks
Based on the exhibit, what is the best next action?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: The exhibit shows a capacity mismatch: available GPU-hours and dependent Data Engineering support are both constrained relative to the planned experiment volume and the fixed go/no-go date. The most appropriate response is to replan training work so the schedule and experiment scope are achievable with reserved compute and available roles, rather than relying on optimistic execution or lowering quality gates.
Training schedules and resource allocation should be driven by a quick feasibility check that compares required compute and staffing (including dependencies) against actual capacity within the iteration window. Here, the GPU pool is heavily reserved by other teams and the experiment plan is large, while a needed data pipeline change depends on limited Data Engineering capacity. Before running jobs, the team should reconcile demand to supply by updating the iteration plan and securing reservations/support.
This keeps iteration predictable and auditable without weakening evaluation standards.
The plan exceeds available compute and dependent staffing, so the iteration scope and schedule must be reconciled to the constrained resources before execution.
Topic: Identify Data Needs
A product team proposes using historical customer support chat transcripts to fine-tune an internal generative AI assistant. The dataset may include names, account numbers, and free-text that could contain sensitive information. Before approving data access for the project, what should the AI project manager verify first to ensure the intended data use complies with organizational policy and data protection requirements?
Best answer: D
What this tests: Identify Data Needs
Explanation: Before deciding on access or processing, you need to confirm whether the transcripts are allowed to be used for this purpose under policy and data protection obligations. Verifying data classification and approved use establishes what safeguards, approvals, and constraints (e.g., minimization, redaction, retention) must be applied. Only then can technical and schedule decisions proceed responsibly.
The core compliance step is to determine whether the proposed use of the data is permitted and under what conditions. For chat transcripts that may contain personal or sensitive data, start by verifying the dataset’s classification (e.g., public/internal/confidential/restricted) and the approved purpose(s) for use under organizational policy and applicable data protection requirements. This “permissioning” decision drives everything that follows: whether access can be granted at all, what minimum data is required, what de-identification or redaction is needed, where the data may be stored/processed, who can access it, and what retention and audit requirements apply. Technical choices like model architecture and resourcing should be made only after the allowable data handling constraints are known.
You must first confirm the data classification and permitted purpose to determine whether access, minimization, controls, and approvals are allowed.
Topic: Manage AI Model Development and Evaluation
Which term best describes the documented operational artifact that specifies what production metrics to monitor for model/data drift, the alerting thresholds, and the predefined response actions (for example, retrain, rollback, or human review) when thresholds are exceeded?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: A model monitoring runbook is the operations-ready plan that turns drift detection into actionable controls. It documents the specific monitoring metrics, the thresholds that trigger alerts, and the agreed response actions to restore performance and manage risk. This is what supports a go/no-go decision for operationalization readiness.
The core concept is that drift monitoring must be operationalized, not just described. A model monitoring runbook (sometimes called a monitoring and response plan) documents: which data/model health signals are tracked in production, the alert thresholds for those signals, and what the team will do when thresholds are crossed (who responds, how quickly, and what actions are allowed such as rollback, retraining, or routing to human review). In a go/no-go readiness check, confirming this artifact exists helps ensure the solution can be safely operated and issues can be detected and mitigated consistently, rather than handled ad hoc.
It defines monitored metrics, drift thresholds, alerting, and the required response actions when breaches occur.
Topic: Operationalize AI Solution
Your organization plans to deploy an AI-powered customer support triage solution in 3 weeks. You must coordinate deployment work across ML engineering, platform/DevOps, application, security, and the service desk, and report weekly progress “against the plan.” You see multiple team backlogs and a draft runbook, but nothing that clearly ties work together.
What should you verify or obtain FIRST?
Best answer: D
What this tests: Operationalize AI Solution
Explanation: To coordinate deployment across multiple technical teams, you first need a shared, approved baseline plan that integrates milestones, dependencies, and owners. That plan becomes the reference for sequencing work, resolving handoffs, and reporting status consistently. Without it, “progress vs plan” is undefined and updates will be inconsistent across teams.
Cross-team AI deployment coordination depends on having one integrated, baselined deployment plan that everyone agrees to execute. In this scenario, you have fragmented signals (multiple backlogs and a draft runbook) but no single source of truth that connects work across teams, clarifies handoffs, and establishes what “on track” means.
Start by obtaining or creating an approved integrated plan that includes:
Once that exists, you can align the teams’ boards to it, track actuals vs baseline, and report credible progress and risks. Technical details (metrics, SIEM configuration) are important, but they come after the coordination baseline is established.
You can’t coordinate cross-team deployment or track progress without a single baselined, dependency-aware plan and clear ownership.
Topic: Identify Data Needs
You are scoping data for a telecom churn prediction use case. The data team provides the following excerpt from the data profiling report.
Exhibit: Data profiling report (excerpt)
Dataset: Billing + CRM extract (last 12 months)
Field: account_status_cd Values: A,S,T Meaning: unknown
Field: cancel_reason_cd Values: 1–7 Mapping: not provided
Field: save_offer_cd Values: 0–15 Notes: “legacy marketing codes”
Risk: Ambiguous business semantics may invalidate labels/features
Owner: TBD
What is the best next action to address the risk shown in the exhibit?
Best answer: A
What this tests: Identify Data Needs
Explanation: The exhibit shows key fields whose business meaning is unknown, which can invalidate labels and features even if the data is technically accessible. The right response is to identify and engage domain experts who own or use the billing/retention processes and code sets to provide authoritative definitions and mappings. This creates a trustworthy data dictionary for downstream modeling and evaluation.
When a data artifact flags “unknown” meanings or missing code mappings, the primary gap is business semantics, not engineering or modeling technique. To resolve it, you need the domain SMEs who can explain how the organization uses those codes in real workflows and what each value represents (including historical/legacy nuances). In this scenario, billing operations and retention/marketing process owners are best positioned to confirm whether account_status_cd, cancel_reason_cd, and save_offer_cd align with the churn definition and to provide an approved mapping for the data dictionary. Once semantics are validated, the team can revisit feature engineering, label definition, and any needed data transformations. Encoding ambiguous fields without meaning just makes an opaque model faster.
These domain SMEs can explain what each status/reason/offer code means in the business process and validate feature/label interpretation.
Topic: Operationalize AI Solution
A retail bank has deployed a customer churn prediction model that drives retention offers. True churn labels are only available about 45 days after scoring, and the model must pass a 2-week validation and change-approval process before any production update. The MLOps team can support at most 6 full retrains per year without increasing run costs.
Which approach SHOULD AVOID when planning update and retraining cadences?
Best answer: B
What this tests: Operationalize AI Solution
Explanation: A sustainable cadence aligns retraining and releases with label latency, governance lead time, and operational cost constraints. The cadence should favor stability in production while still enabling timely refresh when monitoring shows meaningful degradation. Automating frequent retrains and deployments without approvals undermines controlled change management and can increase instability and operational burden.
Retraining and release cadences should be anchored to (1) how quickly reliable labels arrive, (2) the time needed for validation/approval, and (3) the organization’s run-cost capacity. In this scenario, labels lag by ~45 days and each update requires a 2-week governed validation, so extremely frequent retrains (and especially auto-deployments) cannot be evaluated responsibly and will create unnecessary production volatility. Good cadence design combines scheduled refreshes (for predictability and budgeting) with monitoring-based triggers (to preserve performance), and may separate “train often” from “release less often” via shadow/champion-challenger practices. The key takeaway is to avoid update patterns that outpace label readiness and governance, even if they seem to maximize freshness.
This creates unnecessary churn and risk by retraining/deploying faster than label availability and bypassing required governance gates.
Topic: Support Responsible and Trustworthy AI Efforts
During an internal audit, your team is asked to reproduce last quarter’s fraud-detection model results. The model was trained from a data lake extract, and the team used notebooks plus a manual spreadsheet for training parameters. You are deciding what version-control and audit-trail improvements are needed.
What should you verify or request FIRST before choosing specific practices?
Best answer: A
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: To enable reproducibility, you must first confirm whether the audited model can be traced to exact, retrievable versions of the model artifact, training data, and training configuration. Without those immutable identifiers (and their linkage), you cannot recreate the same training run or prove what was used. Establishing this provenance drives which version-control gaps to fix.
The core requirement for reproducibility is provenance: a verifiable linkage from a specific deployed model back to the exact inputs and settings that produced it. In the scenario, notebooks and a parameter spreadsheet suggest the run may not be uniquely reconstructable, so the first thing to verify is whether the organization already has (or can reconstruct) a single run record that captures:
Once you know whether these identifiers exist and are consistently linked, you can choose appropriate version-control practices (for data, configs, and model artifacts) to close the audit-trail gap rather than optimizing unrelated concerns.
Reproducibility starts by confirming you can uniquely identify and retrieve the exact model, data, and configuration used for the audited run.
Topic: Operationalize AI Solution
A retail bank has deployed an AI model that recommends credit-limit increases. The solution is now in production and must meet strict privacy controls (no production data may be exported) and a low risk tolerance (any wrong recommendation can trigger customer complaints). A rollback runbook and a manual-override process exist, but the operations team has never executed them end-to-end. The model will be expanded to three more regions in 6 weeks.
What is the BEST next action?
Best answer: B
What this tests: Operationalize AI Solution
Explanation: Before expanding the AI solution, the team should validate that contingency procedures actually work under operational conditions. A controlled drill can be designed to respect the privacy constraint by using approved test data and realistic production-like steps. Capturing outcomes and updating runbooks ensures lessons learned improve readiness ahead of broader rollout.
In operationalizing AI, contingency procedures (rollback, manual override, incident communications, and recovery steps) must be tested regularly, not just documented. Here, the organization has low risk tolerance and an imminent scale-up, but the team has never executed the rollback/override end-to-end—creating a high likelihood of operational failure during an incident.
The best next step is to run a controlled contingency exercise that:
This validates real readiness and improves the procedures before expanding to additional regions.
An end-to-end drill (using approved test data) validates rollback/override readiness and feeds lessons learned into updated runbooks before scaling.
Topic: Identify Business Needs and Solutions
A retail bank pilots a customer-support chatbot that uses retrieval-augmented generation (RAG) to pull from an internal CRM and knowledge base. After rollout, adoption drops and a privacy incident is reported: an external pilot user was able to get another customer’s address and recent case notes.
Clues: offline answer-quality metrics are unchanged from pre-release, no recent model retraining occurred, and audit logs show the chatbot’s backend service account queried CRM records without an end-user identifier on the request.
What is the most likely underlying cause?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: The key clue is that the chatbot accessed CRM data without an end-user identifier, which implies the system is not enforcing user-level authorization. In RAG systems, privacy incidents like cross-customer disclosure often result from over-broad service accounts or missing row-level security, not from changes in model quality.
This scenario points to a cybersecurity data-exposure risk introduced by the integration layer, not the model itself. Stable offline quality and no retraining make drift or training-data issues unlikely as the primary cause. The decisive evidence is the audit trail: CRM queries were executed by a backend identity without an end-user context, which commonly means the chatbot can retrieve data beyond what the requester is authorized to see.
A practical risk-assessment response is to verify and enforce:
When a generative component is connected to sensitive systems, controlling tool/data permissions is a primary safeguard against unintended or malicious data exfiltration.
The logs indicate CRM queries executed without binding to the requesting user, pointing to an access-control design flaw enabling data exposure.
Topic: Manage AI Model Development and Evaluation
A team is training a churn prediction model to target retention offers. After adding dozens of new engineered features and increasing model complexity, the latest results are:
The product owner decides to deploy this version because the training score is “excellent.” What is the most likely near-term impact after rollout?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: The pattern of much higher training performance with worse validation performance is a classic sign of overfitting. Deploying an overfit model typically leads to poorer generalization on real-world data, so business outcomes tied to prediction quality (like retention uplift) will degrade quickly. The decision to prioritize training score over validation performance drives the immediate impact.
Overfitting is indicated when a model performs very well on training data but worse on validation data, especially when a change (more features and higher complexity) increases training metrics while decreasing validation metrics. In this scenario, the model has likely learned noise or idiosyncrasies in the training set and will not generalize to new customers.
Near-term, that shows up as weaker real-world predictive quality (e.g., lower precision/recall), causing mis-targeted retention offers and reduced campaign uplift. Typical corrections would be to reduce model capacity, add regularization/early stopping, simplify or vet engineered features, improve data quantity/representativeness, and select the model based on validation (or cross-validation) performance rather than training scores.
The widening training–validation gap indicates overfitting, so performance will drop on new customers despite high training F1.
Topic: Identify Business Needs and Solutions
A retail bank plans to launch an AI-driven “instant pre-approval” feature in its mobile app that also recommends next-best products. In the risk assessment, stakeholders raise concerns about potential bias against underrepresented groups, manipulative nudging in product recommendations, and exclusion of customers with limited digital literacy.
Which mitigation approach the team proposes SHOULD AVOID?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: Ethical risk mitigation reduces bias, manipulation, and exclusion without designing groups out of the solution. Removing a population (such as minority-language users) is an exclusionary anti-pattern that can worsen inequity and undermine the business objective of serving the full customer base. Effective mitigations instead measure disparate impact, increase transparency and oversight, and test for harmful interaction patterns.
Ethical risk assessment for AI solutions should lead to mitigations that reduce harm while preserving equitable access. In this scenario, the concerns map to (1) bias in automated decisions, (2) manipulation via product “nudges,” and (3) exclusion of users who interact differently (language, accessibility, digital literacy). A mitigation that simply removes an affected group is not risk reduction—it is institutionalizing exclusion and likely increases disparate impact while shrinking who can benefit from the feature.
Practical mitigations typically include:
The goal is to make the system safer and more inclusive, not to avoid accountability by filtering out impacted users.
This is an exclusionary mitigation that increases harm by design rather than reducing ethical risk.
Topic: Identify Business Needs and Solutions
A product team proposes an AI-driven call summarization service. The sponsor asks for an ROI estimate before approving rollout to 2,000 agents. The team has strong pilot quality results, but Finance flags that the cost estimate appears to include only initial build.
Which evidence/artifact best validates that the total cost of ownership (TCO) estimate is decision-ready and includes ongoing monitoring and maintenance?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: To validate ROI readiness, you need evidence that the cost side is complete over the solution’s life, not just the build. A TCO model that itemizes and forecasts ongoing costs—monitoring, incident response, retraining, platform operations, and support—provides auditable assumptions and enables Finance to evaluate ROI credibly.
TCO for an AI solution must include both one-time implementation costs and recurring costs to keep the model reliable in production. In this scenario, the decision risk is underestimating ongoing MLOps and operational workload (monitoring for drift, alert triage, data pipeline maintenance, scheduled/triggered retraining, validation, deployment, security reviews, and support). The most valid evidence is a TCO model that explicitly itemizes these run costs, states assumptions (volumes, SLAs, monitoring frequency, retraining triggers/cadence, human-in-the-loop time), and ties them to the ROI timeframe. Quality and adoption metrics are important, but they do not validate that the cost estimate includes operationalization and maintenance.
A decision-ready TCO model explicitly forecasts ongoing operational costs (monitoring, drift response, retraining, support) and documents assumptions used for ROI.
Topic: Identify Data Needs
You are leading an AI initiative to predict insurance claim denials. In a steering committee, the data team reports: “Only 65% of claims have reliable denial labels; last year’s policy change introduced concept drift; and there’s class imbalance.” No one translates what this means for business outcomes, timeline, or achievable accuracy. Leadership approves a production date and announces expected savings.
What is the most likely near-term impact of this omission?
Best answer: A
What this tests: Identify Data Needs
Explanation: Failing to translate data quality and representativeness into business terms leads stakeholders to assume the data can support promised outcomes. That typically drives near-term overcommitment on timeline, ROI, and performance targets. The immediate consequence is re-planning, scope changes, and loss of confidence when constraints surface.
The core issue is communication: technical findings about label reliability, distribution imbalance, and data drift must be converted into business implications (what performance is realistic, which segments are in/out of scope, what remediation is needed, and how that affects timeline and benefits). If leadership hears only jargon, they often interpret it as “the team has it handled” and proceed with commitments.
Near-term impacts commonly include:
Longer-term monitoring issues can occur, but the first consequence is usually decision and expectation misalignment that delays delivery.
Without business translation of data limits, leadership makes near-term decisions (scope, timeline, expectations) on incorrect assumptions, leading to rework and slippage.
Topic: Support Responsible and Trustworthy AI Efforts
In an AI project, which term best describes maintaining a documented record of where training, validation, and test data came from, how they were transformed and versioned, and who accessed or approved them so the dataset can be audited end-to-end?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: The scenario describes end-to-end traceability for datasets: source-to-use provenance, transformation history, versioning, and access/approval records. This is the purpose of data lineage, which supports auditability and accountability documentation for AI initiatives. The other terms address model behavior, interpretability, or privacy scope rather than dataset provenance and chain-of-custody.
The core concept is dataset traceability for audit and accountability. Maintaining chain-of-custody and provenance for training, validation, and test data means you can reconstruct what data was used, where it originated, what transformations and joins were applied, which versions were used, and who had access or made approvals at each step. The governance term that covers this end-to-end trace is data lineage (often documented via lineage graphs, metadata logs, and version-controlled data pipelines). This supports reproducibility, investigations (e.g., biased outcomes), and compliance audits because you can show a defensible, tamper-evident history of the data lifecycle rather than relying on informal descriptions.
Data lineage captures dataset provenance and the traceable chain of custody across sources, transformations, versions, and access/approvals.
Topic: Support Responsible and Trustworthy AI Efforts
A bank is building an ML model to recommend approve/decline decisions for small-business loans. The model will be used for high-impact decisions, and internal audit requires that for every decline the team can provide a clear, stable, feature-level rationale that a non-technical reviewer can understand and that can be reproduced months later. Accuracy is important, but the business will not accept an approach that relies primarily on post-hoc explanations.
Which model selection approach best fits this situation?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: When transparency requirements are strict for high-impact decisions, model selection should prioritize inherent interpretability over marginal performance gains. An interpretable model makes the decision logic directly inspectable, easier to validate for consistency, and more defensible in audits. This aligns the algorithm choice to the primary operational and governance constraint: reliable, reproducible explanations per decision.
This scenario’s decisive factor is the requirement for clear, stable, reproducible explanations that do not depend primarily on post-hoc interpretation. In high-impact decision settings, transparency and auditability often become selection criteria that outweigh small performance improvements.
A practical selection approach is:
Post-hoc methods can support understanding, but when the organization will not accept reliance on them, choosing a simpler, transparent model is the defensible tradeoff.
Strict, audit-ready transparency needs are best met by an inherently interpretable model whose decision logic is directly explainable and reproducible.
Topic: Identify Business Needs and Solutions
You are a project manager starting discovery for an AI-assisted customer support triage initiative. You need to elicit and document business pain points from agents, supervisors, and quality analysts through interviews and observation.
Which approach should you AVOID when gathering pain points?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: Effective pain-point elicitation relies on neutral, open-ended interviewing and direct observation to surface real constraints, workarounds, and impacts. Leading stakeholders to “confirm” a preselected hypothesis creates confirmation bias and limits discovery to what the sponsor already believes. That undermines accurate documentation of the current-state problems AI should address.
The core concept is unbiased discovery: you want stakeholders to describe their work, friction points, and impacts in their own words, then corroborate what you heard with real-world observation. When you push a sponsor’s hypothesis and rely on yes/no confirmation, you turn discovery into validation of a predetermined solution direction, which hides conflicting perspectives and misses root causes.
Practical tactics that support strong pain-point documentation include:
Key takeaway: discovery should expand understanding before narrowing to a solution.
This is leading and confirmation-biased, which suppresses unanticipated pain points and distorts the problem definition.
Topic: Identify Data Needs
You are briefing non-technical executives on a pilot customer-churn prediction use case. During data profiling, the team found that 18% of customer records are missing recent support-interaction data and the source systems define “active customer” differently, which changes churn labels.
Which statement should you AVOID when translating these data findings to leadership?
Best answer: C
What this tests: Identify Data Needs
Explanation: When conveying data readiness to leadership, the goal is to connect technical findings to business risk, decision points, and impact on outcomes. A statement full of database/ETL jargon without consequences or a requested decision is an anti-pattern because it does not enable prioritization or informed tradeoffs. The best communication reframes data issues in terms of reliability of results, customer impact, and actions needed.
Translating data concepts for non-technical stakeholders means expressing “what we found” as “what it means for the business” and “what decision or action is required.” In this scenario, missing support-interaction data and inconsistent churn labeling directly affect model reliability, reported pilot performance, and which customers get targeted.
Good executive-level translation typically includes:
Purely technical implementation details (normal forms, slowly changing dimensions, mart design) are usually noise unless they are explicitly tied to risk, cost, and a decision.
It uses internal jargon and omits business impact, decision needed, and the consequence to success metrics.
Topic: Operationalize AI Solution
A team is ready to deploy an internal AI assistant that uses retrieval-augmented generation (RAG) over HR policies and employee case notes. The pilot used shared test accounts, but the production rollout will include 1,500 employees. Security has flagged that the knowledge base contains PII and that the assistant must restrict who can view case-note excerpts and must keep an audit trail of access.
What is the best next step to enable safe use before expanding access beyond the pilot?
Best answer: A
What this tests: Operationalize AI Solution
Explanation: Before scaling an AI solution that can surface PII, the deployment must enforce identity-based access controls and secure configuration. The immediate need is to move from shared pilot accounts to governed, least-privilege access with a repeatable provisioning/deprovisioning process and auditability. This establishes safe, accountable use as a prerequisite to broader rollout.
The core concept is secure operational readiness: user access provisioning and security configurations must be in place before expanding production access, especially when the AI system can retrieve or generate content derived from sensitive data. In this scenario, the pilot’s shared accounts and undefined entitlements create a high likelihood of unauthorized exposure and no reliable accountability.
The best next step is to implement the access control and provisioning foundation:
Only after these controls are verified should rollout expand; testing and monitoring complement but do not replace access governance for safe use.
Production rollout requires verified identity, least-privilege entitlements, controlled provisioning/deprovisioning, and auditable access before broader exposure to PII.
Topic: Identify Data Needs
A team is collecting weekly transaction and customer-support data to train a service-level prediction model. The AI project manager requires automated checks at ingestion (e.g., record-count reconciliation to source systems, schema/format validation, missing-value thresholds, and outlier flags) and blocks downstream use until issues are fixed so defects are caught early.
Which CPMAI-aligned principle or governance approach does this practice best represent?
Best answer: D
What this tests: Identify Data Needs
Explanation: The described practice is a proactive data-quality control applied at ingestion to confirm completeness and accuracy before the dataset is used. By defining automated validation checks and preventing downstream consumption until defects are resolved, the team “shifts left” quality and reduces rework later in the lifecycle.
This situation describes implementing data quality gates during collection/ingestion—automated validation that verifies the incoming data is complete, accurate, and conformant before it is accepted for analysis or model development. In AI initiatives, catching defects early is achieved by making data acceptance criteria explicit (counts reconcile, required fields populated, schemas match, values in expected ranges) and enforcing stop/go controls when checks fail. These controls help prevent training on corrupted or incomplete data, reduce downstream debugging, and improve traceability of issues back to the source.
The key distinguishing feature is timing and intent: these checks occur as data is collected/ingested to prevent bad data from propagating, rather than controlling access, evaluating model performance, or monitoring behavior after deployment.
It establishes completeness and accuracy checks with stop/go criteria to detect and correct data defects as early as possible.
Topic: Support Responsible and Trustworthy AI Efforts
A retail bank is developing an AI system that will automatically approve or decline small-business loan applications. The system will use applicant financial statements and credit history, and the decision will be communicated to customers. Your organization’s policy classifies this as a high-impact automated decision requiring legal/compliance oversight.
Which approach best plans compliance checkpoints and coordinates with legal/compliance across the AI lifecycle?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Because the solution makes high-impact automated decisions, compliance needs to be built in as formal stage gates, not treated as a final hurdle. Defining required artifacts and legal/compliance sign-offs at each gate makes obligations testable, auditable, and repeatable as the model and data change. This also reduces late rework and deployment risk.
For high-impact automated decisions, the most effective compliance strategy is proactive and lifecycle-based: define compliance checkpoints (stage gates) that legal/compliance co-owns, along with the evidence needed to pass each gate. In this scenario, that means aligning on what must be reviewed and approved at key moments (use-case approval, data acquisition/processing, pre-release validation, and post-release monitoring) and ensuring traceability for audits and change control.
A practical checkpoint plan includes:
Compared with a single end-of-project review, stage gates catch noncompliance early and keep the system compliant as it evolves.
High-impact automated decisions require defined, repeatable legal/compliance checkpoints and evidence (privacy, fairness, explainability, audit trail) before moving to the next lifecycle stage.
Topic: Operationalize AI Solution
A retail bank has deployed an AI model that pre-screens credit card applications. Business stakeholders require 24/7 availability and a low risk tolerance for incorrect declines. A dashboard exists, but recent incidents showed the team noticed problems hours later and responses were ad hoc. Constraints: alerts must not expose PII, the on-call team is small, and an internal model risk audit is in 2 weeks.
What is the BEST next action to improve operational control of this AI solution?
Best answer: B
What this tests: Operationalize AI Solution
Explanation: Operationalizing an AI solution requires proactive detection and consistent response. Defining thresholds for key metrics and configuring alerts enables timely notification without relying on manual dashboard checks. Pairing alerts with response playbooks (triage, escalation, rollback, communications) reduces variability, fits a small on-call team, and provides audit-ready evidence of control.
The core concept is closing the loop between monitoring and action: specify what “bad” looks like (thresholds) and what to do when it happens (playbooks). In this scenario, delayed detection and ad hoc responses are the main operational gaps, and the constraints require PII-safe alerting, efficient on-call handling, and auditability.
A practical next step is to:
Dashboards and retraining can be part of the broader plan, but they do not replace actionable alerting plus standardized response.
This establishes actionable alerting for metric breaches and standardized runbooks that meet the PII and audit constraints.
Topic: Manage AI Model Development and Evaluation
A company plans to pilot an ML model that ranks job applicants. To meet a deadline, the team tests only overall precision/recall on a random holdout set and skips subgroup performance and fairness testing (e.g., by gender and ethnicity). During the pilot, the bottom 30% of applicants will be automatically rejected with no recruiter review.
What is the most likely near-term impact of this testing omission?
Best answer: C
What this tests: Manage AI Model Development and Evaluation
Explanation: For a high-impact hiring decision, QA protocols should include subgroup and fairness testing before any automated rejection is allowed. Without it, the most immediate consequence is that disparate error rates can ship to production-like conditions and surface quickly as candidate complaints and stakeholder pushback. This typically disrupts adoption sooner than longer-cycle outcomes like regulatory actions or drift.
Model testing protocols should scale with risk and impact. In an applicant rejection use case, skipping subgroup performance and fairness checks leaves a critical failure mode untested: the model can perform acceptably in aggregate while systematically rejecting qualified candidates in certain protected groups. Because the pilot includes automated rejection with no human review, these errors will be experienced immediately by real applicants and internal recruiters/HR, often triggering rapid escalation, loss of trust, and a stop/pause to add required QA gates.
Appropriate high-impact QA would typically add:
Compared with drift or audit findings, fairness-related harm is the most direct near-term impact here.
Skipping subgroup testing in a high-impact workflow makes immediate disparate errors likely and quickly visible to applicants and HR.
Topic: Operationalize AI Solution
You are planning deployment for a customer-facing ML scoring API that will support both a mobile app and a call center. Leadership asks you to “size the infrastructure and on-call support” for launch, but the request contains no operational targets or usage estimates.
What should you ask for first before selecting compute, scaling, and support resources?
Best answer: C
What this tests: Operationalize AI Solution
Explanation: Infrastructure and resource planning depends primarily on the workload and the required service levels. Without request volume patterns and targets like latency and availability, you cannot defensibly choose an architecture, scale strategy, or on-call staffing. Establish these operational requirements first, then evaluate options that meet them within constraints.
For deployment and ongoing operations, the first clarifying input is the set of nonfunctional requirements and demand assumptions that drive capacity planning. In this scenario, selecting compute, scaling, and support coverage requires understanding how many inferences will be served (average and peak), how fast responses must be, and what uptime/availability target the business expects.
A practical sequence is:
Other details may matter later, but they do not replace the need for workload and SLOs as the foundation for right-sizing and operational readiness.
Capacity and support planning must start from workload demand and operational targets to size compute, scaling, and on-call needs.
Topic: Identify Business Needs and Solutions
A retail bank is launching an AI solution to auto-triage customer emails to reduce handling time and overtime costs. Constraints: (1) privacy policy prohibits storing raw email text longer than 24 hours, (2) executives want ROI evidence within 90 days of go-live, (3) operations will only accept low-risk rollout with clear rollback criteria, and (4) the pilot must start in 8 weeks. As the AI product manager, what is the BEST next action?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: The next step is to operationalize how ROI will be proven after go-live, not just to assume it will appear. That means selecting ROI metrics tied to the business objective, establishing baselines, and designing a collection and attribution approach that works under the 24-hour raw-text retention limit. A defined cadence, ownership, and rollback-linked success criteria enable credible ROI reporting within 90 days.
ROI for an AI initiative must be measurable post-deployment, traceable to the business objective, and feasible under governance constraints. Here, the team needs a measurement plan that links triage automation to reduced handling time and overtime while respecting the 24-hour raw-text retention rule and the low-risk rollout requirement.
A practical next action is to document a plan that includes:
This produces credible, decision-ready ROI evidence and prevents post-launch “we can’t measure it” surprises.
This creates an auditable ROI measurement plan (baseline, data capture, attribution, cadence, owners) that fits the privacy and rollout constraints and can be executed after deployment.
Topic: Identify Business Needs and Solutions
A retail bank has deployed an AI-assisted triage tool that suggests next-best actions to call-center agents. The sponsor asks you to define ROI metrics and a measurement plan that can be applied over the next 90 days to confirm value after deployment.
Which approach should you NOT use?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: An ROI measurement plan must focus on business outcomes and how they will be measured and translated into financial value after deployment. Offline model metrics indicate technical performance but do not quantify realized benefits in production. A strong plan also defines baselines, attribution, and operational ownership so observed changes can be credibly linked to the AI solution.
ROI metrics are defined in terms of measurable business outcomes (e.g., reduced average handle time, improved resolution rate, increased sales conversion) and a repeatable way to translate those outcomes into financial impact. After deployment, you need a plan that specifies what data will be collected, how you will establish a baseline, how you will attribute changes to the AI solution (vs. seasonality or process changes), and who will review results on a set cadence.
Relying on offline model metrics as “ROI” is an anti-pattern because it ignores adoption, process effects, and real-world constraints; ROI must be demonstrated through production outcomes and credible attribution.
Model quality metrics (e.g., accuracy/F1) are not ROI and must be paired with post-deployment business outcome measurement.
Topic: Support Responsible and Trustworthy AI Efforts
A bank is preparing to operationalize an AI model for loan pre-approval decisions. Internal audit requires evidence that the training dataset can be traced back to its original systems and that every cleaning, filtering, and feature-engineering step can be reproduced for any given model version.
Which artifact best validates this transparency and traceability readiness?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Readiness for transparency and traceability is best validated by documentation that ties the model’s training data to specific source systems and captures each preparation step in a reproducible, version-controlled way. A lineage and data-prep log provides auditable evidence of provenance, transformations, and when/why changes occurred, supporting re-creation of the exact training set used for a given model release.
To demonstrate AI/ML transparency for data, you need evidence that an independent reviewer can reconstruct the training dataset used for a specific model version and trace each field back to its origin. The strongest validation is a version-controlled artifact that captures data provenance and the full preparation pipeline (what was done, in what order, with what logic, and when).
This typically includes:
Charts, meeting notes, or model metrics may be useful project outputs, but they do not provide end-to-end data traceability required for audit and reproducibility.
It directly records where data came from and exactly how it was transformed, enabling reproducibility and auditability.
Topic: Identify Data Needs
You lead an AI initiative to predict 60-day hospital readmissions to reduce penalties by 8% within 9 months. A data readiness assessment finds: 30% of discharge summaries are missing/late, readmission labels differ by facility, and PHI cannot leave the hospital network (no external vendor access).
The CFO wants a recommendation this week on whether to fund full development and commit to an enterprise rollout date. What is the BEST next action?
Best answer: B
What this tests: Identify Data Needs
Explanation: Leadership needs a recommendation that explicitly connects data readiness gaps to outcomes like penalty reduction, timeline risk, and deployment scope. A decision brief that presents options (for example, phased rollout, remediation investment, or re-scoping) enables an informed funding and schedule decision while honoring PHI and governance constraints.
The core practice is to convert data readiness findings into decision-ready guidance: what the gaps mean for expected business value, delivery risk, and what choices leadership can make now. In this scenario, missing/late discharge summaries and inconsistent label definitions directly threaten model validity and the 9-month target, while PHI restrictions constrain staffing/vendor options. The best next step is to synthesize these findings into a concise recommendation that includes business impact, risk exposure, and clear alternatives with decision points (for example, fund data standardization first and run a limited pilot on facilities with consistent labels, or revise the rollout date/scope). This keeps the conversation outcome-focused and supports an explicit go/no-go or phased-go decision rather than premature build commitments.
It translates data readiness findings into business impacts, risks, and phased decision choices leadership can approve under the stated constraints.
Topic: Operationalize AI Solution
When transitioning an AI model from the project team to operational support, which term refers to standardized documentation that summarizes the model’s intended use, key assumptions, evaluation results, limitations/risks, and monitoring/ownership details to support ongoing operations?
Best answer: C
What this tests: Operationalize AI Solution
Explanation: A model card is a concise, standardized artifact that helps operational teams understand what a model is for, how it was evaluated, what its limitations are, and how it should be monitored and governed after deployment. It directly supports a clean handover by clarifying responsibilities and expectations for ongoing support.
A strong AI transition plan includes operational documentation that enables support teams to run, monitor, and govern the model without relying on the original project team. A model card is the commonly used governance artifact for this purpose: it captures the model’s intended use and out-of-scope use, evaluation approach and results, key assumptions and limitations, known risks (e.g., bias or failure modes), and practical operational details such as monitoring signals, escalation paths, and ownership. This reduces ambiguity during handover and makes responsibilities and timelines actionable by giving operations a shared, durable reference point. By contrast, other artifacts focus on data tracking, contractual commitments, or field-level definitions rather than model-specific operational readiness.
A model card is the operational-facing summary of a model’s purpose, performance, limitations, and governance/monitoring details used during handover.
Topic: Identify Business Needs and Solutions
You are initiating an AI project for a bank’s contact center to use a model to recommend whether to approve a customer’s fee waiver request in real time. The sponsor says only, “We need faster decisions and lower costs,” and asks you to define the project’s success metrics.
What should you ask for FIRST before finalizing the success metrics?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: Success metrics must trace to organizational objectives and governance expectations for the decision being automated. Before choosing KPIs, confirm what outcomes matter (e.g., cost reduction vs. customer experience) and what “acceptable” behavior looks like under governance (e.g., explainability, audit requirements, and error tolerance). Without that alignment, metrics can optimize the wrong goal or violate oversight expectations.
Defining success criteria for an AI initiative starts with clarifying what the organization is trying to achieve and what governance will require for the specific decision context. In this scenario, “faster decisions and lower costs” is directionally helpful but not sufficient to select KPIs because it omits the decision owner’s intended outcomes (financial, operational, and customer) and the oversight expectations that constrain how the model can be judged and used.
Ask stakeholders to confirm:
Once these are agreed, you can choose metrics that are both outcome-aligned and governance-compliant, rather than optimizing accuracy or speed in isolation.
You need agreed outcomes and governance expectations (e.g., customer impact, auditability, fairness, error tolerance) to set aligned, measurable success metrics.
Topic: Manage AI Model Development and Evaluation
A team has built a supervised ML model to recommend credit line increases. Performance meets the agreed KPI, but the organization’s model risk committee requires evidence that the solution can be supported and audited after deployment because it affects customer outcomes.
As the AI project lead preparing the go/no-go recommendation, what is the best next step to validate that documentation and operational procedures are sufficient for support and governance?
Best answer: C
What this tests: Manage AI Model Development and Evaluation
Explanation: Because the model impacts customer outcomes and is subject to committee oversight, the decisive factor is governance and auditability at go/no-go. The strongest way to validate sufficiency is an operational readiness review that checks required documentation and procedures and confirms clear ownership for ongoing support.
At go/no-go, “ready” means more than meeting model metrics; it also means the organization can operate, govern, and defend the model in production. For a high-impact decisioning use case, the most direct validation is an operational readiness/governance checkpoint that verifies required artifacts exist, are consistent, and have accountable owners.
A practical readiness review typically confirms:
Further model tuning, security testing, or training may still be needed, but none substitutes for verifying supportability and governance completeness before production.
A formal readiness review validates runbooks, monitoring, escalation, change control, and audit documentation are complete and owned before go-live.
Topic: Identify Business Needs and Solutions
A health insurer launched an AI-assisted claims triage tool. The adoption dashboard shows “strong adoption” based on 1,100 staff completing training and 25 roadshow sessions, yet only 6% of eligible claims are processed using the tool and average claim cycle time has not improved.
Stakeholders report “the model isn’t working,” but recent monitoring shows stable precision/recall, no meaningful drift, no reported privacy incidents, and no new bias signals.
What is the most likely underlying cause of the perceived adoption problem?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: The clues show the model is performing as expected and there are no new privacy or bias issues. The main mismatch is that “adoption” is being measured with activity (training/roadshows) instead of true usage and outcome metrics (eligible-case utilization and cycle-time impact). That makes adoption look healthy while the solution is not being used or delivering value.
Adoption should be measured with metrics that reflect actual product use and realized outcomes, not just enablement activity. In this scenario, training completion and roadshows are leading indicators, but the critical adoption signals are low utilization (only 6% of eligible claims) and no improvement in the intended business KPI (cycle time). Because model monitoring shows stable performance with no meaningful drift—and there are no privacy or bias signals—the most likely root cause is a measurement approach that overweights activity metrics and fails to surface workflow/integration barriers, incentives, or process fit issues.
A better adoption measurement set would include, for example:
When metrics emphasize outcomes and real usage, adoption risks become visible early and can be addressed.
Counting trainings and sessions can look “successful” while masking low real usage and no business impact, driving a false narrative that the model is failing.
Topic: Identify Data Needs
A healthcare company is building an AI model for appointment no-show prediction. The solution uses regulated patient data and must follow governance requirements: least-privilege access, separation of development and testing environments, and auditable access logging. The team is setting up secure workspaces for data science.
Which approach SHOULD AVOID to align environments with governance and access controls?
Best answer: C
What this tests: Identify Data Needs
Explanation: Secure AI workspaces rely on separation of environments, least-privilege access, and traceable user actions. A shared admin account breaks these controls by expanding privileges beyond need and eliminating individual accountability. Governance-aligned setups require identity-based access and auditable logs tied to specific users and roles.
The core control for secure AI development and testing environments is governance-aligned access: users get only the minimum permissions needed, environments are separated to reduce blast radius, and all access is attributable to an individual identity for auditability. In the scenario, regulated patient data increases the importance of accountability and controlled access paths.
Practical governance-aligned measures include:
Using a shared admin account undermines least privilege and makes it difficult to prove who accessed what, which is a direct misalignment with the stated governance requirements.
Shared admin credentials bypass least-privilege, weaken accountability, and make access auditing unreliable.
Topic: Manage AI Model Development and Evaluation
A team has completed a candidate churn prediction model. Offline testing meets the agreed accuracy and latency targets, but the fairness assessment shows higher false-positive rates for one customer segment, and the operations team has not yet finalized monitoring thresholds or a rollback plan. The business asks to deploy next week to hit a campaign date.
Which governance approach best matches an appropriate go/no-go operationalization decision in this situation?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: This calls for a structured go/no-go decision based on model and operational readiness, not just offline performance. A conditional approval allows limited deployment only when the team documents specific entry conditions, remediation actions, and accountable owners. It also ensures follow-ups (monitoring, rollback readiness, and fairness checks) are defined before exposure increases.
The core concept is a model operationalization readiness gate (go/no-go) that explicitly captures the decision rationale, release conditions, and required follow-ups. Here, offline metrics are adequate, but there are unresolved risks: a fairness gap for a segment and incomplete production safeguards (monitoring thresholds and rollback plan). A risk-based gate can proceed only as a conditional go (often for a limited pilot) when the decision record documents what must be true before expanding use.
Typical conditions and follow-ups to document include:
Pushing to production without these controls turns the go/no-go into an implicit risk acceptance without accountability.
A formal go/no-go gate can approve only with explicit conditions, owners, and follow-ups to close fairness and operational readiness gaps.
Topic: Support Responsible and Trustworthy AI Efforts
A health insurer is deploying an AI model that automatically approves or denies certain medical claims. During a pilot, a model update caused a spike in incorrect denials, and operations teams were unsure who could authorize a rollback, who owned customer-impact decisions, and when to escalate to compliance. Because the system makes high-impact decisions, what is the best way to clarify ownership and escalation paths for AI outcomes, changes, and incidents in production?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: High-impact AI decisions require explicit decision rights, not just technical documentation or monitoring. A documented accountability structure that names who is accountable for business outcomes and who is responsible for technical actions—paired with a defined escalation and rollback path—prevents confusion during incidents. Tying it to formal incident/change processes also preserves the audit trail needed for review.
For AI in operations—especially when it can directly affect customers—accountability must be operationalized: who is accountable for the decision outcomes, who can approve changes, and who can trigger/authorize rollback or suspension. The most effective approach is to document and socialize a clear ownership and escalation structure (often a RACI) and connect it to existing incident and change management so every event is traceable.
A practical setup includes:
Model documentation and monitoring support governance, but they do not, by themselves, establish decision rights and escalation paths.
A named accountability matrix plus a documented incident/change escalation runbook creates clear decision rights (including rollback) and an auditable trail for high-impact AI operations.
Topic: Identify Business Needs and Solutions
You are rolling out an AI agent-assist tool for a customer support center to reduce average handle time by 8% this quarter. After a 3-week pilot, model quality meets targets, but adoption is only 30% and supervisors report agents don’t trust when the tool suggests next-best actions. Policy requires a governance checkpoint and change control before expanding to 5 more sites in 6 weeks, and no customer PII can be shown in the UI. What is the BEST next action?
Best answer: C
What this tests: Identify Business Needs and Solutions
Explanation: Adoption signals show the solution is not operationally ready to scale even if model metrics look acceptable. The best action is to use the scheduled governance checkpoint to re-baseline the rollout plan (phasing, training, communications, and human oversight) and route any changes through change control. This preserves trust, policy compliance, and a controlled expansion timeline.
In CPMAI, rollout readiness includes adoption and integration—not just model performance. With only 30% usage and low trust, scaling to additional sites would likely fail to realize the business outcome (handle-time reduction) and could increase operational risk. Because policy requires a governance checkpoint and change control, the appropriate next step is to pause expansion, review adoption evidence (usage, override rates, qualitative feedback), and update the rollout and change-management plan while confirming safeguards (including the no-PII UI constraint) remain intact.
A practical adjustment package typically includes:
This approach addresses the adoption risk while maintaining required governance discipline.
Low adoption is a readiness risk, so you should use the required checkpoint to adjust rollout and change-manage under governance before scaling.
Topic: Support Responsible and Trustworthy AI Efforts
Your team is iterating a credit-limit recommendation model. After a recent performance update, internal audit asks you to prove (1) which training dataset version was used, (2) what model changes were made and why, and (3) who approved the change for production.
Which evidence/artifact best validates that you maintained an auditable trail of key AI decisions across iterations?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: An audit trail requires traceability, not just outcomes or communications. A versioned lineage and decision log ties each release to the exact data snapshot, the specific model change, the documented rationale, and the recorded approver(s). That directly supports governance, reproducibility, and accountability across iterations.
To demonstrate an auditable trail for AI decisions, you need evidence that connects “what changed” and “who approved” to “what was trained and deployed.” The strongest validation is a controlled, versioned set of records that links:
This allows an auditor to reconstruct each production release and verify that changes followed the organization’s approval process. Performance dashboards, delivery artifacts, or informal notes may be helpful context, but they do not reliably provide end-to-end traceability across data, model, and approvals.
This provides end-to-end traceability from data selection through model changes and formal approvals for each release.
Topic: Manage AI Model Development and Evaluation
A team is building a loan pre-approval model and has run extensive hyperparameter tuning over the last two sprints. Reported AUC ranges from 0.71 to 0.83 for what the team calls “the same dataset,” and the fairness metric for one protected group swings widely across reruns. In a shadow deployment, a drift alert appears immediately after a retrain, but no one can confirm which exact model configuration is running. Business stakeholders pause adoption because results seem inconsistent week to week.
The team admits most tuning was done in shared notebooks and only the “best” metrics were copied into a spreadsheet; there are no saved run configurations, random seeds, or dataset/version identifiers tied to each result.
What is the most likely underlying cause?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: The clues point to uncontrolled hyperparameter tuning with no run-level logging of configurations, seeds, or data versions. That breaks reproducibility, prevents confident selection of the true “best” model, and makes it impossible to verify what was deployed when performance, fairness, and drift signals appear. Coordinated experiment tracking is the core control that resolves these symptoms.
The core issue is lack of traceability across hyperparameter tuning experiments. When teams run many trials without capturing the full run context (hyperparameters, preprocessing code version, random seed, train/validation split, and dataset/feature version), the same “experiment” can produce different outcomes and the chosen model cannot be reproduced or audited. This also makes operational signals (like an immediate drift alert after retraining) ambiguous because the team cannot definitively map monitoring results to a specific model build.
A practical control is to coordinate tuning through an experiment management approach that records, for each run:
This directly addresses inconsistent metrics, unstable fairness results, and uncertainty about what is running.
Without logging hyperparameters, seeds, and data/model versions per run, results become irreproducible and the deployed model cannot be confidently identified.
Topic: Identify Business Needs and Solutions
A healthcare provider is launching a 12-week pilot to predict appointment no-shows. The team needs an MLOps toolset and a data-labeling service, but patient data (PHI) cannot leave the organization’s controlled environment, and security requires audit logs and role-based access. Procurement warns the solution may need to scale later to additional use cases, so the sponsor wants to avoid vendor lock-in while still meeting the pilot timeline.
What is the BEST next action to coordinate procurement?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: To avoid vendor lock-in while meeting strict PHI and audit requirements, procurement should be driven by vendor-agnostic capabilities and nonfunctional requirements rather than a preselected product. A structured RFI/RFP with clear evaluation criteria enables rapid comparison of toolsets and services that can operate in the required environment. Including portability and exit terms preserves long-term flexibility if the pilot scales.
The core practice is to procure AI tools and services using outcome- and capability-based requirements, not vendor-specific implementations. In this scenario, constraints (PHI staying in a controlled environment, required auditability/RBAC, and a short pilot timeline) must be translated into measurable requirements and selection criteria that multiple vendors can satisfy.
A strong next step is to align procurement and stakeholders on:
This approach supports rapid selection without locking the organization into one proprietary stack, unlike prematurely committing to a single platform or bypassing procurement governance.
This aligns procurement to functional needs and security constraints while preserving optionality through portability and contractual exit provisions.
Topic: Operationalize AI Solution
A bank’s AI-based fraud scoring service is in production. Overnight, customer support reports a surge in declined legitimate transactions, and the operations lead wants to push an emergency configuration change (lower the decision threshold) immediately.
As the AI product manager, what should you verify or obtain FIRST before deciding whether to implement an emergency change or rollback?
Best answer: C
What this tests: Operationalize AI Solution
Explanation: Before acting on an emergency change in production, you need to know whether this qualifies as a critical incident and who is empowered to make the decision. Confirming the predefined escalation procedure (including roles like incident commander and change approver) ensures rapid action is taken by the right authority with proper accountability.
In operationalizing AI solutions, critical incidents and emergency changes must follow a predefined incident response and change control process. When a production issue emerges, the first clarification is governance-related: is the event a formally defined critical incident (by severity/impact criteria), and who has decision authority to approve an emergency change or rollback. This prevents ad hoc actions, ensures accountability, and triggers the correct escalation, communications, and logging requirements.
Practical items to confirm include:
Performance metrics and business impact can inform the decision, but they come after authority and escalation are established.
You must confirm the escalation path and decision authority for a critical incident before executing emergency change actions.
Topic: Operationalize AI Solution
A fraud-detection model is in production. The data science team wants to deploy an “improved” model this week after retraining on newer data, and they plan to overwrite the current model artifact in the deployment pipeline.
As the AI project manager overseeing model governance, what should you ask to verify/obtain FIRST to ensure model versioning and change control maintain end-to-end traceability across this update?
Best answer: C
What this tests: Operationalize AI Solution
Explanation: Overwriting a production model breaks traceability unless the new model is uniquely versioned and linked to the exact training data, code, and configuration used to produce it. The first step is to confirm there is a controlled change record and that all lineage artifacts are captured in a registry or equivalent system. This enables auditability, rollback, and consistent comparisons across versions.
The core control for traceability across model updates is a governed, uniquely identifiable model version that is linked to the evidence needed to reproduce and audit it. Before approving a deployment that would overwrite an existing artifact, confirm that the change is recorded and that the new model is registered with pointers to immutable lineage artifacts (for example, code commit, feature pipeline version, training/validation data snapshot or dataset version, configuration/hyperparameters, evaluation report, and approvers). With those in place, you can compare performance to the baseline, support rollback, and demonstrate what changed and why. Cost, new features, and contracting may matter later, but they do not establish traceability for this specific update.
Traceability requires a uniquely identified version tied to immutable lineage artifacts and an auditable change approval record before any overwrite.
Topic: Identify Data Needs
An AI project manager must write a one-page executive summary after the initial data assessment for a customer churn prediction use case. Leadership wants to decide whether to fund a full model build next quarter.
Which principle best fits what the executive summary should emphasize to support this decision?
Best answer: C
What this tests: Identify Data Needs
Explanation: For an executive funding decision, leaders need a concise, decision-oriented view of whether the data can support the use case and what could derail delivery or outcomes. The summary should translate the data assessment into readiness status, constraints (access, quality, coverage, governance), and the most important risks with business impact. This supports a clear go/no-go or conditional go recommendation.
In CPMAI data understanding communications, an executive summary is not a technical report; it is a decision artifact. It should state whether the available data is fit (or not yet fit) for the intended AI outcome, what constraints must be resolved (e.g., missing target labels, limited history, restricted access, unclear lineage/consent), and the highest-impact risks and dependencies (e.g., bias risk from unrepresentative segments, data leakage risk, inability to refresh data on schedule).
A practical structure is:
Technical deep dives belong in appendices; leaders need clarity on feasibility and risk first.
An executive summary for a go/no-go decision should clearly state data readiness, constraints, and material risks in business terms.
Topic: Manage AI Model Development and Evaluation
You are selecting an initial model technique for a credit pre-screening use case. Business requirements include: minimum AUC of 0.80, average scoring latency under 50 ms, and the ability to provide human-understandable reason codes for decline explanations.
Exhibit: Evaluation summary (validation set)
Logistic regression: AUC 0.79 | Latency 5 ms | Explainability High
Gradient-boosted trees: AUC 0.82 | Latency 35 ms | Explainability Medium (reason codes feasible)
Neural network: AUC 0.83 | Latency 120 ms | Explainability Low
kNN: AUC 0.81 | Latency 210 ms | Explainability Low
Based on the exhibit, which model technique should you advance to the next stage (robustness/fairness testing and pilot planning)?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: The best technique is the one that satisfies the stated acceptance constraints while remaining fit for the use case. Gradient-boosted trees clear the minimum AUC and latency targets and are noted as capable of producing reason codes, making them suitable to move forward into deeper validation and piloting.
Model technique selection should be driven by the use case’s success criteria and operational constraints, not just the highest accuracy metric. Here, the gate criteria include a minimum AUC, a strict latency ceiling for real-time decisions, and explainability sufficient to communicate decline reasons. The evaluation summary shows gradient-boosted trees are the only candidate that simultaneously meets the AUC threshold and the latency requirement and is explicitly assessed as able to generate reason codes. That makes it the best-fit technique to advance into the next stage (e.g., robustness checks, fairness analysis, calibration review, and pilot planning) before committing to production.
It is the only option that meets both the AUC and latency requirements while still supporting usable reason codes.
Topic: Support Responsible and Trustworthy AI Efforts
A retail bank is building an AI model to pre-screen credit card applications to reduce manual review time by 30%. Constraints: policy prohibits using protected attributes (e.g., race/ethnicity) in model training, but allows their use for fairness auditing in a restricted-access environment; the bank has low risk tolerance for disparate impact; go/no-go is in 3 weeks; operations will only adopt the model if bias findings include a clear root-cause hypothesis.
After initial testing, the model performs well overall but has a much higher false-negative rate for applicants from certain postal codes. What is the BEST next action to identify likely sources of bias before deciding on mitigation?
Best answer: A
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: The disparity tied to postal codes could come from biased sampling in data collection, inconsistent or biased labels, proxy effects in model design, or feedback loops from prior decisions. The best next step is a structured bias triage that uses governed access to protected attributes for auditing and slice-based evaluation, while also examining how outcomes/labels were created. This produces defensible root-cause hypotheses before choosing mitigations.
When a fairness issue appears in a subgroup (here, certain postal codes), the priority is to identify the most likely bias source(s) before applying fixes. In CPMAI terms, that means checking the full pipeline: whether the data collected is representative of the population, whether labels reflect biased historical decisions (or inconsistent labeling practices), whether the model is relying on proxy variables, and whether downstream decisions could create feedback loops (e.g., fewer approvals leading to fewer “good outcome” labels for that area).
A practical triage is to:
Changing features, thresholds, or deploying first are mitigation/operational steps that risk treating symptoms without locating the cause.
It directly investigates bias sources across collection, labeling, model behavior, and potential feedback effects while respecting the training restriction and governance needs.
Topic: Manage AI Model Development and Evaluation
Which term best describes a governance artifact that documents a model’s algorithm selection criteria and the rationale for key design choices (e.g., intended use, evaluation results, limitations) to support review and approval?
Best answer: A
What this tests: Manage AI Model Development and Evaluation
Explanation: A model card is a standardized artifact for recording why an approach was chosen and how it was evaluated, including intended use, performance, and known limitations. This documentation supports governance, auditability, and stakeholder review of the model development decisions.
To support review and governance, teams need a consistent way to explain what model was built, why specific techniques were selected, how the model was evaluated, and what constraints or risks remain. A model card provides that decision rationale and summary evidence in a single, review-friendly format (often including intended use, out-of-scope use, data/evaluation notes, performance metrics, limitations, and ethical considerations). This makes model development decisions traceable and easier to approve, monitor, and revisit during future changes or incidents. In contrast, artifacts like lineage or PII inventories address data traceability and privacy, not algorithm selection rationale.
A model card is used to document model choices, performance, intended use, and limitations for governance review.
Topic: Manage AI Model Development and Evaluation
You are leading an AI initiative to predict loan default risk. During initial data profiling (before any data preparation), the team finds 18% missing income values, 6% duplicate customer records, and inconsistent product codes across two source systems. A governance board meeting is scheduled tomorrow and requires an auditable record of the findings, the go/no-go decision, and the agreed remediation next steps.
Which deliverable best meets this need?
Best answer: C
What this tests: Manage AI Model Development and Evaluation
Explanation: A governance review needs a traceable, stakeholder-ready artifact that summarizes data quality results and ties them to a decision and an action plan. A data quality assessment report (often paired with a decision and remediation log) provides the evidence, rationale, owners, and timelines needed for a go/no-go checkpoint. This directly supports accountable sign-off before data preparation begins.
The core need is not just identifying data defects, but documenting them in a way that supports a governance checkpoint and an auditable go/no-go decision. A data quality assessment report should summarize the profiling methods and key metrics (e.g., missingness, duplicates, consistency), compare results to agreed acceptance criteria, and record the decision rationale.
It should also include a clear remediation plan so stakeholders can act:
Artifacts like model cards, data dictionaries, and experiment logs are useful, but they do not primarily document data-quality findings and the governance decision path at this stage.
It captures quantified issues, the go/no-go rationale, owners, and next steps in an auditable format for governance review.
Topic: Identify Data Needs
A retailer is launching an AI demand-forecasting solution that must incorporate new sales and inventory transactions daily to remain accurate. The data includes customer-linked identifiers, and operations requires an auditable process for ongoing data collection, refresh, and quality control before go-live.
Which metric/evidence/artifact best validates that the team is ready to support the required continuous updates?
Best answer: C
What this tests: Identify Data Needs
Explanation: Readiness for continuous updates is best validated by evidence that the team has operationalized a repeatable, governed data refresh process. An approved runbook (or equivalent operating procedure) makes the refresh cadence, ownership, SLAs, and data-quality gates explicit and auditable. This directly supports ongoing data collection and refresh required by the use case.
For use cases that need continual data updates, the key validation is not a one-time dataset or model score—it is proof that the organization can reliably run the data lifecycle in production. The strongest evidence is an agreed and approved data refresh operating procedure (runbook) that defines how new data is ingested, validated, and made available for downstream training/scoring while meeting governance needs (auditability, access control, and accountability).
A fit-for-purpose runbook typically specifies:
This demonstrates the process for ongoing data collection and refresh that the use case requires, rather than a point-in-time artifact or a vanity measure of progress.
A signed-off operational runbook demonstrates a defined, governed process to continuously collect, validate, and refresh data in production.
Topic: Operationalize AI Solution
A retail bank is ready to deploy a new machine-learning model that recommends credit limit changes. The decision affects many customers, customer service workload, and compliance reporting. The MLOps team has only basic uptime monitoring in place; automated drift and fairness alerts are planned but not yet implemented. Which rollout approach SHOULD AVOID for the initial production release?
Best answer: C
What this tests: Operationalize AI Solution
Explanation: When stakeholder impact is high and monitoring maturity is low, the safest deployment is incremental and easily reversible. Approaches like shadow mode, canary, or phased release reduce blast radius while the team validates performance and establishes operational monitoring. A one-time full cutover concentrates risk and makes issues harder to detect and contain early.
Rollout strategy should match operational risk. With high-impact decisions (customer outcomes, regulatory reporting) and immature monitoring (limited drift/fairness detection), the priority is to limit blast radius and create fast feedback loops. Incremental releases also provide controlled data to validate real-world behavior before scaling.
Practical choices include:
The anti-pattern is a “big-bang” switch that exposes the entire population before observability and safeguards are proven in production.
A full “big-bang” cutover is highest risk when monitoring and rollback controls are not yet mature.
Topic: Support Responsible and Trustworthy AI Efforts
A bank piloted an AI model to pre-approve small-business loans. Overall performance meets targets, but the pilot shows materially lower approval rates for applicants from rural areas. The team suspects the issue may come from how training data were collected, how historical approvals were used as labels, feature choices (e.g., location-related proxies), or a post-deployment feedback loop.
What is the best next step?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Before changing the model or expanding data collection, the team should pinpoint where the disparity originates. A structured bias check combines subgroup metric analysis with audits of data collection, labeling practices, and feature proxies to determine whether the issue is representation, measurement/label bias, or model design. That evidence then drives the right mitigation and governance decision.
When a disparity appears despite acceptable overall metrics, the next step is to localize the likely bias source rather than jumping straight to mitigation. In this scenario, rural applicants could be underrepresented in training data (collection bias), historical approval labels may encode past inequities (label/measurement bias), location-related variables may act as proxies for protected characteristics (model/feature design bias), or approvals could influence future applicant pools (feedback loop).
A practical next step is to:
This creates defensible evidence for targeted remediation (data changes, relabeling, feature constraints, or process controls) instead of premature retraining or deployment.
Bias source identification starts with subgroup performance, representativeness, label validity, and proxy/feature review before choosing mitigations.
Topic: Identify Data Needs
During discovery for a customer churn model, Marketing defines “active customer” as anyone with a login in the last 90 days, while Finance defines it as anyone with a paid invoice in the last 90 days. Each team points to different reports, and the mismatch is blocking data extraction and KPI baselining. Which data governance approach best matches the principle for resolving these conflicting interpretations?
Best answer: D
What this tests: Identify Data Needs
Explanation: Conflicting metric meanings are a semantic governance issue, not a modeling or data-quality optimization problem. The right move is to align to an authoritative, documented definition (business glossary/data dictionary) and record the approved definition and source with accountable data SMEs (data owner/steward) so downstream extraction and KPIs are consistent.
When teams disagree on what a field or KPI means (for example, “active customer”), the primary risk is inconsistent labeling, inconsistent extracts, and non-reproducible results. The appropriate principle is to resolve the conflict by anchoring the term to governed metadata: a documented definition in a business glossary/data dictionary, tied to an authoritative source (system of record) and approved by the accountable data SME (data owner/steward). This creates a durable reference that the team can use for data requirements, extraction logic, and KPI baselines, and it provides traceability for audits and future iterations.
Key takeaway: don’t “pick a definition” based on convenience or model performance; formalize the definition and its authoritative source through data governance.
It resolves semantic conflicts by aligning to a documented, governed definition and an authoritative source of record owned by accountable data SMEs.
Topic: Support Responsible and Trustworthy AI Efforts
A team is deploying a customer-support AI assistant that uses conversation transcripts containing PII (names, emails, account numbers). The solution includes a data lake for training, an API for real-time inference, and centralized logging for troubleshooting. Security requires encryption for data at rest and in transit, and privacy requires that logs must not store PII; the team also needs fast incident investigation without slowing the release.
Which approach best optimizes risk reduction while meeting these constraints?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: The best choice applies secure data handling consistently across the full data path: storage, network transport, and operational telemetry. Encrypting at rest and in transit reduces exposure from compromised storage or network interception. Preventing PII from being written to logs (via redaction/tokenization) meets privacy constraints while still enabling effective troubleshooting.
Optimizing privacy and security for AI solutions requires end-to-end controls because sensitive data flows through multiple components (data stores, APIs, and observability tooling). In this scenario, encryption at rest protects training data, features, model artifacts, and backups; encryption in transit (e.g., TLS) protects data moving between services; and log hygiene ensures centralized logs don’t become an ungoverned copy of PII.
A practical, low-friction pattern is:
This satisfies the “no PII in logs” constraint while preserving incident response capabilities.
It covers at-rest, in-transit, and logging controls while preserving operational observability without retaining sensitive data.
Topic: Manage AI Model Development and Evaluation
Which term refers to the documented, step-by-step operational procedures used to support a production AI model (for example monitoring checks, alert triage, escalation, rollback, and who is on call) as part of a go/no-go readiness review?
Best answer: C
What this tests: Manage AI Model Development and Evaluation
Explanation: A model operations runbook is the primary artifact for validating that operational procedures are in place before production release. It enables consistent monitoring, incident response, rollback, and clear handoffs to support teams. This directly supports governance by clarifying roles, controls, and operational decisions during issues.
For a go/no-go decision, you must confirm the solution can be supported day-to-day and governed when something goes wrong. The core documentation for that operational readiness is a model operations runbook: a practical, step-by-step guide describing how to run, monitor, troubleshoot, and recover the model service, including ownership (on-call/escalation) and actions like rollback.
A runbook typically covers:
A model card supports transparency about the model, but it does not replace operational procedures needed for support.
A runbook captures the repeatable procedures and ownership needed to operate and support the model in production.
Topic: Identify Data Needs
A team is preparing a customer-support text dataset to send to an external annotation vendor for an AI intent-classification pilot. The team masked obvious PII columns (name, email) and replaced customer IDs with tokens.
Two days after transfer, the privacy office reports a potential incident: the vendor found several records containing unredacted payment card numbers and home addresses inside the free-text “case notes” field. The vendor also received a full database extract that included fields not listed in the data-sharing request.
Which is the most likely underlying cause?
Best answer: C
What this tests: Identify Data Needs
Explanation: The incident points to a breakdown in pre-sharing privacy controls: sensitive data was not discovered and removed from unstructured fields, and the shared extract included unnecessary attributes. The appropriate diagnosis is inadequate data discovery/classification, de-identification, and data minimization before providing data to a vendor.
Before sharing data with broader teams or vendors, you must confirm that all sensitive data is identified and controlled—not just obvious structured columns. Here, the clues are that payment card numbers and addresses were embedded in free-text notes (often missed by column-based masking) and that the vendor received extra fields beyond the approved request (a minimization and access-control failure).
A sound remediation approach is to run sensitive-data discovery on the entire dataset (including unstructured text), apply approved de-identification/redaction, minimize to only the fields required for the use case, and enforce the approved extract via governed access and transfer controls. The other choices describe downstream outcomes (drift, adoption, performance) and do not explain a confirmed privacy exposure during data sharing.
PII remained in unstructured text and extra fields were shared, indicating missing end-to-end data scanning and minimization before transfer.
Topic: Support Responsible and Trustworthy AI Efforts
An insurer is preparing to deploy an AI assistant that summarizes customer claim calls and suggests next steps to adjusters. The assistant will process call transcripts that may include health details and will send prompts and context to a third-party model hosting provider. This is a new use of the data and will influence claim handling decisions.
Which evidence best validates privacy governance readiness for this deployment?
Best answer: C
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Because the solution introduces a new, high-impact processing activity involving potentially sensitive personal data and third-party processing, privacy readiness is best validated through a completed and approved privacy impact assessment (PIA/DPIA). The assessment should map data collection and sharing, evaluate necessity and proportionality, identify privacy risks, and document mitigations and residual-risk approvals before deployment.
A privacy impact assessment is warranted when an AI deployment creates or materially changes personal-data processing in ways that increase privacy risk (for example, new purposes, sensitive data, third-party sharing, or automated decision support affecting individuals). In this scenario, call transcripts may contain health information, data will be transmitted to a model provider, and outputs will influence claim handling—together signaling elevated privacy risk.
A PIA/DPIA is the best readiness evidence because it evaluates and documents:
Security testing and model quality evidence are important, but they do not replace the privacy risk assessment and governance decision needed before go-live.
This high-risk, new processing of sensitive data requires a PIA/DPIA that assesses privacy risks end-to-end and documents controls and sign-offs.
Topic: Operationalize AI Solution
A customer-facing AI virtual assistant for a health insurer begins returning snippets of other members’ claim notes in its responses. The on-call team confirms the issue is reproducible and could expose personal data to any user.
Which contingency/incident response procedure is the BEST immediate action to follow, given this situation?
Best answer: A
What this tests: Operationalize AI Solution
Explanation: Because the assistant is exposing member information, the decisive factor is a potential privacy/security incident. The procedure should prioritize rapid containment to stop harm, preserve evidence for investigation, and route escalation through predefined privacy/legal and executive channels. Model improvements can follow only after the incident is controlled.
Incident response for harmful AI outputs should be risk-based, and suspected personal-data exposure demands the highest urgency. In this scenario, the right procedure is to execute a predefined incident response runbook that both limits further leakage and supports accountable investigation and communications.
A practical immediate sequence is:
Tuning prompts, retraining, or “observing longer” can reduce future recurrence, but they do not stop current harm and can compromise evidence if changes are made before capture.
Potential personal-data exposure requires immediate containment, evidence preservation, and formal escalation per the incident response procedure.
Topic: Operationalize AI Solution
An AI team operates a customer-service chatbot in production. The product owner asks for a monthly performance report to executives after seeing a spike in containment rate last week. You notice (1) the last 48 hours of interaction logs are not yet ingested, (2) a new policy article went live midweek, and (3) the report audience is likely to treat the metric change as “proven improvement.”
Which CPMAI-aligned principle best matches how you should communicate the performance report?
Best answer: A
What this tests: Operationalize AI Solution
Explanation: Operational AI metrics must be communicated transparently with sufficient context so stakeholders do not misinterpret short-term changes. In this case, known data latency and a midweek knowledge-base change can inflate or distort the observed containment rate. The best practice is to report the metric along with uncertainty and clear caveats about what the numbers do and do not yet represent.
The core concept is trustworthy operational performance reporting: provide regular metrics, but also the conditions required to interpret them correctly. When you know the data is incomplete (ingestion latency) or the operating environment changed (new policy content), a raw KPI can look better or worse for reasons unrelated to model quality. Your report should therefore include caveats (data freshness window, known change events, comparison baseline, and any uncertainty bands or confidence ranges where appropriate) and clearly separate observed results from validated, sustained improvement. This preserves decision quality while maintaining a reliable governance trail for how performance was assessed and communicated.
Stakeholder performance reporting should include limitations (e.g., data latency, change events) to prevent overinterpretation and support sound decisions.
Topic: Identify Data Needs
Which statement best defines a privacy impact assessment (PIA) for an AI initiative’s data usage?
Best answer: C
What this tests: Identify Data Needs
Explanation: A privacy impact assessment is performed when an AI use case introduces or significantly changes the processing of personal data in ways that could increase privacy risk. Its value is not just the analysis but the documented outcomes: data processing description, identified risks, planned mitigations, residual risk acceptance, and sign-offs needed to proceed.
A PIA is a structured privacy risk assessment focused on how data is collected, used, shared, stored, and retained—especially when an AI initiative proposes new or materially changed processing of personal data or otherwise elevates privacy risk (for example, new data combinations, expanded purposes, or increased data access). The key deliverable is documented outcomes that make the decision auditable, typically including the processing description/data flows, identified privacy risks, recommended controls/mitigations, residual risk and who accepts it, and any required approvals or follow-up actions. This differs from security-only reviews (which focus on technical safeguards), data quality work (which focuses on fitness for modeling), and model documentation (which focuses on model behavior and intended use).
A PIA is triggered by potentially higher-risk personal-data use and produces documented privacy risks, mitigations, and decisions.
Topic: Support Responsible and Trustworthy AI Efforts
A retail bank is building an AI model to route and summarize customer support chat transcripts. The data includes account numbers and other PII. The product owner wants fast iteration, but the solution must pass internal security review, support audits (traceability), and limit privacy risk across collection, training, inference, and retention.
Which end-to-end data handling procedure best optimizes risk reduction while meeting these constraints?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: The best choice uses layered controls across the full lifecycle: data minimization, strong protection of sensitive fields, environment separation, and controlled logging and retention. This reduces the likelihood and impact of unauthorized access while still supporting traceability for audits. It also keeps iteration possible without spreading raw PII across teams and systems.
Secure end-to-end data handling is a lifecycle design problem: you want to reduce exposure of sensitive data while preserving the minimum traceability needed for operations and audits. In this scenario, the strongest approach is defense-in-depth applied consistently from collection through deletion.
This balances speed with privacy/security by enabling iterative work on protected data rather than distributing raw transcripts or retaining sensitive artifacts unnecessarily.
It applies defense-in-depth controls across collection, training, inference, and retention while preserving auditability and enabling iterative development.
Topic: Identify Business Needs and Solutions
A product owner asks an AI team to “use machine learning to reduce call-center costs.” The team drafts a problem statement but does not specify the target users, the decision the model will support, a baseline, or measurable success criteria. The sponsor tells the team to start data exploration anyway to “show progress” and align later.
What is the most likely near-term impact of proceeding without clear desired outcomes and testable success criteria?
Best answer: A
What this tests: Identify Business Needs and Solutions
Explanation: Clear problem statements and desired outcomes translate into measurable success criteria and acceptance tests. When a team starts work without them, early decisions about data, labels, and evaluation targets are made without a shared definition of value. The immediate result is misalignment, scope churn, and rework when stakeholders later clarify what they actually need.
The core concept is that a problem statement must lead to a desired outcome that can be validated with explicit success criteria (who benefits, what decision improves, baseline, and measurable target). In the scenario, “reduce call-center costs” is an aspirational goal, not a testable outcome for an AI MVP, so the team cannot set acceptance criteria or choose appropriate evaluation metrics and thresholds.
Near-term, this causes:
Operational concerns like drift, audits, and infrastructure cost are important, but they typically become dominant later, after the use case and success criteria are defined.
Without a testable outcome and success criteria, the team cannot align on what “good” looks like, causing mis-scoped work and rework during evaluation and stakeholder reviews.
Topic: Identify Business Needs and Solutions
A project team proposes an AI assistant to deflect contact-center calls. The business case claims $2.0M annual savings based on a 20% reduction in average handle time, but the team did not validate the baseline metrics with Operations or confirm that a planned policy change next quarter will also reduce handle time.
The steering committee is about to decide whether to fund the pilot. What is the most likely near-term impact of this omission?
Best answer: D
What this tests: Identify Business Needs and Solutions
Explanation: Because the savings estimate is built on unvalidated baselines and ignores a confounding policy change, the financial benefits are not credible enough for an investment decision. The most immediate consequence is a governance or finance challenge that forces the team to revise assumptions and provide evidence. This typically delays approval or reduces committed funding until the benefits are substantiated.
For an AI business case, you must gather and validate financial inputs (costs, baselines) and benefit assumptions (causal drivers, attribution, and timing). In this scenario, the claimed $2.0M benefit depends on a baseline average-handle-time metric that was not confirmed with Operations and is likely to be affected by a separate policy change. That creates an attribution risk (you cannot tell what savings come from AI versus other changes) and a credibility gap that investment committees commonly address immediately.
Practical validation steps include:
The near-term effect is decision friction (rework, delayed approval, or reduced scope), not model-quality or post-deployment outcomes.
Unvalidated baselines and overlapping initiatives undermine the credibility of savings estimates, triggering rework and postponing the investment decision.
Topic: Manage AI Model Development and Evaluation
A team built a customer credit line increase model using customer transaction history and demographic attributes (PII). The organization’s model risk committee requires a go/no-go package for high-impact decisions, and internal audit may review the evidence later. The model meets performance targets, but the product owner wants to launch quickly.
What should the AI project manager request as the best go/no-go evidence package to confirm responsible AI checks are complete and recorded?
Best answer: D
What this tests: Manage AI Model Development and Evaluation
Explanation: Because the model supports a high-impact decision and uses PII, go/no-go must include documented responsible AI checks with durable evidence. A signed model card-style package that links to privacy assessment artifacts, bias/fairness results, and explainability outputs provides traceability for governance and audit. This best demonstrates completion and proper recordkeeping, not just assurances or partial technical outputs.
For operationalization readiness, responsible AI checks must be both completed and verifiably recorded in artifacts that can be reviewed later. In a high-impact decision context using PII, the go/no-go package should provide end-to-end traceability: what data was used, which privacy controls/assessments were performed, what bias/fairness tests were run and their outcomes, what transparency/explainability materials are provided, and who approved the release.
A practical way to package this is a model card (or equivalent governance dossier) that links/attaches:
The key takeaway: “ready” requires audit-ready evidence, not informal confirmation or post-release plans.
This bundles documented privacy, bias/fairness, and transparency checks with traceable approvals for audit-ready go/no-go.
Topic: Operationalize AI Solution
Which term best describes a documented, step-by-step set of operational instructions used during an AI service disruption to restore service and, if needed, switch processing to a defined manual fallback while the AI is unavailable?
Best answer: A
What this tests: Operationalize AI Solution
Explanation: An operational runbook is the practitioner-facing, step-by-step guide used during incidents to keep operations running. It typically includes triggers, roles/escalations, recovery actions, and explicit manual fallback procedures when an AI service is degraded or down. This directly supports business continuity for AI-enabled processes.
For AI solutions in production, business continuity requires more than high-level intentions; teams need executable procedures for handling outages and degraded performance. An operational runbook is the artifact that translates contingency planning into concrete actions operators can follow under time pressure, including how to switch to a manual process, how to roll back or fail over, who must approve/communicate changes, and how to validate recovery before returning to normal operations.
A model card documents a model’s purpose and evaluation, data lineage traces data origins and transformations, and concept drift describes changing data relationships over time—important operational concepts, but they are not the incident-response instructions for manual fallback and service restoration.
A runbook provides actionable incident steps such as failover/rollback, escalation, and manual fallback procedures.
Topic: Support Responsible and Trustworthy AI Efforts
A product team is preparing a quarterly accountability report for executives about a customer-service chatbot that uses a third-party LLM. The report must summarize current risk posture, key controls, and major decisions made since the last release. Which action should the team NOT take when creating this report?
Best answer: B
What this tests: Support Responsible and Trustworthy AI Efforts
Explanation: Accountability reports for leaders should provide an auditable view of risk posture, the controls in place (and evidence of them), and a transparent decision history. Removing the decision log details or selectively omitting rejected options undermines traceability and makes it difficult to understand how risk tradeoffs were evaluated. The other actions strengthen oversight and support audit readiness.
An accountability report is an executive-friendly summary that still preserves traceability. For AI initiatives, leaders need a clear view of (1) current risk posture (key risks, residual risk, and what is changing), (2) the controls/guardrails that mitigate those risks and where evidence can be found (policies, tests, monitoring, approvals), and (3) decision history (what was decided, by whom, when, and why).
A common anti-pattern is “sanitizing” the story by replacing the decision log with a curated narrative that omits rejected alternatives or dissenting assessments. That breaks the audit trail, hides tradeoffs, and makes it harder to demonstrate accountable governance when questions arise later. The goal is concise reporting without sacrificing traceable, decision-relevant records.
Omitting rejected alternatives and rationale weakens the audit trail and obscures decision history leaders need for accountability.
Topic: Identify Business Needs and Solutions
You are planning an AI initiative to predict customer churn and trigger retention offers. The sponsor will approve funding only if you can demonstrate that the schedule and staffing plan realistically covers discovery, data acquisition/engineering, model development, and production operations (monitoring and support).
Which metric/evidence/artifact best validates that your resource allocation and timing are credible across these phases?
Best answer: B
What this tests: Identify Business Needs and Solutions
Explanation: The strongest validation is evidence that explicitly maps required roles and effort to each project phase and checks feasibility against available capacity. A resource-loaded, skill-based roadmap with phase gates makes timing, dependencies, and cross-functional staffing needs visible from discovery through operations. This is the most direct way to validate the credibility of the resourcing plan.
To validate resource allocation and timing for an AI initiative, you need evidence that connects (1) phase-specific work and dependencies to (2) the skills required and (3) the actual capacity available over time. A resource-loaded, skill-based roadmap (often paired with a skills matrix) does this by showing when data engineers, SMEs, data scientists, security/privacy, and MLOps/operations are needed, and whether the plan includes time for data access, governance checkpoints, deployment, monitoring, and handover.
This type of artifact typically includes:
Performance reports, backlogs, and status dashboards may be useful, but they don’t validate end-to-end resourcing feasibility across all phases.
It directly shows when each role is needed across phases and whether planned capacity covers the workload through operationalization.
Topic: Manage AI Model Development and Evaluation
An AI team has completed a candidate credit-risk model and reports strong offline metrics. Before approving the model for deployment, the AI project manager schedules an independent review by another data scientist and an MLOps engineer to examine the modeling approach, data split/leakage risks, evaluation methodology, reproducibility, and whether results support the stated acceptance criteria. Findings are documented and must be resolved before the go/no-go decision.
Which CPMAI-aligned QA/QC approach is this practice best mapping to?
Best answer: B
What this tests: Manage AI Model Development and Evaluation
Explanation: The described practice is a pre-deployment QA/QC checkpoint that uses independent reviewers to validate the model design choices and the credibility of evaluation results. It emphasizes detecting methodological issues (like leakage), confirming reproducibility, and documenting findings before a go/no-go decision. This directly aligns to coordinated peer review and technical validation of model designs and evaluation outcomes.
In Domain IV QA/QC, peer review and independent technical validation provide a structured, unbiased check that the model was built and evaluated correctly before it is promoted. In the scenario, reviewers are explicitly tasked to verify evaluation integrity (e.g., leakage, split strategy), confirm the approach is reproducible/auditable, and ensure results are sufficient against predefined acceptance criteria; documenting and remediating findings turns the review into a true governance gate.
A practical validation checkpoint commonly includes:
This differs from operational monitoring activities that happen after deployment or from data-only governance checks that do not validate model evaluation rigor.
It establishes an independent, documented checkpoint to technically validate model design and evaluation results before a release decision.
The most important pattern is whether you connect business need, data feasibility, model evaluation, governance, and production monitoring in one decision chain.
This page gives one complete public CPMAI diagnostic. PM Mastery adds the larger PMI-CPMAI bank, domain drills, mixed timed mocks, progress tracking, and explanations for business framing, data readiness, model evaluation, responsible AI, and operational rollout.
Do not retake immediately. Review every miss, write the lifecycle step you skipped, then drill that domain before another full set. Repeated high scores on unseen mixed attempts should move you toward the real exam rather than more memorization.
Use the PMI-CPMAI Practice Test page for the full PM Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Read the PMI-CPMAI guide on PMExams.com for concept review, then return here for PM Mastery practice.