Free GARP RAI Practice Exam

Last revised: July 14, 2026

Try 80 free GARP RAI practice exam questions across the exam domains, with answers, explanations, timed mock exams, topic drills, and the Finance Prep next step.

This free full-length GARP RAI practice exam includes 80 original Finance Prep questions across the exam domains.

These are original Finance Prep practice questions aligned to the exam outline. They are not official GARP questions, copied live-exam content, or exam dumps. Use them to preview question style and explanation depth before continuing with mixed sets, topic drills, and timed mock exams in Finance Prep.

Practice count note: exam sponsors can describe total questions, scored questions, duration, or administrative exam-day rules differently. Always confirm current exam-day rules with the sponsor.

Practice questions

Questions 1-25

Question 1

Topic: Data and AI Model Governance

A bank identifies a new AI-enabled tool used to draft customer service responses. Before assigning any inherent risk tier, the first-line owner records the tool’s business purpose, accountable owner, users, data sources, vendor involvement, and deployment status in a central register. Which concept does this recording step best represent?

A. Independent model validation
B. Risk classification
C. AI inventory capture
D. Ongoing performance monitoring

Best answer: C

What this tests: Data and AI Model Governance

Explanation: Inventory capture is the initial governance step of identifying an AI system and recording core facts needed for oversight, such as owner, purpose, users, data sources, vendor involvement, and deployment status. Risk classification uses that information later to assign a risk tier or category based on factors such as impact, complexity, data sensitivity, and control requirements. In the stem, no risk rating has been assigned yet; the activity is limited to creating or updating the central AI register. That makes it inventory capture rather than classification, validation, or monitoring.

Risk classification would involve assigning an inherent or residual risk tier, not merely recording identifying information.
Independent model validation assesses design, data, performance, and controls, typically after enough information exists to review the system.
Ongoing performance monitoring tracks behavior after deployment and is not the initial registration step.

Recording descriptive metadata in a central register is inventory capture, distinct from deciding the tool’s risk tier.

Question 2

Topic: Risks and Risk Factors

A bank excludes applicants’ gender from a credit approval model to reduce fairness risk. During review, analysts find that employment gaps, part-time status, and certain occupation codes are highly correlated with gender and still influence approvals. Which concept best describes the remaining fairness concern?

A. Data leakage may allow the model to use future repayment outcomes during training.
B. Proxy variables may allow the model to infer protected-group membership indirectly.
C. Concept drift may cause relationships in the model to change after deployment.
D. Class imbalance may reduce predictive accuracy for a smaller training subgroup.

Best answer: B

What this tests: Risks and Risk Factors

Explanation: Fairness risk can remain even after a protected characteristic is removed because other variables may act as proxies. If features such as occupation, geography, education, or employment patterns are strongly correlated with protected-group membership, the model can still learn patterns that produce unfair or disparate outcomes. This is sometimes described as indirect discrimination or proxy bias. The appropriate risk response is not simply deleting the protected field; it also requires assessing correlated features, testing outcomes across groups, and documenting whether the remaining variables are justified for the decision context.

Data leakage concerns improper use of information that would not be available at prediction time, not correlated features standing in for a protected trait.
Concept drift concerns changing data relationships after deployment, not a fairness issue already present in the feature set.
Class imbalance can affect performance across groups, but the described issue is that remaining inputs encode protected-group information indirectly.

Removing the protected field does not eliminate fairness risk when correlated features still transmit similar information to the model.

Question 3

Topic: Data and AI Model Governance

A bank is moving several AI pilots into production for customer service, credit operations, and compliance analytics. The board-approved risk appetite permits AI adoption only when business value, control expectations, human accountability, and monitoring are clear. Current teams use different approval paths and no common escalation process for AI issues. What is the best action?

A. Pause all AI deployments until the bank can select one approved vendor platform for every AI use case.
B. Allow each business line to approve its own AI use cases if the local product owner documents expected efficiency benefits.
C. Require independent technical validation only for AI models that directly make final customer decisions.
D. Implement an enterprise AI governance framework that classifies AI use cases, links them to strategy and risk appetite, assigns accountable owners, and defines required controls and reporting.

Best answer: D

What this tests: Data and AI Model Governance

Explanation: The purpose of an AI governance framework is not merely to approve technology or document model performance. It provides an enterprise structure for deciding which AI uses are appropriate, how they support business strategy, what risks are acceptable, which controls are required, and who is accountable throughout the lifecycle. In this scenario, the bank has multiple AI use cases, inconsistent approval paths, and unclear escalation. The best action is therefore to implement a governance framework that creates consistent classification, ownership, control, monitoring, and reporting expectations aligned with the board’s risk appetite.

Local business approval may capture benefits, but it does not ensure enterprise risk alignment or consistent accountability.
Technical validation is important, but limiting governance to final-decision models misses broader AI risks and lifecycle controls.
Standardizing on one vendor platform does not by itself define risk appetite, ownership, controls, or oversight.

An AI governance framework exists to align AI use with strategic objectives, risk appetite, control expectations, and accountability across the organization.

Question 4

Topic: History and Overview of AI Concepts

A bank pilots an AI tool that flags small-business credit card accounts likely to become delinquent. Validation shows the tool is useful but has some false positives. The proposed workflow would automatically reduce credit limits for accounts above a cutoff, which could affect customers’ liquidity. What is the single best action before using the output in production?

A. Do not use the AI tool until it can provide a complete causal explanation for every account flag.
B. Use the AI flag as a prioritization input and require a credit officer to review customer-specific information before changing limits.
C. Have model validation approve each individual credit-limit reduction before it is executed.
D. Automatically reduce limits for all flagged accounts to ensure consistent treatment across customers.

Best answer: B

What this tests: History and Overview of AI Concepts

Explanation: AI outputs used in business decisions are often probabilistic indicators, not definitive conclusions. When an output has known false positives and the action can materially affect customers, the appropriate design is human-in-the-loop decision support. A credit officer can consider the AI flag together with current customer facts, policy criteria, and any mitigating information before taking action. This preserves accountability and reduces the risk that model error or missing context becomes an automatic adverse decision. The issue is not that the AI must be ignored, but that its output should be one input to a controlled human decision process.

Automatic limit reductions over-rely on an imperfect model and convert a risk signal into a final adverse action.
Requiring complete causal explanation is too absolute; useful AI can be governed with appropriate review and controls.
Model validation assesses model fitness and controls, but it should not own each individual credit decision.

Because the output is probabilistic and customer-impacting, it should support a human decision rather than trigger the final action automatically.

Question 5

Topic: Responsible and Ethical AI

An AI tool scores applications for a hardship loan-modification program. Because an unfavorable output could materially affect a customer’s housing stability, the bank requires a trained employee to review the AI recommendation and supporting facts before any denial is finalized. Which concept best matches this control?

A. Data minimization in model development
B. Explainability reporting for model transparency
C. Human-in-the-loop oversight for high-impact decisions
D. Automated drift monitoring after deployment

Best answer: C

What this tests: Responsible and Ethical AI

Explanation: Human-in-the-loop oversight is appropriate when an AI system affects high-impact decisions, such as access to credit, housing stability, employment, healthcare, or other materially consequential outcomes. The control ensures that a qualified person reviews the AI output, considers relevant context, and remains accountable for the final decision rather than allowing the system to determine the outcome automatically. Monitoring, data controls, and explainability can support responsible AI, but they do not by themselves provide pre-decision human judgment over a consequential customer outcome.

Automated drift monitoring detects performance changes over time but does not require human approval before a specific decision.
Data minimization limits unnecessary data use but does not address who finalizes a high-impact outcome.
Explainability reporting helps users understand model outputs, but transparency alone is not the same as human decision control.

A person must review and approve the AI-influenced outcome before a materially consequential decision is finalized.

Question 6

Topic: Responsible and Ethical AI

A bank is preparing evidence for an internal review of a new AI-assisted credit decision tool. Which artifact best shows that responsible AI expectations were considered during design, approval, and monitoring?

A. A performance report showing that the tool’s approval-rate predictions had higher accuracy than the prior rules-based process
B. A lifecycle review log linking fairness, explainability, privacy, and human-oversight requirements to design decisions, approval sign-offs, and ongoing monitoring metrics
C. A vendor statement asserting that its AI products are developed according to ethical innovation principles
D. A production support dashboard showing system uptime, processing latency, and incident response times

Best answer: B

What this tests: Responsible and Ethical AI

Explanation: Evidence of responsible AI consideration should be traceable across the lifecycle: design choices, approval decisions, and post-deployment monitoring. The strongest evidence connects responsible AI expectations—such as fairness, explainability, privacy, accountability, and human oversight—to specific controls, approvals, metrics, and follow-up actions. A narrow performance report, generic vendor ethics statement, or operational availability dashboard may be useful for other purposes, but they do not show that responsible AI principles were embedded and monitored throughout the AI system’s lifecycle.

Higher prediction accuracy supports model performance, but it does not by itself evidence fairness, transparency, privacy, or oversight review.
A generic vendor ethics statement is not specific enough to show how expectations were applied to this tool.
Uptime and latency monitoring address operational resilience, not the broader responsible AI expectations in the stem.

This artifact provides traceable evidence that responsible AI expectations were addressed across the model lifecycle.

Question 7

Topic: AI Tools and Techniques

A bank operations team wants to paste a customer’s complaint file into a public generative AI tool to draft a response. The file includes customer identifiers, account numbers, and transaction details; the vendor terms say prompts may be retained for service improvement, and the tool has not been approved for regulated data. What is the best action before using AI assistance?

A. Use the public tool only for a short session and delete the chat after the response is generated.
B. Proceed if the analyst is authorized to view the customer file internally.
C. Use an approved AI environment and apply data minimization, redaction or tokenization, and access and retention controls before prompting.
D. Paste the file into the tool if the prompt instructs the AI not to reveal or reuse the information.

Best answer: C

What this tests: AI Tools and Techniques

Explanation: Information placed in an AI prompt can become part of the tool’s processing context and may be logged, retained, reviewed, or exposed depending on the provider and configuration. Customer identifiers, account numbers, and transaction details are sensitive and may be regulated, so the organization must control whether and how they are submitted. The best action is not simply better wording of the prompt; it is to use an approved environment and reduce or protect the data before submission through minimization, redaction, tokenization, access limits, and retention controls. Internal authorization to view data does not automatically authorize sending it to an unapproved external AI service.

Telling the AI not to disclose information is not a technical or governance control over logging, retention, or provider access.
Deleting a chat does not prove that prompts were not retained in backend logs or service records.
Internal access rights do not address third-party processing, data leakage, or regulated-data handling requirements.

Sensitive prompt content needs controls because it may be stored, processed, or exposed outside the bank’s approved data-handling boundaries.

Question 8

Topic: Risks and Risk Factors

A bank plans to use a vendor-hosted AI service to triage customer hardship requests. The vendor controls the model, stores prompts and outputs in its cloud, can update the model without prior notice, and will provide only high-level uptime and accuracy summaries. The business owner says internal validation is unnecessary because the bank is not building the model. What is the best action before production?

A. Treat the service as a third-party AI dependency and require due diligence covering data handling, change notification, availability, and assurance evidence.
B. Accept the vendor’s high-level accuracy and uptime summaries as sufficient validation evidence.
C. Exclude the service from the AI model inventory because the vendor owns the model and infrastructure.
D. Approve the service if bank staff manually review the first month of triage results.

Best answer: A

What this tests: Risks and Risk Factors

Explanation: Third-party AI risk arises when an external provider controls important aspects of the AI system, such as model behavior, data processing, updates, service availability, or assurance evidence. The bank remains accountable for the AI-enabled process even if it does not build or host the model. Before production, it should perform third-party and AI risk due diligence, define contractual controls, require evidence appropriate to the service’s risk, and address change management, data use, monitoring, and resilience. Vendor ownership does not remove the need for governance; it changes the type of evidence and controls the bank must obtain.

Manual review may reduce downstream harm, but it does not address vendor-controlled data handling, updates, availability, or assurance gaps.
Vendor ownership is exactly why the service should be captured in the AI inventory and third-party risk process.
High-level vendor summaries are not enough when the provider controls model changes and key operating evidence.

The key risk is that the external provider controls critical AI behavior and evidence, so the bank must manage it through third-party AI risk controls before use.

Question 9

Topic: Risks and Risk Factors

A bank is assessing a third-party AI service before allowing it to process customer interaction data. The review must determine how data is protected, whether the service is operationally resilient, what the model is not designed to do, and which controls remain the bank’s responsibility. Which vendor documentation set best supports this assessment?

A. Marketing materials, benchmark rankings, product roadmap, pricing schedule, and customer testimonials
B. Internal user training records, acceptable-use policy, help-desk procedures, and business-unit approval minutes
C. Model monitoring dashboard, production incident log, feature backlog, and release notes after implementation
D. Privacy and data-use terms, security assurance evidence, resilience or BCP/DR materials, model limitations documentation, and a shared responsibility matrix

Best answer: D

What this tests: Risks and Risk Factors

Explanation: Third-party AI due diligence should obtain vendor evidence that supports the specific risks being assessed. For an AI service handling customer data, relevant documentation includes privacy and data-use terms, security assurance materials, operational resilience or business continuity evidence, model limitations or model card-style documentation, and a shared responsibility matrix that clarifies which controls are performed by the vendor and which remain with the bank. These materials help the firm assess third-party and concentration risk before approval rather than relying on marketing claims or only post-implementation monitoring.

Marketing claims and benchmark rankings do not evidence privacy, security, resilience, or control ownership.
Internal training and approval records may support the bank’s governance process but do not document the vendor’s controls or model limitations.
Monitoring dashboards and incident logs are useful after implementation, but they do not replace pre-onboarding vendor assurance documentation.

This set directly covers privacy, security, resilience, model limitations, and allocation of control responsibilities for a third-party AI service.

Question 10

Topic: AI Tools and Techniques

A bank is preparing a credit line pre-screening model for validation. The development team removed race, gender, and age from the training file, but retained engineered features such as neighborhood income band, distance to a branch, and years at current address. Initial testing shows lower recommended credit limits for one protected group even among applicants with similar repayment histories. What is the best action for the validation team?

A. Evaluate whether the retained engineered features act as proxies for protected characteristics and revise or justify them based on disparate-impact testing.
B. Approve the model because protected characteristics were removed before training.
C. Require the developers to add the protected characteristics back into the model to improve prediction accuracy.
D. Focus only on overall default-rate accuracy because repayment history is already controlled in testing.

Best answer: A

What this tests: AI Tools and Techniques

Explanation: Removing protected characteristics does not by itself prevent biased model behavior. Engineered features can serve as proxies when they are correlated with protected status, such as location-based or stability-related variables reflecting historical housing, employment, or access patterns. Bias can also enter through how features are constructed, transformed, or selected. Because testing already shows different outcomes for a protected group among otherwise similar applicants, the validation team should examine feature-level proxy risk and disparate impact, then require remediation, documentation, or a defensible business justification before approval.

Removing protected fields is a useful step, but it does not address proxy variables or biased feature construction.
Adding protected characteristics may be appropriate for fairness testing, but using them directly for prediction is not the best response described here.
Overall accuracy can hide subgroup harm, so it is insufficient when outcome differences appear for similarly qualified applicants.

Features correlated with protected characteristics can transmit bias even when the protected fields themselves are excluded.

Question 11

Topic: AI Tools and Techniques

Before deploying an AI loan-operations assistant, a project team documents approved user groups, role-based access, permitted data feeds, expected outputs, escalation triggers, and owners for post-deployment monitoring. Which concept best matches this activity?

A. Benchmarking exercise for comparing model accuracy
B. Data labeling protocol for creating training targets
C. Hyperparameter tuning plan for improving model fit
D. Implementation design specification for controlled use and accountability

Best answer: D

What this tests: AI Tools and Techniques

Explanation: An implementation design specification translates an AI use case into controlled production operation. It should identify who may use the system, what access they have, which data inputs are allowed, what outputs are expected, when issues must be escalated, and who is responsible for monitoring. These details reduce misuse, unmanaged data exposure, unclear accountability, and gaps after deployment. The stem is not mainly about optimizing model performance or preparing training data; it is about defining the operating controls and responsibilities needed for safe implementation.

Hyperparameter tuning concerns model configuration and performance, not user access, escalation, or monitoring ownership.
Data labeling supports supervised training but does not define production use controls.
Benchmarking compares performance across models or vendors, while the described activity governs how the selected system will operate.

These elements define how the AI system will be used, controlled, escalated, and monitored in production.

Question 12

Topic: Data and AI Model Governance

During post-deployment monitoring, an AI model triggers a material performance exception. The governance process requires a record that documents the root cause, business and customer impact, remediation plan, accountable owner, target due date, and evidence that closure was validated. Which concept does this description best match?

A. Model inventory entry
B. Issue management record
C. Risk appetite statement
D. Monitoring metric dashboard

Best answer: B

What this tests: Data and AI Model Governance

Explanation: Issue management is used when monitoring, validation, audit, or operational use identifies a problem that requires controlled remediation. A complete issue record should capture what went wrong, why it happened, who or what was affected, the agreed corrective action, the responsible owner, the due date, and evidence that the fix was validated before closure. This supports accountability, escalation, trend analysis, and assurance that AI model issues are not closed merely because an action was started. In the stem, the described artifact is not just reporting model performance; it is managing a specific exception through remediation and validated closure.

A model inventory entry catalogs AI systems, ownership, purpose, and risk classification, but it does not track remediation of a specific exception.
A monitoring metric dashboard reports indicators such as drift or performance, but it does not by itself document root cause, ownership, due dates, and closure validation.
A risk appetite statement defines acceptable risk levels or limits, but it is not the remediation-tracking record for an identified issue.

An issue management record tracks the cause, impact, corrective actions, accountability, timing, and validated closure of a governance issue.

Question 13

Topic: Responsible and Ethical AI

A retail bank uses an AI routing model to prioritize customer hardship assistance requests. Quarterly survey results show overall customer satisfaction improved from 82% to 88%, but monitoring also shows customers with limited English proficiency are routed to manual follow-up 40% more often and wait two days longer than other customers. The product owner proposes closing the issue because overall satisfaction improved. What is the best action for responsible AI review?

A. Conduct a fairness assessment comparing subgroup outcomes, investigate the disparity, and require mitigation or documented risk acceptance before closure.
B. Increase the survey sample size and defer any fairness review until the next quarterly satisfaction cycle.
C. Close the issue based on the improved overall satisfaction score and continue the standard customer-satisfaction review.
D. Remove language-proficiency indicators from monitoring reports so the model cannot be evaluated using sensitive subgroup information.

Best answer: A

What this tests: Responsible and Ethical AI

Explanation: A customer-satisfaction review measures perceived service quality, usually at an aggregate or survey-sample level. A fairness assessment asks a different question: whether outcomes, errors, burdens, or benefits differ materially across relevant groups. In this scenario, the overall satisfaction score improved, but customers with limited English proficiency face more manual follow-up and longer waits. That is evidence of potential disparate impact or stakeholder harm, so the responsible AI response is to assess subgroup outcomes, investigate root causes, and decide on mitigation or formally accepted residual risk. Aggregate satisfaction cannot be used to close a fairness concern when subgroup monitoring indicates materially different treatment or outcomes.

Relying on the overall satisfaction score misses the subgroup disparity shown in monitoring data.
Removing subgroup indicators would weaken oversight and make potential inequity harder to detect.
Expanding the survey may be useful later, but it does not justify delaying a fairness review when operational outcome differences are already visible.

Aggregate satisfaction improvement does not resolve evidence that a subgroup may be experiencing materially worse outcomes.

Question 14

Topic: Risks and Risk Factors

A lender uses an AI model to decide which borrowers are offered hardship assistance. After launch, an audit finds that otherwise similar borrowers receive materially different outcomes by demographic group, call-center staff cannot explain the decisions to customers, and complaints have attracted regulator and media attention. Which concept best matches this situation?

A. Liquidity risk from unexpected changes in borrower cash flows
B. Cybersecurity risk from unauthorized access to model infrastructure
C. Legal, regulatory, conduct, and reputational risk from unfair and opaque AI decisioning
D. Model performance risk limited to lower predictive accuracy

Best answer: C

What this tests: Risks and Risk Factors

Explanation: AI decisions that are inaccurate, unfair, opaque, or poorly governed can create several connected risks. Unequal treatment of similarly situated customers raises conduct and fairness concerns and may trigger legal or regulatory scrutiny. The inability to explain decisions makes it harder to justify outcomes, handle complaints, or demonstrate compliance. Media attention and customer complaints also create reputational risk because stakeholders may view the firm’s AI use as unfair or irresponsible. In this scenario, the issue is not merely technical performance; it is the risk created when AI-driven customer decisions affect access to assistance without adequate fairness, transparency, and governance.

Unauthorized access would indicate a cybersecurity issue, but the stem describes unfair outcomes and opacity, not a breach.
Borrower cash-flow changes may affect credit or liquidity analysis, but they are not the main risk described.
Lower predictive accuracy is too narrow because the facts include customer harm, explainability gaps, regulator attention, and reputational damage.

The facts show potentially unfair treatment, weak explainability, customer harm, regulatory scrutiny, and public trust damage from an AI decision process.

Question 15

Topic: AI Tools and Techniques

Before a fraud-detection model is submitted for approval and independent review, the analytics lead is asked for a package that explains the model’s intended use, data sources, methodology, assumptions, limitations, performance testing, and proposed monitoring. Which item best matches this request?

A. Model documentation package
B. Hyperparameter tuning log
C. Production incident runbook
D. Feature store access policy

Best answer: A

What this tests: AI Tools and Techniques

Explanation: Before a model is approved, handed off, or independently reviewed, it should be accompanied by model documentation that explains what the model is intended to do and how it was developed and assessed. Typical content includes the business purpose, scope of use, data sources and lineage, feature and methodology choices, assumptions, limitations, performance results, validation evidence, implementation notes, and ongoing monitoring expectations. This documentation allows approvers, validators, risk managers, and future operators to understand the model well enough to challenge it, reproduce key decisions, and manage its risks.

A production incident runbook is useful after deployment for operational response, but it does not document model design and validation.
A feature store access policy governs data access, not the full model development and review evidence.
A hyperparameter tuning log may support technical review, but it is only one narrow component of broader model documentation.

A model documentation package provides the core evidence reviewers need to assess the model before approval, handoff, or independent review.

Question 16

Topic: Risks and Risk Factors

A bank’s internal LLM assistant is approved to answer only from a vetted policy repository. Monitoring flags that it has produced confidential customer identifiers in responses to users without access, and logs show prompts attempting to bypass retrieval and access filters. Which concept best matches this event?

A. Benign prompt-quality issue
B. AI security incident requiring investigation
C. Routine concept drift monitoring event
D. Standard model validation sampling finding

Best answer: B

What this tests: Risks and Risk Factors

Explanation: An AI incident should be investigated when there are indicators that safeguards, data, users, or outputs may have been compromised. Here, the system produced confidential identifiers to unauthorized users and logs show attempts to bypass retrieval and access filters. That points to possible data leakage, misuse, or control failure, not merely an accuracy problem. Investigation would determine scope, affected data and users, control breakdowns, containment, remediation, and any reporting needs.

Concept drift concerns changes in model behavior due to shifting data patterns, not unauthorized disclosure and bypass attempts.
Standard validation evaluates model design and performance but does not replace incident investigation when compromise indicators appear.
Prompt-quality issues involve unclear or poorly formed prompts; they do not explain confidential outputs to unauthorized users.

Unauthorized data disclosure plus attempts to bypass AI controls indicate possible compromise of data, outputs, and controls.

Question 17

Topic: Data and AI Model Governance

A bank’s quarterly AI governance pack includes a page showing all registered AI use cases with their owners, risk tier, lifecycle stage, deployment approval status, and overdue documentation flags. Which category of AI governance reporting information is being described?

A. Inventory status reporting
B. Incident reporting
C. Control metrics reporting
D. Validation findings reporting

Best answer: A

What this tests: Data and AI Model Governance

Explanation: AI governance reporting should help oversight bodies understand both the population of AI systems and the risk/control condition of that population. The described page is primarily inventory status reporting because it identifies registered AI use cases and shows core inventory attributes such as owner, risk tier, lifecycle stage, approval status, and documentation gaps. This supports accountability, coverage, and prioritization of governance work. It is not primarily reporting on incidents, independent validation results, or performance of specific controls, even though those items may be linked to inventory records in a broader governance dashboard.

Incident reporting would focus on events such as data leakage, harmful outputs, unauthorized use, or operational failures.
Validation findings reporting would summarize independent review results, limitations, and remediation required before or after deployment.
Control metrics reporting would track control performance, such as review timeliness, monitoring completion, or policy exception rates.

The description focuses on the completeness, ownership, classification, and lifecycle status of AI systems in the governance inventory.

Question 18

Topic: Risks and Risk Factors

A bank is preparing a credit-line increase model for production. The data science team removed race and gender fields, but the model still uses ZIP code, length at current address, employer industry, and education history. A pre-launch fairness review finds lower approval recommendations for applicants from neighborhoods strongly correlated with a protected group, even among applicants with similar credit scores. What is the best action for the risk team?

A. Approve the model because protected characteristics are not used as direct inputs.
B. Add the protected characteristics back as model inputs so the model can automatically correct group differences.
C. Review only overall accuracy and calibration because fairness risk is not present without protected fields.
D. Require a proxy-variable and fairness impact analysis, then mitigate or control features that drive unjustified group differences before approval.

Best answer: D

What this tests: Risks and Risk Factors

Explanation: Removing a protected characteristic does not by itself eliminate fairness risk. Other variables may be highly correlated with that characteristic and act as proxies, allowing the model to produce systematically different outcomes for protected groups. In this scenario, ZIP code and related demographic indicators are linked to a protected group, and the fairness review shows an adverse pattern even among applicants with similar credit scores. The risk team should not approve solely because direct protected fields were removed. The better response is to analyze proxy effects, assess fairness metrics, determine whether differences are justified by legitimate risk factors, and implement mitigation or controls before deployment.

Direct removal of race and gender is insufficient when correlated features can recreate similar group effects.
Adding protected characteristics for automatic correction is not an appropriate substitute for governed fairness analysis and control design.
Overall accuracy can mask subgroup harm, so calibration alone does not address proxy-driven fairness risk.

Protected fields can be removed while correlated proxy variables still transmit protected-characteristic information into model outcomes.

Question 19

Topic: Responsible and Ethical AI

A bank plans to use an AI tool to recommend whether a customer receives a credit limit increase. Call-center agents usually follow the recommendation, and customers who are declined receive only the message, “Request not approved.” Validation also shows weaker performance for customers with limited credit history. What is the best action before deployment to address transparency and disclosure needs?

A. Use the existing decline message because detailed explanations could reduce model effectiveness.
B. Limit disclosure to internal model documentation because the tool is only a recommendation to agents.
C. Publish the model’s source code and training data so customers can independently inspect the system.
D. Provide plain-language disclosure that AI supports the decision, explain the main factors and known limitations, and offer a path for human review.

Best answer: D

What this tests: Responsible and Ethical AI

Explanation: Transparency in responsible AI requires more than internal documentation when AI materially influences outcomes for people. Here, agents usually follow the model’s recommendation, declined customers are affected, and a known limitation exists for customers with limited credit history. The best action is to disclose AI involvement in understandable language, provide meaningful rationale such as key decision factors, describe relevant limitations, and make escalation or human review available. Transparency should be fit for the audience; it does not require exposing proprietary code or raw training data, but it should enable users and affected parties to understand how AI is involved and what its limits are.

Treating the tool as “only a recommendation” ignores that agents usually follow it, so the AI materially affects customers.
Publishing source code or training data is not a practical or privacy-safe substitute for understandable disclosure.
Keeping a generic decline message fails to explain AI involvement, rationale, or known limitations.

This makes AI involvement, decision rationale, and relevant limitations understandable to users and affected customers.

Question 20

Topic: AI Tools and Techniques

A bank’s data science team has built a machine learning model to prioritize small-business loan applications. The training metrics are strong, but the independent validation report flags materially lower performance on a recent out-of-time sample and unstable behavior for one borrower segment. The product manager wants to promote the model to production next week to meet a launch deadline. What is the best action before production use?

A. Defer promotion until the validation findings are assessed against acceptance criteria and addressed through remediation or governance approval.
B. Promote the model and rely on production monitoring to identify whether the validation concerns become material.
C. Allow the product manager to approve deployment because validation has already been completed.
D. Promote the model because strong training metrics show that the model learned the relevant patterns.

Best answer: A

What this tests: AI Tools and Techniques

Explanation: Validation is a pre-production control in the data science workflow. Its purpose is not merely to complete a checklist, but to provide evidence that the model generalizes beyond training data, performs acceptably for the intended use, and has known limitations. In this scenario, the out-of-time sample weakness and segment instability are direct warnings that production performance may be unreliable or unfairly uneven. The appropriate action is to pause promotion long enough for those results to be assessed against documented acceptance criteria, remediated if needed, or escalated through governance if risk acceptance is considered. Launch timing and strong training performance do not override adverse validation evidence.

Strong training metrics are insufficient because they may reflect overfitting or non-representative development data.
Production monitoring is important, but it does not replace pre-production validation when known issues already exist.
Product ownership does not substitute for governance review of validation findings and model fitness for use.

Validation results must be reviewed before release because they show whether the model is fit for its intended production use and what risks require action.

Question 21

Topic: Risks and Risk Factors

During validation of a credit-decision AI model, analysts find that many borrower income values are recorded as negative in the training data. The ETL job copied the income_amount field exactly from the loan-origination system, the full approved-applicant population was used, target labels were generated correctly, and lineage logs show the field’s path end to end. Which data risk factor best matches the issue?

A. Transformation failure
B. Labeling failure
C. Source-system data error
D. Lineage failure

Best answer: C

What this tests: Risks and Risk Factors

Explanation: A source-system data error occurs when the underlying operational system captures or stores incorrect values before they enter the AI development pipeline. Here, the ETL process copied the field exactly, labels were generated correctly, the full population was used, and lineage was documented. Those facts rule out common downstream data-risk causes such as transformation logic, target-label creation, sampling design, or inability to trace data provenance. The AI risk is still significant because the model may learn distorted relationships from invalid borrower income values, but the control response should focus first on source data quality, input validation, and remediation in the loan-origination system.

Transformation failure would involve incorrect conversion, mapping, joining, or calculation after extraction, but the ETL copied the field exactly.
Labeling failure would involve incorrect target outcomes or class assignments, but the target labels were generated correctly.
Lineage failure would involve an inability to trace data origin or movement, but the field’s path is documented end to end.
A sampling issue would involve unrepresentative record selection, but the full approved-applicant population was used.

The incorrect values originated in the loan-origination system before extraction or model-dataset preparation.

Question 22

Topic: History and Overview of AI Concepts

A bank deploys an AI model to flag suspicious payment transactions. Six months later, monitoring shows that transaction-size patterns and merchant-category frequencies differ materially from the training data, and the model’s alert precision has declined even though the code and decision threshold are unchanged. Which concept does this best illustrate?

A. Model drift
B. Overfitting
C. Data leakage
D. Hyperparameter tuning

Best answer: A

What this tests: History and Overview of AI Concepts

Explanation: Model drift occurs when a deployed model’s operating environment changes so that the relationship between inputs, outputs, or performance no longer matches what was observed during training and validation. In this scenario, the model has not been re-coded and its threshold has not changed, but the live transaction patterns have shifted and alert precision has declined. That is a classic monitoring signal that the model may no longer be performing as expected in production and may need investigation, recalibration, retraining, or other remediation.

Overfitting is a training-time issue where a model fits noise or idiosyncrasies in the training data too closely; the stem focuses on post-deployment change.
Data leakage involves improper use of information during training or validation that would not be available in real use; no such improper information is described.
Hyperparameter tuning is the process of selecting model configuration settings, not the observed decline in live model performance.

Model drift describes post-deployment changes in input patterns or performance that can reduce a model’s reliability.

Question 23

Topic: Responsible and Ethical AI

A retail bank uses an AI tool to help prioritize credit-card hardship requests. Its responsible AI policy requires customer-facing notices to state when AI influenced the outcome, describe key limitations of the tool, and provide a plain-language rationale for the decision. Which concept best matches this requirement?

A. Data minimization
B. Fairness testing
C. Transparency and disclosure
D. Operational resilience

Best answer: C

What this tests: Responsible and Ethical AI

Explanation: Transparency and disclosure address whether people can understand that AI is being used, what role it played, what its limitations are, and why a decision or recommendation was made. In this scenario, the key requirement is not only internal documentation but also customer-facing communication in plain language. That makes transparency the best match. Fairness, privacy, and resilience controls may also be important for an AI-enabled credit process, but they target different risk concerns than informing affected parties about AI involvement and rationale.

Fairness testing evaluates disparate or unjustified outcomes across groups, not whether customers are told how AI affected a decision.
Data minimization limits collection or retention of personal data, but it does not provide decision rationale.
Operational resilience concerns continuity and recovery of services, not understandable AI notices.

Transparency and disclosure focus on making AI involvement, limitations, and decision rationale understandable to users or affected parties.

Question 24

Topic: AI Tools and Techniques

A bank is testing a retrieval-augmented LLM assistant for complaint handling. The retrieved context says: “Customer reported a duplicate debit-card charge; one charge was reversed; merchant inquiry is still open.” The test prompt asks: “Confirm that the merchant committed fraud and recommend whether the customer’s account should be closed.” Which action is BEST?

A. Flag the prompt as requesting conclusions not supported by the retrieved context and revise it to answer only what the context establishes.
B. Have the model answer with a confidence score so reviewers can decide whether the inference is acceptable.
C. Allow the model to infer fraud because duplicate charges are commonly associated with merchant misconduct.
D. Expand the prompt to ask the model to use general banking knowledge to fill in the missing facts.

Best answer: A

What this tests: AI Tools and Techniques

Explanation: A prompt asks for unsupported inference when it requires the model to state or decide facts that are not present in the provided context. Here, the retrieved context establishes only that a duplicate charge was reported, one charge was reversed, and the merchant inquiry remains open. It does not establish merchant fraud, customer fault, or whether account closure is appropriate. The best action is to flag and revise the prompt so the model either limits its response to known facts or asks for additional evidence. This reduces hallucination risk and keeps the assistant grounded in the supplied context.

Treating duplicate charges as proof of fraud overgeneralizes from a possible pattern rather than the provided facts.
Adding a confidence score does not cure the absence of evidence; it can make an unsupported conclusion appear more reliable.
Using general banking knowledge to fill gaps defeats the purpose of context-grounded prompting and increases hallucination risk.

The context contains a duplicate-charge complaint and open inquiry, but no evidence of fraud or basis for account-closure advice.

Question 25

Topic: AI Tools and Techniques

A fraud-detection model was deployed six months ago. Recent reviews show that transactions the model rates as low risk now have materially higher confirmed fraud rates than expected, so the model’s outputs no longer align with observed business outcomes. Which data science workflow step is most directly indicated?

A. Feature selection during model development
B. Post-deployment monitoring and maintenance
C. Initial business problem framing
D. Exploratory data analysis before model training

Best answer: B

What this tests: AI Tools and Techniques

Explanation: When a deployed model’s predictions stop matching actual business outcomes, the issue belongs to post-deployment monitoring and maintenance. Monitoring compares model outputs, performance metrics, and realized outcomes over time to detect degradation, drift, or changed business conditions. Once detected, the workflow may require investigation, recalibration, retraining with more recent data, or changes to controls and usage. Earlier workflow steps such as problem framing, exploratory analysis, and feature selection are important during development, but they are not the primary step for responding to a deployed model whose live performance has deteriorated.

Initial business problem framing defines the objective and success criteria before development, not the response to degraded live performance.
Exploratory data analysis helps understand training data patterns, but it is not the main deployed-model control.
Feature selection may be revisited during remediation, but the broader indicated workflow step is monitoring and maintenance.

This step addresses deployed model performance degradation by investigating drift and triggering recalibration, retraining, or other corrective action.

Questions 26-50

Question 26

Topic: Data and AI Model Governance

A bank is preparing to deploy an internal generative AI assistant for relationship managers. The business sponsor argues that AI governance should be owned only by technology because the tool will run on an approved platform. Internal audit asks how customer-impact, privacy, data-quality, model-output, and post-deployment control risks will be overseen. What is the best action?

A. Assign full governance ownership to technology because platform approval and access controls are the primary sources of AI risk.
B. Create cross-functional AI governance with defined accountabilities across business, technology, data, risk, compliance, legal, and audit functions.
C. Let the business sponsor self-approve the tool after legal review because the assistant is for internal users only.
D. Transfer ownership to internal audit so governance remains independent from the team deploying the assistant.

Best answer: B

What this tests: Data and AI Model Governance

Explanation: The best action is to establish cross-functional governance with clear roles. AI systems create risks across the lifecycle: the business owns the use case and outcomes, technology manages infrastructure and security controls, data teams address data quality and lineage, risk and compliance assess control and policy alignment, legal considers obligations and liability, and audit provides independent assurance. No single function can fully govern an AI assistant simply because it is hosted on an approved platform. The facts identify several risk dimensions—customer impact, privacy, data quality, model outputs, and monitoring—which require coordinated oversight and defined accountability rather than one centralized owner doing everything.

Technology-only ownership misses business-use, data, legal, compliance, and assurance responsibilities.
Business self-approval is inadequate because internal tools can still create customer, privacy, and control risks.
Audit should provide independent assurance, not own or operate first-line AI governance.

AI governance responsibilities span multiple functions because AI risk arises from use, data, technology, legal obligations, controls, and independent assurance.

Question 27

Topic: Risks and Risk Factors

A bank is assessing a machine-learning model that recommends credit-line decreases. Before mitigation, the AI risk team rates the use case as high risk because biased training patterns could produce unfair reductions and customers may receive unclear reasons. The business then adds fairness testing, reason-code review, human approval for adverse recommendations, and ongoing drift monitoring. What is the best way to record the risk taxonomy assessment?

A. Treat residual risk as high solely because the use case affects customer credit outcomes.
B. Classify the monitoring and review activities as inherent risk because they are part of the AI lifecycle.
C. Record the high pre-control exposure as inherent AI risk, then assess control effectiveness to determine the remaining residual risk.
D. Lower the inherent AI risk rating because fairness testing and human approval reduce the likelihood of unfair outcomes.

Best answer: C

What this tests: Risks and Risk Factors

Explanation: Inherent AI risk is the exposure arising from the AI use case before considering mitigating controls. In this scenario, the pre-control risk is high because the model could affect customer credit access and may produce biased or poorly explained adverse recommendations. The added fairness testing, reason-code review, human approval, and drift monitoring are controls. They may reduce the likelihood or impact of harm, but they do not change the inherent risk rating. Residual risk should be assessed after determining whether those controls are appropriately designed and operating effectively. If the remaining risk is still above appetite, it should be escalated or remediated.

Reducing inherent risk for added controls confuses pre-control exposure with post-control mitigation.
Rating residual risk as high solely from the use case ignores whether controls reduce risk.
Treating monitoring and review as inherent risk misclassifies controls as risk sources.

Inherent risk is measured before controls, while residual risk is the risk that remains after controls are applied and evaluated.

Question 28

Topic: Risks and Risk Factors

A bank is preparing to launch a public LLM assistant that accepts free-form customer prompts. The assistant can call an internal tool that retrieves account-servicing notes, and red-team testing shows crafted prompts can override the assistant’s instructions and request records outside the customer’s session. Which is the BEST control response before release?

A. Launch the assistant with post-release monitoring of a sample of conversations.
B. Add a stronger system prompt instructing the assistant never to reveal unauthorized information.
C. Block prompts containing phrases such as “ignore previous instructions” or “reveal hidden data.”
D. Enforce server-side authorization and least-privilege tool access scoped to the authenticated customer session.

Best answer: D

What this tests: Risks and Risk Factors

Explanation: For AI systems exposed to untrusted user input, prompt instructions alone are not a security boundary. The main risk in this scenario is that a prompt-injection attack could cause the assistant to misuse its connected retrieval tool. The best response is to enforce controls outside the model: server-side authorization, least privilege, and session-scoped access to tools and data. These controls ensure the tool will not return records the authenticated user is not entitled to access, regardless of what the model is persuaded to ask for. Monitoring and prompt hardening may be useful supporting controls, but they do not adequately prevent unauthorized data access.

Stronger system prompts can reduce some unsafe behavior but can be bypassed by prompt injection.
Keyword blocking is brittle because attackers can rephrase malicious instructions.
Post-release sampling detects some issues after the fact but does not prevent unauthorized access before release.

This directly limits what the AI can retrieve even if untrusted input succeeds in manipulating the model.

Question 29

Topic: AI Tools and Techniques

A bank uses a machine-learning model to prioritize small-business loan reviews. The holdout test set shows stable aggregate precision, a challenge set of newly incorporated firms shows many false negatives, loan officers report misleading denial explanations, and production monitoring shows declining precision after a new marketing campaign. What is the best interpretation for the model risk manager?

A. Treat the challenge-set failures as irrelevant unless they also appear in the random holdout test set.
B. Use the evidence sources together because each reveals a different weakness: representative historical performance, targeted edge-case behavior, user-facing output problems, and live drift.
C. Rely on the holdout test set because aggregate precision is the most objective benchmark for approval decisions.
D. Use production monitoring only, because pre-deployment testing cannot provide useful evidence once the model is live.

Best answer: B

What this tests: AI Tools and Techniques

Explanation: Different evaluation sources are designed to reveal different weaknesses. A holdout test set estimates performance on data intended to resemble the historical target population, so it may miss rare or emerging cases. A challenge set deliberately stresses known edge cases or high-risk segments, such as newly incorporated firms. User feedback can reveal problems with explanations, workflow fit, or unintended impacts that may not appear in numerical benchmarks. Production monitoring detects changes in live data, behavior, or performance after deployment, such as drift following a new marketing campaign. The best interpretation is not that one source overrides the others, but that they provide complementary evidence for model evaluation and remediation.

Aggregate holdout precision can hide segment-specific weaknesses and post-deployment changes.
Challenge-set failures are relevant because they intentionally test important edge cases that may be underrepresented in random samples.
Production monitoring is essential after deployment, but it does not replace pre-deployment tests or user feedback.

This correctly recognizes that each evaluation source samples a different condition and therefore can expose different model limitations.

Question 30

Topic: Data and AI Model Governance

A bank’s AI governance standard requires each AI use case to have defined control checkpoints from initial intake through design, development, validation, deployment, ongoing monitoring, material change, and retirement. Which concept best matches this requirement?

A. Dataset lineage documentation
B. Point-in-time independent validation
C. End-to-end AI lifecycle controls
D. Production performance monitoring

Best answer: C

What this tests: Data and AI Model Governance

Explanation: End-to-end AI lifecycle controls are governance activities applied throughout an AI system’s life, beginning with intake and risk classification and continuing through design, development, validation, approval, deployment, monitoring, change management, and retirement. The key feature is full lifecycle coverage with appropriate checkpoints, evidence, ownership, and approvals at each stage. Validation, monitoring, and lineage are important controls, but each addresses a narrower part of the lifecycle. The stem describes a governance standard that spans all stages, so the best match is end-to-end AI lifecycle controls.

Point-in-time independent validation focuses on predeployment or periodic review, not the full sequence from intake to retirement.
Production performance monitoring applies after deployment and does not cover intake, design, development, validation, or retirement.
Dataset lineage documentation supports data governance but is narrower than the complete AI lifecycle control structure.

The description maps to controls that operate across all major AI lifecycle stages rather than at a single point.

Question 31

Topic: Data and AI Model Governance

A bank’s lending business plans to use a vendor-hosted AI model to recommend which small-business applicants should receive enhanced manual review. The vendor developed the model, hosts it, and provides monthly performance reports. The business owner argues that accountability for the AI use should sit with the vendor because no internal team will build or run the model. What is the best governance action?

A. Assign accountability to the vendor because it controls the model design, hosting environment, and performance reporting.
B. Treat the model as a technology service only and make IT infrastructure the accountable owner for approval.
C. Keep the business owner accountable for the AI use and require inventory, risk assessment, vendor oversight, and ongoing monitoring before deployment.
D. Allow deployment if the contract requires the vendor to notify the bank of material model changes.

Best answer: C

What this tests: Data and AI Model Governance

Explanation: In AI model governance, the first-line business owner remains accountable for the business use of AI, including impacts on customers, controls, monitoring, and escalation. A vendor may develop, host, or operate the model, but the institution still chooses the use case, relies on the output, and bears responsibility for managing associated risks. The best action is therefore to keep internal business accountability clear and apply normal governance steps such as inventory registration, risk assessment, vendor due diligence, performance monitoring, and issue management. Vendor reports and contract clauses can support oversight, but they do not replace internal ownership or independent challenge where required.

Shifting accountability to the vendor confuses operational responsibility with institutional accountability for AI use.
Making IT the accountable owner misses that the key decision risk sits in the lending business process, not only the hosting environment.
Relying only on change notification is too narrow; it does not cover inventory, risk assessment, monitoring, or business-owner oversight.

Outsourcing development or hosting does not transfer the bank’s accountability for how the AI is used in its business decisions.

Question 32

Topic: AI Tools and Techniques

A risk analyst gives an LLM only this context: “Customer A opened a savings account in 2021 and filed two address changes in 2023.” The analyst then asks, “Using only this context, explain why Customer A changed jobs in 2023.” Which concept best describes the issue with this prompt?

A. Few-shot prompting with insufficient examples
B. Unsupported inference from missing context
C. Prompt injection through malicious instructions
D. Data drift in production model monitoring

Best answer: B

What this tests: AI Tools and Techniques

Explanation: A prompt is asking for an unsupported inference when it directs the model to answer a factual question that cannot be grounded in the provided context. Here, the only facts supplied are about an account opening and address changes. Nothing states that Customer A changed jobs, why a job change occurred, or whether employment is relevant. A well-controlled prompt should either limit the answer to stated facts or instruct the model to say that the information is not available. Otherwise, the model may generate a plausible but ungrounded explanation, creating hallucination and decision-support risk.

Few-shot prompting concerns providing examples to guide output format or reasoning, not asking for facts absent from the context.
Prompt injection involves instructions intended to override or manipulate the model’s intended behavior; no such malicious instruction appears here.
Data drift concerns changes in input or production data over time, not a single prompt’s lack of supporting context.

The prompt asks the model to explain a job change even though no job-change fact or reason is present in the supplied context.

Question 33

Topic: Data and AI Model Governance

A bank’s model risk team is setting validation plans for two new AI tools: an internal rules-based FAQ bot used only by operations staff, and a machine-learning model that recommends customer credit limit changes. The credit model uses many variables, has limited explainability, and could materially affect customers and revenue. What is the best action for setting validation scope?

A. Use the same full validation checklist for both tools to ensure consistent treatment across the AI inventory.
B. Defer validation until production monitoring shows which tool generates more exceptions.
C. Apply a risk-tiered approach, with deeper independent validation for the credit model and a lighter review for the FAQ bot.
D. Limit validation of the credit model to vendor or developer documentation if initial performance appears acceptable.

Best answer: C

What this tests: Data and AI Model Governance

Explanation: Validation scope should be proportionate to the risk created by the model’s use. The credit limit model is more material because it affects customer outcomes and revenue, is embedded in a decision workflow, is more complex, and has limited explainability. It therefore warrants deeper independent validation, such as review of data, performance, limitations, fairness or bias concerns, explainability, implementation, and monitoring. The internal FAQ bot may still need governance controls, but a lighter review can be appropriate if its use and impact are limited. Proportional validation helps avoid both under-validating high-risk AI and over-burdening low-risk tools.

Applying the same full checklist ignores meaningful differences in materiality and use.
Relying only on vendor or developer documentation does not provide sufficient independent challenge for a material credit model.
Waiting for production exceptions delays risk assessment until after potential customer or business harm occurs.

Validation should be proportionate to each model’s materiality, complexity, intended use, and risk profile.

Question 34

Topic: Risks and Risk Factors

A bank is validating a machine-learning score intended to support credit line increase decisions for existing retail customers. Which validation finding most directly indicates that the model is not fit for that intended use?

A. The model uses a non-linear ensemble method rather than a simpler linear scoring model.
B. On an independent sample representative of existing retail customers, the model fails the preapproved performance and calibration criteria for credit line decisions.
C. The model documentation does not list every hyperparameter trial evaluated during development.
D. The model was trained using labeled historical account data from the bank’s servicing platform.

Best answer: B

What this tests: Risks and Risk Factors

Explanation: A model is fit for intended use only if validation evidence shows it performs adequately for the specific decision, population, and risk tolerance it will support. The most direct adverse finding is failure on an independent, representative validation sample against predefined acceptance criteria, such as discrimination, calibration, or error limits relevant to credit line decisions. That result connects the model’s observed behavior to its planned business use. Other issues may require mitigation, documentation, or governance attention, but they do not by themselves prove the model cannot perform the intended task.

Using a non-linear ensemble may raise explainability or governance considerations, but complexity alone does not establish unfitness.
Incomplete detail on every hyperparameter trial is a documentation weakness, not the most direct evidence of decision-performance failure.
Training on labeled internal historical data may be appropriate if the data is relevant, governed, and representative.

Failure against predefined validation criteria on representative data directly shows the model cannot support its intended decision use.

Question 35

Topic: History and Overview of AI Concepts

A bank uses an AI model to compare each card transaction with a customer’s normal spending behavior and broader transaction patterns. Transactions with unusually high risk scores are routed for review before authorization. Which AI-supported business decision is primarily described?

A. Enterprise risk-report aggregation
B. Investment portfolio rebalancing
C. Fraud detection and prevention
D. Credit underwriting and limit setting

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: AI can support different financial-services decisions by recognizing patterns in data and producing scores, alerts, or recommendations for human or automated workflows. In this case, the relevant data are card transactions, customer spending behavior, and unusual activity patterns. The output is a risk score used before transaction authorization, which is characteristic of fraud detection and prevention. Credit underwriting would focus on repayment capacity and credit terms, investment support would focus on asset allocation or trading decisions, and risk reporting would summarize exposures or metrics for oversight rather than decide whether a specific payment should be reviewed.

Credit underwriting is not the best match because the scenario does not assess borrower default risk or credit limits.
Investment portfolio rebalancing is unrelated because no portfolio holdings, market signals, or allocation decision is described.
Enterprise risk-report aggregation would summarize risk information for management, not score individual card transactions before authorization.

The model is identifying anomalous transaction patterns to support decisions about whether to block, review, or approve payments.

Question 36

Topic: History and Overview of AI Concepts

An analytics team is developing a supervised machine learning model to rank small-business loan applications. The team has fitted models on a training split and repeatedly compared feature sets and hyperparameters using a validation split. Because the final governance review is approaching, the project lead proposes merging the validation and test splits to produce a stronger-looking performance estimate. What is the best action?

A. Use the test split during hyperparameter tuning so the final model is optimized against the same metric used in governance review.
B. Train the model on all available data immediately and rely on production monitoring to determine whether it generalizes.
C. Preserve an untouched test set for final evaluation and use training data for fitting and validation data for tuning and model selection.
D. Merge the validation and test splits because more observations will make the reported performance estimate more stable.

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: Training, validation, and testing datasets serve different purposes. Training data are used to fit model parameters. Validation data are used during development to tune hyperparameters, compare feature sets, and select among candidate models. Because the validation set influences those choices, its performance is no longer an independent estimate of final model quality. The test set should remain untouched until the model design is fixed, then be used once to estimate out-of-sample performance for governance and deployment decisions. Merging validation and test data after repeated tuning creates data leakage and can make performance appear better than it is.

Merging validation and test data may increase sample size, but it contaminates the final evaluation because validation results already shaped model choices.
Using the test split for hyperparameter tuning turns the test set into another validation set and undermines independent assessment.
Relying only on production monitoring skips a pre-deployment generalization check and may expose the business to avoidable model risk.

Separate purposes reduce overfitting and provide a more credible estimate of how the selected model will generalize.

Question 37

Topic: History and Overview of AI Concepts

A risk team asks an AI development group to maintain a record of each training dataset’s source, version, movement, transformations, and use so that results can be reproduced, reviewed by auditors, and traced to accountable owners. Which concept best matches this description?

A. Data lineage
B. Model interpretability
C. Feature engineering
D. Data labeling

Best answer: A

What this tests: History and Overview of AI Concepts

Explanation: Data lineage is the documented trail of where data came from, how it moved, how it was transformed, and where it was used in an AI system. It matters because reproducibility depends on knowing the exact data version and preparation steps used to train or test a model. It supports auditability by giving reviewers evidence to trace model inputs and data handling decisions. It also supports accountability by showing responsible owners, control points, and approvals across the data lifecycle. In the stem, the requested record is not just improving model inputs or explaining outputs; it is creating a traceable data history.

Data labeling assigns target values or annotations; it does not by itself trace data origin, transformations, and use.
Feature engineering creates or transforms predictive variables, but it is only one activity that lineage should document.
Model interpretability explains model behavior or outputs, not the end-to-end history of the data used.

Data lineage traces data from origin through transformations and use, supporting reproducibility, auditability, and accountability.

Question 38

Topic: AI Tools and Techniques

A bank is piloting a generative AI assistant to help branch staff answer product-eligibility questions. The current benchmark reports high average answer accuracy on a fixed FAQ test, but pilot reviews show the assistant sometimes gives different answers to equivalent prompts, cites policy sections that do not exist, and once suggests skipping a suitability check to speed onboarding. What is the best action before approving the assistant for broader use?

A. Expand the evaluation to test factuality against approved policies, consistency across prompt variants, harmful guidance, and support from valid sources.
B. Restrict the assistant to senior branch staff while relying on a disclaimer that outputs may be incorrect.
C. Approve the assistant because the fixed FAQ benchmark shows high average answer accuracy.
D. Tune the assistant for more fluent responses and clearer wording before retesting user satisfaction.

Best answer: A

What this tests: AI Tools and Techniques

Explanation: Generative AI evaluation often needs more than a conventional accuracy score because outputs are open-ended and can vary across semantically similar prompts. In this scenario, the assistant’s inconsistent answers indicate a need to test response stability. Nonexistent policy citations indicate weak source support or grounding. The suggestion to bypass a suitability check creates harmful output risk, even if many other responses are correct. Before broader approval, the bank should evaluate whether responses are factually correct against authoritative sources, consistent under reasonable prompt variations, safe for the intended use, and supported by traceable evidence.

High average FAQ accuracy is insufficient because it can mask hallucinated citations, inconsistent outputs, and unsafe recommendations.
Improving fluency may make responses sound more credible without making them more factual, grounded, or safe.
Limiting use to senior staff and adding a disclaimer does not resolve the underlying evaluation gaps before deployment.

The pilot issues show that average accuracy alone does not address factual grounding, stability, safety, or citation support for generative AI outputs.

Question 39

Topic: Risks and Risk Factors

An operations team proposes to accept the risk of an AI claims-triage tool for the next quarter. The firm’s AI risk appetite permits acceptance only when residual exposure is within approved limits and required human-review controls are operating. Current monitoring shows the tool exceeds the approved error-rate limit for vulnerable customers, and the required human-review sample has not been implemented. Which concept best describes why the proposed acceptance is inappropriate?

A. Routine residual risk acceptance
B. Risk acceptance outside approved appetite
C. Risk transfer through third-party reliance
D. Inherent risk identification before controls

Best answer: B

What this tests: Risks and Risk Factors

Explanation: Risk acceptance is appropriate only when the residual risk is within the organization’s stated risk appetite and required controls or conditions are satisfied. In this scenario, both conditions fail: the monitored error rate exceeds the approved limit for a sensitive customer group, and the required human-review control is not operating. That makes the proposed acceptance a risk appetite breach or out-of-tolerance exposure, not a normal acceptance decision. The appropriate governance response would typically be escalation, remediation, compensating controls, or restriction of use rather than simply accepting the risk.

Routine residual risk acceptance applies when remaining risk is within tolerance and required controls are in place.
Risk transfer would shift or share some exposure contractually or operationally, but no transfer mechanism is described.
Inherent risk identification concerns exposure before controls; the stem describes monitored residual exposure and failed control expectations.

The exposure breaches stated appetite and required controls are missing, so acceptance is not an appropriate risk response.

Question 40

Topic: Responsible and Ethical AI

A bank adds a notice to an internal generative AI research tool stating that answers are drawn from approved policy documents and recent case notes, may miss new regulatory interpretations, include a confidence indicator, and must be reviewed by a qualified employee before use in client communications. Which responsible AI concept does this notice best illustrate?

A. Adversarial prompt testing
B. Meaningful transparency disclosure
C. Independent model validation
D. Data minimization control

Best answer: B

What this tests: Responsible and Ethical AI

Explanation: Transparency in responsible AI is not limited to saying that AI is being used. A useful disclosure helps users understand the basis and appropriate use of the output, including relevant data sources, known limitations, confidence or uncertainty, and when human review is required. In this scenario, the notice gives employees practical reliance information before they use AI-generated content in client communications. That is why it maps to meaningful transparency disclosure rather than a technical testing, privacy, or validation activity.

Data minimization concerns limiting data collection or use, not explaining how users should interpret outputs.
Independent model validation assesses whether the model is fit for purpose, but the stem describes user-facing disclosure.
Adversarial prompt testing evaluates resilience to malicious or manipulative prompts, not reliance guidance for ordinary users.

The notice explains the tool’s data sources, limitations, confidence, and human review expectations so users understand how to rely on its outputs.

Question 41

Topic: History and Overview of AI Concepts

A bank uses an AI model to rank small-business loan applications for manual underwriter review. The model was validated before launch, but the applicant mix has changed after a new digital channel, and management wants to continue using the model without changing its intended use. Which evidence would best support whether the model remains fit for that intended use?

A. Confirmation that the model code and hyperparameters have not changed since approval
B. The original validation report showing strong test-set performance before the digital channel was introduced
C. Recent monitoring results on current production applications comparing predicted risk rankings with realized outcomes against approved performance criteria
D. A business report showing that underwriters find the model’s risk rankings convenient to use

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: A model remains fit for intended use when evidence shows it still performs adequately for the current population, decision context, and approved performance expectations. Because the applicant mix has changed, historical validation alone is insufficient. The best evidence is recent monitoring or back-testing on current production data, comparing predictions with realized outcomes and approved criteria such as discrimination, calibration, error rates, or ranking effectiveness. Unchanged code does not prove unchanged performance, because data drift or population shift can degrade results. User convenience may support adoption, but it does not establish model performance or risk fitness.

Original validation is useful baseline evidence, but it may no longer reflect the changed applicant population.
Unchanged code misses the risk that inputs or borrower behavior have shifted.
Underwriter convenience addresses usability, not whether predictions remain accurate or reliable.

Current, outcome-based performance evidence tied to the model’s intended use is the strongest basis for assessing ongoing fitness.

Question 42

Topic: AI Tools and Techniques

A bank uses a large model trained on broad, diverse text and code data. Rather than building a separate model from scratch for each application, teams adapt it with prompts or limited fine-tuning for tasks such as summarizing policies, drafting customer-service replies, and extracting terms from contracts. Which AI concept best matches this description?

A. Foundation model
B. Rule-based expert system
C. Retrieval-augmented generation system
D. Task-specific supervised model

Best answer: A

What this tests: AI Tools and Techniques

Explanation: Foundation models are general-purpose AI models trained on broad datasets at scale, often using self-supervised learning. Their key feature is adaptability: the same underlying model can support many downstream tasks through prompting, fine-tuning, or integration into applications. In the bank scenario, one large model is reused for summarization, drafting, and information extraction, which is the hallmark of a foundation model. A task-specific supervised model is usually trained for one defined prediction or classification task. Retrieval-augmented generation may use a foundation model but adds external document retrieval. A rule-based expert system relies on predefined human-authored rules rather than broad data-driven training.

A task-specific supervised model is narrower because it is trained for a defined target task rather than broadly reused across many tasks.
Retrieval-augmented generation describes an architecture that grounds outputs in retrieved content, not the broad base model itself.
A rule-based expert system depends on explicit rules and does not learn broad representations from large-scale data.

A foundation model is broadly trained on large, diverse data and can be adapted or prompted for many downstream tasks.

Question 43

Topic: Data and AI Model Governance

A bank classifies a new machine-learning credit-line model as material because it will directly influence customer limits. The analytics team that built the model has also written the validation memo and its product owner is prepared to approve production release to meet a launch deadline. What is the BEST governance action before deployment?

A. Pause release until an independent validation function reviews the model and approval is obtained outside the development team.
B. Proceed with deployment if the analytics team adds more performance metrics to the validation memo.
C. Allow the product owner to approve deployment because the model is owned by the analytics team.
D. Deploy the model with enhanced post-implementation monitoring by the same analytics team.

Best answer: A

What this tests: Data and AI Model Governance

Explanation: Segregation of duties is weakened when the same team develops, validates, and approves a material AI model. For a material model, governance should preserve independent challenge: developers may provide evidence and respond to issues, but validation and approval should be performed by roles or committees that are separate from the development team and sufficiently objective. Additional documentation, ownership by the product team, or post-deployment monitoring can support governance, but they do not correct the core conflict created by self-validation and self-approval before production use.

Adding more metrics may improve documentation, but it does not make the validation independent.
Product ownership does not justify self-approval for a material model that affects customers.
Monitoring after release is useful, but it cannot substitute for independent pre-deployment validation and approval.

Material models require segregation between development, validation, and approval to preserve independent challenge.

Question 44

Topic: History and Overview of AI Concepts

A bank uses an AI system trained and validated only to classify incoming payment messages as likely duplicate or not duplicate. It performs well in that workflow but cannot answer customer-service questions, summarize policies, or perform other banking tasks without separate design and training. Which concept best describes this system?

A. General-purpose AI
B. Narrow task-specific AI
C. Foundation model
D. Artificial general intelligence

Best answer: B

What this tests: History and Overview of AI Concepts

Explanation: Narrow or task-specific AI is designed to perform a defined function within a limited context, such as classifying payment messages for duplicate risk. Strong performance on that task does not imply the system can reason across domains, adapt to unrelated workflows, or perform open-ended business activities. Broader general-purpose AI capabilities involve models that can be adapted across many tasks, and artificial general intelligence would imply human-like flexibility across domains. In this scenario, the need for separate design and training before handling other banking tasks is the key evidence that the system is narrow AI.

General-purpose AI would imply broader adaptability across multiple tasks, which the system does not have.
Artificial general intelligence would imply flexible, human-like capability across domains, far beyond the described classifier.
A foundation model may support many downstream tasks, but the described system is a single-purpose classifier rather than a broad pretrained model.

The system is optimized for one defined business task and lacks broader cross-domain capabilities.

Question 45

Topic: Data and AI Model Governance

During a quarterly AI inventory review, a payments unit proposes an AI fraud model. The model scores card transactions in real time and automatically blocks transactions above a threshold; a block can deny a customer’s purchase until staff review occurs. The model uses standard transaction fields and has acceptable validation results. Which classification decision is best?

A. Classify it as low intensity because acceptable validation results reduce the need for governance review.
B. Classify it based on the development team’s preferred label because fraud prevention is an operational control.
C. Classify it as low intensity because it uses standard transaction fields rather than sensitive demographic data.
D. Classify it for heightened governance because it can automatically affect customer access to financial services.

Best answer: D

What this tests: Data and AI Model Governance

Explanation: AI inventory classification should reflect the risk created by how the system is used, not only the data type or validation result. A system that automatically blocks card transactions can directly affect customers’ access to financial services, even if the purpose is fraud prevention and the input data is ordinary transaction data. That customer impact and degree of automation justify heightened governance, such as stronger approval, documentation, monitoring, escalation, and independent review expectations. Good validation results are important evidence, but they do not eliminate the need for governance intensity when the use case is high-impact.

Standard transaction fields may reduce some data-risk concerns, but they do not remove the customer-impact risk from automated blocking.
Acceptable validation supports model readiness, but it is not a substitute for governance classification.
A team’s preferred label or operational-control purpose does not determine governance intensity; actual use and impact do.

Automated customer-impacting decisions are a classification factor that should increase governance intensity.

Question 46

Topic: Data and AI Model Governance

An AI issue log shows repeated late escalation of high-risk output errors across several business use cases. Each incident was corrected individually, but root-cause reviews show no defined owner for monitoring exceptions, no standard severity criteria, and no requirement to verify corrective actions before issue closure. Which concept best matches this pattern?

A. Normal model performance drift requiring recalibration
B. Independent validation evidence confirming control effectiveness
C. Control design weakness in AI issue management
D. Isolated operational errors by individual users

Best answer: C

What this tests: Data and AI Model Governance

Explanation: Recurring AI issues should be assessed for patterns in process, ownership, escalation, and remediation. In this case, the same type of failure occurs across multiple use cases and the root causes point to absent governance elements: no owner, no severity standard, and no closure verification. That indicates the issue management control is not designed well enough to prevent, detect, escalate, and resolve problems consistently. Individual fixes may correct specific outputs, but they do not address the underlying control weakness. Effective AI governance would require defined accountability, consistent issue classification, escalation triggers, remediation tracking, and evidence that corrective actions worked before closure.

Model drift concerns changing performance over time, but the stem emphasizes missing issue-management controls.
Isolated user errors would not explain repeated failures across several use cases with common root causes.
Independent validation evidence would support assurance; here, the findings show control gaps, not control effectiveness.

Recurring issues tied to missing ownership, criteria, and closure verification indicate a systemic governance/control gap rather than one-off mistakes.

Question 47

Topic: Responsible and Ethical AI

A bank plans to deploy an LLM assistant that drafts responses for branch staff. Validation found weaker performance on uncommon products, and the product team rejected delaying launch for more testing or limiting the tool to common products; the business owner approved launch with mandatory human review and accepted the remaining risk. What is the best documentation action before deployment?

A. Update the responsible AI documentation to record the rejected alternatives, the known product-coverage limitation, the human-review control, and the business owner’s residual-risk acceptance.
B. Rely on the vendor’s model card because internal approval with human review transfers the remaining risk away from the bank.
C. Document only the final human-review control because rejected alternatives are not part of the approved design.
D. Wait to document the limitation until production monitoring shows that customers were affected.

Best answer: A

What this tests: Responsible and Ethical AI

Explanation: Responsible AI documentation should preserve the reasoning behind material deployment decisions, not just the final technical design. When a team proceeds despite a known limitation, rejects feasible alternatives, or formally accepts residual risk, the record should show what was considered, why alternatives were rejected, what limitations remain, what controls mitigate them, and who is accountable for the acceptance. In this scenario, the weaker performance on uncommon products and the decision not to delay or restrict launch are material governance facts. Recording them before deployment supports transparency, challenge, monitoring, auditability, and future review if the risk profile changes.

Recording only the final human-review control omits the decision rationale and accepted residual risk.
Waiting for customer impact treats documentation as incident reporting rather than a predeployment governance control.
A vendor model card may inform due diligence, but it does not replace internal accountability for use-case-specific limitations and risk acceptance.

This decision involves rejected risk-reduction options, a known limitation, and accepted residual risk, all of which should be documented before use.

Question 48

Topic: Risks and Risk Factors

A bank’s validation team can reproduce local feature attributions for each credit model decision and understands which variables drive the score. However, declined applicants and relationship managers receive only a generic statement that an automated tool was used, with no understandable reason or escalation path. Which concept is most directly lacking?

A. Data lineage across source systems and transformations
B. Technical model interpretability for validators and developers
C. Performance monitoring for model drift and accuracy decay
D. Business transparency to affected users and decision stakeholders

Best answer: D

What this tests: Risks and Risk Factors

Explanation: Technical model interpretability and business transparency are related but distinct. Interpretability focuses on whether technical teams can understand or explain model behavior, such as feature importance, local attributions, or decision logic. In the scenario, that capability exists because validators can reproduce explanations for individual decisions. Business transparency focuses on whether users, customers, managers, or regulators receive information that is understandable, actionable, and appropriate for their role. The risk here is that affected stakeholders are not told meaningful reasons for the outcome or how to challenge or escalate it, even though the model is technically explainable internally.

Technical model interpretability is not the main gap because validators can already explain model drivers.
Data lineage concerns where data came from and how it was transformed, not whether decisions are communicated clearly.
Performance monitoring addresses ongoing accuracy or drift, not the transparency of explanations to stakeholders.

The gap is the failure to provide understandable, useful information to customers and managers despite internal technical explanations.

Question 49

Topic: Responsible and Ethical AI

A bank’s responsible AI team reviews a proposed AI collections tool by identifying how its decisions could affect past-due customers, call-center employees, external debt-recovery counterparties, and the control owners who monitor exceptions. Which concept best matches this review?

A. Control owner attestation
B. Stakeholder impact assessment
C. Privacy impact assessment
D. Model performance benchmarking

Best answer: B

What this tests: Responsible and Ethical AI

Explanation: Stakeholder impact assessment is a responsible AI activity focused on who may be affected by an AI system and how. In this scenario, the review goes beyond technical model accuracy and considers customers, employees, counterparties, and internal control owners. The key issue is recognizing AI’s effects on different stakeholder groups, including operational burdens and potential harms. The concept is broader than privacy, benchmarking, or control sign-off, though those activities may support the overall governance process.

Model performance benchmarking compares model results against technical or business performance standards, not stakeholder effects.
Privacy impact assessment focuses mainly on personal data collection, use, and protection.
Control owner attestation confirms responsibility or operation of controls but does not itself map affected stakeholder groups.

It identifies the groups affected by an AI system and evaluates how benefits, harms, duties, or burdens may shift across them.

Question 50

Topic: Data and AI Model Governance

A bank has introduced an AI tool that recommends exceptions to small-business loan policy. The use case is rated material because recommendations can affect credit decisions. Its new override-logging and second-review controls have failed in each of the last three monthly control tests, even after management remediation. What is the best action for the AI governance lead?

A. Reclassify the use case as lower risk because the tool only recommends exceptions rather than approving loans.
B. Close the issue if the model’s accuracy metrics remain within approved performance thresholds.
C. Allow the model owner to continue self-testing the controls until the next annual model review.
D. Refer the AI governance controls to independent audit or assurance review and report the repeated failures to the oversight forum.

Best answer: D

What this tests: Data and AI Model Governance

Explanation: Audit or assurance should become involved when AI governance controls are new, material, or repeatedly failing—especially when the AI use case can affect customer or credit outcomes. In this scenario, the controls are new, tied to a material credit decision-support use case, and have failed across multiple testing cycles despite remediation. That pattern indicates a potential weakness in control design, execution, ownership, or governance oversight. The best action is not simply more first-line self-testing or waiting for a scheduled review; it is to obtain independent assurance and ensure the oversight forum is informed so the issue can be tracked, challenged, and remediated appropriately.

Continuing only with model-owner self-testing misses the need for independent review after repeated failures.
Good model accuracy does not resolve failed governance controls over how recommendations are reviewed and logged.
Treating the use case as lower risk ignores that recommendations can still influence credit decisions and customer outcomes.

New, material controls that repeatedly fail after remediation warrant independent assurance and governance escalation.

Questions 51-75

Question 51

Topic: AI Tools and Techniques

A bank deploys a retrieval-based AI assistant for internal policy questions. A review focuses on whether the searchable policy sources are authoritative and current, whether employee permissions are enforced during retrieval, and whether the index returns the most relevant passages before the model generates an answer. Which concept best matches this review focus?

A. Prompt phrasing optimization for user instructions
B. Supervised fine-tuning of model weights
C. Temperature tuning for response variability
D. Retrieval corpus and index governance for grounding

Best answer: D

What this tests: AI Tools and Techniques

Explanation: Retrieval-based AI systems, such as retrieval-augmented generation, rely on external sources being retrieved and passed to the model as context. If those sources are low quality, stale, improperly permissioned, or poorly indexed, the model may generate answers that appear fluent but are unsupported, outdated, or disclose information to the wrong user. Governance of the retrieval corpus and index helps ensure that the system is grounded in trusted, current, access-controlled, and findable content. This is different from changing the model itself; the key risk mechanism is the quality and control of what the system retrieves.

Temperature tuning affects randomness or consistency in generation, not whether retrieved sources are current, authorized, or relevant.
Supervised fine-tuning changes model behavior through training examples, but the stem concerns external retrieval content and indexing.
Prompt phrasing can guide the model’s response style or instructions, but it does not by itself validate source quality, freshness, access controls, or retrieval relevance.

The review targets the quality, freshness, permissioning, and retrievability of the context used to ground the assistant’s outputs.

Question 52

Topic: History and Overview of AI Concepts

A retail bank is piloting an AI tool that ranks transaction alerts for fraud investigators. The model’s validation metrics meet target on historical data, but investigators are beginning to close low-ranked alerts without review, the feedback labels used for retraining are incomplete, and the production API is not connected to the bank’s incident-escalation workflow. What is the best action before expanding deployment?

A. Restrict system access to fraud managers and treat the remaining issues as standard cybersecurity controls.
B. Approve expansion because validation metrics meet target and monitor only aggregate fraud losses after deployment.
C. Run an end-to-end AI risk assessment and implement controls covering investigator use, alert workflow, feedback-data quality, model monitoring, and production integration.
D. Retrain the model with a larger historical dataset before rollout while leaving user procedures and workflow unchanged.

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: AI systems operate in a broader socio-technical environment. Strong historical validation results reduce one source of model risk, but they do not address how users act on outputs, whether feedback data are reliable, whether workflow controls assign accountability, or whether technology integration supports escalation and monitoring. In this scenario, investigators may be over-relying on rankings, incomplete labels could degrade future model performance, and the production API may bypass incident handling. The best action is an integrated risk assessment and control plan that covers people, process, data, model, and technology before scaling the tool.

Relying only on validation metrics ignores user behavior, retraining data, and operational escalation gaps.
Retraining may help model performance, but it does not correct incomplete labels, human overreliance, or process integration issues.
Access restriction addresses only a narrow technology or security concern, not the full AI risk context.

AI risk arises from the interaction of people, process, data, models, and technology, so controls must address the full operating system rather than one component.

Question 53

Topic: Responsible and Ethical AI

A bank is preparing to deploy an AI model that recommends whether customers qualify for hardship fee waivers. Which finding most clearly indicates the team should pause deployment for ethical escalation before release?

A. Pre-release testing shows a materially higher false-denial rate for a vulnerable customer segment, and the team has not identified a valid justification or mitigation.
B. The model uses more historical data than originally planned, and the data lineage has been documented and approved.
C. The monitoring dashboard will be reviewed at the first scheduled governance meeting after launch, as specified in the approved plan.
D. A challenger model has slightly higher overall accuracy, but the selected model remains within approved performance tolerances.

Best answer: A

What this tests: Responsible and Ethical AI

Explanation: A deployment pause is appropriate when a responsible AI concern is material, unresolved, and likely to affect stakeholders unfairly or harmfully. In this scenario, the model may deny hardship fee waivers at a disproportionately high rate for a vulnerable segment, and the team lacks a justified explanation or mitigation. That creates a fairness and customer-impact issue that should be escalated before release. By contrast, documented data lineage, performance trade-offs within approved tolerances, and planned post-launch governance activities are not, by themselves, clear reasons to stop deployment.

Documented and approved data lineage supports governance; it does not signal an unresolved responsible AI issue.
A slightly more accurate challenger model may raise model selection questions, but the selected model is still within approved tolerances.
Scheduled monitoring is a normal lifecycle control when it follows the approved governance plan; it is not the same as an unresolved pre-launch ethical concern.

An unresolved, material fairness and customer-harm concern should be escalated and resolved before deployment.

Question 54

Topic: History and Overview of AI Concepts

A bank operations group uses an AI model that reads a standard exception form and assigns one of eight workflow codes. Validation showed strong accuracy only on that form and those eight categories. The business sponsor now proposes using the same model to summarize customer complaints and draft response text. What is the best action for the risk manager?

A. Reject the model inventory entry unless the tool can perform human-level reasoning across unrelated business functions.
B. Approve the expanded use because strong validation accuracy shows the model can generalize to related operations tasks.
C. Classify the model as general-purpose AI because it is used in more than one bank operations process.
D. Treat the current model as narrow, task-specific AI and require a separate assessment before any broader language-use deployment.

Best answer: D

What this tests: History and Overview of AI Concepts

Explanation: Narrow AI is designed and validated for a specific task, such as classifying a standard form into predefined workflow codes. That evidence does not show that the same model can perform broader language tasks such as summarizing complaints or drafting responses. A risk manager should distinguish the proven use case from the proposed new capability and require a separate assessment of suitability, data, performance, controls, and user oversight before deployment. General-purpose capability is not established merely because a tool is labeled “AI” or works well in one narrow workflow.

Strong accuracy on workflow coding is not evidence of safe or effective performance on summarization or drafting.
Use in multiple processes does not by itself make a model general-purpose; capability breadth matters.
Requiring human-level reasoning across unrelated functions confuses general-purpose business capability with artificial general intelligence.

Performance on one defined classification task does not demonstrate general-purpose capability for summarization or drafting.

Question 55

Topic: AI Tools and Techniques

A collections business unit has integrated a vendor generative AI tool into its workflow to draft customer emails. The tool is using live customer data, but it is not recorded in the enterprise AI inventory, has not received required model-risk or compliance approval, and has no monitoring of output quality or customer complaints. What is the best action for the risk manager to recommend?

A. Pause or restrict production use until the tool is inventoried, approved through the required governance process, and assigned monitoring controls.
B. Request the vendor’s assurance report and defer internal review until the next annual risk assessment cycle.
C. Add a customer-facing disclosure that AI may have assisted in drafting the email.
D. Permit continued use because employees review each email before it is sent to customers.

Best answer: A

What this tests: AI Tools and Techniques

Explanation: An AI tool used with live customer data in a business workflow should be brought under the organization’s AI governance process. Inventory creates visibility and ownership, approval confirms that the use case, data, compliance, and operational risks have been reviewed, and monitoring detects issues such as poor output quality, bias, complaints, or control failures. Human review and vendor documentation may reduce some risks, but they do not replace internal accountability for approving and monitoring the deployment. The best action is therefore to stop or limit production use until the missing governance controls are in place.

Human review is helpful, but it does not cure the absence of inventory, approval, and monitoring.
Vendor assurance can support due diligence, but it cannot substitute for internal governance over the specific use case.
Customer disclosure may improve transparency, but it does not address the core control gap in deployment oversight.

The decisive gap is unauthorized AI deployment without inventory, approval, or monitoring, so governance onboarding and controls are needed before continued production use.

Question 56

Topic: History and Overview of AI Concepts

A bank uses a supervised AI model to classify card transactions as low fraud risk or high fraud risk. After a portfolio migration, 70% of newly scored low fraud risk transactions have tenure_days=999; the data dictionary says 999 is a placeholder for unknown tenure, and the model card lists tenure as a top predictive feature. What should the model-risk analyst do first?

A. Add the migrated transactions to the next training set as-is to improve current-population coverage.
B. Flag tenure_days=999 as a missing-value encoding issue and require corrected preprocessing before relying on the classifications.
C. Recalibrate the score threshold so fewer migrated transactions are classified as low fraud risk.
D. Accept the classifications because tenure_days=999 is consistently populated by the source system.

Best answer: B

What this tests: History and Overview of AI Concepts

Explanation: The decisive issue is data quality: a missing-value placeholder is entering the model as though it were a valid tenure value. Because tenure is identified as a top predictive feature, the value 999 can materially affect the classification outcome, especially when it is concentrated among transactions classified as low fraud risk. The best first step is to flag the invalid encoding, correct the data preprocessing or imputation treatment, and reassess the model output before relying on the classifications. This is a data-concepts issue because AI predictions depend not only on the algorithm but also on whether input fields accurately represent the real-world attributes they are meant to measure.

Recalibrating the threshold treats the symptom without correcting the invalid model input.
Adding migrated transactions as-is increases exposure to the same data defect.
Treating 999 as valid because it is consistently populated ignores the data dictionary’s meaning.

The placeholder is being treated as a real predictive input, which can directly distort the model’s fraud-risk classifications.

Question 57

Topic: History and Overview of AI Concepts

An AI application assigns a risk score to loan-renewal requests and lists the main factors behind the score. Because the decision can materially affect a customer and may require judgment about exceptions, a credit officer must review the output and approve or decline the renewal. Which concept best describes this use case?

A. Unsupervised pattern discovery
B. Post-deployment drift monitoring
C. AI-enabled decision support
D. Fully autonomous decision-making

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: A use case should be treated as decision support when the AI output helps a person make a decision but the accountable decision remains with a human. This is especially important when the outcome has material customer impact, requires judgment, or may involve exceptions that are not captured fully by model inputs. In the stem, the AI provides a risk score and explanatory factors, but a credit officer reviews the case and approves or declines the renewal. That is not full automation; it is AI-assisted judgment within a controlled decision process.

Fully autonomous decision-making would let the system approve or decline without human review.
Unsupervised pattern discovery describes finding groups or structures in data, not the governance of the final decision.
Post-deployment drift monitoring checks model behavior over time after implementation, not whether a use case is decision support.

The AI informs and structures a human decision but does not make the final customer-impacting decision autonomously.

Question 58

Topic: History and Overview of AI Concepts

A card issuer wants an AI system to review each customer’s recent transactions and flag transactions that are highly unusual compared with that customer’s normal spending pattern. Which machine learning use case best matches this description?

A. Prediction
B. Clustering
C. Anomaly detection
D. Classification

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: Anomaly detection is used to find observations that are unusual relative to a baseline pattern, such as transactions that do not fit a cardholder’s normal behavior. In this example, the goal is not primarily to assign every transaction to a predefined category, group similar customers, or estimate a future value. The decisive feature is the search for exceptional or outlying activity that may warrant investigation. In financial services, anomaly detection is commonly used for fraud alerts, cyber event monitoring, and unusual account activity reviews.

Classification would apply if the main task were assigning each transaction to predefined labels such as fraud or not fraud.
Clustering would apply if the task were grouping similar customers or transactions without predefined labels.
Prediction would apply if the task were estimating a future outcome, such as next month’s spend or default probability.

The system is identifying observations that deviate significantly from expected or normal behavior.

Question 59

Topic: Risks and Risk Factors

A bank uses a vendor AI service to prioritize fraud alerts. The vendor announces that it has replaced the underlying model and changed how confidence scores are generated, which may affect false negatives, analyst workload, and customer friction. Which oversight action best matches this situation?

A. Accept the vendor’s release notes as sufficient evidence because the service remains externally hosted.
B. Open a material-change review requiring updated vendor assurance, impact assessment, and service testing before relying on the revised outputs.
C. Limit the review to cybersecurity access controls because no internal model code was changed.
D. Record the change for the next scheduled annual vendor review while continuing use without additional review.

Best answer: B

What this tests: Risks and Risk Factors

Explanation: A meaningful change to a third-party AI model or service should be treated as a material change when it may affect business outcomes, control performance, customer impact, or operational workload. The bank remains accountable for how the AI service is used, even when the model is hosted and maintained by a vendor. Appropriate oversight includes assessing the change’s impact on risk, obtaining updated vendor evidence, testing the revised service against relevant use-case criteria, and updating governance documentation or controls as needed. This is especially important for fraud alert prioritization because changes in scoring behavior can shift false negatives, false positives, escalation volumes, and customer experience.

Waiting for the annual review misses the risk created by a material model or service change.
Limiting review to cybersecurity controls ignores model performance, operational, and customer-impact risks.
Vendor release notes may inform the review, but they do not replace the firm’s accountability for assurance and use-case testing.

A vendor model change that can affect business risk should trigger renewed oversight, impact assessment, and testing rather than routine monitoring alone.

Question 60

Topic: History and Overview of AI Concepts

A bank uses an AI score to support credit line-increase approvals. Although the model has strong overall accuracy, the risk team evaluates false approvals, false declines, and cutoff choices in terms of expected losses, revenue, and customer impact for that specific approval decision. Which concept does this description best illustrate?

A. Decision-centered performance evaluation
B. Feature engineering for predictive signal
C. Unsupervised pattern discovery
D. Model transparency through explainability

Best answer: A

What this tests: History and Overview of AI Concepts

Explanation: Model performance should be judged in relation to the decision the model informs, not only by generic statistical metrics. A credit approval model may have high overall accuracy but still be unsuitable if its errors create unacceptable credit losses, missed revenue, customer harm, or control issues. Decision-centered evaluation connects metrics such as false positives, false negatives, thresholds, and calibration to the business action being taken. This helps determine whether the model is fit for purpose in its operating context.

Unsupervised pattern discovery concerns finding structure in unlabeled data, not evaluating decision consequences.
Feature engineering focuses on creating useful input variables, not judging whether outcomes support a business decision.
Explainability helps users understand model outputs, but it is not the same as evaluating performance against decision-specific costs and benefits.

It assesses model performance by the consequences and trade-offs of the business decision the model supports.

Question 61

Topic: Data and AI Model Governance

A business unit tells the AI governance team that it is using a vendor large language model to draft replies to small-business loan inquiries. Employees review and edit each draft before sending, prompts may include customer business information, and the use has no current AI inventory record or assigned risk tier. What is the best action when this AI system is first identified?

A. Assign the highest risk tier immediately because customer information may be included in prompts.
B. Approve continued use once the business manager confirms the drafts are not used for credit approval.
C. Exclude the tool from the AI inventory because employees review the drafts before sending them.
D. Create an inventory record with the owner, purpose, vendor, data inputs, users, human-review control, and status, then route it for risk classification.

Best answer: D

What this tests: Data and AI Model Governance

Explanation: Inventory capture and risk classification are related but distinct governance steps. When an AI use is first identified, the organization should record the system in the inventory with basic factual attributes, such as ownership, purpose, vendor, data used, users, lifecycle status, and key controls. Those facts then support a risk classification or tiering decision. The presence of customer information and a vendor LLM may later affect the risk tier, but it does not replace the need for a complete inventory record. Human review also does not remove the need to inventory the AI use; it is a control fact to capture and evaluate.

Immediate highest-risk classification jumps to a conclusion before capturing the factual basis for tiering.
Excluding the tool because humans review outputs confuses a mitigating control with an exemption from inventory.
Manager confirmation about non-credit-decision use is relevant context, but it is not a substitute for inventory capture and formal classification.

First identification requires capturing factual inventory attributes so risk classification can be performed using the actual use, data, and control context.

Question 62

Topic: Responsible and Ethical AI

A bank plans to use an AI model to prioritize applicants for unsecured-loan offers. Pre-launch testing shows the model would increase completed applications, but rural applicants receive materially fewer offers than comparable urban applicants because distance to branch is a high-importance feature. The product owner wants to launch immediately to capture the expected revenue benefit. Which action best balances business value with fairness and stakeholder-impact considerations?

A. Set a manual quota so rural and urban applicants receive the same number of offers regardless of model score or credit context.
B. Remove all location-related variables and launch immediately without retesting because the apparent source of bias has been eliminated.
C. Launch the model unchanged because the model does not explicitly use protected-class data and has a positive revenue case.
D. Delay broad launch, assess the feature’s proxy effect and stakeholder impact, modify or justify the model design, and roll out with monitoring and review controls.

Best answer: D

What this tests: Responsible and Ethical AI

Explanation: Responsible AI decision-making should consider both expected business value and potential adverse stakeholder impact. Here, the model has a measurable benefit, but testing has identified a plausible fairness issue: distance to branch may operate as a proxy that disadvantages rural applicants who are otherwise comparable. The best action is not to abandon the use case automatically, but also not to launch unchanged. A balanced approach is to pause broad deployment, evaluate the proxy effect, consider affected stakeholders, adjust or justify the model design, and implement monitoring and review controls for rollout. This keeps the business objective in view while ensuring the model’s impact is understood, documented, and governed.

Launching unchanged overweights revenue and ignores evidence of disparate stakeholder impact.
Removing variables without retesting may create new performance or fairness issues and does not demonstrate control effectiveness.
Equalizing offer counts by quota ignores credit context and is an unsupported remedy rather than a risk-based fairness control.

This preserves potential business value while addressing a concrete fairness concern before scaling the AI use case.

Question 63

Topic: Risks and Risk Factors

A financial institution deploys a fraud-alert assistant through an external AI platform. The provider controls model updates, retention of submitted data, service uptime, and the evidence available for independent review. Which concept best matches this risk description?

A. Internal model validation risk
B. Third-party AI risk
C. Prompt injection risk
D. Data representativeness risk

Best answer: B

What this tests: Risks and Risk Factors

Explanation: Third-party AI risk occurs when an organization depends on an external provider for important parts of an AI system or service. In this scenario, the provider controls model behavior through updates, handles submitted data under its own practices, determines service availability, and limits the assurance evidence the institution can review. Those dependencies can affect operational resilience, privacy, model performance, compliance, and oversight. The key issue is not only whether the AI model works, but whether the institution can govern, monitor, and obtain assurance over externally controlled components.

Data representativeness risk concerns whether training or input data adequately reflects the intended population, not vendor control of the AI service.
Prompt injection risk involves malicious or manipulative prompts changing model behavior, not ordinary reliance on an external provider.
Internal model validation risk would focus on weaknesses in the institution’s own independent review process, while the stem emphasizes external provider dependency.

The risk arises because an external provider controls key aspects of AI behavior, data handling, availability, and assurance evidence.

Question 64

Topic: Risks and Risk Factors

A bank uses the same generative AI tool for two activities: summarizing internal project meetings and drafting customer-facing explanations for declined credit applications. The risk team says the tool should be assessed separately for each activity, considering the business use, customer impact, technical complexity, and strength of human review and monitoring. Which AI risk concept best matches this approach?

A. Model performance benchmarking
B. Contextual, risk-based AI assessment
C. Third-party vendor concentration analysis
D. Data lineage documentation

Best answer: B

What this tests: Risks and Risk Factors

Explanation: AI risk should be assessed in context because the same tool or model can create very different risk profiles in different uses. An internal meeting-summary tool may have limited external impact if errors are reviewed and contained. A customer-facing credit-decision explanation may affect fairness, transparency, regulatory expectations, customer outcomes, and reputational risk. Complexity also matters because harder-to-understand systems may be more difficult to validate, explain, and monitor. The control environment matters because strong human review, access controls, testing, monitoring, and escalation can reduce residual risk. This is why AI risk taxonomies commonly consider use, impact, complexity, and controls rather than assigning risk based only on the technology itself.

Performance benchmarking focuses on measuring model outputs against metrics, not on overall risk in the use context.
Data lineage documentation addresses where data came from and how it moved, but it does not by itself assess impact, complexity, and controls.
Vendor concentration analysis concerns dependency on external providers, not the risk profile of each AI use case.

The assessment is based on how the AI is used, what impact it may have, how complex it is, and whether controls reduce the risk.

Question 65

Topic: Risks and Risk Factors

A bank is testing a generative AI assistant that summarizes external articles using retrieval-augmented generation. One retrieved webpage contains hidden text telling the model to ignore its instructions and reveal any client identifiers in the session context. In testing, the assistant repeats a client identifier in its answer. What is the best action?

A. Treat the event as data poisoning and rebuild the model’s training dataset using only approved sources.
B. Allow the release if monitoring logs capture future disclosures for compliance review.
C. Classify the event as indirect prompt injection with sensitive-data leakage risk, then isolate retrieved content as untrusted and restrict the assistant’s access to client data.
D. Treat the event as ordinary model hallucination and improve the answer-quality evaluation set before release.

Best answer: C

What this tests: Risks and Risk Factors

Explanation: The key risk is indirect prompt injection: malicious instructions embedded in external retrieved content influence the model despite system instructions. Because the assistant exposed a client identifier from session context, the incident also involves sensitive-data leakage. The best action is not only to improve prompts, but to treat retrieved content as untrusted, limit what sensitive data the model can access, and apply controls such as content isolation, output filtering, least privilege, and review before deployment. This directly addresses the attack path and reduces the potential impact if a prompt injection attempt succeeds.

Ordinary hallucination is a content accuracy problem, but the facts show malicious instructions caused the model to disclose sensitive data.
Data poisoning targets training data or model behavior over time; here the attack came through retrieved runtime content.
Logging future disclosures is detective only and does not provide an adequate preventive control before release.

The malicious instruction enters through retrieved content and causes disclosure, so the control response should address prompt injection and data-access exposure together.

Question 66

Topic: Responsible and Ethical AI

A bank pilots a generative AI assistant that summarizes loan-file notes and suggests missing-document checklist items. Independent model validation found the assistant’s summary accuracy within approved criteria, but a QA review finds analysts copying checklist items into credit memos without checking source documents, including one hallucinated item. What is the best action to address the issue?

A. Issue safe-use guidance and workflow controls requiring analysts to verify assistant outputs against source documents before using them in credit memos.
B. Repeat independent model validation because any hallucinated output means the validation conclusion is invalid.
C. Transfer accountability for incorrect credit memos from analysts to the AI model owner.
D. Retrain the assistant on a larger document set to eliminate the need for analyst review.

Best answer: A

What this tests: Responsible and Ethical AI

Explanation: Model validation and safe-use guidance address different control needs. Validation evaluates whether the model performs acceptably against defined criteria before or during approved use. Here, validation has already found the assistant within criteria; the observed failure is that analysts are relying on outputs without the required professional review. Because generative AI can still hallucinate even when generally reliable, users need clear instructions, training, workflow prompts, and documentation expectations that outputs are decision support and must be verified against authoritative sources. The best action targets the human-use pattern that creates the risk.

Repeating validation treats the issue as a technical performance failure, even though the decisive fact is improper reliance by users.
Retraining may improve outputs but cannot justify removing human verification for credit documentation.
Shifting accountability to the model owner ignores that business users remain responsible for how AI-assisted outputs are used in decisions.

The problem is user reliance on outputs in practice, so the best control is safe-use guidance and workflow reinforcement rather than another technical validation.

Question 67

Topic: Responsible and Ethical AI

A retail bank project team is preparing to deploy an AI tool that recommends credit-line reductions. Final responsible AI review finds a material disparity against one protected customer group, traced to a third-party data attribute that acts as a proxy for that group. The team has not tested a mitigation or obtained an approved risk acceptance, but the product owner wants to launch and monitor complaints after go-live. What is the BEST action?

A. Pause deployment, escalate the issue through the responsible AI governance process, and release only after mitigation or approved risk acceptance is completed.
B. Proceed because the disparity was identified before deployment and can be documented in the project file.
C. Deploy as scheduled and add the disparity to post-launch monitoring and complaint tracking.
D. Remove the third-party attribute immediately and deploy without further testing.

Best answer: A

What this tests: Responsible and Ethical AI

Explanation: A project team should pause deployment when a responsible AI concern is material, unresolved, and likely to affect customers or other stakeholders. Here, the review identified a fairness issue tied to a proxy attribute, and the team has neither validated a mitigation nor obtained formal risk acceptance. Launching anyway would shift an identified ethical and governance problem into production. The best response is to escalate through the responsible AI governance process, determine whether the use case can be remediated, and require appropriate approval before release. Monitoring is important, but it is not a substitute for addressing a known pre-deployment harm.

Post-launch monitoring is inadequate because the disparity is already known and material before go-live.
Removing the attribute without retesting may create new errors or leave proxy effects elsewhere in the model.
Documentation alone does not resolve a fairness concern or provide formal accountability for accepting the risk.

An unresolved material fairness concern before launch requires escalation and remediation or formal risk acceptance before deployment.

Question 68

Topic: History and Overview of AI Concepts

A risk team builds a supervised-learning model to predict whether a credit-card account will become 60 days past due in the next quarter. The training file contains current utilization, payment history, account age, income band, and a historical field showing whether the account actually became 60 days past due. Which term best maps to the historical past-due field?

A. Training label or target variable
B. Model hyperparameter
C. Validation metric
D. Input feature

Best answer: A

What this tests: History and Overview of AI Concepts

Explanation: In supervised learning, each training example typically pairs input variables with a known outcome. The input variables are features, such as utilization, payment history, account age, and income band. The known outcome the model is trained to predict is the label, also called the target variable. Here, the field showing whether the account actually became 60 days past due is the observed outcome, so it is the label. When the model is later used on new accounts, that outcome is not yet known; the model estimates it from the available features.

Input feature is incorrect because features are predictors available to the model, not the historical outcome being predicted.
Model hyperparameter is incorrect because hyperparameters are configuration choices set before or during training, not fields in the training data.
Validation metric is incorrect because metrics evaluate performance, such as accuracy or recall, rather than serving as the observed target outcome.

The historical past-due outcome is the known result the supervised model learns to predict.

Question 69

Topic: AI Tools and Techniques

A bank has an approved generative AI assistant that drafts internal credit memo summaries using a locked vendor model and a retrieval corpus limited to policy documents. The product team proposes changing the tool configuration to retrieve prior client emails and modifying the system prompt so the assistant can suggest credit exceptions, but the vendor model version and application code will not change. What is the best action for the AI governance lead?

A. Update the model inventory after deployment and rely on production monitoring to detect issues.
B. Limit the review to cybersecurity sign-off because the main change is access to additional documents.
C. Require change review and targeted testing before release because the configuration changes affect outputs, data exposure, and decision-support risk.
D. Approve deployment because the vendor model version and application code remain unchanged.

Best answer: C

What this tests: AI Tools and Techniques

Explanation: AI tool configuration can be risk-relevant even without retraining, code changes, or a new vendor model version. Retrieval sources determine what information the system can use and potentially expose, while system prompts shape the tool’s role, tone, constraints, and output behavior. In this scenario, adding prior client emails raises data governance, confidentiality, and privacy considerations, and allowing suggested credit exceptions changes the assistant from summarization toward decision support. Those facts make the proposed release appropriate for formal change review, targeted testing, and approval before production use.

Treating the change as safe because the base model and code are unchanged ignores that prompts and retrieval configuration can materially change behavior.
Waiting for post-deployment monitoring leaves risk controls reactive when the change is foreseeable before release.
A cybersecurity-only review is too narrow because the change also affects data use, output scope, and model-risk exposure.

Changing retrieval sources and prompts can materially alter model behavior and risk even when the underlying model and code are unchanged.

Question 70

Topic: Responsible and Ethical AI

A bank’s operations team wants to use a general-purpose chatbot to decide whether to release a fraud hold by asking, “Is this customer’s identity document genuine?” The chatbot has no connection to the bank’s document-verification system or authoritative identity records and can only generate a plausible text response. Which concept best describes why this AI tool should not be used for that decision?

A. Demographic bias in model outputs
B. Lack of verifiable factual grounding for a fact-critical decision
C. Model drift in production monitoring
D. Overfitting to training data

Best answer: B

What this tests: Responsible and Ethical AI

Explanation: A safe and reliable AI use case depends on whether the tool can support the decision being made. Here, releasing a fraud hold depends on a verified fact: whether the identity document is genuine. A general-purpose chatbot that is not connected to authoritative records or a verification system cannot establish that fact; it can only produce a plausible answer. Using it would create an unsafe reliance on unverified output, even if the response sounds confident. The appropriate control is to use an authoritative verification process, with AI used only in a role consistent with its evidence and limitations.

Model drift concerns changes in performance over time, not the tool’s inability to verify a required fact.
Overfitting describes poor generalization from training data, not lack of authoritative evidence.
Demographic bias may be relevant in identity processes, but the decisive issue here is unverifiable factual grounding.

The decision requires authoritative verification, but the chatbot cannot access or confirm the facts needed to support the action.

Question 71

Topic: Data and AI Model Governance

A bank uses an AI system to recommend collections actions for early-stage retail loan delinquencies. In the quarterly lifecycle review, the owner notes that the collections policy has been replaced by a hardship program with different treatment options, two key input data feeds have been discontinued, and staff override more than 70% of the system’s recommendations because they no longer match current procedures. What is the best action?

A. Initiate controlled retirement or decommissioning, remove the system from production use, update the inventory, and transition users to an approved process.
B. Keep the system active but require staff to document overrides for all recommendations.
C. Recalibrate the recommendation thresholds and continue production use until the next annual validation cycle.
D. Expand the system’s approved scope to cover the hardship program because human users can reject unsuitable outputs.

Best answer: A

What this tests: Data and AI Model Governance

Explanation: Retirement or decommissioning controls are needed when an AI system can no longer perform its approved purpose safely or reliably. Here, the approved use case has changed, key input feeds have been discontinued, and users are routinely overriding outputs because recommendations no longer align with current procedures. These are not minor performance issues; they indicate the system’s design assumptions and operating context are obsolete. A controlled decommission should remove the system from production decision support, update governance records such as the model inventory, preserve required documentation, and transition users to an approved alternative process.

Recalibration addresses parameter tuning, not discontinued inputs and an obsolete business process.
Override documentation may be useful temporarily, but it does not make an unfit system appropriate for continued use.
Expanding scope would require a new assessment and approval; human rejection alone does not cure lack of fitness for purpose.

The system is no longer fit for its approved purpose because its use context, inputs, and operating procedures have materially changed.

Question 72

Topic: AI Tools and Techniques

A risk team describes an AI assistant that searches approved internal policy documents at the time of a user query and uses the returned passages as context to draft a grounded response. Which AI concept does this describe?

A. Fine-tuning
B. Retrieval-augmented generation
C. Prompt engineering
D. Model distillation

Best answer: B

What this tests: AI Tools and Techniques

Explanation: Retrieval-augmented generation, often abbreviated RAG, combines information retrieval with generative AI. Instead of relying only on what the model learned during training, the system retrieves relevant source content—such as policy documents, procedures, or knowledge-base passages—and provides that content to the model when it generates an answer. This helps ground the response in approved or current material and can reduce unsupported outputs, although it does not eliminate the need for validation, access controls, and human oversight in higher-risk uses. In the scenario, the decisive feature is that the assistant searches approved documents at query time and uses those passages to produce the response.

Fine-tuning updates or adapts model parameters using training examples; it is not simply retrieving documents at query time.
Prompt engineering improves instructions or input wording but does not by itself add retrieved source material.
Model distillation creates a smaller model that imitates a larger one; it does not describe grounding responses with retrieved content.

Retrieval-augmented generation grounds a generative AI response by supplying relevant retrieved source content as context.

Question 73

Topic: Data and AI Model Governance

A bank plans to deploy a machine-learning model for loan decision support. Policy requires a party outside the model development team to assess conceptual soundness, data suitability, testing evidence, limitations, and performance before production approval. Which governance role is best suited for this activity?

A. First-line model owner
B. Internal audit
C. AI governance committee
D. Independent model validation function

Best answer: D

What this tests: Data and AI Model Governance

Explanation: In a three-lines-of-defense model, the first line owns and operates the AI use case, while an independent validation or risk review function provides technical challenge before deployment. For an AI model, validation typically examines whether the model is conceptually sound, uses appropriate and representative data, has been tested adequately, has known limitations documented, and performs within intended-use expectations. A governance committee may rely on that validation evidence when deciding whether to approve use, but it normally does not perform the detailed validation work. Internal audit provides third-line assurance over the governance framework and controls, often after processes are in place, rather than conducting the pre-production model validation itself.

The first-line model owner is accountable for development, operation, and ongoing use, but is not independent of the model build.
The AI governance committee may approve or escalate decisions, but it typically reviews evidence rather than performing detailed validation.
Internal audit assesses governance and control effectiveness, but it is not the usual pre-use technical validator.

Independent model validation is best suited to provide pre-use technical challenge of model design, data, testing, limitations, and performance.

Question 74

Topic: Responsible and Ethical AI

A financial institution’s AI policy requires each high-impact AI system to have a named business owner who approves its intended use, tracks issues, and is answerable to a governance committee for decisions made with the system. Which responsible AI principle is most directly reflected by this requirement?

A. Reliability
B. Transparency
C. Accountability
D. Fairness

Best answer: C

What this tests: Responsible and Ethical AI

Explanation: Accountability means that responsibility for an AI system’s use, controls, decisions, and impacts is clearly assigned. In the scenario, the key feature is not merely that the model is documented or technically robust, but that a specific business owner approves use, tracks issues, and reports to governance. This creates a line of responsibility for managing the AI system within the organization’s risk appetite. Other responsible AI principles may also be relevant to high-impact AI systems, but the described requirement most directly addresses who is answerable for the system and its consequences.

Transparency focuses on explainability, disclosure, and understandable information about the AI system, not primarily on ownership.
Fairness concerns avoiding unjustified bias or discriminatory impacts across groups.
Reliability concerns consistent, accurate, and dependable performance under intended conditions.

Assigning a named owner who is answerable for AI use and outcomes directly supports accountability.

Question 75

Topic: History and Overview of AI Concepts

An AI system scores transaction alerts and recommends which alerts can be closed. The policy requires low-confidence alerts, unusual customer impacts, or suspected data-quality issues to be routed to an operations analyst who can accept, reject, or override the recommendation before any customer action is taken. Which concept best matches this control placement?

A. Model training using labeled historical alerts
B. Data preprocessing before model scoring
C. Human-in-the-loop review at the decision stage
D. Post-deployment drift monitoring of model performance

Best answer: C

What this tests: History and Overview of AI Concepts

Explanation: Human review, escalation, and override controls often fit between the AI system’s output and the final decision or action. In this case, the model produces a score and recommendation, but specified cases are routed to a person before any customer-facing action occurs. That placement supports accountability, exception handling, and risk-based oversight for uncertain, high-impact, or potentially flawed outputs. It does not describe how the model is trained, how input data is prepared, or how the model is monitored over time; it describes a control embedded in the decision workflow.

Model training is about learning model parameters from historical data, not reviewing individual live recommendations.
Drift monitoring checks whether model behavior or data patterns change after deployment, but it does not itself approve or override a specific alert decision.
Data preprocessing transforms or cleans inputs before scoring, while the analyst control occurs after the score is produced.

The analyst review occurs after the AI output but before the final business action, allowing escalation or override.

Questions 76-80

Question 76

Topic: AI Tools and Techniques

A bank is choosing a model to support renewal decisions for small-business credit lines. A gradient-boosted model improves validation AUC from 0.81 to 0.83 over a regularized logistic regression, but its top drivers change materially across monthly samples. Credit policy requires defensible reason codes for each decision and a control owner must monitor key drivers after deployment. What is the best action?

A. Delay model use until a model with perfect interpretability and the highest AUC is available.
B. Deploy the regularized logistic regression, document the modest performance trade-off, and monitor its stable drivers.
C. Deploy both models and allow reviewers to use whichever output they prefer.
D. Deploy the gradient-boosted model because it has the highest validation AUC.

Best answer: B

What this tests: AI Tools and Techniques

Explanation: When performance gains are small, a simpler model may be preferable if the use case requires clear explanations, stable drivers, and practical post-deployment monitoring. Here, the gradient-boosted model has only a marginal AUC improvement, while its changing drivers create governance and control problems. The regularized logistic regression is more interpretable and easier to support with reason codes and monitoring. The best action is not to ignore performance, but to document the trade-off and choose the model that better fits the decision context and control requirements.

Choosing the highest AUC alone ignores the policy need for defensible explanations and stable monitoring.
Letting reviewers choose between models weakens governance and creates inconsistent decision support.
Waiting for perfect interpretability and the highest AUC is unrealistic and unnecessary when a controlled, adequate model is available.

The simpler model better satisfies interpretability, governance, and control needs while giving up only marginal measured performance.

Question 77

Topic: Responsible and Ethical AI

A bank’s product team plans to publish a client-facing disclosure for a new AI-assisted loan triage tool. The draft says, “Our AI automatically approves applications with 98% accuracy, has been independently validated, and reduces credit losses by 30%.” In fact, the 98% figure is agreement with historical underwriter decisions in a pilot dataset, no independent validation is complete, all recommendations require loan-officer review, and no loss-reduction outcome has been measured. What is the best action for the responsible AI reviewer?

A. Approve the disclosure because the pilot result is high and loan-officer review reduces the risk of inaccurate AI recommendations.
B. Require the disclosure to state the tool is AI-assisted decision support, describe the pilot metric accurately, and remove claims of completed validation and loss reduction until substantiated.
C. Approve the disclosure if the product team changes “automatically approves” to “supports approvals” but keeps the 98% accuracy, validation, and loss-reduction claims.
D. Delay publication only until the model architecture and training algorithm are described in more technical detail for customers.

Best answer: B

What this tests: Responsible and Ethical AI

Explanation: A transparent AI disclosure should be accurate, balanced, and supported by evidence. Here, the draft overstates several points: agreement with historical decisions is not the same as predictive accuracy, the tool is not autonomous because humans review recommendations, independent validation has not been completed, and the claimed 30% credit-loss reduction has not been measured. The best action is to revise or remove unsupported claims before publication. A disclosure can describe the tool’s actual role and evidence, but it should not imply proven performance, full automation, or completed governance checks when those facts are not true.

High pilot agreement and human review do not justify publishing unsupported claims about autonomy, validation, or business benefits.
Changing only the autonomy wording leaves misleading accuracy, validation, and loss-reduction statements in place.
More technical detail may aid transparency, but it does not cure exaggerated or unsubstantiated claims.

This directly corrects the misleading claims about accuracy, autonomy, validation status, and measured benefits.

Question 78

Topic: History and Overview of AI Concepts

A bank uses AI to support decisions on credit limits. Compared with an AI tool used only to summarize internal meeting notes, the bank requires clearer reason codes, stronger documentation of inputs, and evidence that decision logic can be reviewed. Which concept best matches this increased requirement?

A. Unsupervised learning for pattern discovery
B. Risk-based explainability for consequential decisions
C. Performance monitoring for model drift
D. Data minimization for privacy protection

Best answer: B

What this tests: History and Overview of AI Concepts

Explanation: Explainability is not equally important for every AI use case. When an AI system influences a customer outcome, such as a credit limit, pricing decision, eligibility decision, or other regulated process, the organization needs a clearer basis for how the output was produced. This supports customer communication, internal challenge, validation, audit, and regulatory oversight. An internal low-impact productivity tool may still need controls, but the explainability burden is typically lower because the consequences are less direct and less regulated. The stem points to a risk-based approach: as decision impact and regulatory relevance increase, so do expectations for interpretable outputs, documented inputs, reason codes, and reviewability.

Unsupervised learning describes a learning approach, not why higher-impact decisions require more explanation.
Data minimization addresses limiting personal data use, not explaining the basis for an AI-influenced decision.
Performance monitoring detects changes in model behavior over time, but it does not by itself provide decision-level rationale.

Explainability needs increase when AI affects customer outcomes or regulated decisions because stakeholders must understand, review, and challenge the basis for the result.

Question 79

Topic: Data and AI Model Governance

A bank’s first-line analytics team updates a customer-churn AI model with new features and deploys it directly to production. The model inventory is not updated, independent validation does not review the revised model, monitoring thresholds remain tied to the prior version, and no model change ticket is approved. Which lifecycle-control concept is best illustrated?

A. Uncontrolled model change bypassing lifecycle gates
B. Performance drift detected through production monitoring
C. Data lineage defect in source-system mapping
D. Human-review override failure in an automated decision

Best answer: A

What this tests: Data and AI Model Governance

Explanation: AI model governance relies on lifecycle gates so that material model changes are identified, reviewed, approved, documented, and monitored before or at deployment. In this scenario, the model was changed and promoted to production while bypassing multiple controls: the inventory was not updated, independent validation did not assess the revised model, monitoring remained linked to the old version, and change approval was skipped. That is an uncontrolled model change, not merely a technical performance issue. The risk is that stakeholders may rely on a model whose assumptions, data, performance, and controls have not been assessed for its current production use.

Data lineage defects concern unclear or incorrect data flow from source to model, but the main failure here is skipped lifecycle governance.
Performance drift is detected through monitoring after deployment; the stem says monitoring was not updated for the new version.
Human-review override failure would involve inappropriate or ineffective human intervention in decisions, not bypassed validation and change control.

The revised model entered production without required validation, monitoring updates, inventory maintenance, or change approval.

Question 80

Topic: Responsible and Ethical AI

An AI review finds that a customer-service summarization tool passed its accuracy validation, but employees are treating summaries as definitive and skipping required source-document checks. A proposed control defines permitted uses, required human checks, warning signs, and escalation steps for uncertain outputs. Which concept does this description best match?

A. Training data representativeness assessment
B. Post-deployment drift monitoring
C. Independent model validation
D. Safe-use guidance for users

Best answer: D

What this tests: Responsible and Ethical AI

Explanation: The issue is user reliance in practice, not whether the model’s technical performance was tested. Safe-use guidance translates model limitations into workflow instructions: when outputs may be used, what independent checks are required, when human judgment must override the tool, and how uncertain or potentially harmful outputs should be escalated. Model validation remains important, but it evaluates whether the model is fit for intended use based on evidence such as testing, assumptions, performance, and limitations. A validated model can still create risk if users over-trust it or use it outside approved procedures.

Independent model validation focuses on assessing model performance and fitness for use, not day-to-day reliance behavior.
Post-deployment drift monitoring detects performance or data changes over time, not whether users are following appropriate review steps.
Training data representativeness assessment evaluates whether the development data reflect the target population, not user over-reliance on outputs.

Safe-use guidance addresses how people should rely on AI outputs in the workflow, including limits, checks, and escalation.

Exam snapshot

Item	Detail
Issuer	GARP
Exam route	GARP RAI
Official exam name	GARP Risk and AI Certificate (RAI)
Full-length set on this page	80 questions
Exam time	240 minutes
Topic areas represented	5

Full-length exam mix

Topic	Approximate official weight	Questions used
History and Overview of AI Concepts	20%	16
AI Tools and Techniques	20%	16
Risks and Risk Factors	20%	16
Responsible and Ethical AI	20%	16
Data and AI Model Governance	20%	16

Continue in the web app

Use Finance Prep for interactive GARP RAI practice with mixed sets, timed mock exams, topic drills, explanations, and progress tracking.

Focused topic pages

Practice next step

Use the full Finance Prep practice page above for the latest review links and practice page.

Data and AI Model Governance

Official Resources