RAI — GARP Risk and AI Certificate Quick Review

Fast review for the GARP Risk and AI Certificate (RAI): AI risk, governance, model validation, data, bias, explainability, and GenAI controls.

Quick Orientation for RAI Candidates

This quick review is for candidates preparing for the GARP Risk and AI Certificate (RAI), exam code RAI, offered by GARP. Use it as a final-pass review before moving into independent companion practice, topic drills, mock exams, and detailed explanations.

The exam rewards more than vocabulary. Be ready to apply AI risk concepts to practical scenarios involving governance, data, model development, validation, deployment, monitoring, bias, explainability, operational resilience, and generative AI.

Practical study rule: if you can explain what can go wrong, why it matters, how to detect it, and what control reduces the risk, you are reviewing at the right level.

High-Yield RAI Review Map

AreaWhat to KnowCommon Exam Angle
AI and ML fundamentalsSupervised, unsupervised, reinforcement, generative AI, model training, inferenceMatch model type to use case and risk
Data riskQuality, lineage, representativeness, privacy, leakage, biasIdentify flawed data assumptions
Model developmentFeature engineering, train/test split, overfitting, hyperparametersChoose better development practice
Model validationIndependent review, conceptual soundness, performance testing, limitationsDistinguish validation from monitoring
GovernanceAccountability, roles, risk appetite, policies, escalationIdentify missing ownership or weak controls
ExplainabilityGlobal/local explanations, SHAP, LIME, counterfactualsSelect explanation method by audience and use
Fairness and biasDisparate outcomes, proxy variables, label bias, fairness trade-offsAvoid “remove protected variable” trap
Operational riskDeployment, change management, outages, human override, vendor relianceRecognize production risk beyond model accuracy
Cyber and adversarial riskData poisoning, evasion, prompt injection, model extractionSelect control for attack type
Generative AIHallucination, grounding, RAG, guardrails, red teaming, human reviewApply controls to LLM-specific risks
MonitoringDrift, degradation, threshold breaches, incident responseKnow what to monitor and when to retrain
Third-party riskVendor models, APIs, cloud, documentation, auditabilityIdentify due diligence gaps

AI, ML, and Model Types

Core Distinctions

ConceptMeaningWatch For
Artificial intelligenceSystems performing tasks associated with human intelligenceBroad umbrella; not all AI is machine learning
Machine learningModels learn patterns from data rather than explicit rulesData quality and representativeness become central risks
Deep learningNeural network methods with many layersOften high performance but less transparent
Generative AIProduces new text, images, code, or other contentHallucination, misuse, IP/privacy, prompt risks
Foundation modelLarge pretrained model adaptable to many tasksBroad capability creates broad risk surface
InferenceApplying a trained model to new inputsProduction controls matter here
TrainingEstimating model parameters from dataLeakage, bias, overfitting, and compute risk arise here

Model Family Quick Review

Model TypeTypical UseStrengthKey Risk
Linear/logistic regressionScoring, classification, interpretable baselinesTransparent, stableMisses nonlinear relationships
Decision treeRules-based classification/regressionEasy to explainOverfits if unconstrained
Random forestEnsemble predictionRobust, strong performanceLess interpretable than single tree
Gradient boostingCredit, fraud, pricing, risk scoringHigh predictive powerSensitive to tuning; explainability challenge
Neural networkComplex patterns, images, language, nonlinear predictionFlexibleOpaque, data/compute intensive
ClusteringSegmentation, anomaly groupingFinds structure without labelsClusters may lack business meaning
Anomaly detectionFraud, cyber, operational exceptionsIdentifies rare patternsHigh false positives if poorly calibrated
Reinforcement learningSequential decisions, optimizationLearns from reward feedbackSafety and unintended strategy risk
Large language modelText generation, summarization, assistantsFlexible natural language capabilityHallucination, prompt injection, data leakage

Supervised vs. Unsupervised vs. Generative

QuestionLikely Category
“We have labeled historical outcomes and want to predict a future outcome.”Supervised learning
“We want to group customers or transactions without labels.”Unsupervised learning
“We want the model to create text, code, or synthetic content.”Generative AI
“We want an agent to learn actions over time based on rewards.”Reinforcement learning

AI Risk Lifecycle

    flowchart LR
	    A[Use Case Definition] --> B[Data Sourcing and Governance]
	    B --> C[Model Development]
	    C --> D[Independent Review and Validation]
	    D --> E[Approval and Deployment]
	    E --> F[Production Monitoring]
	    F --> G[Change Management]
	    G --> H[Retire, Replace, or Revalidate]
	    F --> C

Lifecycle Control Points

StageKey QuestionControl Focus
Use case definitionIs AI appropriate for the decision?Materiality, risk appetite, human impact
Data sourcingIs the data fit for purpose?Lineage, quality, permissions, representativeness
DevelopmentIs the model designed soundly?Method selection, feature review, documentation
ValidationDoes the model work as intended?Independent challenge, testing, limitations
ApprovalWho accepts the residual risk?Governance, sign-off, accountability
DeploymentIs implementation faithful and secure?Access control, testing, rollback plans
MonitoringIs performance stable over time?Drift, accuracy, bias, usage, incidents
Change managementWhat changed and who approved it?Version control, revalidation triggers
RetirementIs the model still needed?Decommissioning, replacement, record retention

Governance and Accountability

Governance Building Blocks

ElementPurposeWeak Signal
Risk appetiteDefines acceptable risk levelsNo escalation when thresholds are breached
Policy and standardsCreate consistent minimum expectationsTeams invent their own controls
Roles and responsibilitiesClarify ownership“The model owns itself” or no named accountable owner
Independent challengeTests assumptions and limitationsDevelopment team validates its own work without review
DocumentationSupports auditability and repeatabilityKey decisions exist only in emails or code comments
InventoryTracks AI systems and materialityShadow AI tools used outside approval
Escalation processEnsures issues reach decision-makersMonitoring flags ignored
Human oversightKeeps accountable judgment in the loopRubber-stamp review or no override path

Three-Lines-of-Defense Style Thinking

FunctionTypical ResponsibilityExam Trap
First lineOwns and operates the AI use caseCannot outsource accountability to validation or vendor
Second lineSets risk framework, challenges, overseesShould not become the model developer
Third lineIndependent audit/assuranceReviews framework effectiveness, not daily tuning

Governance Decision Rules

  • High materiality + low explainability requires stronger documentation, validation, monitoring, and human oversight.
  • Customer-impacting decisions require special attention to fairness, transparency, complaint handling, and override processes.
  • Automated decisions without review raise governance stakes, especially if adverse outcomes are possible.
  • Vendor-provided AI still needs internal accountability, due diligence, performance monitoring, and exit planning.
  • GenAI used for advice, summaries, or decisions needs controls for hallucination, grounding, prompt injection, and user misuse.

Data Risk Quick Review

Common Data Risks

RiskMeaningExampleControl
Poor qualityInaccurate, incomplete, inconsistent dataMissing income fieldsData checks, cleansing, reconciliation
BiasData reflects historical inequities or sampling issuesUnderrepresented borrower groupBias testing, representative sampling
LeakageTraining data contains future or target informationUsing post-default collection status to predict defaultFeature review, temporal validation
DriftProduction data changes over timeNew customer mix after product launchDrift monitoring, retraining triggers
Lineage gapsUnknown origin or transformationsVendor data field cannot be tracedData lineage documentation
Privacy exposureSensitive information used improperlyPII in prompts or logsMinimization, masking, access controls
Proxy variablesNon-sensitive variables approximate sensitive traitsZIP code as socioeconomic proxyFairness review, feature testing
Label errorOutcome variable is wrong or biasedFraud labels based only on detected fraudLabel audit, alternative labels

Data Leakage Traps

Data leakage is one of the most important candidate traps. Look for features that would not be available at the time of decision.

Suspicious FeatureWhy It May Leak
Collection outcome used in credit approval modelOutcome occurs after approval
Claim settlement amount used in claim triage at filingSettlement occurs later
Fraud investigation result used at transaction authorizationInvestigation occurs after transaction
Customer churn reason used to predict churnReason is known only after churn

Train, Validation, and Test Sets

DatasetPurposeCandidate Trap
Training setFit model parametersDo not use it as proof of real-world performance
Validation setTune hyperparameters and select modelRepeated tuning can overfit validation data
Test setFinal unbiased performance estimateDo not tune after looking at test results
Out-of-time sampleTests temporal stabilityOften more realistic for financial data
Production dataLive operating environmentMust be monitored; not the same as test data

Model Development and Performance

Overfitting vs. Underfitting

PatternMeaningTypical EvidenceResponse
OverfittingLearns noise or idiosyncrasiesHigh training performance, weak test performanceSimplify model, regularize, more data, cross-validation
UnderfittingToo simple to capture signalWeak training and test performanceAdd features, richer model, improve data
Data driftRelationship changes after deploymentPerformance degrades over timeMonitor, recalibrate, retrain
Concept driftTarget relationship changesOld predictors no longer workReassess model design and assumptions

Classification Metrics

Use the confusion matrix language carefully:

TermMeaning
True positiveModel predicts positive and actual outcome is positive
False positiveModel predicts positive but actual outcome is negative
True negativeModel predicts negative and actual outcome is negative
False negativeModel predicts negative but actual outcome is positive

Key formulas:

\[ \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \]\[ \text{Precision} = \frac{TP}{TP + FP} \]\[ \text{Recall} = \frac{TP}{TP + FN} \]\[ \text{F1 Score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} \]

Metric Selection Traps

SituationBetter FocusWhy
Rare fraud eventsPrecision, recall, PR curve, false positivesAccuracy can look high by predicting “no fraud”
Safety-critical missed eventsRecall / false negative rateMissing positives is costly
Manual review capacity is limitedPrecision and alert volumeToo many false positives overwhelm teams
Ranking customers by riskAUC, lift, calibration, decile analysisThreshold may be chosen later
Probability used in pricing or capitalCalibrationRanking alone is insufficient
Highly imbalanced dataPrecision-recall metricsROC-AUC may appear optimistic

Regression Metrics

MetricUseTrap
Mean absolute errorAverage absolute prediction errorEasy to interpret but treats all errors linearly
Mean squared errorPenalizes large errorsSensitive to outliers
Root mean squared errorError in original unitsStill outlier-sensitive
R-squaredShare of variance explainedCan be misleading outside development context

Validation and Independent Challenge

What Validation Should Cover

Validation AreaQuestions to Ask
Conceptual soundnessDoes the model approach make sense for the use case?
Data suitabilityIs the data accurate, representative, and available at decision time?
ImplementationWas the model coded and deployed correctly?
PerformanceDoes it perform well on relevant samples and segments?
StabilityDoes performance hold over time and across conditions?
ExplainabilityCan users and reviewers understand drivers and limitations?
FairnessAre outcomes unfairly adverse for groups or segments?
LimitationsAre weaknesses documented and accepted?
Monitoring planAre thresholds, owners, and actions defined?

Validation Is Not the Same as Monitoring

ActivityTimingPurpose
ValidationBefore approval and after material changeAssess fitness for intended use
MonitoringOngoing after deploymentDetect degradation, drift, misuse, or control failures
AuditPeriodic independent assuranceAssess whether governance and controls work
Development testingDuring model buildImprove the model, not independently challenge it

Common Validation Mistakes

  • Validating only accuracy while ignoring data quality, implementation, fairness, and explainability.
  • Treating vendor documentation as a substitute for internal review.
  • Testing on random splits when time-based splits are more appropriate.
  • Ignoring subgroup performance because aggregate performance looks strong.
  • Approving a model without defined monitoring thresholds.
  • Failing to document known limitations and compensating controls.

Explainability and Transparency

Explainability Methods

MethodBest ForLimitation
Feature importanceGlobal driver overviewMay not explain individual decisions
Partial dependence plotAverage effect of a featureCan mislead when features are correlated
SHAP-style explanationLocal and global contribution analysisCan be complex and computationally expensive
LIME-style explanationLocal approximation around one predictionApproximation may be unstable
Counterfactual explanation“What would need to change?”Must be realistic and actionable
Surrogate modelSimple model approximates complex modelApproximation is not the true model
Model cards / documentationCommunicate intended use, data, limitsOnly useful if accurate and maintained

Explanation Audience Matters

AudienceNeeds
Model developerTechnical diagnostics and feature behavior
ValidatorAssumptions, limitations, robustness, implementation evidence
Business ownerDecision drivers, risk trade-offs, control implications
Customer or affected partyClear, actionable reason for outcome where applicable
Senior managementMaterial risk, residual risk, accountability, escalation

Explainability Traps

  • A model can be explainable but still biased.
  • A model can be accurate but not appropriate for a high-impact decision.
  • Global explanations do not necessarily explain a specific individual outcome.
  • Removing complex algorithms does not automatically remove risk.
  • More explanation is not always better; the explanation must be truthful, relevant, and usable.

Fairness, Bias, and Responsible AI

Sources of Bias

SourceDescriptionExample
Historical biasPast decisions reflect unfair patternsHistorical lending approvals encode discrimination
Sampling biasTraining data underrepresents a groupFew observations for a region or customer type
Measurement biasVariables measure groups differentlyInconsistent income verification methods
Label biasTarget variable reflects imperfect processFraud labels only for investigated cases
Proxy biasNeutral-looking variable captures sensitive traitLocation or education as proxy
Deployment biasModel used differently than intendedAdvisory score becomes automatic rejection

Fairness Metrics: Conceptual Differences

ConceptBasic IdeaTrap
Demographic paritySimilar positive prediction rates across groupsMay ignore true risk differences
Equal opportunitySimilar true positive rates across groupsFocuses on access to favorable correct outcomes
Equalized oddsSimilar true positive and false positive ratesOften hard to satisfy with other goals
Calibration by groupPredicted probabilities mean the same across groupsMay conflict with parity metrics
Individual fairnessSimilar individuals treated similarlyRequires defining “similar” appropriately

High-Yield Fairness Decision Rules

  • Do not assume fairness because protected attributes are excluded. Proxies may remain.
  • Do not assume equal accuracy means equal impact. Error types may differ by group.
  • Do not assume one fairness metric solves all fairness concerns. Metrics can conflict.
  • Do test subgroup performance. Aggregate metrics can hide harm.
  • Do connect fairness findings to governance. Someone must decide, document, and monitor residual risk.

Generative AI and LLM Risk

GenAI Risk Quick Review

RiskMeaningControl
HallucinationPlausible but false outputGrounding, retrieval, citations, human review
Prompt injectionUser manipulates model instructionsInput filtering, instruction hierarchy, sandboxing
Data leakageSensitive data exposed in prompts, outputs, or logsData minimization, masking, access controls
Toxic or harmful outputUnsafe, biased, or inappropriate generationGuardrails, moderation, red teaming
Model misuseUsers rely on outputs beyond intended useUsage policy, training, disclaimers, monitoring
OverrelianceHuman accepts output without reviewHuman-in-the-loop, confidence indicators
Model drift/version changeProvider updates affect behaviorVersion tracking, regression testing
Retrieval errorRAG system retrieves wrong or stale contextCurated knowledge base, freshness controls
Automation biasUsers defer to AI recommendationReview requirements, challenge prompts

RAG and Grounding

Retrieval-augmented generation, or RAG, connects a generative model to external documents or databases. It can reduce hallucination, but it does not eliminate risk.

RAG ComponentRisk
Source documentsMay be stale, wrong, or unauthorized
Retrieval rankingMay retrieve irrelevant context
Prompt assemblyMay expose sensitive data
GenerationMay distort retrieved content
Citation outputMay cite sources incorrectly
User interfaceMay encourage overtrust

GenAI Control Stack

LayerExamples
Use-case controlApproved use cases, prohibited uses, risk tiering
Data controlNo sensitive data in prompts unless authorized and protected
Prompt controlTemplates, system instructions, prompt injection defenses
Model controlApproved models, versioning, performance testing
Output controlHuman review, moderation, citations, confidence warnings
Access controlRole-based access, logging, authentication
MonitoringUsage analytics, incidents, quality samples, abuse detection
Incident responseEscalation, containment, user notification process where applicable

Cyber, Operational, and Third-Party AI Risk

Adversarial and Cyber Risks

Attack / RiskDescriptionLikely Control
Data poisoningTraining data manipulatedData provenance, anomaly checks, trusted sources
Evasion attackInputs crafted to avoid detectionRobust testing, adversarial testing
Model extractionAttacker replicates model through queriesRate limits, monitoring, API controls
Membership inferenceAttacker infers whether data was in training setPrivacy controls, differential privacy where appropriate
Prompt injectionMalicious instructions override intended behaviorPrompt defenses, tool-use restrictions
JailbreakUser bypasses safety constraintsRed teaming, guardrails, monitoring
Supply chain riskDependency, model, or library compromisedVendor review, dependency management

Operational Risk Questions

Ask these for any AI deployment:

  1. Who can access the system?
  2. What happens if the model fails or becomes unavailable?
  3. Can humans override the model?
  4. Are overrides tracked and reviewed?
  5. Is there a rollback plan?
  6. Are model versions controlled?
  7. Are inputs and outputs logged appropriately?
  8. Are incidents escalated?
  9. Are users trained on limitations?
  10. Is the model being used only for its approved purpose?

Third-Party and Vendor AI

Due Diligence AreaWhat to Review
Intended useIs the vendor solution appropriate for the business decision?
Data usageWhat data is sent, stored, retained, or used for training?
Model transparencyWhat documentation, limitations, and testing evidence are available?
SecurityAccess controls, encryption, incident procedures
ResilienceAvailability, service continuity, fallback options
Change managementHow are updates communicated and tested?
Audit rightsCan the organization obtain needed assurance?
Exit strategyCan the organization replace the service if needed?

Monitoring, Drift, and Ongoing Control

What to Monitor

Monitoring AreaExamples
Input dataMissing values, distributions, outliers, population shifts
Output dataScore distributions, approval rates, alert volume
PerformanceAccuracy, recall, precision, error rates, calibration
FairnessGroup outcomes, error rates, adverse impact indicators
StabilityDrift, volatility, threshold breaches
UsageApproved vs. actual use, user behavior, overrides
OperationsLatency, availability, failures, incidents
GenAI qualityHallucination samples, unsafe outputs, user feedback
SecuritySuspicious queries, abuse, unauthorized access

Drift Types

Drift TypeMeaningExample
Data driftInput distribution changesNew customer segment uses product
Concept driftRelationship between inputs and target changesFraud patterns evolve
Prediction driftOutput distribution changesSudden spike in high-risk scores
Performance driftActual model quality declinesRecall falls after market change

Revalidation Triggers

  • Material model change.
  • New data source or feature set.
  • New use case or user population.
  • Significant performance degradation.
  • Significant drift or threshold breach.
  • Vendor model update.
  • Change in operating environment.
  • Incident, complaint trend, or unexpected harm.
  • Regulatory, policy, or governance framework change where relevant.

Risk Assessment and Control Thinking

Inherent vs. Residual Risk

TermMeaning
Inherent riskRisk before controls
ControlMeasure designed to prevent, detect, or correct risk
Residual riskRisk remaining after controls
Risk appetiteLevel of risk the organization is willing to accept
Risk toleranceSpecific thresholds or limits supporting appetite

Control Types

Control TypePurposeExample
PreventiveStop issue before it occursAccess restriction, approved feature list
DetectiveIdentify issue after or during occurrenceDrift monitoring, exception reports
CorrectiveFix or mitigate issueRollback, retraining, incident remediation
CompensatingReduce risk when primary control is imperfectHuman review for low-explainability model

Control Matching Drill

If the Problem Is…A Stronger Control Is Usually…
Unclear accountabilityNamed model owner and governance approval
Poor data lineageData documentation and lineage controls
OverfittingOut-of-sample testing, regularization, simpler model
Biased outcomesFairness testing, feature review, governance decision
HallucinationRAG, human review, output verification
Prompt injectionInput filtering, tool restrictions, red teaming
Vendor opacityDue diligence, monitoring, contractual assurance
DriftMonitoring thresholds and retraining process
OverrelianceHuman-in-the-loop and user training
Unapproved usageAccess control and usage monitoring

Scenario Decision Rules

Choose the Best Answer by Asking

  1. What is the primary risk? Data, model, governance, fairness, cyber, operational, or third-party?
  2. Where in the lifecycle is the issue? Development, validation, deployment, monitoring, or change?
  3. Is the control preventive, detective, or corrective?
  4. Who should own the action? Developer, business owner, risk function, validator, audit, vendor manager?
  5. Is the proposed action sufficient for materiality?
  6. Does the answer confuse performance with governance?
  7. Does the answer rely on a simplistic fix, such as “remove the variable” or “use a more accurate model”?

Common “Best Answer” Patterns

ScenarioStrong Answer
Model performs well overall but poorly for one groupInvestigate subgroup performance, fairness, data quality, and mitigation
New vendor AI tool is proposedConduct due diligence, assess data/security/model risk, define monitoring
LLM gives confident false answersAdd grounding, verification, human review, and monitoring
Model accuracy declines after launchInvestigate drift, data changes, implementation, and retraining triggers
Business wants to bypass validation to meet deadlineEscalate governance issue; do not skip independent review for material models
Feature is highly predictive but may be a proxyTest for proxy effects and fairness implications
Model is used for a new decisionReassess intended use, materiality, validation, and approval
Users ignore model limitationsImprove training, interface controls, oversight, and documentation

Candidate Mistakes to Avoid

  • Equating AI risk management with model accuracy only.
  • Forgetting that data problems can dominate model problems.
  • Treating explainability as the same thing as fairness.
  • Assuming a black-box model is unacceptable in all cases.
  • Assuming a simple model is automatically low risk.
  • Ignoring human process risk around the model.
  • Overlooking production implementation and monitoring.
  • Thinking vendor AI removes internal responsibility.
  • Choosing a technical fix when the scenario is actually a governance failure.
  • Choosing a governance policy when the scenario needs a specific operational control.
  • Ignoring materiality: higher-impact use cases require stronger control.
  • Relying on aggregate metrics without segment analysis.
  • Treating GenAI outputs as reliable because they sound confident.
  • Forgetting that monitoring must have thresholds, owners, and escalation.

Fast Final Review Checklist

Before you move into topic drills or a mock exam, make sure you can answer these quickly:

  • Can you distinguish supervised, unsupervised, reinforcement, and generative AI?
  • Can you identify data leakage in a scenario?
  • Can you choose the right metric for imbalanced classification?
  • Can you explain overfitting and how to reduce it?
  • Can you separate development testing, validation, monitoring, and audit?
  • Can you identify fairness risks even when protected variables are removed?
  • Can you match explainability tools to the audience and decision type?
  • Can you name practical controls for hallucination and prompt injection?
  • Can you explain why vendor AI still requires internal oversight?
  • Can you identify when revalidation is needed?
  • Can you connect materiality to stronger governance?
  • Can you distinguish inherent risk, controls, and residual risk?

How to Use Practice Questions After This Review

Use this Quick Review as a bridge into original practice questions. For the GARP Risk and AI Certificate (RAI), efficient practice should include:

  • Topic drills for data risk, validation, fairness, explainability, GenAI, and governance.
  • Scenario questions that require selecting the best control, not just defining terms.
  • Mock exams to practice pacing and mixed-topic recognition.
  • Detailed explanations to understand why tempting answers are incomplete or misaligned.

Next step: work through independent companion practice questions by topic, review every explanation, and keep a short error log of missed concepts, especially around data leakage, model monitoring, fairness trade-offs, vendor risk, and generative AI controls.