Governing AI In The Post–Reg B World: Scorecards, Guardrails, And The Operational Architecture That Actually Holds Up
Why lenders still need strong AI governance, even after the CFPB’s Regulation B rollback
The Consumer Financial Protection Bureau (CFPB) has removed disparate-impact (effects-test) provisions from Regulation B and has taken the position that ECOA does not recognize disparate-impact liability, in a final rule effective July 21, 2026. This leaves open important questions about how this position affects the scope of lenders’ AI governance programs, particularly with respect to unintentional bias.
For residential mortgage AI specifically, the CFPB’s final Regulation B rule does not eliminate fair lending risk. Disparate-impact theories remain available under other statutes and frameworks, including the Fair Housing Act, government-sponsored enterprise (GSE) contractual requirements, and various state laws. State regulators are continuing to pursue disparate-impact-based theories and AI-related fair lending enforcement under their own laws, regardless of ECOA’s disparate-impact rollback. For example, Massachusetts reached a $2.5 million settlement with Earnest Operations LLC in 2025 resolving allegations that its AI-driven lending practices caused disparate harm to certain protected groups. New Jersey’s 2025 regulations under the Law Against Discrimination (LAD) expressly codify disparate-impact standards, including for mortgage lending and the use of AI and automated decision-making tools. California, New York, Colorado, and Illinois likewise maintain relatively expansive anti-discrimination and fair lending regimes that can reach algorithmic practices, including certain AI-driven lending activities.
Freddie Mac’s Guide Bulletin 2025-16, effective March 3, 2026, establishes AI/ML governance requirements for approved sellers/servicers, including expectations around performance monitoring, bias and fairness controls, and auditable documentation. Fannie Mae’s Lender Letter LL-2026-04, effective August 6, 2026, sets out an AI/ML governance framework and extends certain expectations to vendors and subcontractors through a “no less protective” standard. Disparate-treatment theories under ECOA, including claims based on the use of facially neutral criteria as proxies for protected characteristics with discriminatory intent, remain fully intact, and the CFPB’s adverse-action guidance concerning AI and complex models continues to apply. Because ECOA’s limitations framework permits certain agency and Department of Justice actions beyond the two-year private-action period, future administrations could still pursue ECOA-based AI enforcement for conduct occurring after the rule’s effective date, subject to applicable limits and defenses.
For lenders using AI in underwriting, pricing, marketing, or servicing, the core takeaway is relatively straightforward. The same core controls still apply and remain grounded in Fair Housing Act requirements, state law, and counterparty obligations such as GSE seller/servicer standards. Lenders that treat this rule as a way to justify leniency in the rollout of their AI models may later find themselves rebuilding their framework under less favorable conditions, with a record designed for a regulatory landscape that no longer offers the same level of protection.
The more practical question — and the focus here — is what that operational structure looks like once AI becomes part of the process. In practice, the programs that tend to be viewed as effective by regulators and counterparties often share three characteristics: usable model scorecards, clear containment layers for agentic systems, and governance processes that connect the two to prevent operational and compliance gaps.
The Model Scorecard
A model scorecard is more than a dashboard showing live metrics or error rates. It is a governance record that follows the model throughout its lifecycle. It documents what the model is intended to do, the populations it was developed and tested on, the data it relies upon, and the limitations placed on its use.
The scorecard should also capture fairness testing, less discriminatory alternative analysis, explainability limitations, escalation thresholds, and where human review is required. Just as importantly, it should track how the model performs over time and identify when intervention is necessary.
Containment Of Agentic AI
Agentic systems create risks different from traditional automation. Rather than performing a narrow task, these systems may retrieve information dynamically, operate across multiple workflow stages, and generate recommendations that extend beyond their intended role.
Problems can emerge when an agent accesses unauthorized data, recommends actions outside policy limits, or links together otherwise acceptable steps in ways that create uneven outcomes across borrower populations. The concern is not simply what the system outputs, but how it reaches those outcomes.
Managing that risk generally requires three controls working together: clear scope restrictions defining what the system can access and decide, human review triggers for defined exceptions, and a tested kill switch allowing the lender to disable the system when necessary. More mature governance programs regularly test all three controls rather than assuming they will work when needed.
Connecting Containment With Model Scorecards
A model scorecard alone is not enough if the system is allowed to operate beyond its intended role. At the same time, controls and restrictions are difficult to enforce if no one has clearly documented what the system is supposed to do in the first place. Effective AI governance requires both.
Lenders handling this will tend to follow a few common practices. They keep model scorecards updated and clearly assign ownership over them, maintain records showing how important decisions were made so those decisions can be reviewed later if necessary, and run regular internal exercises across compliance, legal, fair lending, and model risk teams to test how the organization would respond if an issue arises.
Timing matters too. Testing a system before deployment is only the starting point. Annual fair lending testing may provide a reasonable baseline, but agentic systems often require more frequent review because their behavior can change over time. Generative AI tools can also create additional considerations, including tracking prompts, reviewing outputs, and preserving records.
The programs that tend to function most effectively in practice are the ones that identify issues early, document how those issues were handled, and maintain governance structures that remain understandable even as personnel, vendors, and technology change over time.