AI Vendor Audits: Why Lenders Need Them And What They Should Cover
New GSE rules are pushing mortgage lenders to conduct deeper audits of vendor AI systems beyond traditional SOC 2 reviews
The mortgage banking industry has now entered a procurement environment in which a vendor’s SOC 2 Type II report is necessary but no longer sufficient. Freddie Mac’s Bulletin 2025-16, amending Guide §§ 1302.2 and 1302.8, has been live since March 3, 2026. Fannie Mae’s Lender Letter LL-2026-04, issued April 8, 2026, takes effect August 6, 2026, 120 days from publication. Both reach the same operational result by different architectural routes.
Each GSE requires seller/servicers to govern vendor AI use, but the formulations differ. Fannie Mae LL-2026-04 expressly requires seller/servicers to manage risks and governance of subcontractor and vendor use of AI/ML that is “no less protective” of the lender letter’s requirements. Freddie Mac reaches vendor-embedded AI through its Guide § 1302.8 AI/ML governance obligations and its Guide § 1302.2 third-party risk controls; the operational result is similar, but the language is not identical and should not be quoted as such.
Both authorize disclosure on request. Fannie Mae’s LL-2026-04 requires seller/servicers to disclose, on Fannie Mae’s request, the types of AI/ML used, the purpose and manner of use, the safeguards implemented to mitigate risks, and “such other information as Fannie Mae may require.” Freddie Mac similarly requires prompt disclosure of AI/ML used in connection with Freddie Mac activity and its purpose and manner of use. That “such other information” clause in LL-2026-04 is the one that should be driving chief risk officers, general counsel, and chief technology officers back to their vendor files this quarter.
SOC 2 was designed against the AICPA Trust Services Criteria: security, availability, processing integrity, confidentiality, and privacy. It does not address model card completeness. It does not address training-data provenance. It does not address fair-lending testing methodology or results by protected class. It does not address whether the vendor’s output can support an ECOA-compliant adverse action notice under 12 C.F.R. § 1002.9. It does not address change-notification protocol when the vendor retrains the model. None of those questions is within the scope SOC 2 was built to answer. Yet every one of those questions is squarely within what a GSE may request, as well as within what a seller/servicer is now contractually obligated to be able to produce.
Regulators Are Moving Fast. Vendors Are Falling Behind.
On February 19, 2026, the U.S. Department of the Treasury released the Financial Services AI Risk Management Framework (“FS AI RMF”), developed by the Cyber Risk Institute in coordination with the FSSCC and FBIIC, with input from more than 100 financial institutions. The framework is voluntary. It introduces 230 control objectives across four functions adapted from the NIST AI Risk Management Framework: Govern, Map, Measure, and Manage. It is not a regulation. But in supervisory environments, voluntary frameworks often become the control taxonomy that exam teams expect management to understand. Depositories already using the Cyber Risk Institute Profile for FFIEC cybersecurity assessments will recognize the design philosophy. For seller/servicers and their vendors, the FS AI RMF is now the most credible sector-specific operational benchmark in the market.
OCC Bulletin 2023-17, the Interagency Guidance on Third-Party Relationships, remains the banking-sector reference point that mid-size seller/servicers’ counterparty risk teams track, even though that guidance is directed to banking organizations supervised by the federal banking agencies rather than to independent mortgage banks. Reading Bulletin 2023-17 alongside the GSE AI governance frameworks produces a coherent picture of where AI vendor diligence is heading across the institutional spectrum: documented governance, documented model lifecycle, documented fair-lending review, documented vendor change control, and documented audit access, all anchored to obligations the lender carries regardless of what the vendor’s standard contract concedes.
The vendor evidence stack has not caught up. AICPA has not substantively updated the SOC 2 criteria to address AI. The AIUC-1 program is new and is beginning to function as an AI assurance reporting structure. ISO/IEC 42001, the international AI management system standard, is being adopted by frontier AI developers and is now appearing as a discriminator in financial-services procurement, but mortgage-specific implementation guidance is still emerging. In the meantime, the gap between what lenders are now required to produce on request and what their vendors can produce on demand is widening, not narrowing.
What An AI-Specific Vendor Audit Actually Has To Cover
An AI-specific vendor audit calibrated to the obligations a mortgage lender now needs to assess, at a minimum, the items set forth herein below. The evidence has to be documented, current, version-anchored to the specific deployed model, and testable. The list is organized by functional ownership so a general counsel, chief risk officer, chief technology officer, or chief vendor officer can assign the work without re-sorting it.
Governance And Inventory
Owner: Legal/Risk
- AI inventory and version control. Every AI or machine-learning model in production, including vendor-embedded models, the version deployed, the deployment date, upstream and downstream dependencies, and the customer configuration. Vendor disclosure practices on these elements vary. The diligence question is what the vendor will share in what form, and whether that disclosure pathway is sufficient to support the lender’s own inventory.
- Governance documentation. Senior-management-approved AI governance policy, named accountability, annual review evidence, escalation protocols, and documented segregation of duties between AI development and AI risk management. Freddie Mac requires this explicitly. Fannie Mae’s principles-based formulation reaches the same result through the “trustworthy and ethical AI/ML” standard and the designated-owner annual review requirement.
- Model card and data lineage. A model card identifying purpose, inputs, outputs, training-data source and vintage, feature list, feature importance at an appropriate level of generality, known limitations, validation results, and prohibited uses. Vendors and their lender customers face a real tension here between IP protection and disclosure adequacy, and the resolution is rarely all-or-nothing. The workable structure is tiered disclosure — fuller access to the lender’s counsel or risk function under NDA, summary materials for broader operational use, and third-party-attested summaries where direct access is not commercially feasible. The diligence question is not whether the vendor produces every internal artifact, but whether the disclosure structure the vendor offers is sufficient for the lender to support its own governance and examination obligations.
Validation, Fair Lending, And Explainability
Owner: Model Risk/Compliance
- Independent validation. Validation scope, methodology, tests performed, findings, severity ratings, limitations, approval status, and remediation tracking. Conceptual soundness, benchmark and challenger comparisons, calibration testing, and out-of-sample performance. This is the function depositories know best, extended to non-deterministic outputs, training-data bias, and feature drift.
- Fair-lending testing. Pre-production and ongoing testing methodology, validation population, protected-class proxy approach (BISG, or Bayesian Improved Surname Geocoding, or another documented methodology), approval-rate and pricing disparity metrics, false-positive-rate analysis disaggregated by protected class, statistical tests, mitigation decisions, residual-risk assessment, and an ongoing-monitoring schedule. Aggregated testing across the vendor’s lender base, segmented by appropriate proxies, is testing. An unsupported assertion is not.
- Adverse action explainability. Whether the vendor’s outputs can support specific principal reasons tied to factors actually considered, at the loan level, in a form stable enough for an ECOA-compliant notice under 12 C.F.R. § 1002.9. Generic feature-importance charts and post-hoc reason codes do not meet that standard. If the vendor cannot explain the mechanism by which model feature importance translates into borrower-readable reason codes, the lender cannot deploy the model in a credit-decisioning workflow without creating an open Regulation B exposure.
Security, Logging, And Dependency Management
Owner: Security/Engineering
- AI-specific security testing. Coverage of the threat vectors SOC 2 was not designed to address, including prompt injection, model inversion, training-data poisoning, adversarial inputs, hallucination, and unauthorized data exposure. These are the AI-specific threat vectors that supervisory and industry frameworks now expect management to address.
- Change-notification protocol. Written advance notice of every material model change — retraining, new features, threshold adjustments, foundation-model dependency changes — with release notes describing expected impact, validation completed, rollback approach, and effective date. A lender running an unnotified retrain is running a model against which its prior fair-lending analysis is no longer accurate.
- Logging, monitoring, and audit access. Inference-level logs (timestamp, model name, model version, output, reason codes, traceable application identifiers) retained for the relevant period, available in machine-readable format on request, and enforceable audit rights.
- Subprocessor and embedded-model disclosure. Identification of subprocessors, cloud services, and embedded foundation models in the production path, with controls over silent vendor changes in upstream dependencies.
Contract Terms The Audit Should Inform
Owner: Legal/Procurement
- Use restrictions on borrower data. Whether the AI agent or system has access only to the data it needs to perform the specific task it was designed to perform. The question every institution should ask at the design stage, and re-ask at every model change, is straightforward: does the agent or system need the information it has access to, to do the specific task it is being designed to do? Excess data access expands the attack surface, complicates fair-lending and information-security testing, and creates avoidable regulatory exposure under privacy, GLBA safeguards, and GSE confidential-information frameworks.
- Indemnification and termination posture. Allocation of regulatory and litigation risk arising from the model’s outputs, and termination rights triggered by documented fair-lending or governance breach.
The Procurement Read
The CFPB’s existing regulations remain valid and enforceable. Regulation B and the § 1002.9 specificity standard for adverse action have not moved. FHA disparate-impact theory, state fair-lending and consumer protection law, ECOA proxy and disparate-treatment analysis, UDAAP, and the GSE counterparty obligations under LL-2026-04 and §§ 1302.2 and 1302.8 all continue to govern. The audit a lender now needs from its vendors is calibrated to that surviving doctrine layer.
For lenders, the operational priorities are an AI inventory completed against vendor-embedded models as well as direct deployments, tiering by underwriting and borrower-impact exposure, evidence requests issued to priority vendors against a documented deadline, and remediation or replacement decisions made before the first GSE inquiry arrives. For vendors, the priority is to build a tiered, audit-ready evidence package mapped to the GSE governance frameworks — not as a defensive expense, but as a procurement asset that supports lender customers without compromising legitimate IP protection.
The lenders that build the inventory now, and the vendors that build the evidence package now, will be the ones positioned to respond to the first request on the counterparty’s timetable rather than the vendor’s. That is the difference between counterparty discipline and counterparty risk.