Skip to main content

Turning good data into good information makes good loans: Credibility, accuracy and depth are key

Brad Kelso
Feb 04, 2010

Never has it been more important for underwriters and brokers to assess and understand the reliability of the loan data they rely upon. Such a large portion of underwriting is essentially about turning provided data into decisionable loan information. “Bad” or more likely “fuzzy” data going into underwriting increases the risk of generating poor loan information and funding riskier loans. But how can brokers help their own cause—streamlining their deals through to underwriting? The answer is that brokers must begin packaging their loans with full knowledge of the best practices for the underwriter at the other end. Failure to understand the downstream use of packaged data only slows down underwriting, which ultimately slows closings. It’s easy to confuse “data” with “information.” So here’s a review, and perhaps more importantly, two detailed examples as you seek to improve the quality of your underwriting by providing solid information to meet the end needs of investors. Webster’s defines it plainly this way: Data: Individual facts, statistics, or items Information: Knowledge gained through study, communication, research; instruction Understanding that good information requires good data is key to ensuring the quality of the mortgage lending decision. Underwriters essentially gather and compile all the data (individual facts, statistics or items) provided and available to them. Then, with analysis, they transform this data into usable information (knowledge gained) to provide a complete and accurate file about the borrower’s likelihood of repaying a mortgage. The truly differentiated quality of a loan’s forecast risk then is found in the quality of the information that supports it, starting with the original data, and how it is analyzed. The process of using the data as information can define the difference between a quality loan and one that carries a greater risk of default. Data quality, or its integrity, can be qualified by evaluating three key elements: 1. Credibility of its source: From whom was the data gathered, what are their credentials, in what form or manner was the data received? 2. Accuracy: Can the data be confirmed or verified? By what manner was it gathered? What quality control was employed at the roots of its collection? 3. Depth: What’s included or behind the data … score models, logic, as reported, is this an interpreted fact or purely subjective? By way of example, let’s consider these data attributes for two common underwriting issues: Previous income documentation and identity verification. Previous income documentation: Comparing 4506-T to manual tax returns It is now common to order IRS income tax transcripts (using 4506-T forms) for two years prior as an alternative to retrieving hard copies from the borrower. Here’s how those two products stack up from a data integrity perspective, comparing each across credibility, accuracy and depth. 4506 credibility: Excellent The IRS is an absolute source (albeit secondary) for accessing tax return transcripts. The data is delivered directly and cannot be easily forged or amended in the process. It comes in an unalterable file over a secure connection. Manual tax return credibility: Fair to low It’s too easy for a copied tax return to be fraudulently altered if not by the borrower, but by someone with a vested interest in the loan. In addition, a copy is always less credible than an original. The exception to this is if a CPA has prepared the return and then can be used as a verifiable source. 4506 accuracy: Excellent The IRS has actually built data and logic checks that essentially act as quality control of an individual’s tax return. Adjustments in math or in the actual data itself create better data accuracy beyond the original return itself. While the IRS’s accuracy is still limited to the quality of consumer’s original return, by adding its own integrity checks (W-2 and 1099 income sources minimally), the IRS provides exceptional accuracy for an underwriter. Manual tax return accuracy: Low A manual return’s data accuracy is essentially ‘as stated’ with few secondary checks possible except from the copied W-2s or 1099s. Fairly obviously, returns prepared by the applicant carry with them far greater risk of error or outright fraud. 4506 depth: Excellent Beyond having logic that checks the math and compares it with W-2 and 1099 data, the IRS also offers access to any adjustments to the return by visibility to a log file that shows dates and filing amendments. This adds additional value to an underwriter as they are assured of getting not just the best known copy of the return, but also the lineage of what was amended, when and why. Often, this lineage itself is information that should be scrutinized. Manual return depth: Excellent. Ironically, a full copy set of tax returns offers good secondary data as the schedules, signatures and secondary identifying data can be gleaned from the actual returns. None of these are easily accessed through IRS transcripts. By this comparison example, it’s fairly easy to see how 4506 income verifications have become the de-facto tool for underwriters in a very short time. But let’s review a slightly trickier comparison requiring much deeper understanding of the source data. Identity verifications: Comparing Social Security Number traces to direct Social Security Administration verifications With the advent of Fair and Accurate Credit Transaction Act (FACTA) Red Flags regulation, identity confirmation and identity theft mitigation in underwriting has risen to a much higher standard. For that reason, underwriters are turning to identity trace reports or to verifications directly to the Social Security Administration (SSA) for escalation products. But what exactly are the tradeoffs in these products’ credibility, accuracy and depth? As a review, Social Security Number (SSN) identity trace reports are commonly offered by the bureaus and other aggregators. These reports list all reported SSN usage information and combine these with reported names, addresses number and frequency of use. They had been most commonly used for skip tracing efforts in collection. The alternative escalation product for identity verification is a consent-based social verification directly to the SSA. This begins with a required consent form, signed by the applicant (form SSA-89) that allows the lender, through an approved vendor, to submit data directly to the SSA in order to verify the name with the SSN and date of birth. As you will see, these products carry very different data integrity levels used to answer the same question: Is my applicant legitimately who he or she claims to be? Below is the comparison of the data integrity attributes of credibility, accuracy and depth. Trace credibility: Fair Why? One would think the bureaus are an excellent and very credible source of high integrity data, and typically, this is the case. However, looking deeper, the root source for most trace data is simply an aggregation of what is reported to them by subscribing creditors. In general, these creditors do operate and report credibly but by experience, the quality of input, spelling and SSN input is poorly controlled and this impacts its overall quality. SSA direct credibility: Exceptional As with the IRS is with income, not only is the SSA a credible source, but in this case, it is the definitive primary source for identifying a SSN with a name, issued date and address. The data is passed only in a secure environment, and better still, because the applicant consents to allow this verification, fraudsters are often stopped at the gate when they realize that the SSA will be used to verify their identity. Trace accuracy: Fair As suggested, there is wide disparity in the level of data quality control by the credit subscribers who contribute to the bureaus. A trace report is actually a data dump of all the reported combinations of names, addresses and SSNs (misspellings and all) used to report credit. So yes, it is data, but considering it high-quality data may cause misleading underwriting and wasted efforts trying to reach consensus with conflicting information. SSA direct accuracy: Exceptional No database match is ever without possibility of error, but largely, the results from such a request—“Match” or “No Match” are very reputable. It’s hard to argue with a primary source that is so tightly controlled as the SSA. Trace depth: Good/excellent Here, the bureaus and aggregators provide good, and sometimes excellent, context and depth by showing time-stamped traces of the reported addresses. This presents the data in a more usable framework. For example, the underwriter can orally quiz the applicant about their address lineage using the trace data as a way to validate its accuracy. SSA direct depth: Low Ironically on this point, the SSA falls short because its service only ‘confirms’ or ‘denies’ the validity of what the requestor has asked. If one part of the request does not match the records, the entire validation is returned as a “No Match.” So, the cost of having a definitive source is there is no additional data returned. In general, this is why using or combining both tools, Trace and SSA Direct is actually a best practice. Through just these two comparisons, you can see that understanding data integrity—credibility, accuracy and depth—can be key to obtaining information that will ensure the quality of the mortgage lending decision. Brad Kelso is the vice president, director of marketing and product development at Informative Research, with a cumulative 22 years in financial services. Prior to joining Informative Research, Brad led Countrywide’s credit fraud initiatives and system development efforts with credits as a national expert and speaker on “Authorized User Score Fraud.” He is the primary architect of two products related to identity fraud for the mortgage industry.    
Published
Feb 04, 2010
CFPB Reports Trends In Financial Assistance

The latest developments from this study reveal that most consumers have exited the payment assistance they received at the start of the pandemic.

Analysis and Data
Jul 14, 2021
CFPB Orders GreenSky To Refund $9M In Unauthorized Loans

The consent order requires GreenSky to refund or cancel up to $9 million in loans for the customers harmed by this illegal conduct.

Regulation and Compliance
Jul 13, 2021
CFPB Warns Landlords And Consumer Reporting Agencies To Report Accurate Rental Information

Inaccurate rental or eviction information can unfairly block families and individuals from safe, affordable housing.

Regulation and Compliance
Jul 01, 2021
FHFA Mandates Quarterly Fair Lending Reports

FHFA issued orders for all enterprises to submit quarterly Fair Lending Reports with data and information to improve the FHFA’s capabilities. 

Regulation and Compliance
Jul 01, 2021
FHFA Follows CFPB To Protect Borrowers Once COVID-19 Foreclosure And Eviction Moratoriums End

The Federal Housing Finance Agency made it clear that Fannie Mae and Freddie Mac servicers are not permitted to make first notice or filing for foreclosure that would be prohibited by the CFPB protections for borrowers affected by COVID-19.

Regulation and Compliance
Jun 30, 2021
CFPB Finds Evidence Of Redlining And Deceptive Acts In 2020

Enforcement actions resulted in more than $124 million in consumer remediation and civil money penalties in 2020

Regulation and Compliance
Jun 29, 2021