<style type='text/css'> /**figure { display: block; margin-left: auto; margin-right: auto; width: 75%; margin-top: 50px; margin-bottom: 50px; }**/ /*.image-caption{ text-align: center; font-size: 80%; color: #AAA; padding-bottom: 25px; } .code-caption{ font-size: 80%; color: #AAA; text-align: left; padding-bottom: 25px; } img{ margin-top: 25px; width: 75%; display: block; margin-left: auto; margin-right: auto; } h1{ padding-top: 40px; } h2{ padding-top: 40px; } h3{ padding-top: 25px; } h4{ padding-top: 15px; }*/ </style> # Q ## User interface ## Login and KYC Q is extremely easy for borrowers to use, requiring no paper documentation or human interface in order to get a mortgage. To create an account and start a loan application, borrowers must provide: - A current mobile phone number - A scan of their government issued ID - A video selfie With this information alone, Q is able to verify the borrower’s identity and perform KYC. ### Login security Once logged in, the app enables FaceID/TouchID for the future logins. The borrower's phone number is confirmed by one-time passcode (OTP) performed using [Twilio Verify API](https://www.twilio.com/docs/verify/api). Q does not use emails or passwords, which reduces the attack surface for account access — passwords can't be guessed or cracked, and a malicious access to a borrower's email can't be used to reset their account. In the event of a lost or wiped phone, the borrower can re-download the Q app and regain access to their account via one-time passcode to their phone number. In the event of a phone number chance, the borrower can regain access by going through the first-time photo ID identity verification onboarding. ### Identity Q checks the borrower's ID for built-in security features and signs of tampering. Police databases are also searched for reports of the document being lost or stolen, protecting against forged, counterfeit, stolen, and compromised IDs. ```json "breakdown": { "age_validation": { "breakdown": { "minimum_accepted_age": { "properties": {}, "result": "clear" } }, "result": "clear" }, "compromised_document": { "result": "clear" }, "data_comparison": { "breakdown": { "date_of_birth": { "properties": {}, "result": "clear" }, "date_of_expiry": { "properties": {}, "result": "clear" }, "document_numbers": { "properties": {}, "result": "clear" }, "document_type": { "properties": {}, "result": "clear" }, "first_name": { "properties": {}, "result": "clear" }, "gender": { "properties": {}, "result": "clear" }, "issuing_country": { "properties": {}, "result": "clear" }, "last_name": { "properties": {}, "result": "clear" } }, "result": "clear" }, "data_consistency": { "breakdown": { "date_of_birth": { "properties": {}, "result": "clear" }, "date_of_expiry": { "properties": {}, "result": "clear" }, "document_numbers": { "properties": {}, "result": "clear" }, "document_type": { "properties": {}, "result": "clear" }, "first_name": { "properties": {}, "result": "clear" }, "gender": { "properties": {}, "result": "clear" }, "issuing_country": { "properties": {}, "result": "clear" }, "last_name": { "properties": {}, "result": "clear" }, "multiple_data_sources_present": { "properties": {}, "result": "clear" }, "nationality": { "properties": {}, "result": "clear" } }, "result": "clear" }, "data_validation": { "breakdown": { "date_of_birth": { "properties": {}, "result": "clear" }, "document_expiration": { "properties": {}, "result": "clear" }, "document_numbers": { "properties": {}, "result": "clear" }, "expiry_date": { "properties": {}, "result": "clear" }, "gender": { "properties": {}, "result": "clear" }, "mrz": { "properties": {}, "result": "clear" } }, "result": "clear" }, "image_integrity": { "breakdown": { "colour_picture": { "properties": {}, "result": "clear" }, "conclusive_document_quality": { "properties": {}, "result": "clear" }, "image_quality": { "properties": {}, "result": "clear" }, "supported_document": { "properties": {}, "result": "clear" } }, "result": "clear" }, "police_record": { "result": "clear" }, "visual_authenticity": { "breakdown": { "digital_tampering": { "properties": {}, "result": "clear" }, "face_detection": { "properties": {}, "result": "clear" }, "fonts": { "properties": {}, "result": "clear" }, "original_document_present": { "properties": {}, "result": "clear" }, "other": { "properties": {}, "result": "clear" }, "picture_face_integrity": { "properties": {}, "result": "clear" }, "security_features": { "properties": {}, "result": "clear" }, "template": { "properties": {}, "result": "clear" } }, "result": "clear" } } ``` :::info ℹ️ Onfido report callback, notice 'results' for each check, including ID imagine integrity and cross referencing of police DBs ::: If the ID is clear, Q leverages facial recognition to compare the borrower's photo ID to the borrower's selfie video, ensuring that the ID belongs to the person using the mobile device. The process also checks the image data coming from the mobile device's image sensor for signs of additional manipulation, protecting against malicious users trying to manipulate their selfie by: * taking a selfie of a photo from the web * taking a selfie of a screenshot * using a photo of a photo or digital screen for their selfie * using digitally-modified selfies or IDs * or using a 2D mask / printout to augment their selfie ![](https://i.imgur.com/to71CBR.jpg) ```json { "created_at": "2019-12-11T09:39:05Z", "href": "/v3.1/reports/<REPORT_ID>", "id": "<REPORT_ID>", "name": "facial_similarity_photo", "properties": {}, "result": "clear", "status": "complete", "sub_result": null, "breakdown": { "face_comparison": { "result": "clear", "breakdown": { "face_match": { "result": "clear", "properties": { "score": 0.6512 } } } }, "image_integrity": { "result": "clear", "breakdown": { "face_detected": { "result": "clear", "properties": {} }, "source_integrity": { "result": "clear", "properties": {} } } }, "visual_authenticity": { "result": "clear", "breakdown": { "spoofing_detection": { "result": "clear", "properties": { "score": 0.9512 } } } } }, "check_id": "<CHECK_ID>", "documents": [] } ``` :::info Facial similarity result from Onfido ::: Once a borrower's identification is validated, Q extracts PIIs from their ID. Data retrieved from the ID includes name, address, and DOB ```json { "check_id": "<CHECK_ID>", "created_at": "2021-03-22T17:13:12Z", "documents": [ { "id": "<DOCUMENT_ID>" } ], "href": "/v3.1/reports/<REPORT_ID>", "id": "<REPORT_ID>", "name": "document", "properties": { "date_of_birth": "1990-01-01", "date_of_expiry": "2030-01-01", "document_numbers": [ { "type": "document_number", "value": "999999999" } ], "document_type": "drivers_license", "first_name": "Jane", "gender": "", "address": "2570 24TH STREET ANYTOWN, CA 95018", "issuing_country": "USA", "issuing_state": "CA", "last_name": "Doe", "nationality": "" }, "result": "clear", "status": "complete", "sub_result": "clear", ... ``` ### KYC Q cross references the extracted PII above across 400+ identity databases and watchlists, using Onfido's KYC checks and Socure's Identity API, confirming that the person is not a potential risk to transact with, and to satisfy all KYC / AML compliance requirements. Q specifically scans for: * Sanctions - Government and International Organizations Sanctions Lists * Politically Exposed Persons - Proprietary database of Politically Exposed Persons sourced from government lists, websites and other media sources * Monitored Lists - Law-enforcement and Regulatory bodies Monitored Lists (including Terrorism, Money Laundering and Most Wanted lists) * Adverse Media - Negative events reported by publicly and generally available media sources Across the following databases: ```csvpreview {header=true} Source,Definition Credit Agencies,Data comprised of consumer credit applications, Voting Register,Data comprised of voter registration within a country, Telephone Database,Data provided by both landline and mobile providers., Government,"Any standard publicly accessible data collected by government entities. These include driving licence data, motor vehicle registration, court filings, property ownership registers, permanent place of residence registration and other similar data sets", Business Registration,"Data comprised of business registrations, corporate directors filings and business hierarchy data", Consumer Database,Opt-in consumer data leveraging database marketing and other similar opt-in data sources, Utility Registration,"Data comprised of utility registrations such as electricity, gas, water accounts", Postal Authorities,Data provided by postal authorities, Commercial Database,These are corporate/private databases where users have opted-in and allowed for their information to be used for the purpose of verification of their identities, Proprietary,"This is when a data provider chooses not to divulge the source of the data to us for varied reasons, and also includes social media based data", ``` ### Location tracking Real-time location data is collected from the borrower's mobile device’s GPS to check that the borrower is located in the home they are mortgaging. Q also checks any spending transactions collected from bank data collected later in the application process [detailed in financial verification]. Combined, Q can confirm the statistical probability of the current location being a primary residence (i.e. a property occupancy check). ![](https://i.imgur.com/iQAAIPR.png) ## Financial verification ### Financial links Q supports digital links with financial institutions, using account logins. With the data from these links, a financial profile is built for each borrower. We can use account logins to access a majority of financial institutions, including: - Banks like Chase - Credit unions like Navy Federal Credit Union - 401K and IRA custodians like Fidelity - Stock brokers like Robinhood - Tax filings from [IRS.gov](http://irs.gov) - Insurance companies like Allstate Borrowers login through native iOS/Android integrations in our mobile app, managed by [Plaid](http://plaid.com), [Teller.io](http://teller.io), [MX](https://www.mx.com/), and some custom built. <img src="https://i.imgur.com/PT22JKz.png" class="image"> Multiple vendors are used to facilitate account logins in order to maximize the reliability of our connections without sacrificing coverage. The highest reliability connections are provided by Teller, who is able to allow customers to login to their account using internal native mobile APIs — the same APIs that the banks use for their mobile apps. However, their coverage is relatively small. We fall back to Plaid for the majority of customers, as they maintain the connections using headless browsers, giving us access to the same data and connection quality of a customer using the bank's web interface. Finally, we rely no MX for the remainder of customers. They have the broadest coverage, reaching over 95% of US financial institutions, but their reliance on OAuth for logins results in a more limited set of data available to us and a worse user experience for the customer. <img src="https://i.imgur.com/uhNRnxJ.png" class="image"> <p class="image-caption">Waterfall through account linking vendors to maximize reliability and coverage</p> Once a borrower logs in, we receive an API callback and can immediately download financial data: - Transactions history - Account balances - Investment positions and trading history - Account ownership and type Co-borrowers are added to a loan by having them separately login to their accounts. Legal names and ownership types (Individual, Joint, Trust, Business, etc.) allow us to understand which accounts are owned by whom. Bank data is returned in JSON format, specific to each login vendor. For example, a history of transactions from a checking account via Plaid: ```json "transactions": { { "account_id": "BxBXxLj1m4H", "amount": "10200.33", "category": ["Income", "Paycheck"], "category_id": "2000", "date": "2020-01-29", "authorized_date": "2020-01-28", "payer_name": "Microsoft Inc", "payment_channel": "ACH", }, { "account_id": "BxBXxLj1m4H", "amount": "-2307.21", "category": ["Shops", "Computers and Electronics"], "category_id": "19013000", "date": "2020-01-29", "authorized_date": "2020-01-27", "location": "300 Post St, San Francisco, CA, 94108", "merchant_name": "Apple", "payment_channel": "in store", } } ``` <p class="code-caption"> Transactions history data received from Plaid API in JSON format </p> Transactions history is available for up to 24 months on most accounts, and includes enhanced geolocation, merchant, and category information. To process this, we’re creating a machine learning model to find patterns in amounts, payers, and account numbers across all accounts. This allows us to classify patterns such as regular deposits of a paycheck from an employer, or unusually large transactions that could be indicative of a new loan or gift. ### Assets Current account balances are added together to estimate the borrower's liquid assets. These are needed to fund a downpayment in a purchase transaction, or to act as reserves in the event of income loss. Eligible assets for this calculation include cash, short-term investments, and liquid securities. We can retrieve current account balances and holdings across connected accounts including: - Checking, savings, CDs - Stocks, bonds, money market funds, mutual funds - 401K and retirement accounts (subject to liquidation penalty) For example, a Plaid holdings investment holdings report returns a list of accounts with balances, individual assets held and their recent prices, and recent transactions: ```json { "accounts": [ { "account_id": "5Bvpj4QknlhVWk7GygpwfVKdd133GoCxB814g", "balances": { "available": 43200, "current": 43200, "iso_currency_code": "USD", "limit": null, "unofficial_currency_code": null }, "mask": "4444", "name": "Plaid Money Market", "official_name": "Plaid Platinum Standard 1.85% Interest Money Market", "subtype": "money market", "type": "depository" }, { "account_id": "JqMLm4rJwpF6gMPJwBqdh9ZjjPvvpDcb7kDK1", "balances": { "available": null, "current": 320.76, "iso_currency_code": "USD", "limit": null, "unofficial_currency_code": null }, "mask": "5555", "name": "Plaid IRA", "official_name": null, "subtype": "ira", "type": "investment" }, ... ], "holdings": [ { "account_id": "k67E4xKvMlhmleEa4pg9hlwGGNnnEeixPolGm", "cost_basis": 1.5, "institution_price": 2.11, "institution_price_as_of": null, "institution_value": 2.11, "iso_currency_code": "USD", "quantity": 1, "security_id": "KDwjlXj1Rqt58dVvmzRguxJybmyQL8FgeWWAy", "unofficial_currency_code": null }, { "account_id": "k67E4xKvMlhmleEa4pg9hlwGGNnnEeixPolGm", "cost_basis": 10, "institution_price": 10.42, "institution_price_as_of": null, "institution_value": 20.84, "iso_currency_code": "USD", "quantity": 2, "security_id": "NDVQrXQoqzt5v3bAe8qRt4A7mK7wvZCLEBBJk", "unofficial_currency_code": null }, ... ], "request_id": "l68wb8zpS0hqmsJ", "securities": [ { "close_price": 27, "close_price_as_of": null, "cusip": "577130834", "institution_id": null, "institution_security_id": null, "is_cash_equivalent": false, "isin": "US5771308344", "iso_currency_code": "USD", "name": "Matthews Pacific Tiger Fund Insti Class", "proxy_security_id": null, "security_id": "JDdP7XPMklt5vwPmDN45t3KAoWAPmjtpaW7DP", "sedol": null, "ticker_symbol": "MIPTX", "type": "mutual fund", "unofficial_currency_code": null } ... ``` Various adjustments are then made to account for factors which could cause the current account balances to overstate the borrower's personal liquid assets: - **Gifts and personal unsecured loans** require special handling when counting them towards liquid assets. They are detected by looking for statistically anomalous deposit amounts and accounting for the sender account name. - **Investments** are valued at a varying discount to fair market value to account for the tax impact on sale (for accounts that aren’t tax advantaged) and disallowed when they can’t be withdrawn at all (e.g. 401k withdrawal restrictions). This information can be inferred based on account type data returned from the provider. For example, we can detect potential gifts as statistical outliers in a set of deposit transactions: ```yaml deposit_history: - date: 01/05/2020 amount: 8452 source: Amazon Inc XXX0218 - date: 02/06/2020 amount: 8452 Source: Amazon Inc XXX0218 - date: 02/18/2020 amount: 14000 Source: B Spencer XXX5281 - date: 03/06/2020 amount: 8452 Source: Amazon Inc XXX0218 ``` <p class="code-caption"> Selected fields from bank account deposit history allow us to classify potential gifts and personal loans </p> When underwriting the loan, the lender can determine to what extent these adjustments contribute towards a total asset value: ```yaml liquid_assets: checking_savings: 121028 potential_gifts: 14000 stocks: 59109 estimated_liquidation_cost: 11802 money_market: 2990 ``` ### Income Borrower income is calculated by using various independent sources, which makes the estimate more robust. IRS tax filings are the most reliable source for historical income — useful data for borrowers that may appear riskier, such as business owners. However, tax filings only provide data up to the most recent filing date. Paystub data from payroll providers, cross-checked against actual bank account deposits from bank logins, provide up-to-date confirmation that a borrower is still getting paid. Borrower income is calculated by using various independent sources, which makes the estimate more robust. IRS tax filings are the most reliable source for historical income — useful data for borrowers that Payroll data from payroll providers (ADP, Gusto, etc) and employers. This includes payroll amounts, frequency, and employment information (job title, history, location). This is equivalent to information found on paystubs. Payroll data is gathered using vendors like [TheWorkNumber](https://theworknumber.com/) and [TrueWork](https://www.truework.com/). ```yaml Employee: Employee Name: Roseanne Smith SSN: 123456789 Employment: DATE: 01/01/2021 Employer Name: Enterprise USA Employer Code: 91001 Employer Address: 11432 Lackland Road, St Louis, MO 63146 Information Current as of: 12/28/2020 Total time with employer: 5 years Job title: DEMO MANAGER - OPERATIONS Rate of pay: $18.00 / hr Original hire date: 01/01/2016 Income: - Year: 2020 Base Pay: 30400.00 Overtime: 4750.00 Commission: 2850.00 Bonuses: 950.00 Other Income: 190.00 Total Pay: 39140.00 ``` Using multiple payroll vendors increases our coverage, similar to how we handle bank account logins. Out of 150+ million employed workers in the US, TheWorkNumber covers 114 million. TrueWork supports 35 million employees, although some of this coverage overlaps with TheWorkNumber. In addition, we plan to incorporate payroll logins directly in our app in the future, most likely powered by Plaid. Employees and employers that do not have instant verification coverage through these providers can still be verified, although it can require several days for a response. To handle these cases, we can rely on bank account deposits and IRS income history to underwrite, and then confirm with the employer prior to closing. Bank account deposit data is classified as salary, bonuses, and commissions. This is possible by pattern matching sending account names, deposit amounts, and frequency. This data is accessible through bank account logins. Historical tax filings from the IRS provide a breakdown of past income, along with other data on personal tax returns. This includes 3 years of Full Return Transcripts (includes Form 1040 with all schedules), and up to 10 years of Wage & Income Transcripts (W-2s, 1099s, and 1098s). These are downloaded from the [IRS.gov personal filings web portal](https://www.irs.gov/individuals/get-transcript). ```= ... INCOME WAGES, SALARIES, TIPS, ETC: $13,000.00 TAXABLE INTEREST INCOME: SCH B: $0.00 TAX-EXEMPT INTEREST: $0.00 ORDINARY DIVIDEND INCOME: SCH B: $0.00 QUALIFIED DIVIDENDS: $0.00 REFUNDS OF STATE/LOCAL TAXES: $0.00 ALIMONY RECEIVED: $0.00 BUSINESS INCOME OR LOSS (Schedule C): $2,500.00 CAPITAL GAIN OR LOSS: (Schedule D): $0.00 CAPITAL GAINS OR LOSS: SCH D PER COMPUTER: $0.00 OTHER GAINS OR LOSSES (Form 4797): $0.00 TOTAL IRA DISTRIBUTIONS: $0.00 TAXABLE IRA DISTRIBUTIONS: $0.00 TOTAL PENSIONS AND ANNUITIES: $0.00 TAXABLE PENSION/ANNUITY AMOUNT: $0.00 RENT/ROYALTY/PARTNERSHIP/ESTATE (Schedule E): $0.00 RENT/ROYALTY INCOME/LOSS PER COMPUTER: $0.00 ESTATE/TRUST INCOME/LOSS PER COMPUTER: $0.00 PARTNERSHIP/S-CORP INCOME/LOSS PER COMPUTER: $0.00 FARM INCOME OR LOSS (Schedule F): $0.00 UNEMPLOYMENT COMPENSATION: $0.00 TOTAL SOCIAL SECURITY BENEFITS: $0.00 TAXABLE SOCIAL SECURITY BENEFITS: $0.00 OTHER INCOME: $0.00 SCHEDULE EIC SE INCOME PER COMPUTER: $2,323.00 SCH EIC DISQUALIFIED INC COMPUTER: $0.00 TOTAL INCOME: $15,500.00 ... ``` To understand a borrower's self-employment and business income, we start by parsing their personal tax returns. This provides us with a breakdown of several types of income: - Self-employment W2s - Sole proprietorship income from Schedule C - S-corporation and partnership (LLC, LLP) income from Schedule E [Middesk](https://www.middesk.com/) API is used to gather information on the borrower's business. This allows us to confirm the entity's ownership structure, tax ID number, and business's industry in order to identify potentially high risk entities. ```yaml { "object": "business", "id": "32e7fb2c-60ca-4ddc-add2-694475b73f2b", "name": "Middesk Inc. ", "created_at": "2019-01-21T06:43:23.313Z", "updated_at": "2019-01-21T06:44:07.184Z", "status": "approved", "addresses": [ { "object": "address", "address_line1": "2180 Bryant St Ste 210", "address_line2": null, "city": "San Francisco", "state": "CA", "postal_code": "94110-2141", "full_address": "2180 Bryant St Ste 210, San Francisco, CA 94110-2141", "latitude": 37.75938, "longitude": -122.40994, "created_at": "2019-01-30T23:49:01.100Z", "updated_at": "2019-01-30T23:49:01.100Z" } ], "people": [ { "object": "person", "name": "John Doe", "titles": [ { "object": "person_title", "title": "president" }, { "object": "person_titles", "title": "registered agent" } ], "sources": [ { "id": "5d1308ad-0d33-472d-8e61-ed223649d074", "type": "registration", "metadata": { "state": "CA", "status": "active", "file_number": "C4221590" } } ] } ], "tin": { "object": "tin", "updated_at": "2019-01-21T06:43:31.201Z", "name": "Middesk Inc. ", "tin": "37-1883180", "mismatch": false, "unknown": false, "verified": true }, "formation": { "object": "formation", "entity_type": "CORPORATION", "formation_date": "2018-03-05", "formation_state": "DE", "created_at": "2019-01-30T23:49:01.164Z", "updated_at": "2019-01-30T23:49:01.217Z" }, "website": { "object": "website", "domain": { "domain": "middesk.com", "creation_date": "2018-11-20T07:02:53.000Z", "expiration_date": "2019-11-20T07:02:53.000Z", "registrar" {} }, "pages": [ { "category": "home", "url": "https://www.middesk.com/" } ], "entities": [ { "text": "Middesk" } ], "parked": false }, "watchlist": { "object": "watchlist", "lists": [ { "title": "Denied Persons List (DPL) - Bureau of Industry and Security", "results": [ ], "source_last_updated": "2019-01-03T18:03:07.761+00:00" } ] }, "review": { "object": "review", "id": "0eb9094c-33fc-49eb-b5a5-18101d948857", "created_at": "2020-03-06 19:20:29 UTC", "updated_at": "2020-03-06 19:23:24 UTC", "completed_at": "2020-03-06 19:23:24 UTC", "tasks": [ { "category": "tin", "key": "tin", "label": "TIN Match", "message": "The IRS does not have a record for the submitted TIN and Business Name combination", "status": "failure", "sub_label": "Not Found" }, { "category": "name", "key": "name", "label": "Business Name", "message": "Match identified to the submitted Business Name", "status": "success", "sub_label": "Verified" }, { "category": "watchlist", "key": "watchlist", "label": "Watchlist", "message": "No Watchlist hits were identified", "status": "success", "sub_label": "No Hits" } ] }, "registrations": [ { "object": "registration", "id": "53a101f2-d671-4c0a-89b6-6c069e5a3857", "name": "MIDDESK, INC.", "status": "active", "jurisdiction": "FOREIGN", "entity_type": "CORPORATION", "file_number": "C4221590", "addresses": [ "2180 BRYANT ST UNIT 210 SAN FRANCISCO CA 94110" ], "registration_date": "2018-12-21", "state": "CA", "source": "https://businesssearch.sos.ca.gov/CBS" }, { "object": "registration", "id": "53a101f2-d671-4c0a-89b6-6c069e5a3864", "name": "MIDDESK, INC.", "status": "active", "jurisdiction": "DOMESTIC", "entity_type": "CORPORATION", "file_number": "6782397", "registration_date": "2018-03-05", "state": "DE", "source": "https://icis.corp.delaware.gov" } ], "subscription": { "object": "subscription", "id": "2b58c9a9-8279-4e85-bcf9-60dc2a0241cb", "created_at": "2020-09-01T21:05:52.100Z", "event_types": [ { "type": "bankruptcy.created", "status": "active" } ] } } ``` We also need to confirm the receipt of cash from the business to the borrower — profits on a tax return do not necessarily represent money that a borrower has immediate access to. To do so, we compare tax records with personal bank account deposit activity. Since tax filings only give us a picture of the business up to the filing date, we need to confirm that there haven't been any significant decreases in its earning power year-to-date. To do so, we can have the borrower login to their business bank accounts and we can review the cashflow activity. We are exploring additional third-party transaction enrichment providers (like [Heron](https://www.herondata.io/)) that specialize in SME/SMB bank data, giving us more context on each line item: ![](https://i.imgur.com/aimjrAB.png) <p class="code-caption">Business bank transaction enrichment by Heron </p> We combine all this data into a statistical model for each borrower, comparing YTD debit and credit activity to the previous year. This allows us to identify businesses which have experienced an unusual drop in income. ### Credit and liabilities Credit reports provide the necessary data on current liabilities and past credit history. They also serve as additional checks on identity and employment. To collect this data, we obtain credit reports from all three major reporting bureaus: Equifax, Experian, and TransUnion. They contain information on: - **Identity:** Name, DOB, SSN, current and previous addresses - **Credit history:** Outstanding and past debt, monthly balances, credit limits, credit utilization, creditor, account numbers, delinquency history — credit cards, mortgages, student loans, installment loans, personal loans, auto loans, etc. - **Credit scores:** Score, model version - **Public records:** Bankruptcies, judgements, foreclosures, tax liens - **Fraud alerts:** Address variations between records and inquiries, credit locks, identity theft reports - **Employment:** Employer address, occupation, hired date - **Consumer statements:** Consumer-provided statements related to bankruptcies, delinquencies, and credit disputes Aggregating data across all bureaus helps catch some events or credit lines that may have only been reported to a single agency. We use [Array.com](http://array.com) trimerge API to aggregate all reports into a single JSON: ```json "CREDIT_RESPONSE": { "@MISMOVersionID": "2.4", "@CreditResponseID": "CRRep0001", "@CreditReportIdentifier": "TESTCUSTOMER1234", "@CreditReportFirstIssuedDate": "2020-05-01T12:00:00.000Z", "@CreditReportMergeTypeIndicator": "ListAndStack", "@CreditRatingCodeType": "Equifax", "_DATA_INFORMATION": { "DATA_VERSION": [{ "@_Name": "Credmo", "@_Number": "1.3" }, { "@_Name": "Equifax", "@_Number": "5" }, { "@_Name": "Experian", "@_Number": "7" }, { "@_Name": "TransUnion", "@_Number": "4" }] }, "BORROWER": { "@BorrowerID": "Borrower01", "@_BirthDate": "1975-01-14", "@_FirstName": "JOHN", "@_MiddleName": "EDWARD", "@_LastName": "DOE", "@_SSN": "666661234", "@_UnparsedName": "JOHN EDWARD DOE ", "@_PrintPositionType": "Borrower", "_RESIDENCE": [{ "@_StreetAddress": "1234 PRIMARY ST", "@_City": "JACKSONVILLE", "@_State": "FL", "@_PostalCode": "32205", "@BorrowerResidencyType": "Current" }, { "@_StreetAddress": "1234 SECONDARY LN", "@_City": "JACKSONVILLE", "@_State": "FL", "@_PostalCode": "32206", "@BorrowerResidencyType": "Prior" }] } ... { "@CreditLiabilityID": "TRADE017", "@BorrowerID": "Borrower01", "@CreditFileID": "EA01", "@CreditTradeReferenceID": "Secondary", "@_AccountIdentifier": "ACCOUNT12345", "@_AccountOpenedDate": "2016-12-23", "@_AccountOwnershipType": "Individual", "@_AccountReportedDate": "2020-05-21", "@_AccountStatusType": "Open", "@_AccountType": "Revolving", "@_ConsumerDisputeIndicator": "N", "@_DerogatoryDataIndicator": "N", "@_CreditLimitAmount": "1500", "@_HighBalanceAmount": "856", "@_HighCreditAmount": "856", "@_LastActivityDate": "2020-05", "@_MonthlyPaymentAmount": "25", "@_MonthsReviewedCount": "40", "@_TermsDescription": "M$25", "@_TermsSourceType": "Provided", "@_UnpaidBalanceAmount": "817", "@CreditBusinessType": "Banking", "@CreditLoanType": "CreditCard", "_CREDITOR": { "@_Name": "CAPITAL ONE BANK USA", "@_StreetAddress": "PO BOX 85015", "@_City": "RICHMOND", "@_State": "VA", "@_PostalCode": "23285-507", "CONTACT_DETAIL": { "CONTACT_POINT": { "@_Type": "Phone", "@_Value": "8009557070" } } }, "_CURRENT_RATING": { "@_Code": "1", "@_Type": "AsAgreed" }, "_PAYMENT_PATTERN": { "@_Data": "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCX", "@_StartDate": "2020-04-21" }, "CREDIT_COMMENT": { "@_SourceType": "Equifax", "@_Type": "Other", "@_TypeOtherDescripton": "TrendedData", "_Text": "<CREDIT_LIABILITY_PRIOR_INFORMATIONS> <CREDIT_LIABILITY_PRIOR_INFORMATION> <CREDIT_LIABILITY_PRIOR_DETAIL> <CreditLiabilityAccountReportedDate>2020-04/> <CreditLiabilityUnpaidBalanceAmount>337/> <CreditLiabilityCreditLimitAmount>1500/> <CreditLiabilityHighCreditAmount>466/> <CreditLiabilityMonthlyPaymentAmount>25/> <CreditLiabilityLastPaymentDate>2020-04/> <CreditLiabilityAccountType>18/> <CreditLiabilityAccountReportedDate>2020-03/> <CreditLiabilityUnpaidBalanceAmount>370/> <CreditLiabilityCreditLimitAmount>1500/> <CreditLiabilityHighCreditAmount>466/> <CreditLiabilityMonthlyPaymentAmount>25/> <CreditLiabilityLastPaymentDate>2020-03/> <CreditLiabilityAccountType>18/> ... ``` <p class="code-caption"> Snippet of a sample trimerge report from Array.com in MISMO XML format received as JSON </p> The credit report needs to be parsed to extract out attributes needed in mortgage underwriting. Basics like credit scores and past delinquency counts are readily available. We also use the credit report to calculate the monthly debt service as input into DTI (debt-to-income) ratios: - Installment loans at full monthly payment - Revolving and credit lines at minimum required payments - Student loans at full monthly payment, even if account is in deferment, forbearance, or income-driven The credit report is also used to identify outstanding mortgage loans. To map a mortgage to a property, we can cross-reference the loan amount and start date with deed histories of properties owned by the borrower. This tells us the remaining balance on each mortgage, address of the home, and current monthly payment. ```yaml credit_report: ... - tradeline: type: mortgage name: Chase Manhattan Home Mortgage account: 3331334557 opened: 02/2018 reported: 03/2021 balance: 265130 scheduled_payment: 2120 collateral: FRD680474439 ... ``` <p class="code-caption"> Post-processed mortgage tradeline showing remaining balance, next payment, etc. </p> ### Homeowner's insurance Homeowner's insurance coverage is required for a mortgage. We can verify an existing policy or allow a borrower to get a new one. Obtaining a new policy is possible by integrating [Lemonade](https://www.lemonade.com/) homeowner's coverage into our app, using their iOS/Android SDKs. To verify an existing policy, we using account logins — similar to bank account logins except with insurance companies. Using the [Canopy](https://usecanopy.com/) API, we can allow logins to the largest homeowner's insurance providers: ![](https://i.imgur.com/K3P8k7O.png) After a borrower logs in, we download and parse their policies. This gives us active policies and their declaration pages, including: policy status, policy price, coverage amounts, and coverage types (property and special hazards like flood, fire, windstorm). ```json { "dwelling_id": "63f9a99f-ab9d-4faf-83a6-5ae128451627", "address": { "address_id": "a32b0e3d-a5f5-48d8-a572-6ae561815055", "full_address": "609 Market St, San Francisco, CA, 94105, United States", "number": "609", "prefix": "string", "street": "Market", "type": "St", "suffix": "string", "city": "San Francisco", "state": "CA", "sec_unit_type": "string", "sec_unit_num": "string", "zip": "94105", "country": "US" }, "coverages": [ { "dwelling_coverage_id": "48706bc1-c862-4bd1-bcb8-f5eaa5ea23ce", "name": "DWELLING", "friendly_name": "Dwelling", "type": "HO-3", "premium_cents": 52500, "per_person_limit_cents": 5000, "per_incident_limit_cents": 50000000, "deductible_cents": 100000, "is_declined": false } ] } ``` <p class="code-caption"> Homeowner's insurance coverage details returned by Canopy </p> ## Property appraisal Q creates a property profile for every home. This allows it to instantly generate a valuation for a borrower's home from a proprietary AVM. Simultaneously, Q checks with the GSEs to see if they would require an appraisal for the home, or whether the requirement is waived. For homes that require an appraisal, a virtual inspection is performed. This allows a licensed appraiser to complete the appraisal in just a few minutes without a home visit. ### Property profile For all homes within markets where Q is active, a property profile is generated in advance. The profile aggregates information on the home, across both public and private data sources: ```csvpreview {header=true} Data type,Description,Sources Land registry,Legal description parcel number tax rates assessed values square footage beds/baths age of construction,"Estated, PropLogix, Rets.ly/MLS, Zillow" Neighborhood and site,"Dimensions and boundaries, geography, zoning, planned developments, flood and disaster history, natural features, street view, satellite view","County websites, Google maps, Zillow, ArcGIS, CoreLogic" Title and deeds,"Sales, transfers, liens","Estated, CoreLogic" Title insurance,Title insurance underwriting and policies,States Title Listings history,"Listing prices and dates, descriptions, property attributes, photos, valuations","Rets.ly/MLS, Zillow, HouseCanary" Community associations (HOA),"Project status, financial condition, property characteristics, insurance coverage","CoreLogic, PropLogix" ``` ```json "owner": { "name": "C & C MCCMORMICK", "second_name": null, "unit_type": null, "unit_number": null, "formatted_street_address": "2163 N LINCOLN AVE", "city": "CHICAGO", "state": "IL", "zip_code": "60614", "zip_plus_four_code": "4510", "owner_occupied": "YES" }, "deeds": [ { "document_type": "WARRANTY DEED", "recording_date": "2012-06-18", "original_contract_date": "2012-05-31", "document_id": "1217042094", "sale_price": 325000, "sale_price_description": "FULL AMOUNT COMPUTED FROM TRANSFER TAX OR EXCISE TAX", "transfer_tax": 325, "distressed_sale": false, "real_estate_owned": "NO", "seller_first_name": "MICHAEL J", "seller_last_name": "MCCARVILLE", "buyer_first_name": "CORBETT", "buyer_last_name": "MCCORMICK", "buyer2_first_name": "CASEY", "buyer2_last_name": "MCCORMICK", "buyer_address": "137 CENTER ST", "buyer_city": "NAPERVILLE", "buyer_state": "IL", "buyer_zip_code": "60540", "buyer_zip_plus_four_code": "4612", }, { "document_type": "SPECIAL WARRANTY DEED", "recording_date": "2004-10-20", "original_contract_date": "2004-09-24", "document_id": "0429405331", "sale_price": 370000, "sale_price_description": "FULL AMOUNT COMPUTED FROM TRANSFER TAX OR EXCISE TAX", "transfer_tax": 370, "distressed_sale": false, "real_estate_owned": "NO", "seller_last_name": "OZ PARK TOWNHOMES & CONDOMINIUMS LLC", "buyer_first_name": "MARK J", "buyer_last_name": "MCCARVILLE", "buyer_address": "2163 N LINCOLN AVE", "buyer_city": "CHICAGO", "buyer_state": "IL", "buyer_zip_code": "60614", "buyer_zip_plus_four_code": "4510", "lender_name": "CHASE MANHATTAN MORTGAGE CORP", "lender_type": "MORTGAGE COMPANY", "loan_amount": 295900, "loan_type": "UNKNOWN", "loan_due_date": "2034-10-01", "loan_finance_type": "FIXED RATE", "loan_interest_rate": 4.75 } ... ``` <p class="code-caption"> Example of Owner and Deed history data from Estated </p> Statistical models synthesize this data to extract valuation inputs such as: - Each home's physical location (including site and neighborhood characteristics) - Outstanding mortgages and other lien - Current and previous owners - Whether it was recently listed for sale - History of selling prices for the home - Candidate comparable sales - Whether the home's HOA is in a healthy condition. Data is collected using both APIs and screen-scrapers — scripts that use headless browsers to interact with websites and extract data. Once the data is collected, it needs to be parsed. This often requires custom logic for each county, as specific fields may be published in some counties but not others. For some data sets, we use additional object detection machine learning. Specifically with satellite (ArcGIS), street view (Google Maps), and home photographs (MLS, Zillow) we need to parse the images to extract relevant information. ![](https://i.imgur.com/QxlE96k.jpg) <p class="image-caption"> Example ArcGIS data showing property boundaries and swimming pools detected </p> #### Title search and commitment For each home added to Q, a title commitment later is obtained via the [States Title](https://www.statestitle.com/) API. This is a third-party check to confirm the vesting deed and current owners of the property. At closing, this is converted to a lender's title insurance policy. #### HOA and condo association data For larger multifamily buildings (over 4 units) and housing communities with amenities, Q needs to know the state of the HOA or Condo Association before lending against a unit. Unfortunately, this information changes regularly and there's no centralization data set for it. The only way to capture it is by direct outreach to the associations themselves. Public records are used to identify the buildings that are large enough to require HOA analysis, and then they are contacted with a questionnaire ([Fannie Mae's Form 1076 Condo Project Questionnaire — Full Form](https://singlefamily.fanniemae.com/media/document/pdf/form-1076)). Responses are parsed and become usable as inputs to underwriting. [CoreLogic's CondoSafe](https://www.corelogic.com/products/condosafe.aspx) and PropLogix obtain the questionnaire responses on Q's behalf. ### Automated valuation A machine learning model generates automated valuations for each property. It is trained on all available historical data for markets Q is active in, with a target variable of the historical sales price of each home. To improve the accuracy of the model, additional data on economic variables and other factors that could potentially impact local housing markets. In addition, all available data is passed into unsupervised auto-encoders to extract more complex, hidden features, without overfitting. Finally, a separate model is created to help identify comparable sales by comparing property conditions at the neighborhood level. Using MLS and Zillow listing data photos and descriptions, ML object detection can classify the types of home features and appearances which correspond to varying local condition levels. On representative test data, these types of models achieve high correlation with condition as assessed by certified appraisers. ![](https://i.imgur.com/BDmiVbq.png) <p class="image-caption"> Statistically-estimated property condition ("Agent Assessed MLS Based" has high correspondence with Appraiser Assessed property conditions, see: https://collateralanalytics.com/wp-content/uploads/2016/05/CA-RESEARCH-How-Much-Does-Property-Condition-Affect-Residential-Property-Value.pdf) </p> These are combined into a single AVM to generate valuations for all properties. ![](https://i.imgur.com/A8t1Npv.png) <p class="image-caption"> Example AVM schematic -- https://bit.ly/3gEdEYM </p> ### GSE appraisal waivers Requests for a GSE appraisal waiver are placed at the same time as the automated valuation. Receiving a GSE appraisal waiver allows Q to skip the appraisal stuff for those homes. The GSEs waive their appraisal requirement for a large percentage of loans. They are allowing this flexibility for homes that are easier to value (many comparables in the area) and loans that are less risky (lower LTVs). ![](https://i.imgur.com/2NOn0LW.png) <p class="image-caption">See: https://www.aei.org/research-products/report/prevalence-of-gse-appraisal-waivers-november-2020-originations/ </p> Loan applications are submitted into the GSE automatic underwriting system (e.g. Desktop Underwriter for Fannie Mae) in order to check if they will receive an appraisal waiver. Once the borrower's finances have been collected and the target home is known, the laon application is submtited through the [Desktop Underwriter API](https://singlefamily.fanniemae.com/media/6276/display). Within the typical response time of 3-5 seconds, Q is informed of whether the loan is eligible for GSE delivery and whether it will receive an appraisal waiver. ```json "Loan_file": "SAMPLE123", "Result": "Approved/Eligible", "Appraisal": "Waived" ... ``` ### Virtual appraisal For homes that do require an appraisal, Q's mobile app guides the borrower through a walkthrough of their home. The walkthrough is guided to guarantee the borrower doesn't miss any parts of their home — both exterior and interior. The result is an inspection video that the appraiser can use to complete the appraisal. Once the walkthrough begins, the mobile device's cameras identify the user's surroundings and begin building a map of their home. Using [Apple ARKit](https://developer.apple.com/augmented-reality/arkit/) (and [ARCore](https://developers.google.com/ar) for Android devices) allows Q to estimate distances to objects in front of the user in real-time. As the user walks around, a 3D mesh of their surroundings is constructed: ```javascript arView.automaticallyConfigureSession = false let configuration = ARWorldTrackingConfiguration() configuration.sceneReconstruction = .meshWithClassification ``` <p class="code-caption"> Enabliing ARKit mesh </p> ![](https://i.imgur.com/szA6Wes.jpg) <p class="image-caption">3D mesh overlay</p> - Meshing is performed using photogrammetry - Assisted by the LiDAR sensors on newer mobile devices - From the mesh, we reconstruct a 3D model of the home - 3D model is used to guide the borrower towards areas that have not been captured yet ![](https://i.imgur.com/nCiRNwm.png) <p class="image-caption"> Example 3D model generated from mesh </p> ![](https://i.imgur.com/3VPeZN1.jpg) <p class="image-caption"> Borrower guidance instructions </p> During the walkthrough, video data is uploaded over an end-to-end encrypted channel to cloud storage. Bandwidth issues are avoided by selectively sub-sampling the video for frames ("keyframes") with minimal motion blur and sufficient visual difference from previously collected frames. That way, only a few frames per second are uploaded and the system can work even with slower internet connections. This can be done using a variety of techniques, including on-device machine learning: ![](https://i.imgur.com/6So5PxX.jpg) <p class="image-caption"> Identifying key frames to reduce bandwidth usage </p> #### Appraisal assistant backup Some borrowers will be unable to perform their home walkthrough. This can happen for many reasons, including physical disabilities and unique homes that require more thorough inspection. To support these cases, Q needs to send an appraisal assistant to record the walkthrough video. In a traditional full appraisal, the appraiser may use assistance from employees and contractors as long as they maintain direct supervision. The GSEs allow assistance to come from unlicensed or uncertified personnel, subject to USPAP guidelines and state law. See: [B4-1.1-03, Appraiser Selection Criteria (01/31/2017)](https://selling-guide.fanniemae.com/Selling-Guide/Origination-thru-Closing/Subpart-B4-Underwriting-Property/Chapter-B4-1-Appraisal-Requirements/Section-B4-1-1-General-Appraisal-Requirements/1032991761/B4-1-1-03-Appraiser-Selection-Criteria-01-31-2017.htm). Q will operate a network of appraisal assistants under a gig work model, similar to Uber or DoorDash. Appraisal assistants, classified as independent workers, will get paid per home visit and subsequent data collection. In the event that there are difficulties with GSE acceptance of our appraisals, the appraisal assistant network will be used for a larger percentage of Q's appraisals (e.g. those that don't receive an appraisal waiver). #### Object detection As the collected video is uploaded, objects and features are identified that are relevant for appraising a home. Relevant detected features include: - Interior materials — floors, walls, trim - Exterior materials — roof surface, window types, foundation type - Views — window locations, directions, obstructions - Kitchen — countertops, appliances, cabinets - Special features — solar panels, accessory dwelling unit - Damage — water, insect, lack of maintenance ![](https://i.imgur.com/QVLYVvq.png) Machine learning is used for object detection. This speeds up the process for the appraiser and helps to avoid mistakes. As the video is recorded, it is streamed into an encrypted [AWS S3](https://aws.amazon.com/s3/) data store. [AWS Lambdas](https://aws.amazon.com/lambda/) monitor the buckets and trigger a series of [AWS SageMaker](https://aws.amazon.com/sagemaker/) image recognition models — each trained to specific types of features and objects — to classify them in parallel. ![](https://i.imgur.com/dUt9yFD.png) <p class="image-caption"> Example schema of a Mold Detector from "Deep Learning for Detecting Building Defects Using Convolutional Neural Networks" [https://arxiv.org/pdf/1908.04392.pdf] </p> The models are custom trained convolutional neural networks that can improve over time as more home video data is captured and new objects are labeled. To iteratively improve the model, we use human-in-the-loop labeling of ambiguous cases and new data to better classify objects. We use [Scale](https://scale.com/) to access a workforce of human labelers for generic labeling tasks (identifying a bed or window) and appraisers for specialized tasks (identifying damage or degree of wear and tear). ![](https://i.imgur.com/umhiAjF.png) <p class="image-caption"> Human-in-the-loop training for machine learning models — [https://bit.ly/3azS0RD] </p> By detecting these objects and other features of the home, the model can understand a home's condition and quality. A home that has significant damage or is uninhabitable is disallowed as collateral. Appraisers using the automated labeling will be able to perform their appraisals more quickly, with fewer errors, and yield higher accuracy valuations. #### Independent certified appraisers For homes that require an appraisal, the appraiser steps in to complete it after the home's the inspection video is captured. To complete a USPAP-compliant FIRREA Title XI real estate appraisal, the appraisal must be performed by a state certified appraiser. Appraisers must also satisfy Appraisal Independence Requirements (AIR) to ensure they are not biased by the borrower nor the lender or their service providers (e.g. Q). These requirements are satisfied by contracting state certified appraisers from an independent Appraisal Management Company (AMC) owned by Q. Appraisers in the AMC have a bonus-heavy compensation structure. This is similar to a trading firm than a traditional appraisal company where appraisers generally are paid a fixed fee per completed appraisal. By tracking appraised homes into the future, we can wait until a subsequent purchase transaction and subtract out a time series trend factor to calculate the accuracy of the appraiser's original appraisal. Appraisers with the highest accuracy valuations over time will receive disproportionately higher bonuses. This will allow Q to attract talented personnel and generate the most accurate appraisals possible. #### Appraiser dashboard A web-based Appraiser Dashboard will guide appraisers through completing appraisals. The user interface is optimized to speed up the appraisal process without sacrificing quality. Appraiser dashboard highlights: - Property profile info, especially on legal status of the land and building and historical sales - Comparable sales sorted by similarity with most recent photos and Google Maps data - Inspection video, with a list of detected objects/features sorted by uniqueness and expected impact on valuation Using the dashboard, the appraiser can set parameters which modify the AVM and generate an updated valuation. Once an appraiser inputs the final valuation and any notes, they complete the appraisal by cryptographically sealing it. Each appraiser is issued a hardware security key secured by biometric verification using their mobile device (FaceID). Sealing an appraisal with the security key appends the appraiser's license information — license number, issuing state, and dates — to the appraisal, and records a SHA256 hash to prevent any further modification of the file. ## Unverifiable applicants Some borrowers are not automatically supported by the Q app. Since underwriting relies on electronic account connections, not all borrower income and assets may be verifiable without additional documentation and manual review. For example, when a borrower relies on alimony or trust income. To support these borrowers, a fallback solution is needed. ![](https://i.imgur.com/e7gP6Zr.png) <p class="image-caption">Expected causes of unverifiable applicants</p> Loans that require manual underwriting are detected by a machine learning classifier — for example, by checking for large deposits that can't be traced to a verified source or identifying that the majority of a borrower's income can't be detected. Once manual underwriting is triggered, a process exports any data already collected from the borrower and sends them by email to a team at Flagstar. A loan officer then takes over the rest of the underwriting process with the borrower. Data exported to the loan officer can include: - Borrower identity profile and KYC results - Property information including address and ownership - Trimerge credit report data - Assets and income, often partially verified ![](https://i.imgur.com/WPVqCqF.png) <p class="image-caption"> [https://bit.ly/2QiAj22] </p> ==incorporate this? ⬇️ from income section== Some types of income will be difficult for our technology to validate. Particularly, income streams that are not employment-related and are sourced from a legal settlement or separate legal entity. For example, **alimony** or distributions from **trusts**. While we can identify these deposits, we can't automatically verify the source or the expected length of the distributions. In addition, some complex or unstable small businesses will be too difficult for our system to analyze. Particularly, those where the borrower is relying on undistributed business assets or income to quality for the mortgage. We expect these cases to represent about 6% of potential customers. ## Loan types Q supports a limited set of self-configurable products - 30 Yr fixed - Adjustable rate (ARM) - Merge Refinance -- borrower's loan keeps the same term and loan balance Once a borrower chooses their loan type, Q instantly underwrites them and provides the lender's offered interest rate (or rejection reason). ## Mortgage smart contracts Q uses **mortgage** **smart contracts** to execute mortgage loans. By using code-based smart contracts, Q can process a loan application into a closed loan in just a few minutes. The smart contracts take in a digital loan application, and a lender's configuration (including their underwriting and pricing rules). The contract's logic then executes the verifications and actions required to close and deliver an electronic mortgage. ![](https://i.imgur.com/3BxcZQr.png) <p class="image-caption"> [https://bit.ly/3gyUH9D] </p> Smart contract inputs and outputs are recorded to a cryptographically-secured database. qDB is custom-built for recording mortgage origination data and ensures the origination process is not tampered with by any parties to the transaction. ### Composition Mortgage smart contracts are written in TypeScript and consist of **Rules** and **Actions**. The contracts themselves are written and maintained by Q, although lenders inject their own lending configurations. **Rules** are conditions that check inputs and outputs of the contract for risk and regulatory compliance. A few examples of rules: ```javascript Borrowers.forEach(borrower => kyc.verify(borrower)); ``` <p class="code-caption"> Verify that each borrower passes KYC checks </p> ```javascript calculateBusinessDaysDifference(new Date.getTime(), closingDisclosure.getIssuanceTime()) > 7) ``` <p class="code-caption"> Verify that at least 7 business days have elapsed since the Closing Disclosure was issued </p> **Actions** are origination activities triggered by the contract, either performed internally or through an external service. A few examples of actions: ```javascript const result = underwriting_engine.underwrite(loanApplication, underwritingRules); ``` <p class="code-caption"> Underwrite a loan application to the provided "underwritingRules" underwriting criteria </p> ```javascript const result = mers_api.register(loan.getElectronicNote()); ``` <p class="code-caption"> Register an eNote with MERS </p> ## Lender configuration Q allows lenders to configure their own underwriting criteria within a smart contract. This gives lenders control over which borrowers are approved for a loan and what their interest rate is. Underwriting criteria can be codified in just a few lines of code, allowing lenders to get started with almost no integration. Underwriting rules are stored as YAML code rather than the PDFs and Excel files that are typically used for traditional mortgage products. This allows an instant underwrite of a borrower's profile without the support of human underwriters or processors. ```yaml= # Example rule that requires the property type to be a single family home (SFH) # or a 1-unit condo property_type: [sfh, 1-unit condo] ``` The underwriting rules YAML is provided as an input to a mortgage smart contract, which underwrites a borrower's loan application according to the rules. The file contains rules that determine a borrower's eligibility for a particular mortgage product, and the corresponding interest rate. A traditional mortgage product can be converted into this format or new ones can be made from scratch. There are two components to a rules file: - **Eligibility:** Loan eligibility rules - **Pricing:** Benchmark interest rates and loan-level pricing adjustments #### Eligibility configuration **Eligibility** criteria can be set on any attribute captured from the loan application — equivalent to [Uniform Residential Loan Application](https://singlefamily.fanniemae.com/delivering/uniform-mortgage-data-program/uniform-residential-loan-application) and the appraisal [URAR](https://singlefamily.fanniemae.com/media/12371/display) — with information on the borrower's identity, finances, and home. This includes attributes like a borrower's income, DTI, LTV, property type, location, occupancy, etc. ```yaml # eligibility_example.yaml # A rule that limits LTV to 75% and allows for rate/term refinances on single-family homes and condos eligibility: - property_type: [sfh, condo] - loan_type: [rate_term_refinance] - ltv: { max: 75 } ``` Lenders can add overlays on top of an existing set of rules by specifying a *template*. A template refers to another config file by name, and automatically imports all of its eligibility criteria. Q maintains templates for Fannie Mae and Freddie Mac [underwriting eligibility guidelines](https://singlefamily.fanniemae.com/originating-underwriting). ```yaml # The previous eligibility_example YAML is injected as a template # An additional constraint is adaded: credit score minimum of 600 eligibility: - template: eligibility_example - credit_score: { min: 600 } ``` Updated templates are maintained for both Fannie Mae and Freddie Mac which contain their underwriting guidelines. Here is a more complete example of eligibility criteria, built on top of a Fannie Mae 2021 template: ```yaml eligibility: - template: fannie_mae_2021 - property_type: [sfh, condo, pud, modular] - loan_type: [purchase, rate_term_refinance] - locations: [ alameda, contra costa, los angeles, marin, napa, orange, san benito, san francisco, san mateo, santa clara, santa cruz, ] - - occupancy - - occupancy: [primary_home] - cltv: { max: 80 } - credit_score: { min: 680 } - - occupancy: [second_home] - cltv: { max: 75 } - credit_score: { min: 680 } - term: { max: 30, min: 15, type: years } - loan_size: { max: 679650, min: conforming_limit } - eligible_borrowers: [citizens, perrmanent_resident_aliens, occupant_coborrowers] - acreage: { max: 10 } - properties_owned: { max: 4 } - dti: { max: 43 } - - interested_party_contributions - - ltv: { max: 80 } - interested_party_contributions: { max: 9% } - - ltv: { max: 75 } - interested_party_contributions: { max: 6% } - appraisal_condition: { max: C1, min: C4 } - - hazard_insurance - - hazard_insurance_deductible: { max: 5% } - - accessory_units - - accessory_units: { max: 1 } ``` In addition to eligibility rules that the lender defines, Q imposes additional constraints. These constraints help ensure a good experience for eligible borrowers and protect Q's brand. The primary constraint applied is a maximum LTV ratio of 80%, to minimize default risk. abc #### Pricing configuration **Pricing** is set by lenders and is customized to each loan's level of risk. Loans are priced in terms of interest rate, and there are two components to the rate: **benchmarks** and **adjustments.** Benchmarks and adjustments are automatically calculated and added together for each loan to arrive at the interest rate offered to the borrower. ==*Mortgages on Q have no fees or cash to close for the borrower.*== This also means there are no *points* — credits/debits to the borrower in exchange for interest rate adjustments. As a result, the lender only needs to provide par pricing interest rates for each loan term. The **benchmark** interest rate is the par rate for each loan term and lock window. Benchmark interest rates are the raw rates used when calculating a borrower's rate at each loan term, before any loan-level adjustments for margins and risk (see: Adjustments section below). Benchmark rates can be updated manually throughout the day or injected via API links with , Bloomberg, or a webhook to a lender's custom pricing system. Since benchmarks can change throughout the day, lenders may prefer to have them streamed in using API links to their pricing software. For example, to integrate Q with [OptimalBlue's](https://www2.optimalblue.com/) pricing API, the lender needs to simply provide an OAuth API key to Q. Then, in their benchmark YAML they can reference benchmarks directly: ```yaml - rate: { source: optimal_blue, productId: optimalBlueProductId } ``` Q will automatically request best execution pricing for the corresponding optimal blue product (based on `optimalBlueProductId`) and extract the par rate for each lock period: ```json # Sample API callback from OptimalBlue best ex pricing API ... "quotes": [ { "rate": 0.0, "lockPeriod": 0, "lockExpirationDate": "string", "apr": 0.0, "price": 0.0, "armMargin": 0.0, "closingCost": 0.0, "discountDollar": 0.0, "discountPercent": 0.0, "rebateDollar": 0.0, "rebatePercent": 0.0, "loCompensationDollar": 0.0, "loCompensationPercent": 0.0, "principalAndInterest": 0.0, "totalCredit": 0.0, "monthlyMi": 0.0, "totalPayment": 0.0 } ], "parQuotes": [ { "parRate": 0.0, "parPrice": 0.0, "parLock": 0 } ], ... ``` Traditional pricing grids require setting a loan price for each loan's coupon and lock window. However, mortgages on Q do not have their rates limited to 1/8% increments and borrowers are not given the option to buy up/down points. As a result, only a single best execution rate is needed from the lender for each loan term and lock window. ![](https://i.imgur.com/OBWJqD2.png) <p class="image-caption"> Traditional pricing grid </p> Benchmark rates can also be provided in raw YAML format, in lieu of API data: ```yaml eligibility: ... pricing: benchmarks: - - term: { max: 30, type: years } - rate_15d: 3.01% - rate_30d: 3.05% - rate_45d: 3.10% - - term: { max: 15, type: years } - rate_15d: 2.10% - rate_30d: 2.13% - rate_45d: 2.21% ``` Interest rates are interpolated between the tenors provided. For example, a 15 day lock for `28.23` year merge term refinance using the benchmark rates above is calculated as `(28.23 − 15)/(30 − 15) × (3.01 − 2.1) + 2.1 = 2.90%` **Adjustments** are added on top of the benchmark rate to account for profit margins and loan-level risks. Adjustments are configured in the YAML in a section below the benchmarks. Since adjustments rarely change, they are maintained manually by the lender's pricing team. Similar to traditional loan-level pricing adjustments (LLPAs), adjustments are cumulative for each loan. Adjustments can be set for specific loan-level conditions: ```yaml # Add 0.25% to all loans for California homes eligibility: ... pricing: benchmarks: ... adjustments: - - property_location: {state: CA} - rate: +0.25% ``` Conditions can be nested. Also, an unconditional adjustment can be provided to add a profit margin to all loans: ```yaml # Add 0.5% to all loans, and 0.25% to single-family homes in California eligibility: ... pricing: benchmarks: ... adjustments: - - rate: +0.5% - - property_location: {state: CA} - - property_type: {sfh} - rate: +0.25% ``` Using conditional statements, lenders can easily recreate the LLPA grids they are familiar with, as well as more sophisticated adjustments to target specific risks and opportunities. ![](https://i.imgur.com/yA8YUQ4.png) <p class="image-caption"> Traditional LLPAs (loan-level pricing adjustments) from Fannie Mae </p> ```yaml # Example oof using conditional statements to begin to replicate an LLPA grid eligibility: ... pricing: benchmarks: ... adjustments: - - rate: +0.5% - - ltv: { max: 80, min: 75.01 } - - fico: { min: 760 } - rate: +0.05% - - fico: { max: 759, min: 740 } - rate: +0.08% - - fico: { max: 739, min: 720 } - rate: +0.1% - - fico: { max: 719, min: 700 } - rate: +0.13% - - fico: { max: 699, min: 680 } - rate: +0.3% ``` A graphical user interface will be provided to lenders to make it easier for them to visualize and modify their adjustments. ## Closing Mortgages originated on Q are closed within the mobile app, in an electronic closing. This allows closing to be completed within minutes. To make this experience possible, the smart contract controls each step of the process. Once the smart contract completes underwriting, the mortgage is ready to close. Closing requires signatures on closing docs and transferring funds. After closing, an electronic mortgage note is created, and delivered to the lender upon completion of the transaction. ### Borrower acceptance Borrowers receive their mortgage offer immediately upon underwriting by the smart contract. If they accept the terms, they proceed to loan closing. ### Closing docs Closing documents form the legal basis of the mortgage and associated loan note. These docs are generated while a borrower is configuring their loan. In addition to the mortgage note, the docs include consumer disclosures mandated by the CFPB (RESPA/TILA Regulation X and Z). The disclosures present the terms of the loan in a standardized way, making it easier for borrowers to compare offers between lenders. Disclosures contain information like: interest rate, amount, closing date, property address, closing costs, settlement services, monthly loan payment, and estimated property tax payment. An example of a CFPB mortgage *Closing Disclosure*: ![](https://i.imgur.com/tkcJHzG.png) ![](https://i.imgur.com/5wkPCBK.png) #### Rules engine A rules engine generates the closing disclosure and other closing documents, including: the promissory note, mortgage deed, subordination agreement, and right to rescind (for refinance). Additional documents are required for some counties or communities (e.g. HOAs), as well as state-specific language for notes or deeds. The rules engine is built as a set of functions which take in the borrower and property profile as input, and output the closing docs from pre-built templates. As new condo projects, counties, and states are added, the rules and templates will be expanded out to support them. ![](https://i.imgur.com/g3T1ltz.png) The rules engine calculates the values necessary to inject into the templates, including: - Refinance loan amount from credit report data, checked against lien amount from title history - Taxes from property address data, provided by borrower and confirmed with credit report and device location, and county tax assessor databases - Interest rate from the mortgage offer YAML configuration and loan term from loan type - Homeowner's insurance payment from new Lemonade policy letter delivered over API or from existing HOI policy electronic declaration pages Closing documents are stored immutably and are downloaded to the borrower's mobile app to view at any time. ### Regulatory monitoring CFPB regulations mandate waiting periods between disclosures and closing — specifically 7 business days between delivery of a Loan Estimate and consummation of the loan agreement (TILA window). A waiting period monitoring function periodically checks unclosed loans to alert when the period has elapsed. In response, a notification is sent to the borrower's mobile device to invite them into a video chat to sign the docs. ![](https://i.imgur.com/UeCCpSg.png) In addition to the TILA window, the smart contract provides support for mortgage refinance "right to rescind". This provides a 3 day window within which the borrower can cancel their refinance request and prevent the transaction from settling. ### Video notarization Video chat is used to complete the closing. The loan documents must be signed under observation by a notary -- as required by state notarization laws which require audiovisual interaction. To achieve this without an in-person meeting, video conferencing that can support signatures and notarizations is integrated into the Q mobile app. The video tech allows a notary to communicate with a borrower over a secure communications channel while simultaneously overlaying loan documents and electronic signatures. This is enabled by [Twilio Video](https://www.twilio.com/video) iOS/Android SDKs to establish an encrypted [WebRTC](https://webrtc.org/) connection. Loan documents are simultaneously overlaid with video chat between borrower and notary. Then, eSignatures are collected and stored alongside an encrypted recording of the notarization. ![](https://i.imgur.com/3c16gWu.png) To perform notarizations, notaries trained to use the Q system must be available on-demand. The notaries must be state-licensed Notary Publics and operate from jurisdictions with established Remote Online Notarization laws — URPERA and NASS eNotarization standards. Notaries use a Q web application to communicate with borrowers. It allows them to apply their electronic notarization seal to documents signed during the video chat. The app also guides them through a scripted procedure to ensure that the notarization complies with notarization laws. During closing, the borrower signs loan documents with a selfie photo. The selfie is registered as a legally-binding signature through a signature affidavit. The selfie can then be encrypted and stored as an eSignature, complying with eSign/UETA laws. The notary applies their electronic seal to the mortgage, making it eligible for acceptance by county deed recorders. Simultaneously, the document is cryptographically signed with the notary's unique digital certificate, ensuring that no further changes can be made. The seal and certificate are both unique to the notary and the notary's desktop app needs to confirm their identity using a biometric check on the notary's 2FA device with FaceID/TouchID. ### Electronic Notes After the closing docs are signed, the electronic mortgage note is created. The mortgage note is an XML file in [MISMO SMART Doc format](https://www.mismo.org/get-started/participate-in-a-mismo-workgroup/---verifiable-smartdoc%C2%AE-30-profile-development-workgroup). The note is digitally signed using a SHA-2 tamper-evident digital signature with an [X509 certificate](https://en.wikipedia.org/wiki/X.509). The note is then a transferrable record of the loan — an [eMortgage](https://en.wikipedia.org/wiki/EMortgage). The note is stored with the lender recorded as the owner. Ownership of the note is registered electronically with [MERS](https://www.mersinc.org/index) via their [eRegistry](https://www.mersinc.org/products-services/mers-esuite/eregistry) protocol. Once the note is created, the mortgage deed can be recorded with the county using [Simplifile](https://simplifile.com/) electronic recording. The property's title is then monitored for changes to confirm by pulling daily updates from the county recorder using the same infrastructure we use for property profile data. ![](https://i.imgur.com/MuX0EUF.png) <p class="image-caption"> Example of a tamper seal using a X509 Certificate on a MISMO eNote </p> ### Settlement Once the note is created, the mortgage funds are ready to be transferred. To settle the mortgage, Q needs to know the amount and recipient's bank account information. The process to obtain this info is different for purchase and refinance mortgages. To prepare for settlement, a mortgage escrow account is created at Flagstar. [ModernTreasury's API](https://www.moderntreasury.com/) is used to manage accounts and outgoing payments. For escrow accounts, a virtual account (aka sub-account) is created for each loan: ```bash # ModernTreausry call to creat a virtual account curl --request POST \ -u ORGANIZATION_ID:API_KEY \ --url https://app.moderntreasury.com/api/virtual_accounts \ -H 'Content-Type: application/json' \ -d '{ "name": "Escrow account for Loan #123", "internal_account_id": "c743edb7-4059-496a-94b8-06fc081156fd", "counterparty_id": "b4313c66-3892-416d-995f-f5b6044b5c7a", "account_details": [ { "account_number": "2000001", "account_number_type": "other" } ] }' ``` The escrow account serves as the intermediary for fund transfers, allowing for funds to be dispersed to other parties, including: real estate brokers, state and county tax collectors, and property insurance companies. The initial escrow balance is calculated to estimate the next expected property tax bill, less a monthly pro-rated annual tax amount which is expected to be received from the borrower with each interim monthly mortgage payment. #### Refinance mortgage settlement For a refinance mortgage, the mortgage amount is initially estimated from credit report and public title/deed records. The exact payoff amount is unknown until a payoff statement is received. ```yaml # Mortgage tradiline on credit report with remaining balance credit_report: ... - tradeline: type: mortgage name: Chase Manhattan Home Mortgage account: 3331334557 opened: 02/2018 reported: 03/2021 balance: 265130 scheduled_payment: 2120 collateral: FRD680474439 ... ``` ```yaml # Corresponding mortgage deed from title search title: - date: 2018-02-18 type: mortgage from: CORBETT MCCORMICK to: CHASE MANHATTAN MTG CORP loan_amount: 325123 loan_due_date: 2/22/2048 loan_finance_type: adjustable_rate document_id: 0907729017 ... ``` In order to request a payoff statement, the existing loan number and servicer are parsed from credit report and [MERS servicer portal](https://www.mers-servicerid.org/sis/). Then, a payoff statement request is sent to the servicer. Q adapts the request to each servicer's requirements, e.g. over email (using [SendGrid](https://sendgrid.com/) or [AWS SES](https://aws.amazon.com/ses/)) or automated phone call (with [Twilio Voice API](https://www.twilio.com/voice)). ![](https://i.imgur.com/ufbHHbL.png) <p class="image-caption"> Sample mortgage payoff statement </p> Once the statement is received, it can be parsed to extract the payoff amount and other loan attributes. [AWS Textract](https://aws.amazon.com/textract/) is used to read unstructured data using OCR and machine learning: ![](https://i.imgur.com/ZQItcvW.png) <p class="image-caption">Automated data extraction from forms using Textract (source: https://aws.amazon.com/textract/features/)</p> Any difference between estimated (from credit report) and actual payoff (from payoff statement) amount must be reflected in the closing docs. In the event that it would increase the borrower's closing costs, a lender credit is added to offset the amount (paid for by the lender). Increases in lender credits do not require a reset of the TILA window, thus preventing any impact to the borrower. ![](https://i.imgur.com/KyL9SEn.png) Once the video notarization is complete, an outgoing wire must be sent from the lender to the loan's servicer in order to pay off the existing mortgage loan. This process varies for each lender and depends on their available payments infrastructure: for some banks Q can issue an API request with [ModernTreasury](https://www.moderntreasury.com/), and for others a bank-specific PDF file is generated. ```bash # Outgoing wire request with ModernTreasury curl --request POST \ -u ORGANIZATION_ID:API_KEY \ --url https://app.moderntreasury.com/api/payment_orders \ -H 'Content-Type: application/json' \ -d '{ "type": "wire", "amount": 265130, "direction": "credit", "currency": "USD", "originating_account_id": "0f8e3719-3dfd-4613-9bbf-c0333781b59f", "receiving_account_id": "5acec2ef-987b-4260-aa97-b719eeb0a8d5" }' ``` #### Purchase mortgage settlement For a purchase mortgage, the purchase contract is required to determine the final settlement amount and where it needs to be dispersed. Arika is used to obtain the seller's contact information and reach out over SMS/email to request the contract. ![](https://i.imgur.com/M9yF7vc.png) <p class="image-caption"> Email generated to seller's broker by Arika </p> The purchase contract is parsed with [Textract](https://aws.amazon.com/textract/) to read the amount of the purchase, and therefore the amount of the loan (less the downpayment). Simultaneously, the borrower's downpayment is collected over ACH using [Dwolla](https://www.dwolla.com/) and placed in a dedicated purchase escrow account until settlement is complete. ```javascript // Creating a same-day ACH pull request with dwolla var transferRequest = { _links: { source: { href: "https://api-sandbox.dwolla.com/funding-sources/b5e68264-7d4d-42a9-88d4-5616c77c6baa", }, destination: { href: "https://api-sandbox.dwolla.com/funding-sources/3152c22b-3d72-442d-a83b-e575df3a043e", }, }, amount: { currency: "USD", value: "93019.19", }, clearing: { destination: "next-available", }, }; appToken.post("transfers", transferRequest).then(function (res) { res.headers.get("location"); // => 'https://api-sandbox.dwolla.com/transfers/d76265cd-0951-e511-80da-0aa34a9b2388' }); ``` At closing time, the seller is invited into the Q app to notarize the title transfer for their home. Once both buyer and seller have closed, the smart contract releases the buyer's downpayment from escrow and sends a wire from the lender to the seller's bank account. ### Considerations #### Scaling to all 50 states Q is initially launching in California, but is designed to scale to all 50 states. Due to variations in state law, the closing process will be different in some states in order to accommodate their requirements — potentially impacting notarization, recording, and attorney presence. #### Notarization issues While many states do not have explicit laws approving of Remote Online Notarization, almost all accept out-of-state notary seals for recorded documents (provided that the out-of-state notary complies with the laws of their respective state). Currently, Iowa, South Carolina, and Oregon are unclear in their willingness to accept [Remote Online Notarization](https://www.notarize.com/knowledge-center/is-online-notary-legal) of real estate transactions. These laws are changing rapidly and there will likely be more clarity by the time we expand to those states. As Q scales to close in more states, a larger network of notaries will be required to support increased loan volumes. To achieve this scale, Q will need to build out a network of independent third-party notaries with access to our closing system. This will follow the gig-work "Uber model", as most notaries will not be utilized most of the time but nevertheless need to be available in case of a spike in demand. #### Attorney closing states [Several states](http://media.octoberresearch.com/pdfs/2020_Attorney_State_Breakdown.pdf) require an attorney's involvement in mortgage closing, either in the form of closing presence or for a title opinion. For almost all of these, an attorney does not need to be physically present but may need to be involved in the closing to a varying degree, including presiding over the notarization. These attorney closing states include Connecticut, Delaware, Georgia, Massachusetts, North Carolina, Rhode Island, South Carolina, and West Virginia. To support closing in those states, Q will operate an internal team of attorneys to work with the notary during each closing. A separate closing tool is needed for the attorney in order to satisfy their requirements under each state's laws. ![](https://i.imgur.com/OkDC0S5.png) #### eRecording limitations The vast majority of homes are located in counties that allow for electronic recording of deeds — over [2,200 of 3,006 counties](https://simplifile.com/e-recording/e-recording-network/) nationwide, with mostly rural counties not supported. To supplement this, 13 states have enacted laws which allow "papering-out" of documents — printing electronic documents to be physically filed. For the remaining counties, electronic closing will not be available until their respective states adopt "papering-out" legislature or an equivalent. ## qDB: a trustless mortgage ledger qDB is a proprietary database built to securely store mortgage data, written directly from mortgage smart contracts. It allows lenders to rely on origination data without needing to trust other parties to the mortgage transaction. Data is retained in its original electronic state with proof that it was sourced from an independent third-party. The database combines the immutability and ordering guarantees of public blockchains with the multimedia storage capabilities and performance of traditional databases. By retaining all mortgage origination data in this database, originators and investors will be able to perform loan-level auditing on a reliable and completely digital mortgage dataset. ### Data storage qDB stores all mortgage origination data in one place. This creates a single source of truth for the loan throughout its life. Traditional tamper-proof databases are limited to only storing a pointer to large data sets and the data itself is stored elsewhere. A full manifest of data is stored for each mortgage: - **Customer identity verification:** including selfie video, photo of government-issued picture ID, phone number, credit report identity data, customer location history, and verifications from third-party databases for identity verification including SSA, PEP, OFAC, etc. - **Financial documentation:** bank account and investment accounts (activity and balances), verification of employment and income from employer, IRS tax filing history, credit reports, and proof of homeowner's insurance. - **Property data:** title history, tax assessment history, parcel description and legal/zoning status, HOA and condo association questionnaire (financials, occupancy, insurance, legal structure) - **Property appraisal data:** video walkthrough and analysis, full appraisal - **Loan applications:** loan request, including all variables derived form raw inputs above, such as LTV, DTI - **Underwriting decisions** including rejection reasons - **Closing packages**, including all disclosures, promissory note (eNote), mortgage deed - **Notarization ceremony** and eSignatures attached to closing package - Evidence of **recorded mortgage deed** - **Servicing activity:** payments, delinquencies, customer support interactions, modifications, and legal actions - **Note transfers:** sales to GSEs, updates to MERS registry Arbitrary data types can be stored in the database, such as images, video, and sealed electronic mortgage notes. This enables the database to store all raw data collected during origination, including: ID photographs, home walkthrough videos, 3D models of homes and neighborhoods, and notarization videos. Raw data is stored in its original format. ### Proof of source All database entries are stored alongside proof of where and when they were obtained. Source data from third-party vendors is signed with their encryption keys. For example, video and photo data collected from mobile devices are stored with original metadata proving the time and location of collection. ![](https://i.imgur.com/yOCuXFz.png) <p class="image-caption">TLS certificate example</p> Data collected from third-party websites or APIs is stored encrypted with their TLS certificate. TLS uses the HMAC algorithm — a specific construction for calculating a message authentication code (MAC) involving a cryptographic hash function in combination with a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and the authentication of a message. ![](https://i.imgur.com/PWpaNkH.png) <p class="image-caption">HMAC verification performed during TLS handshake in HTTPS protocol</p> ### Proof of integrity Database entries are stored in a journal that is append-only — entries can only be added, never modified or deleted. Each entry is hash chained to the previous entry inserted into the database. Hash chaining uses a cryptographic hash (SHA-256) of the previous entry and includes it inside each new entry. This creates proof of entry order and that underlying data hasn't changed — a given entry is guaranteed to have existed before any entries that contain a hash of its data. ![](https://i.imgur.com/uFvulK3.png) <p class="image-caption">Hash chaining example from Bitcoin whitepaper (source: https://bitcoin.org/bitcoin.pdf)</p> Periodically (e.g. every hour), an encrypted snapshot of the entire database is taken by calculating a Merkle Tree root hash. This hash is written as a transaction to the Bitcoin blockchain, serving as permanent proof of the database contents at that time. This protects against database attacks which may try to recreate an entire database at once, including past history, in order to tamper with the data. ![](https://i.imgur.com/mG4U7rz.png) <p class="image-caption">Calculating a Merkle root hash by combining hashes of each transaction (individual database entry)</p> qDB is unique in its ability to prove data integrity for large files — e.g. a video. Other databases (like QLDB) that calculate merkle roots synchronously are restricted to small data sizes because calculating the root hash is computationally expensive. These databases are designed for an environment with many simultaneous writers and readers. We can design for a more specific use case: a single writer from the Q system, and delayed reads from the lender (for auditing and QC). Given our usage pattern, we can calculate the merkle root asynchronously, which allows us to store much larger database entries. ![](https://i.imgur.com/SDTE5s7.png) ### Private instances Data recorded in qDB is accessible exclusively by the owner of the instance — the lender. This is possibly encrypting data stored in qDB using a dedicated Hardware Security Model (HSM) from [AWS CloudHSM](https://aws.amazon.com/cloudhsm/). The HSM is FIPS 140-2 level 3 compliant and performs all encryption/decryption operations for data entering/leaving the database. Database entries are encrypted at rest (in the database) and in transit (using TLS). Since an HSM is used, encryption keys never leave the hardware device, reducing risk of key theft or loss. The HSM device itself is physically tamper-proof and tamper-evident, protect against a data center breach. ![](https://i.imgur.com/mrpve68.png) <p class="image-caption">Hardware Security Module to protect encryption keys</p> Each database entry is *envelope encrypted* — a mix of symmetric (e.g. [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard)) and asymmetric (e.g. [RSA](https://en.wikipedia.org/wiki/RSA_(cryptosystem))) cryptography. Symmetric key algorithms are faster and produce smaller ciphertexts than public key algorithms. But public key (asymmetric) algorithms provide inherent separation of roles and easier key management. Envelope encryption lets you combine the strengths of each strategy. This increases security by preventing excessive key reuse and enables higher performance encryption for large objects like videos. Access to encryption and decryption operations is restricted to authorized users selected by the lender. Q administers the HSM appliance but does not have access to the keys or crypto operations. Adding or removing a user creates an immutable entry in an [AWS CloudTrail](https://aws.amazon.com/cloudtrail/) log for auditing and compliance. In addition, users require multi-factor authentication to access the data. We plan to use [Yubico FIDO U2F devices with built-in fingerprint reader](https://www.yubico.com/blog/getting-a-biometric-security-key-right/) for additional biometric security. ![](https://i.imgur.com/cAkPcVg.png) <p class="image-caption">Lenders retain exclusive access to crypto operations</p> ![](https://i.imgur.com/t2iTvMU.png) <p class="image-caption">Biometric MFA key for authorized users</p> ### Compliance accessibility Lenders can grant selective access to their data to select third-parties, e.g. the CFPB or OCC for auditing purposes. Data access can be granted for a predefined period of time and for a subset of data, e.g. for specific mortgages or dates. Access is controlled by creating separate [AWS IAM](https://aws.amazon.com/iam/) users for the regulator's agent and granting them access to a subset of the lender's data through an ["encryption context"](https://docs.aws.amazon.com/whitepapers/latest/kms-best-practices/encryption-context.html): ```json # AWS IAMA policy to restrict decryption access LoanID = A12345 { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::111122223333:role/RoleForExampleApp" }, "Action": [ "kms:Decrypt" ], "Resource": "*", "Condition": { "StringEquals": { "kms:EncryptionContext:LoanID": "ABC12345", } } } ``` ### Engineering consideration Q is exploring several ways of implementing the qDB design. In any implementation, Q would build off of existing technologies by forking them and adding or removing functionality as necessary. The two leading approaches are currently: #### Forking hyperledger Linux Foundation's Hyperledger is an open-source blockchain library. It is used as the basis for many other blockchain implementations, e.g. QLDB is a fork of Hyperledger. It may serve as a good starting point for building out our own tech, but would require significant changes. Specifically, hyperledger block design is for distributed consensus, which adds overhead that we would want to remove. In addition, we'd have to modify the merkle tree hashing to happen periodically rather than at write time for each entry, and to add the public attestation to Bitcoin blockchain. #### Forking CouchDB Traditional databases (PostgreSQL, DynamoDB, CouchDB, etc) are flexible and performant enough to store mortgage origination data. NoSQL, document-oriented data stores, are a better fit for our use case of storing relatively large documents rather than small individual database entries (e.g. individual fields). The biggest issue is these databases are not provably tamper-proof. Without enforced hash chaining of individual entries, a sophisticated attacker could modify and retroactively insert data without it being detectable. We would have to implement this crypto logic ourselves. #### Other technologies [**QLDB**](https://aws.amazon.com/qldb/) Amazon AWS's QLDB is the closest alternative solution to our database. Its largest limitations are: small document sizes (can't story images and video) and inability to use customer encryption keys. QLDB was designed to calculate merkle tree root hashes with every insertion, which creates large performance issues with larger entries. In addition, the encryption limitations create a potential compliance liability as Amazon AWS controls the private keys to the data. **Public decentralized blockchains (Bitcoin and Ethereum)** Public blockchains are designed for distributes consensus — thousands of participants all calculating a copy of the database for themselves. This places tremendous limitations on the amount of data that can be stored in them — e.g. we wouldn't be able to store images or video. ### The path forward ==todo== ## Servicing ==todo== ### Payment processing ### In-App Notification System ### Support ## GSE Delivery # Considerations and conclusions ## Security ### Tokenizing sensitive data ### PII detectors ### Differential privacy for internal data use ### Hardware encryption ### Red teaming ## Fraud Detection Module A fraud detection module monitors user activity and blocks behavior that is suspicious. The module uses data collected throughout the origination process as inputs: - **Identity** **data** including PII (names, DOB, addresses, etc.), selfie video, photo ID, and KYC/AML verification - **Financial data** including like names on accounts, assets, income, employment history, and alerts on a credit report - **Property** **data** including the home's location, title history, and sales history. To supplement this, telemetry data is collected from user mobile devices (using [Seon](https://seon.io/) and [Shield](https://www.shield.com/)): - **Behavior biometrics** including screen taps, motion sensor, GPS and radio signals - **Device metadata** including hardware specifications, IP/network configuration, and kernel data Combining this data allows for pattern detection to identify **identity theft, synthetic identities, account takeover fraud, and occupancy fraud.** User behavior which appears sufficiently risky is automatically rejected and the borrower is either removed from the Q platform or transferred to Flagstar's retail lending division for manual review. ![](https://i.imgur.com/Yqt0gHB.png) ### Explainable machine learning The fraud detection module uses explainable machine learning models, like classification decision trees. This enables awareness of what specific criteria triggered the fraud alert, enabling more effective manual review when passed to Flagstar, and for model improvement by our internal engineering team. ![](https://i.imgur.com/DCXY8PJ.png) <p class="image-caption"> Example branches from a decision tree fraud classifier model [https://bit.ly/3neA3gt] </p> Initial data to train the model will be sourced from historical applications to Flagstar's retail mortgage divisioon. ### Fraud red teaming In order to capture more training data for the fraud detection system, regular stress testing is performed. This is done via "Red Team" exercises where third-party security organizations are hired to attempt to create fraudulent transactions on the Q platform, or otherwise defeat the fraud detection modules. We are exploring various partners for fraud red teaming, including [HackerOne](https://www.hackerone.com/) and [Positive Technologies](https://www.ptsecurity.com/). ## Open Architecture ### Enabling other banks to lend through Q ### Integration lite #### Funding source #### Lender configuration for smart contract ### Q lender screening and quality control ### Solving the mortgage brand problem ## Moving from Prototype to Production ### Engineering for scalability ### Internal mortgage origination costs ### Implementation risks