Production LLM Risk Assessment System for Regional Banking Institution

PROJECTS

SHARE

The case study below details the technical architecture, implementation methodology, challenges overcome, and quantifiable business results of this project.

A 90-day, multi-phased implementation, delivered extraordinary results:

75% reduction in underwriting time (7 days → 1.5 days)
30% improvement in risk prediction accuracy
40% reduction in underwriting costs ($1.2M annual savings)
3x increase in processing capacity (200+ daily assessments)
99.95% uptime achieved (exceeding 99.9% SLA)
$4.2M fraud prevented in first 6 months
15% increase in commercial loan originations
$2.8M estimated annual value created

Our customer, a regional bank in Charlotte, NC with $8B in assets, needed to modernize their credit risk assessment process while maintaining strict regulatory compliance. Manual underwriting took 5-7 days per commercial loan application, creating competitive disadvantage against larger banks with automated systems.

Remaker Digital built a production LLM risk assessment platform that reduced underwriting time by 75% while improving risk prediction accuracy by 30%, all within a secure, auditable, and regulatory-compliant MLOps framework.

Business Context

The bank specialized in commercial lending to small and mid-sized businesses across the Southeast. With 50 branches and 200 loan officers, the institution processed 3,000+ loan applications annually:

Average commercial loan size: $500K-$5M
Manual underwriting process: 5-7 days per application
15-person underwriting team reviewing financial statements, tax returns, credit reports
Regulatory requirements: FDIC, OCC, Fair Lending Act compliance
Competition: Larger banks with automated underwriting gaining market share

Technical and Regulatory Challenge

The bank needed an AI solution that could:

Analyze complex financial documents (income statements, balance sheets, tax returns)
Integrate with legacy core banking system (Fiserv Premier)
Maintain explainability for regulatory audits and fair lending compliance
Achieve production-grade reliability (99.9% uptime, <2 second response time)
Support continuous model improvement without disrupting operations
Provide fraud detection and anomaly flagging in real-time

Existing credit scoring models (FICO-based) couldn’t analyze unstructured data or provide nuanced risk assessment for complex commercial lending scenarios.

Existing Infrastructure

The bank’s technology environment included:

Core Banking System: Fiserev Premier (legacy on-premises deployment)
Credit Bureau Integration: Experian, Equifax, TransUnion APIs
Document Management: SharePoint for loan application files (PDFs, scanned documents)
Data Warehouse: SQL Server 2019 with historical loan performance data (10 years)
Regulatory Reporting: Manual Excel-based processes for FDIC/OCC reporting

Technical Requirements

The solution needed to:

Deploy on-premises or in private cloud (data residency requirements)
Integrate with Fiserv Premier via SOAP APIs
Process 50-100 loan applications daily with <2 second response time
Maintain complete audit trail for regulatory examination
Support model versioning and A/B testing for continuous improvement
Provide explainable AI outputs for underwriter review
Achieve SOC 2 Type II compliance for third-party AI vendor requirements

Elapsed time (days): 21

Discovery and Planning

Discovery & Regulatory Assessment (3 weeks)

Conducted stakeholder interviews with underwriting team, compliance officers, and IT leadership. Analyzed 500+ historical loan applications and performance data. Documented regulatory requirements (FDIC, OCC, Fair Lending Act). Assessed Fiserv Premier integration capabilities and data quality in existing systems.

Elapsed time (days): 28

Architecture Design

Architecture Design & Compliance Framework (4 weeks)

Designed MLOps architecture with Azure Machine Learning and Databricks. Created explainability framework using SHAP and GPT-4 narrative generation. Designed bias monitoring and fairness testing procedures. Developed integration architecture for Fiserv Premier SOAP APIs. Obtained regulatory approval for architecture approach from compliance team.

Elapsed time (days): 42

Development and Integration

Development & Model Training (6 weeks)

Built data pipeline extracting 10 years of historical loan data from SQL Server. Developed document processing pipeline with Azure Document Intelligence. Fine-tuned FinBERT for financial sentiment analysis. Trained fraud detection models on historical fraud cases. Built custom Fiserv API wrapper and caching layer. Developed underwriter dashboard with React.

Elapsed time (days): 28

Testing and Training

Testing & Bias Validation (4 weeks)

Conducted comprehensive testing across 1,000+ historical loan applications. Performed bias testing across protected classes (race, gender, age). Validated explainability outputs with compliance team. Load tested system for 200+ concurrent requests. Achieved 99.95% uptime in staging environment. Obtained preliminary regulatory approval for testing results.

Elapsed time (days): 21

Deployment

Pilot Deployment & A/B Testing (3 weeks)

Deployed to 5 branches for pilot with 20 loan officers. Ran A/B test comparing AI-assisted underwriting to traditional process. Collected feedback and refined UI based on underwriter usage patterns. Processed 200+ real loan applications during pilot. Achieved 30% accuracy improvement and 75% time reduction in pilot metrics.

Elapsed time (days): 14

Handoff to Operations

Production Rollout & MLOps Enablement (2 weeks)

Full rollout to all 50 branches and 200 loan officers. Conducted training sessions on AI-assisted underwriting workflow. Deployed production monitoring dashboards for model performance and bias detection. Established weekly model performance review process. Achieved 99.95% uptime in first 2 weeks. Delivered comprehensive documentation for regulatory examination.

Regulatory Explainability Requirements

Financial regulators require complete transparency in lending decisions. LLMs are often “black boxes.” We addressed this through:

SHAP Analysis: Generated feature importance scores for every prediction
Narrative Explanations: GPT-4 produced human-readable explanations citing specific financial metrics
Audit Trail: Logged every input document, model version, and output for regulatory examination
Human-in-the-Loop: Underwriters review and approve all AI recommendations (AI as decision support, not autonomous decision-making)

Legacy System Integration Complexity

Fiserv Premier’s SOAP-based APIs were designed for batch processing, not real-time AI integration. We solved this by:

API Wrapper: Built custom REST API layer translating modern API calls to SOAP
Caching Layer: Redis cache for frequently accessed customer data to reduce latency
Async Processing: Decoupled document analysis from real-time API responses using message queues

Model Performance and Bias Monitoring

Ensuring fair lending compliance and detecting model drift required continuous monitoring:

Bias Detection: Automated analysis for disparate impact across protected classes (race, gender, age)
Drift Detection: Statistical tests comparing current predictions to historical baseline
Champion/Challenger Testing: A/B testing new model versions on 10% of traffic before full rollout
Performance Dashboards: Real-time metrics on accuracy, precision, recall, and regulatory compliance

Data Quality and Historical Bias

Historical loan data contained biases from previous manual underwriting. We mitigated this through:

Data Augmentation: Synthetic data generation for underrepresented borrower segments
Fairness Constraints: Added constraints to model training ensuring equitable treatment
Bias Testing: Pre-deployment testing across demographic groups to identify and correct disparities

Operational Efficiency Gains

The LLM risk assessment platform transformed commercial lending operations:

75% faster underwriting: 5-7 days reduced to 1.5 days average
30% higher risk prediction accuracy: Compared to traditional FICO-only scoring
40% reduction in underwriting costs: $1.2M annual savings through automation
200+ daily risk assessments: Processing capacity increased 3x
99.95% uptime achieved: Exceeding 99.9% SLA commitment

Competitive Advantage

Beyond operational efficiency, the system improved market competitiveness:

Same-day loan decisions: For qualified borrowers, enabling faster closing
Improved customer satisfaction: NPS score increased from 42 to 68
Market share growth: 15% increase in commercial loan originations (first year)
Fraud prevention: Detected $4.2M in potentially fraudulent applications (first 6 months)

Regulatory Compliance ROI

Estimated $2.8M annual value created through efficiency gains, fraud prevention, and competitive advantage. System achieved full FDIC and OCC approval in regulatory examination.

Lessons Learned

Explainability is Non-Negotiable: For regulated industries, AI systems must provide complete transparency. SHAP analysis + narrative explanations satisfied regulators while maintaining LLM sophistication.
MLOps Maturity Determines Success: Production LLM systems require robust CI/CD, monitoring, and rollback capabilities. Investing in MLOps infrastructure upfront prevented operational issues.
Human-in-the-Loop for High-Stakes Decisions: AI as decision support (not autonomous decision-making) balanced efficiency gains with regulatory compliance and risk management.
Legacy Integration Requires Custom Solutions: Modern AI platforms often need custom API wrappers and caching layers to integrate with legacy banking systems effectively.
Bias Monitoring is Continuous: One-time bias testing is insufficient. Ongoing monitoring and A/B testing are essential for fair lending compliance.

Appendices

Integration Overview

The system integrates with existing banking infrastructure through:

Fiserv Premier Connector: Custom SOAP-to-REST API wrapper for bidirectional loan data synchronization
Credit Bureau APIs: Real-time integration with Experian, Equifax, and TransUnion
Azure AD SSO: Single sign-on for loan officer access with MFA
SharePoint Integration: Automated document extraction from loan application files
SQL Server Sync: Daily data warehouse updates for historical performance tracking

Model Selection Rationale

Azure OpenAI GPT-4: Selected for superior financial document comprehension and narrative explanation generation. Provides nuanced risk assessment beyond traditional scoring models.

FinBERT: Specialized financial sentiment analysis model outperformed general-purpose models for extracting risk signals from financial statements and management discussion.

Custom Fraud Detection Model: XGBoost ensemble trained on historical fraud cases achieved 95% precision while maintaining low false positive rate.

Azure Document Intelligence: Extracted structured data from PDFs and scanned documents with 98% accuracy, superior to open-source OCR solutions.

Cost Analysis

Monthly operational costs approximately $8,000-$10,000:

Azure OpenAI API: $4,000-$5,000 (GPT-4 for 200 daily assessments)
Azure ML & Databricks: $2,500/month (compute for training and inference)
Azure Document Intelligence: $800/month (50-100 daily document processing)
AKS & Storage: $1,200/month (container hosting and data storage)
Monitoring & Logging: $500/month (Azure Monitor, Event Hub)

ROI: $1.2M annual cost savings + $4.2M fraud prevention = $5.4M annual value. 45x return on investment.

Security Architecture

Enterprise-grade security for financial services compliance:

Cloud Deployment: Azure Government Cloud in private VNet
Network Security: Azure Private Link for all Azure services, NSG rules, Azure Firewall
Authentication: Azure AD with MFA and conditional access policies
Encryption: TLS 1.3 in transit, AES-256 encryption at rest, HSM key management
Compliance: SOC 2 Type II, PCI DSS Level 1, GLBA compliance
Audit Logging: All API calls, model predictions, and user actions logged to Azure Event Hub with 7-year retention
Disaster Recovery: Multi-region deployment with automated failover, 15-minute RTO, 1-hour RPO
Availability: 99.95% uptime SLA with load balancing and auto-scaling