Categories Machine Learning

Agent AI Loan Underwriter with AWS and Arize AI

How AWS SageMaker, MCP servers, and Arize AI enable production-ready, observable, and scalable agentic AI workflows

πŸš€ Introduction

The hype around AI agents and the Model Context Protocol (MCP) is everywhere. But how do these concepts translate into real, enterprise-grade solutions?

In this post, we distill a financial services loan underwriting use case built on AWS SageMaker with Arize AI for observability. The goal: show how to design a scalable, compliant, and production-ready agentic AI architecture.

🧩 AWS Services for Generative AI

AWS offers two primary paths for deploying generative AI models:

β€’ Amazon Bedrock β†’ Provides API-based access to managed models (e.g., Claude).

– Best for developers who want to build apps quickly, without worrying about infrastructure.

β€’ Amazon SageMaker β†’ Full-control platform to train and deploy open-source or custom models.

– Ideal for ML teams who need GPU control, fine-tuned training, and flexible deployment.

πŸ‘‰ For our loan underwriting demo, we chose SageMaker to deploy an open-source Qwen model on GPU infrastructure.

πŸ€– Agents, Tools, and MCP Servers

The Challenge with Agents

β€’ Agents excel at specific tasks.

β€’ But enterprise workflows (like loan underwriting or insurance claims) often require multiple steps, context handoffs, and orchestration.

β€’ Writing custom multi-agent DAGs for every use case is not scalable.

The Solution: Tools + MCP Servers

β€’ Instead of rigid orchestration, create tools that agents can call for specialized tasks:

– Fetch credit data

– Process PDFs

– Clean applicant profiles

β€’ Wrap these tools as MCP servers:

– Agnostic to models and frameworks

– Reusable across agent frameworks (LangGraph, LangChain, AWS Strands)

– Scalable via containers (Kubernetes/ECS/EKS)

– Discoverable at runtime (agents can query available MCP servers dynamically)

– Flexible (implemented in Python, JavaScript, or any language)

🌐 Advantages of MCP Servers

β€’ Agnosticism β†’ Standard protocol works with any framework.

β€’ Scalability β†’ Containerized MCP clients auto-scale with usage.

β€’ Dynamic Discovery β†’ No pre-wired DAGs; agents discover servers on the fly.

β€’ Flexibility β†’ Build in any language, deploy anywhere.

πŸ”Ž The Importance of Observability

For industries like finance and healthcare, observability is not optional.

β€’ Tracing β†’ Identify where errors or latencies occur in complex, multi-layer workflows.

β€’ Explainability β†’ Capture why a decision was made (key for compliance and auditing).

β€’ Compliance β†’ Maintain audit trails that regulators can inspect.

This is where Arize AI comes in.

🏦 Loan Underwriter Demo Architecture

We implemented a simplified loan underwriting pipeline with three MCP servers arranged in a DAG:

1. Loan Officer MCP Server

– Cleans and summarizes applicant profile

– Input: {age, income, loan_amount, credit_score, liabilities, purpose}

2. Credit Analyzer MCP Server

– Builds credit profile

– Assesses creditworthiness (low / medium / high)

3. Risk Assessor MCP Server

– Consumes credit assessment

– Issues final decision (approve / deny)

Backend: β€” SageMaker hosts the Qwen model on an ml.g5 GPU instance. β€” Input was provided as JSON, but natural language input could also be parsed by the LLM. β€” This demo did not use RAG, though it could be extended with retrieval pipelines.

πŸ“Š Observability with Arize AI

Arize AI is integrated to provide end-to-end visibility:

β€’ Agent-Level Tracing β†’ See which agents and MCP servers were invoked.

β€’ Granularity β†’ Inspect inputs/outputs for every step in the decision chain.

β€’ Evaluation Metrics β†’ Track latency, execution time, and performance.

β€’ OpenTelemetry β†’ Native integration with LangChain, LangSmith, and other open-source observability stacks.

β€’ Simple Integration β†’ Just a few lines of tracer initialization code instrument the entire workflow.

πŸ—οΈ Architecture Diagram

Press enter or click to view image in full size

βœ… Key Takeaways

β€’ Agents alone are not enough β†’ Enterprise workflows demand scalable, composable tools.

β€’ MCP servers provide a universal protocol for agents to call specialized services.

β€’ SageMaker powers flexible model deployment, while Bedrock fits lightweight API-based use cases.

β€’ Observability with Arize AI ensures compliance, explainability, and production readiness.

🎯 Conclusion

Agentic AI is moving fast from POCs to production in regulated industries. By combining SageMaker, MCP servers, and Arize AI, enterprises can:

β€’ Build modular, reusable workflows

β€’ Scale reliably across business units

β€’ Meet the stringent compliance and observability requirements of finance and healthcare

This architecture isn’t just experimental β€” it’s the blueprint for real-world, production-grade agentic AI.