Middleware

Master this essential documentation concept

Quick Definition

Software that acts as a bridge or translator between two otherwise incompatible systems, enabling them to communicate and share data without direct integration.

How Middleware Works

graph TD ClientApp["🖥️ Client Application (React / Mobile App)"] -->|HTTP Request| MW["⚙️ Middleware Layer (API Gateway / Message Broker)"] LegacyERP["🏭 Legacy ERP System (SAP / Oracle)"] -->|Proprietary Protocol| MW MW -->|Translated REST API| ModernDB["🗄️ Modern Database (PostgreSQL / MongoDB)"] MW -->|Normalized JSON| Analytics["📊 Analytics Service (Tableau / Snowflake)"] MW -->|Auth Token Validation| AuthSvc["🔐 Auth Service (OAuth2 / JWT)"] MW -->|Event Publishing| MsgQueue["📨 Message Queue (RabbitMQ / Kafka)"] MsgQueue -->|Async Events| Microservice["🔧 Microservice (Order / Payment Processing)"] style MW fill:#f4a261,stroke:#e76f51,stroke-width:3px,color:#000 style ClientApp fill:#457b9d,stroke:#1d3557,color:#fff style LegacyERP fill:#6c757d,stroke:#495057,color:#fff style ModernDB fill:#2a9d8f,stroke:#264653,color:#fff style Analytics fill:#2a9d8f,stroke:#264653,color:#fff style AuthSvc fill:#e9c46a,stroke:#f4a261,color:#000 style MsgQueue fill:#e9c46a,stroke:#f4a261,color:#000 style Microservice fill:#2a9d8f,stroke:#264653,color:#fff

Understanding Middleware

Software that acts as a bridge or translator between two otherwise incompatible systems, enabling them to communicate and share data without direct integration.

Key Features

  • Centralized information management
  • Improved documentation workflows
  • Better team collaboration
  • Enhanced user experience

Benefits for Documentation Teams

  • Reduces repetitive documentation tasks
  • Improves content consistency
  • Enables better content reuse
  • Streamlines review processes

Turning Middleware Knowledge from Recorded Walkthroughs into Searchable Reference Docs

When your team onboards engineers or documents integration architecture, middleware configurations are often explained through recorded walkthroughs, architecture review sessions, or troubleshooting calls. Someone shares their screen, traces the data flow between two systems, and explains how the middleware layer handles translation and routing. It makes sense in the moment — but that knowledge gets buried the instant the recording ends.

The core challenge with video-only approaches is that middleware is inherently reference material. When a developer needs to understand why a specific message broker is translating payloads between your legacy ERP and a modern API, they need to find that answer in under two minutes — not scrub through a 45-minute architecture call hoping the relevant segment appears. Video simply wasn't designed for that kind of lookup.

Converting those recordings into structured documentation changes the dynamic entirely. A middleware integration walkthrough becomes a searchable article your team can query by system name, protocol, or error type. For example, a recorded session explaining how your ESB handles authentication between two incompatible identity systems becomes a reusable reference that new engineers can actually find and act on — without interrupting a senior engineer.

If your team regularly explains middleware configurations through meetings or training videos, see how converting those recordings into documentation can make that knowledge genuinely accessible. Explore the video-to-documentation workflow →

Real-World Documentation Use Cases

Connecting a Legacy SAP ERP to a Modern React E-Commerce Frontend

Problem

Enterprise retail teams have a decades-old SAP ERP managing inventory and orders, but their new React storefront speaks REST/JSON. SAP outputs IDOC or BAPI formats that the frontend cannot parse, causing developers to write brittle, one-off translation scripts embedded directly in the frontend codebase.

Solution

An API Gateway middleware (e.g., MuleSoft or AWS API Gateway with Lambda) sits between SAP and the React app, translating BAPI calls into RESTful JSON endpoints. It normalizes data schemas, handles authentication, and rate-limits requests so neither system needs to change its native format.

Implementation

['Deploy MuleSoft Anypoint Platform and configure an inbound REST listener that accepts product and order requests from the React frontend.', 'Build a DataWeave transformation map that converts SAP BAPI_MATERIAL_GETLIST responses into a normalized JSON product schema (id, name, sku, price, stockLevel).', 'Add an OAuth2 policy on the middleware endpoint so the React app authenticates via JWT without SAP needing to manage web tokens natively.', 'Publish the middleware API contract as an OpenAPI 3.0 spec in the developer portal so frontend engineers can mock and test without SAP access.']

Expected Outcome

Frontend developers gain a stable, versioned REST API contract; SAP integration errors drop by ~80%; new product attributes can be added in the DataWeave map without touching frontend or SAP code.

Real-Time Data Synchronization Between a CRM and a Marketing Automation Platform

Problem

Sales teams update lead statuses in Salesforce while the marketing team runs campaigns in HubSpot. Without a sync mechanism, leads are emailed after they have already converted, or suppression lists are days out of date, causing compliance risks and wasted ad spend.

Solution

A message-broker middleware (Apache Kafka with Kafka Connect) captures change-data events from Salesforce via the Salesforce CDC connector and streams normalized lead-status events to HubSpot using a HubSpot sink connector. The middleware guarantees at-least-once delivery and schema validation via Confluent Schema Registry.

Implementation

["Enable Salesforce Change Data Capture on the Lead and Contact objects and configure the Kafka Salesforce Source Connector to publish events to a 'crm.lead.status' Kafka topic.", 'Define an Avro schema in Confluent Schema Registry that enforces required fields (leadId, email, lifecycleStage, updatedAt) and rejects malformed events before they reach downstream consumers.', "Deploy a HubSpot Sink Connector that consumes the 'crm.lead.status' topic and upserts contact records in HubSpot using the email field as the deduplication key.", "Configure dead-letter queue (DLQ) routing so failed HubSpot API calls are captured in a 'crm.lead.status.dlq' topic with alerting via PagerDuty."]

Expected Outcome

Lead suppression lists update within 90 seconds of a Salesforce status change; marketing team eliminates post-conversion email complaints; audit logs in Kafka provide a 30-day replay window for compliance reviews.

Unifying Authentication Across Microservices Using an API Gateway as Auth Middleware

Problem

A fintech company's microservices (payments, KYC, notifications) each implement their own session-token validation logic. When the security team mandates a switch from API keys to short-lived JWTs, every service team must update their auth code independently, creating a weeks-long migration with inconsistent rollout and security gaps.

Solution

Kong API Gateway is placed in front of all microservices as authentication middleware. The JWT validation plugin is configured once at the gateway level, stripping the token and forwarding a verified X-User-ID header downstream. Individual services trust the header without re-validating the token.

Implementation

['Deploy Kong Gateway on Kubernetes and register each microservice (payments-svc, kyc-svc, notifications-svc) as upstream services with their internal cluster DNS addresses.', "Enable the Kong JWT plugin globally and configure it to validate tokens against the company's Auth0 JWKS endpoint, enforcing 15-minute token expiry and RS256 algorithm.", 'Configure Kong to forward X-User-ID, X-User-Roles, and X-Tenant-ID headers extracted from validated JWT claims so microservices can perform authorization without token parsing.', 'Document the new auth contract in Confluence: which headers are guaranteed, what happens on 401 vs 403, and how services should handle missing headers in local development.']

Expected Outcome

Auth logic is maintained in one place; the migration from API keys to JWT is completed in 2 days instead of 6 weeks; security audits now target a single gateway configuration rather than reviewing 12 microservice codebases.

Protocol Translation for IoT Sensor Data Ingestion into a Cloud Analytics Pipeline

Problem

A manufacturing company has 500 factory floor sensors publishing telemetry over MQTT (a lightweight IoT protocol). Their cloud analytics platform (Azure Synapse) only accepts data via HTTPS REST or Azure Event Hubs. Engineers lack a way to bridge the protocol gap without writing a custom server that must be maintained indefinitely.

Solution

Azure IoT Hub acts as middleware, accepting MQTT connections from sensors and translating device telemetry into Azure Event Hub-compatible messages. IoT Hub's message routing rules filter and enrich messages with device metadata before forwarding to downstream analytics.

Implementation

['Register all 500 sensors in Azure IoT Hub Device Registry with X.509 certificates for mutual TLS authentication over MQTT port 8883.', "Configure IoT Hub message routing rules to filter high-temperature alerts (temperature > 85°C) to a dedicated 'critical-alerts' Event Hub endpoint and route all telemetry to a 'raw-telemetry' endpoint.", 'Enable IoT Hub message enrichment to stamp each message with device location, production line ID, and firmware version from the Device Twin registry before forwarding.', "Connect Azure Stream Analytics to the 'raw-telemetry' Event Hub to perform 5-minute tumbling window aggregations and write results to Azure Synapse Analytics tables."]

Expected Outcome

500 MQTT sensors are integrated with zero custom server code; critical alerts reach the operations dashboard within 3 seconds; device metadata enrichment eliminates a manual join step that previously took analysts 2 hours per report.

Best Practices

Define and Version Middleware API Contracts Before Writing Transformation Logic

Middleware sits at the intersection of two systems, making its interface the most critical contract in the integration. Defining the input and output schemas (using OpenAPI, Avro, or JSON Schema) before building transformation logic prevents both upstream and downstream teams from making incompatible assumptions. Versioning these contracts (v1, v2) allows non-breaking evolution without forcing simultaneous deployments across all connected systems.

✓ Do: Publish an OpenAPI 3.0 spec or Avro schema to a shared schema registry or developer portal before implementing any transformation code, and increment the version number when making breaking changes to field names or data types.
✗ Don't: Don't hardcode field mappings directly in application code on either side of the middleware and assume the middleware's output format is 'understood' — undocumented implicit contracts break silently when either system is updated.

Implement Idempotent Message Handling to Prevent Duplicate Processing

Middleware that uses message queues or event streams (Kafka, RabbitMQ, SQS) operates under at-least-once delivery guarantees, meaning the same message may be delivered multiple times during retries or failover. Without idempotency, a payment processed twice or an order created twice causes real business damage. Idempotency keys (a unique message ID checked against a deduplication store) ensure repeated delivery of the same message has no additional effect.

✓ Do: Assign a globally unique idempotency key (e.g., UUID generated at the source) to every message, and have the middleware consumer check a Redis or DynamoDB deduplication cache before processing — skip and acknowledge duplicates.
✗ Don't: Don't assume your message broker guarantees exactly-once delivery by default and skip deduplication logic; even Kafka with exactly-once semantics requires careful producer and consumer configuration that is often misconfigured in practice.

Centralize Error Handling and Dead-Letter Queue Routing at the Middleware Layer

When a downstream system is unavailable or rejects a transformed message, the middleware must handle the failure gracefully rather than silently dropping data or crashing. Dead-letter queues (DLQs) capture failed messages with their original payload, error reason, and retry count, enabling operations teams to inspect, fix, and replay them without data loss. Centralizing this logic in the middleware prevents each downstream service from implementing its own error handling inconsistently.

✓ Do: Configure a dedicated DLQ for each integration path (e.g., 'orders.payments.dlq'), include the original message payload, error code, timestamp, and retry count in DLQ messages, and set up PagerDuty or Slack alerts when DLQ depth exceeds a threshold.
✗ Don't: Don't configure middleware to silently discard messages that fail transformation or delivery — 'fire and forget' middleware that drops failed messages makes data reconciliation impossible and erodes trust in the integration.

Log Correlation IDs Across Every System the Middleware Touches

When a request passes through middleware and touches multiple systems (frontend → API Gateway → legacy ERP → database), debugging failures requires tracing the request across all system logs. Without a shared correlation ID propagated through every hop, engineers spend hours matching timestamps across log files from different systems. Middleware is the ideal place to generate and inject this ID because it sits at the boundary of all connected systems.

✓ Do: Generate a UUID correlation ID at the first middleware entry point (or accept one from the client via X-Correlation-ID header), log it with every middleware operation, and forward it as a header or message attribute to all downstream systems so their logs reference the same ID.
✗ Don't: Don't allow each system to generate its own independent request ID — disconnected IDs mean a single user-facing error requires manually correlating four separate log streams by timestamp, which is error-prone and slow during incidents.

Isolate Middleware Configuration from Business Logic Using Environment-Specific Config Maps

Middleware often connects to different endpoints in development, staging, and production environments (different SAP hostnames, different Kafka brokers, different OAuth2 token URLs). Hardcoding these values in transformation logic or deployment scripts creates environments that diverge silently and makes promoting middleware changes from staging to production risky. Externalizing all environment-specific values into config maps or secret managers keeps the middleware logic itself environment-agnostic.

✓ Do: Store all environment-specific values (upstream URLs, API keys, topic names, timeout values) in Kubernetes ConfigMaps and Secrets or AWS Parameter Store, and inject them as environment variables at runtime so the same middleware container image runs in all environments.
✗ Don't: Don't embed environment-specific hostnames, credentials, or topic names directly in middleware source code or Docker images — this forces a code change and rebuild for every environment promotion and risks accidentally pointing staging middleware at production systems.

How Docsie Helps with Middleware

Build Better Documentation with Docsie

Join thousands of teams creating outstanding documentation

Start Free Trial