AWS: Asynchronous Event Ingestion and Processing Architecture

This documentation outlines the Asynchronous Event Ingestion and Processing Architecture designed for high-scale webhook integration from clients OR third-party providers.

  1. Overview
  2. Architecture
  3. High-Level Architecture Diagram
  4. Component Specifications
  5. Request Flow Sequence
  6. Architectural Justification: Why Asynchronous Validation?
  7. Future enhancement
  8. Code
  9. References

1. Overview

The architecture follows a Decoupled Producer-Consumer pattern. Its primary objective is to provide a highly available entry point that captures external events with minimal latency, ensuring data durability through a queuing system before processing business logic and validations.![ref1]

2. Architecture

Component AWS
Entry Point API Gateway
Authentication Lambda
Ingestion Lambda
Messaging/Queue SQS
Worker/Processor Lambda
Scaling Automatic (Scale-to-Zero)

3. High-Level Architecture Diagram


High-Level Architecture Diagram

4. Component Specifications

1. API Gateway

The entry point for all incoming webhook requests. Role: Acts as the managed interface for the system.

  • Key Responsibilities: Terminating TLS, request routing, and basic protocol validation.
  • Design Choice: By using API Gateway, we offload authentication and throttling concerns, ensuring the underlying compute resources are only used for legitimate traffic.

    2. Auth Lambda (Authorizer)

A dedicated function for request validation.

  • Role: Performs security checks (e.g., verifying Shopify HMAC signatures or API keys).
  • Interaction: If validation succeeds, it returns an IAM policy allowing the API Gateway to invoke the next stage. If it fails, the request is rejected at the edge with a 401 Unauthorized.

3. Injection / Producer Lambda

The ingestion layer is designed for speed and reliability.

  • Role: Receives the raw payload from the API Gateway and pushes it to the directed message queue.
  • Validation Strategy: This layer uses Shallow Validation. It checks if the payload is valid JSON but does not enforce a strict schema (DTO). This ensures that if the provider adds new fields unexpectedly, the event is still captured.
  • Outcome: Once the message is in SQS, it returns a 202 Accepted to the client.

4. SQS / FIFO (Simple Queue Service)

The durability and ordering layer.

  • Role: Buffers events and ensures they are processed in the order they were received (First-In-First-Out).

  • Benefit: Decouples the ingestion speed from the processing speed, protecting downstream services from traffic spikes.

5. Consumer / Worker Lambda

The core business logic and validation layer.

  • Role: Triggered by messages in SQS to perform heavy lifting.
  • Validation Strategy: This layer performs Deep Validation (Schema/DTO checks). It maps the incoming data to the internal system requirements.

  • Processing: If validation passes, it works on logic to perform updates to the database or triggers downstream business workflows.

6. DLQ (Dead Letter Queue) & Fix-and-Replay

The resilience and recovery mechanism.

  • Role: Captures events that fail processing in the Worker Lambda (e.g., schema mismatches or transient database errors).
  • Fix-and-Replay Path: Allows developers to inspect failed events in the DLQ, fix the underlying Worker code or schema, and then re-inject the message back into the Worker for processing without losing data.

5. Request Flow Sequence

  1. Ingestion: The Client sends a webhook. API Gateway triggers the Auth Lambda.
  2. Verification: Upon successful authentication, API Gateway passes the request to the Injection Lambda.
  3. Persistence: Injection Lambda performs a structural check and sends the payload to SQS FIFO.
  4. Acknowledgement: The system returns an immediate 202 Accepted to the client.
  5. Processing: SQS triggers the Worker Lambda.
  6. Deep Validation: The Worker validates the schema.
    • If Valid: The event is processed.
    • If Invalid: The event is moved to the DLQ.
  7. Architectural Justification: Why Asynchronous Validation? This design prioritizes Durability over Immediate Rejection.
    • Resilience to External Changes: Third-party webhooks (like Shopify) are subject to change. If we enforced strict validation at the API Gateway (as suggested in your peer review), a new, unmapped field from Shopify would cause a 400 Bad Request , and the data would be lost forever.
    • Reliability: By accepting the data first, we ensure we have a “copy of record.” If the validation fails in the Worker, we have the ability to fix our code and replay the event from the DLQ.
    • Client Experience: Webhook providers require fast response times to prevent retries and back-offs. This architecture minimizes the synchronous work, ensuring we meet these strict time constraints.

7. Future enhancement

Entry Point: Global External HTTP(S) Load Balancer (ALB equivalent)

  • Static IP: Provides a single, static Anycast IP address to whitelist for any third party.

8. Code

A fully serverless, async message processing system built with AWS Lambda, API Gateway, and SQS, deployed via Terraform and TypeScript.

9. References