This session presents a reference architecture for a multi-modal data lake house on AWS that closes this gap. We combine Amazon Bedrock Data Automation (BDA) for intelligent content extraction with Apache Iceberg for scalable, ACID-compliant data management — turning documents, images, and media into queryable, governed datasets alongside traditional structured data.
We walk through how to design a multi-modal ingestion pipeline, extract structured signals from unstructured content using AI, store everything in Iceberg tables with schema evolution and time travel, and query across modalities using a single engine. Real-world patterns from customer support analytics, retail intelligence, and compliance document processing illustrate the architecture in action. Attendees leave with a clear blueprint for moving beyond traditional data lakes toward AI-ready, multi-modal data platforms.
Gopalakrishnan Marimuthu is a seasoned Cloud Application Architect at Amazon Web Services with over 20 years of cross-domain industry experience spanning telecommunications, supply chain, banking, and healthcare.