Overview
A US-based insurance organization sought to modernize its data governance framework by migrating from Hive Metastore to Databricks Unity Catalog. The scale was significant, spanning approximately 10,000 tables, refactoring 800 Databricks notebooks, and updating 250 Azure Data Factory pipelines, all while transitioning to a standardized three-level namespace model.More
A US-based insurance organization sought to modernize its data governance framework by migrating from Hive Metastore to Databricks Unity Catalog. The scale was significant, spanning approximately 10,000 tables, refactoring 800 Databricks notebooks, and updating 250 Azure Data Factory pipelines, all while transitioning to a standardized three-level namespace model. By co-creating this transformation, Tech Mahindra executed a seamless, controlled, metadata-driven migration that unified governance, improved auditability, and reduced operational risk, enabling scalable, compliant, and business-aligned data access across analytics, engineering, and reporting platforms.
LessIndustry Challenge
he client is an insurance giant in the US with a strong global presence. The organization, limited by legacy systems, faced increasing complexity in managing and governing its data ecosystem.
Key challenges included:
- A large-scale Hive Metastore environment with approximately 10,000 tables
- A two-level namespace that limited governance and scalability
- Extensive hard-coded references across notebooks, pipelines, and reports
- Downstream dependency misalignment across SQL Server and Power BI
- Limited automation for complex migration patterns at scale
Our Approach and Solution
Tech Mahindra designed and delivered a strategic migration involving:
- End-to-end Unity Catalog enablement with business-aligned catalogs and schemas
- Deep clone–based migration preserving Delta history and metadata
- Conversion of managed tables to external tables with standardized storage paths
- Regex-driven refactoring of Databricks notebooks and ADF JSON pipelines
- Comprehensive downstream alignment across SQL Server objects and Power BI assets
- Multi-layer validation across data, code, pipelines, and reports
Business and Community Impact
The engagement delivered extended impact:
- Migration of approximately 10,000 tables to Unity Catalog
- Standardized three-level namespace across all data assets
- Centralized governance with fine-grained access control
- Improved auditability, compliance, and reporting accuracy
- Reduced operational risk and long-term data management costs