Designed and implemented a scalable, production-grade web analytics pipeline that ingests raw Google Analytics 4 (GA4) event data into Databricks using Structured Streaming Pipeline. Leveraged Delta Lake and Medallion Architecture (Bronze–Silver–Gold) to transform nested GA4 data into optimized, analytics-ready datasets powering real-time BI dashboards.
• GA4 event ingestion using Streaming Injection Pipeline
• Medallion Architecture implementation (Bronze → Silver → Gold)
• Flattening of complex nested GA4 schema
• Schema evolution & late-arriving event handling
• Delta Lake ACID transactions & checkpointing
• Performance optimization using partitioning & Z-Ordering
• BI-ready KPI datasets powering Power BI dashboards
• Integration of Silver & Gold Layer with Databricks Genie
✉ Reach me:
jharajnish@outlook.in