Category: Courses
-
Introduction to data pipeline monitoring and alerting
A practical introduction to monitoring and alerting in data pipelines. Learn the core concepts, tools, and patterns that help engineers ensure reliability, detect failures early, and maintain confidence in their data systems.
-
Introduction to data pipeline monitoring and alerting
A practical introduction to monitoring and alerting in data pipelines. Learn the core concepts, tools, and patterns that help engineers ensure reliability, detect failures early, and maintain confidence in their data systems.
-
Tools: AWS Athena Federation, Starburst, Trino
This post explores how AWS Athena Federation, Starburst, and Trino power federated data queries in 2025. Learn how these tools integrate across cloud and on-prem systems, their architectural strengths, and how enterprises leverage them for modern data lakehouse and data mesh analytics.
-
Tools: AWS Athena Federation, Starburst, Trino
A deep dive into AWS Athena Federation, Trino, and Starburstâthe leading tools powering federated data querying in 2025. Learn how these engines unify analytics across S3, databases, and warehouses, their architectures, and when to choose each for modern data mesh and lakehouse environments.
-
Introduction to developer productivity fundamentals
This post explores the core principles of developer productivity, from mindset and habits to tools and metrics. Learn how modern developers structure their environments, automate workflows, and maintain sustainable focus in 2025 to deliver better software faster and with less stress.
-
Empirical: batch vs streaming stores
This empirical post explores the modern trade-offs between batch and streaming data stores. Using benchmarks from real-world systems like Spark, Flink, and Pinot, it examines performance, cost, and operational complexity in 2025. Learn how unified architectures and hybrid designs are shaping the next generation of data processing systems.
-
Best practices for evaluating clusters
Evaluating clustering models goes far beyond picking the highest silhouette score. This post explores modern best practices for evaluating clusters in unsupervised learning, combining internal and external validation metrics, visualization techniques, and domain-driven evaluation frameworks that leading data teams use in 2025 to ensure meaningful, actionable segmentation results.
-
Introduction to SOLID principles in Python
An introduction to SOLID principles for Python developers. Learn how to write cleaner, more modular, and maintainable code by applying the five foundational software design principlesâSRP, OCP, LSP, ISP, and DIPâwith practical Python examples and real-world best practices.
-
Tools: Evidently AI, WhyLabs
Evidently AI and WhyLabs are two leading tools shaping how teams monitor data drift and model health in production ML systems. This post explores their architectures, features, integrations, and best practices for using them together in modern data observability workflows.
