Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a Sisyphean task.
Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.
Join us at the Datadog Headquarters in NYC on November 19th at 6:00 pm to learn more about the OpenLineage spec and integrations. You’ll meet other members of the ecosystem, learn about the project’s goals and fundamental design, and participate in a robust discussion about the future of the project.
Agenda:
- OpenLineage overview & project update
Julien Le Dem (Project lead) & Harel Shein (TSC member), OpenLineage
- Interpreting OpenLineage events to power Astro Observe
Julian LaNeve (CTO) & Christine Shen (Software Engineer), Astronomer
- How OpenLineage fits into the Datadog ecosystem
Igor Kravchenko (Staff Engineer) & Ryan Warrier (Sr. Product Manager), Datadog
Interpreting OpenLineage events to power Astro Observe
Astronomer recently announced Observe, an observability platform powered by OpenLineage. We've created experiences designed around the reliability and efficiency of datasets delivered by Apache Airflow. To do so, we leverage the Airflow OpenLineage integration to get telemetry and lineage out of Airflow regardless of where it's running. We've created a processor framework that runs custom business logic depending on the event type and metadata included. In this talk, we'll dig under the covers and show our processor framework that allows us to easily extend the Observe platform into new use cases.
How OpenLineage fits into the Datadog ecosystem
This talk explores how Datadog integrates with OpenLineage to provide observability solutions for data pipelines. The first part is a technical walkthrough of the architecture of how Datadog ingests, processes, and stores OpenLineage events. The second part covers a brief overview of Datadog’s data observability capabilities, how Datadog sees OpenLineage providing value to users, and a demo of Airflow and dbt monitoring built using OpenLineage events.
Thank you to our sponsors:
Datadog
LFAI & Data Foundation
Enter from 620 8th Ave, check in at the left desk and take the elevators up to the 46th floor.