In the oil and gas sector, data is vast, complex, and highly interconnected. From exploration to production, every well, license, and facility carries relationships that often get buried inside traditional tabular systems.
Recently, I worked on building a graph database using the Factpages dataset (official public data from Norwegian Offshore Directorate). The goal was to model how different entities in oil and gas relate to each other and uncover insights that would otherwise remain hidden.
Why Graph Databases?
Traditional relational databases are great for structured records, but when it comes to exploring connections—for example, which operator managed a specific wellbore drilled from a facility in a given license area? — queries become complex and inefficient.
Graph databases like Neo4j are designed for this very challenge. They allow us to model, store, and query relationships directly, making it easier to:
- Visualize hierarchies between Areas, Licenses, Fields, and Wells.
- Identify operational dependencies and potential bottlenecks.
- Enable faster decision-making by showing a connected picture of the ecosystem.
The Model I Built
Using Neo4j Community Edition, I created nodes and relationships representing key oil and gas concepts:
Nodes
Area License Field Discovery Well Wellbore Drilling Facility Facility Type Operator Base
Relationships
HAS_LICENSE HAS_FIELD HAS_DISCOVERY HAS_WELLBORE WAS_OPERATED_BY MANAGES HAS_SUPPLYBASE_OF and more…
This model mirrors the real-world hierarchy:
- An Area contains Licenses
- A License has Fields
- A Field may have Discoveries and is Managed by Operators
- Wells and Wellbores are tied to Facilities and Bases
Instead of looking at isolated spreadsheets, the entire ecosystem becomes a living map of connections.
Why Neo4j?
I specifically chose Neo4j for this project because:
- Ease of use – Cypher query language is intuitive.
- Visualization tools – Neo4j Bloom makes it simple to present relationships interactively.
- Integration capabilities – From RAG (Retrieval Augmented Generation) to APIs, Neo4j fits modern AI-driven workflows.
- Community edition – Free to experiment with, yet powerful enough for real projects.
- I am certified in Neo4j, which gave me confidence in modeling and querying.
Other graph databases exist, but Neo4j's maturity and ecosystem made it a natural fit.
Benefits for Oil & Gas Operators
By adopting graph databases, operators can:
Key Benefits
- Gain a single connected view of licenses, fields, wells, and facilities.
- Quickly identify dependencies and optimize operational decisions.
- Support regulatory reporting with traceable relationships.
- Enable knowledge graphs that power AI applications in exploration and production.
Try It Yourself
This project is available on GitHub. Anyone with Docker can set it up — I've included a detailed README with step-by-step instructions.
🔗 View on GitHubWhy I Did This Project
At its core, this project reflects my passion and career direction:
- I genuinely enjoy working with data—wrangling, modeling, and finding patterns.
- It helps me increase my domain knowledge in oil and gas.
- It becomes part of my resume portfolio, strengthening my profile for future opportunities in data engineering and energy domain projects.
Have questions about implementing graph databases in your organization? Feel free to reach out — I'd love to discuss your use case.