Digitally Drilling For Oil And Gas

Data is the lifeblood of modern Oil & Gas exploration and production (E&P). Throughout the search to recovery of crude oil and natural gas, the upstream sector for the oil and gas industry must manage, analyze, and store digital information from 3D and 4D seismic surveys, downhole sensors to high resolution satellite images. To remain competitive and agile, strategies for lowering operating costs and increasing discovery and recovery rates through data driven exploration must address seemingly contradictory goals:

  • Support higher resolution capture methods and new seismic formats
  • Accelerate analysis of massive amounts of unstructured data
  • Collect and analyze metadata from field sensors
  • Control rising data storage costs
  • Actively archive data to cloud or object storage while keeping it accessible indefinitely

What this means is that Oil and Gas companies will first drill virtually in the digital domain to determine how much to bid on a site and where to physically place a well. With billions at stake, geoscientists apply the most sophisticated modelling and simulation technology to ensure a high return on investment. This requires vast amounts of data and metadata analysis both from current and historical data points. It is not uncommon for a large exploration project to produce several petabytes of data. Scientific discovery in the digital domain pushes the boundaries of data management and requires a solution to ensure storage demands are met.

  • Accelerate modern data-intensive seismic and E&P applications
  • Reduce costs by automating the flow of data across all storage
  • Integrate with cloud or object storage as an active archive


DataSphere is a metadata engine designed to separate and offload the architecturally rigid relationship between applications and where their data is stored. Offloading metadata access with DataSphere delivers predictable, low-latency operations by guaranteeing that metadata operations do not get “stuck” in the queue behind other data requests. Rather than having to wait for sequential operations to complete, DataSphere can leverage parallel access with the latest optimizations of the standard NFS v4.2 protocol. Leveraging NFS v4.2 significantly speeds up metadata and small file operations by requiring less than half of the protocol-specific network round trips compared to NFS v3.

Products | Primary Data

Figure 1 - The DataSphere architecture separates the metadata path from the data path so that data can be abstracted from its physical location.

DataSphere Extended Services (DSX) service legacy NFS v3 and SMB environments, and can be used in tandem with clients using the latest NFS v4.2 advancements. Regardless of client types, DataSphere provides insight into how an application makes use of its data. With this knowledge, DataSphere can place or move files to different storage types or tiers without disrupting an application’s access, even while files are open and the data is in-flight. This enables IT to ensure that applications will always meet E&P performance, price, and protection requirements.


Success or failure to find oil and gas depends on the quantity and quality of exploration data. E&P applications have evolved with the deployment of more advanced sensors to provide a more accurate view of the underlying geology. Survey technologies such as orthogonal WAZ (wide-azimuth) reflection seismology use a multi-sensor array to build a high-resolution subsurface 3D model covering tens of thousands of square kilometers, thousands of meters under water. Full 3D acquisitions generate between eight to

Products | Primary Data

20GB/s of sensor seismic data totaling 250TB to 1PB per square kilometer. This seismic data consists of both extremely large data acquisition files and a very large number of smaller files containing metadata. For targeted physical surveys, Individual 45-60 meter core samples are scanned at the drill site for every cubic centimeter, producing hundreds of gigabytes of highly detailed images. Exploration data processing and analytics requires low latency storage to process these massive amounts of large unstructured files and datasets quickly, while high bandwidth is required to render and display results. Even as storage demands differ, IT must quickly adapt where they place data to meet changing requirements during the exploration, interpretation, and analysis needed to make a production decision.

Products | Primary Data

DataSphere can non-disruptively place data on the ideal storage for meeting application requirements for each step of the E&P pipeline. For example, DataSphere can place transient data, such as scratch and OS swap space, on NVMe flash in application servers, active application data on performance NAS tiers, and less used data on capacity NAS or cloud tiers. Admins can even create volume groups for data that must meet compliance standards. These capabilities can turn multi-day E&P tasks into single-day tasks to greatly accelerate productivity. See the Primary Data Data Migration brief for detailed information about how DataSphere optimizes performance.

Products | Primary Data

Figure 2 - DataSphere places data on the ideal storage to meet application requirements for each stage of the E&P pipeline, without application interruption.

Products | Primary Data


E&P applications support a wide variety of users, leveraging data from many different sources. This generates diverse and random workloads that require high performance storage to handle. The problem is that once data becomes idle, it becomes a source of waste in the datacenter. Efforts to diligently move cold data off performance storage are becoming increasingly challenging as the 24/7/365 operation of global companies gives IT shorter windows when data can be archived without affecting applications. As a result, many companies simply buy enough capacity so data won’t have to be moved over its lifetime. Not only is this costly and inefficient, but eventually, even this capacity runs out, requiring a massive project to archive data, upgrade storage, or both.

DataSphere provides comprehensive visibility into application workloads, resources, and usage patterns to predictively determine the best place to provision data, what data might need to be moved to protect service levels, and what data can be safely archived – all without impacting running applications. This gives oil and gas companies the best of both worlds. Active data gets the performance to maximize processing, while cold data is automatically archived to reduce costs and ensure performance capacity is available for data that needs it.


As applications consume increasing volumes and varieties of data from oil prospecting to production, it seems unrealistic that companies will be able to slow storage spending, yet DataSphere enables just that. By aligning data to the right resource for business needs, DataSphere reduces infrastructure needs, expands storage choice, and increases operational efficiency to dramatically reduce O&G datacenter costs. These savings free overprovisioned resources for an immediate ROI on existing investments and make more efficient use of future purchases.

DataSphere delivers these savings through a number of unique capabilities, including:

  • A global namespace with the ability to keep cold data accessible from cloud storage
  • Reducing overprovisioning by moving cool data to Tier 2 storage
  • Improved utilization to reduce the cost of new storage
  • Agnostic architecture that expands storage choice


A key challenge faced by oil and gas companies is that while they would like to archive a lot of data, it must be kept accessible in case it is required for new projects. DataSphere makes data available to applications across all storage in its namespace, including the cloud. Not only can DataSphere automatically move cold files to the cloud or on premises object storage, it can also automatically retrieve the data to primary storage if applications need it. Better still, DataSphere does this without requiring applications to be modified to use retrieved data. In addition, DataSphere maintains access to data in the cloud as files. This means that companies can restore just the file that is needed, minimizing cloud bandwidth charges. These features make it easy for companies to add object or cloud storage as a native active archival tier. For more in-depth details on this use case, see the Primary Data brief on Simple Cloud Adoption.

Products | Primary Data

Figure 3 - DataSphere makes it easy to integrate on-premises and cloud storage resources.


DataSphere gives Oil and Gas leaders the ability to automate the placement and movement of data across all storage, including the cloud. This makes it possible to accelerate the performance of E&P applications, even as applications become more powerful and thus more demanding, and as user and workload diversity increases. DataSphere also automates many core management tasks to free IT resources to support other initiatives. DataSphere can free terabytes of capacity on existing storage, while seamlessly integrating traditional storage with on premises object or public cloud storage as an active archive. Whichever use case best meets your architecture and business needs, DataSphere can help you achieve dramatic cost savings across E&P projects and overall operations.

Calculate Your Savings Use the Primary Data TCO Calculator on our web site to see how much DataSphere might save you.

Connect With Us

Enter your name and email to receive news and updates from Primary Data. Fields marked with an * are required:

Contact Form

Channel Partner