How DataSphere Uses Machine Learning to Automate Data Management

How DataSphere Uses Machine Learning to Automate Data Management

Posted in tech

DataSphere uses machine learning to solve resource management problems, fixing numerous other IT problems along the way. To meet service level agreements (SLAs), enterprise IT pros need to allocate finite resources in storage infrastructure: capacity, performance load, and anything else anything consumable such as IOPS/unit time, or how much data can be ingested over a lifetime before the media it is written to wears out. These resources have to be handed out to achieve the most value for the business. At the same time as IT is expected to keep business data humming, it is also under pressure to contain costs. Historically, IT has allocated storage resources manually, often reacting to events after they have impacted business.

DataSphere begins with an enterprise’s existing infrastructure and uses reinforcement learning to continually optimize the placement and movement of data across all storage, including flash, shared, and cloud storage, to meet SLAs. Its algorithm simulates a market economy, where storage containers are “landlords” with floor space to lease and the data objects (individual files) are “tenants” who are looking to lease real estate that suits their tastes/needs.

Following is a simplified summary of how DataSphere’s machine learning engine generally works:

  • Admins assign Service Level Objectives (SLOs) to data that define the performance, price, and protection that the data requires. Admins can do this using predefined workflows or powerful objective expressions.
  • DataSphere discovers the capabilities of the storage resources, and maps the active performance of the entire topology from storage to clients. This holistic view of the topology enables DataSphere to map the net-net performance of the IT infrastructure to learn how to better allocate resources to clients that need it. It also eliminates the need to manually map the performance of the network, since DataSphere measures the net impact of all layers on performance.
  • DataSphere continuously profiles the environment, “auctioning” resources off to data objects, granting them capabilities, performance, and priority.
  • Using machine learning, DataSphere maps data objects to the IT-defined objectives. When data falls out of alignment with objectives, it makes adjustments, feeding the results of changes to the infrastructure back into the engine’s economic simulation. As it learns what data needs, DataSphere can recommend changes to objectives to better meet business’s requirements, as well as notify admins when performance or capacity are needed.
  • When contention occurs, or when DataSphere predicts it might occur, it can perform “financial arbitrage” to move data according in ways that deliver the highest value to business. For example, performance for a mission critical application should be protected over data being used by a back-end process. DataSphere determines this value, the “income” the data tenant has to spend, based on two factors: the price paid for a given service level and the penalties for when service levels are not met.
  • Over time, DataSphere learns what data’s “tastes” are and will be able to predict the needs of all data in the system. Just as a robot using machine learning to navigate runs into fewer and fewer objects over time, DataSphere will need to perform arbitrage less and less often. 

From the day DataSphere is installed (in just a few hours), it enables enterprises to automate the location and placement of data without impact to applications. As DataSphere learns about the topology, from storage to clients, and learns about what data needs, it optimizes how data can be managed to meet business goals, getting better and better over time.

With DataSphere, IT can deliver more consistent application response times and greater application uptime, and contribute to higher profits through increased utilization and the savings that come with reducing overprovisioning. Want to learn more about how DataSphere can automate intelligent data management? Connect with us at deepdive@primarydata.com to schedule a meeting or demo.




Contact Form

Channel Partner