The global electronic design automation (EDA) market relies heavily on Computer Aided Design (CAD) software to aid in the creation and production of semiconductor devices, integrated circuits (ICs) and printed circuit boards (PCBs). This multi-billion-dollar industry is fueled by the need for faster development cycles, low power designs, lower costs and the complexities driven by Moore’s Law, which states that transistor density doubles every 18 months. The need to rapidly develop new systems-on-chips (SoCs), embedded microprocessors, and analog circuits for industrial and consumer products are critical to EDA, and create significant engineering costs. In chip design, EDA allows architects and designers the ability virtually create, build and test simulations of their end products before committing to the cost of validating a physical prototype (semiconductor wafer starts, packaging and physical test). To accomplish this task, IT must provide computational, networking and storage capacity capable of handling not only the scale of millions to billions of files and fast processing, but also to handle many iterations of a design throughout its development cycle.
- Overcome performance bottlenecks by offloading metadata operations
- Reclaim Tier 1 capacity by automatically moving cold data to object or cloud storage
- Non-disruptive data movement with automated data tiering
THE EDA STORAGE BOTTLENECK
It is well documented that storage is the biggest bottleneck for EDA application performance. Common EDA deployments take advantage of large compute farms using an HPC (high performance computing) approach to distribute parallel simulation runs to verify logic designs, which creates intense demands on storage. When ASIC designers open a support IT ticket for slow performance, at times it relates to a problem with the EDA application software, but most of the time the issues are tied to bottlenecks in the infrastructure that are caused by the types of and the amount of I/O traffic generated. Modern EDA applications are file-based and best deployed with network file system (NFS) based storage solutions. To help build an appropriate infrastructure for EDA applications, the Standard Performance Evaluation Corporation (SPEC) created standardized benchmarks profiling many workloads such as Database, Software Build, Video Data Acquisition and Virtual Desktop Infrastructure. Recently, proposals are in the works to add EDA workloads to SPEC SFS® 2014 SP2. The SPEC working group has profiled many of the EDA vendors’ products showing heavy metadata-intensive workloads. As shown in the EDA Workload diagram from the working group presented at the 2016 Storage Developers Conference, the EDA software jobs run concurrently producing a combined and random mix of workloads across multiple design phases where about 60% is metadata, approximately 15% is read and approximately 25% is write NFS operations. Armed with this type of knowledge, Research and Development (R&D) IT teams can take advantage of DataSphere’s performance-enhanced metadata engine, scale-out NAS and data tiering features for EDA workloads.
Figure 1 — EDA software produces heavy metadata activity, according to SPEC SFS 2014. Source: SNIA.org
OPTIMIZING FOR EDA WORKLOADS
Unlike traditional corporate IT, R&D IT must accommodate compute, networking and storage configurations to handle environments with millions of small and large files, according to the type of R&D workflow being deployed during chip development. The workloads vary depending not only on the design process, but also according to which EDA software products have been installed. Use of the NFS protocol and NAS storage are very popular with the majority of EDA application deployments. Although workloads differ depending on the design and the development phase, EDA applications ubiquitously use NFS remote procedure calls (RPCs) to access NAS storage. The challenge here is not only the serialization of RPCs, but also the mixing of metadata and file requests and the constant demand for low latency with high bandwidth file accesses. Generically speaking, the design process can be broken into two specific development workflows, frontend and backend, as discussed in the section below.
FRONTEND IC DEVELOPMENT FLOW (GENERIC)
The frontend integrated circuit (IC) design workflow describes the process of translating a product’s requirements to a set of specifications, which in turn is used by engineer(s) to develop a functional design using a specialized hardware description language (HDL) that can be synthesized or complied into the logic design.
The last stage of the frontend workflow places the most demand on storage. In this verification stage, an engineer must verify their design in a simulation. The storage demands for the frontend activity consists of millions of small files requiring fast storage for the transient output of large amounts of data, which can be terabytes in size. Again, the SPEC working group profiled several EDA vendor solutions, each consistently showing similar storage transaction attributes: a high percentage of writes over read file operations, with nearly 50% of the transactions relating to metadata operations. Metadata performance is easily blocked behind file accesses in the storage controller’s queue. In an effort to try and balance these storage calls, R&D IT will leverage multiple storage controllers across multiple storage arrays, which is costly and does not guarantee consistent metadata operation performance as well as each storage array being its own storage island.
ACCELERATING METADATA ACCESSES
DataSphere is a metadata engine that collects performance telemetry from clients accessing data within its global namespace. With this information, DataSphere runs actuary code to determine data management analytics and ensure business objectives for applications are always met. By separating the metadata from the data itself, DataSphere is able to offload metadata activity from NAS storage arrays allowing it to utilize all resources for file accesses. This out-of-band approach solves a key performance bottleneck for frontend EDA applications or any metadata heavy workloads. Once a functional design is complete, it moves to the next phase of development, such as the physical design and manufacturing, or the backend flow.
Figure 2 — DataSphere’s out-of-band access ensures high performance for EDA applications.
BACKEND IC DEVELOPMENT FLOW (GENERIC)
In contrast to the frontend IC process, the backend IC development workflow moves a design from the virtual to the physical world. In other words, it translates the schematic of a digital design or the netlist of logic to transistors, gates, memory, I/O cells, etc. The backend process is similar to building a house. First, you start with a floorplan. Floor planning determines where to best place the functional logic blocks on a silicon chip for speed and the smallest die, or in the house example, where the architect will place the entrance, living room, dining room, bathrooms, kitchen and bedrooms for the best look and optimization of the square footage. After this, the logic needs to be electrically connected for communications and power, or again in a house, this is synonymous to adding plumbing and electrical outlets. At this point, engineering must make sure everything theoretically works, and begins the process of physical verification to check the design fits into the floorplan, all the logic and I/O are connected correctly, and then estimate power consumption and timing for the targeted operating speeds.
The SPEC group also profiled typical EDA backend storage operations for the purposes of simulated loads. Across several vendors’ EDA solutions, files sizes were two to three times larger for backend than frontend loads, with more sequential access patterns and heavier write loads. The backend flow points R&D IT towards a storage solution with large bandwidth and huge write caches. However, IT budgets usually are not capable of supporting dedicated resources for the different phases of development, and are thus overprovisioned in performance or capacity to handle the different demands throughout the engineering project. Fixing this requires the ability to combine a broad range of storage solutions, from high performance flash based storage arrays to low cost object storage in the cloud, in order to manage the changing demands of an EDA workload automatically.
MEETING DATA DEMANDS IN REAL-TIME
Figure 3 — DataSphere enables IT to easily tier data to the resource that meets current application demands.
With its metadata engine and DSX extended services, DataSphere can match the performance, cost and reliability attributes of a storage resource with an application’s requirements in real-time. This enables IT to overlay various storage options and configurations to meet EDA development needs. For example, R&D IT can easily tier storage, and with objectives, allow automated movement of latency-sensitive data accesses to a high performance all-flash array. Transient datasets can be placed on PCI-Express attached or NVMe-based flash drives. With many EDA jobs running in parallel, clusters of storage devices under DataSphere allows scale-out configurations for file load balancing across any number of NAS arrays, giving EDA applications simultaneous access to multiple files. Less frequently used data can remain on lower cost centralized hard disk drive-based or hybrid NAS arrays. Snapshots, older simulations or verification results can be pushed to the cloud to retain at the lowest cost of ownership. DataSphere’s unique architecture even allows file movement between tiers while files are open and data is actively being accessed. DSX Data Movers will ensure all reads and writes to data are completed atomically even while the data is in flight to the target storage tier.
Figure 4 — DataSphere enables admins to set objectives across attributes including performance, cost and reliability to ensure application service levels are met automatically.
RIGHT DATA, RIGHT PLACE, RIGHT TIME WITH DATASPHERE
With DataSphere, admins can ensure the varying demands of EDA applications are met throughout the engineering development project by offloading metadata operations. This delivers higher performance, with existing storage as well as with new forms of scale-out NAS storage solutions built from existing scale-out NAS deployments or tiered storage types. DataSphere makes it possible to combine NAS arrays from different vendors for cost savings and agility, create logical storage tiers for improved capacity efficiency, define performance tiers for increased application throughput, automate demands using objectives, upgrade storage without disruption and leverage the use of the cloud today – seamlessly and without changing applications. DataSphere expands architectural storage choices to meet both IT’s budget constraints and the application demands of the business.
Use the Primary Data TCO Calculator on our web site to see how much DataSphere might save you.
Connect With Us
Enter your name and email to receive news and updates from Primary Data. Fields marked with an * are required: