Digital Multimedia Content on the Internet or Web getty The annual FAST conference, put on by the USENIX Association is
The annual FAST conference, put on by the USENIX Association is an event that dives into advanced storage system and device technologies. This year’s event had sessions on key-value storage, advanced file systems, deduplication, cloud and distributed systems, data caching and advanced SSD technology. It also featured an interesting keynote talk by Netflix.
Jonathan Looney from Netflix gave a keynote at the 2021 FAST conference on its purpose-built Open Connect content delivery network (CDN) that it uses to stream video to its customers around the world. Open Connect delivers streaming video to over 200 M members, delivering over 100 Tb/s peak data rates. The figure below gives a schematic on Netflix’s on-line streaming services.
Netflix’s Open Connect Appliance (OCA) is the workhorse for the network. The OCA runs almost exclusively on open-source software including its OS (FreeBSD). The figure below shows a typical OCA workload that includes stored content on HDDs and SSDs and content stored in RAM for fast response.
Jonathan said that the 2RU OCA uses commodity parts and achieves 180 Gb/s serving TLS-encrypted connections with less than 50% CPU utilization on a single 32-core 2.5-GHx CPU. He said that the company plans to double this bandwith using the same HW in the next year.
There are three different OCA flavors used for different workloads. One is an all-flash appliance (that is the 180+ Gb/s configuration). Another is a large HDD/SSD combination with up to 80Gb/s and a small HDD/SSD combination with up to 7Gb/s. He said that the company will start to use dual actuator HDDs in their operations later this year.
UFS is the file system used in content drives and Jonathan said that Netflix plans to deploy ZFS shortly on non-content drives. The Netflix CDN treats the data as disposable since there are multiple redundant copies throughout the CDN and most content is pre-positioned near where it is being used. Content placement is very important for efficient content delivery and requires the right number of copies, in locations close enough to the members, spread across the right servers and across the correct disks in each server.
Smart disk I/O requires readahead and keeping heavily accessed files in memory and a careful mix of reads and writes and pacing out disruptive operations like trims, which can impact response times for SSDs. To increase operational efficiency the company seeks to reduce memory bandwidth and making sure active content is hot in the cache memory. It also uses hardware offloads to reduce memory bandwidth and CPU usage and seeks to use PCIe bandwidth and I/O controller resources efficiently. The company is open to new platforms and designs that further these goals.
I recently did some work with a company called Versity, whose products, ScoutAM and ScoutFS provides an archive manager and scale out file system for data archiving applications. Versity was founded by Harriet Coverston and Bruce Gilpin in 2011. Ms. Coverston (shown below) has over 30 years of experience developing archiving software are Oracle, Sun, LSC and CDC.
Harriet is the author of the clustered archiving file system, Quick File System (QFS), initially developed at Large Storage Systems (LSC). LSC was purchased by Sun Microsystems and QFS was integrated with SAM (Storage and Archive Manager). This archiving platform was referred to as SAM-QFS. After Oracle acquired Sun, SAM-QFS was renamed Oracle HSM (OHSM). In 2014 Versity released Versity Storage Manager, a Linux compatible variant of the open-source SAM-QFS product.
Versity’s new scale out file system (ScoutFS) is a key component of the company’s ScoutAM product. ScoutFS is a POSIX kernel based, scale out, open-source GPL (general public license), shared block file system designed for archiving. The product scales out on commodity hardware. Metadata is processed on all nodes or a sub-set of nodes in a cluster. As a result, there is no central metadata controller or any single point of failure. ScoutFS enables much larger archives with hundreds of billions of files by distributing workloads across nodes and leverages metadata stored in NVMe SSDs to provide outstanding metadata performance. It also handles both large and small files with maximum efficiency from a single converged point of control.
The FAST conference gives a great insight on important industry developments in storage and storage systems and this year included an insightful keynote talk by Jonathan Looney from Netflix on how they use digital storage to support their content delivery network. Harriet Coverston and her company, Versity, developed their ScoutFS scale out archive file system.