Location

2019 Storage Performance Development Kit (SPDK) Summit
April 16th-17th, 2019

Day 1

Time	Location	Title	Presenter	Abstract
7:00am	Conference Hall	Registration & Breakfast
8:20am	Hayes Ballroom	Welcome Note	Nathan Marushak
8:30am	Hayes Ballroom	SPDK: State of the Project	Jim Harris Principle Engineer Chief SPDK Architect Intel Corp.	SPDK is continuing to evolve – from both a technology and community perspective. In this talk, Jim with review the significant changes over the last year, and provide insight into what to expect from SPDK over the coming year.
9:00am	Hayes Ballroom	PMDK: State of the Project	Andy Rudoff Senior Principle Engineer Intel Corp.	The SNIA NVM Programming Model is an agreement between dozens of companies on how an OS exposes persistent memory, building on the standard storage APIs. But to make the most of persistent memory, application writers want to access persistence directly using memory semantics, and this can be tricky programming. This is where PMDK comes in. In this talk, Andy will explain the goals of PMDK, the primary motivation for creating it, and how well it has met those goals so far. Andy will talk about what has worked well, as well as some of the challenges we still have ahead of us for PMDK.
9:30am	Ballrooms	Breakout Sessions #1
10:15am	Conference Hall	Break
10:30am	Ballrooms	Breakout Sessions #2
12:05pm	Dolce Hayes Mansion	Lunch
1:00pm	Hayes Ballroom	Keynote	Jennifer Huffstetler Vice President, Data Center Product Marketing Intel Corp
1:30pm	Hayes Ballroom	Keynote	Ken Gibson Director, Persistent Software Architecture Sri Doddapaneni Director, VTune tools. Intel Corp.
2:00pm	Conference Hall	Break
2:10pm	Ballrooms	Breakout Sessions #3
5:30pm-7:00pm	Conference Hall	Meet the Experts/Happy Hour

Day 2

Time	Location	Title
7:30am	Conference Hall	Breakfast/Registration
8:30am	Ballrooms	Breakout Session #4
10:05am	Conference Hall	Break
10:20am	Ballrooms	Breakout Session #5
12:00pm	Dolce Hayes Mansion	Lunch
1:00pm-3:00pm	Ballrooms	Hands-on Labs

Breakout Session 1

Time	Hayes Ballroom	Morgan Hill	Monterey
9:30am	PMDK Essentials Part 1 Andy Rudoff and Pawel Skowron Intel Corp The SNIA NVM Programming Model is an agreement between dozens of companies on how an OS exposes persistent memory, building on the standard storage APIs. But to make the most of persistent memory, application writers want to access persistence directly using memory semantics, and this can be tricky programming. This is where PMDK comes in. In this talk, Andy will explain the goals of PMDK, the primary motivation for creating it, and how well it has met those goals so far. Andy will talk about what has worked well, as well as some of the challenges we still have ahead of us for PMDK.	SPDK Software Quality Paul Luse, Seth Howell and Darek Stojaczyk Intel Corp Software Quality can mean a lot of different things to different people. Most developers equate quality to testing which is very reasonable. But in the SPDK Community it’s about much more than just testing, it’s about automation, code reviews, and clear communication to name just a few. This talk will provide a bird’s eye view of the SPDK automated testing infrastructure and how it’s evolved over the years. We will also demonstrate how developer interactions are another key to SPDK’s continued commitment to quality.	Prepare for the next generation of memory. Is your application a good candidate? Kevin P O'leary and Vineet Singh Intel Corp Will larger Intel® Optane™ DC persistent memory increase performance if I make code changes? What needs to change? Can it speed up I/O delays in my application? Every application is different. VTune Amplifier’s Platform profiler can give a coarse grain overview. In this session we’ll use VTune Amplifier’s memory and I/O analysis to take a fine grained look at specific memory objects. Learn how to use the Intel® VTune™ Amplifier performance profiler to plan design changes and tune your final implementation to take the best advantage of the new memory technology.

Breakout Session 2

Time	Hayes Ballroom	Morgan Hill	Monterey
10:30am	PMDK Essentials Part 2 Andy Rudoff and Pawel Skowron Intel Corp The SNIA NVM Programming Model is an agreement between dozens of companies on how an OS exposes persistent memory, building on the standard storage APIs. But to make the most of persistent memory, application writers want to access persistence directly using memory semantics, and this can be tricky programming. This is where PMDK comes in. In this talk, Andy will explain the goals of PMDK, the primary motivation for creating it, and how well it has met those goals so far. Andy will talk about what has worked well, as well as some of the challenges we still have ahead of us for PMDK.	High Performance Pooled Storage for RSD Architectures Steve Miller Intel Corp Driving efficiency and performance is critical to modern data center architectures, this talk will cover how the features of SPDK (lockless design, user space lib, core affinity, bdev stacking, and debug ability) were used to provide high performance nondurable block storage with RAID 0, think provisioning QOS, Clones, Snapshots and Redfish/RSD compliant management.	Optimize system configurations and workloads for Intel® Optane™ DC persistent memory Vineet Singh and Asaf Yaffe Intel Corp Have you ever wondered if your system is configured well for its typical loads? Or if your typical workloads are well optimized for your system? Will your workloads benefit from Intel Optane DC persistent memory? State of the art performance analysis tools, for longer runs, do not always give sufficiently detailed performance metrics. More detailed performance analysis tools can overwhelm the user with huge amount of fine-grained data. Intel® VTune™ Amplifier’s Platform Profiler provides an adequate amount of data for a user to detect if there is any problem with the system configuration, or if there is any pressure on specific system components like memory or I/O that cause performance bottlenecks. This presentation focuses on how to use Intel® VTune™ Amplifier Platform Profiler for (1) Analyzing suitability of your workload for Intel® Optane™ DC PMM and (2) Analyzing performance on an Intel® Optane™ DC PMM enabled system.
11:20am	Flexible and Dynamic Resource Use with SPDK Ben Walker and Darek Stojaczyk Intel Corp SPDK was originally designed to run in a model where all memory was allocated up front, a system thread was spawned per CPU core, and the process was executed with elevated privileges. While suitable for a large number of use cases, a more flexible and dynamic model is required especially for use within cloud deployments. Today, SPDK supports fully dynamic memory allocation, a new flexible threading model design to integrate seamlessly into existing code bases, and full support for running without elevated privileges. This talk will cover that transformation process and outline how to take advantage of all of SPDK’s new-found flexibility in your application.	Is your data really persistent? Or did you forget to flush? Kevin P O'leary Intel Corp Data cannot persist when power is removed unless it is actually written to persistent memory. It must be flushed from volatile caches. Flush too little and you lose data. Too much and you reduce performance. Intel Inspector’s Persistence Inspector finds persistent memory errors like missing and redundant cache flushes, missing store fences, out of order persistent memory stores and PMDK transaction logging errors. Come learn how it can be used both as a design tool to find the best spot to insert flushes and as a performance tool to eliminate redundant flushes.	Persistent Memory Programming with Java Olasoji Denloye Intel Corp Persistent memory offers new ways to program with long-lived data. Java is a popular language for data center applications such as databases. This session will describe ways to access persistent memory from Java and relate some of our application experience.

Breakout Session 3

Time	Hayes Ballroom	Morgan Hill	Monterey
2:10pm	Hardware offloads for SPDK Sasha Kotchubievsky and Oren Duer Mellanox A SmartNIC is a NIC with on-board processing units, that can run software. Such a SmartNIC can present to the host a standard NVMe device, while implementing the device logic, using software on the SmartNIC's processors. Use cases include: bare-metal and virtual cloud providers; storage systems with proprietary protocols; storage devices security and isolation; compression and encryption over the standard NVMe interface and more. SPDK is a great match as a framework for the software processing on the SmartNIC. NVME-OF target offload is a hardware implementation of the NVME-OF RDMA protocol at the network card. Employing this technology, the network adapter terminates the target side of NVME-OF protocol and submits the NVME requests directly to NVME devices over PCI using peer-to-peer. Integrating this technology with SPDK will allow users to manage such configurations from a known and flexible environment.	Open Source Caching Solutions Powered by CAS Jay Guilmart and Tomasz Zawadzki Intel Corp Intel is releasing an Open framework for their Cache Acceleration Software product (Open Cache Framework OCF). OCF will be maintained by the Intel team and available to all members of the community. This package can be wrapped in other storage applications to improve caching effectiveness and improve overall platform performance. Current examples from Intel include integration into SPDK and QEMU. The details of this OCF package, how it can be used, and potential benefits will be reviewed.	Persistent Memory Provisioning/Configuration tools Steve Scargall Intel Corp This session is aimed at System Administrators or Application Developers with minimal or no experience working with persistent memory. Steve will introduce and demonstrate how to provision persistent memory in Linux using the open source ndctl utility.
3:00pm	End-to-end Data Protection with SPDK Shuhei Matsumoto Hitachi Dealing with data corruption is important for storage applications, including those built on SPDK. This presentation will describe the new SPDK DIF library and its use cases in SPDK. The library leverages the Intel® Intelligent Storage Acceleration Library (ISA-L) for efficient CRC calculations and provides routines to simplify integration of DIF operations into other libraries. The SPDK iSCSI target now leverages this DIF library to support insert/strip for read and write I/O. This demonstrates that end-to-end data protection can be implemented in host software without specialized hardware while still providing compelling performance and efficiency.	Translate in a flash with SPDK Flash Translation Layer library Wojciech Malikowski Intel Corp The Flash Translation Layer library provides block device access on top of non-block SSDs implementing Open Channel interface. Such SSDs provide more flexibility with regard to data placement decisions, over-provisioning, I/O operations scheduling, garbage collection and wear leveling. FTL library handles logical to physical address mapping, responds to the asynchronous media management events and manages the defragmentation process. Presentation will focus on explaining library core concepts and components, how the library should be used and what benefits could be obtained. Future plans to support power loss protected 4K writes and Zoned Namespace interface will be covered.	Persistent Memory Programming Made Easy with pmemkv Rob Dickenson Intel Corp Introducing pmemkv, an open-source local key/value store for persistent memory based on PMDK. Written in C/C++, pmemkv provides optimized language bindings for Java, JavaScript and Ruby. Pmemkv includes multiple storage engines that are tailored for different use-cases. Fast, flexible and bulletproof, pmemkv is an easy way to modify applications to use persistent memory
3:50pm	NVMe over TCP Storage with SPDK Scott Schweitzer and Patrick Dehkordi Solarflare Solarflare has a mature and proven user space TCP/IP stack that has been integrated into every capital market, powering exchanges for over a decade. We recently ported this stack, called Onload®, onto SPDK. Solarflare has spent the past 24 months testing and proving the value of NVMe over TCP. Solarflare will present our work and roadmap going forward while exploring ways to work closer with the SPDK ecosystem.	Persistent Memory – which mode do I want? Where are the “gotchas” hidden? -- Part 1 Sudha Udanapalli Thiagarajan Intel Corp Intel® Optane™ DC persistent memory can be configured either as persistent memory (AppDirect) or as main memory, with DRAM used as a cache (memory mode). Each mode has some challenges for adoption. To extract the best performance from this memory technology and identify which configuration mode is best suited for your application, it is necessary to understand the architectural flow from the core to the memory and some key challenges in optimization. In this presentation, we will present an overview of the uncore architecture leading to architectural conditions to monitor when using Intel® Optane™ DC persistent memory in both memory configurations . This presentation will also cover recommendations on what tools/profiles to use for profiling. With the relevant architectural background and the recommendation on what tools/profiles to use, profiling and optimizing applications will be that much easier!	Lessons learned from MemVerge, an avid PMDK user Yue Li Intel Corp
4:40pm	Integrating SPDK with Oracle RDBMS Akshay Shah and Zahra Khatami Oracle The goal of integrating Oracle database with SPDK is to build highly available and scalable applications where the database is serving millions of IOPS at very low latencies. However, there are a few challenges with integrating SPDK with the Oracle database. The multi-process Oracle database architecture quickly exhausts the PCIe hardware queues. The DPDK memory model conflicts with Oracle's shared memory model and introduces inefficiencies in the RDMA data access path. In this talk, we will go over the Oracle dispatcher and Oracle environment abstraction layer (EAL) design and implementation. Additionally, we will also describe other enhancements that enhance the security model for SPDK memory management, provide better isolation between different databases and improve the fault tolerance of the database. These features help address the above challenges and enable creation of an enterprise-grade solution that can unlock the full potential of the NVMe devices. Lastly, we will talk about some of the deployment and packaging related challenges with delivering this solution for on premise and cloud deployments.	Persistent Memory – which mode do I want? Where are the “gotchas” hidden? -- Part 2 Sudha Udanapalli Thiagarajan Intel Corp Intel® Optane™ DC persistent memory can be configured either as persistent memory (AppDirect) or as main memory, with DRAM used as a cache (memory mode). Each mode has some challenges for adoption. To extract the best performance from this memory technology and identify which configuration mode is best suited for your application, it is necessary to understand the architectural flow from the core to the memory and some key challenges in optimization. In this presentation, we will present an overview of the uncore architecture leading to architectural conditions to monitor when using Intel® Optane™ DC persistent memory in both memory configurations . This presentation will also cover recommendations on what tools/profiles to use for profiling. With the relevant architectural background and the recommendation on what tools/profiles to use, profiling and optimizing applications will be that much easier!	Creating C++ apps with PMDK Piotr Balcer Intel Corp With persistent memory, data can be retained after a program crash or power failure. In this session, learn how to make your C++ application persistent memory aware using the Persistent Memory Developers Kit (PMDK). The presentation includes C++ code samples walkthrough.

Breakout Session 4

Time	Hayes Ballroom	Morgan Hill	Monterey
8:30am	Tiered approach with different kinds of volatile memory Usha Upadhyayula and Piotr Balcer Intel Corp Data generated from emerging use cases like HPC, AI and IOT can be enormous and is often prohibitive for DRAM. With the advent of large capacity persistent memory modules like Intel® Optane DC Persistent Memory, an alternative for a large capacity tier sitting close to the processor has emerged. libmemkind is a general purpose volatile memory allocation library that supports different "kinds" of memory. In first ½ of the session, Usha will explore how libmemkind can be used to allocate and manage memory from different volatile memory sources through examples. In the 2nd ½ of the session, Piotr Balcer will go over a new library that’s in the works, called libvmemcache. libvmemcache is a dedicated caching library that implement a more optimal memory management scheme.	My experience tuning big-data workloads and configurations Milind Damle Intel Corp For big-data and analytics workloads, performance improvement is critical. Learn about my experience using Intel® VTune™ Amplifier Platform Profiler to identify performance bottle-necks, mitigate them and arrive at an optimal configuration. As distributed systems get more complex and the use of large clusters increases, end-users and customers need to become familiar with these tools and methodologies to get the best performance and TCO for their investment	It’s a Bird…It’s a Plane... It's NVMe over TCP and more!!! Seth Howell and Ziye Yang Intel Corp Since its release in 2016, The SPDK NVMe-oF target has leveraged innovative features to provide low latency, high throughput access to remote NVMe drives. This last year has provided many opportunities to expand on that foundation to broaden the applicability of the NVMe-oF target, further increase performance, and decrease it’s memory footprint. In this talk, we introduce the NVMe-oF TCP target and initiator, describe their position in the SPDK NVMe-oF architecture and provide benchmarking data for the new transport. We also detail several new features of the RDMA transport including shared receive queue and scatter gather list support and their impact on performance and memory consumption.
9:20am	SPDK’s OCSSD FTLs on Zao Platform to Solve Cloud’s Noisy Neighbor Problem David Recker, Circuit Blvd Madhukiran Vaddi, Marvell Since the SPDK community introduced OCSSD FTL bdev in January 2019, Circuit Blvd and Marvell have been jointly evaluating it on Marvell's Zao SSD SoC platform. We will first present Zao platform technical details including OCSSD SDK features. We measured various performance metrics from the initial prototype in comparison with the standard NVMe devices. In conclusion, we will demonstrate how SPDK OCSSDs can solve the noisy neighbor problem in multi-tenant environments.	Optimize your PMDK application’s performance with the help of Intel® VTune™ Amplifier profiler Dmitry Ryabtsev and Sergey Vinogradov Intel Corp If you want to take advantage of the Intel® Optane® DC persistent memory in AppDirect mode then PMDK library is probably your best bet. But what can you do if the performance you get doesn’t satisfy you? In this talk you will learn how to use Intel VTune Amplifier to optimize PMDK-based applications.	Debugging Techniques for Persistent Memory Programming Eduardo Berrocal Intel Corp In this talk, Eduardo will go over the new potential errors that persistent memory programmers need to be aware of and the tools available to overcome them. The first part of the talk will focus on pmempool, a tool available in the Persistent Memory Development Kit (PMDK) that helps prevent, diagnose, and recover from unplanned data corruption due to hardware issues. Eduardo will also show how errors can be injected to test the redundancy of your system. The second part will focus on the new types of bugs that can arise in persistent memory programming, and how you can protect yourself using the pmemcheck, pmreorder, and Intel® Inspector – Persistent Inspector tools.

Breakout Session 5

Time	Hayes Ballroom	Morgan Hill	Monterey
10:20am	RMI: High performance user-level multi-threading on SPDK/DPDK Munenori Maeda and Shinji KobayashI Fujitsu Our two-years practice to develop an Enterprise AFA storage with high-speed RDMA reveals that its complicated storage stack requires a threading environment to reduce implementation complexity. However, it is a big challenge to implement a threading environment including synchronous communication primitives without degrading the performance of SPDK/DPDK. RMI threading environment, we have developed on SPDK 16.12, supports over 16K threads, and provides dynamic load-balancing and simple synchronous communication primitives using RDMA. Our preliminary evaluation show that the cost of load balancing is quite low, and the communication performance scales well on modern multicore systems.	OpenStack, NVMe-over-Fabrics and SPDK – High Performance Pooled Block Storage with Cinder Maciej Szwed and Tushar Gohad Intel Corp NVMe-over-Fabrics (NVMe-oF) is an architecture that allows low latency access over RDMA and TCP to remote NVMe-attached storage. It is fast becoming a key enabler for disaggregated storage and NVMe pooling in data centers, driving increased storage efficiency. Intel, Mellanox and Mirantis joined forces to enable NVMe-oF in Openstack Cinder and Nova - a new SPDK NVMe-oF driver was recently merged to upstream, which paves the way for consuming SPDK-based high performance storage in OpenStack (and soon to be enabled, Kubernetes) cloud deployments.	Oops! Most profilers can’t analyze SPDK polled I/O. Learn how Intel® VTune™ Amplifier can. Roman Sudarikov, Ilya Kurakin, Roman Khatko, and Jeffrey Reinemann Intel Corp With traditional interrupt driven I/O, the CPU is either doing something useful or waiting. With SPDK’s polled I/O the CPU is always 100% busy so traditional profiling techniques don’t work. Intel VTune Amplifier can identify “empty” spinning so you can balance core loading, balance SSDs, see the throughput per device, PCIe traffic breakdown and lots of good stuff. Learn how to use Intel VTune Amplifier to optimize your I/O performance.
11:10am	Pachyzoom: Understanding & Optimizing Hadoop servers with Intel VPP Matt Singer Twitter Twitter partnered with Intel to investigate options to increase the storage density of Hadoop nodes. The project began with a focus on iCAS and Optane SSDs, but evolved into a deeper dive into Twitter's existing Hadoop infrastructure using Intel's vTune Platform Profiler and internal tooling. As bottlenecks were removed, new ones took their place, causing a shift in the focus of our testing. By collaborating with experts from both companies on Hadoop, storage, caching and telemetry, we were able to challenge several assumptions about Twitter's desired compute/storage balance. The result of many months of reconfiguration, benchmark testing, and analysis resulted in a clear direction for the shape of Twitter's next generation of Hadoop hardware. The presentation will discuss the evolution of the project and key results of the collaboration.	Squeezing Compression into SPDK Paul Luse and Jim Harris Intel Corp Last year we introduced the crypto bdev module which made use of DPDK’s existing variety of drivers to usher in the capability. This year we are expanding our use of DPDK with the addition of a compression bdev module. This talk will outline the overall architecture of the compression module and explain in detail how we are managing the layout of the device and leveraging PMDK to store metadata in super-fast persistent memory.	Big data usages with PMDK Di Wang and Jakub Radtke Intel Corp The Distributed Asynchronous Object Storage (DAOS) is an open-source software-defined object store designed from the ground up for massively distributed Non Volatile Memory (NVM). In this presentation, Di will introduce DAOS architecture and how DAOS is built based on PMDK. DAOS takes advantage of next generation NVM technology like Storage Class Memory (SCM) and NVM express (NVMe) while presenting a key-value storage interface and providing features such as transactional non-blocking I/O, advanced data protection with self-healing on top of commodity hardware, end-to-end data integrity, fine grained data control and elastic storage to optimize performance and cost. The second part talks about the main challenge in the design of key-value store for DAQ is handling the bandwidth and capacity requirements. D

Time

Hayes Ballroom

Morgan Hill

Monterey

10:20am

RMI: High performance user-level multi-threading on SPDK/DPDK

Munenori Maeda and Shinji KobayashI
Fujitsu

Our two-years practice to develop an Enterprise AFA storage with high-speed RDMA reveals that its complicated storage stack requires a threading environment to reduce implementation complexity. However, it is a big challenge to implement a threading environment including synchronous communication primitives without degrading the performance of SPDK/DPDK.

RMI threading environment, we have developed on SPDK 16.12, supports over 16K threads, and provides dynamic load-balancing and simple synchronous communication primitives using RDMA. Our preliminary evaluation show that the cost of load balancing is quite low, and the communication performance scales well on modern multicore systems.

OpenStack, NVMe-over-Fabrics and SPDK – High Performance Pooled Block Storage with Cinder

Maciej Szwed and Tushar Gohad
Intel Corp

NVMe-over-Fabrics (NVMe-oF) is an architecture that allows low latency access over RDMA and TCP to remote NVMe-attached storage. It is fast becoming a key enabler for disaggregated storage and NVMe pooling in data centers, driving increased storage efficiency. Intel, Mellanox and Mirantis joined forces to enable NVMe-oF in Openstack Cinder and Nova - a new SPDK NVMe-oF driver was recently merged to upstream, which paves the way for consuming SPDK-based high performance storage in OpenStack (and soon to be enabled, Kubernetes) cloud deployments.

Oops! Most profilers can’t analyze SPDK polled I/O. Learn how Intel® VTune™ Amplifier can.

Roman Sudarikov, Ilya Kurakin, Roman Khatko, and Jeffrey Reinemann
Intel Corp

With traditional interrupt driven I/O, the CPU is either doing something useful or waiting. With SPDK’s polled I/O the CPU is always 100% busy so traditional profiling techniques don’t work. Intel VTune Amplifier can identify “empty” spinning so you can balance core loading, balance SSDs, see the throughput per device, PCIe traffic breakdown and lots of good stuff. Learn how to use Intel VTune Amplifier to optimize your I/O performance.

11:10am

Pachyzoom: Understanding & Optimizing Hadoop servers with Intel VPP

Matt Singer
Twitter

Twitter partnered with Intel to investigate options to increase the storage density of Hadoop nodes. The project began with a focus on iCAS and Optane SSDs, but evolved into a deeper dive into Twitter's existing Hadoop infrastructure using Intel's vTune Platform Profiler and internal tooling. As bottlenecks were removed, new ones took their place, causing a shift in the focus of our testing. By collaborating with experts from both companies on Hadoop, storage, caching and telemetry, we were able to challenge several assumptions about Twitter's desired compute/storage balance. The result of many months of reconfiguration, benchmark testing, and analysis resulted in a clear direction for the shape of Twitter's next generation of Hadoop hardware. The presentation will discuss the evolution of the project and key results of the collaboration.

Squeezing Compression into SPDK

Paul Luse and Jim Harris
Intel Corp

Last year we introduced the crypto bdev module which made use of DPDK’s existing variety of drivers to usher in the capability. This year we are expanding our use of DPDK with the addition of a compression bdev module. This talk will outline the overall architecture of the compression module and explain in detail how we are managing the layout of the device and leveraging PMDK to store metadata in super-fast persistent memory.

Big data usages with PMDK

Di Wang and Jakub Radtke
Intel Corp

The Distributed Asynchronous Object Storage (DAOS) is an open-source software-defined object store designed from the ground up for massively distributed Non Volatile Memory (NVM). In this presentation, Di will introduce DAOS architecture and how DAOS is built based on PMDK.

DAOS takes advantage of next generation NVM technology like Storage Class Memory (SCM) and NVM express (NVMe) while presenting a key-value storage interface and providing features such as transactional non-blocking I/O, advanced data protection with self-healing on top of commodity hardware, end-to-end data integrity, fine grained data control and elastic storage to optimize performance and cost.

The second part talks about the main challenge in the design of key-value store for DAQ is handling the bandwidth and capacity requirements. D

2019 SPDK US Summit