ublk: the new and improved way to serve SPDK storage locally!
SPDK v23.01 added a new ublk library that can expose SPDK block devices as Linux kernel block devices using the Linux ublk framework added in the Linux 6.0 kernel. The Linux kernel doc for ublk describes ublk as:
ublk is a generic framework for implementing block device logic from userspace. The motivation behind it is that moving virtual block drivers into userspace, such as loop, nbd and similar can be very helpful.
Why is this useful to SPDK? Let’s start with looking back at how SPDK has evolved over the years.
SPDK started with a polled-mode userspace NVMe driver, and soon followed with an iSCSI target that could expose NVMe SSDs over a network fabric. So very early on, SPDK was focusing not just on direct-attached storage, but also storage networking.
But then we looked at what QEMU and DPDK had done with vhost for virtual NICs, and how we could extend it to do storage virtualization. SPDK v17.03 added a vhost-scsi target to allow SPDK to serve its block storage to QEMU-based virtual machines. Since then, SPDK storage virtualization now also includes vhost-blk and nvme-vfio-user.
Serving SPDK block storage to virtual machines is great, but what about serving that storage directly to other processes on the host - i.e. containers? This requires a driver in the host kernel that is able to connect to an SPDK storage application running on the same host.
SPDK v17.07 provided the initial attempt at this with nbd (network block device). The name is really a misnomer - nbd simply allows a userspace process to act as a server for block devices presented by the Linux kernel. But nbd has massive scalability and performance problems, so SPDK nbd support was primarily used for non-performant use cases - i.e. using standard Linux tools to write GPT partition tables to an SPDK block device, rather than SPDK writing its own tools to do that.
In November 2018, NVMe ratified the NVMe/TCP specification. This enabled a more efficient way of presenting SPDK block storage to host processes via NVMe over TCP loopback. We use the SPDK NVMe-oF target as the server, and then the kernel can connect to it using nvme-cli.
This has been the recommended way to present SPDK block storage to host processes. Solutions such as OpenEBS/Mayastor do exactly this. Performance scales better, due to NVMe’s inherent multi-queue architecture. But this has always felt a bit kludgy, using TCP loopback as the communication path between the application and the SPDK target. Wouldn’t it be nice to have something a bit more native for this use case?
Enter ublk. It was designed specifically for this purpose. It is all based around io_uring and blk-mq (multi-queue) to help with both performance and scalability. Here’s a simple example to get ublk up and running with SPDK.
- Make sure the ublk_drv module is loaded on your system. This requires Linux kernel version 6.0 or greater.
- Configure SPDK to build the ublk library:
./configure --with-ublk
- Build SPDK:
make -j
- Start SPDK target:
build/bin/spdk_tgt
- From separate terminal, execute the following RPCs:
scripts/rpc.py bdev_malloc_create 100 512 -b malloc0
: this creates as 100MB malloc block devicescripts/rpc.py ublk_create_target
: this initializes the ublk targetscripts/rpc.py ublk_start_disk malloc0 0 -q 4 -d 64
Let’s explain this last RPC in a bit more detail.
malloc0
specifies which SPDK block device will be exported over ublk0
specifies the index of the ublk device node. In this case, it will map to/dev/ublkb0
.-q 4
specifies the creation of 4 queues for this device. blk-mq will share these 4 queues among the available CPUs on the host system.-d 64
specifies each queue will have size 64.
After you’ve run these RPCs, you should see /dev/ublkb0
on your system. Run some simple dd
commands against it, or your favorite fio script!
Stay tuned for additional posts on ublk, focused on specifics of the SPDK ublk design as well as performance comparisons against NVMe/TCP loopback.
Thanks for reading!