Flexible Data Placement

In this section, you will find a guide on FDP aka Flexible Data Placement support in xNVMe. This will guide you with examples of all the supported FDP log pages, Set/Get features, and I/O management commands. This will also cover the Write command with hints from the perspective of FIO’s xNVMe ioengine.

Concepts and Prelude

FDP adds an enhancement to NVM Command Set by enabling host guided data placement. This introduces a Reclaim Unit (RU) which is a logical representation of non-volatile storage within a Reclaim Group that is able to be physically erased by the controller without disturbing any other Reclaim Units.

Placement Identifier is a data structure that specifies a Reclaim Group Identifier and a Placement Handle that references a Reclaim Unit.

Placement Handle is a namespace scoped handle that maps to an Endurance group scoped Reclaim Unit Handle which references a Reclaim Unit in each Reclaim Group.

Reclaim Unit Handle (RUH) is a controller resource that references a Reclaim Unit in each Reclaim Group.

For the complete information on FDP, please see the ratified technical proposal TP4146 Flexible Data Placement 2022.11.30 Ratified, which can be found here https://nvmexpress.org/wp-content/uploads/NVM-Express-2.0-Ratified-TPs_20230111.zip

Get log page

There are 4 new log pages associated with FDP. These are FDP Configuration, Reclaim Unit Handle Usage, FDP Statistics and FDP Events. All these log pages are Endurance Group scoped and hence you need to specify Endurance Group Identifier in Log Specific Identifier field.

For the 4 log pages mentioned above, you can use xNVMe CLI:

The Fdp configuration log page requires dynamic memory allocation, as there can be multiple configuration each having multiple Reclaim Unit Handles. You will have to specify the data size in bytes. The command can be run like:

xnvme log-fdp-config /dev/nvme3n1 --data-nbytes=512 --lsi 0x1

The command should produce output similar to:

# Allocating and clearing buffer...
# Retrieving FDP configurations log page ...
xnvme_spec_log_fdp_conf:
  ncfg: 0
  version: 0
  size: 112
  config_desc: 0
  ds: 96
  fdp attributes: {    rgif: 6    fdpvwc: 0    fdpcv: 1    val: 0x86  }
  vss: 0
  nrg: 32
  nruh: 8
  maxpids: 127
  nns: 256
  runs: 40960
  erutl: 0
   - ruht[0]: 1
   - ruht[1]: 1
   - ruht[2]: 1
   - ruht[3]: 1
   - ruht[4]: 1
   - ruht[5]: 1
   - ruht[6]: 1
   - ruht[7]: 1

For Fdp Statistics log page the command can be run like:

xnvme log-fdp-stats /dev/nvme3n1 --lsi 0x1

The command should produce output similar to:

# Allocating and clearing buffer...
# Retrieving FDP statistics log page ...
xnvme_spec_log_fdp_stats:
  hbmw: [2097152, 0]
  mbmw: [2342912, 0]
  mbe: [0, 0]

Similar to the Fdp Configuration log page, you will have to specify the number of Recalim Unit Handle Usage descriptors to fetch. The command can be run like:

xnvme log-ruhu /dev/nvme3n1 --lsi 0x1 --limit 4

The command should produce output similar to:

# Allocating and clearing buffer...
# Retrieving ruhu-log ...
# 4 reclaim unit handle usage:
xnvme_spec_log_ruhu:
  nruh: 8
  - ruhu_desc[0]:  0x1
  - ruhu_desc[1]:  0
  - ruhu_desc[2]:  0
  - ruhu_desc[3]:  0

The Fdp Events log page will have multiple events. You will have to specify the number of events you want to fetch. You also need to specify whether you need host or controller events. This can be done by log specific parameter. The complete command can be run like:

xnvme log-fdp-events /dev/nvme3n1 --nsid 0x1 --limit 2 --lsi 0x1 --lsp 0x1

The command should produce output similar to:

# Allocating and clearing buffer...
# Retrieving fdp-events-log ...
# 2 fdp events log page entries:
xnvme_spec_log_fdp_events:
  nevents: 1
  - {type: 0, fdpef: 0x7, pid: 0, timestamp: 564656954151826, nsid: 1, rgid: 0, ruhid: 0, }
  - {type: 0, fdpef: 0, pid: 0, timestamp: 0, nsid: 0, rgid: 0, ruhid: 0, }

Set and get-feature

There are 2 new Set and Get Feature commands, introduced with FDP. These are Flexible Data Placement which controls operation of FDP capability in the specified Endurance Group, and FDP Events which controls if a controller generates FDP Events associated with a specific Reclaim Unit Handle.

xNVMe does not support Namespace Management commands. Thus we cannot enable or disable FDP by sending Set Feature command to the Endurance Group, as it requires deletion of all namespaces in that Endurance Group. However you can check the FDP capability by sending a Get Feature command. The command can be run like:

xnvme feature-get /dev/nvme3n1 --fid 0x1d --cdw11 0x1

The command should produce output similar to:

# cmd_gfeat: {nsid: 0x1, fid: 0x1d, sel: 0x0}
feat: { fdpe: 1, fdpci: 0 }

Command Dword 12 controls whether you want to enable or disable FDP Events. You will have to specify number of events to enable or disable and Placement Handle associated with it. These will be part of the Feat field. This will require you to specify the size in bytes of the data buffer. To enable all the Fdp Events you can run command like:

xnvme set-fdp-events /dev/nvme3n1 --fid 0x1e --feat 0x60000 --cdw12 0x1

The command should produce output similar to:

# cmd_sfeat: {nsid: 01, fid: 0x1e, save: 0x0, feat: 0x60000, cdw12: 0x1}

You can get the status of supported FDP Events. The Command Dowrd 11 remains the same as the Set Feature command. You can run the command like:

xnvme feature-get /dev/nvme3n1 --fid 0x1e --cdw11 0xFF0000 --data-nbytes 510

The command should produce output similar to:

# cmd_gfeat: {nsid: 0x1, fid: 0x1e, sel: 0x0}
nevents: 6 }
{ type: 0, event enabled: 1 }
{ type: 0x1, event enabled: 1 }
{ type: 0x2, event enabled: 1 }
{ type: 0x3, event enabled: 1 }
{ type: 0x80, event enabled: 0 }
{ type: 0x81, event enabled: 0 }

I/O Management

Two I/O Management commands are introduced with FDP. These are I/O Management Send and I/O Management Receive.

I/O management Receive supports Reclaim Unit Handle Status command. You will have to specify the number of Recalim Unit Handle Status descriptors to fetch. You can run the command like:

xnvme fdp-ruhs /dev/nvme3n1 --limit 4

The command should produce output similar to:

# Allocating and clearing buffer...
# Retrieving ruhs ...
# 4 reclaim unit handle status:
xnvme_spec_ruhs:
  nruhsd: 128
  - ruhs_desc[0] : { pi: 0 ruhi: 0 earutr: 0 ruamw: 10}
  - ruhs_desc[1] : { pi: 1024 ruhi: 0 earutr: 0 ruamw: 10}
  - ruhs_desc[2] : { pi: 2048 ruhi: 0 earutr: 0 ruamw: 10}
  - ruhs_desc[3] : { pi: 3072 ruhi: 0 earutr: 0 ruamw: 10}

I/O Management Send supports Reclaim Unit Handle Update command. You will have to specify a Placement Identifier for this. You can run the command like:

xnvme fdp-ruhu /dev/nvme3n1 --pid 0x0

The command should produce output similar to:

# Updating ruh ...

FIO xnvme ioengine

FIO’s xNVMe ioengine provides FDP support since the 3.35 release. This support is only there with nvme character device i.e. /dev/ng0n1 and with userspace drivers such as spdk. Since the kernel support is limited to nvme character device, you can only use the FDP functionality with xnvme_sync=nvme or xnvme_async=io_uring_cmd backends.

To enable the FDP mode, you will have to specify fio option fdp=1.

Two additional optional FDP specific fio options can be specified. These are:

fdp_pli=x,y,.. This can be used to specify index or comma separated indicies of placement identifiers. The index or indicies refer to the placement identifiers from reclaim unit handle status command. If you don’t specify this option, fio will use all the available placement identifiers from reclaim unit handle status command.

fdp_pli_select=str You can specify random or roundrobin as the string literal. This tells fio which placement identifer to select next after every write operation. If you don’t specify this option, fio will round robin over the available placement identifers.

Have a look at example configuration file at: https://github.com/axboe/fio/blob/master/examples/xnvme-fdp.fio

This configuration tells fio to use placement identifers present at index 4 and 5 in reclaim unit handle usage command. By default we are using round robin mechanism for selecting the next placement identifier.

Using the above mentioned configuration, you can run the fio command like this:

fio ../tutorial/fdp/examples/xnvme-fdp.fio --section=default --ioengine=xnvme --xnvme_async=io_uring_cmd --filename=/dev/ng3n1

The command should produce output similar to:

default: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=xnvme, iodepth=1
fio-3.36
Starting 1 thread

default: (groupid=0, jobs=1): err= 0: pid=107765: Sat Feb  3 22:52:09 2024
  write: IOPS=6243, BW=24.4MiB/s (25.6MB/s)(2048KiB/82msec); 0 zone resets
    slat (nsec): min=340, max=9355, avg=718.33, stdev=417.33
    clat (usec): min=71, max=746, avg=156.46, stdev=59.42
     lat (usec): min=72, max=747, avg=157.18, stdev=59.49
    clat percentiles (usec):
     |  1.00th=[   82],  5.00th=[   87], 10.00th=[   95], 20.00th=[  105],
     | 30.00th=[  122], 40.00th=[  145], 50.00th=[  147], 60.00th=[  159],
     | 70.00th=[  176], 80.00th=[  208], 90.00th=[  223], 95.00th=[  249],
     | 99.00th=[  314], 99.50th=[  404], 99.90th=[  750], 99.95th=[  750],
     | 99.99th=[  750]
  lat (usec)   : 100=14.84%, 250=80.66%, 500=4.30%, 750=0.20%
  cpu          : usr=41.98%, sys=56.79%, ctx=3, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,512,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=24.4MiB/s (25.6MB/s), 24.4MiB/s-24.4MiB/s (25.6MB/s-25.6MB/s), io=2048KiB (2097kB), run=82-82msec

Note

If you see no output, then try running it as super-user or via sudo