Skip to content

iVersion: 3 with xFetch: None causes SEGFAULT under concurrent WAL #79

@russellromney

Description

@russellromney

Bug

sqlite3_io_methods.iVersion is set to 3 in register_inner() (src/vfs.rs:316), but xFetch and xUnfetch are set to None (null function pointers).

SQLite interprets iVersion >= 3 as "xFetch and xUnfetch are implemented and callable." It does not null-check these function pointers before calling them. When SQLite's pager decides to memory-map a page (which happens during WAL checkpoint page reads under concurrent load), it calls xFetch through the null pointer, causing a SEGFAULT.

Reproduction

Any VFS using sqlite-plugin with WAL mode under concurrent readers + writers. The crash is intermittent because SQLite's mmap heuristic doesn't always choose to mmap pages.

Minimal reproduction: 4 reader threads + 2 writer threads, WAL mode, 2000 rows of 5KB each, 3 seconds. ~66% crash rate on macOS aarch64.

ASAN output:

==59391==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
(pc 0x000000000000 bp ... sp ... T8)
==59391==Hint: pc points to the zero page.

    #0 0x000000000000  (<unknown module>)
    #1 ... in sqlite3WalCheckpoint
    #2 ... in sqlite3PagerCheckpoint
    #3 ... in sqlite3BtreeCheckpoint
    ...

pc=0x0 confirms a null function pointer call (not a null data pointer dereference).

Fix

Set iVersion: 2 instead of iVersion: 3. Version 2 declares SHM/WAL support (xShmMap, xShmLock, xShmBarrier, xShmUnmap) without claiming xFetch/xUnfetch support.

let io_methods = ffi::sqlite3_io_methods {
    iVersion: 2,  // was 3
    // ...
    xFetch: None,
    xUnfetch: None,
};

The one-line change makes the crash go away completely (30/30 concurrent stress test passes, was ~10/30 before).

Context

From the SQLite header (sqlite3.h):

struct sqlite3_io_methods {
  int iVersion;
  // ... version 1 methods ...
  /* Methods above are valid for version 1 */
  int (*xShmMap)(...);
  int (*xShmLock)(...);
  void (*xShmBarrier)(...);
  int (*xShmUnmap)(...);
  /* Methods above are valid for version 2 */
  int (*xFetch)(...);
  int (*xUnfetch)(...);
  /* Methods above are valid for version 3 */
};

"Valid for version N" means "guaranteed non-null and callable when iVersion >= N." Setting iVersion=3 with null xFetch violates this contract.

Longer term

Once sqlite-plugin implements actual xFetch/xUnfetch (memory-mapped page reads), iVersion can be bumped back to 3. This would enable zero-copy page reads for VFS implementations that support it, which is a significant performance win for warm-cache workloads.

Found while building turbolite.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions