Skip to content

Use virtiofsd for sharing file system data with host#88

Open
d-e-s-o wants to merge 2 commits intodanobi:masterfrom
d-e-s-o:topic/virtiofs
Open

Use virtiofsd for sharing file system data with host#88
d-e-s-o wants to merge 2 commits intodanobi:masterfrom
d-e-s-o:topic/virtiofs

Conversation

@d-e-s-o
Copy link
Contributor

@d-e-s-o d-e-s-o commented Aug 28, 2024

Switch over to using virtiofsd for sharing file system data with the
host. virtiofs is a file system designed for the needs of virtual
machines and environments. That is in contrast to 9P fs, which we
currently use for sharing data with the host, which is first and
foremost a network file system.
9P is problematic if for no other reason that it lacks proper support
for usage of the "open-unlink-fstat idiom", in which files are unlinked
and later referenced via file descriptor (see #83). virtiofs does not
have this problem.

This change replaces usage of 9P with that of virtiofs. In order to
work, virtiofs needs a user space server. The current state-of-the-art
implementation (virtiofsd) is implemented in Rust and so we interface
directly with the library. Most of this code is extracted straight from
virtiofsd, as it's a lot of boilerplate. An alternative approach is to
install the binary via distribution packages or from crates.io, but
availability (and discovery) can be a bit of a challenge. Note that this
now means that both libcap-ng as well as libseccomp need to be
installed.

I benchmarked both the current master as well as this version with a
bare-bones custom kernel:
Benchmark 1: target/release/vmtest -k bzImage-9p 'echo test'
Time (mean ± σ): 1.316 s ± 0.087 s [User: 0.462 s, System: 1.104 s]
Range (min … max): 1.232 s … 1.463 s 10 runs

Benchmark 1: target/release/vmtest -k bzImage-virtiofsd 'echo test'
Time (mean ± σ): 1.244 s ± 0.011 s [User: 0.307 s, System: 0.358 s]
Range (min … max): 1.227 s … 1.260 s 10 runs

So it seems there is a slight speed up, on average (and significantly
less system time being used). This is great, but I suspect a more
pronounced speed advantage will be visible when working with large
files, in which virtiofs is said to significantly outperform 9P
(typically >2x from what I understand, but I have not done any
benchmarks of that nature).

A few other notes:

  • we solely rely on guest level read-only mounts to enforce read-only
    state. The virtiofsd recommended way is to use read-only bind mounts
    [0], but doing so would require root.
  • we are not using DAX, because it still is still incomplete and
    apparently requires building Qemu (?) from source. In any event, it
    should not change anything functionally and be solely a performance
    improvement.

I have adjusted the configs, but because I don't have Docker handy I
can't really create those kernel. CI seems incapable of producing the
artifacts without doing a fully-blown release dance. No idea what empty
is about, really. I suspect the test failures we see are because it
lacks support?

[0] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md?ref_type=heads#faq

@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 2 times, most recently from 62c084e to 3596ee3 Compare August 29, 2024 00:01
@danielocfb
Copy link

danielocfb commented Aug 29, 2024

Can't undraft right now, but this change can probably be looked at. Please advise re: empty kernels & test failures (which I suspect are related to them as well as the images containing incompat kernels).

@danielocfb danielocfb force-pushed the topic/virtiofs branch 6 times, most recently from f382adb to be7aefd Compare August 29, 2024 22:20
@d-e-s-o d-e-s-o changed the title [RFC][prototype] Use virtiofsd for sharing file system data with host Use virtiofsd for sharing file system data with host Aug 30, 2024
@d-e-s-o d-e-s-o marked this pull request as ready for review August 30, 2024 04:46
@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 3 times, most recently from f022560 to 2d776c6 Compare August 30, 2024 13:08
@d-e-s-o
Copy link
Contributor Author

d-e-s-o commented Aug 30, 2024

Found a few more things we could remove -> decreased patch size.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop these changes, please. all the config happens in scripts/docker/config_kernel.sh

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump; drop kconfig changes please. custom config goes in 2466d90

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@d-e-s-o d-e-s-o force-pushed the topic/virtiofs branch 3 times, most recently from e149558 to f83412b Compare September 5, 2024 15:40
Copy link
Owner

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm. but I think the non-vendored libcap-ng and libseccomp deps break the static builds:

$ RUSTFLAGS="-C target-feature=+crt-static" cargo build --target x86_64-unknown-linux-gnu
[...]
  = note: /usr/bin/ld: cannot find -lm: No such file or directory
          /usr/bin/ld: cannot find -lc: No such file or directory
          /usr/bin/ld: cannot find -lc: No such file or directory
          collect2: error: ld returned 1 exit status


error: could not compile `vmtest` (bin "vmtest") due to 1 previous error

looking at capng and libseccomp-sys, I see they support static linking:

https://github.com/slp/capng/blob/25304f0b998d7a8a44d7e3b03598d73b56e2a555/build.rs#L17
https://github.com/libseccomp-rs/libseccomp-rs/blob/7d372c4cb2b11a666c58ed07b7c07f211092c55e/libseccomp-sys/build.rs#L23

and ubuntu ships static archives for both:

root@e4ca9ec837d8:/# ls /usr/lib/x86_64-linux-gnu/ -l | grep -e "\.a$"
-rw-r--r--. 1 root root      1918 Aug  8 14:47 libBrokenLocale.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libanl.a
-rw-r--r--. 1 root root   6134904 Aug  8 14:47 libc.a
-rw-r--r--. 1 root root      5594 Aug  8 14:47 libc_nonshared.a
-rw-r--r--. 1 root root     34208 Mar 31 07:45 libcap-ng.a
-rw-r--r--. 1 root root    271404 Apr  8 16:09 libcrypt.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libdl.a
-rw-r--r--. 1 root root      1686 Mar 31 07:45 libdrop_ambient.a
-rw-r--r--. 1 root root      1210 Aug  8 14:47 libg.a
-rw-r--r--. 1 root root   2607988 Aug  8 14:47 libm-2.39.a
-rw-r--r--. 1 root root       132 Aug  8 14:47 libm.a
-rw-r--r--. 1 root root      1480 Aug  8 14:47 libmcheck.a
-rw-r--r--. 1 root root   1830076 Aug  8 14:47 libmvec.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libpthread.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libpthread_nonshared.a
-rw-r--r--. 1 root root     88100 Aug  8 14:47 libresolv.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 librt.a
-rw-r--r--. 1 root root    273376 Aug  9 02:33 libseccomp.a
-rw-r--r--. 1 root root         8 Aug  8 14:47 libutil.a

Lemme mess around with it a bit and see if we can get away with just tweaking the CI

@danielocfb
Copy link

danielocfb commented Sep 5, 2024

Lemme mess around with it a bit and see if we can get away with just tweaking the CI

Perhaps try setting LIBSECCOMP_LIB_TYPE and LIBCAPNG_LINK_TYPEto static.

Edit: Oops, sorry, I missed that you found those already. Haha.

kernel-patches-daemon-bpf bot pushed a commit to kernel-patches/bpf that referenced this pull request Sep 25, 2024
danobi/vmtest is going to migrate from using 9p to using virtio_fs to
mount the local rootfs: danobi/vmtest#88

BPF CI uses danobi/vmtest to run bpf selftests and will need to support
VIRTIO_FS.

This change enables new kconfigs to be able to support the upcoming
danobi/vmtest.

Tested by building a new kernel with those config and confirming it
would successfully run with 9p (currently what is used by vmtest), and
with virtio_fs (using a local build of vmtest).

  $ vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE    FSTYPE OPTIONS
  /      /dev/root 9p     rw,relatime,cache=5,access=client,msize=512000,trans=virtio
  $ /home/chantra/local/danobi-vmtest/target/debug/vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Initializing host environment
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE FSTYPE   OPTIONS
  /      rootfs virtiofs rw,relatime

Changes in v2:
* Sorted configs alphabetically

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
@d-e-s-o
Copy link
Contributor Author

d-e-s-o commented Sep 25, 2024

Got it working without these libraries. Updated accordingly.

ldd vmtest/target/release/vmtest
        linux-vdso.so.1 (0x00007ffe6dfc0000)
        libgcc_s.so.1 => /usr/lib/gcc/x86_64-pc-linux-gnu/13/libgcc_s.so.1 (0x00007f09a2d46000)
        libm.so.6 => /usr/lib64/libm.so.6 (0x00007f09a2c9e000)
        libc.so.6 => /usr/lib64/libc.so.6 (0x00007f09a2ae2000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f09a31f4000)

Upstream pull requests:

kernel-patches-daemon-bpf bot pushed a commit to kernel-patches/bpf that referenced this pull request Sep 26, 2024
danobi/vmtest is going to migrate from using 9p to using virtio_fs to
mount the local rootfs: danobi/vmtest#88

BPF CI uses danobi/vmtest to run bpf selftests and will need to support
VIRTIO_FS.

This change enables new kconfigs to be able to support the upcoming
danobi/vmtest.

Tested by building a new kernel with those config and confirming it
would successfully run with 9p (currently what is used by vmtest), and
with virtio_fs (using a local build of vmtest).

  $ vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE    FSTYPE OPTIONS
  /      /dev/root 9p     rw,relatime,cache=5,access=client,msize=512000,trans=virtio
  $ /home/chantra/local/danobi-vmtest/target/debug/vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Initializing host environment
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE FSTYPE   OPTIONS
  /      rootfs virtiofs rw,relatime

Changes in v2:
* Sorted configs alphabetically

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
kernel-patches-daemon-bpf-rc bot pushed a commit to kernel-patches/bpf-rc that referenced this pull request Sep 26, 2024
danobi/vmtest is going to migrate from using 9p to using virtio_fs to
mount the local rootfs: danobi/vmtest#88

BPF CI uses danobi/vmtest to run bpf selftests and will need to support
VIRTIO_FS.

This change enables new kconfigs to be able to support the upcoming
danobi/vmtest.

Tested by building a new kernel with those config and confirming it
would successfully run with 9p (currently what is used by vmtest), and
with virtio_fs (using a local build of vmtest).

  $ vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE    FSTYPE OPTIONS
  /      /dev/root 9p     rw,relatime,cache=5,access=client,msize=512000,trans=virtio
  $ /home/chantra/local/danobi-vmtest/target/debug/vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Initializing host environment
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE FSTYPE   OPTIONS
  /      rootfs virtiofs rw,relatime

Changes in v2:
* Sorted configs alphabetically

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
kernel-patches-daemon-bpf bot pushed a commit to kernel-patches/bpf that referenced this pull request Sep 27, 2024
danobi/vmtest is going to migrate from using 9p to using virtio_fs to
mount the local rootfs: danobi/vmtest#88

BPF CI uses danobi/vmtest to run bpf selftests and will need to support
VIRTIO_FS.

This change enables new kconfigs to be able to support the upcoming
danobi/vmtest.

Tested by building a new kernel with those config and confirming it
would successfully run with 9p (currently what is used by vmtest), and
with virtio_fs (using a local build of vmtest).

  $ vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE    FSTYPE OPTIONS
  /      /dev/root 9p     rw,relatime,cache=5,access=client,msize=512000,trans=virtio
  $ /home/chantra/local/danobi-vmtest/target/debug/vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Initializing host environment
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE FSTYPE   OPTIONS
  /      rootfs virtiofs rw,relatime

Changes in v2:
* Sorted configs alphabetically

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/bpf/20240925002210.501266-1-chantr4@gmail.com
kernel-patches-daemon-bpf-rc bot pushed a commit to kernel-patches/bpf-rc that referenced this pull request Sep 27, 2024
danobi/vmtest is going to migrate from using 9p to using virtio_fs to
mount the local rootfs: danobi/vmtest#88

BPF CI uses danobi/vmtest to run bpf selftests and will need to support
VIRTIO_FS.

This change enables new kconfigs to be able to support the upcoming
danobi/vmtest.

Tested by building a new kernel with those config and confirming it
would successfully run with 9p (currently what is used by vmtest), and
with virtio_fs (using a local build of vmtest).

  $ vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE    FSTYPE OPTIONS
  /      /dev/root 9p     rw,relatime,cache=5,access=client,msize=512000,trans=virtio
  $ /home/chantra/local/danobi-vmtest/target/debug/vmtest -k arch/x86/boot/bzImage "findmnt /"
  => bzImage
  ===> Initializing host environment
  ===> Booting
  ===> Setting up VM
  ===> Running command
  TARGET SOURCE FSTYPE   OPTIONS
  /      rootfs virtiofs rw,relatime

Changes in v2:
* Sorted configs alphabetically

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Acked-by: Daniel Xu <dxu@dxuuu.xyz>
Copy link
Owner

@danobi danobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm! seems like there's a bit of a delay on virtiofsd upstream side, so we could merge sooner than them, esp. b/c it looks fairly uncontentious.

thanks for pushing this through!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump; drop kconfig changes please. custom config goes in 2466d90

@danielocfb
Copy link

seems like there's a bit of a delay on virtiofsd upstream side, so we could merge sooner than them, esp. b/c it looks fairly uncontentious.

I'd be fine with that, but I think it would prevent you from publishing until we drop the git dependency, if I remember correctly.

@danobi
Copy link
Owner

danobi commented Oct 1, 2024

Right, forgot about that. Let's give upstream a bit of time first then

@d-e-s-o
Copy link
Contributor Author

d-e-s-o commented Nov 7, 2024

@danielocfb danielocfb force-pushed the topic/virtiofs branch 3 times, most recently from 42ea6a6 to 91aff2e Compare November 7, 2024 19:48
Cargo.toml Outdated
tinytemplate = "1.2.1"
toml = "0.5.9"
vhost-user-backend = "0.15.0"
virtiofsd = { version = "1.11.1", default-features = false, git = "https://gitlab.com/d-e-s-o/virtiofsd.git", branch = "topic/different-caps-backend" }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it still need to be pinned to your fork?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danobi
Copy link
Owner

danobi commented Dec 9, 2024

[reminder to delete the uid hint in #104 when this merges]

danobi added a commit to danobi/bpftrace that referenced this pull request Jan 1, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
@danobi
Copy link
Owner

danobi commented Jan 1, 2025

I think bpftrace will also need this for its vmtest in CI work: bpftrace/bpftrace#3657

danobi added a commit to danobi/bpftrace that referenced this pull request Jan 4, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 5, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 6, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
danobi added a commit to danobi/bpftrace that referenced this pull request Jan 10, 2025
vmtest uses 9pfs under the hood. But
danobi/vmtest#88 switches the FS to virtiofs
which should be more reliable with stuff like this.

So for now, disable this test on 9pfs rootfs.
Move the gen_sock() helper function into the util module, where it will
be accessible by more than just the qemu module. Also switch over to
using std::env::temp_dir() for retrieving the path to the hosts temp
directory instead of hard coding it.

Signed-off-by: Daniel Müller <deso@posteo.net>
Switch over to using virtiofsd for sharing file system data with the
host. virtiofs is a file system designed for the needs of virtual
machines and environments. That is in contrast to 9P fs, which we
currently use for sharing data with the host, which is first and
foremost a network file system.
9P is problematic if for no other reason that it lacks proper support
for usage of the "open-unlink-fstat idiom", in which files are unlinked
and later referenced via file descriptor (see danobi#83). virtiofs does not
have this problem.

This change replaces usage of 9P with that of virtiofs. In order to
work, virtiofs needs a user space server. The current state-of-the-art
implementation (virtiofsd) is implemented in Rust and so we interface
directly with the library. Most of this code is extracted straight from
virtiofsd, as it's a lot of boilerplate. An alternative approach is to
install the binary via distribution packages or from crates.io, but
availability (and discovery) can be a bit of a challenge.

I benchmarked both the current master as well as this version with a
bare-bones custom kernel:
  Benchmark 1: target/release/vmtest -k bzImage-9p 'echo test'
    Time (mean ± σ):      1.316 s ±  0.087 s    [User: 0.462 s, System: 1.104 s]
    Range (min … max):    1.232 s …  1.463 s    10 runs

  Benchmark 1: target/release/vmtest -k bzImage-virtiofsd 'echo test'
    Time (mean ± σ):      1.244 s ±  0.011 s    [User: 0.307 s, System: 0.358 s]
    Range (min … max):    1.227 s …  1.260 s    10 runs

So it seems there is a slight speed up, on average (and significantly
less system time being used). This is great, but I suspect a more
pronounced speed advantage will be visible when working with large
files, in which virtiofs is said to significantly outperform 9P
(typically >2x from what I understand, but I have not done any
benchmarks of that nature).

A few other notes:
- we solely rely on guest level read-only mounts to enforce read-only
  state. The virtiofsd recommended way is to use read-only bind mounts
  [0], but doing so would require root.
- we are not using DAX, because it still is still incomplete and
  apparently requires building Qemu (?) from source. In any event, it
  should not change anything functionally and be solely a performance
  improvement. Given that we are not regressing in terms of performance,
  this is strictly future work.

Some additional resources worth keeping around:
  - https://virtio-fs.gitlab.io/howto-boot.html
  - https://virtio-fs.gitlab.io/howto-qemu.html

[0] https://gitlab.com/virtio-fs/virtiofsd/-/blob/main/README.md?ref_type=heads#faq

Closes: danobi#16
Closes: danobi#83

Signed-off-by: Daniel Müller <deso@posteo.net>
@cccheng
Copy link

cccheng commented Nov 17, 2025

Virtiofs is expected to offer better latency performance compared to 9pfs. Is there a plan to merge this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants