Fix readonly check for vdev user properties
VDEV_PROP_USERPROP is equal do VDEV_PROP_INVAL and so is not a real
property. That's why vdev_prop_readonly() does not work right for
it. In particular it may declare all vdev user properties readonly
on FreeBSD.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16890
Skip iterating over snapshots for share properties
Setting sharenfs and sharesmb properties on a dataset can become costly
if there are large number of snapshots, since setting the share
properties iterates over all snapshots present for a dataset. If it is
the root dataset for which we are trying to set the share property,
snapshots for all child datasets and their children will also be
iterated.
There is no need to iterate over snapshots for share properties
because we do not allow share properties or any other property,
to be set on a snapshot itself execpt for user properties.
This commit skips iterating over snapshots for share properties,
instead iterate over all child dataset and their children for share
properties.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem at ixsystems.com>
Closes #16877
zfs_main: fix alignment on props usage output
I guess we've got some long property names since this was first set up!
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16883
CI: Fix FreeBSD 13.4 STABLE build
In #16869 we added FreeBSD 13.4 STABLE, but forget the special
thing, that the virtio nic within FreeBSD 13.x is buggy.
This fix adds the needed rtl8139 nic to the VM.
Reviewed-by: George Melikov <mail at gmelikov.ru>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs at mcmilk.de>
Closes #16885
Fix compile-time warnings caused by duplicate struct typedefs (#16880)
Some compiler/versions warn these typedefs according to #16660.
The platform specific header sys/abd_os.h shouldn't define or use abd_t,
as it's defined in its non-platform specific consumer sys/abd.h.
Do the same as what FreeBSD header does.
Original-patch-by: Tomohiro Kusumi <kusumi.tomohiro at gmail.com>
Signed-off-by: Rob Norris <robn at despairlabs.com>
(cherry picked from commit a9851ea3dd6af6f789e221f2ccd7ad43561eff81)
Co-authored-by: Tomohiro Kusumi <kusumi.tomohiro at gmail.com>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Tony Hutter <hutter2 at llnl.gov>
zprop: fix value help for ZPOOL_PROP_CAPACITY
It's a percentage and documented as such, but we were showing it as
<size>.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: George Melikov <mail at gmelikov.ru>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16881
CI: Add FreeBSD 14.2 RELEASE+STABLE builds
Update the CI to include FreeBSD 14.2 as a regularly tested platform.
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #16869
config: fix dequeue_signal check for kernels <4.20
Before 4.20, kernel_siginfo_t was just called siginfo_t. This was
causing the kthread_dequeue_signal_3arg_task check, which uses
kernel_siginfo_t, to fail on older kernels.
In d6b8c17f1, we started checking for the "new" three-arg
dequeue_signal() by testing for the "old" version. Because that test is
explicitly using kernel_siginfo_t, it would fail, leading to the build
trying to use the new three-arg version, which would then not comile.
This commit fixes that by avoiding checking for the old 3-arg
dequeue_signal entirely. Instead, we check for the new one, as well as
the 4-arg form, and we use the old form as a fallback. This way, we
never have to test for it explicitly, and once we're building
HAVE_SIGINFO will make sure we get the right kernel_siginfo_t for it, so
everything works out nice.
Original-patch-by: Finix <yancw at info2soft.com>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Use pin_user_pages API for Direct I/O requests
As of kernel v5.8, pin_user_pages* interfaced were introduced. These
interfaces use the FOLL_PIN flag. This is preferred interface now for
Direct I/O requests in the kernel. The reasoning for using this new
interface for Direct I/O requests is explained in the kernel
documenetation:
Documentation/core-api/pin_user_pages.rst
If pin_user_pages_unlocked is available, the all Direct I/O requests
will use this new API to stay uptodate with the kernel API requirements.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Brian Atkinson <batkinson at lanl.gov>
Closes #16856
Removing old code outside of 4.18 kernsls
There were checks still in place to verify we could completely use
iov_iter's on the Linux side. All interfaces are available as of kernel
4.18, so there is no reason to check whether we should use that
interface at this point. This PR completely removes the UIO_USERSPACE
type. It also removes the check for the direct_IO interface checks.
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Brian Atkinson <batkinson at lanl.gov>
Closes #16856
simd_stat: fix undefined CONFIG_KERNEL_MODE_NEON error on armel
CONFIG_KERNEL_MODE_NEON depends on CONFIG_NEON. Neither is defined
on armel. Add a guard to avoid compilation errors.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen at outlook.com>
Closes #16871
Fix stray "no" in configure output
This is purely a cosmetic fix which removes a stray "no" from
the configure output.
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Closes #16867
Fix use-afer-free regression in RAIDZ expansion
We should not dereference rra after the last zio_nowait() is called.
It seems very unlikely, but ASAN in ztest managed to catch it.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16868
Remount datasets on soft-reboot
The one-shot zfs-mount.service is incorrectly deemed active by
Systemd after a systemctl soft-reboot. As such, soft-rebooting
prevents zfs mount -a from being ran automatically.
This commit makes it so that zfs-mount.service is marked as being
undone by the time umount.target is reached, so that zfs.target then
pulls it in again and gets it restarted after a soft reboot.
Reviewed by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: kotauskas <v.toncharov at gmail.com>
Closes #16845
flush: only detect lack of flush support in one place
It seems there's no good reason for vdev_disk & vdev_geom to explicitly
detect no support for flush and set vdev_nowritecache. Instead, just
signal it by setting the error to ENOTSUP, and let zio_vdev_io_assess()
take care of it in one place.
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Closes #16855
flush: don't report flush error when disabling flush support
The first time a device returns ENOTSUP in repsonse to a flush request,
we set vdev_nowritecache so we don't issue flushes in the future and
instead just pretend the succeeded. However, we still return an error
for the initial flush, even though we just decided such errors are
meaningless!
So, when setting vdev_nowritecache in response to a flush error, also
reset the error code to assume success.
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Closes #16855
build: use correct bashcompletiondir on arch
Reviewed-by: Tino Reichardt <milky-zfs at mcmilk.de>
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: poscat <poscat at poscat.moe>
Closes #16861
backtrace: fix off-by-one on string output
sizeof("foo") includes the trailing null byte, so all the output had
nulls through it. Most terminals quietly ignore it, but it makes some
tools misdetect file types and other annoyances.
Easy fix: subtract 1.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Alexander Motin <mav at FreeBSD.org>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16862
BRT: Check bv_mos_entries in brt_entry_lookup()
When vdev first sees some block cloning, there is a window when
brt_maybe_exists() might already return true since something was
cloned, but bv_mos_entries is still 0 since BRT ZAP was not yet
created. In such case we should not try to look into the ZAP
and dereference NULL bv_mos_entries_dnode.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16851
Fix DR_OVERRIDDEN use-after-free race in dbuf_sync_leaf
In dbuf_sync_leaf, we clone the arc_buf in dr if we share it with db
except for overridden case. However, this exception causes a race where
dbuf_new_size could free the arc_buf after the last dereference of
*datap and causes use-after-free. We fix this by cloning the buf
regardless if it's overridden.
The race:
--
P0 P1
dbuf_hold_impl()
// dbuf_hold_copy passed
// because db_data_pending NULL
dbuf_sync_leaf()
// doesn't clone *datap
// *datap derefed to db_buf
[37 lines not shown]
Fix DR_OVERRIDDEN use-after-free race in dbuf_sync_leaf
In dbuf_sync_leaf, we clone the arc_buf in dr if we share it with db
except for overridden case. However, this exception causes a race where
dbuf_new_size could free the arc_buf after the last dereference of
*datap and causes use-after-free. We fix this by cloning the buf
regardless if it's overridden.
The race:
--
P0 P1
dbuf_hold_impl()
// dbuf_hold_copy passed
// because db_data_pending NULL
dbuf_sync_leaf()
// doesn't clone *datap
// *datap derefed to db_buf
[37 lines not shown]
BRT: Check bv_mos_entries in brt_entry_lookup()
When vdev first sees some block cloning, there is a window when
brt_maybe_exists() might already return true since something was
cloned, but bv_mos_entries is still 0 since BRT ZAP was not yet
created. In such case we should not try to look into the ZAP
and dereference NULL bv_mos_entries_dnode.
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Reviewed-by: Rob Norris <robn at despairlabs.com>
Signed-off-by: Alexander Motin <mav at FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16851
Remove unnecessary CSTYLED escapes on top-level macro invocations
cstyle can handle these cases now, so we don't need to disable it.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840
cstyle: understand macro params can be empty
It's not uncommon to have empty parameters in code generator macros,
usually when multiple parameters are concatenated or stringified into a
single token or literal. So, exclude the space-before-comma check, which
will allow construction like `MACRO_CALL(foo, , baz)`.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840
cstyle: understand basic top-level macro invocations
We quite often invoke macros outside of functions, usually to generate
functions or data. cstyle inteprets these as function headers, which at
least have opinions for indenting.
This introduces a separate state for top-level macro invocations, and
excludes it from matching functions. For the moment, most of the
existing rules will continue to apply, but this gives us a way to add or
removes rules targeting macros specifically.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840
Optimize RAIDZ expansion
- Instead of copying one ashift-sized block per ZIO, copy as much
as we have contiguous data up to 16MB per old vdev. To avoid data
moves use gang ABDs, so that read ZIOs can directly fill buffers
for write ZIOs. ABDs have much smaller overhead than ZIOs in both
memory usage and processing time, plus big I/Os do not depend on
I/O aggregation and scheduling to reach decent performance on HDDs.
- Reduce raidz_expand_max_copy_bytes to 16MB on 32bit platforms.
- Use 32bit range tree when possible (practically always now) to
slightly reduce memory usage.
- Use ZIO_PRIORITY_REMOVAL for early stages of expansion, same as
for main ones.
- Fix rate overflows in `zpool status` reporting.
With these changes expanding RAIDZ1 from 4 to 5 children I am able
to reach 6-12GB/s rate on SSDs and ~500MB/s on HDDs, both are
limited by devices instead of CPU.
[4 lines not shown]
cstyle: ignore old non-POSIX types in macro invocations
In code generation macros, we often use names like `uint` when
constructing handler functions. These are not being used as types, so
exclude them from the admonishment to use POSIX type names.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840
Remove unnecessary CSTYLED escapes on top-level macro invocations
cstyle can handle these cases now, so we don't need to disable it.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840
cstyle: ignore old non-POSIX types in macro invocations
In code generation macros, we often use names like `uint` when
constructing handler functions. These are not being used as types, so
exclude them from the admonishment to use POSIX type names.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1 at llnl.gov>
Signed-off-by: Rob Norris <robn at despairlabs.com>
Closes #16840