Coly Li [Wed, 27 May 2020 04:01:55 +0000 (12:01 +0800)]
bcache: configure the asynchronous registertion to be experimental
In order to avoid the experimental async registration interface to
be treated as new kernel ABI for common users, this patch makes it
as an experimental kernel configure BCACHE_ASYNC_REGISTRAION.
This interface is for extreme large cached data situation, to make sure
the bcache device can always created without the udev timeout issue. For
normal users the async or sync registration does not make difference.
In future when we decide to use the asynchronous registration as default
behavior, this experimental interface may be removed.
Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Wed, 27 May 2020 04:01:54 +0000 (12:01 +0800)]
bcache: asynchronous devices registration
When there is a lot of data cached on cache device, the bcach internal
btree can take a very long to validate during the backing device and
cache device registration. In my test, it may takes 55+ minutes to check
all the internal btree nodes.
The problem is that the registration is invoked by udev rules and the
udevd has 180 seconds timeout by default. If the btree node checking
time is longer than udevd timeout, the registering process will be
killed by udevd with SIGKILL. If the registering process has pending
sigal, creating kthread for bcache will fail and the device registration
will fail. The result is, for bcache device which cached a lot of data
on cache device, the bcache device node like /dev/bcache<N> won't create
always due to the very long btree checking time.
A solution to avoid the udevd 180 seconds timeout is to register devices
in an asynchronous way. Which is, after writing cache or backing device
path into /sys/fs/bcache/register_async, the kernel code will create a
kworker and move all the btree node checking (for cache device) or dirty
data counting (for cached device) in the kwork context. Then the kworder
is scheduled on system_wq and the registration code just returned to
user space udev rule task. By this asynchronous way, the udev task for
bcache rule will complete in seconds, no matter how long time spent in
the kworker context, it won't be killed by udevd for a timeout.
After all the checking and counting are done asynchronously in the
kworker, the bcache device will eventually be created successfully.
This patch does the above chagne and add a register sysfs file
/sys/fs/bcache/register_async. Writing the registering device path into
this sysfs file will do the asynchronous registration.
The register_async interface is for very rare condition and won't be
used for common users. In future I plan to make the asynchronous
registration as default behavior, which depends on feedback for this
patch.
Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
At line 808, put_disk(disk) may encounter kobject refcount of 'disk'
being underflow.
Here is how to reproduce the issue,
- Attche the backing device to a cache device and do random write to
make the cache being dirty.
- Stop the bcache device while the cache device has dirty data of the
backing device.
- Only register the backing device back, NOT register cache device.
- The bcache device node /dev/bcache0 won't show up, because backing
device waits for the cache device shows up for the missing dirty
data.
- Now echo 1 into /sys/fs/bcache/pendings_cleanup, to stop the pending
backing device.
- After the pending backing device stopped, use 'dmesg' to check kernel
message, a use-after-free warning from KASA reported the refcount of
kobject linked to the 'disk' is underflow.
The dropping refcount at line 808 in the above code piece is added by
add_disk(d->disk) in bch_cached_dev_run(). But in the above condition
the cache device is not registered, bch_cached_dev_run() has no chance
to be called and the refcount is not added. The put_disk() for a non-
added refcount of gendisk kobject triggers a underflow warning.
This patch checks whether GENHD_FL_UP is set in disk->flags, if it is
not set then the bcache device was not added, don't call put_disk()
and the the underflow issue can be avoided.
Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Joe Perches [Wed, 27 May 2020 04:01:52 +0000 (12:01 +0800)]
bcache: Convert pr_<level> uses to a more typical style
Remove the trailing newline from the define of pr_fmt and add newlines
to the uses.
Miscellanea:
o Convert bch_bkey_dump from multiple uses of pr_err to pr_cont
as the earlier conversion was inappropriate done causing multiple
lines to be emitted where only a single output line was desired
o Use vsprintf extension %pV in bch_cache_set_error to avoid multiple
line output where only a single line output was desired
o Coalesce formats
Fixes: 6ae63e3501c4 ("bcache: replace printk() by pr_*() routines") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Colin Ian King [Wed, 27 May 2020 04:01:51 +0000 (12:01 +0800)]
bcache: remove redundant variables i and n
Variables i and n are being assigned but are never used. They are
redundant and can be removed.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Coly Li <colyli@suse.de>
Addresses-Coverity: ("Unused value") Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 27 May 2020 11:17:10 +0000 (05:17 -0600)]
Merge branch 'nvme-5.8' of git://git.infradead.org/nvme into for-5.8/drivers
Pull NVMe updates from Christoph:
"The second large batch of nvme updates:
- t10 protection information support for nvme-rdma and nvmet-rdma
(Israel Rukshin and Max Gurtovoy)
- target side AEN improvements (Chaitanya Kulkarni)
- various fixes and minor improvements all over, icluding the nvme part
of the lpfc driver"
* 'nvme-5.8' of git://git.infradead.org/nvme: (38 commits)
lpfc: Fix return value in __lpfc_nvme_ls_abort
lpfc: fix axchg pointer reference after free and double frees
lpfc: Fix pointer checks and comments in LS receive refactoring
nvme: set dma alignment to qword
nvmet: cleanups the loop in nvmet_async_events_process
nvmet: fix memory leak when removing namespaces and controllers concurrently
nvmet-rdma: add metadata/T10-PI support
nvmet: add metadata support for block devices
nvmet: add metadata/T10-PI support
nvme: add Metadata Capabilities enumerations
nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
nvmet: rename nvmet_rw_len to nvmet_rw_data_len
nvmet: add metadata characteristics for a namespace
nvme-rdma: add metadata/T10-PI support
nvme-rdma: introduce nvme_rdma_sgl structure
nvme: introduce NVME_INLINE_METADATA_SG_CNT
nvme: enforce extended LBA format for fabrics metadata
nvme: introduce max_integrity_segments ctrl attribute
nvme: make nvme_ns_has_pi accessible to transports
nvme: introduce NVME_NS_METADATA_SUPPORTED flag
...
James Smart [Wed, 20 May 2020 18:59:29 +0000 (11:59 -0700)]
lpfc: Fix return value in __lpfc_nvme_ls_abort
A static checker reported the following issue:
drivers/scsi/lpfc/lpfc_nvmet.c:1366 lpfc_nvmet_ls_abort()
warn: 'ret' can be either negative or positive
The comment indicates a non-zero value indicates error in the
form of -Exxx, but the code is returning "1".
Fix the code to return -EINVAL to be compliant to comment.
Fixes: e96a22b0b7c2 ("lpfc: Refactor Send LS Abort support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
James Smart [Wed, 20 May 2020 18:59:28 +0000 (11:59 -0700)]
lpfc: fix axchg pointer reference after free and double frees
The axchg structure is a structure allocated early in the
lpfc_nvme_unsol_ls_handler() to represent the newly received exchange.
Upon error, the out_fail path in the routine unconditionally frees the
pointer, yet subsequently passes the pointer to the abort routine.
Additionally, the abort routine, lpfc_nvme_unsol_ls_issue_abort(), also
has a failure path that will attempt to delete the pointer on error.
Fix these errors by:
- Removing the unconditional free so that it stays valid if passed
to the abort routine.
- Revise the abort routine to not free the pointer. Instead, return
a success/failure status. Note: if success, the later completion of
the abort frees the structure.
- Back in the unsol_ls_handler() error path, if the abort routine was
skipped (thus no possible reference) or the abort routine returned
error, free the pointer.
Fixes: 3a8070c567aa ("lpfc: Refactor NVME LS receive handling") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
James Smart [Wed, 20 May 2020 18:59:27 +0000 (11:59 -0700)]
lpfc: Fix pointer checks and comments in LS receive refactoring
Additional testing encountered null pointers that weren't fully qualified
in lpfc_nvmet_xmt_ls_abort_cmp() and lpfc_nvmet_unsol_issue_abort().
The same error was detected and reported by static checker reporting:
drivers/scsi/lpfc/lpfc_sli.c:2905 lpfc_nvme_unsol_ls_handler()
error: we previously assumed 'phba->targetport' could be null
(see line 2837)
Fix by making phba->nvmet_support and phba->targetport validity checks
in lpfc_nvmet_xmt_ls_abort_cmp() and lpfc_nvmet_unsol_issue_abort().
Fixes: 3a8070c567aaa (“lpfc: Refactor NVME LS receive handling”) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paul Ely <paul.ely@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Keith Busch [Thu, 21 May 2020 02:22:53 +0000 (19:22 -0700)]
nvme: set dma alignment to qword
The default dma alignment mask is 511, which is much larger than any nvme
controller requires. NVMe controllers accept qword aligned DMA addresses,
so set the request_queue constraints to that. This can help avoid bounce
buffers on user passthrough commands.
Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
David Milburn [Mon, 18 May 2020 18:59:55 +0000 (13:59 -0500)]
nvmet: cleanups the loop in nvmet_async_events_process
Based-on-a-patch-by: Christoph Hellwig <hch@infradead.org> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Wed, 20 May 2020 19:48:12 +0000 (12:48 -0700)]
nvmet: fix memory leak when removing namespaces and controllers concurrently
When removing a namespace, we add an NS_CHANGE async event, however if
the controller admin queue is removed after the event was added but not
yet processed, we won't free the aens, resulting in the below memory
leak [1].
Fix that by moving nvmet_async_event_free to the final controller
release after it is detached from subsys->ctrls ensuring no async
events are added, and modify it to simply remove all pending aens.
Israel Rukshin [Tue, 19 May 2020 14:06:03 +0000 (17:06 +0300)]
nvmet-rdma: add metadata/T10-PI support
For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
protection information passthrough and validation for NVMe over RDMA
transport. Metadata support was implemented over the new RDMA signature
verbs API.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:06:02 +0000 (17:06 +0300)]
nvmet: add metadata support for block devices
Allocate the metadata SGL buffers and set metadata fields for the
request. Then create a block IO request for the metadata from the
protection SG list.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:06:01 +0000 (17:06 +0300)]
nvmet: add metadata/T10-PI support
Expose the namespace metadata format when PI is enabled. The user needs
to enable the capability per subsystem and per port. The other metadata
properties are taken from the namespace/bdev.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:06:00 +0000 (17:06 +0300)]
nvme: add Metadata Capabilities enumerations
The enumerations will be used to expose the namespace metadata format by
the target.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:05:59 +0000 (17:05 +0300)]
nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
The function doesn't check only the data length, because the transfer
length includes also the metadata length in some cases. This is
preparation for adding metadata (T10-PI) support.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:05:58 +0000 (17:05 +0300)]
nvmet: rename nvmet_rw_len to nvmet_rw_data_len
The function doesn't add the metadata length (only data length is
calculated). This is preparation for adding metadata (T10-PI) support.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:05:57 +0000 (17:05 +0300)]
nvmet: add metadata characteristics for a namespace
Fill those namespace fields from the block device format for adding
metadata (T10-PI) over fabric support with block devices.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Max Gurtovoy [Tue, 19 May 2020 14:05:56 +0000 (17:05 +0300)]
nvme-rdma: add metadata/T10-PI support
For capable HCAs (e.g. ConnectX-5/ConnectX-6) this will allow end-to-end
protection information passthrough and validation for NVMe over RDMA
transport. Metadata offload support was implemented over the new RDMA
signature verbs API and it is enabled for capable controllers.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:05:55 +0000 (17:05 +0300)]
nvme-rdma: introduce nvme_rdma_sgl structure
Remove first_sgl pointer from struct nvme_rdma_request and use pointer
arithmetic instead. The inline scatterlist, if exists, will be located
right after the nvme_rdma_request. This patch is needed as a preparation
for adding PI support.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Israel Rukshin [Tue, 19 May 2020 14:05:54 +0000 (17:05 +0300)]
nvme: introduce NVME_INLINE_METADATA_SG_CNT
SGL size of metadata is usually small. Thus, 1 inline sg should cover
most cases. The macro will be used for pre-allocate a single SGL entry
for metadata. The preallocation of small inline SGLs depends on SG_CHAIN
capability so if the ARCH doesn't support SG_CHAIN, use the runtime
allocation for the SGL. This patch is a preparation for adding metadata
(T10-PI) over fabric support.
Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Max Gurtovoy [Tue, 19 May 2020 14:05:53 +0000 (17:05 +0300)]
nvme: enforce extended LBA format for fabrics metadata
An extended LBA is a larger LBA that is created when metadata associated
with the LBA is transferred contiguously with the LBA data (AKA
interleaved). The metadata may be either transferred as part of the LBA
(creating an extended LBA) or it may be transferred as a separate
contiguous buffer of data. According to the NVMeoF spec, a fabrics ctrl
supports only an Extended LBA format. Fail revalidation in case we have a
spec violation. Also add a flag that will imply on capable transports and
controllers as part of a preparation for allowing end-to-end protection
information for fabric controllers.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
This patch doesn't change any logic, and is needed as a preparation
for adding PI support for fabrics drivers that will use an extended
LBA format for metadata and will support more than 1 integrity segment.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
James Smart [Tue, 19 May 2020 14:05:51 +0000 (17:05 +0300)]
nvme: make nvme_ns_has_pi accessible to transports
Move the nvme_ns_has_pi() inline from core.c to the nvme.h header.
This allows use by the transports.
Signed-off-by: James Smart <jsmart2021@gmail.com>
[maxg: added a comment for nvme_ns_has_pi()] Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Max Gurtovoy [Tue, 19 May 2020 14:05:50 +0000 (17:05 +0300)]
nvme: introduce NVME_NS_METADATA_SUPPORTED flag
This is a preparation for adding support for metadata in fabric
controllers. New flag will imply that NVMe namespace supports getting
metadata that was originally generated by host's block layer.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Max Gurtovoy [Tue, 19 May 2020 14:05:49 +0000 (17:05 +0300)]
nvme: introduce namespace features flag
Replace the specific ext boolean (that implies on extended LBA format)
with a feature in the new namespace features flag. This is a preparation
for adding more namespace features (such as metadata specific features).
Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Max Gurtovoy [Tue, 19 May 2020 14:05:48 +0000 (17:05 +0300)]
block: always define struct blk_integrity in genhd.h
This will reduce the amount of ifdefs inside the source code for various
drivers and also will reduce the amount of stub functions that were
created for the !CONFIG_BLK_DEV_INTEGRITY case.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Add a new attribute "revalidate_size" for the namespace which allows
user to revalidate and generate the AEN if needed. This attribute is
needed so that we can install userspace rules with systemd service based
on inotify/fsnotify/uevent. The registered callback for such a service
will end up writing to this attribute to generate AEN if needed.
The newly added function nvmet_ns_revalidate() does update the ns size
in the identify namespace in-core target data structure when host issues
id-ns command. This can lead to host having inconsistencies between size
of the namespace present in the id-ns command result and size of the
corresponding block device until host scans the namespaces explicitly.
To avoid this scenario generate AEN if old size is not same as the new
one in nvmet_ns_revalidate().
This will allow automatic AEN generation when host calls id-ns command
and also allows target to install userspace rules so that it can trigger
nvmet_ns_revalidate() (using configfs interface with the help of next
patch) resulting in appropriate AEN generation when underlying namespace
size change is detected.
This patch adds a wrapper helper to indicate size change in the bdev &
file-backed namespace when revalidating ns. This helper is needed in
order to minimize code repetition in the next patch for configfs.c and
existing admin-cmd.c.
This adds a new tracepoint for the target to trace async event. This is
helpful in debugging and comparing host and target side async events
especially when host is connected to different targets on different
machines and now that we rely on userspace components to generate AEN.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
nvme: replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
Damien Le Moal [Thu, 14 May 2020 05:56:26 +0000 (14:56 +0900)]
nvme: fix io_opt limit setting
Currently, a namespace io_opt queue limit is set by default to the
physical sector size of the namespace and to the the write optimal
size (NOWS) when the namespace reports optimal IO sizes. This causes
problems with block limits stacking in blk_stack_limits() when a
namespace block device is combined with an HDD which generally do not
report any optimal transfer size (io_opt limit is 0). The code:
/* Optimal I/O a multiple of the physical block size? */
if (t->io_opt & (t->physical_block_size - 1)) {
t->io_opt = 0;
t->misaligned = 1;
ret = -1;
}
in blk_stack_limits() results in an error return for this function when
the combined devices have different but compatible physical sector
sizes (e.g. 512B sector SSD with 4KB sector disks).
Fix this by not setting the optimal IO size queue limit if the namespace
does not report an optimal write size value.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Bart van Assche <bvanassche@acm.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
Martin George [Tue, 12 May 2020 16:47:04 +0000 (22:17 +0530)]
nvme-fc: print proper nvme-fc devloss_tmo value
The nvme-fc devloss_tmo is computed as the min of either the
ctrl_loss_tmo (max_retries * reconnect_delay) or the remote port's
devloss_tmo. But what gets printed as the nvme-fc devloss_tmo in
nvme_fc_reconnect_or_delete() is always the remote port's devloss_tmo
value. So correct this by printing the min value instead.
Signed-off-by: Martin George <marting@netapp.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Mon, 18 May 2020 17:47:48 +0000 (10:47 -0700)]
nvmet-tcp: move send/recv error handling in the send/recv methods instead of call-sites
Have routines handle errors and just bail out of the poll loop.
This simplifies the code and will help as we may enhance the poll
loop logic and these are somewhat in the way.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
Jiri Kosina [Tue, 26 May 2020 09:49:18 +0000 (11:49 +0200)]
block/floppy: fix contended case in floppy_queue_rq()
Since the switch of floppy driver to blk-mq, the contended (fdc_busy) case
in floppy_queue_rq() is not handled correctly.
In case we reach floppy_queue_rq() with fdc_busy set (i.e. with the floppy
locked due to another request still being in-flight), we put the request
on the list of requests and return BLK_STS_OK to the block core, without
actually scheduling delayed work / doing further processing of the
request. This means that processing of this request is postponed until
another request comes and passess uncontended.
Which in some cases might actually never happen and we keep waiting
indefinitely. The simple testcase is
for i in `seq 1 2000`; do echo -en $i '\r'; blkid --info /dev/fd0 2> /dev/null; done
run in quemu. That reliably causes blkid eventually indefinitely hanging
in __floppy_read_block_0() waiting for completion, as the BIO callback
never happens, and no further IO is ever submitted on the (non-existent)
floppy device. This was observed reliably on qemu-emulated device.
Fix that by not queuing the request in the contended case, and return
BLK_STS_RESOURCE instead, so that blk core handles the request
rescheduling and let it pass properly non-contended later.
Stefan Haberland [Tue, 19 May 2020 14:22:59 +0000 (16:22 +0200)]
s390/dasd: remove ioctl_by_bdev calls
The IBM partition parser requires device type specific information only
available to the DASD driver to correctly register partitions. The
current approach of using ioctl_by_bdev with a fake user space pointer
is discouraged.
Fix this by replacing IOCTL calls with direct in-kernel function calls.
Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Martijn Coenen [Wed, 13 May 2020 13:38:45 +0000 (15:38 +0200)]
loop: Add LOOP_CONFIGURE ioctl
This allows userspace to completely setup a loop device with a single
ioctl, removing the in-between state where the device can be partially
configured - eg the loop device has a backing file associated with it,
but is reading from the wrong offset.
Besides removing the intermediate state, another big benefit of this
ioctl is that LOOP_SET_STATUS can be slow; the main reason for this
slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to
freeze the associated queue; this requires waiting for RCU
synchronization, which I've measured can take about 15-20ms on this
device on average.
In addition to doing what LOOP_SET_STATUS can do, LOOP_CONFIGURE can
also be used to:
- Set the correct block size immediately by setting
loop_config.block_size (avoids LOOP_SET_BLOCK_SIZE)
- Explicitly request direct I/O mode by setting LO_FLAGS_DIRECT_IO
in loop_config.info.lo_flags (avoids LOOP_SET_DIRECT_IO)
- Explicitly request read-only mode by setting LO_FLAGS_READ_ONLY
in loop_config.info.lo_flags
Here's setting up ~70 regular loop devices with an offset on an x86
Android device, using LOOP_SET_FD and LOOP_SET_STATUS:
vsoc_x86:/system/apex # time for i in `seq 30 100`;
do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
0m03.40s real 0m00.02s user 0m00.03s system
Here's configuring ~70 devices in the same way, but using a modified
losetup that uses the new LOOP_CONFIGURE ioctl:
vsoc_x86:/system/apex # time for i in `seq 30 100`;
do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
0m01.94s real 0m00.01s user 0m00.01s system
Martijn Coenen [Wed, 13 May 2020 13:38:44 +0000 (15:38 +0200)]
loop: Clean up LOOP_SET_STATUS lo_flags handling
LOOP_SET_STATUS(64) will actually allow some lo_flags to be modified; in
particular, LO_FLAGS_AUTOCLEAR can be set and cleared, whereas
LO_FLAGS_PARTSCAN can be set to request a partition scan. Make this
explicit by updating the UAPI to include the flags that can be
set/cleared using this ioctl.
The implementation can then blindly take over the passed in flags,
and use the previous flags for those flags that can't be set / cleared
using LOOP_SET_STATUS.
Martijn Coenen [Wed, 13 May 2020 13:38:42 +0000 (15:38 +0200)]
loop: Move loop_set_status_from_info() and friends up
So we can use it without forward declaration. This is a separate commit
to make it easier to verify that this is just a move, without functional
modifications.
Martijn Coenen [Wed, 13 May 2020 13:38:39 +0000 (15:38 +0200)]
loop: Refactor loop_set_status() size calculation
figure_loop_size() calculates the loop size based on the passed in
parameters, but at the same time it updates the offset and sizelimit
parameters in the loop device configuration. That is a somewhat
unexpected side effect of a function with this name, and it is only only
needed by one of the two callers of this function - loop_set_status().
Move the lo_offset and lo_sizelimit assignment back into loop_set_status(),
and use the newly factored out functions to validate and apply the newly
calculated size. This allows us to get rid of figure_loop_size() in a
follow-up commit.
Martijn Coenen [Wed, 13 May 2020 13:38:35 +0000 (15:38 +0200)]
loop: Call loop_config_discard() only after new config is applied
loop_set_status() calls loop_config_discard() to configure discard for
the loop device; however, the discard configuration depends on whether
the loop device uses encryption, and when we call it the encryption
configuration has not been updated yet. Move the call down so we apply
the correct discard configuration based on the new configuration.
Signed-off-by: Martijn Coenen <maco@android.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 13 May 2020 21:17:01 +0000 (15:17 -0600)]
Merge branch 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.8/drivers
Pull MD changes from Song.
* 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md/raid1: Replace zero-length array with flexible-array
md: add a newline when printing parameter 'start_ro' by sysfs
md: stop using ->queuedata
md/raid1: release pending accounting for an I/O only after write-behind is also finished
md: remove redundant memalloc scope API usage
raid5: update code comment of scribble_alloc()
raid5: remove gfp flags from scribble_alloc()
md: use memalloc scope APIs in mddev_suspend()/mddev_resume()
md: remove the extra line for ->hot_add_disk
md: flush md_rdev_misc_wq for HOT_ADD_DISK case
md: don't flush workqueue unconditionally in md_open
md: add new workqueue for delete rdev
md: add checkings before flush md_misc_wq
md/raid1: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
David Jeffery [Mon, 27 Jan 2020 15:26:19 +0000 (10:26 -0500)]
md/raid1: release pending accounting for an I/O only after write-behind is also finished
When using RAID1 and write-behind, md can deadlock when errors occur. With
write-behind, r1bio structs can be accounted by raid1 as queued but not
counted as pending. The pending count is dropped when the original bio is
returned complete but write-behind for the r1bio may still be active.
This breaks the accounting used in some conditions to know when the raid1
md device has reached an idle state. It can result in calls to
freeze_array deadlocking. freeze_array will never complete from a negative
"unqueued" value being calculated due to a queued count larger than the
pending count.
To properly account for write-behind, move the call to allow_barrier from
call_bio_endio to raid_end_bio_io. When using write-behind, md can call
call_bio_endio before all write-behind I/O is complete. Using
raid_end_bio_io for the point to call allow_barrier will release the
pending count at a point where all I/O for an r1bio, even write-behind, is
done.
Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Coly Li [Thu, 9 Apr 2020 14:17:23 +0000 (22:17 +0800)]
md: remove redundant memalloc scope API usage
In mddev_create_serial_pool(), memalloc scope APIs memalloc_noio_save()
and memalloc_noio_restore() are used when allocating memory by calling
mempool_create_kmalloc_pool(). After adding the memalloc scope APIs in
raid array suspend context, it is unncessary to explicitly call them
around mempool_create_kmalloc_pool() any longer.
This patch removes the redundant memalloc scope APIs in
mddev_create_serial_pool().
Signed-off-by: Coly Li <colyli@suse.de> Cc: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Coly Li [Thu, 9 Apr 2020 14:17:21 +0000 (22:17 +0800)]
raid5: remove gfp flags from scribble_alloc()
Using GFP_NOIO flag to call scribble_alloc() from resize_chunk() does
not have the expected behavior. kvmalloc_array() inside scribble_alloc()
which receives the GFP_NOIO flag will eventually call kmalloc_node() to
allocate physically continuous pages.
Now we have memalloc scope APIs in mddev_suspend()/mddev_resume() to
prevent memory reclaim I/Os during raid array suspend context, calling
to kvmalloc_array() with GFP_KERNEL flag may avoid deadlock of recursive
I/O as expected.
This patch removes the useless gfp flags from parameters list of
scribble_alloc(), and call kvmalloc_array() with GFP_KERNEL flag. The
incorrect GFP_NOIO flag does not exist anymore.
Fixes: b330e6a49dc3 ("md: convert to kvmalloc") Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Song Liu <songliubraving@fb.com>
Coly Li [Thu, 9 Apr 2020 14:17:20 +0000 (22:17 +0800)]
md: use memalloc scope APIs in mddev_suspend()/mddev_resume()
In raid5.c:resize_chunk(), scribble_alloc() is called with GFP_NOIO
flag, then it is sent into kvmalloc_array() inside scribble_alloc().
The problem is kvmalloc_array() eventually calls kvmalloc_node() which
does not accept non GFP_KERNEL compatible flag like GFP_NOIO, then
kmalloc_node() is called indeed to allocate physically continuous
pages. When system memory is under heavy pressure, and the requesting
size is large, there is high probability that allocating continueous
pages will fail.
But simply using GFP_KERNEL flag to call kvmalloc_array() is also
progblematic. In the code path where scribble_alloc() is called, the
raid array is suspended, if kvmalloc_node() triggers memory reclaim I/Os
and such I/Os go back to the suspend raid array, deadlock will happen.
What is desired here is to allocate non-physically (a.k.a virtually)
continuous pages and avoid memory reclaim I/Os. Michal Hocko suggests
to use the mmealloc sceope APIs to restrict memory reclaim I/O in
allocating context, specifically to call memalloc_noio_save() when
suspend the raid array and to call memalloc_noio_restore() when
resume the raid array.
This patch adds the memalloc scope APIs in mddev_suspend() and
mddev_resume(), to restrict memory reclaim I/Os during the raid array
is suspended. The benifit of adding the memalloc scope API in the
unified entry point mddev_suspend()/mddev_resume() is, no matter which
md raid array type (personality), we are sure the deadlock by recursive
memory reclaim I/O won't happen on the suspending context.
Please notice that the memalloc scope APIs only take effect on the raid
array suspending context, if the memory allocation is from another new
created kthread after raid array suspended, the recursive memory reclaim
I/Os won't be restricted. The mddev_suspend()/mddev_resume() entries are
used for the critical section where the raid metadata is modifying,
creating a kthread to allocate memory inside the critical section is
queer and very probably being buggy.
Fixes: b330e6a49dc3 ("md: convert to kvmalloc") Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Song Liu <songliubraving@fb.com>
It is not not necessary to add a newline for them since they don't exceed
80 characters, and it is not intutive to distinguish ->hot_add_disk() from
hot_add_disk() too.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Since rdev->kobj is removed asynchronously, it is possible that the
rdev->kobj still exists when try to add the rdev again after rdev
is removed. But this path md_ioctl (HOT_ADD_DISK) -> hot_add_disk
-> bind_rdev_to_array missed it.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
md: don't flush workqueue unconditionally in md_open
We need to check mddev->del_work before flush workqueu since the purpose
of flush is to ensure the previous md is disappeared. Otherwise the similar
deadlock appeared if LOCKDEP is enabled, it is due to md_open holds the
bdev->bd_mutex before flush workqueue.
And md_alloc also flushed the same workqueue, but the thing is different
here. Because all the paths call md_alloc don't hold bdev->bd_mutex, and
the flush is necessary to avoid race condition, so leave it as it is.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Since the purpose of call flush_workqueue in new_dev_store is to ensure
md_delayed_delete() has completed, so we should check rdev->del_work is
pending or not.
To suppress lockdep warning, we have to check mddev->del_work while
md_delayed_delete is attached to rdev->del_work, so it is not aligned
to the purpose of flush workquee. So a new workqueue is needed to avoid
the awkward situation, and introduce a new func flush_rdev_wq to flush
the new workqueue after check if there was pending work.
Also like new_dev_store, ADD_NEW_DISK ioctl has the same purpose to flush
workqueue while it holds bdev->bd_mutex, so make the same change applies
to the ioctl to avoid similar lock issue.
And md_delayed_delete actually wants to delete rdev, so rename the function
to rdev_delayed_delete.
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Since works (rdev->del_work and mddev->del_work) are queued in md_misc_wq,
then lockdep_map lock is held if either of them are running, then both of
them try to hold kernfs lock by call kobject_del. Then if new_dev_store
or array_state_store are triggered by write to the related sysfs node, so
the write operation gets kernfs lock, but need the lockdep_map because all
of them would trigger flush_workqueue(md_misc_wq) finally, then the same
lockdep_map lock is needed.
To suppress the lockdep warnning, we should flush the workqueue in case the
related work is pending. And several works are attached to md_misc_wq, so
we need to check which work should be checked:
1. for __md_stop_writes, the purpose of call flush workqueue is ensure sync
thread is started if it was starting, so check mddev->del_work is pending
or not since md_start_sync is attached to mddev->del_work.
2. __md_stop flushes md_misc_wq to ensure event_work is done, check the
event_work is enough. Assume raid_{ctr,dtr} -> md_stop -> __md_stop doesn't
need the kernfs lock.
3. both new_dev_store (holds kernfs lock) and ADD_NEW_DISK ioctl (holds the
bdev->bd_mutex) call flush_workqueue to ensure md_delayed_delete has
completed, this case will be handled in next patch.
4. md_open flushes workqueue to ensure the previous md is disappeared, but
it holds bdev->bd_mutex then try to flush workqueue, so it is better to
check mddev->del_work as well to avoid potential lock issue, this will be
done in another patch.
Cc: Coly Li <colyli@suse.de> Reported-by: Coly Li <colyli@suse.de> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com>
Jens Axboe [Tue, 12 May 2020 17:35:49 +0000 (11:35 -0600)]
Merge tag 'floppy-for-5.8' of https://github.com/evdenis/linux-floppy into for-5.8/drivers
Floppy patches for 5.8
Cleanups:
- symbolic register names for x86,sparc64,sparc32,powerpc,parisc,m68k
- split of local/global variables for drive,fdc
- UBSAN warning suppress in setup_rw_floppy()
Changes were compile tested on arm, sparc64, powerpc, m68k. Many patches
introduce no binary changes by using defines instead of magic numbers.
The patches were also tested with syzkaller and simple write/read/format
tests on real hardware.
Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* tag 'floppy-for-5.8' of https://github.com/evdenis/linux-floppy: (31 commits)
floppy: suppress UBSAN warning in setup_rw_floppy()
floppy: add defines for sizes of cmd & reply buffers of floppy_raw_cmd
floppy: add FD_AUTODETECT_SIZE define for struct floppy_drive_params
floppy: use print_hex_dump() in setup_DMA()
floppy: cleanup: make set_fdc() always set current_drive and current_fd
floppy: cleanup: get rid of current_reqD in favor of current_drive
floppy: make sure to reset all FDCs upon resume()
floppy: cleanup: do not iterate on current_fdc in do_floppy_init()
floppy: cleanup: add a few comments about expectations in certain functions
floppy: cleanup: do not iterate on current_fdc in DMA grab/release functions
floppy: cleanup: make get_fdc_version() not rely on current_fdc anymore
floppy: cleanup: make next_valid_format() not rely on current_drive anymore
floppy: cleanup: make check_wp() not rely on current_{fdc,drive} anymore
floppy: cleanup: make fdc_specify() not rely on current_{fdc,drive} anymore
floppy: cleanup: make fdc_configure() not rely on current_fdc anymore
floppy: cleanup: make perpendicular_mode() not rely on current_fdc anymore
floppy: cleanup: make need_more_output() not rely on current_fdc anymore
floppy: cleanup: make result() not rely on current_fdc anymore
floppy: cleanup: make output_byte() not rely on current_fdc anymore
floppy: cleanup: make wait_til_ready() not rely on current_fdc anymore
...
Denis Efremov [Fri, 1 May 2020 13:44:16 +0000 (16:44 +0300)]
floppy: suppress UBSAN warning in setup_rw_floppy()
UBSAN: array-index-out-of-bounds in drivers/block/floppy.c:1521:45
index 16 is out of range for type 'unsigned char [16]'
Call Trace:
...
setup_rw_floppy+0x5c3/0x7f0
floppy_ready+0x2be/0x13b0
process_one_work+0x2c1/0x5d0
worker_thread+0x56/0x5e0
kthread+0x122/0x170
ret_from_fork+0x35/0x40
This out-of-bounds access is intentional. The command in struct
floppy_raw_cmd may take up the space initially intended for the reply
and the reply count. It is needed for long 82078 commands such as
RESTORE, which takes 17 command bytes. Initial cmd size is not enough
and since struct setup_rw_floppy is a part of uapi we check that
cmd_count is in [0:16+1+16] in raw_cmd_copyin().
The patch adds union with original cmd,reply_count,reply fields and
fullcmd field of equivalent size. The cmd accesses are turned to
fullcmd where appropriate to suppress UBSAN warning.
Denis Efremov [Fri, 1 May 2020 13:44:15 +0000 (16:44 +0300)]
floppy: add defines for sizes of cmd & reply buffers of floppy_raw_cmd
Use FD_RAW_CMD_SIZE, FD_RAW_REPLY_SIZE defines instead of magic numbers
for cmd & reply buffers of struct floppy_raw_cmd. Remove local to
floppy.c MAX_REPLIES define, as it is now FD_RAW_REPLY_SIZE.
FD_RAW_CMD_FULLSIZE added as we allow command to also fill reply_count
and reply fields.
floppy: cleanup: make set_fdc() always set current_drive and current_fd
When called with a negative drive value, set_fdc() would stick to the
current fdc (which was assumed to reflect the current_drive's FDC). We
do not need this anymore as the last call place with a negative value
was just addressed. Let's make this function always set both current_fdc
and current_drive so that there's no more ambiguity. A few comments
stating this were added to a few non-obvious places.
floppy: cleanup: get rid of current_reqD in favor of current_drive
This macro equals -1 and is used as an alternative for current_drive when
calling reschedule_timeout(), which in turn needs to remap it. This only
adds obfuscation, let's simply use current_drive.
In floppy_resume() we don't properly reinitialize all FDCs, instead
we reinitialize the current FDC once per available FDC because value
-1 is passed to user_reset_fdc(). Let's simply save the current drive
and properly reinitialize each FDC.
floppy: cleanup: do not iterate on current_fdc in do_floppy_init()
There's no need to iterate on current_fdc in do_floppy_init() anymore,
in the first case it's only used as an array index to access fdc_state[],
so let's get rid of this confusing assignment. The second case is a bit
trickier because user_reset_fdc() needs to already know current_fdc when
called with drive==-1 due to this call chain:
Note that current_drive is not used in this code part and may even not
match a unit belonging to current_fdc. Instead of passing -1 we can
simply pass the first drive of the FDC being initialized, which is even
cleaner as it will allow the function chain above to consistently assign
both variables.
Willy Tarreau [Tue, 31 Mar 2020 09:40:54 +0000 (11:40 +0200)]
floppy: cleanup: add a few comments about expectations in certain functions
The locking in the driver is far from being obvious, with unlocking
automatically happening at end of operations scheduled by interrupt,
especially for the error paths where one does not necessarily expect
that such an interrupt may be triggered. Let's add a few comments
about what to expect at certain places to avoid misdetecting bugs
which are not.
Willy Tarreau [Tue, 31 Mar 2020 09:40:53 +0000 (11:40 +0200)]
floppy: cleanup: do not iterate on current_fdc in DMA grab/release functions
Both floppy_grab_irq_and_dma() and floppy_release_irq_and_dma() used to
iterate on the global variable while setting up or freeing resources.
Now that they exclusively rely on functions which take the fdc as an
argument, so let's not touch the global one anymore.
Willy Tarreau [Tue, 31 Mar 2020 09:40:47 +0000 (11:40 +0200)]
floppy: cleanup: make perpendicular_mode() not rely on current_fdc anymore
Now the fdc is passed in argument so that the function does not
use current_fdc anymore.
It's worth noting that there's still a single raw_cmd pointer
specific to the current fdc. It may make sense to have one per
fdc in the future. In addition, cont->done() still relies on the
current drive and current raw_cmd.
Willy Tarreau [Tue, 31 Mar 2020 09:40:45 +0000 (11:40 +0200)]
floppy: cleanup: make result() not rely on current_fdc anymore
Now the fdc is passed in argument so that the function does not
use current_fdc anymore.
It's worth noting that there's still a single reply_buffer[] which
will store the result for the current fdc. It may or may not make
sense to implement one buffer per fdc in the future.