[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:45:40 +02:00
|
|
|
if BLOCK
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
menu "IO Schedulers"
|
|
|
|
|
|
|
|
config IOSCHED_NOOP
|
|
|
|
bool
|
|
|
|
default y
|
|
|
|
---help---
|
|
|
|
The no-op I/O scheduler is a minimal scheduler that does basic merging
|
|
|
|
and sorting. Its main uses include non-disk based block devices like
|
|
|
|
memory devices, and specialised software or hardware environments
|
|
|
|
that do their own scheduling and require only minimal assistance from
|
|
|
|
the kernel.
|
|
|
|
|
2012-06-27 11:25:26 +03:00
|
|
|
config IOSCHED_TEST
|
|
|
|
tristate "Test I/O scheduler"
|
|
|
|
depends on DEBUG_FS
|
2014-12-04 10:10:07 +02:00
|
|
|
default m
|
2012-06-27 11:25:26 +03:00
|
|
|
---help---
|
|
|
|
The test I/O scheduler is a duplicate of the noop scheduler with
|
|
|
|
addition of test utlity.
|
|
|
|
It allows testing a block device by dispatching specific requests
|
|
|
|
according to the test case and declare PASS/FAIL according to the
|
|
|
|
requests completion error code.
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
config IOSCHED_DEADLINE
|
|
|
|
tristate "Deadline I/O scheduler"
|
|
|
|
default y
|
|
|
|
---help---
|
2009-10-03 09:37:51 +02:00
|
|
|
The deadline I/O scheduler is simple and compact. It will provide
|
|
|
|
CSCAN service with FIFO expiration of requests, switching to
|
|
|
|
a new point in the service tree and doing a batch of IO from there
|
|
|
|
in case of expiry.
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
config IOSCHED_CFQ
|
|
|
|
tristate "CFQ I/O scheduler"
|
|
|
|
default y
|
|
|
|
---help---
|
|
|
|
The CFQ I/O scheduler tries to distribute bandwidth equally
|
|
|
|
among all processes in the system. It should provide a fair
|
2009-10-03 09:40:47 +02:00
|
|
|
and low latency working environment, suitable for both desktop
|
|
|
|
and server systems.
|
|
|
|
|
2007-02-17 20:08:22 +01:00
|
|
|
This is the default I/O scheduler.
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2009-12-03 12:59:43 -05:00
|
|
|
config CFQ_GROUP_IOSCHED
|
|
|
|
bool "CFQ Group Scheduling support"
|
2010-04-26 19:27:56 +02:00
|
|
|
depends on IOSCHED_CFQ && BLK_CGROUP
|
2009-12-03 12:59:43 -05:00
|
|
|
default n
|
|
|
|
---help---
|
|
|
|
Enable group IO scheduling in CFQ.
|
|
|
|
|
block: introduce the BFQ-v7r11 I/O sched for 4.4.0
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.
- Slices are given in the service domain: tasks are assigned
budgets, measured in number of sectors. Once got the disk, a task
must however consume its assigned budget within a configurable
maximum time (by default, the maximum possible value of the
budgets is automatically computed to comply with this timeout).
This allows the desired latency vs "throughput boosting" tradeoff
to be set.
- Budgets are scheduled according to a variant of WF2Q+, implemented
using an augmented rb-tree to take eligibility into account while
preserving an O(log N) overall complexity.
- A low-latency tunable is provided; if enabled, both interactive
and soft real-time applications are guaranteed a very low latency.
- Latency guarantees are preserved also in the presence of NCQ.
- Also with flash-based devices, a high throughput is achieved
while still preserving latency guarantees.
- BFQ features Early Queue Merge (EQM), a sort of fusion of the
cooperating-queue-merging and the preemption mechanisms present
in CFQ. EQM is in fact a unified mechanism that tries to get a
sequential read pattern, and hence a high throughput, with any
set of processes performing interleaved I/O over a contiguous
sequence of sectors.
- BFQ supports full hierarchical scheduling, exporting a cgroups
interface. Since each node has a full scheduler, each group can
be assigned its own weight.
- If the cgroups interface is not used, only I/O priorities can be
assigned to processes, with ioprio values mapped to weights
with the relation weight = IOPRIO_BE_NR - ioprio.
- ioprio classes are served in strict priority order, i.e., lower
priority queues are not served as long as there are higher
priority queues. Among queues in the same class the bandwidth is
distributed in proportion to the weight of each queue. A very
thin extra bandwidth is however guaranteed to the Idle class, to
prevent it from starving.
Change-Id: I1c789ca3c2eb93972d742f82ee729cfe5fb7170c
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini@google.com>
2013-05-09 19:10:02 +02:00
|
|
|
config IOSCHED_BFQ
|
|
|
|
tristate "BFQ I/O scheduler"
|
|
|
|
default n
|
|
|
|
---help---
|
2018-01-16 22:15:14 +01:00
|
|
|
The BFQ I/O scheduler distributes bandwidth among all
|
|
|
|
processes according to their weights, regardless of the
|
|
|
|
device parameters and with any workload. It also guarantees
|
|
|
|
a low latency to interactive and soft real-time applications.
|
|
|
|
Details in Documentation/block/bfq-iosched.txt
|
block: introduce the BFQ-v7r11 I/O sched for 4.4.0
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.
- Slices are given in the service domain: tasks are assigned
budgets, measured in number of sectors. Once got the disk, a task
must however consume its assigned budget within a configurable
maximum time (by default, the maximum possible value of the
budgets is automatically computed to comply with this timeout).
This allows the desired latency vs "throughput boosting" tradeoff
to be set.
- Budgets are scheduled according to a variant of WF2Q+, implemented
using an augmented rb-tree to take eligibility into account while
preserving an O(log N) overall complexity.
- A low-latency tunable is provided; if enabled, both interactive
and soft real-time applications are guaranteed a very low latency.
- Latency guarantees are preserved also in the presence of NCQ.
- Also with flash-based devices, a high throughput is achieved
while still preserving latency guarantees.
- BFQ features Early Queue Merge (EQM), a sort of fusion of the
cooperating-queue-merging and the preemption mechanisms present
in CFQ. EQM is in fact a unified mechanism that tries to get a
sequential read pattern, and hence a high throughput, with any
set of processes performing interleaved I/O over a contiguous
sequence of sectors.
- BFQ supports full hierarchical scheduling, exporting a cgroups
interface. Since each node has a full scheduler, each group can
be assigned its own weight.
- If the cgroups interface is not used, only I/O priorities can be
assigned to processes, with ioprio values mapped to weights
with the relation weight = IOPRIO_BE_NR - ioprio.
- ioprio classes are served in strict priority order, i.e., lower
priority queues are not served as long as there are higher
priority queues. Among queues in the same class the bandwidth is
distributed in proportion to the weight of each queue. A very
thin extra bandwidth is however guaranteed to the Idle class, to
prevent it from starving.
Change-Id: I1c789ca3c2eb93972d742f82ee729cfe5fb7170c
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini@google.com>
2013-05-09 19:10:02 +02:00
|
|
|
|
|
|
|
config BFQ_GROUP_IOSCHED
|
|
|
|
bool "BFQ hierarchical scheduling support"
|
2018-01-16 22:15:14 +01:00
|
|
|
depends on IOSCHED_BFQ && BLK_CGROUP
|
block: introduce the BFQ-v7r11 I/O sched for 4.4.0
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.
- Slices are given in the service domain: tasks are assigned
budgets, measured in number of sectors. Once got the disk, a task
must however consume its assigned budget within a configurable
maximum time (by default, the maximum possible value of the
budgets is automatically computed to comply with this timeout).
This allows the desired latency vs "throughput boosting" tradeoff
to be set.
- Budgets are scheduled according to a variant of WF2Q+, implemented
using an augmented rb-tree to take eligibility into account while
preserving an O(log N) overall complexity.
- A low-latency tunable is provided; if enabled, both interactive
and soft real-time applications are guaranteed a very low latency.
- Latency guarantees are preserved also in the presence of NCQ.
- Also with flash-based devices, a high throughput is achieved
while still preserving latency guarantees.
- BFQ features Early Queue Merge (EQM), a sort of fusion of the
cooperating-queue-merging and the preemption mechanisms present
in CFQ. EQM is in fact a unified mechanism that tries to get a
sequential read pattern, and hence a high throughput, with any
set of processes performing interleaved I/O over a contiguous
sequence of sectors.
- BFQ supports full hierarchical scheduling, exporting a cgroups
interface. Since each node has a full scheduler, each group can
be assigned its own weight.
- If the cgroups interface is not used, only I/O priorities can be
assigned to processes, with ioprio values mapped to weights
with the relation weight = IOPRIO_BE_NR - ioprio.
- ioprio classes are served in strict priority order, i.e., lower
priority queues are not served as long as there are higher
priority queues. Among queues in the same class the bandwidth is
distributed in proportion to the weight of each queue. A very
thin extra bandwidth is however guaranteed to the Idle class, to
prevent it from starving.
Change-Id: I1c789ca3c2eb93972d742f82ee729cfe5fb7170c
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini@google.com>
2013-05-09 19:10:02 +02:00
|
|
|
default n
|
|
|
|
---help---
|
2018-01-16 22:15:14 +01:00
|
|
|
|
|
|
|
Enable hierarchical scheduling in BFQ, using the blkio
|
|
|
|
(cgroups-v1) or io (cgroups-v2) controller.
|
block: introduce the BFQ-v7r11 I/O sched for 4.4.0
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.
- Slices are given in the service domain: tasks are assigned
budgets, measured in number of sectors. Once got the disk, a task
must however consume its assigned budget within a configurable
maximum time (by default, the maximum possible value of the
budgets is automatically computed to comply with this timeout).
This allows the desired latency vs "throughput boosting" tradeoff
to be set.
- Budgets are scheduled according to a variant of WF2Q+, implemented
using an augmented rb-tree to take eligibility into account while
preserving an O(log N) overall complexity.
- A low-latency tunable is provided; if enabled, both interactive
and soft real-time applications are guaranteed a very low latency.
- Latency guarantees are preserved also in the presence of NCQ.
- Also with flash-based devices, a high throughput is achieved
while still preserving latency guarantees.
- BFQ features Early Queue Merge (EQM), a sort of fusion of the
cooperating-queue-merging and the preemption mechanisms present
in CFQ. EQM is in fact a unified mechanism that tries to get a
sequential read pattern, and hence a high throughput, with any
set of processes performing interleaved I/O over a contiguous
sequence of sectors.
- BFQ supports full hierarchical scheduling, exporting a cgroups
interface. Since each node has a full scheduler, each group can
be assigned its own weight.
- If the cgroups interface is not used, only I/O priorities can be
assigned to processes, with ioprio values mapped to weights
with the relation weight = IOPRIO_BE_NR - ioprio.
- ioprio classes are served in strict priority order, i.e., lower
priority queues are not served as long as there are higher
priority queues. Among queues in the same class the bandwidth is
distributed in proportion to the weight of each queue. A very
thin extra bandwidth is however guaranteed to the Idle class, to
prevent it from starving.
Change-Id: I1c789ca3c2eb93972d742f82ee729cfe5fb7170c
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini@google.com>
2013-05-09 19:10:02 +02:00
|
|
|
|
2005-10-30 15:02:19 -08:00
|
|
|
choice
|
|
|
|
prompt "Default I/O scheduler"
|
2006-06-19 10:06:48 +02:00
|
|
|
default DEFAULT_CFQ
|
2005-10-30 15:02:19 -08:00
|
|
|
help
|
|
|
|
Select the I/O scheduler which will be used by default for all
|
|
|
|
block devices.
|
|
|
|
|
|
|
|
config DEFAULT_DEADLINE
|
2005-11-04 08:44:58 +01:00
|
|
|
bool "Deadline" if IOSCHED_DEADLINE=y
|
2005-10-30 15:02:19 -08:00
|
|
|
|
|
|
|
config DEFAULT_CFQ
|
2005-11-04 08:44:58 +01:00
|
|
|
bool "CFQ" if IOSCHED_CFQ=y
|
2005-10-30 15:02:19 -08:00
|
|
|
|
|
|
|
config DEFAULT_NOOP
|
|
|
|
bool "No-op"
|
|
|
|
|
|
|
|
endchoice
|
|
|
|
|
|
|
|
config DEFAULT_IOSCHED
|
|
|
|
string
|
|
|
|
default "deadline" if DEFAULT_DEADLINE
|
|
|
|
default "cfq" if DEFAULT_CFQ
|
|
|
|
default "noop" if DEFAULT_NOOP
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
endmenu
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-09-30 20:45:40 +02:00
|
|
|
|
|
|
|
endif
|