Budget Fair Queueing I/O Scheduler
==================================

This patchset introduces BFQ into 2.6.34-zen1.
For further information: http://algo.ing.unimo.it/people/paolo/disk_sched/

The overall diffstat is the following:

 block/Kconfig.iosched         |   26 +
 block/Makefile                |    1 +
 block/bfq-cgroup.c            |  767 ++++++++++++++
 block/bfq-ioc.c               |  375 +++++++
 block/bfq-iosched.c           | 2279 +++++++++++++++++++++++++++++++++++++++++
 block/bfq-sched.c             | 1010 ++++++++++++++++++
 block/bfq.h                   |  550 ++++++++++
 block/blk-ioc.c               |   27 +-
 block/cfq-iosched.c           |   10 +-
 fs/ioprio.c                   |    9 +-
 include/linux/cgroup_subsys.h |    6 +
 include/linux/iocontext.h     |   18 +-
 12 files changed, 5058 insertions(+), 20 deletions(-)


CHANGELOG
 
IMPORTANT NOTE: This version of BFQ contains a bug that may cause a panic
when a process is moved to a new cgroup. This functionality seems to be
rarely used, as we have not yet received any report of this panic. We
will be glad to fix the bug after the first report. This bug has however been
removed from version v3r3 onwards.


v1:

This is a new version of BFQ with respect to the versions you can
find on Fabio's site: http://feanor.sssup.it/~fabio/linux/bfq.
Here is what we changed with respect to the previous versions:

1) re-tuned the budget feedback mechanism: it is now slighlty more
biased toward assigning high budgets, to boost the aggregated
throughput more, and more quickly as new processes are started

2) introduced more tolerance toward seeky queues (I verified that the
phenomenona described below used to occurr systematically):

   2a: if a queue is expired after having received very little
       service, then it is not punished as a seeky queue, even if it
       occurred to consume that little service too slowly; the
       rationale is that, if the new active queue has been served for
       a too short time interval, then its possible sequential
       accesses may not yet prevail on the initial latencies for
       moving the disk head on the first sector requested

   2b: the waiting time (disk idling) of a queue detected as seeky as
       a function of the position of the requests it issued is reduced
       to a very low value only after the queue has consumed a minimum
       fraction of the assigned budget; this prevents processes
       generating (partly) seeky workloads from being too ill-treated

   2c: if a queue has consumed 'enough' budget upon a budget timeout, then,
       even if it did not consume all of its budget, that queue is not punished
       as any seeky queue; the rationale is that, depending on the disk zones,
       a queue may be served at a lower rate than the estimated peak rate.

   Changes 2a and 2b have been critical in lowering latencies, whereas
   change 2c, in addition to change 1, helped a lot increase the disk
   throughput.

3) slightly changed the peak rate estimator: a low-pass filter is now
used instead of just keeping the highest rate sampled; the rationale
is that the peak rate of a disk should be quite stable, so the filter
should converge more or less smoothly to the right value; it seemed to
correctly catch the peak rate with all disks we used

4) added the low latency mechanism described in detail in
http://algo.ing.unimo.it/people/paolo/disk_sched/description.php.