[RFC] writeback and cgroup

Fengguang Wu fengguang.wu at intel.com
Wed Apr 25 12:05:02 UTC 2012


> > So the cfq behavior is pretty undetermined. I more or less realize
> > this from the experiments. For example, when starting 2+ "dd oflag=direct"
> > tasks in one single cgroup, they _sometimes_ progress at different rates.
> > See the attached graphs for two such examples on XFS. ext4 is fine.
> > 
> > The 2-dd test case is:
> > 
> > mkdir /cgroup/dd
> > echo $$ > /cgroup/dd/tasks
> > 
> > dd if=/dev/zero of=/fs/zero1 bs=1M oflag=direct &
> > dd if=/dev/zero of=/fs/zero2 bs=1M oflag=direct &
> > 
> > The 6-dd test case is similar.
>   Hum, interesting. I would not expect that. Maybe it's because files are
> allocated at the different area of the disk. But even then the difference
> should not be *that* big.

Agreed.

> > > > Look at this graph, the 4 dd tasks are granted the same weight (2 of
> > > > them are buffered writes). I guess the 2 buffered dd tasks managed to
> > > > progress much faster than the 2 direct dd tasks just because the async
> > > > IOs are much more efficient than the bs=64k direct IOs.
> > >   Likely because 64k is too low to get good bandwidth with direct IO. If
> > > it was 4M, I believe you would get similar throughput for buffered and
> > > direct IO. So essentially you are right, small IO benefits from caching
> > > effects since they allow you to submit larger requests to the device which
> > > is more efficient.
> > 
> > I didn't direct compare the effects, however here is an example of
> > doing 1M, 64k, 4k direct writes in parallel. It _seems_ bs=1M only has
> > marginal benefits of 64k, assuming cfq is behaving well.
> > 
> > https://github.com/fengguang/io-controller-tests/raw/master/log/snb/ext4/direct-write-1M-64k-4k.2012-04-19-10-50/balance_dirty_pages-task-bw.png
> > 
> > The test case is:
> > 
> > # cgroup 1
> > echo 500 > /cgroup/cp/blkio.weight
> > 
> > dd if=/dev/zero of=/fs/zero-1M bs=1M oflag=direct &
> > 
> > # cgroup 2
> > echo 1000 > /cgroup/dd/blkio.weight
> > 
> > dd if=/dev/zero of=/fs/zero-64k bs=64k oflag=direct &
> > dd if=/dev/zero of=/fs/zero-4k  bs=4k  oflag=direct &
>   Um, I'm not completely sure what you tried to test in the above test.

Yeah it's not a good test case. I've changed it to run the 3 dd tasks
in 3 cgroups with equal weight. Attached the new results (looks the
same as the original one).

> What I wanted to point out is that direct IO is not necessarily less
> efficient than buffered IO. Look:
> xen-node0:~ # uname -a
> Linux xen-node0 3.3.0-rc4-xen+ #6 SMP PREEMPT Tue Apr 17 06:48:08 UTC 2012
> x86_64 x86_64 x86_64 GNU/Linux
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.5304 s, 102 MB/s
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync
> 1024+0 records in
> 1024+0 records out
> 1073741824 bytes (1.1 GB) copied, 10.3678 s, 104 MB/s
> 
> So both direct and buffered IO are about the same. Note that I used
> conv=fsync flag to erase the effect that part of buffered write still
> remains in the cache when dd is done writing which is unfair to direct
> writer...

OK, I also find direct write being a bit faster than buffered write:

root at snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 conv=fsync

1073741824 bytes (1.1 GB) copied, 10.4039 s, 103 MB/s
1073741824 bytes (1.1 GB) copied, 10.4143 s, 103 MB/s

root at snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=1M count=1024 oflag=direct conv=fsync

1073741824 bytes (1.1 GB) copied, 9.9006 s, 108 MB/s
1073741824 bytes (1.1 GB) copied, 9.55173 s, 112 MB/s

root at snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync

1073741824 bytes (1.1 GB) copied, 9.83902 s, 109 MB/s
1073741824 bytes (1.1 GB) copied, 9.61725 s, 112 MB/s

> And actually 64k vs 1M makes a big difference on my machine:
> xen-node0:~ # dd if=/dev/zero of=/mnt/file bs=64k count=16384 oflag=direct conv=fsync
> 16384+0 records in
> 16384+0 records out
> 1073741824 bytes (1.1 GB) copied, 19.3176 s, 55.6 MB/s

Interestingly, my 64k direct writes are as fast as 1M direct writes...
and 4k writes run at ~1/4 speed:

root at snb /home/wfg# dd if=/dev/zero of=/mnt/file bs=4k count=$((256<<10)) oflag=direct conv=fsync

1073741824 bytes (1.1 GB) copied, 42.0726 s, 25.5 MB/s

Thanks,
Fengguang
-------------- next part --------------
A non-text attachment was scrubbed...
Name: balance_dirty_pages-task-bw.png
Type: image/png
Size: 61279 bytes
Desc: not available
URL: <http://lists.linuxfoundation.org/pipermail/containers/attachments/20120425/2b157094/attachment-0001.png>


More information about the Containers mailing list