From: Alexei_Roudnev (Alexei_Roudnev_at_exigengroup.com)
Date: Mon Dec 10 2007 - 20:21:22 CET
Message-ID: <0e5301c83b61$d93058f0$7031a8c0@exigengroup.com> From: "Alexei_Roudnev" <Alexei_Roudnev@exigengroup.com> Date: Mon, 10 Dec 2007 11:21:22 -0800 Subject: Re: [suse-sles-e] Very high MD latencies with SLES10 SP1.
Your numbers looks very strange (2 times slower in SLES10). Moreover, then
can not be explained by the different cache setting, for example, because
random reads dont depends of the cache at all.
Can you test just direct read from the disk, as
time dd if=/dev/sdb of=/dev/null bs=1M count=8000
(total read must be more then your RAM + RAID cache, and you can try
different block sizes).
If it possible that you just have different FS block sizes in SLES9 vs
SLES10? So that IO systems have the same performance BUT FileSystem
operations requires more IO in your SLES10?
(Run your tests, and run iostat 5 at the same time; results here, please.
Run dd command and iostat 5 too, results here, please).
----- Original Message -----
From: "Norbert Paschedag" <noe@physik.unizh.ch>
To: "Neil Brown" <neilb@suse.de>
Cc: <suse-sles-e@suse.com>
Sent: Monday, December 10, 2007 3:55 AM
Subject: Re: [suse-sles-e] Very high MD latencies with SLES10 SP1.
>
>
> On Fri, 7 Dec 2007, Neil Brown wrote:
>
>> > we're having a bit of a performance problem with our MD RAID1 setup
>> > after
>> > having switched to SLES10 SP1 (from SLES9 SP3). For random-I/O in
>> > particular,
>> > we're seeing maximum read and write latencies up to _several seconds_
>> > (with averages
>> > around 30ms) where SLES9 showed maxima around 500ms with avg around
>> > 20ms.
>> >
>> > Typical outputs (randomio-1.3, using O_DIRECT):
>> >
>> > sles9sp3:
>> > total | read: latency (ms) | write: latency
>> > (ms)
>> > iops | iops min avg max sdev | iops min avg
>> > max sdev
>> > --------+-----------------------------------+----------------------------------
>> > 636.8 | 476.7 0.2 20.6 501.6 39.1 | 160.1 0.4 1.0
>> > 25.0 1.4
>> >
>> > sles10sp1:
>> > total | read: latency (ms) | write: latency
>> > (ms)
>> > iops | iops min avg max sdev | iops min avg
>> > max sdev
>> > --------+-----------------------------------+----------------------------------
>> > 350.1 | 262.5 0.2 29.9 7114.5 284.6 | 87.6 0.5 22.0
>> > 6511.6 253.6
>>
>> This looks bad. I've tried on may machine and I cannot reproduce it,
>> so presumably drive subsystem as a significant effect (I've got some
>> fairly boring SATA drives on a cheap SATA card).
>>
>> I presume you are running 'randomio' on a file that you created in the
>> XFS partition - is that correct?
>
> Correct.
>
>> Could you try failing one half of the mirror so that all the requests go
>> just to the one drive?
>
> O.k, I failed both (not at the same time, of course) and the timings
> didn't change.
>
>> Also, would it be possible to make a single drive into a RAID0. That
>> might help show if the problem is specific to raid1, or if it affect
>> md more generally.
>
> With RAID0, both on a single drive and using both drives, I did not see
> these latencies. So I suspect the raid1 driver.
>
> However, I also tried the same benchmark with two internal SCSI drives
> (attached to an AIC-7902B) and did not see any strange latencies
> with RAID1 either.
> Is there anything FC-specific in the raid1 driver ?
>
> Regards,
> Norbert
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
> For additional commands, e-mail: suse-sles-e-help@suse.com
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
For additional commands, e-mail: suse-sles-e-help@suse.com
This archive was generated by hypermail 2.1.7 : Mon Dec 10 2007 - 09:21:49 CET