Re: [suse-sles-e] Very high MD latencies with SLES10 SP1.

From: Norbert Paschedag (noe_at_physik.unizh.ch)
Date: Mon Dec 10 2007 - 12:55:24 CET


Date: Mon, 10 Dec 2007 12:55:24 +0100 (CET)
From: Norbert Paschedag <noe@physik.unizh.ch>
Message-ID: <Pine.LNX.4.64.0712101244580.6336@hesse.math.tu-berlin.de>
Subject: Re: [suse-sles-e] Very high MD latencies with SLES10 SP1.


On Fri, 7 Dec 2007, Neil Brown wrote:

> > we're having a bit of a performance problem with our MD RAID1 setup after
> > having switched to SLES10 SP1 (from SLES9 SP3). For random-I/O in particular,
> > we're seeing maximum read and write latencies up to _several seconds_ (with averages
> > around 30ms) where SLES9 showed maxima around 500ms with avg around 20ms.
> >
> > Typical outputs (randomio-1.3, using O_DIRECT):
> >
> > sles9sp3:
> > total | read: latency (ms) | write: latency (ms)
> > iops | iops min avg max sdev | iops min avg max sdev
> > --------+-----------------------------------+----------------------------------
> > 636.8 | 476.7 0.2 20.6 501.6 39.1 | 160.1 0.4 1.0 25.0 1.4
> >
> > sles10sp1:
> > total | read: latency (ms) | write: latency (ms)
> > iops | iops min avg max sdev | iops min avg max sdev
> > --------+-----------------------------------+----------------------------------
> > 350.1 | 262.5 0.2 29.9 7114.5 284.6 | 87.6 0.5 22.0 6511.6 253.6
>
> This looks bad. I've tried on may machine and I cannot reproduce it,
> so presumably drive subsystem as a significant effect (I've got some
> fairly boring SATA drives on a cheap SATA card).
>
> I presume you are running 'randomio' on a file that you created in the
> XFS partition - is that correct?

Correct.
 
> Could you try failing one half of the mirror so that all the requests go
> just to the one drive?

O.k, I failed both (not at the same time, of course) and the timings
didn't change.

> Also, would it be possible to make a single drive into a RAID0. That
> might help show if the problem is specific to raid1, or if it affect
> md more generally.

With RAID0, both on a single drive and using both drives, I did not see
these latencies. So I suspect the raid1 driver.

However, I also tried the same benchmark with two internal SCSI drives
(attached to an AIC-7902B) and did not see any strange latencies
with RAID1 either.
Is there anything FC-specific in the raid1 driver ?

Regards,
  Norbert

---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
For additional commands, e-mail: suse-sles-e-help@suse.com



This archive was generated by hypermail 2.1.7 : Mon Dec 10 2007 - 01:55:57 CET