Re: Antw: Re: [suse-sles-e] Re: catastrophic disk-performance on SLES10 with SATA drives

From: matilda matilda (matilda_at_grandel.de)
Date: Tue Apr 03 2007 - 09:55:54 CEST


Message-Id: <461224AA.117B.0044.0@grandel.de>
Date: Tue, 03 Apr 2007 09:55:54 +0200
From: "matilda matilda" <matilda@grandel.de>
Subject: Re: Antw: Re: [suse-sles-e] Re: catastrophic	disk-performance on SLES10	with SATA drives

Hi Markus,

do you have a similar behaviour as we have when you try to write a file bigger
than what fits into the memory? Increasing count of [pdflush]-kernel-threads and
an increasing avarage load on the machine.

Could you please try this?

Are you working with the HP provided drivers or with drivers which come with SLES?

What SLES version do you have?

We're working with internal raid. As soon as all data fit into memory the "benchmark"
is not worth it. :-) 551MB/s, this is a raid I would like. What do you want for it? ;-)

Best regards
Andreas Mock

>>> Markus Roeffen <markus.roeffen@iws.uni-stuttgart.de> 03.04.2007 09:39 >>>
Hello Andreas,

we've run a few benchmarks and the result are similar.

On the internal SCSI-RAID (RAID 1) we have made the same experiences. As
soon as the data grows big the performance decreases.

server:~ # time dd if=/dev/zero of=/frei/test bs=16k count=100000

100000+0 records in

100000+0 records out

1638400000 bytes (1.6 GB) copied, 48.4464 seconds, 33.8 MB/s

real 0m48.789s

user 0m0.060s

sys 0m3.404s

server:~ # time dd if=/dev/zero of=/frei/test bs=16k count=10000

10000+0 records in

10000+0 records out

163840000 bytes (164 MB) copied, 0.297434 seconds, 551 MB/s

On our external RAID (RAID 5) the results are different. The peformance
is poor in both cases.

server:~ # time dd if=/dev/zero of=/home/jan/test bs=16k count=10000

10000+0 records in

10000+0 records out

163840000 bytes (164 MB) copied, 4.13542 seconds, 39.6 MB/s

real 0m6.016s

user 0m0.000s

sys 0m0.476s

server:~ # time dd if=/dev/zero of=/home/jan/test bs=16k count=100000

100000+0 records in

100000+0 records out

1638400000 bytes (1.6 GB) copied, 43.4186 seconds, 37.7 MB/s

real 0m43.457s

user 0m0.048s

sys 0m4.412s

Do you use an internal or an external RAID?

I've got news from HP, too. They wanted us to check our server with a
special version of cfg2html-linux (cfg2html-linux HP Proliant Version
1.24). The results have been mailed to HP, so maybe we'll get an answer
this week.

Best regards
Markus

matilda matilda wrote:
> Hi all,
>
> we experienced two main problems:
>
> 1) As soon as we have write requests for large files in combination with regular read requests performance
> is suffering dramatically.
>
> 2) We have a second problem and I would be interested to see, if other people with that controller have
> the same phaenomena. I never got a satsifying answer from Novell support. (SLES9 SP3, I'm also interested in
> seeing that with SLES 10).
> When you start the command 'time dd if=/dev/zero of=data.out bs=8192 count=1310720' (10 737 418 240 Bytes,
> 10GB, something much bigger than physical memory) everything seems fine. As soon as data copuldn't be stored to
> memory (dirty buffers) performance of the whole system is decreasing dramatically. cat /proc/loadavg shows a
> increase of avarage system load of approx. 10 (!). Shell is very very unresponsive and you have suddenly
> very much [pdflush] kernel threads compared to the situation when you started the test.
>
> As far as I know/read pdflush kernel thread is responsible to write the dirty buffers back to physical storage.
> And this thread tries to do this as quick as possible. BUT: In my opinion it does not make sense that so many
> of them try to write to the very same pyhsical disk. It will not be faster because the "spindle" is the
> bottleneck. If we try the very same on a physical device not controlled by cciss driver we never see such
> a big increase of [pdflush] threads. In this case system behaves as I expect it.
>
> Any way, the system should be responsive as the processor has nothing to do but waiting on the blocks written
> to disk, but the system's performance is going down dramatically. I don't understand this behaviour.
> At the moment it's the best local DOS-attack you can think of. :-))
>
> Would someone please make such test? It's rather simple and would be very interesting.
>
> Thank you in advance
> Andreas Mock
>

-- 
-------------------------------
Markus Roeffen
Institut für Wasserbau IWS
Universität Stuttgart
Pfaffenwaldring 61
70569 Stuttgart
Tel: +49(0)711/685-67010
Fax: +49(0)711/685-67020
Mail: markus.roeffen@iws.uni-stuttgart.de 
---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com 
For additional commands, e-mail: suse-sles-e-help@suse.com 
---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
For additional commands, e-mail: suse-sles-e-help@suse.com


This archive was generated by hypermail 2.1.7 : Tue Apr 03 2007 - 12:00:07 CEST