[suse-sles-e] Node freeze (almost)

From: Silviu Marin-Caea (silviu_marin-caea_at_fieldinsights.ro)
Date: Wed Jul 11 2007 - 15:22:28 CEST


From: Silviu Marin-Caea <silviu_marin-caea@fieldinsights.ro>
Date: Wed, 11 Jul 2007 16:22:28 +0300
Message-Id: <200707111622.28148.silviu_marin-caea@fieldinsights.ro>
Subject: [suse-sles-e] Node freeze (almost)

One of our cluster nodes is freezing all of a sudden. We haven't find any
probable cause. It has happened twice, in 3 months period. This is a
mission critical cluster, such freezes are quite bad.

It only responds to ping. This is interesting, because other than this we
cannot get any reaction out of it; we have to power cycle. We cannot connect
through ssh, the local console does not respond, [NumLock], [CapsLock] don't
work. It's not a kernel panic.

And there's nothing suspicious in any log, just cut out.

It's a Dell PowerEdge 6650.

We have run diagnostic software and it didn't report anything.

I have activated the SysRq magic key, and put a cronjob with top -b -n 1
hoping that it will give an indication of what's going on before the crash.

Do you have any other idea what we could do?

---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
For additional commands, e-mail: suse-sles-e-help@suse.com



This archive was generated by hypermail 2.1.7 : Wed Jul 11 2007 - 17:24:17 CEST