[Deepsea-users] Deepsea fails to deploy OSDs in stage 3

Mon Jul 8 11:15:43 MDT 2019

So I’ve run through the documented disk reset process that you attached, and rebooted the storage server. I’m still getting the same error and issue running stage 3. I adjusted the dd bs settings to accommodate the disks being 4k block size.

Is there somewhere that may give me more verbose log info on the specific task that’s failing, or a way that I can try running that particular pillar by hand and try debugging the steps?

I’m also open to other troubleshooting suggestions. I was able to get everything deployed on these OSDs with deepsea and ceph mimic, so I know the configuration  was at least supported by the automation at some point

Allen

From: Strahil <hunter86_bg at yahoo.com>
Date: Saturday, July 6, 2019 at 9:16 AM
To: Allen Sellars <asellars at vigilantnow.com>
Cc: deepsea-users <deepsea-users at lists.suse.com>
Subject: Re: [Deepsea-users] Deepsea fails to deploy OSDs in stage 3

Also 1 more thing - check if ceph-osd at X.service  has failed  any reasons behind that.

Best Regards,
Strahil Nikolov
On Jul 6, 2019 14:24, Allen Sellars <asellars at vigilantnow.com> wrote:
gdisk was repeating no MVR and no GPT partitions, so I assumed they were safe to use.

I’ll go through zeroing them out with this process and report back.

Thanks
Allen Sellars
asellars at vigilantnow.com<mailto:asellars at vigilantnow.com>

-Sent from my iPhone

On Jul 5, 2019, at 18:04, Strahil <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>> wrote:

Hi Allen,

I think that you need empty disks for deepsea to 'target' them.

Can you wipe the partition's beginning, disk beginning and disk end ?

Should be something like:

for partition in /dev/sdX[0-9]*
do
dd if=/dev/zero of=$partition bs=4096 count=1 oflag=direct done

dd if=/dev/zero of=/dev/sdX bs=512 count=34 oflag=direct

dd if=/dev/zero of=/dev/sdX bs=512 count=33 \
seek=$((`blockdev --getsz /dev/sdX` - 33)) oflag=direct

And then create a gpt partition table:

sgdisk -Z --clear -g /dev/sdX

Source: https://www.google.bg/url?sa=t&source=web&rct=j&url=https://www.suse.com/documentation/suse-enterprise-storage-5/pdfdoc/book_storage_deployment/book_storage_deployment.pdf&ved=2ahUKEwj_2ouC4p7jAhWkwqYKHd7OBJUQFjAAegQIARAB&usg=AOvVaw3g9_lOOBwwzqK3siEkNbnF

Best Regards,
Strahil Nikolov
On Jul 6, 2019 00:41, Allen Sellars <asellars at vigilantnow.com<mailto:asellars at vigilantnow.com>> wrote:

I have a cisco UCS S3260 with 52 6TB spinning disks and 4 SSDs as DB disks.

I have no profile-* configs in the proposals directory.

I’ve obscured FQDNs

Stages 0-2 run fine with no failures. I see the following in stage 3:

When I run salt-run state.orch ceph.stage.3 my salt-master return this:

firewall                 : disabled

apparmor                 : disabled

subvolume                : skipping

DEV_ENV                  : True
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20190708/b1a37d4d/attachment.html>