[Deepsea-users] Deepsea fails to deploy OSDs in stage 3
Allen Sellars
asellars at vigilantnow.com
Sat Jul 6 05:24:36 MDT 2019
gdisk was repeating no MVR and no GPT partitions, so I assumed they were safe to use.
I’ll go through zeroing them out with this process and report back.
Thanks
Allen Sellars
asellars at vigilantnow.com<mailto:asellars at vigilantnow.com>
-Sent from my iPhone
On Jul 5, 2019, at 18:04, Strahil <hunter86_bg at yahoo.com<mailto:hunter86_bg at yahoo.com>> wrote:
Hi Allen,
I think that you need empty disks for deepsea to 'target' them.
Can you wipe the partition's beginning, disk beginning and disk end ?
Should be something like:
for partition in /dev/sdX[0-9]*
do
dd if=/dev/zero of=$partition bs=4096 count=1 oflag=direct done
dd if=/dev/zero of=/dev/sdX bs=512 count=34 oflag=direct
dd if=/dev/zero of=/dev/sdX bs=512 count=33 \
seek=$((`blockdev --getsz /dev/sdX` - 33)) oflag=direct
And then create a gpt partition table:
sgdisk -Z --clear -g /dev/sdX
Source: https://www.google.bg/url?sa=t&source=web&rct=j&url=https://www.suse.com/documentation/suse-enterprise-storage-5/pdfdoc/book_storage_deployment/book_storage_deployment.pdf&ved=2ahUKEwj_2ouC4p7jAhWkwqYKHd7OBJUQFjAAegQIARAB&usg=AOvVaw3g9_lOOBwwzqK3siEkNbnF
Best Regards,
Strahil Nikolov
On Jul 6, 2019 00:41, Allen Sellars <asellars at vigilantnow.com<mailto:asellars at vigilantnow.com>> wrote:
I have a cisco UCS S3260 with 52 6TB spinning disks and 4 SSDs as DB disks.
I have no profile-* configs in the proposals directory.
I’ve obscured FQDNs
Stages 0-2 run fine with no failures. I see the following in stage 3:
When I run salt-run state.orch ceph.stage.3 my salt-master return this:
firewall : disabled
apparmor : disabled
subvolume : skipping
DEV_ENV : True
fsid : valid
public_network : valid
public_interface : valid
cluster_network : valid
cluster_interface : valid
ip_version : valid
monitors : valid
mgrs : valid
storage : valid
storage_role : valid
rgw : valid
ganesha : valid
master_role : valid
time_server : valid
fqdn : valid
Found DriveGroup <drive_group_default>
Calling dg.deploy on compound target I at roles:storage
[ERROR ] {'out': 'highstate', 'ret': {'s1103': {'module_|-wait for osd processes_|-cephprocesses.wait_|-run': {'name': 'cephprocesses.wait', 'changes': {'ret': False}, 'comment': 'Module function cephprocesses.wait executed', 'result': False, '__sls__': 'ceph.processes.osd.default', '__run_num__': 0, 'start_time': '21:11:52.567915', 'duration': 933229.3, '__id__': 'wait for osd processes'}}}}
admin_master:
Name: populate scrape configs - Function: salt.state - Result: Changed Started: - 21:03:44.756802 Duration: 6245.346 ms
Name: populate alertmanager peers - Function: salt.state - Result: Changed Started: - 21:03:51.002480 Duration: 2236.262 ms
Name: fileserver.clear_file_list_cache - Function: salt.runner - Result: Changed Started: - 21:03:53.239063 Duration: 795.163 ms
Name: install prometheus - Function: salt.state - Result: Clean Started: - 21:03:54.034548 Duration: 62470.849 ms
Name: push scrape configs - Function: salt.state - Result: Changed Started: - 21:04:56.505708 Duration: 1903.123 ms
Name: install alertmanager - Function: salt.state - Result: Clean Started: - 21:04:58.409127 Duration: 38233.69 ms
Name: populate grafana config fragments - Function: salt.state - Result: Clean Started: - 21:05:36.643118 Duration: 2777.631 ms
Name: install grafana - Function: salt.state - Result: Clean Started: - 21:05:39.420958 Duration: 109235.045 ms
Name: time - Function: salt.state - Result: Changed Started: - 21:07:28.656309 Duration: 46245.898 ms
Name: configuration check - Function: salt.state - Result: Clean Started: - 21:08:14.902504 Duration: 920.39 ms
Name: create ceph.conf - Function: salt.state - Result: Changed Started: - 21:08:15.823194 Duration: 5960.673 ms
Name: configuration - Function: salt.state - Result: Changed Started: - 21:08:21.784167 Duration: 2212.586 ms
Name: admin - Function: salt.state - Result: Clean Started: - 21:08:23.996960 Duration: 962.78 ms
Name: mgr keyrings - Function: salt.state - Result: Clean Started: - 21:08:24.960033 Duration: 1011.557 ms
Name: monitors - Function: salt.state - Result: Changed Started: - 21:08:25.971888 Duration: 17081.809 ms
Name: mgr auth - Function: salt.state - Result: Changed Started: - 21:08:43.054000 Duration: 5225.853 ms
Name: mgrs - Function: salt.state - Result: Changed Started: - 21:08:48.280055 Duration: 44721.595 ms
Name: install ca cert in mgr minions - Function: salt.state - Result: Changed Started: - 21:09:33.001952 Duration: 2575.651 ms
Name: retry.cmd - Function: salt.function - Result: Changed Started: - 21:09:35.577911 Duration: 1095.445 ms
Name: dashboard - Function: salt.state - Result: Changed Started: - 21:09:36.673648 Duration: 7548.494 ms
Name: osd auth - Function: salt.state - Result: Changed Started: - 21:09:44.222446 Duration: 2545.84 ms
Name: sysctl - Function: salt.state - Result: Changed Started: - 21:09:46.768599 Duration: 758.771 ms
Name: set osd keyrings - Function: salt.state - Result: Clean Started: - 21:09:47.527655 Duration: 749.077 ms
Name: disks.deploy - Function: salt.runner - Result: Changed Started: - 21:09:48.277007 Duration: 30133.606 ms
Name: mgr tuned - Function: salt.state - Result: Changed Started: - 21:10:18.410965 Duration: 19570.008 ms
Name: mon tuned - Function: salt.state - Result: Changed Started: - 21:10:37.981281 Duration: 18494.201 ms
Name: osd tuned - Function: salt.state - Result: Changed Started: - 21:10:56.475786 Duration: 13570.12 ms
Name: pools - Function: salt.state - Result: Changed Started: - 21:11:10.046234 Duration: 6984.189 ms
Name: wait until mon2 with role mon can be restarted - Function: salt.state - Result: Changed Started: - 21:11:17.030734 Duration: 7574.266 ms
Name: check if mon processes are still running on mon2 after restarting mons - Function: salt.state - Result: Changed Started: - 21:11:24.605309 Duration: 992.112 ms
Name: restarting mons on mon2 - Function: salt.state - Result: Clean Started: - 21:11:25.597732 Duration: 1626.034 ms
Name: wait until mon3 with role mon can be restarted - Function: salt.state - Result: Changed Started: - 21:11:27.224053 Duration: 6971.541 ms
Name: check if mon processes are still running on mon3 after restarting mons - Function: salt.state - Result: Changed Started: - 21:11:34.195896 Duration: 994.762 ms
Name: restarting mons on mon3 - Function: salt.state - Result: Clean Started: - 21:11:35.190958 Duration: 1620.595 ms
Name: wait until mon1 with role mon can be restarted - Function: salt.state - Result: Changed Started: - 21:11:36.811852 Duration: 7322.913 ms
Name: check if mon processes are still running on mon1 after restarting mons - Function: salt.state - Result: Changed Started: - 21:11:44.135071 Duration: 988.726 ms
Name: restarting mons on mon1 - Function: salt.state - Result: Clean Started: - 21:11:45.124101 Duration: 1625.383 ms
Name: wait until mon2 with role mgr can be restarted - Function: salt.state - Result: Changed Started: - 21:11:46.749778 Duration: 6997.631 ms
Name: check if mgr processes are still running on mon2 after restarting mgrs - Function: salt.state - Result: Changed Started: - 21:11:53.747726 Duration: 990.767 ms
Name: restarting mgr on mon2 - Function: salt.state - Result: Clean Started: - 21:11:54.738787 Duration: 1666.851 ms
Name: wait until mon3 with role mgr can be restarted - Function: salt.state - Result: Changed Started: - 21:11:56.405937 Duration: 6978.253 ms
Name: check if mgr processes are still running on mon3 after restarting mgrs - Function: salt.state - Result: Changed Started: - 21:12:03.384521 Duration: 2233.584 ms
Name: restarting mgr on mon3 - Function: salt.state - Result: Clean Started: - 21:12:05.618424 Duration: 1572.339 ms
Name: wait until mon1 with role mgr can be restarted - Function: salt.state - Result: Changed Started: - 21:12:07.191059 Duration: 6980.657 ms
Name: check if mgr processes are still running on mon1 after restarting mgrs - Function: salt.state - Result: Changed Started: - 21:12:14.172025 Duration: 992.372 ms
Name: restarting mgr on mon1 - Function: salt.state - Result: Clean Started: - 21:12:15.164731 Duration: 1618.461 ms
Name: wait until s1103 with role osd can be restarted - Function: salt.state - Result: Changed Started: - 21:12:16.783490 Duration: 6973.487 ms
----------
ID: check if osd processes are still running on s1103 after restarting osds
Function: salt.state
Result: False
Comment: Run failed on minions: s1103
Started: 21:12:23.757280
Duration: 934163.423 ms
Changes:
s1103:
----------
ID: wait for osd processes
Function: module.run
Name: cephprocesses.wait
Result: False
Comment: Module function cephprocesses.wait executed
Started: 21:11:52.567915
Duration: 933229.3 ms
Changes:
----------
ret:
False
Summary for s1103
------------
Succeeded: 0 (changed=1)
Failed: 1
------------
Total states run: 1
Total run time: 933.229 s
Summary for admin_master
-------------
Succeeded: 47 (changed=33)
Failed: 1
When running salt-run disks.report I get this:
salt-run disks.report
Found DriveGroup <drive_group_default>
Calling dg.report on compound target I at roles:storage
|_
----------
s1103:
|_
- 0
-
Total OSDs: 13
Solid State VG:
Targets: block.db Total size: 185.00 GB
Total LVs: 13 Size per LV: 14.23 GB
Devices: /dev/sdbc
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
[data] /dev/sdaa 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdae 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdai 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdam 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdaq 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdau 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sday 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdc 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdg 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdk 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdo 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sds 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdw 5.46 TB 100.0%
[block.db] vg: vg/lv 14.23 GB 7%
|_
- 0
-
Total OSDs: 13
Solid State VG:
Targets: block.db Total size: 371.00 GB
Total LVs: 13 Size per LV: 28.54 GB
Devices: /dev/sdbd
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
[data] /dev/sdab 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdaf 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdaj 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdan 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdar 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdav 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdaz 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdd 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdh 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdl 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdp 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdt 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdx 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
|_
- 0
-
Total OSDs: 13
Solid State VG:
Targets: block.db Total size: 371.00 GB
Total LVs: 13 Size per LV: 28.54 GB
Devices: /dev/sdbe
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
[data] /dev/sdac 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdag 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdak 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdao 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdas 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdaw 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdba 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sde 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdi 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdm 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdq 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdu 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdy 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
|_
- 0
-
Total OSDs: 13
Solid State VG:
Targets: block.db Total size: 371.00 GB
Total LVs: 13 Size per LV: 28.54 GB
Devices: /dev/sdbf
Type Path LV Size % of device
----------------------------------------------------------------------------------------------------
[data] /dev/sdad 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdah 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdal 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdap 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdat 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdax 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdbb 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdf 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdj 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdn 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdr 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdv 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
----------------------------------------------------------------------------------------------------
[data] /dev/sdz 5.46 TB 100.0%
[block.db] vg: vg/lv 28.54 GB 7%
Here’s my drive_groups.yml:
drive_group_default:
target: 'I at roles:storage'
data_devices:
size: '2TB:'
rotational: 1
db_devices:
rotational: 0
size: '180GB:'
Here’s what I see on the minion itself:
2019-07-05 20:52:27,750 [salt.utils.decorators:613 ][WARNING ][65394] The function "module.run" is using its deprecated version and will expire in version "Sodium".
2019-07-05 21:07:26,314 [salt.utils.decorators:613 ][WARNING ][66129] The function "module.run" is using its deprecated version and will expire in version "Sodium".
2019-07-05 21:11:52,568 [salt.utils.decorators:613 ][WARNING ][70185] The function "module.run" is using its deprecated version and will expire in version "Sodium".
2019-07-05 21:11:52,692 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:11:55,801 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:12:01,899 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:12:11,001 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:12:23,104 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:12:38,210 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:12:56,321 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:13:17,437 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:13:41,557 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:14:08,676 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:14:38,801 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:15:11,930 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:15:48,031 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:16:27,128 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:17:09,232 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:17:54,371 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:18:42,510 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:19:33,653 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:20:27,799 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:21:24,951 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:22:25,107 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:23:25,270 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:24:25,425 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:25:25,582 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:26:25,737 [salt.loaded.ext.module.cephprocesses:252 ][ERROR ][70185] ERROR: process ceph-osd for role storage is not running
2019-07-05 21:27:25,796 [salt.loaded.ext.module.cephprocesses:389 ][ERROR ][70185] Timeout expired
2019-07-05 21:27:25,797 [salt.state :323 ][ERROR ][70185] {'ret': False}
What am I missing here? The OSDs were deploying with deepsea and ceph 13 mimic and the storage profiles in profile-*. I seem to be missing something with the new drive_group setup
Allen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20190706/b0a5feaa/attachment.html>
More information about the Deepsea-users
mailing list