[Deepsea-users] stage 1 errors on Azure
Kevin Ayres
kevin.ayres at suse.com
Wed Sep 19 13:20:43 MDT 2018
Thanks Joel, yes DNS, NTP is configured and behaving correctly. SP3/SES5 from current repo. salt-api service, master, minion service running (with one error.)
I’m walking through the Deployment guide line by line with same result, now on my second freshly built master node. Salt output is at the bottom of this message. Key: After stage 0, the */proposals directory has NOT been created.
Here’s my build on a single flat network(Azure vNet 172.19.20.0/24):
Root ssh enabled and key based login from master to all nodes as root. All nodes rebooted before salt stage.
All nodes using identical image and fully patched CPE_NAME="cpe:/o:suse:sles:12:sp3", firewall off, etc. - the Azure instance defaults.
Salt (and all nodes):~ # zypper lr -E
Repository priorities are without effect. All enabled repositories share the same priority.
# | Alias | Name | Enabled | GPG Check | Refresh
---+--------------------------------------------------------------------+-----------------------------------+---------+-----------+--------
3 | SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Pool | SUSE-Enterprise-Storage-5-Pool | Yes | (r ) Yes | No
5 | SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Updates | SUSE-Enterprise-Storage-5-Updates | Yes | (r ) Yes | Yes
8 | SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Pool | SLES12-SP3-Pool | Yes | (r ) Yes | No
10 | SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Updates | SLES12-SP3-Updates | Yes | (r ) Yes | Yes
**DNS** all nodes resolve bidirectionally. Azure cares for DNS but I’ve also updated hosts files.
salt:~ # hostname
salt
salt:~ # ping salt
PING salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1) 56(84) bytes of data.
64 bytes from salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1): icmp_seq=1 ttl=64 time=0.030 ms
104.211.27.224 Outside NAT to 172.19.20.10 salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net salt
172.19.20.12 mon1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon1
172.19.20.13 mon2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon2
172.19.20.14 mon3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon3
172.19.20.15 osd1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd1
172.19.20.16 osd2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd2
172.19.20.17 osd3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd3
172.19.20.18 igw1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net ogw1
**NTP** all nodes 5 minutes sync interval to same Stratum 1 server in same GEO as Azure AZ: (US East) navobs1.gatech.edu as shown:
bash-3.2$ pssh -h pssh-hosts -l sesuser -i sudo ntpq -p
[1] 11:15:27 [SUCCESS] mon3
remote refid st t when poll reach delay offset jitter
==============================================================================
*navobs1.gatech. .GPS. 1 u 19 64 1 15.596 -4.863 0.333
[2] 11:15:27 [SUCCESS] salt
remote refid st t when poll reach delay offset jitter
==============================================================================
navobs1.gatech. .GPS. 1 u 42 64 1 17.063 -6.702 0.000
[3] 11:15:27 [SUCCESS] igw1
remote refid st t when poll reach delay offset jitter
==============================================================================
*navobs1.gatech. .GPS. 1 u 18 64 1 17.394 -27.874 7.663
[4] 11:15:27 [SUCCESS] osd1
remote refid st t when poll reach delay offset jitter
==============================================================================
*navobs1.gatech. .GPS. 1 u 21 64 1 16.962 -3.755 0.813
[5] 11:15:27 [SUCCESS] osd2
remote refid st t when poll reach delay offset jitter
==============================================================================
*navobs1.gatech. .GPS. 1 u 22 64 1 15.832 -4.709 3.062
[6] 11:15:27 [SUCCESS] osd3
remote refid st t when poll reach delay offset jitter
==============================================================================
*navobs1.gatech. .GPS. 1 u 26 64 1 15.877 -3.252 19.131
[7] 11:15:27 [SUCCESS] mon1
remote refid st t when poll reach delay offset jitter
==============================================================================
navobs1.gatech. .GPS. 1 u 2 64 1 16.120 -4.263 0.000
[8] 11:15:27 [SUCCESS] mon2
remote refid st t when poll reach delay offset jitter
==============================================================================
navobs1.gatech. .GPS. 1 u 2 64 1 16.108 -7.713 0.959
**SALT**
salt:~ # systemctl status salt-api salt-master salt-minion |grep 'active (running)'
Active: active (running) since Wed 2018-09-19 18:14:39 UTC; 13min ago
Active: active (running) since Wed 2018-09-19 18:14:42 UTC; 13min ago
Active: active (running) since Wed 2018-09-19 18:14:41 UTC; 13min ago
salt:~ # systemctl status salt-api salt-master salt-minion |grep ERROR
Sep 19 18:14:49 salt salt-minion[1413]: [ERROR ] Function cephimages.list in mine_functions failed to execute
salt:~ # salt-key --list-all
Accepted Keys:
igw1
mon1
mon2
mon3
osd1
osd2
osd3
salt
Denied Keys:
Unaccepted Keys:
Rejected Keys:
salt:~ # salt '*' test.ping
salt:
True
osd2:
True
mon3:
True
osd3:
True
osd1:
True
mon2:
True
igw1:
True
mon1:
True
salt:~ # cat /srv/pillar/ceph/master_minion.sls
master_minion: salt
salt:~ # cat /srv/pillar/ceph/deepsea_minions.sls
...
# Choose all minions
deepsea_minions: '*'
...
**SALT STAGES**
Stage 0 is successful with no errors but does not create the proposals folder.
salt:~ # salt-run state.orch ceph.stage.prep
deepsea_minions : valid
master_minion : valid
ceph_version : valid
[WARNING ] All minions are ready
salt_master:
Name: sync master - Function: salt.state - Result: Changed Started: - 18:44:20.440255 Duration: 949.98 ms
Name: salt-api - Function: salt.state - Result: Changed Started: - 18:44:21.390365 Duration: 3256.749 ms
Name: repo master - Function: salt.state - Result: Clean Started: - 18:44:24.647227 Duration: 351.0 ms
Name: metapackage master - Function: salt.state - Result: Clean Started: - 18:44:24.998333 Duration: 1127.063 ms
Name: prepare master - Function: salt.state - Result: Changed Started: - 18:44:26.125514 Duration: 4109.917 ms
Name: filequeue.remove - Function: salt.runner - Result: Changed Started: - 18:44:30.235610 Duration: 2071.199 ms
Name: restart master - Function: salt.state - Result: Clean Started: - 18:44:32.306972 Duration: 1006.268 ms
Name: filequeue.add - Function: salt.runner - Result: Changed Started: - 18:44:33.313369 Duration: 1352.98 ms
Name: minions.ready - Function: salt.runner - Result: Changed Started: - 18:44:34.666528 Duration: 1891.677 ms
Name: repo - Function: salt.state - Result: Clean Started: - 18:44:36.558363 Duration: 553.342 ms
Name: metapackage minions - Function: salt.state - Result: Clean Started: - 18:44:37.111825 Duration: 3993.733 ms
Name: common packages - Function: salt.state - Result: Clean Started: - 18:44:41.105706 Duration: 2434.079 ms
Name: sync - Function: salt.state - Result: Changed Started: - 18:44:43.539897 Duration: 1381.692 ms
Name: mines - Function: salt.state - Result: Clean Started: - 18:44:44.921708 Duration: 1657.019 ms
Name: updates - Function: salt.state - Result: Changed Started: - 18:44:46.578853 Duration: 11183.347 ms
Name: restart - Function: salt.state - Result: Clean Started: - 18:44:57.762346 Duration: 1553.957 ms
Name: mds restart noop - Function: test.nop - Result: Clean Started: - 18:44:59.316442 Duration: 0.348 ms
Summary for salt_master
-------------
Succeeded: 17 (changed=8)
Failed: 0
-------------
Total states run: 17
Total run time: 38.874 s
Before running Stage 1, the /srv/pillar/ceph/proposals directory does not exist.
salt:~ # ls /srv/pillar/ceph/proposals/
ls: cannot access '/srv/pillar/ceph/proposals/': No such file or directory
That’s where I’m at – Googling..
~ Kevin
From: <deepsea-users-bounces at lists.suse.com> on behalf of Joel Zhou <joel.zhou at suse.com>
Reply-To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Date: Tuesday, September 18, 2018 at 11:34 PM
To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Subject: Re: [Deepsea-users] stage 1 errors on Azure
Hi Kevin,
My short answer is,
Step 1, before stage 0, check your salt-api service on salt-master node first.
```bash
zypper install -y salt-api
systemctl enable salt-api.service
systemctl start salt-api.service
```
Step 2, make sure NTP service works correctly on all nodes, which means time synchronized correctly on all nodes.
Step 3, reboot all your nodes, if acceptable. In case of kernel updated somehow.
Step 4, then you have to start over again from stage 0 to 5.
Basically, deepsea is a bunch of salt scripts, and salt based on python2 and/or python3.
I have no clues about your whole running stack, so assume SLES 12 sp3 + SES 5, which works fine and supported.
More info would be helpful, and also your purpose, such as for practice on your own, or for PoC/testing to meet customer’s demands.
Regards,
--
Joel Zhou 周维伟
Senior Storage Technologist, APJ
Mobile: +86 18514577601
Email: joel.zhou at suse.com
From: <deepsea-users-bounces at lists.suse.com> on behalf of Kevin Ayres <kevin.ayres at suse.com>
Reply-To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Date: Tuesday, September 18, 2018 at 4:49 PM
To: "deepsea-users at lists.suse.com" <deepsea-users at lists.suse.com>
Subject: [Deepsea-users] stage 1 errors on Azure
Hey guys, I can’t seem to get past stage 1. Stage 0 complete successfully. Same output with deepsea command. The master and minion service are running and bidirectional host resolution are good. Keys are all accepted. From what I can determine, the default files are not created by stage 0 for some reason. Thoughts? What I’m seeing is that it fails to create the /srv/pillar/ceph/proposals
I’m running through this doc line by line: https://www.suse.com/documentation/suse-enterprise-storage-5/singlehtml/book_storage_deployment/book_storage_deployment.html#deepsea.cli
~ Kevin
salt:~ # salt-run state.orch ceph.stage.discovery
salt-api : ["Salt API is failing to authenticate - try 'systemctl restart salt-master': list index out of range"]
deepsea_minions : valid
master_minion : valid
ceph_version : valid
[ERROR ] No highstate or sls specified, no execution made
salt_master:
----------
ID: salt-api failed
Function: salt.state
Name: just.exit
Result: False
Comment: No highstate or sls specified, no execution made
Started: 22:30:53.628882
Duration: 0.647 ms
Changes:
Summary for salt_master
------------
Succeeded: 0
Failed: 1
------------
Total states run: 1
Total run time: 0.647 ms
salt:~ # !tail
tail -f /var/log/salt/master
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] role-igw/cluster/igw*.sls matched no files
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] role-openattic/cluster/salt.sls matched no files
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] config/stack/default/global.yml matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] config/stack/default/ceph/cluster.yml matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] cluster/*.sls matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] stack/default/ceph/minions/*.yml matched no files
2018-09-18 22:29:08,822 [salt.state ][ERROR ][8499] No highstate or sls specified, no execution made
2018-09-18 22:29:52,472 [salt.transport.ipc][ERROR ][5672] Exception occurred while handling stream: [Errno 0] Success
2018-09-18 22:29:56,797 [salt.state ][ERROR ][8759] No highstate or sls specified, no execution made
2018-09-18 22:30:53,629 [salt.state ][ERROR ][9272] No highstate or sls specified, no execution made
There’s also some issue with the salt-minion.service:
● salt-minion.service - The Salt Minion
Loaded: loaded (/usr/lib/systemd/system/salt-minion.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2018-09-18 22:46:54 UTC; 12s ago
Main PID: 11082 (salt-minion)
…
.....
Sep 18 22:46:54 salt systemd[1]: Started The Salt Minion.
Sep 18 22:47:00 salt salt-minion[11082]: [ERROR ] Function cephimages.list in mine_functions failed to execute
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20180919/cd1ffc72/attachment.htm>
More information about the Deepsea-users
mailing list