[Deepsea-users] stage 1 errors on Azure
Kevin Ayres
kevin.ayres at suse.com
Wed Sep 19 13:56:17 MDT 2018
Thanks Eric, Yes, I understand this but worded it poorly. I don't see any issues with NTP or DNS. Something else is amiss.
Should deepsea be installed after salt as outlined in the deployment doc, or before?
salt:~ # salt-run state.orch ceph.stage.discovery
salt-api : ["Salt API is failing to authenticate - try 'systemctl restart salt-master': list index out of range"]
deepsea_minions : valid
master_minion : valid
ceph_version : valid
[ERROR ] No highstate or sls specified, no execution made
salt_master:
----------
ID: salt-api failed
Function: salt.state
Name: just.exit
Result: False
Comment: No highstate or sls specified, no execution made
Started: 19:38:41.962044
Duration: 0.734 ms
Changes:
Summary for salt_master
------------
Succeeded: 0
Failed: 1
------------
Total states run: 1
Total run time: 0.734 ms
salt:~ # tail -f /var/log/salt/master
2018-09-19 18:44:36,555 [salt.loaded.ext.runners.minions][WARNING ][15319] All minions are ready
2018-09-19 19:38:41,955 [salt.transport.ipc][ERROR ][1626] Exception occurred while handling stream: [Errno 0] Success
2018-09-19 19:38:41,962 [salt.state ][ERROR ][40826] No highstate or sls specified, no execution made
salt:~ # ls /srv/pillar/ceph/proposals
ls: cannot access '/srv/pillar/ceph/proposals': No such file or directory
salt:~ # ls /srv/pillar/ceph/
benchmarks deepsea_minions.sls deepsea_minions.sls.rpmsave init.sls master_minion.sls master_minion.sls.rpmsave stack
~ Kevin
On 9/19/18, 12:37 PM, "deepsea-users-bounces at lists.suse.com on behalf of Eric Jackson" <deepsea-users-bounces at lists.suse.com on behalf of ejackson at suse.com> wrote:
Hi Kevin,
Stage 0 only does the "preparation" part. That is, sync'ing salt modules,
zypper updates, etc. Stage 1 is the "discovery" part that interrogates the
minions and then creates the roles and storage fragments. If your salt-api
issue is resolved, Stage 1 should run relatively quick.
Eric
On Wednesday, September 19, 2018 3:20:43 PM EDT Kevin Ayres wrote:
> Thanks Joel, yes DNS, NTP is configured and behaving correctly. SP3/SES5
> from current repo. salt-api service, master, minion service running (with
> one error.)
I’m walking through the Deployment guide line by line with
> same result, now on my second freshly built master node. Salt output is at
> the bottom of this message. Key: After stage 0, the */proposals directory
> has NOT been created.
> Here’s my build on a single flat network(Azure vNet 172.19.20.0/24):
> Root ssh enabled and key based login from master to all nodes as root. All
> nodes rebooted before salt stage.
All nodes using identical image and
> fully patched CPE_NAME="cpe:/o:suse:sles:12:sp3", firewall off, etc. - the
> Azure instance defaults.
> Salt (and all nodes):~ # zypper lr -E
> Repository priorities are without effect. All enabled repositories share the
> same priority.
# | Alias
> | Name | Enabled | GPG Check |
> Refresh
> ---+--------------------------------------------------------------------+--
> ---------------------------------+---------+-----------+-------- 3 |
> SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Pool |
> SUSE-Enterprise-Storage-5-Pool | Yes | (r ) Yes | No 5 |
> SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Updates |
> SUSE-Enterprise-Storage-5-Updates | Yes | (r ) Yes | Yes 8 |
> SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Pool |
> SLES12-SP3-Pool | Yes | (r ) Yes | No 10 |
> SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Updates |
> SLES12-SP3-Updates | Yes | (r ) Yes | Yes
> **DNS** all nodes resolve bidirectionally. Azure cares for DNS but I’ve also
> updated hosts files.
salt:~ # hostname
> salt
> salt:~ # ping salt
> PING salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1)
> 56(84) bytes of data.
64 bytes from
> salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1):
> icmp_seq=1 ttl=64 time=0.030 ms
> 104.211.27.224 Outside NAT to 172.19.20.10
> salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net salt
172.19.20.12
> mon1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon1
> 172.19.20.13 mon2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
> mon2 172.19.20.14
> mon3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon3 172.19.20.15
> osd1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd1
> 172.19.20.16 osd2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
> osd2 172.19.20.17
> osd3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd3 172.19.20.18
> igw1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net ogw1
> **NTP** all nodes 5 minutes sync interval to same Stratum 1 server in same
> GEO as Azure AZ: (US East) navobs1.gatech.edu as shown:
bash-3.2$ pssh -h
> pssh-hosts -l sesuser -i sudo ntpq -p
> [1] 11:15:27 [SUCCESS] mon3
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS. 1 u 19 64 1 15.596 -4.863
> 0.333 [2] 11:15:27 [SUCCESS] salt
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS. 1 u 42 64 1 17.063 -6.702
> 0.000 [3] 11:15:27 [SUCCESS] igw1
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS. 1 u 18 64 1 17.394 -27.874
> 7.663 [4] 11:15:27 [SUCCESS] osd1
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS. 1 u 21 64 1 16.962 -3.755
> 0.813 [5] 11:15:27 [SUCCESS] osd2
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS. 1 u 22 64 1 15.832 -4.709
> 3.062 [6] 11:15:27 [SUCCESS] osd3
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS. 1 u 26 64 1 15.877 -3.252
> 19.131 [7] 11:15:27 [SUCCESS] mon1
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS. 1 u 2 64 1 16.120 -4.263
> 0.000 [8] 11:15:27 [SUCCESS] mon2
> remote refid st t when poll reach delay offset
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS. 1 u 2 64 1 16.108 -7.713
> 0.959
>
> **SALT**
> salt:~ # systemctl status salt-api salt-master salt-minion |grep 'active
> (running)'
Active: active (running) since Wed 2018-09-19 18:14:39 UTC;
> 13min ago Active: active (running) since Wed 2018-09-19 18:14:42 UTC; 13min
> ago Active: active (running) since Wed 2018-09-19 18:14:41 UTC; 13min ago
> salt:~ # systemctl status salt-api salt-master salt-minion |grep ERROR
> Sep 19 18:14:49 salt salt-minion[1413]: [ERROR ] Function
> cephimages.list in mine_functions failed to execute
> salt:~ # salt-key --list-all
> Accepted Keys:
> igw1
> mon1
> mon2
> mon3
> osd1
> osd2
> osd3
> salt
> Denied Keys:
> Unaccepted Keys:
> Rejected Keys:
>
> salt:~ # salt '*' test.ping
> salt:
> True
> osd2:
> True
> mon3:
> True
> osd3:
> True
> osd1:
> True
> mon2:
> True
> igw1:
> True
> mon1:
> True
>
> salt:~ # cat /srv/pillar/ceph/master_minion.sls
> master_minion: salt
>
> salt:~ # cat /srv/pillar/ceph/deepsea_minions.sls
> ...
> # Choose all minions
> deepsea_minions: '*'
> ...
>
> **SALT STAGES**
> Stage 0 is successful with no errors but does not create the proposals
> folder.
> salt:~ # salt-run state.orch ceph.stage.prep
> deepsea_minions : valid
> master_minion : valid
> ceph_version : valid
> [WARNING ] All minions are ready
> salt_master:
> Name: sync master - Function: salt.state - Result: Changed Started: -
> 18:44:20.440255 Duration: 949.98 ms
Name: salt-api - Function: salt.state
> - Result: Changed Started: - 18:44:21.390365 Duration: 3256.749 ms Name:
> repo master - Function: salt.state - Result: Clean Started: -
> 18:44:24.647227 Duration: 351.0 ms Name: metapackage master - Function:
> salt.state - Result: Clean Started: - 18:44:24.998333 Duration: 1127.063 ms
> Name: prepare master - Function: salt.state - Result: Changed Started: -
> 18:44:26.125514 Duration: 4109.917 ms Name: filequeue.remove - Function:
> salt.runner - Result: Changed Started: - 18:44:30.235610 Duration: 2071.199
> ms Name: restart master - Function: salt.state - Result: Clean Started: -
> 18:44:32.306972 Duration: 1006.268 ms Name: filequeue.add - Function:
> salt.runner - Result: Changed Started: - 18:44:33.313369 Duration: 1352.98
> ms Name: minions.ready - Function: salt.runner - Result: Changed Started: -
> 18:44:34.666528 Duration: 1891.677 ms Name: repo - Function: salt.state -
> Result: Clean Started: - 18:44:36.558363 Duration: 553.342 ms Name:
> metapackage minions - Function: salt.state - Result: Clean Started: -
> 18:44:37.111825 Duration: 3993.733 ms Name: common packages - Function:
> salt.state - Result: Clean Started: - 18:44:41.105706 Duration: 2434.079 ms
> Name: sync - Function: salt.state - Result: Changed Started: -
> 18:44:43.539897 Duration: 1381.692 ms Name: mines - Function: salt.state -
> Result: Clean Started: - 18:44:44.921708 Duration: 1657.019 ms Name:
> updates - Function: salt.state - Result: Changed Started: - 18:44:46.578853
> Duration: 11183.347 ms Name: restart - Function: salt.state - Result: Clean
> Started: - 18:44:57.762346 Duration: 1553.957 ms Name: mds restart noop -
> Function: test.nop - Result: Clean Started: - 18:44:59.316442 Duration:
> 0.348 ms
> Summary for salt_master
> -------------
> Succeeded: 17 (changed=8)
> Failed: 0
> -------------
> Total states run: 17
> Total run time: 38.874 s
>
>
> Before running Stage 1, the /srv/pillar/ceph/proposals directory does not
> exist.
salt:~ # ls /srv/pillar/ceph/proposals/
> ls: cannot access '/srv/pillar/ceph/proposals/': No such file or
> directory
> That’s where I’m at – Googling..
>
> ~ Kevin
>
> From: <deepsea-users-bounces at lists.suse.com> on behalf of Joel Zhou
> <joel.zhou at suse.com>
Reply-To: Discussions about the DeepSea management
> framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
> 18, 2018 at 11:34 PM
> To: Discussions about the DeepSea management framework for Ceph
> <deepsea-users at lists.suse.com>
Subject: Re: [Deepsea-users] stage 1 errors
> on Azure
>
> Hi Kevin,
>
> My short answer is,
>
> Step 1, before stage 0, check your salt-api service on salt-master node
> first.
```bash
> zypper install -y salt-api
> systemctl enable salt-api.service
> systemctl start salt-api.service
> ```
> Step 2, make sure NTP service works correctly on all nodes, which means time
> synchronized correctly on all nodes.
Step 3, reboot all your nodes, if
> acceptable. In case of kernel updated somehow. Step 4, then you have to
> start over again from stage 0 to 5.
>
> Basically, deepsea is a bunch of salt scripts, and salt based on python2
> and/or python3.
I have no clues about your whole running stack, so assume
> SLES 12 sp3 + SES 5, which works fine and supported. More info would be
> helpful, and also your purpose, such as for practice on your own, or for
> PoC/testing to meet customer’s demands.
> Regards,
>
> --
> Joel Zhou 周维伟
> Senior Storage Technologist, APJ
>
> Mobile: +86 18514577601
> Email: joel.zhou at suse.com
>
> From: <deepsea-users-bounces at lists.suse.com> on behalf of Kevin Ayres
> <kevin.ayres at suse.com>
Reply-To: Discussions about the DeepSea management
> framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
> 18, 2018 at 4:49 PM
> To: "deepsea-users at lists.suse.com" <deepsea-users at lists.suse.com>
> Subject: [Deepsea-users] stage 1 errors on Azure
>
> Hey guys, I can’t seem to get past stage 1. Stage 0 complete successfully.
> Same output with deepsea command. The master and minion service are running
> and bidirectional host resolution are good. Keys are all accepted. From
> what I can determine, the default files are not created by stage 0 for some
> reason. Thoughts? What I’m seeing is that it fails to create the
> /srv/pillar/ceph/proposals
> I’m running through this doc line by line:
> https://www.suse.com/documentation/suse-enterprise-storage-5/singlehtml/boo
> k_storage_deployment/book_storage_deployment.html#deepsea.cli
> ~ Kevin
>
>
> salt:~ # salt-run state.orch ceph.stage.discovery
>
> salt-api : ["Salt API is failing to authenticate - try
> 'systemctl restart salt-master': list index out of range"]
> deepsea_minions : valid
>
> master_minion : valid
>
> ceph_version : valid
>
> [ERROR ] No highstate or sls specified, no execution made
>
> salt_master:
>
> ----------
>
> ID: salt-api failed
>
> Function: salt.state
>
> Name: just.exit
>
> Result: False
>
> Comment: No highstate or sls specified, no execution made
>
> Started: 22:30:53.628882
>
> Duration: 0.647 ms
>
> Changes:
>
>
>
> Summary for salt_master
>
> ------------
>
> Succeeded: 0
>
> Failed: 1
>
> ------------
>
> Total states run: 1
>
> Total run time: 0.647 ms
>
> salt:~ # !tail
> tail -f /var/log/salt/master
> 2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499]
> role-igw/cluster/igw*.sls matched no files
2018-09-18 22:29:08,797
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> role-openattic/cluster/salt.sls matched no files 2018-09-18 22:29:08,797
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> config/stack/default/global.yml matched no files 2018-09-18 22:29:08,798
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> config/stack/default/ceph/cluster.yml matched no files 2018-09-18
> 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499]
> cluster/*.sls matched no files 2018-09-18 22:29:08,798
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> stack/default/ceph/minions/*.yml matched no files 2018-09-18 22:29:08,822
> [salt.state ][ERROR ][8499] No highstate or sls specified, no
> execution made 2018-09-18 22:29:52,472 [salt.transport.ipc][ERROR ][5672]
> Exception occurred while handling stream: [Errno 0] Success 2018-09-18
> 22:29:56,797 [salt.state ][ERROR ][8759] No highstate or sls
> specified, no execution made 2018-09-18 22:30:53,629 [salt.state
> ][ERROR ][9272] No highstate or sls specified, no execution made
>
> There’s also some issue with the salt-minion.service:
> ● salt-minion.service - The Salt Minion
> Loaded: loaded (/usr/lib/systemd/system/salt-minion.service; enabled;
> vendor preset: disabled)
Active: active (running) since Tue 2018-09-18
> 22:46:54 UTC; 12s ago Main PID: 11082 (salt-minion)
> …
> .....
> Sep 18 22:46:54 salt systemd[1]: Started The Salt Minion.
> Sep 18 22:47:00 salt salt-minion[11082]: [ERROR ] Function cephimages.list
> in mine_functions failed to execute
>
More information about the Deepsea-users
mailing list