[Deepsea-users] stage 1 errors on Azure

Kevin Ayres kevin.ayres at suse.com
Wed Sep 19 13:20:43 MDT 2018


Thanks Joel, yes DNS, NTP is configured and behaving correctly. SP3/SES5 from current repo. salt-api service, master, minion service running (with one error.)
I’m walking through the Deployment guide line by line with same result, now on my second freshly built master node. Salt output is at the bottom of this message. Key: After stage 0, the */proposals directory has NOT been created.

Here’s my build on a single flat network(Azure vNet 172.19.20.0/24):
Root ssh enabled and key based login from master to all nodes as root. All nodes rebooted before salt stage.
All nodes using identical image and fully patched CPE_NAME="cpe:/o:suse:sles:12:sp3", firewall off, etc. - the Azure instance defaults.

Salt (and all nodes):~ # zypper lr -E
Repository priorities are without effect. All enabled repositories share the same priority.
#  | Alias                                                              | Name                              | Enabled | GPG Check | Refresh
---+--------------------------------------------------------------------+-----------------------------------+---------+-----------+--------
  3 | SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Pool    | SUSE-Enterprise-Storage-5-Pool    | Yes     | (r ) Yes  | No
  5 | SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Updates | SUSE-Enterprise-Storage-5-Updates | Yes     | (r ) Yes  | Yes
  8 | SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Pool         | SLES12-SP3-Pool                   | Yes     | (r ) Yes  | No
 10 | SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Updates      | SLES12-SP3-Updates                | Yes     | (r ) Yes  | Yes

**DNS** all nodes resolve bidirectionally. Azure cares for DNS but I’ve also updated hosts files.
salt:~ # hostname
salt
salt:~ # ping salt
PING salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1) 56(84) bytes of data.
64 bytes from salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1): icmp_seq=1 ttl=64 time=0.030 ms

104.211.27.224 Outside NAT to 172.19.20.10    salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net salt
172.19.20.12    mon1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon1
172.19.20.13    mon2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon2
172.19.20.14      mon3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon3
172.19.20.15      osd1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd1
172.19.20.16      osd2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd2
172.19.20.17      osd3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd3
172.19.20.18      igw1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net ogw1

**NTP** all nodes 5 minutes sync interval to same Stratum 1 server in same GEO as Azure AZ: (US East) navobs1.gatech.edu as shown:
bash-3.2$ pssh -h pssh-hosts -l sesuser -i sudo ntpq -p
[1] 11:15:27 [SUCCESS] mon3
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*navobs1.gatech. .GPS.            1 u   19   64    1   15.596   -4.863   0.333
[2] 11:15:27 [SUCCESS] salt
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
navobs1.gatech. .GPS.            1 u   42   64    1   17.063   -6.702   0.000
[3] 11:15:27 [SUCCESS] igw1
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*navobs1.gatech. .GPS.            1 u   18   64    1   17.394  -27.874   7.663
[4] 11:15:27 [SUCCESS] osd1
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*navobs1.gatech. .GPS.            1 u   21   64    1   16.962   -3.755   0.813
[5] 11:15:27 [SUCCESS] osd2
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*navobs1.gatech. .GPS.            1 u   22   64    1   15.832   -4.709   3.062
[6] 11:15:27 [SUCCESS] osd3
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*navobs1.gatech. .GPS.            1 u   26   64    1   15.877   -3.252  19.131
[7] 11:15:27 [SUCCESS] mon1
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
navobs1.gatech. .GPS.            1 u    2   64    1   16.120   -4.263   0.000
[8] 11:15:27 [SUCCESS] mon2
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
navobs1.gatech. .GPS.            1 u    2   64    1   16.108   -7.713   0.959


**SALT**
salt:~ # systemctl status salt-api salt-master salt-minion |grep 'active (running)'
   Active: active (running) since Wed 2018-09-19 18:14:39 UTC; 13min ago
   Active: active (running) since Wed 2018-09-19 18:14:42 UTC; 13min ago
   Active: active (running) since Wed 2018-09-19 18:14:41 UTC; 13min ago

salt:~ # systemctl status salt-api salt-master salt-minion |grep ERROR
   Sep 19 18:14:49 salt salt-minion[1413]: [ERROR   ] Function cephimages.list in mine_functions failed to execute

salt:~ # salt-key --list-all
   Accepted Keys:
   igw1
   mon1
   mon2
   mon3
   osd1
   osd2
   osd3
   salt
   Denied Keys:
   Unaccepted Keys:
   Rejected Keys:

salt:~ # salt '*' test.ping
salt:
    True
osd2:
    True
mon3:
    True
osd3:
    True
osd1:
    True
mon2:
    True
igw1:
    True
mon1:
    True

salt:~ # cat /srv/pillar/ceph/master_minion.sls
master_minion: salt

salt:~ # cat /srv/pillar/ceph/deepsea_minions.sls
...
# Choose all minions
deepsea_minions: '*'
...

**SALT STAGES**
Stage 0 is successful with no errors but does not create the proposals folder.

salt:~ # salt-run state.orch ceph.stage.prep
    deepsea_minions          : valid
    master_minion            : valid
    ceph_version             : valid
    [WARNING ] All minions are ready
    salt_master:
      Name: sync master - Function: salt.state - Result: Changed Started: - 18:44:20.440255 Duration: 949.98 ms
      Name: salt-api - Function: salt.state - Result: Changed Started: - 18:44:21.390365 Duration: 3256.749 ms
      Name: repo master - Function: salt.state - Result: Clean Started: - 18:44:24.647227 Duration: 351.0 ms
      Name: metapackage master - Function: salt.state - Result: Clean Started: - 18:44:24.998333 Duration: 1127.063 ms
      Name: prepare master - Function: salt.state - Result: Changed Started: - 18:44:26.125514 Duration: 4109.917 ms
      Name: filequeue.remove - Function: salt.runner - Result: Changed Started: - 18:44:30.235610 Duration: 2071.199 ms
      Name: restart master - Function: salt.state - Result: Clean Started: - 18:44:32.306972 Duration: 1006.268 ms
      Name: filequeue.add - Function: salt.runner - Result: Changed Started: - 18:44:33.313369 Duration: 1352.98 ms
      Name: minions.ready - Function: salt.runner - Result: Changed Started: - 18:44:34.666528 Duration: 1891.677 ms
      Name: repo - Function: salt.state - Result: Clean Started: - 18:44:36.558363 Duration: 553.342 ms
      Name: metapackage minions - Function: salt.state - Result: Clean Started: - 18:44:37.111825 Duration: 3993.733 ms
      Name: common packages - Function: salt.state - Result: Clean Started: - 18:44:41.105706 Duration: 2434.079 ms
      Name: sync - Function: salt.state - Result: Changed Started: - 18:44:43.539897 Duration: 1381.692 ms
      Name: mines - Function: salt.state - Result: Clean Started: - 18:44:44.921708 Duration: 1657.019 ms
      Name: updates - Function: salt.state - Result: Changed Started: - 18:44:46.578853 Duration: 11183.347 ms
      Name: restart - Function: salt.state - Result: Clean Started: - 18:44:57.762346 Duration: 1553.957 ms
      Name: mds restart noop - Function: test.nop - Result: Clean Started: - 18:44:59.316442 Duration: 0.348 ms

    Summary for salt_master
    -------------
    Succeeded: 17 (changed=8)
    Failed:     0
    -------------
    Total states run:     17
    Total run time:   38.874 s


Before running Stage 1, the /srv/pillar/ceph/proposals directory does not exist.
salt:~ # ls /srv/pillar/ceph/proposals/
    ls: cannot access '/srv/pillar/ceph/proposals/': No such file or directory

That’s where I’m at – Googling..

~ Kevin

From: <deepsea-users-bounces at lists.suse.com> on behalf of Joel Zhou <joel.zhou at suse.com>
Reply-To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Date: Tuesday, September 18, 2018 at 11:34 PM
To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Subject: Re: [Deepsea-users] stage 1 errors on Azure

Hi Kevin,

My short answer is,

Step 1, before stage 0, check your salt-api service on salt-master node first.
```bash
zypper install -y salt-api
systemctl enable salt-api.service
systemctl start salt-api.service
```
Step 2, make sure NTP service works correctly on all nodes, which means time synchronized correctly on all nodes.
Step 3, reboot all your nodes, if acceptable. In case of kernel updated somehow.
Step 4, then you have to start over again from stage 0 to 5.

Basically, deepsea is a bunch of salt scripts, and salt based on python2 and/or python3.
I have no clues about your whole running stack, so assume SLES 12 sp3 + SES 5, which works fine and supported.
More info would be helpful, and also your purpose, such as for practice on your own, or for PoC/testing to meet customer’s demands.

Regards,

--
Joel Zhou 周维伟
Senior Storage Technologist, APJ

Mobile: +86 18514577601
Email: joel.zhou at suse.com

From: <deepsea-users-bounces at lists.suse.com> on behalf of Kevin Ayres <kevin.ayres at suse.com>
Reply-To: Discussions about the DeepSea management framework for Ceph <deepsea-users at lists.suse.com>
Date: Tuesday, September 18, 2018 at 4:49 PM
To: "deepsea-users at lists.suse.com" <deepsea-users at lists.suse.com>
Subject: [Deepsea-users] stage 1 errors on Azure

Hey guys, I can’t seem to get past stage 1. Stage 0 complete successfully. Same output with deepsea command. The master and minion service are running and bidirectional host resolution are good. Keys are all accepted. From what I can determine, the default files are not created by stage 0 for some reason. Thoughts? What I’m seeing is that it fails to create the /srv/pillar/ceph/proposals

I’m running through this doc line by line: https://www.suse.com/documentation/suse-enterprise-storage-5/singlehtml/book_storage_deployment/book_storage_deployment.html#deepsea.cli

~ Kevin


salt:~ # salt-run state.orch ceph.stage.discovery

salt-api                 : ["Salt API is failing to authenticate - try 'systemctl restart salt-master': list index out of range"]

deepsea_minions          : valid

master_minion            : valid

ceph_version             : valid

[ERROR   ] No highstate or sls specified, no execution made

salt_master:

----------

          ID: salt-api failed

    Function: salt.state

        Name: just.exit

      Result: False

     Comment: No highstate or sls specified, no execution made

     Started: 22:30:53.628882

    Duration: 0.647 ms

     Changes:



Summary for salt_master

------------

Succeeded: 0

Failed:    1

------------

Total states run:     1

Total run time:   0.647 ms

salt:~ # !tail
tail -f /var/log/salt/master
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] role-igw/cluster/igw*.sls matched no files
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] role-openattic/cluster/salt.sls matched no files
2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499] config/stack/default/global.yml matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] config/stack/default/ceph/cluster.yml matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] cluster/*.sls matched no files
2018-09-18 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499] stack/default/ceph/minions/*.yml matched no files
2018-09-18 22:29:08,822 [salt.state       ][ERROR   ][8499] No highstate or sls specified, no execution made
2018-09-18 22:29:52,472 [salt.transport.ipc][ERROR   ][5672] Exception occurred while handling stream: [Errno 0] Success
2018-09-18 22:29:56,797 [salt.state       ][ERROR   ][8759] No highstate or sls specified, no execution made
2018-09-18 22:30:53,629 [salt.state       ][ERROR   ][9272] No highstate or sls specified, no execution made


There’s also some issue with the salt-minion.service:
● salt-minion.service - The Salt Minion
   Loaded: loaded (/usr/lib/systemd/system/salt-minion.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-09-18 22:46:54 UTC; 12s ago
 Main PID: 11082 (salt-minion)
…
.....
Sep 18 22:46:54 salt systemd[1]: Started The Salt Minion.
Sep 18 22:47:00 salt salt-minion[11082]: [ERROR   ] Function cephimages.list in mine_functions failed to execute


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20180919/cd1ffc72/attachment.htm>


More information about the Deepsea-users mailing list