[Deepsea-users] stage 1 errors on Azure

Kevin Ayres kevin.ayres at suse.com
Wed Sep 19 13:56:17 MDT 2018


Thanks Eric, Yes, I understand this but worded it poorly. I don't see any issues with NTP or DNS. Something else is amiss. 
Should deepsea be installed after salt as outlined in the deployment doc, or before? 

salt:~ # salt-run state.orch ceph.stage.discovery
salt-api                 : ["Salt API is failing to authenticate - try 'systemctl restart salt-master': list index out of range"]
deepsea_minions          : valid
master_minion            : valid
ceph_version             : valid
[ERROR   ] No highstate or sls specified, no execution made
salt_master:
----------
          ID: salt-api failed
    Function: salt.state
        Name: just.exit
      Result: False
     Comment: No highstate or sls specified, no execution made
     Started: 19:38:41.962044
    Duration: 0.734 ms
     Changes:   

Summary for salt_master
------------
Succeeded: 0
Failed:    1
------------
Total states run:     1
Total run time:   0.734 ms


salt:~ # tail -f /var/log/salt/master
2018-09-19 18:44:36,555 [salt.loaded.ext.runners.minions][WARNING ][15319] All minions are ready
2018-09-19 19:38:41,955 [salt.transport.ipc][ERROR   ][1626] Exception occurred while handling stream: [Errno 0] Success
2018-09-19 19:38:41,962 [salt.state       ][ERROR   ][40826] No highstate or sls specified, no execution made

salt:~ # ls /srv/pillar/ceph/proposals
ls: cannot access '/srv/pillar/ceph/proposals': No such file or directory

salt:~ # ls /srv/pillar/ceph/
benchmarks  deepsea_minions.sls  deepsea_minions.sls.rpmsave  init.sls	master_minion.sls  master_minion.sls.rpmsave  stack


~ Kevin

On 9/19/18, 12:37 PM, "deepsea-users-bounces at lists.suse.com on behalf of Eric Jackson" <deepsea-users-bounces at lists.suse.com on behalf of ejackson at suse.com> wrote:

    Hi Kevin,
      Stage 0 only does the "preparation" part.  That is, sync'ing salt modules, 
    zypper updates, etc.  Stage 1 is the "discovery" part that interrogates the 
    minions and then creates the roles and storage fragments.  If your salt-api 
    issue is resolved, Stage 1 should run relatively quick.  
    
    Eric
    
    On Wednesday, September 19, 2018 3:20:43 PM EDT Kevin Ayres wrote:
    > Thanks Joel, yes DNS, NTP is configured and behaving correctly. SP3/SES5
    > from current repo. salt-api service, master, minion service running (with
    > one error.)
     I’m walking through the Deployment guide line by line with
    > same result, now on my second freshly built master node. Salt output is at
    > the bottom of this message. Key: After stage 0, the */proposals directory
    > has NOT been created. 
    > Here’s my build on a single flat network(Azure vNet 172.19.20.0/24):
    > Root ssh enabled and key based login from master to all nodes as root. All
    > nodes rebooted before salt stage.
     All nodes using identical image and
    > fully patched CPE_NAME="cpe:/o:suse:sles:12:sp3", firewall off, etc. - the
    > Azure instance defaults. 
    > Salt (and all nodes):~ # zypper lr -E
    > Repository priorities are without effect. All enabled repositories share the
    > same priority.
     #  | Alias                                                 
    >             | Name                              | Enabled | GPG Check |
    > Refresh
    > ---+--------------------------------------------------------------------+--
    > ---------------------------------+---------+-----------+-------- 3 |
    > SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Pool    |
    > SUSE-Enterprise-Storage-5-Pool    | Yes     | (r ) Yes  | No 5 |
    > SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Updates |
    > SUSE-Enterprise-Storage-5-Updates | Yes     | (r ) Yes  | Yes 8 |
    > SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Pool         |
    > SLES12-SP3-Pool                   | Yes     | (r ) Yes  | No 10 |
    > SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Updates      |
    > SLES12-SP3-Updates                | Yes     | (r ) Yes  | Yes 
    > **DNS** all nodes resolve bidirectionally. Azure cares for DNS but I’ve also
    > updated hosts files.
     salt:~ # hostname
    > salt
    > salt:~ # ping salt
    > PING salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1)
    > 56(84) bytes of data.
     64 bytes from
    > salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1):
    > icmp_seq=1 ttl=64 time=0.030 ms 
    > 104.211.27.224 Outside NAT to 172.19.20.10   
    > salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net salt
     172.19.20.12
    >    mon1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon1
    > 172.19.20.13    mon2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
    > mon2 172.19.20.14     
    > mon3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon3 172.19.20.15 
    >     osd1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd1
    > 172.19.20.16      osd2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
    > osd2 172.19.20.17     
    > osd3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd3 172.19.20.18 
    >     igw1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net ogw1 
    > **NTP** all nodes 5 minutes sync interval to same Stratum 1 server in same
    > GEO as Azure AZ: (US East) navobs1.gatech.edu as shown:
     bash-3.2$ pssh -h
    > pssh-hosts -l sesuser -i sudo ntpq -p
    > [1] 11:15:27 [SUCCESS] mon3
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === *navobs1.gatech. .GPS.            1 u   19   64    1   15.596   -4.863  
    > 0.333 [2] 11:15:27 [SUCCESS] salt
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === navobs1.gatech. .GPS.            1 u   42   64    1   17.063   -6.702  
    > 0.000 [3] 11:15:27 [SUCCESS] igw1
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === *navobs1.gatech. .GPS.            1 u   18   64    1   17.394  -27.874  
    > 7.663 [4] 11:15:27 [SUCCESS] osd1
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === *navobs1.gatech. .GPS.            1 u   21   64    1   16.962   -3.755  
    > 0.813 [5] 11:15:27 [SUCCESS] osd2
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === *navobs1.gatech. .GPS.            1 u   22   64    1   15.832   -4.709  
    > 3.062 [6] 11:15:27 [SUCCESS] osd3
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === *navobs1.gatech. .GPS.            1 u   26   64    1   15.877   -3.252 
    > 19.131 [7] 11:15:27 [SUCCESS] mon1
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === navobs1.gatech. .GPS.            1 u    2   64    1   16.120   -4.263  
    > 0.000 [8] 11:15:27 [SUCCESS] mon2
    >      remote           refid      st t when poll reach   delay   offset 
    > jitter
    > ===========================================================================
    > === navobs1.gatech. .GPS.            1 u    2   64    1   16.108   -7.713  
    > 0.959 
    > 
    > **SALT**
    > salt:~ # systemctl status salt-api salt-master salt-minion |grep 'active
    > (running)'
     Active: active (running) since Wed 2018-09-19 18:14:39 UTC;
    > 13min ago Active: active (running) since Wed 2018-09-19 18:14:42 UTC; 13min
    > ago Active: active (running) since Wed 2018-09-19 18:14:41 UTC; 13min ago 
    > salt:~ # systemctl status salt-api salt-master salt-minion |grep ERROR
    >    Sep 19 18:14:49 salt salt-minion[1413]: [ERROR   ] Function
    > cephimages.list in mine_functions failed to execute
     
    > salt:~ # salt-key --list-all
    >    Accepted Keys:
    >    igw1
    >    mon1
    >    mon2
    >    mon3
    >    osd1
    >    osd2
    >    osd3
    >    salt
    >    Denied Keys:
    >    Unaccepted Keys:
    >    Rejected Keys:
    > 
    > salt:~ # salt '*' test.ping
    > salt:
    >     True
    > osd2:
    >     True
    > mon3:
    >     True
    > osd3:
    >     True
    > osd1:
    >     True
    > mon2:
    >     True
    > igw1:
    >     True
    > mon1:
    >     True
    > 
    > salt:~ # cat /srv/pillar/ceph/master_minion.sls
    > master_minion: salt
    > 
    > salt:~ # cat /srv/pillar/ceph/deepsea_minions.sls
    > ...
    > # Choose all minions
    > deepsea_minions: '*'
    > ...
    > 
    > **SALT STAGES**
    > Stage 0 is successful with no errors but does not create the proposals
    > folder.
     
    > salt:~ # salt-run state.orch ceph.stage.prep
    >     deepsea_minions          : valid
    >     master_minion            : valid
    >     ceph_version             : valid
    >     [WARNING ] All minions are ready
    >     salt_master:
    >       Name: sync master - Function: salt.state - Result: Changed Started: -
    > 18:44:20.440255 Duration: 949.98 ms
     Name: salt-api - Function: salt.state
    > - Result: Changed Started: - 18:44:21.390365 Duration: 3256.749 ms Name:
    > repo master - Function: salt.state - Result: Clean Started: -
    > 18:44:24.647227 Duration: 351.0 ms Name: metapackage master - Function:
    > salt.state - Result: Clean Started: - 18:44:24.998333 Duration: 1127.063 ms
    > Name: prepare master - Function: salt.state - Result: Changed Started: -
    > 18:44:26.125514 Duration: 4109.917 ms Name: filequeue.remove - Function:
    > salt.runner - Result: Changed Started: - 18:44:30.235610 Duration: 2071.199
    > ms Name: restart master - Function: salt.state - Result: Clean Started: -
    > 18:44:32.306972 Duration: 1006.268 ms Name: filequeue.add - Function:
    > salt.runner - Result: Changed Started: - 18:44:33.313369 Duration: 1352.98
    > ms Name: minions.ready - Function: salt.runner - Result: Changed Started: -
    > 18:44:34.666528 Duration: 1891.677 ms Name: repo - Function: salt.state -
    > Result: Clean Started: - 18:44:36.558363 Duration: 553.342 ms Name:
    > metapackage minions - Function: salt.state - Result: Clean Started: -
    > 18:44:37.111825 Duration: 3993.733 ms Name: common packages - Function:
    > salt.state - Result: Clean Started: - 18:44:41.105706 Duration: 2434.079 ms
    > Name: sync - Function: salt.state - Result: Changed Started: -
    > 18:44:43.539897 Duration: 1381.692 ms Name: mines - Function: salt.state -
    > Result: Clean Started: - 18:44:44.921708 Duration: 1657.019 ms Name:
    > updates - Function: salt.state - Result: Changed Started: - 18:44:46.578853
    > Duration: 11183.347 ms Name: restart - Function: salt.state - Result: Clean
    > Started: - 18:44:57.762346 Duration: 1553.957 ms Name: mds restart noop -
    > Function: test.nop - Result: Clean Started: - 18:44:59.316442 Duration:
    > 0.348 ms 
    >     Summary for salt_master
    >     -------------
    >     Succeeded: 17 (changed=8)
    >     Failed:     0
    >     -------------
    >     Total states run:     17
    >     Total run time:   38.874 s
    > 
    > 
    > Before running Stage 1, the /srv/pillar/ceph/proposals directory does not
    > exist.
     salt:~ # ls /srv/pillar/ceph/proposals/
    >     ls: cannot access '/srv/pillar/ceph/proposals/': No such file or
    > directory
     
    > That’s where I’m at – Googling..
    > 
    > ~ Kevin
    > 
    > From: <deepsea-users-bounces at lists.suse.com> on behalf of Joel Zhou
    > <joel.zhou at suse.com>
     Reply-To: Discussions about the DeepSea management
    > framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
    > 18, 2018 at 11:34 PM
    > To: Discussions about the DeepSea management framework for Ceph
    > <deepsea-users at lists.suse.com>
     Subject: Re: [Deepsea-users] stage 1 errors
    > on Azure
    > 
    > Hi Kevin,
    > 
    > My short answer is,
    > 
    > Step 1, before stage 0, check your salt-api service on salt-master node
    > first.
     ```bash
    > zypper install -y salt-api
    > systemctl enable salt-api.service
    > systemctl start salt-api.service
    > ```
    > Step 2, make sure NTP service works correctly on all nodes, which means time
    > synchronized correctly on all nodes.
     Step 3, reboot all your nodes, if
    > acceptable. In case of kernel updated somehow. Step 4, then you have to
    > start over again from stage 0 to 5.
    > 
    > Basically, deepsea is a bunch of salt scripts, and salt based on python2
    > and/or python3.
     I have no clues about your whole running stack, so assume
    > SLES 12 sp3 + SES 5, which works fine and supported. More info would be
    > helpful, and also your purpose, such as for practice on your own, or for
    > PoC/testing to meet customer’s demands. 
    > Regards,
    > 
    > --
    > Joel Zhou 周维伟
    > Senior Storage Technologist, APJ
    > 
    > Mobile: +86 18514577601
    > Email: joel.zhou at suse.com
    > 
    > From: <deepsea-users-bounces at lists.suse.com> on behalf of Kevin Ayres
    > <kevin.ayres at suse.com>
     Reply-To: Discussions about the DeepSea management
    > framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
    > 18, 2018 at 4:49 PM
    > To: "deepsea-users at lists.suse.com" <deepsea-users at lists.suse.com>
    > Subject: [Deepsea-users] stage 1 errors on Azure
    > 
    > Hey guys, I can’t seem to get past stage 1. Stage 0 complete successfully.
    > Same output with deepsea command. The master and minion service are running
    > and bidirectional host resolution are good. Keys are all accepted. From
    > what I can determine, the default files are not created by stage 0 for some
    > reason. Thoughts? What I’m seeing is that it fails to create the
    > /srv/pillar/ceph/proposals
     
    > I’m running through this doc line by line:
    > https://www.suse.com/documentation/suse-enterprise-storage-5/singlehtml/boo
    > k_storage_deployment/book_storage_deployment.html#deepsea.cli
     
    > ~ Kevin
    > 
    > 
    > salt:~ # salt-run state.orch ceph.stage.discovery
    > 
    > salt-api                 : ["Salt API is failing to authenticate - try
    > 'systemctl restart salt-master': list index out of range"]
     
    > deepsea_minions          : valid
    > 
    > master_minion            : valid
    > 
    > ceph_version             : valid
    > 
    > [ERROR   ] No highstate or sls specified, no execution made
    > 
    > salt_master:
    > 
    > ----------
    > 
    >           ID: salt-api failed
    > 
    >     Function: salt.state
    > 
    >         Name: just.exit
    > 
    >       Result: False
    > 
    >      Comment: No highstate or sls specified, no execution made
    > 
    >      Started: 22:30:53.628882
    > 
    >     Duration: 0.647 ms
    > 
    >      Changes:
    > 
    > 
    > 
    > Summary for salt_master
    > 
    > ------------
    > 
    > Succeeded: 0
    > 
    > Failed:    1
    > 
    > ------------
    > 
    > Total states run:     1
    > 
    > Total run time:   0.647 ms
    > 
    > salt:~ # !tail
    > tail -f /var/log/salt/master
    > 2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499]
    > role-igw/cluster/igw*.sls matched no files
     2018-09-18 22:29:08,797
    > [salt.loaded.ext.runners.validate][WARNING ][8499]
    > role-openattic/cluster/salt.sls matched no files 2018-09-18 22:29:08,797
    > [salt.loaded.ext.runners.validate][WARNING ][8499]
    > config/stack/default/global.yml matched no files 2018-09-18 22:29:08,798
    > [salt.loaded.ext.runners.validate][WARNING ][8499]
    > config/stack/default/ceph/cluster.yml matched no files 2018-09-18
    > 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499]
    > cluster/*.sls matched no files 2018-09-18 22:29:08,798
    > [salt.loaded.ext.runners.validate][WARNING ][8499]
    > stack/default/ceph/minions/*.yml matched no files 2018-09-18 22:29:08,822
    > [salt.state       ][ERROR   ][8499] No highstate or sls specified, no
    > execution made 2018-09-18 22:29:52,472 [salt.transport.ipc][ERROR   ][5672]
    > Exception occurred while handling stream: [Errno 0] Success 2018-09-18
    > 22:29:56,797 [salt.state       ][ERROR   ][8759] No highstate or sls
    > specified, no execution made 2018-09-18 22:30:53,629 [salt.state      
    > ][ERROR   ][9272] No highstate or sls specified, no execution made 
    > 
    > There’s also some issue with the salt-minion.service:
    > ● salt-minion.service - The Salt Minion
    >    Loaded: loaded (/usr/lib/systemd/system/salt-minion.service; enabled;
    > vendor preset: disabled)
     Active: active (running) since Tue 2018-09-18
    > 22:46:54 UTC; 12s ago Main PID: 11082 (salt-minion)
    > …
    > .....
    > Sep 18 22:46:54 salt systemd[1]: Started The Salt Minion.
    > Sep 18 22:47:00 salt salt-minion[11082]: [ERROR   ] Function cephimages.list
    > in mine_functions failed to execute
     
    > 
    
    



More information about the Deepsea-users mailing list