[Deepsea-users] stage 1 errors on Azure

Eric Jackson ejackson at suse.com
Wed Sep 19 13:37:00 MDT 2018


Hi Kevin,
  Stage 0 only does the "preparation" part.  That is, sync'ing salt modules, 
zypper updates, etc.  Stage 1 is the "discovery" part that interrogates the 
minions and then creates the roles and storage fragments.  If your salt-api 
issue is resolved, Stage 1 should run relatively quick.  

Eric

On Wednesday, September 19, 2018 3:20:43 PM EDT Kevin Ayres wrote:
> Thanks Joel, yes DNS, NTP is configured and behaving correctly. SP3/SES5
> from current repo. salt-api service, master, minion service running (with
> one error.)
 I’m walking through the Deployment guide line by line with
> same result, now on my second freshly built master node. Salt output is at
> the bottom of this message. Key: After stage 0, the */proposals directory
> has NOT been created. 
> Here’s my build on a single flat network(Azure vNet 172.19.20.0/24):
> Root ssh enabled and key based login from master to all nodes as root. All
> nodes rebooted before salt stage.
 All nodes using identical image and
> fully patched CPE_NAME="cpe:/o:suse:sles:12:sp3", firewall off, etc. - the
> Azure instance defaults. 
> Salt (and all nodes):~ # zypper lr -E
> Repository priorities are without effect. All enabled repositories share the
> same priority.
 #  | Alias                                                 
>             | Name                              | Enabled | GPG Check |
> Refresh
> ---+--------------------------------------------------------------------+--
> ---------------------------------+---------+-----------+-------- 3 |
> SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Pool    |
> SUSE-Enterprise-Storage-5-Pool    | Yes     | (r ) Yes  | No 5 |
> SUSE_Enterprise_Storage_5_x86_64:SUSE-Enterprise-Storage-5-Updates |
> SUSE-Enterprise-Storage-5-Updates | Yes     | (r ) Yes  | Yes 8 |
> SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Pool         |
> SLES12-SP3-Pool                   | Yes     | (r ) Yes  | No 10 |
> SUSE_Linux_Enterprise_Server_12_SP3_x86_64:SLES12-SP3-Updates      |
> SLES12-SP3-Updates                | Yes     | (r ) Yes  | Yes 
> **DNS** all nodes resolve bidirectionally. Azure cares for DNS but I’ve also
> updated hosts files.
 salt:~ # hostname
> salt
> salt:~ # ping salt
> PING salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1)
> 56(84) bytes of data.
 64 bytes from
> salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net (127.0.0.1):
> icmp_seq=1 ttl=64 time=0.030 ms 
> 104.211.27.224 Outside NAT to 172.19.20.10   
> salt.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net salt
 172.19.20.12
>    mon1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon1
> 172.19.20.13    mon2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
> mon2 172.19.20.14     
> mon3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net mon3 172.19.20.15 
>     osd1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd1
> 172.19.20.16      osd2.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net
> osd2 172.19.20.17     
> osd3.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net osd3 172.19.20.18 
>     igw1.acylew2ti3nulm1e5a1hcxdv0h.bx.internal.cloudapp.net ogw1 
> **NTP** all nodes 5 minutes sync interval to same Stratum 1 server in same
> GEO as Azure AZ: (US East) navobs1.gatech.edu as shown:
 bash-3.2$ pssh -h
> pssh-hosts -l sesuser -i sudo ntpq -p
> [1] 11:15:27 [SUCCESS] mon3
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS.            1 u   19   64    1   15.596   -4.863  
> 0.333 [2] 11:15:27 [SUCCESS] salt
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS.            1 u   42   64    1   17.063   -6.702  
> 0.000 [3] 11:15:27 [SUCCESS] igw1
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS.            1 u   18   64    1   17.394  -27.874  
> 7.663 [4] 11:15:27 [SUCCESS] osd1
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS.            1 u   21   64    1   16.962   -3.755  
> 0.813 [5] 11:15:27 [SUCCESS] osd2
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS.            1 u   22   64    1   15.832   -4.709  
> 3.062 [6] 11:15:27 [SUCCESS] osd3
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === *navobs1.gatech. .GPS.            1 u   26   64    1   15.877   -3.252 
> 19.131 [7] 11:15:27 [SUCCESS] mon1
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS.            1 u    2   64    1   16.120   -4.263  
> 0.000 [8] 11:15:27 [SUCCESS] mon2
>      remote           refid      st t when poll reach   delay   offset 
> jitter
> ===========================================================================
> === navobs1.gatech. .GPS.            1 u    2   64    1   16.108   -7.713  
> 0.959 
> 
> **SALT**
> salt:~ # systemctl status salt-api salt-master salt-minion |grep 'active
> (running)'
 Active: active (running) since Wed 2018-09-19 18:14:39 UTC;
> 13min ago Active: active (running) since Wed 2018-09-19 18:14:42 UTC; 13min
> ago Active: active (running) since Wed 2018-09-19 18:14:41 UTC; 13min ago 
> salt:~ # systemctl status salt-api salt-master salt-minion |grep ERROR
>    Sep 19 18:14:49 salt salt-minion[1413]: [ERROR   ] Function
> cephimages.list in mine_functions failed to execute
 
> salt:~ # salt-key --list-all
>    Accepted Keys:
>    igw1
>    mon1
>    mon2
>    mon3
>    osd1
>    osd2
>    osd3
>    salt
>    Denied Keys:
>    Unaccepted Keys:
>    Rejected Keys:
> 
> salt:~ # salt '*' test.ping
> salt:
>     True
> osd2:
>     True
> mon3:
>     True
> osd3:
>     True
> osd1:
>     True
> mon2:
>     True
> igw1:
>     True
> mon1:
>     True
> 
> salt:~ # cat /srv/pillar/ceph/master_minion.sls
> master_minion: salt
> 
> salt:~ # cat /srv/pillar/ceph/deepsea_minions.sls
> ...
> # Choose all minions
> deepsea_minions: '*'
> ...
> 
> **SALT STAGES**
> Stage 0 is successful with no errors but does not create the proposals
> folder.
 
> salt:~ # salt-run state.orch ceph.stage.prep
>     deepsea_minions          : valid
>     master_minion            : valid
>     ceph_version             : valid
>     [WARNING ] All minions are ready
>     salt_master:
>       Name: sync master - Function: salt.state - Result: Changed Started: -
> 18:44:20.440255 Duration: 949.98 ms
 Name: salt-api - Function: salt.state
> - Result: Changed Started: - 18:44:21.390365 Duration: 3256.749 ms Name:
> repo master - Function: salt.state - Result: Clean Started: -
> 18:44:24.647227 Duration: 351.0 ms Name: metapackage master - Function:
> salt.state - Result: Clean Started: - 18:44:24.998333 Duration: 1127.063 ms
> Name: prepare master - Function: salt.state - Result: Changed Started: -
> 18:44:26.125514 Duration: 4109.917 ms Name: filequeue.remove - Function:
> salt.runner - Result: Changed Started: - 18:44:30.235610 Duration: 2071.199
> ms Name: restart master - Function: salt.state - Result: Clean Started: -
> 18:44:32.306972 Duration: 1006.268 ms Name: filequeue.add - Function:
> salt.runner - Result: Changed Started: - 18:44:33.313369 Duration: 1352.98
> ms Name: minions.ready - Function: salt.runner - Result: Changed Started: -
> 18:44:34.666528 Duration: 1891.677 ms Name: repo - Function: salt.state -
> Result: Clean Started: - 18:44:36.558363 Duration: 553.342 ms Name:
> metapackage minions - Function: salt.state - Result: Clean Started: -
> 18:44:37.111825 Duration: 3993.733 ms Name: common packages - Function:
> salt.state - Result: Clean Started: - 18:44:41.105706 Duration: 2434.079 ms
> Name: sync - Function: salt.state - Result: Changed Started: -
> 18:44:43.539897 Duration: 1381.692 ms Name: mines - Function: salt.state -
> Result: Clean Started: - 18:44:44.921708 Duration: 1657.019 ms Name:
> updates - Function: salt.state - Result: Changed Started: - 18:44:46.578853
> Duration: 11183.347 ms Name: restart - Function: salt.state - Result: Clean
> Started: - 18:44:57.762346 Duration: 1553.957 ms Name: mds restart noop -
> Function: test.nop - Result: Clean Started: - 18:44:59.316442 Duration:
> 0.348 ms 
>     Summary for salt_master
>     -------------
>     Succeeded: 17 (changed=8)
>     Failed:     0
>     -------------
>     Total states run:     17
>     Total run time:   38.874 s
> 
> 
> Before running Stage 1, the /srv/pillar/ceph/proposals directory does not
> exist.
 salt:~ # ls /srv/pillar/ceph/proposals/
>     ls: cannot access '/srv/pillar/ceph/proposals/': No such file or
> directory
 
> That’s where I’m at – Googling..
> 
> ~ Kevin
> 
> From: <deepsea-users-bounces at lists.suse.com> on behalf of Joel Zhou
> <joel.zhou at suse.com>
 Reply-To: Discussions about the DeepSea management
> framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
> 18, 2018 at 11:34 PM
> To: Discussions about the DeepSea management framework for Ceph
> <deepsea-users at lists.suse.com>
 Subject: Re: [Deepsea-users] stage 1 errors
> on Azure
> 
> Hi Kevin,
> 
> My short answer is,
> 
> Step 1, before stage 0, check your salt-api service on salt-master node
> first.
 ```bash
> zypper install -y salt-api
> systemctl enable salt-api.service
> systemctl start salt-api.service
> ```
> Step 2, make sure NTP service works correctly on all nodes, which means time
> synchronized correctly on all nodes.
 Step 3, reboot all your nodes, if
> acceptable. In case of kernel updated somehow. Step 4, then you have to
> start over again from stage 0 to 5.
> 
> Basically, deepsea is a bunch of salt scripts, and salt based on python2
> and/or python3.
 I have no clues about your whole running stack, so assume
> SLES 12 sp3 + SES 5, which works fine and supported. More info would be
> helpful, and also your purpose, such as for practice on your own, or for
> PoC/testing to meet customer’s demands. 
> Regards,
> 
> --
> Joel Zhou 周维伟
> Senior Storage Technologist, APJ
> 
> Mobile: +86 18514577601
> Email: joel.zhou at suse.com
> 
> From: <deepsea-users-bounces at lists.suse.com> on behalf of Kevin Ayres
> <kevin.ayres at suse.com>
 Reply-To: Discussions about the DeepSea management
> framework for Ceph <deepsea-users at lists.suse.com> Date: Tuesday, September
> 18, 2018 at 4:49 PM
> To: "deepsea-users at lists.suse.com" <deepsea-users at lists.suse.com>
> Subject: [Deepsea-users] stage 1 errors on Azure
> 
> Hey guys, I can’t seem to get past stage 1. Stage 0 complete successfully.
> Same output with deepsea command. The master and minion service are running
> and bidirectional host resolution are good. Keys are all accepted. From
> what I can determine, the default files are not created by stage 0 for some
> reason. Thoughts? What I’m seeing is that it fails to create the
> /srv/pillar/ceph/proposals
 
> I’m running through this doc line by line:
> https://www.suse.com/documentation/suse-enterprise-storage-5/singlehtml/boo
> k_storage_deployment/book_storage_deployment.html#deepsea.cli
 
> ~ Kevin
> 
> 
> salt:~ # salt-run state.orch ceph.stage.discovery
> 
> salt-api                 : ["Salt API is failing to authenticate - try
> 'systemctl restart salt-master': list index out of range"]
 
> deepsea_minions          : valid
> 
> master_minion            : valid
> 
> ceph_version             : valid
> 
> [ERROR   ] No highstate or sls specified, no execution made
> 
> salt_master:
> 
> ----------
> 
>           ID: salt-api failed
> 
>     Function: salt.state
> 
>         Name: just.exit
> 
>       Result: False
> 
>      Comment: No highstate or sls specified, no execution made
> 
>      Started: 22:30:53.628882
> 
>     Duration: 0.647 ms
> 
>      Changes:
> 
> 
> 
> Summary for salt_master
> 
> ------------
> 
> Succeeded: 0
> 
> Failed:    1
> 
> ------------
> 
> Total states run:     1
> 
> Total run time:   0.647 ms
> 
> salt:~ # !tail
> tail -f /var/log/salt/master
> 2018-09-18 22:29:08,797 [salt.loaded.ext.runners.validate][WARNING ][8499]
> role-igw/cluster/igw*.sls matched no files
 2018-09-18 22:29:08,797
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> role-openattic/cluster/salt.sls matched no files 2018-09-18 22:29:08,797
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> config/stack/default/global.yml matched no files 2018-09-18 22:29:08,798
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> config/stack/default/ceph/cluster.yml matched no files 2018-09-18
> 22:29:08,798 [salt.loaded.ext.runners.validate][WARNING ][8499]
> cluster/*.sls matched no files 2018-09-18 22:29:08,798
> [salt.loaded.ext.runners.validate][WARNING ][8499]
> stack/default/ceph/minions/*.yml matched no files 2018-09-18 22:29:08,822
> [salt.state       ][ERROR   ][8499] No highstate or sls specified, no
> execution made 2018-09-18 22:29:52,472 [salt.transport.ipc][ERROR   ][5672]
> Exception occurred while handling stream: [Errno 0] Success 2018-09-18
> 22:29:56,797 [salt.state       ][ERROR   ][8759] No highstate or sls
> specified, no execution made 2018-09-18 22:30:53,629 [salt.state      
> ][ERROR   ][9272] No highstate or sls specified, no execution made 
> 
> There’s also some issue with the salt-minion.service:
> ● salt-minion.service - The Salt Minion
>    Loaded: loaded (/usr/lib/systemd/system/salt-minion.service; enabled;
> vendor preset: disabled)
 Active: active (running) since Tue 2018-09-18
> 22:46:54 UTC; 12s ago Main PID: 11082 (salt-minion)
>> .....
> Sep 18 22:46:54 salt systemd[1]: Started The Salt Minion.
> Sep 18 22:47:00 salt salt-minion[11082]: [ERROR   ] Function cephimages.list
> in mine_functions failed to execute
 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20180919/9bd245d4/attachment.sig>


More information about the Deepsea-users mailing list