From lgrimmer at suse.com  Fri Dec  1 06:35:29 2017
From: lgrimmer at suse.com (Lenz Grimmer)
Date: Fri, 1 Dec 2017 14:35:29 +0100
Subject: [Deepsea-users] Running stage 0 fails with some errors ("rpm:
 -1: unknown option")
In-Reply-To: <ec414c03-0a05-47a3-7d35-0a325af0ac5b@suse.com>
References: <ec3614e5-9607-54b7-fdfd-61644487cd77@suse.com>
	<1511974779.3686.9.camel@suse.com>
	<ec414c03-0a05-47a3-7d35-0a325af0ac5b@suse.com>
Message-ID: <f62047e6-7a93-c94a-71fb-354b676b78de@suse.com>

On 11/30/2017 04:26 PM, Lenz Grimmer wrote:

>> Currently DeepSea only supports salt version 2016.11.4
> 
> Oh, so users on Leap can't actually use DeepSea at all? Or is there
> a recommended way to downgrade Salt to the previous version?

Answering myself: there is a dedicated repo on OBS that provides Salt
2016.11.4 for Leap 42.3:

https://software.opensuse.org//download.html?project=systemsmanagement%3Asaltstack%3Aproducts&package=salt

You can then install Salt by providing an explicit version number, e.g.
"zypper install salt-2016.11.04".

I wonder if it would make sense to add this version requirement as an
installation dependency to the DeepSea RPM spec file?

I will continue testing this and report back. I need to figure out if
it's possible to "pin" Salt to this version, to avoid that zypper
updates it to the newer version shipped with Leap by accident...

Lenz

-- 
SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany)
GF:Felix Imend?rffer,Jane Smithard,Graham Norton,HRB 21284 (AG N?rnberg)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <http://lists.suse.com/pipermail/deepsea-users/attachments/20171201/b3d24e3a/attachment.sig>

From ncutler at suse.cz  Fri Dec  1 09:22:45 2017
From: ncutler at suse.cz (Nathan Cutler)
Date: Fri, 1 Dec 2017 17:22:45 +0100
Subject: [Deepsea-users] Running stage 0 fails with some errors ("rpm:
 -1: unknown option")
In-Reply-To: <f62047e6-7a93-c94a-71fb-354b676b78de@suse.com>
References: <ec3614e5-9607-54b7-fdfd-61644487cd77@suse.com>
	<1511974779.3686.9.camel@suse.com>
	<ec414c03-0a05-47a3-7d35-0a325af0ac5b@suse.com>
	<f62047e6-7a93-c94a-71fb-354b676b78de@suse.com>
Message-ID: <50d556d4-66a3-2b59-8ac1-aa4e5de5c30e@suse.cz>

> I will continue testing this and report back. I need to figure out if
> it's possible to "pin" Salt to this version, to avoid that zypper
> updates it to the newer version shipped with Leap by accident...

Hi Lenz:

There are two ways, and you can use one, the other, or both.

First way: increase the priority of the repo containing the "right" Salt 
version. The default priority is 99, and the lower the number, the 
higher the priority. Use the "-p" option to "zypper lr" to see the 
priorities of all your repos:

# zypper lr -p

For example, if the right version is coming from repo "foo", increase 
the priority of that repo to a number less than that, e.g.:

# zypper mr -p 98 foo
# zypper lr -p | grep foo

The second way is package locks. Read the "Package Lock Management" of 
the zypper manpage for details, but tl;dr would be (after installing the 
"correct" version of salt that you wish to lock):

# zypper al 'salt*'

HTH
Nathan

From david.smith at pdfcomplete.com  Mon Dec  4 07:23:39 2017
From: david.smith at pdfcomplete.com (David Smith)
Date: Mon, 4 Dec 2017 08:23:39 -0600
Subject: [Deepsea-users] Leap 42.3 with Salt 2016.11.04
In-Reply-To: <2523366.36MuXs8VOy@fury.home>
References: <393AD1F2613C884CBCEF977564E6BB4101A2A6881901@SBSEXCHSRVR.pdfcomplete.local>
	<2523366.36MuXs8VOy@fury.home>
Message-ID: <393AD1F2613C884CBCEF977564E6BB4101A2A6881906@SBSEXCHSRVR.pdfcomplete.local>

Hi Eric,

Thank you so much for your help.

># salt '*' saltutil.sync_all

Solved the issue with Stage 1.  Forging on ... :)

David


-----Original Message-----
From: deepsea-users-bounces at lists.suse.com [mailto:deepsea-users-bounces at lists.suse.com] On Behalf Of Eric Jackson
Sent: Monday, December 04, 2017 6:48 AM
To: deepsea-users at lists.suse.com
Subject: Re: [Deepsea-users] Leap 42.3 with Salt 2016.11.04

Hi David,
  Whenever Salt gives the 'module.method is not available', there's a few possibilities.  The sync never occurred and the module is missing from the minion.  Give this a try

# salt '*' saltutil.sync_all

  If you see an entry for modules getting updated, then give Stage 2 a try again.  If that does not resolve the problem, then the module may be failing.  
The module normally runs on all minions which includes the master node. So,

# salt-call -l debug proposal.generate 

might show an exception or stack trace.  If not, then

# salt '*' proposal.generate

should show which minion(s) are getting an error.  If only one minion is complaining but the others are fine, suspect that node.  Set log_level: debug in /etc/salt/minion.  Restart the  minion.  Check /var/log/salt/minion for error messages.  Rerun the command above.

At this point, hopefully the problem is resolved.  If we have a failure on Leap, opening an issue on github with your findings would be helpful.

Eric


On Saturday, December 02, 2017 09:42:51 PM David Smith wrote:
> I am trying to use the Leap 42.3 Deepsea install of Ceph as described 
> in the Quick Start Guide.
> 
> Since the current version of Salt hangs in Stage 0, I setup Salt 
> 2016.11.04 before adding and refreshing the repo for luminous and 
> openAttic.  Stage 1 completes successfully, but Stage 2 fails in step 
> 4 with "proposal.generate  is not available" for all the minions except the master.
> 
> I presume that Salt 2016.11.04 minion installs are missing 
> 'proposal.generate'?  What should I do?
> 
> Do you know when Deepsea will be compatible with the current Salt version?
> 
> Thanks,
> 
> David

From thsundel at gmail.com  Fri Dec  8 06:04:25 2017
From: thsundel at gmail.com (Thomas Sundell)
Date: Fri, 8 Dec 2017 14:04:25 +0100
Subject: [Deepsea-users] filestore to bluestore migration
Message-ID: <CANK5SKvFYUMDV5v3J2sS4Ksox=xmRJuKyKMPZXuMLbjaGPnnMg@mail.gmail.com>

Hi all,
we recently upgraded our SES4 cluster to SES5 successfully, there were
a couple of very minor issues on the way that we managed to solve,
good work and great documentation!

Now we are at the step to migrate from filestore to bluestore backend.
We followed the documentation (
https://www.suse.com/documentation/suse-enterprise-storage-5/book_storage_deployment/data/ceph_upgrade_4to5.html
) and ran "salt-run state.orch ceph.migrate.policy", this created a
new policy.cfg and policy.cfg-original and also a directory named
migrated-profile-1Intel238GB-8ST4000VN0001-1SF3726GB-1.

Then we ran "salt-run state.orch ceph.migrate.osds" which ran
successfully but watching "ceph osd tree" the weight of the osds
stayed the same, so my guess was that the migration did not start.

So I did a diff on policy.cfg and policy.cfg-original and noticed the
content was the same, I edited the file and changed the profile to
point to the migrated folder:
migrated-profile-1Intel238GB-8ST4000VN0001-1SF3726GB-1/cluster/*.sls
migrated-profile-1Intel238GB-8ST4000VN0001-1SF3726GB-1/stack/default/ceph/minions/*.yml

Then I reran "salt-run state.orch ceph.migrate.osds" which completed
successfully again but still the osd weight didn't change.

After this I noticed that another directory also had been created
during the upgrade named profile-default. Comparing the content of the
yml files in that directory vs the migrated directory I noticed some
difference:

This is from the migrated profile directory:
ceph:
  storage:
    osds:
      /dev/disk/by-id/ata-ST4000VN0001-1SF178_Z4F0PS2P:
        format: bluestore
...

This is from the profile-default directory:
ceph:
  storage:
    osds:
      /dev/disk/by-id/ata-ST4000VN0001-1SF178_Z4F0PS49:
        db: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
        db_size: 500m
        format: bluestore
        wal: /dev/disk/by-id/nvme-20000000001000000e4d25c0bf42f4c00
        wal_size: 500m
...

So now I'm thinking that maybe policy.cfg should point to
profile-default/../, maybe that change would get the migration to
bluestore rolling? I wanted to ask here before I try, anyone have a
clue if I'm onto something or am I completely lost :)

First time poster and new to this list and SES.

Thank's for listening.

Thomas

From thsundel at gmail.com  Fri Dec  8 07:50:20 2017
From: thsundel at gmail.com (Thomas Sundell)
Date: Fri, 8 Dec 2017 15:50:20 +0100
Subject: [Deepsea-users] filestore to bluestore migration
In-Reply-To: <1797088.mreHHa35iL@fury.home>
References: <CANK5SKvFYUMDV5v3J2sS4Ksox=xmRJuKyKMPZXuMLbjaGPnnMg@mail.gmail.com>
	<1797088.mreHHa35iL@fury.home>
Message-ID: <CANK5SKviqf_krBBPK8xEUDQVppkDOObG0Jw2gXqoJfbH2nzC-g@mail.gmail.com>

Thank's Eric for your response, see inline comments.

On Fri, Dec 8, 2017 at 3:10 PM, Eric Jackson <ejackson at suse.com> wrote:
> Hi Thomas,
>   Were you running Stage 2 after making changes to the policy.cfg?  The reason
> I ask is that the behavior you describe (i.e. migration runs, but nothing
> changed) gives the impression that the migration "thinks" that you are already
> configured.
>
>   Try running
>
> salt -I roles:storage osd.report
>
> That will compare the existing configuration in the pillar against the
> configuration of the storage node.  If you see messages like
>
> All configured OSDs are active

salt -I roles:storage osd.report
osd1.ceph.mydomain.fi:
All configured OSDS are active
osd2.ceph.mydomain.fi:
All configured OSDS are active
osd3.ceph.mydomain.fi:
All configured OSDS are active
osd4.ceph.mydomain.fi:
All configured OSDS are active

>
> and a query of the pillar
>
> salt -l roles:storage pillar.get ceph
>

salt -I roles:storage pillar.get ceph
osd1.ceph.mydomain.fi:
osd2.ceph.mydomain.fi:
osd3.ceph.mydomain.fi:
osd4.ceph.mydomain.fi:

> reflects the policy.cfg correctly, then you should be migrated to bluestore.
> To manually verify, pick a storage node and run
>
> cat /var/lib/ceph/osd/ceph-*/type
>

cat /var/lib/ceph/osd/ceph-*/type
filestore
filestore
filestore
filestore
filestore
filestore
filestore
filestore

We only have 4 storage nodes in this test cluster and each of them have 8 OSDs.

> ***
> With respect to the migration, the commands to migrate one node are in
> /srv/salt/ceph/mgirate/osds/default.sls.   Effectively,
>
> salt 'data1*' state.apply ceph.redeploy.osds
> salt 'admin*' state.apply ceph.remove.migrated
>
> The rest of the state file is waiting for a healthy cluster.  Notice that the
> cleanup is batched (i.e. the 'old' OSDs are not removed until the end).
>
> ***
> With respect to the profile-default and the one generated by the
> ceph.migrate.policy orchestration, either should be fine given some conditions.
>
> Here's a quick history: DeepSea originally tried to encode a useful name for a
> profile for a range of defaults.  The general issue is that most sites fell
> outside of the guessed defaults and admins were left to hand crafting their
> hardware profiles.
>
> The strategy now is to create 'profile-default' with a hardcoded 1 to 5 ratio
> or standalone OSDs depending on the hardware available.
>
> If you prefer the configuration provided by the profile-default, then feel free
> to use it.  The only caveat is to verify that devices have not been left off
> or used in ways not originally intended.  The purpose of the
> ceph.migrate.policy was to take keep any manual modifications and only change
> the type.  The journal media would be used for the wal and db for bluestore.
>
> Feel free to experiment with `salt-run proposal.help` and `salt-run
> proposal.peek`.  Once you decide on what you really want, the migration will
> carry it out.
>

Another thing struck me, if the "old" profile was not correctly
formatted when they deployed the cluster could the
"ceph.migrate.policy" have created a "faulty" one? Here is a snip of
the premigrated yml:

storage:
  data+journals: []
  osds:
  - /dev/disk/by-id/ata-ST4000VN0001-1SF178_Z4F0PS2P

Because when I create a new proposal (salt-run proposal.populate
name=my7to1profile ratio=7) I see there are several attributes that
are not in the migrated profile like db, db_size, wal and wal_size.

One last thing, if I were to set the new my7to1profile in policy.cfg
will "salt-run state.orch ceph.migrate.osds" magically migrate the old
setup to the new without destroying our testdata :) ?

Thomas

From thsundel at gmail.com  Mon Dec 11 05:26:28 2017
From: thsundel at gmail.com (Thomas Sundell)
Date: Mon, 11 Dec 2017 13:26:28 +0100
Subject: [Deepsea-users] filestore to bluestore migration
In-Reply-To: <3449161.RPsEhlAETZ@fury.home>
References: <CANK5SKvFYUMDV5v3J2sS4Ksox=xmRJuKyKMPZXuMLbjaGPnnMg@mail.gmail.com>
	<1797088.mreHHa35iL@fury.home>
	<CANK5SKviqf_krBBPK8xEUDQVppkDOObG0Jw2gXqoJfbH2nzC-g@mail.gmail.com>
	<3449161.RPsEhlAETZ@fury.home>
Message-ID: <CANK5SKv3u0=7UfLDz5xm-_NW_n_M_Nca2_Z8VSaVJiby5EuVTA@mail.gmail.com>

Back at work.

On Fri, Dec 8, 2017 at 5:57 PM, Eric Jackson <ejackson at suse.com> wrote:

> It's okay to pull yaml files or even parts of them from the existing
> configuration, the results from proposal.populate and from ceph.migrate.policy.
> As long as the policy.cfg points to the one profile you want, you can evolve
> your configuration at your pace.
>
> I would suggest changing a yaml file and running through
>
> salt-run state.orch ceph.stage.2
> salt 'storage_minion*' pillar.get ceph
> salt 'storage_minion*' osd.report
>
> until you see what you want.  Then, run the migration
>
> salt-run state.orch ceph.migrate.osds
>
So I did the above and migration began, noticed osd weight dropping to
zero and osd went down and then they came back up again as bluestore
:) 5 of them got migrated but then I hit an error:

"Module function osd.redeploy threw an exception. Exception: Device
/dev/sda is not defined in pillar"

So I took a look in the osd*.yml and noticed the first disk of every
node was missing, so I manually edited the files and added the missing
disk of each node and reran the above steps but still I'm hitting the
same error.

And now after I edited the files and run osd.report it states "No OSD
configured for" the disks I added manually.

Any ideas what I could try next?

Thanks again!

Thomas

From thsundel at gmail.com  Mon Dec 11 08:24:00 2017
From: thsundel at gmail.com (Thomas Sundell)
Date: Mon, 11 Dec 2017 16:24:00 +0100
Subject: [Deepsea-users] filestore to bluestore migration
In-Reply-To: <34397414.MGAn1gEDzT@fury.home>
References: <CANK5SKvFYUMDV5v3J2sS4Ksox=xmRJuKyKMPZXuMLbjaGPnnMg@mail.gmail.com>
	<3449161.RPsEhlAETZ@fury.home>
	<CANK5SKv3u0=7UfLDz5xm-_NW_n_M_Nca2_Z8VSaVJiby5EuVTA@mail.gmail.com>
	<34397414.MGAn1gEDzT@fury.home>
Message-ID: <CANK5SKstQQ+nfEe18C3C3UjOsud7siiTQuoftaQPgSJOUzA81A@mail.gmail.com>

Hi again,

On Mon, Dec 11, 2017 at 2:19 PM, Eric Jackson <ejackson at suse.com> wrote:
>>
>> :) 5 of them got migrated but then I hit an error:
>> "Module function osd.redeploy threw an exception. Exception: Device
>> /dev/sda is not defined in pillar"
>>
> I expect the device is defined in the local grains, but not in the pillar.

Exactly, when I ran proposal.populate to create a new profile it seems
to not have picked up the first disk of each node and looking in
/etc/salt/grains on the storage nodes the disk is defined there.

>
>> So I took a look in the osd*.yml and noticed the first disk of every
>> node was missing, so I manually edited the files and added the missing
>> disk of each node and reran the above steps but still I'm hitting the
>> same error.
>>
>> And now after I edited the files and run osd.report it states "No OSD
>> configured for" the disks I added manually.
>
> So, these first disks never had an OSD?  That would seem to contradict the
> grains above.  Were these deleted manually?

I found the cause why it stated "No OSD configured for" the disks I
manually added to the profile, there was a typo in the
/dev/disk/by-id.... line in the yml, had a capital O instead of a
zero, uups ;)

Migration to bluestore is now done but for some reason one OSD is
missing, have to investigate.

Thank you once again Eric!

Thomas