From swamireddy at gmail.com Thu Dec 1 02:55:52 2016 From: swamireddy at gmail.com (M Ranga Swami Reddy) Date: Thu, 1 Dec 2016 15:25:52 +0530 Subject: [Deepsea-users] DeepSea support Ubuntu OS In-Reply-To: <4614823.cYj6PdUEVI@ruby> References: <4614823.cYj6PdUEVI@ruby> Message-ID: Thanks Eric. Please share the files, where we need to do the changes for ubunte...I have already setup on Ubunt...and installed the salt-master and salt-minion... Thanks Swami On Thu, Dec 1, 2016 at 1:46 PM, Eric Jackson wrote: > Hi Swami, > The current master is only working on SUSE, but we do have an issue > opened > to support Ubuntu as well https://github.com/SUSE/DeepSea/issues/57. > The necessary changes should not be large and finding the Ubuntu > equivalents > of specific zypper and hwinfo would be most of the work. I will likely > turn > these existing pieces into Salt modules. > > There's also the matter of creating an Ubuntu development environment > which > I do not have at the moment. If you are interested in helping before I > get my > Ubuntu environment completed, I can give the specific filenames of where > the > changes are needed. > > And welcome to the list and thanks for your interest. > > Eric > > On Thursday, December 01, 2016 12:15:03 PM M Ranga Swami Reddy wrote: > > Hello, > > We would like to use the DeepSea tool on Ubuntu OS. Please let me know, > if > > this tool works on Ubuntu as well. Or its works on SuSE only. > > > > > > Thanks > > Swami > > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swamireddy at gmail.com Thu Dec 1 07:03:38 2016 From: swamireddy at gmail.com (M Ranga Swami Reddy) Date: Thu, 1 Dec 2016 19:33:38 +0530 Subject: [Deepsea-users] DeepSea support Ubuntu OS In-Reply-To: References: <4614823.cYj6PdUEVI@ruby> Message-ID: Hi Lenz, Yes, hwinfo can be replaced with lsblk on ubuntu. Agreed with you that, DeepSea code should be independent of OS and based on the OS user, use, should pickup appropriate commands/tools. But this requires modularization of DeepSea code, which is not there at present. Thanks Swami On Thu, Dec 1, 2016 at 6:43 PM, Lenz Grimmer wrote: > Hi, > > On 12/01/2016 10:55 AM, M Ranga Swami Reddy wrote: > > > Please share the files, where we need to do the changes for ubunte...I > > have already setup on Ubunt...and installed the salt-master and > > salt-minion... > > A quick recursive grep for "hwinfo" and "zypper" turned up the following > potential files: > > srv/salt/_modules/examples/freedisks.py > srv/salt/_modules/cephdisks.py > srv/modules/runners/validate.py > srv/salt/ceph/rgw/default-zypper.sls > srv/salt/ceph/updates/default.sls > srv/salt/ceph/packages/common/default-zypper.sls > srv/salt/ceph/igw/default-zypper.sls > srv/salt/ceph/igw/import/default-zypper.sls > srv/salt/_modules/zypper_locks.py > > It would make sense to figure out if instead of using "zypper" we could > use the functions built into Salt for installing packages. Not sure how > to replace the "hwinfo" calls, though. From what I could tell, it's > being used to list disks attached to the given node. Maybe "lsblk" could > be used here instead? > > In general and if possible, we should only use functions that are > completely distribution-independent. In addition to Debian/Ubuntu, it'd > be excellent if DeepSea would also be supported on Red Hat distributions > like CentOS or Fedora... > > Lenz > > > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmb at suse.com Thu Dec 1 07:23:07 2016 From: lmb at suse.com (Lars Marowsky-Bree) Date: Thu, 1 Dec 2016 15:23:07 +0100 Subject: [Deepsea-users] DeepSea support Ubuntu OS In-Reply-To: References: <4614823.cYj6PdUEVI@ruby> Message-ID: <20161201142307.dqdordjeejpczco4@suse.de> On 2016-12-01T19:33:38, M Ranga Swami Reddy wrote: > Hi Lenz, > Yes, hwinfo can be replaced with lsblk on ubuntu. > > Agreed with you that, DeepSea code should be independent of OS and based on > the OS user, use, should pickup appropriate commands/tools. But this > requires modularization of DeepSea code, which is not there at present. We're very happy to take patches on this though. That's currently simply a question of bandwidth on our part to drive the other platforms. So please, let us know! Regards, Lars -- Architect SDS, Distinguished Engineer SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N?rnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde From loic.devulder at mpsa.com Thu Dec 1 08:44:18 2016 From: loic.devulder at mpsa.com (LOIC DEVULDER) Date: Thu, 1 Dec 2016 15:44:18 +0000 Subject: [Deepsea-users] DeepSea support Ubuntu OS In-Reply-To: References: <4614823.cYj6PdUEVI@ruby> Message-ID: <3CBFA7CC2505A74B9C172B35128B88637ED73971@YLAV4460.INETPSA.com> Hi guys, >-----Message d'origine----- >De?: deepsea-users-bounces at lists.suse.com [mailto:deepsea-users- >bounces at lists.suse.com] De la part de Lenz Grimmer >Envoy??: jeudi 1 d?cembre 2016 14:13 >??: deepsea-users at lists.suse.com >Objet?: Re: [Deepsea-users] DeepSea support Ubuntu OS >... >It would make sense to figure out if instead of using "zypper" we could >use the functions built into Salt for installing packages. Not sure how >to replace the "hwinfo" calls, though. From what I could tell, it's >being used to list disks attached to the given node. Maybe "lsblk" could >be used here instead? > >In general and if possible, we should only use functions that are >completely distribution-independent. In addition to Debian/Ubuntu, it'd >be excellent if DeepSea would also be supported on Red Hat distributions >like CentOS or Fedora... > >Lenz Only one remark as a SUSE/SLES user: is it not "bad" to install packages behind zypper?Zzypper has lot of advantages compared to the "rpm" command. Or does Salt natively use zypper for package installation when it detects SUSE OS? If so, yes it could better to use theses functions. I don't have try DeepSea (I plan to do this quickly) but it looks promising :-) Loic From tserong at suse.com Thu Dec 1 17:58:45 2016 From: tserong at suse.com (Tim Serong) Date: Fri, 2 Dec 2016 11:58:45 +1100 Subject: [Deepsea-users] DeepSea support Ubuntu OS In-Reply-To: <3CBFA7CC2505A74B9C172B35128B88637ED73971@YLAV4460.INETPSA.com> References: <4614823.cYj6PdUEVI@ruby> <3CBFA7CC2505A74B9C172B35128B88637ED73971@YLAV4460.INETPSA.com> Message-ID: On 12/02/2016 02:44 AM, LOIC DEVULDER wrote: > Hi guys, > >> -----Message d'origine----- >> De : deepsea-users-bounces at lists.suse.com [mailto:deepsea-users- >> bounces at lists.suse.com] De la part de Lenz Grimmer >> Envoy? : jeudi 1 d?cembre 2016 14:13 >> ? : deepsea-users at lists.suse.com >> Objet : Re: [Deepsea-users] DeepSea support Ubuntu OS >> ... >> It would make sense to figure out if instead of using "zypper" we could >> use the functions built into Salt for installing packages. Not sure how >> to replace the "hwinfo" calls, though. From what I could tell, it's >> being used to list disks attached to the given node. Maybe "lsblk" could >> be used here instead? >> >> In general and if possible, we should only use functions that are >> completely distribution-independent. In addition to Debian/Ubuntu, it'd >> be excellent if DeepSea would also be supported on Red Hat distributions >> like CentOS or Fedora... >> >> Lenz > > Only one remark as a SUSE/SLES user: is it not "bad" to install packages behind zypper?Zzypper has lot of advantages compared to the "rpm" command. > > Or does Salt natively use zypper for package installation when it detects SUSE OS? If so, yes it could better to use theses functions. Salt's pkg.installed natively uses zypper on SUSE Linux (and yum on Red Hat, apt on Debian and derivatives, etc.). IIRC there was a bug at some point in the past with the zypper module inside salt itself, which is why we ended up with explicit zypper invocations in DeepSea, as a short term workaround for that issue. I can't find a reference for that bug right now though. Anyway, from the list of files Lenz posted earlier, non-SUSE distros can probably ignore all the ones named default-zypper.sls. For example, srv/salt/ceph/packages/common/init.sls says: include: - .{{ salt['pillar.get']('packages_common', 'default') }} This means it will use srv/salt/ceph/packages/default.sls (which in turn uses the generic pkg.installed), unless explicitly overridden by pillar data to use default-zypper.sls. I think this means the only zypper invocations that should need "fixing" are in these two files: srv/salt/ceph/updates/default.sls srv/modules/runners/validate.py Regards, Tim -- Tim Serong Senior Clustering Engineer SUSE tserong at suse.com From rdias at suse.com Fri Dec 2 05:00:52 2016 From: rdias at suse.com (Ricardo Dias) Date: Fri, 2 Dec 2016 12:00:52 +0000 Subject: [Deepsea-users] Pull-requests and automated testing Message-ID: <20161202120052.GB16044@rdias-suse-laptop> Hi folks, Now that DeepSea (DS) is starting to get public attention and will start getting more and more contributions, it is very important to "agile-ize" the reviewing process of pull-requests submissions. Every time a pull-request (PR) is submitted, besides reviewing the code, the assigned reviewer must also deploy DS and test it manually to check if nothing stops working due to the PR changes. While this process has worked until now, it will not scale when we start to get several PR submissions. A good starting point to solve this scaling problem, is to rely on GitHub's "hooks", and PR's "status checks", to run a set of automated tests that will ensure that everything is working correctly and save the reviewers time from doing the manual testing. This is similar to what the Ceph project in github is doing with PR submissions. Of course all of this assumes that we have tests to run, and we should start planning how unit tests should be developed for DS, based on Salt unit tests. But for the start, I think we should have some way to pick the changes from the submitted PR, spawn some virtual machines and deploy a Ceph cluster, and respective services with DS+changes, and check that everything is working correctly. A side effect of having status checks activated for the PRs is that we protect the master branch from "force" pushes and from deletion, which is also nice. Now what do we need to actually put this plan to work? Here is a checklist (maybe incomplete): * public jenkins instance * computing resources for testing DS * Scripts to glue everything with github If anyone has suggestions, see any problem with this plan, or knows about other components/software to achieve the same plan, please go forth and present it. I truly believe that this is an important matter to allow a faster development cycle for DS. Thanks, Ricardo Dias From jfajerski at suse.com Fri Dec 2 09:11:50 2016 From: jfajerski at suse.com (Jan Fajerski) Date: Fri, 2 Dec 2016 17:11:50 +0100 Subject: [Deepsea-users] Pull-requests and automated testing In-Reply-To: <20161202120052.GB16044@rdias-suse-laptop> References: <20161202120052.GB16044@rdias-suse-laptop> Message-ID: <20161202161150.bqple4xvpz365iws@jf_suse_laptop.fritz.box> On Fri, Dec 02, 2016 at 12:00:52PM +0000, Ricardo Dias wrote: >Hi folks, > >Now that DeepSea (DS) is starting to get public attention and will start >getting more and more contributions, it is very important to "agile-ize" >the reviewing process of pull-requests submissions. > >Every time a pull-request (PR) is submitted, besides reviewing the code, >the assigned reviewer must also deploy DS and test it manually to check >if nothing stops working due to the PR changes. >While this process has worked until now, it will not scale when we start >to get several PR submissions. > >A good starting point to solve this scaling problem, is to rely on >GitHub's "hooks", and PR's "status checks", to run a set of automated >tests that will ensure that everything is working correctly and save the >reviewers time from doing the manual testing. This is similar to what >the Ceph project in github is doing with PR submissions. > >Of course all of this assumes that we have tests to run, and we should >start planning how unit tests should be developed for DS, based on Salt >unit tests. But for the start, I think we should have some way to pick >the changes from the submitted PR, spawn some virtual machines and >deploy a Ceph cluster, and respective services with DS+changes, and >check that everything is working correctly. I'm not sure if unit tests will be sufficient. salt is hard to test and to my knowledge we can only test runners with unit tests. The real question is how we will test the rest. > >A side effect of having status checks activated for the PRs is that we >protect the master branch from "force" pushes and from deletion, which >is also nice. > >Now what do we need to actually put this plan to work? >Here is a checklist (maybe incomplete): >* public jenkins instance >* computing resources for testing DS >* Scripts to glue everything with github Being on my way back from the QA/Testing/Automation meeting in Prag, I would strongly suggest we look into integrating DeepSea testing into Teuthology. This has various advantages and shouldn't be too hard. Ceph deploy is tested in Teuthology too...so why not DS. Imho the tests to run (besides unit tests for the runners) would simply exercise DeepSeas functionality. > >If anyone has suggestions, see any problem with this plan, or knows about >other components/software to achieve the same plan, please go forth and >present it. > >I truly believe that this is an important matter to allow a faster >development cycle for DS. Absolutely agreed...we need to get this on the road. > >Thanks, >Ricardo Dias > >_______________________________________________ >Deepsea-users mailing list >Deepsea-users at lists.suse.com >http://lists.suse.com/mailman/listinfo/deepsea-users -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH jfajerski at suse.com From rdias at suse.com Fri Dec 2 09:32:30 2016 From: rdias at suse.com (Ricardo Dias) Date: Fri, 2 Dec 2016 16:32:30 +0000 Subject: [Deepsea-users] Pull-requests and automated testing In-Reply-To: <20161202161150.bqple4xvpz365iws@jf_suse_laptop.fritz.box> References: <20161202120052.GB16044@rdias-suse-laptop> <20161202161150.bqple4xvpz365iws@jf_suse_laptop.fritz.box> Message-ID: <20161202163230.GC16044@rdias-suse-laptop> On 2016-12-02 17:11:50, Jan Fajerski wrote: > On Fri, Dec 02, 2016 at 12:00:52PM +0000, Ricardo Dias wrote: > > Of course all of this assumes that we have tests to run, and we should > > start planning how unit tests should be developed for DS, based on Salt > > unit tests. But for the start, I think we should have some way to pick > > the changes from the submitted PR, spawn some virtual machines and > > deploy a Ceph cluster, and respective services with DS+changes, and > > check that everything is working correctly. > I'm not sure if unit tests will be sufficient. salt is hard to test and to > my knowledge we can only test runners with unit tests. The real question is > how we will test the rest. As a first step, I will be happy to just deploy a Ceph cluster using the PR branch and check for HEALTH_OK. Also, we can check if each stage has succeeded. > > > > A side effect of having status checks activated for the PRs is that we > > protect the master branch from "force" pushes and from deletion, which > > is also nice. > > > > Now what do we need to actually put this plan to work? > > Here is a checklist (maybe incomplete): > > * public jenkins instance > > * computing resources for testing DS > > * Scripts to glue everything with github > Being on my way back from the QA/Testing/Automation meeting in Prag, I would > strongly suggest we look into integrating DeepSea testing into Teuthology. > This has various advantages and shouldn't be too hard. Ceph deploy is tested > in Teuthology too...so why not DS. > Imho the tests to run (besides unit tests for the runners) would simply > exercise DeepSeas functionality. Yes, testing DS in teuthology is an important goal, which I think it's still a bit far away from being reached. There are two ways of testing DS within teuthology in my opinion: a) swap ceph-ansible for DS to deploy the cluster as configured in each test suite; b) test DS as a single suite test. Either approaches will require significant work, specially a). But I think using teuthology to test DeepSea in the context of a GitHub PR's status check is not an adequate solution. We need something to be run quickly, no more than 30 minutes in my opinion. > > > > _______________________________________________ > > Deepsea-users mailing list > > Deepsea-users at lists.suse.com > > http://lists.suse.com/mailman/listinfo/deepsea-users > > -- > Jan Fajerski > Engineer Enterprise Storage > SUSE Linux GmbH > jfajerski at suse.com > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users From jfajerski at suse.com Fri Dec 2 09:54:54 2016 From: jfajerski at suse.com (Jan Fajerski) Date: Fri, 2 Dec 2016 17:54:54 +0100 Subject: [Deepsea-users] Pull-requests and automated testing In-Reply-To: <20161202163230.GC16044@rdias-suse-laptop> References: <20161202120052.GB16044@rdias-suse-laptop> <20161202161150.bqple4xvpz365iws@jf_suse_laptop.fritz.box> <20161202163230.GC16044@rdias-suse-laptop> Message-ID: <20161202165453.evqgl7kk7prdmotl@jf_suse_laptop.fritz.box> On Fri, Dec 02, 2016 at 04:32:30PM +0000, Ricardo Dias wrote: >On 2016-12-02 17:11:50, Jan Fajerski wrote: >> On Fri, Dec 02, 2016 at 12:00:52PM +0000, Ricardo Dias wrote: >> > Of course all of this assumes that we have tests to run, and we should >> > start planning how unit tests should be developed for DS, based on Salt >> > unit tests. But for the start, I think we should have some way to pick >> > the changes from the submitted PR, spawn some virtual machines and >> > deploy a Ceph cluster, and respective services with DS+changes, and >> > check that everything is working correctly. >> I'm not sure if unit tests will be sufficient. salt is hard to test and to >> my knowledge we can only test runners with unit tests. The real question is >> how we will test the rest. > >As a first step, I will be happy to just deploy a Ceph cluster using the PR >branch and check for HEALTH_OK. >Also, we can check if each stage has succeeded. Yeah same here. I'm also looking to somewhat automate this process with vagrant. > >> > >> > A side effect of having status checks activated for the PRs is that we >> > protect the master branch from "force" pushes and from deletion, which >> > is also nice. >> > >> > Now what do we need to actually put this plan to work? >> > Here is a checklist (maybe incomplete): >> > * public jenkins instance >> > * computing resources for testing DS >> > * Scripts to glue everything with github >> Being on my way back from the QA/Testing/Automation meeting in Prag, I would >> strongly suggest we look into integrating DeepSea testing into Teuthology. >> This has various advantages and shouldn't be too hard. Ceph deploy is tested >> in Teuthology too...so why not DS. >> Imho the tests to run (besides unit tests for the runners) would simply >> exercise DeepSeas functionality. > >Yes, testing DS in teuthology is an important goal, which I think it's >still a bit far away from being reached. There are two ways of testing >DS within teuthology in my opinion: a) swap ceph-ansible for DS to >deploy the cluster as configured in each test suite; b) test DS as a >single suite test. Either approaches will require significant work, >specially a). We probably want to have both really. From what I understood from Nathans explanations re. Teuthology, it doesn't actually use Teuthology for cluster deployment. There is a ceph-cm-ansible component that takes care of provisioning the nodes, but does not actually deploy it. Is ceph-ansible also used? > >But I think using teuthology to test DeepSea in the context of a GitHub >PR's status check is not an adequate solution. We need something to be >run quickly, no more than 30 minutes in my opinion. Would this be impossible with Teuthology? Can't we just run our DS specific test suite? If that gets so extensive, that it takes so long we can use Teuthologies approach of partitioning suites and run a partition at a time. > >> > >> > _______________________________________________ >> > Deepsea-users mailing list >> > Deepsea-users at lists.suse.com >> > http://lists.suse.com/mailman/listinfo/deepsea-users >> >> -- >> Jan Fajerski >> Engineer Enterprise Storage >> SUSE Linux GmbH >> jfajerski at suse.com >> _______________________________________________ >> Deepsea-users mailing list >> Deepsea-users at lists.suse.com >> http://lists.suse.com/mailman/listinfo/deepsea-users >_______________________________________________ >Deepsea-users mailing list >Deepsea-users at lists.suse.com >http://lists.suse.com/mailman/listinfo/deepsea-users From rdias at suse.com Fri Dec 2 10:04:24 2016 From: rdias at suse.com (Ricardo Dias) Date: Fri, 2 Dec 2016 17:04:24 +0000 Subject: [Deepsea-users] Pull-requests and automated testing In-Reply-To: <20161202165453.evqgl7kk7prdmotl@jf_suse_laptop.fritz.box> References: <20161202120052.GB16044@rdias-suse-laptop> <20161202161150.bqple4xvpz365iws@jf_suse_laptop.fritz.box> <20161202163230.GC16044@rdias-suse-laptop> <20161202165453.evqgl7kk7prdmotl@jf_suse_laptop.fritz.box> Message-ID: <20161202170424.GD16044@rdias-suse-laptop> On 2016-12-02 17:54:54, Jan Fajerski wrote: > On Fri, Dec 02, 2016 at 04:32:30PM +0000, Ricardo Dias wrote: > > On 2016-12-02 17:11:50, Jan Fajerski wrote: > > > On Fri, Dec 02, 2016 at 12:00:52PM +0000, Ricardo Dias wrote: > > > > Of course all of this assumes that we have tests to run, and we should > > > > start planning how unit tests should be developed for DS, based on Salt > > > > unit tests. But for the start, I think we should have some way to pick > > > > the changes from the submitted PR, spawn some virtual machines and > > > > deploy a Ceph cluster, and respective services with DS+changes, and > > > > check that everything is working correctly. > > > I'm not sure if unit tests will be sufficient. salt is hard to test and to > > > my knowledge we can only test runners with unit tests. The real question is > > > how we will test the rest. > > > > As a first step, I will be happy to just deploy a Ceph cluster using the PR > > branch and check for HEALTH_OK. > > Also, we can check if each stage has succeeded. > Yeah same here. I'm also looking to somewhat automate this process with vagrant. > > > > > > > > > > A side effect of having status checks activated for the PRs is that we > > > > protect the master branch from "force" pushes and from deletion, which > > > > is also nice. > > > > > > > > Now what do we need to actually put this plan to work? > > > > Here is a checklist (maybe incomplete): > > > > * public jenkins instance > > > > * computing resources for testing DS > > > > * Scripts to glue everything with github > > > Being on my way back from the QA/Testing/Automation meeting in Prag, I would > > > strongly suggest we look into integrating DeepSea testing into Teuthology. > > > This has various advantages and shouldn't be too hard. Ceph deploy is tested > > > in Teuthology too...so why not DS. > > > Imho the tests to run (besides unit tests for the runners) would simply > > > exercise DeepSeas functionality. > > > > Yes, testing DS in teuthology is an important goal, which I think it's > > still a bit far away from being reached. There are two ways of testing > > DS within teuthology in my opinion: a) swap ceph-ansible for DS to > > deploy the cluster as configured in each test suite; b) test DS as a > > single suite test. Either approaches will require significant work, > > specially a). > We probably want to have both really. From what I understood from Nathans > explanations re. Teuthology, it doesn't actually use Teuthology for cluster > deployment. There is a ceph-cm-ansible component that takes care of > provisioning the nodes, but does not actually deploy it. Is ceph-ansible > also used? Sorry, I was referring to ceph-cm-ansible and yes, you're right, ceph-cm-ansible just provisions. The deployment per se is done by teuthology itself if I'm not mistaken. > > > > But I think using teuthology to test DeepSea in the context of a GitHub > > PR's status check is not an adequate solution. We need something to be > > run quickly, no more than 30 minutes in my opinion. > Would this be impossible with Teuthology? Can't we just run our DS specific > test suite? If that gets so extensive, that it takes so long we can use > Teuthologies approach of partitioning suites and run a partition at a time. Teuthology is a heavyweight and resource hungry testing framework, not sure if we can run a simple DeepSea suite within a small time-frame, but without really run it I cannot have 100% sure. > > > > > > > > > > _______________________________________________ > > > > Deepsea-users mailing list > > > > Deepsea-users at lists.suse.com > > > > http://lists.suse.com/mailman/listinfo/deepsea-users > > > > > > -- > > > Jan Fajerski > > > Engineer Enterprise Storage > > > SUSE Linux GmbH > > > jfajerski at suse.com > > > _______________________________________________ > > > Deepsea-users mailing list > > > Deepsea-users at lists.suse.com > > > http://lists.suse.com/mailman/listinfo/deepsea-users > > _______________________________________________ > > Deepsea-users mailing list > > Deepsea-users at lists.suse.com > > http://lists.suse.com/mailman/listinfo/deepsea-users > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users From ncutler at suse.cz Fri Dec 2 15:54:19 2016 From: ncutler at suse.cz (Nathan Cutler) Date: Fri, 2 Dec 2016 23:54:19 +0100 Subject: [Deepsea-users] Teuthology is not for unit testing Message-ID: Sorry I was not subscribed until just now. Let me say I heartily agree with the general direction of Ricardo's proposal, i.e. that automated PR validation testing is desirable/needed. Jan makes a good point that any meaningful testing of DeepSea requires that a Ceph cluster be deployed. Right now, DeepSea expects that cluster to be on a certain minimum number of nodes? I know it's possible to run a Ceph cluster on a single node. Can DeepSea do that kind of deployment right now? If not, what kind of modifications would be needed? (The ability to deploy one- and two-node clusters would be useful for DeepSea to have, since many teuthology test cases are designed to run on such minimal configurations.) If the whole thing could run on a single VM, that (in my mind) would satisfy the definition of a unit test suite. When Jenkins is triggered on a PR, it would create a VM in OpenStack, git clone the branch/SHA1, run the test suite, get the result, and then destroy the VM. The Jenkins machine could also be a VM in OpenStack. Teuthology could be used as well, but not without significant modifications. In my mind, unit testing should be kept as simple and light-weight as possible. Teuthology does not fit into that mould. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 From ncutler at suse.cz Fri Dec 2 16:04:47 2016 From: ncutler at suse.cz (Nathan Cutler) Date: Sat, 3 Dec 2016 00:04:47 +0100 Subject: [Deepsea-users] Teuthology is for functional testing Message-ID: In the upstream Ceph testing model, you've got two types of tests: 1. make check tests (unit tests) - consume source code 2. teuthology tests (functional tests) - consume packages The make check tests all run on a single machine - if the environment is set up correctly you just type "make check" and patiently await the result. (There is also run-make-check.sh which attempts to automate even setting up the environment.) These "make check" tests are what is run (via Jenkins in OVH) on upstream Ceph PRs. Teuthology is a heavy-duty tool for functional testing of clusters, and it's designed to run entire suites of tests. For example, 250 tests running in parallel with each test using 1-5 VMs. For each of these 250 tests, a set of VMs is provisioned, the test is run, results gathered, and then the VMs are destroyed. Needless to say, even in a high-powered teuthology lab environment it can take hours for a suite run to complete, and failures due to "infrastructure noise" are inevitable since there are so many moving parts. Analyzing the failures requires special skills and is generally done by developers. So, we should think about how much testing we want to do for DeepSea itself and whether it would make sense to implement such a distinction between unit tests (which run on a single machine) and functional tests (which run on clusters). -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 From tserong at suse.com Mon Dec 5 00:55:17 2016 From: tserong at suse.com (Tim Serong) Date: Mon, 5 Dec 2016 18:55:17 +1100 Subject: [Deepsea-users] Teuthology is not for unit testing In-Reply-To: References: Message-ID: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> On 12/03/2016 09:54 AM, Nathan Cutler wrote: > Sorry I was not subscribed until just now. > > Let me say I heartily agree with the general direction of Ricardo's > proposal, i.e. that automated PR validation testing is desirable/needed. > > Jan makes a good point that any meaningful testing of DeepSea requires > that a Ceph cluster be deployed. > > Right now, DeepSea expects that cluster to be on a certain minimum > number of nodes? DeepSea's validate runner (invoked at the start of stage 3/deploy) will fail if you've got less than three mon nodes and/or less than four storage nodes. AIUI this restriction was deliberately put in place to discourage building tiny clusters :) > I know it's possible to run a Ceph cluster on a single node. Can DeepSea > do that kind of deployment right now? If not, what kind of modifications > would be needed? It might just be getting rid of those node count checks (or adding an option to skip them, which could be enabled when running tests?) > (The ability to deploy one- and two-node clusters would be useful for > DeepSea to have, since many teuthology test cases are designed to run on > such minimal configurations.) > > If the whole thing could run on a single VM, that (in my mind) would > satisfy the definition of a unit test suite. If I'm right about the node count checks being the only thing to change, the minimum would probably be one VM, with one extra disk (or something that looks like a disk) to use for an OSD. > > When Jenkins is triggered on a PR, it would create a VM in OpenStack, > git clone the branch/SHA1, run the test suite, get the result, and then > destroy the VM. The Jenkins machine could also be a VM in OpenStack. > > Teuthology could be used as well, but not without significant > modifications. In my mind, unit testing should be kept as simple and > light-weight as possible. Teuthology does not fit into that mould. > -- Tim Serong Senior Clustering Engineer SUSE tserong at suse.com From lmb at suse.com Mon Dec 5 01:41:07 2016 From: lmb at suse.com (Lars Marowsky-Bree) Date: Mon, 5 Dec 2016 09:41:07 +0100 Subject: [Deepsea-users] Teuthology is not for unit testing In-Reply-To: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> References: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> Message-ID: <20161205084107.omprld3o62uqdfpt@suse.de> On 2016-12-05T18:55:17, Tim Serong wrote: > DeepSea's validate runner (invoked at the start of stage 3/deploy) will > fail if you've got less than three mon nodes and/or less than four > storage nodes. AIUI this restriction was deliberately put in place to > discourage building tiny clusters :) > > > I know it's possible to run a Ceph cluster on a single node. Can DeepSea > > do that kind of deployment right now? If not, what kind of modifications > > would be needed? > > It might just be getting rid of those node count checks (or adding an > option to skip them, which could be enabled when running tests?) Indeed, I think having an override option here would be beneficial. For production, sure. But in a lab, one might want to even stand up a single node Ceph instance. > If I'm right about the node count checks being the only thing to change, > the minimum would probably be one VM, with one extra disk (or something > that looks like a disk) to use for an OSD. That sounds good to me. Regards, Lars -- SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N?rnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde From jfajerski at suse.com Mon Dec 5 02:08:11 2016 From: jfajerski at suse.com (Jan Fajerski) Date: Mon, 5 Dec 2016 10:08:11 +0100 Subject: [Deepsea-users] Teuthology is not for unit testing In-Reply-To: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> References: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> Message-ID: <20161205090810.47vbg4vzfpsn7low@jf_suse_laptop.fritz.box> On Mon, Dec 05, 2016 at 06:55:17PM +1100, Tim Serong wrote: >On 12/03/2016 09:54 AM, Nathan Cutler wrote: >> Sorry I was not subscribed until just now. >> >> Let me say I heartily agree with the general direction of Ricardo's >> proposal, i.e. that automated PR validation testing is desirable/needed. >> >> Jan makes a good point that any meaningful testing of DeepSea requires >> that a Ceph cluster be deployed. >> >> Right now, DeepSea expects that cluster to be on a certain minimum >> number of nodes? > >DeepSea's validate runner (invoked at the start of stage 3/deploy) will >fail if you've got less than three mon nodes and/or less than four >storage nodes. AIUI this restriction was deliberately put in place to >discourage building tiny clusters :) > >> I know it's possible to run a Ceph cluster on a single node. Can DeepSea >> do that kind of deployment right now? If not, what kind of modifications >> would be needed? > >It might just be getting rid of those node count checks (or adding an >option to skip them, which could be enabled when running tests?) That is exactly right. Should be easy enough to include an option. On the other hand it is already possible if one would call the respective salt states individually instead of the orchestration file. Either way, that is also the only 'blocker' I see. > >> (The ability to deploy one- and two-node clusters would be useful for >> DeepSea to have, since many teuthology test cases are designed to run on >> such minimal configurations.) >> >> If the whole thing could run on a single VM, that (in my mind) would >> satisfy the definition of a unit test suite. With a single VM we can run a basic test suite yes. Without having a particular example at the ready I think we might miss some features if we limit ourselves to one VM. Think everything that includes removing a node. > >If I'm right about the node count checks being the only thing to change, >the minimum would probably be one VM, with one extra disk (or something >that looks like a disk) to use for an OSD. > >> >> When Jenkins is triggered on a PR, it would create a VM in OpenStack, >> git clone the branch/SHA1, run the test suite, get the result, and then >> destroy the VM. The Jenkins machine could also be a VM in OpenStack. >> >> Teuthology could be used as well, but not without significant >> modifications. In my mind, unit testing should be kept as simple and >> light-weight as possible. Teuthology does not fit into that mould. >> > >-- >Tim Serong >Senior Clustering Engineer >SUSE >tserong at suse.com >_______________________________________________ >Deepsea-users mailing list >Deepsea-users at lists.suse.com >http://lists.suse.com/mailman/listinfo/deepsea-users From ncutler at suse.cz Mon Dec 5 02:35:08 2016 From: ncutler at suse.cz (Nathan Cutler) Date: Mon, 5 Dec 2016 10:35:08 +0100 Subject: [Deepsea-users] Teuthology is not for unit testing In-Reply-To: <20161205090810.47vbg4vzfpsn7low@jf_suse_laptop.fritz.box> References: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> <20161205090810.47vbg4vzfpsn7low@jf_suse_laptop.fritz.box> Message-ID: > With a single VM we can run a basic test suite yes. Without having a > particular example at the ready I think we might miss some features if > we limit ourselves to one VM. Think everything that includes removing a > node. A basic test suite that runs in a single VM and completes within a reasonably short time can be triggered by Jenkins on every PR. All tests that require multiple machines, or take a long time to run, would go into teuthology. That's the upstream Ceph model, anyway. Nathan From rdias at suse.com Mon Dec 5 02:49:07 2016 From: rdias at suse.com (Ricardo Dias) Date: Mon, 5 Dec 2016 09:49:07 +0000 Subject: [Deepsea-users] Teuthology is not for unit testing In-Reply-To: References: <7dd73d72-f711-9ad9-bdfa-03d384334af5@suse.com> <20161205090810.47vbg4vzfpsn7low@jf_suse_laptop.fritz.box> Message-ID: <20161205094907.GE16044@rdias-suse-laptop> On 2016-12-05 10:35:08, Nathan Cutler wrote: > > With a single VM we can run a basic test suite yes. Without having a > > particular example at the ready I think we might miss some features if > > we limit ourselves to one VM. Think everything that includes removing a > > node. > > A basic test suite that runs in a single VM and completes within a > reasonably short time can be triggered by Jenkins on every PR. > > All tests that require multiple machines, or take a long time to run, would > go into teuthology. > > That's the upstream Ceph model, anyway. > Yes, this is what I had in my mind when I made the proposal. It is the PR's reviewer responsibility to decide if the PR requires a more rigorous/thorough testing, and schedule some runs of teuthology suites, otherwise the PR can be merge with only a simple tes using a single VM. Ricardo > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users From ncutler at suse.cz Fri Dec 2 15:51:56 2016 From: ncutler at suse.cz (Nathan Cutler) Date: Fri, 2 Dec 2016 23:51:56 +0100 Subject: [Deepsea-users] Teuthology is not for unit testing Message-ID: <2c39af34-3750-e91f-9984-1434ae7589c5@suse.cz> Sorry I was not subscribed until just now. Let me say I heartily agree with the general direction of Ricardo's proposal, i.e. that automated PR validation testing is desirable/needed. Jan makes a good point that any meaningful testing of DeepSea requires that a Ceph cluster be deployed. Right now, DeepSea expects that cluster to be on a certain minimum number of nodes? I know it's possible to run a Ceph cluster on a single node. Can DeepSea do that kind of deployment right now? If not, what kind of modifications would be needed? (The ability to deploy one- and two-node clusters would be useful for DeepSea to have, since many teuthology test cases are designed to run on such minimal configurations.) If the whole thing could run on a single VM, that (in my mind) would satisfy the definition of a unit test suite. When Jenkins is triggered on a PR, it would create a VM in OpenStack, git clone the branch/SHA1, run the test suite, get the result, and then destroy the VM. The Jenkins machine could also be a VM in OpenStack. Teuthology could be used as well, but not without significant modifications. In my mind, unit testing should be kept as simple and light-weight as possible. Teuthology does not fit into that mould. -- Nathan Cutler Software Engineer Distributed Storage SUSE LINUX, s.r.o. Tel.: +420 284 084 037 From rdias at suse.com Mon Dec 12 08:12:50 2016 From: rdias at suse.com (Ricardo Dias) Date: Mon, 12 Dec 2016 15:12:50 +0000 Subject: [Deepsea-users] stage 1: network discovery Message-ID: <20161212151250.GB14338@rdias-suse-laptop> Hi, I've been running DeepSea (DS) in a virtual environment and I'm wondering what are the network requirements of DeepSea to successfully deploy a Ceph cluster. In my current environment, every node has a single network interface configured with a netmask of 255.255.255.255. I've looked into how the discovery code in populate.py is identifying the network "topology" of the cluster and I have a set of questions/doubts: - Does DS expects to exist two networks (cluster and public) in the cluster, or it can cope with the existence of only one network and use it has both the "cluster" and "public" networks? -- Just found out in the code that if only one network exists it is used as both the "cluster" and "public" networks - DS is identifying a network by using the netmask of the network interface. In my case, the netmask is 255.255.255.255 and DS is unable to configure MONs public address due to the way the IP addresses are verified to belong to the "chosen" public address. Maybe we should rethink the way we are currently discovering the existent networks of the cluster. - Why is DS storing the "public_network" and "cluster_network" in "pillar/ceph/stack/default/ceph/cluster.yml"? What are their purposes? Do we really need them? Thanks, Ricardo Dias From rdias at suse.com Mon Dec 12 08:50:15 2016 From: rdias at suse.com (Ricardo Dias) Date: Mon, 12 Dec 2016 15:50:15 +0000 Subject: [Deepsea-users] stage 1: network discovery In-Reply-To: <13863070.TY5ryHy21E@ruby> References: <20161212151250.GB14338@rdias-suse-laptop> <13863070.TY5ryHy21E@ruby> Message-ID: <20161212155015.GC14338@rdias-suse-laptop> On 2016-12-12 10:28:10, Eric Jackson wrote: > On Monday, December 12, 2016 03:12:50 PM Ricardo Dias wrote: > > Hi, > > > > I've been running DeepSea (DS) in a virtual environment and I'm wondering > > what are the network requirements of DeepSea to successfully deploy a > > Ceph cluster. > > > > In my current environment, every node has a single network interface > > configured with a netmask of 255.255.255.255. > > Are you using virtualbox where every node has the same address? I cannot > think of any production network where all Ceph nodes would use host routes. > I'm using a cluster created in OVH cloud provider. Every node has a different IP address, but have a netmask of /32. The IP addresses are public and the access rules are controlled by the security group of the instances. > > > > I've looked into how the discovery code in populate.py is identifying > > the network "topology" of the cluster and I have a set of > > questions/doubts: > > > > - Does DS expects to exist two networks (cluster and public) in the > > cluster, or it can cope with the existence of only one network and > > use it has both the "cluster" and "public" networks? > > -- Just found out in the code that if only one network exists it is used > > as both the "cluster" and "public" networks > > Right. The simplest network is normally a single one used for management, > public and cluster traffic. I personally use three separate networks since I > believe many production users would be similar. I am aware some have multiple > management networks. > > > > > - DS is identifying a network by using the netmask of the network > > interface. In my case, the netmask is 255.255.255.255 and DS is unable > > to configure MONs public address due to the way the IP addresses are > > verified to belong to the "chosen" public address. > > Maybe we should rethink the way we are currently discovering the > > existent networks of the cluster. > > I never considered a netmask of /32. I would like to understand your > topology. If this is easy enough to detect (i.e. no network seems to exist > and all are /32), what is the correct answer?A I think we can assume that all networks (in this case there is a network for each node) are public. And by reading the "Network Configuration" section [1] in Ceph documentation, we can specify several public networks. This means we just need to extend a bit the code to deal with a list of public networks instead of only one. The rule for network detection should be: Given a set of networks NTS: 1) if each network in NTS only contains one node, then we consider all networks public 2) otherwise we fallback to the current logic, which is: pick the network with more nodes as the public network, and the second with more nodes as the cluster network. > > > > > - Why is DS storing the "public_network" and "cluster_network" in > > "pillar/ceph/stack/default/ceph/cluster.yml"? What are their purposes? > > Do we really need them? > > Take a look at /srv/salt/ceph/configuration/files/ceph.conf.j2. The public and > cluster networks are specified there. As I said in the comment above, we can add a list of public networks in the "public_network" config option. [1] http://docs.ceph.com/docs/jewel/rados/configuration/network-config-ref/ From rdias at suse.com Mon Dec 12 09:54:58 2016 From: rdias at suse.com (Ricardo Dias) Date: Mon, 12 Dec 2016 16:54:58 +0000 Subject: [Deepsea-users] stage 1: network discovery In-Reply-To: <1875668.bV8R9UkyE1@ruby> References: <20161212151250.GB14338@rdias-suse-laptop> <13863070.TY5ryHy21E@ruby> <20161212155015.GC14338@rdias-suse-laptop> <1875668.bV8R9UkyE1@ruby> Message-ID: <20161212165458.GD14338@rdias-suse-laptop> On 2016-12-12 11:26:13, Eric Jackson wrote: > On Monday, December 12, 2016 03:50:15 PM Ricardo Dias wrote: > > On 2016-12-12 10:28:10, Eric Jackson wrote: > > > > I'm using a cluster created in OVH cloud provider. Every node has a > > different IP address, but have a netmask of /32. The IP addresses are > > public and the access rules are controlled by the security group of > > the instances. > > I haven't tried OVH. > > > > > > I think we can assume that all networks (in this case there is a network > > for each node) are public. And by reading the "Network Configuration" > > Maybe in this instance, but if I have 4 network interfaces on a host, why > would I assume that all are public? In the comments of the code, I wrote that > this is a best guess but it seems to be reasonable so far. > > > section [1] in Ceph documentation, we can specify several public > > networks. This means we just need to extend a bit the code to deal with > > a list of public networks instead of only one. > > > > The rule for network detection should be: > > Given a set of networks NTS: > > 1) if each network in NTS only contains one node, then we consider all > > networks public > > 2) otherwise we fallback to the current logic, which is: > > pick the network with more nodes as the public network, and the > > second with more nodes as the cluster network. > > I think I would prefer reversing the behavior. That is, if nothing can be > deduced, then check if all nodes have a single network and apply the list to > both the cluster and public networks. I agree. > > If I have several hosts with two networks, but a few of the hosts are > experiencing an issue with one network, I would prefer the current behavior. > Stage 1 would write the configuration and the validation would fail that some > nodes do not seem to have a cluster network. (I think it would be hard for > Salt to work without a public network, but I'm sure there's a way.) > > If I follow the above, the entire cluster will be deployed and the admin will > learn that all nodes are configured to use a single network. While there is a > few ways out of that scenario, I do not want to put them into it initially. I don't understand what you said above. Is this a justification to reverse the order that I proposed? If it is, I agree :) > > > > > > > - Why is DS storing the "public_network" and "cluster_network" in > > > > > > > > "pillar/ceph/stack/default/ceph/cluster.yml"? What are their purposes? > > > > Do we really need them? > > > > > > Take a look at /srv/salt/ceph/configuration/files/ceph.conf.j2. The > > > public and cluster networks are specified there. > > > > As I said in the comment above, we can add a list of public networks in > > the "public_network" config option. > > You asked why the variables are defined in the pillar. So, that this Jinja > template can use them is my answer. I understood what you meant. My comment was just a followup of my comment above. > > Converting the public_network/cluster_network to lists is not difficult, but > thinking about the upgrade to SES5 needs to be thought about too. I don't understand what converting public_network to lists has to do with upgrading to SES5. Can you clarify? > > > > > > [1] http://docs.ceph.com/docs/jewel/rados/configuration/network-config-ref/ From lmb at suse.com Wed Dec 14 03:19:13 2016 From: lmb at suse.com (Lars Marowsky-Bree) Date: Wed, 14 Dec 2016 11:19:13 +0100 Subject: [Deepsea-users] Fwd: v11.1.0 kraken candidate released In-Reply-To: <4349026.ftKBgFnQMR@ruby> References: <4349026.ftKBgFnQMR@ruby> Message-ID: <20161214101913.ebcpmmqmxgil6h6a@suse.de> On 2016-12-13T09:28:31, Eric Jackson wrote: > * The new *BlueStore* backend now has a stable disk format and is > passing our failure and stress testing. Although the backend is > still flagged as experimental, we encourage users to try it out > for non-production clusters and non-critical data sets. > > I think we should create an alternate default now (e.g. default-bluestore.sls) > for the OSDs. However, by the time this is tested/verified, maybe BlueStore > should be the default prior to Luminous. Yes, we should make the OSD deployment more easily customizable; I'd personally think that this configuration should be in the pillars somewhere and expose the various options. > * The list of monitor hosts/addresses for building the monmap can now be > obtained from DNS SRV records. The service name used in when querying the > DNS > is defined in the "mon_dns_srv_name" config option, which defaults to > "ceph-mon". > > Sites that wish to remove the hardcoded monitor names/addresses from ceph.conf > will like this. I think the ceph/configuration needs an alternate default > (default-dns-srv.sls or something). Yes. And then we may want to consider how to push the IPs into the DNS server, but that may be out of our scope if we make it really easy to "list all MON public IPs" so they can push this. > * If you have manually specified the monitor user rocksdb via the > ``mon keyvaluedb = rocksdb`` option, you will need to manually add a > file to the mon data directory to preserve this option:: > > echo rocksdb > /var/lib/ceph/mon/ceph-`hostname`/kv_backend > > New monitors will now use rocksdb by default, but if that file is > not present, existing monitors will use leveldb. The ``mon > keyvaluedb`` option now only affects the backend chosen when a > monitor is created. > > What do others think of rocksdb? Would you want this as an alternate default? > Should this be the default? I think it should be the default so we can test it better. BlueStore also uses this in the back. Should be another configuration option. > * The jerasure and shec plugins can now detect SIMD instruction at > runtime and no longer need to be explicitly configured for different > processors. The following plugins are now deprecated: > jerasure_generic, jerasure_sse3, jerasure_sse4, jerasure_neon, > shec_generic, shec_sse3, shec_sse4, and shec_neon. If you use any of > these plugins directly you will see a warning in the mon log file. > Please switch to using just 'jerasure' or 'shec'. > > I've never known what the "right" answer is to erasure coded pools. Do you > leave it to interested parties to go "read the documentation" or is this > something that is worthy of a working example for both jerasure and shec? I think this is something that users need to configure, and deciding how the pools should be configured is more of a topic for openATTIC's UI in our stack, I believe. -- SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N?rnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde From jfajerski at suse.com Wed Dec 14 07:58:21 2016 From: jfajerski at suse.com (Jan Fajerski) Date: Wed, 14 Dec 2016 15:58:21 +0100 Subject: [Deepsea-users] ganesha direction In-Reply-To: <30256021.vFgTp4qRGA@ruby> References: <30256021.vFgTp4qRGA@ruby> Message-ID: <20161214145821.274mnmqlh5d77g3n@jf_suse_laptop.fritz.box> On Wed, Dec 14, 2016 at 09:06:50AM -0500, Eric Jackson wrote: >Hello all, > Most of the ganesha configuration seems doable for both the rgw and cephfs >backend. Supriti and I are just going through the motions on a feature branch >https://github.com/SUSE/DeepSea/tree/wip_ganesha. It's not working yet, so >sit tight. Excellent..nice job! > > The reason for this message is that the CephFS backend only supports the >admin keyring at this moment. While there is the possibility of configuring >the CephFS backend to use a different ceph.conf (e.g. a second >/etc/ceph/ceph.conf), that feels convoluted. +1 This seems unnecessarily hacky. > > In general, DeepSea does not require the admin keyring anywhere but on a >master. That is still the goal. However, ganesha is a new feature so I think >we can relax this constraint for the time being until the CephFS backend is >configurable or has its own default. This way, others can try out ganesha >sooner than later. Agreed. Considering that ganesha is a tech preview for ses4 I think this is fine. For a proper release we should probably change the ceph FSAL in ganesha to accept an option specifying a keyring or user plus secret. > > The standard "we'll add validations, documentation, smoke signals, etc." is >part of that, but simply not done at this time. > >Eric > > >_______________________________________________ >Deepsea-users mailing list >Deepsea-users at lists.suse.com >http://lists.suse.com/mailman/listinfo/deepsea-users From lmb at suse.com Wed Dec 21 01:57:22 2016 From: lmb at suse.com (Lars Marowsky-Bree) Date: Wed, 21 Dec 2016 09:57:22 +0100 Subject: [Deepsea-users] Fwd: Detecting disks by their properties rather than by their path In-Reply-To: <1ab0549b-fe84-e004-cc11-722bf82b9626@suse.com> References: <543124068.5941150.1481233886382.JavaMail.zimbra@redhat.com> <1ab0549b-fe84-e004-cc11-722bf82b9626@suse.com> Message-ID: <20161221085722.r7qz6r7kh5ezfsn2@suse.de> On 2016-12-21T09:46:48, Lenz Grimmer wrote: > Hi all, > > stumbled over this idea in the ceph-ansible mailing list and found it > quite interesting. Is there some merit in being able to select disks > based on this kind information? How does DeepSea handle this? Being able to have OSD generation consume/select disks based on attribute filters has been a long-running discourse for DeepSea, but didn't make it into the release yet. I'm hopeful we can match this. Regards, Lars -- SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton, HRB 21284 (AG N?rnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde From jfajerski at suse.com Wed Dec 21 02:09:05 2016 From: jfajerski at suse.com (Jan Fajerski) Date: Wed, 21 Dec 2016 10:09:05 +0100 Subject: [Deepsea-users] Fwd: Detecting disks by their properties rather than by their path In-Reply-To: <1ab0549b-fe84-e004-cc11-722bf82b9626@suse.com> References: <543124068.5941150.1481233886382.JavaMail.zimbra@redhat.com> <1ab0549b-fe84-e004-cc11-722bf82b9626@suse.com> Message-ID: <20161221090905.wncytozeq34ovdwi@jf_suse_laptop> On Wed, Dec 21, 2016 at 09:46:48AM +0100, Lenz Grimmer wrote: >Hi all, > >stumbled over this idea in the ceph-ansible mailing list and found it >quite interesting. Is there some merit in being able to select disks >based on this kind information? How does DeepSea handle this? > >Lenz Yeah I think this is definitely a feature we want. As to how high it is on the priority list I don't know. To a degree though its less of an issue in DeepSea since this (admittedly) long list of disks is generated. So while you still need to look at a long and weird looking list (the by-id paths are used) if you want to change the proposal, you don't need to write it. > > >-------- Forwarded Message -------- >Subject: [Ceph-ansible] Detecting disks by their properties rather than >by their path >Date: Thu, 8 Dec 2016 16:51:26 -0500 (EST) >From: Erwan Velu >To: Ceph-ansible at lists.ceph.com, lgrimmer at suse.de > >Hi list, > >This is my first contribution to ceph-ansible and wanted to share my >idea with you and get preliminary feedbacks on it. >First time I saw ceph-ansible, I was surprised on the way disks are >defined by naming disks per node with their path like "/dev/sdx" > >To my low-level background this is a big issue as : >- if you have 20-30 disks per node that make a serious list to maintain >- doesn't garantee what device you select : does /dev/sdx is a usb key, >an SSD or a rotational drive ? >- name can change over time : inserting a new disk or rebooting can lead >to a different name > > >My first approach was about using /dev/disk/by-id/ paths but that tigger >the following issues : >- still need a serious list of devices, which is even longer ... >- name integrate serial number or hexa strings making difficult to >maintain it, every device is different > >I eneded up with the following idea: >- Why should'nt we choose the disks by their features and let a module >find the associated path. > >The syntax looks like : >"{'samsung_journal_disks': {'model': 'SAMSUNG MZ7LN512', 'rotational': >'0', 'count': 2, 'ceph_type': 'journal' }}" > >With this syntax, you can select disk by the model, vendor, size, SSD or >not. >Then you can specify to get more of that type by specifing the count' >number : having 10 disks of the same type doesn't make the definition longer >It is then possible to define what disks are "journals" or "data" by >defining the ceph_type attribute. > >If that definition match the actual system, the module returns the >associated /dev/disk/by-id/ path like : >samsung_journal_disks_000 : >/dev/disk/by-id/scsi-36848f690e68a50001e428e511e4a6c20 >samsung_journal_disks_000 : >/dev/disk/by-id/scsi-36848f690e68a50001e428e521e55c62b > >The real benefit of that is the disks path become the result of a search >and not a starting point. The disk path have very few value then. > >Ceph-ansible will be able to select what disks should be used for what : >this part is under work. > >I wrote a documentation about it, you can have an overview here : >https://gist.github.com/ErwanAliasr1/b19c62fac061f4f924f587b1973cf0ea > >All this work can be found in https://github.com/ceph/ceph-ansible/pull/1128 > >I didn't detailled everything in this first mail to avoid being too verbose. > >I'd love to get your feedbacks on that idea : >- does it solve issues you already had ? >- could it be useful for you ? > >Cheers, >Erwan >_______________________________________________ >Ceph-ansible mailing list >Ceph-ansible at lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com > > >_______________________________________________ >Deepsea-users mailing list >Deepsea-users at lists.suse.com >http://lists.suse.com/mailman/listinfo/deepsea-users -- Jan Fajerski Engineer Enterprise Storage SUSE Linux GmbH jfajerski at suse.com From dbyte at suse.com Wed Dec 21 07:41:19 2016 From: dbyte at suse.com (David Byte) Date: Wed, 21 Dec 2016 14:41:19 +0000 Subject: [Deepsea-users] Fwd: Detecting disks by their properties rather than by their path In-Reply-To: <3974773.GTqgBxxajX@ruby> References: <543124068.5941150.1481233886382.JavaMail.zimbra@redhat.com> <1ab0549b-fe84-e004-cc11-722bf82b9626@suse.com>, <3974773.GTqgBxxajX@ruby> Message-ID: I wrote a script about 6 months ago that uses lsblk output to compare the configuration of storage nodes. This includes device manufacturer, model, rotational speed and maybe a few other attributes. This is enough on most systems to get you pretty close. I've been doing some additional work on a script that also picks up the connection type and speed (12G SAS for example). With all of this data, it would make it easy to group the disks by type and if the ratios are within bounds, you could make it easy to say use group B as journals for group A. Group C should be in its own bucket, etc. David Byte Sr. Technical Strategist IHV Alliances and Embedded SUSE Sent from my iPhone. Typos are Apple's fault. > On Dec 21, 2016, at 6:31 AM, Eric Jackson wrote: > > Hi Lenz, > As Lars mentioned, this has been an ongoing conversation. Currently, > DeepSea is using the long list, but at least it's generated by Salt. At some > point, the list does need to exist on the minion during the creation of OSDs. > Now, that could be a Salt module instead of the Salt pillar, effectively > generating the list dynamically. > > I only have a couple of concerns with no solution at the time. How do I > trivially know what models I have? I most likely cannot say simply Samsung or > Intel, but will need to collect all the alpha-numeric numbers of every drive > on all nodes. Ideally, I'd like to believe that it's only a few at a given > time in the life of a Ceph cluster, but over the course of a couple of years, > I could see that list being rather lengthy as well with replacements and > upgrades. > Another concern is debugging. While a hardcoded list of devices in the Salt > pillar is not elegant, there's no moving parts. If a device in the list is > not an OSD, the process is currently how did ceph-disk fail. Using a Salt > module that accepts various filters to return a list may not be seen as an > improvement over a static file when trying to determine why a drive is not an > OSD. > The last issue I haven't had enough time to think about entirely, but I > wonder if temporary failure conditions will create any unintended side effects. > Currently, the Salt pillar is authoritative. If the list becomes ephemeral > and changes between runs (not due to actual intended changes, but more like > flaky hardware), does that make any other operations more difficult? > > What would you think of a filter based solution that generates the list as a > static file for each storage node? This would mimic the current behavior of > keyrings in DeepSea. It's a two step process. The keyrings are generated and > stored on the Salt master. The second step includes adding the keyring to > Ceph and distributing it to the correct roles. That keeps debugging simple > and avoids regenerating keyrings unnecessarily. I think this addresses the > second and third issue above. Additionally, the static file may provide a > history via version control. That could prove useful for sysadmin teams > knowing the previously detected hardware (e.g. Did somebody change the filter > for this node or did they change the hardware?) > > These lists would still feed into the generation of hardware profiles unless > the goal is not only to include a filter, but have the admin provide a desired > profile. Maybe only adding the filter would be a sufficient first step? I am > believing that the default filter would be all available devices as it is now. > > Eric > > > >> On Wednesday, December 21, 2016 09:46:48 AM Lenz Grimmer wrote: >> Hi all, >> >> stumbled over this idea in the ceph-ansible mailing list and found it >> quite interesting. Is there some merit in being able to select disks >> based on this kind information? How does DeepSea handle this? >> >> Lenz >> >> >> -------- Forwarded Message -------- >> Subject: [Ceph-ansible] Detecting disks by their properties rather than >> by their path >> Date: Thu, 8 Dec 2016 16:51:26 -0500 (EST) >> From: Erwan Velu >> To: Ceph-ansible at lists.ceph.com, lgrimmer at suse.de >> >> Hi list, >> >> This is my first contribution to ceph-ansible and wanted to share my >> idea with you and get preliminary feedbacks on it. >> First time I saw ceph-ansible, I was surprised on the way disks are >> defined by naming disks per node with their path like "/dev/sdx" >> >> To my low-level background this is a big issue as : >> - if you have 20-30 disks per node that make a serious list to maintain >> - doesn't garantee what device you select : does /dev/sdx is a usb key, >> an SSD or a rotational drive ? >> - name can change over time : inserting a new disk or rebooting can lead >> to a different name >> >> >> My first approach was about using /dev/disk/by-id/ paths but that tigger >> the following issues : >> - still need a serious list of devices, which is even longer ... >> - name integrate serial number or hexa strings making difficult to >> maintain it, every device is different >> >> I eneded up with the following idea: >> - Why should'nt we choose the disks by their features and let a module >> find the associated path. >> >> The syntax looks like : >> "{'samsung_journal_disks': {'model': 'SAMSUNG MZ7LN512', 'rotational': >> '0', 'count': 2, 'ceph_type': 'journal' }}" >> >> With this syntax, you can select disk by the model, vendor, size, SSD or >> not. >> Then you can specify to get more of that type by specifing the count' >> number : having 10 disks of the same type doesn't make the definition longer >> It is then possible to define what disks are "journals" or "data" by >> defining the ceph_type attribute. >> >> If that definition match the actual system, the module returns the >> associated /dev/disk/by-id/ path like : >> samsung_journal_disks_000 : >> /dev/disk/by-id/scsi-36848f690e68a50001e428e511e4a6c20 >> samsung_journal_disks_000 : >> /dev/disk/by-id/scsi-36848f690e68a50001e428e521e55c62b >> >> The real benefit of that is the disks path become the result of a search >> and not a starting point. The disk path have very few value then. >> >> Ceph-ansible will be able to select what disks should be used for what : >> this part is under work. >> >> I wrote a documentation about it, you can have an overview here : >> https://gist.github.com/ErwanAliasr1/b19c62fac061f4f924f587b1973cf0ea >> >> All this work can be found in https://github.com/ceph/ceph-ansible/pull/1128 >> >> I didn't detailled everything in this first mail to avoid being too verbose. >> >> I'd love to get your feedbacks on that idea : >> - does it solve issues you already had ? >> - could it be useful for you ? >> >> Cheers, >> Erwan >> _______________________________________________ >> Ceph-ansible mailing list >> Ceph-ansible at lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com > _______________________________________________ > Deepsea-users mailing list > Deepsea-users at lists.suse.com > http://lists.suse.com/mailman/listinfo/deepsea-users