[caasp-beta] [Marketing Mail] Re: How to deploy CaaSP on EC2?

Le Bihan Stéphane (AMUNDI-ITS) stephane.lebihan at amundi.com
Wed May 30 23:29:37 MDT 2018


Hi all,

I ask for update bug with information, and I'm not use "reply all".

So it seemed OK Friday and Saturday but on Monday new pods in "Terminating" state.
We observed node state switch to NotReady for a few moment, when node host more pods in "Terminating" state.

We have search more information, and we think it's a Docker problem who don't terminate correctly container.

The only option for clear "Terminating" container we find is :
for pod2delete in $(kubectl get po -n <namespace> | grep Terminating | awk '{ print $1 }'); do 
    kubectl delete po/$pod2delete -n <namespace> --force --grace-period=0
done

And sometimes, because node is more often "NotReady", or all node is "NotReady" we restart docker and kubelet.....

For information we have restart docker and kubelet at 17:00 yesterday, and this morning I have 286 po in "Terminating" State.

I joined status of nodes and pods this morning.

Regards,

Stéphane Le Bihan
SDE/DSI/IPR/SSD/UNX
90, Boulevard Pasteur - 75015 Paris
Web: www.amundi.com
Tel: +33 1 76 32 32 08
Tel Unix Team: +33 1 76 32 02 30
@: stephane.lebihan at amundi.com
@: sits.unix at amundi.com

Visit us on: 

        

-----Message d'origine-----
De : Valentin Rothberg [mailto:vrothberg at suse.com] 
Envoyé : mercredi 30 mai 2018 18:03
À : Le Bihan Stéphane (AMUNDI-ITS)
Cc : vrothberg at suse.de; 'Junichi Yoshise'; 'caasp-beta at lists.suse.com'; Liam Proven
Objet : Re: [caasp-beta] [Marketing Mail] Re: How to deploy CaaSP on EC2?

Hi Stéphane,

On May/30/18 13:51, Le Bihan Stéphane (AMUNDI-ITS)  wrote:
> Hi all,
> 
> Have you more news of this bug ?
> I'm also not authorized to see bug 
> https://bugzilla.suse.com/show_bug.cgi?id=1094680

I am a bit confused.  Didn't you write earlier this week that issue doesn't occur anymore?  Maybe, I misread your email.  Can you clarify?

> For information we have find this :
> 
> So I've found related PR: 
> https://github.com/opencontainers/runc/pull/1477
> (possibly)
> This is something in logs just before it all goes crazy:
> level=error msg="Handler for POST /v1.31/containers/0ebea7ea2522e14a1fec46ee8991ea2195b76898b5f778dc2f9e57dd36f52efe/start returned error: oci runtime error: container_linux.go:258: creating new parent process caused \"container_linux.go:1545: running lstat on namespace path \\"/proc/49701/ns/ipc\\" caused \\"lstat /proc/49701/ns/ipc: no such file or directory\\"\"
> The problem is, that the version of runc with the PR merged is still 
> only 1.0 RC5 I can't quite find if docker is using the same code or if 
> they import some RC code

Thanks a lot for checking.  We will look into the issue as soon as possible.

Kind regards,
 Valentin

> Stéphane Le Bihan
> SDE/DSI/IPR/SSD/UNX
> 90, Boulevard Pasteur - 75015 Paris
> Web: www.amundi.com
> Tel: +33 1 76 32 32 08
> Tel Unix Team: +33 1 76 32 02 30
> @: stephane.lebihan at amundi.com
> @: sits.unix at amundi.com
> 
> Visit us on: 
> 
>         
> 
> -----Message d'origine-----
> De : Le Bihan Stéphane (AMUNDI-ITS)
> Envoyé : lundi 28 mai 2018 11:20
> À : Valentin Rothberg
> Cc : vrothberg at suse.de; Junichi Yoshise; caasp-beta at lists.suse.com; 
> Liam Proven Objet : RE: [caasp-beta] [Marketing Mail] Re: How to deploy CaaSP on EC2?
> 
> Hi all.
> 
> After successfully upgrade of all nodes with Suse CAASP 3beta 2 DVD next Friday, we have no docker in "Terminating" state this Monday.
> We left sleeper run this week.
> 
> ---
> apiVersion: batch/v1beta1
> kind: CronJob
> metadata:
>   name: sleeper
> spec:
>   schedule: "*/1 * * * *"
>   concurrencyPolicy: Replace
>   successfulJobsHistoryLimit: 3
>   failedJobsHistoryLimit: 3
>   jobTemplate:
>     spec:
>       template:
>         spec:
>           containers:
>             - name: sleeper
>               image: registry-docker.intramundi.com/library/ubuntu:18.04
>               args:
>                 - "/bin/sleep"
>                 - "5"
>           restartPolicy: OnFailure
>           imagePullSecrets:
>             - name: regcred
> 
> Here's the source code for sleep: 
> https://github.com/coreutils/coreutils/blob/master/src/sleep.c
> 
> State of cluster this morning. In "state_2018-05-28_11-15.txt" file.
> I join support config of same node.
> 
> Regards,
>  
> Stéphane Le Bihan
> Ingénieur système et production
> 90, Boulevard Pasteur – 75015 Paris
> Web : www.amundi.com
> Tél : +33 1 76 32 32 08
> Equipe Unix : +33 1 76 32 02 30
> @ : stephane.lebihan at amundi.com
> @ : sits.unix at amundi.com
> 
> -----Message d'origine-----
> De : Valentin Rothberg [mailto:vrothberg at suse.com] Envoyé : vendredi 25 mai 2018 13:45 À : Le Bihan Stéphane (AMUNDI-ITS) Cc : vrothberg at suse.de; Junichi Yoshise; caasp-beta at lists.suse.com; Liam Proven Objet : Re: [caasp-beta] [Marketing Mail] Re: How to deploy CaaSP on EC2?
> 
> On May/25/18 11:29, Le Bihan Stéphane (AMUNDI-ITS)  wrote:
> > Yes, you may. :D
> > 
> > It's on this, in Bugzilla section :
> > https://www.suse.com/betaprogram/caasp-beta/#documentation
> 
> Thanks a lot, and sorry for the inconvenience.  The link will be fixed.
> 
> > " Please refer to our complete Bugzilla guideline for more information and before creating your first bug report: how to use Bugzilla"
> 
> I would appreciate if you could open the bug in Bugzilla, so we can move communication there.  You can also upload the logs there directly.  What you need to do is to login, so you may create an account if not present.
> You can go on bugzilla.suse.com, click on login and follow the steps from there.
> 
> Thanks a lot, and kind regards,
>  Valentin
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: state_no-po_20180531.txt
URL: <http://lists.suse.com/pipermail/caasp-beta/attachments/20180531/d2630b7c/attachment.txt>


More information about the caasp-beta mailing list