[caasp-beta] BTRFS space and quota]

Le Bihan Stéphane (AMUNDI-ITS) stephane.lebihan at amundi.com
Fri Nov 3 02:48:44 MDT 2017


Hello Ludovic,

I can provide us result now, but we success to restore free space yesterday. And I think we find cause.

For restore free space we have stop etcd.service, remove all file in /var/lib/etcd, and restart etcd.service.
# systemctl stop etcd
# rm –rf /etc/sysconfig/etcd/member
# systemctl start etcd

# du -csh /*
4.6M    /bin
44M     /boot
0       /cloud-init-config
8.0K    /dev
12M     /etc
0       /home
318M    /lib
14M     /lib64
0       /mnt
0       /opt
du: cannot access '/proc/24205/task/24205/fd/4': No such file or directory
du: cannot access '/proc/24205/task/24205/fdinfo/4': No such file or directory
du: cannot access '/proc/24205/fd/4': No such file or directory
du: cannot access '/proc/24205/fdinfo/4': No such file or directory
0       /proc
3.4M    /root
218M    /run
5.7M    /sbin
0       /selinux
0       /srv
0       /sys
48K     /tmp
1.8G    /usr
5.4G    /var
7.8G    total

# btrfs fi usage /
Overall:
    Device size:                  30.00GiB
    Device allocated:              5.02GiB
    Device unallocated:           24.99GiB
    Device missing:                  0.00B
    Used:                          2.55GiB
    Free (estimated):             25.50GiB      (min: 13.00GiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:               16.00MiB      (used: 0.00B)

Data,single: Size:3.00GiB, Used:2.49GiB
   /dev/vda6       3.00GiB

Metadata,DUP: Size:1.00GiB, Used:32.59MiB
   /dev/vda6       2.00GiB

System,DUP: Size:9.50MiB, Used:16.00KiB
   /dev/vda6      19.00MiB

Unallocated:
   /dev/vda6      24.99GiB


Etcd seems ok, but flannel is KO.

After search I discover we can’t ping all other server (in or not in CAASP) from master and worker.
I connect to admin node and it’s same.

So I search in history, and I found my team patch OS of hypervisor on 22-October.
My architecture is based on KVM, on one physical server SLES12 SP2, but I think after upgrade of hypervisor on SLES12 SP3, virtio card of KVM don’t work correctly…

# cat /etc/hosts
#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#
# IP-Address  Full-Qualified-Hostname  Short-Hostname
#

127.0.0.1       localhost

# special IPv6 addresses
::1             localhost ipv6-localhost ipv6-loopback

fe00::0         ipv6-localnet

ff00::0         ipv6-mcastprefix
ff02::1         ipv6-allnodes
ff02::2         ipv6-allrouters
ff02::3         ipv6-allhosts

#-- start Salt-CaaSP managed hosts - DO NOT MODIFY --
### service names ###
127.0.0.1 api api.infra.caasp.local dev-kubm01.unix.sits.credit-agricole.fr

### admin nodes ###
10.198.47.219 admin admin.infra.caasp.local

### kubernetes masters ###
10.198.47.220 f74967034d3743f1b843d227df61c7ad f74967034d3743f1b843d227df61c7ad.infra.caasp.local

### kubernetes workers ###
10.198.47.224 82c1065b62f84a508a9e1ffeb45a5cf2 82c1065b62f84a508a9e1ffeb45a5cf2.infra.caasp.local
10.198.47.223 afbe67218e5b4807a16e84997de79c6f afbe67218e5b4807a16e84997de79c6f.infra.caasp.local
10.198.47.221 12b79838fd734263830ffeb74dbb35bb 12b79838fd734263830ffeb74dbb35bb.infra.caasp.local
10.198.47.222 d246e0d7ff5b49c0996ea10c7bb8ca43 d246e0d7ff5b49c0996ea10c7bb8ca43.infra.caasp.local
#-- end Salt-CaaSP managed hosts --

# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:49:ee:13 brd ff:ff:ff:ff:ff:ff
    inet 10.198.47.220/24 brd 10.198.47.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe49:ee13/64 scope link
       valid_lft forever preferred_lft forever

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.198.47.253   0.0.0.0         UG    0      0        0 eth0
10.198.47.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0

# ping 10.198.47.219
PING 10.198.47.219 (10.198.47.219) 56(84) bytes of data.
^C
--- 10.198.47.219 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 3999ms

# ping 10.198.47.221
PING 10.198.47.221 (10.198.47.221) 56(84) bytes of data.
^C
--- 10.198.47.221 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2014ms

# ping 10.198.47.253
PING 10.198.47.253 (10.198.47.253) 56(84) bytes of data.
^C
--- 10.198.47.253 ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4002ms

Regards,

[cid:image001.gif at 01D3547B.A5D70880]

Stéphane Le Bihan

SDE/DSI/IPR/SSD/UNX

90, Boulevard Pasteur - 75015 Paris

Web: http://www.amundi.com<http://www.amundi.com/>

Tél: +33 1 76 32 32 08
Equipe Unix : +33 1 76 32 02 30

@: stephane.lebihan at amundi.com<mailto:stephane.lebihan at amundi.com>
@ : sits.unix at amundi.com<mailto:sits.unix at amundi.com>



De : Ludovic Cavajani [mailto:ludovic.cavajani at suse.com]
Envoyé : jeudi 2 novembre 2017 16:47
À : Paul Gonin; caasp-beta at lists.suse.com; Le Bihan Stéphane (AMUNDI-ITS)
Objet : Re: [caasp-beta] BTRFS space and quota]


Hello Stéphane,

Can you provide us the output of :

# du -csh /*
Regards,
On 11/02/2017 11:54 AM, Paul Gonin wrote:
-------- Message transféré --------

Date: Thu, 2 Nov 2017 10:35:13 +0000
Objet: Re: [caasp-beta] BTRFS space and quota
À: Paul Gonin <paul.gonin at suse.com<mailto:Paul%20Gonin%20%3cpaul.gonin at suse.com%3e>>, caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com> <caasp-beta at lists.suse.com<mailto:%22caasp-beta at lists.suse.com%22%20%3ccaasp-beta at lists.suse.com%3e>>
De: Le Bihan Stéphane (AMUNDI-ITS) <stephane.lebihan at amundi.com<mailto:Le%20Bihan%20%3d%3fISO-8859-1%3fQ%3fSt%3dE9phane%3f%3d%20%22%28AMUNDI-ITS%29%22%20%3cstephane.lebihan at amundi.com%3e>>
Hi Paul,

The result of command snapper ls.

# snapper ls
Type   | # | Pre # | Date                            | User | Cleanup | Description           | Userdata
-------+---+-------+---------------------------------+------+---------+-----------------------+---------
single | 0 |       |                                 | root |         | current               |
single | 1 |       | Fri 06 Oct 2017 08:47:14 AM UTC | root |         | first root filesystem |

I delete quota on /var/lb/etcd, and test balance but it’s not ok.
I recreate quota and rescan and value is same before deletion.

For information I launch du –sh on / and result is 7.8Go.

# du -sh /
du: cannot access '/proc/7982/task/7982/fd/4': No such file or directory
du: cannot access '/proc/7982/task/7982/fdinfo/4': No such file or directory
du: cannot access '/proc/7982/fd/3': No such file or directory
du: cannot access '/proc/7982/fdinfo/3': No such file or directory
7.8G    /

Regards,


[cid:image001.gif at 01D3547B.A5D70880]

Stéphane Le Bihan

SDE/DSI/IPR/SSD/UNX

90, Boulevard Pasteur - 75015 Paris

Web: http://www.amundi.com<http://www.amundi.com/>

Tél: +33 1 76 32 32 08
Equipe Unix : +33 1 76 32 02 30

@: stephane.lebihan at amundi.com<mailto:stephane.lebihan at amundi.com>
@ : sits.unix at amundi.com<mailto:sits.unix at amundi.com>



De : Paul Gonin [mailto:paul.gonin at suse.com]
Envoyé : jeudi 2 novembre 2017 10:55
À : Le Bihan Stéphane (AMUNDI-ITS); caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com>
Objet : Re: [caasp-beta] BTRFS space and quota

Hi Stephane,

What is the output of
# snapper ls
?

I assume that since you there were no updates yet it should look like

Type   | # | Pre # | Date                     | User | Cleanup | Description           | Userdata
-------+---+-------+--------------------------+------+---------+-----------------------+--------------
single | 0 |       |                          | root |         | current               |
single | 1 |       | Tue Oct 31 09:07:13 2017 | root |         | first root filesystem |
single | 2 |       | Tue Oct 31 09:10:42 2017 | root | number  | after installation    | important=yes

rgds
Paul

Le mardi 31 octobre 2017 à 13:38 +0000, Le Bihan Stéphane (AMUNDI-ITS) a écrit :
Hi Paul,

We work with CaaSP2.

Regards,

[cid:image001.gif at 01D3547B.A5D70880]

Stéphane Le Bihan

SDE/DSI/IPR/SSD/UNX

90, Boulevard Pasteur - 75015 Paris

Web: http://www.amundi.com<http://www.amundi.com/>

Tél: +33 1 76 32 32 08
Equipe Unix : +33 1 76 32 02 30

@: stephane.lebihan at amundi.com<mailto:stephane.lebihan at amundi.com>
@ : sits.unix at amundi.com<mailto:sits.unix at amundi.com>



De : Paul Gonin [mailto:paul.gonin at suse.com]
Envoyé : mardi 31 octobre 2017 14:34
À : Le Bihan Stéphane (AMUNDI-ITS); caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com>
Objet : Re: [caasp-beta] BTRFS space and quota

Hi Stéphane,

Not that it should make a difference for the issue described, what version of CaaSP the cluster is running ?
Is it CaaSP2 ? RC1 ?

thanks
Paul

Le mardi 31 octobre 2017 à 08:35 +0000, Le Bihan Stéphane (AMUNDI-ITS) a écrit :
Hello,

We have a strange case on CAASP plateform with btrfs quota.

For history, I was out of office since 3 weeks, but others colleague test kubernetes plateform.
When I return, we ask me because FS is full on master and worker nodes.
I don’t have cause, but I think with a bad config, subvolume /var/lib/etcd grown and  after correction reduce, though quota reserved all space.

When I check, I see btrfs usage and it’s really full, but balance as no effect.
After search I see quota is activate, and subvolumes /var/lib/etcd reserved 90% of space. But I don’t succeed to release this space.

Can you help me for release space disk ?


·         On master :



# btrfs filesystem usage /

Overall:

    Device size:                  30.00GiB

    Device allocated:             29.99GiB

    Device unallocated:           17.00MiB

    Device missing:                  0.00B

    Used:                         27.56GiB

    Free (estimated):            504.93MiB      (min: 496.43MiB)

    Data ratio:                       1.00

    Metadata ratio:                   2.00

    Global reserve:               16.00MiB      (used: 0.00B)



Data,single: Size:27.97GiB, Used:27.49GiB

   /dev/vda6      27.97GiB



Metadata,DUP: Size:1.00GiB, Used:32.64MiB

   /dev/vda6       2.00GiB



System,DUP: Size:9.50MiB, Used:16.00KiB

   /dev/vda6      19.00MiB



Unallocated:

   /dev/vda6      17.00MiB



# btrfs fi df /

Data, single: total=27.97GiB, used=27.50GiB

System, DUP: total=9.50MiB, used=16.00KiB

Metadata, DUP: total=1.00GiB, used=32.66MiB

GlobalReserve, single: total=16.00MiB, used=0.00B



# btrfs fi show /

Label: none  uuid: 1b0614eb-fc59-4841-bbc5-5318087f6432

        Total devices 1 FS bytes used 27.53GiB

        devid    1 size 30.00GiB used 29.99GiB path /dev/vda6



# btrfs subvolume list /

ID 257 gen 40 top level 5 path @

ID 258 gen 194820 top level 257 path @/.snapshots

ID 259 gen 197128 top level 258 path @/.snapshots/1/snapshot

ID 260 gen 194810 top level 257 path @/boot/grub2/i386-pc

ID 261 gen 194810 top level 257 path @/boot/grub2/x86_64-efi

ID 262 gen 194810 top level 257 path @/cloud-init-config

ID 263 gen 194810 top level 257 path @/home

ID 264 gen 197081 top level 257 path @/root

ID 265 gen 197111 top level 257 path @/tmp

ID 266 gen 194809 top level 257 path @/var/cache

ID 267 gen 194809 top level 257 path @/var/crash

ID 268 gen 195783 top level 257 path @/var/lib/ca-certificates

ID 269 gen 195783 top level 257 path @/var/lib/cloud

ID 270 gen 24 top level 257 path @/var/lib/docker

ID 271 gen 194810 top level 257 path @/var/lib/dockershim

ID 272 gen 195719 top level 257 path @/var/lib/etcd

ID 273 gen 194810 top level 257 path @/var/lib/kubelet

ID 274 gen 194810 top level 257 path @/var/lib/machines

ID 275 gen 196430 top level 257 path @/var/lib/misc

ID 276 gen 194810 top level 257 path @/var/lib/mysql

ID 277 gen 194810 top level 257 path @/var/lib/nfs

ID 278 gen 194810 top level 257 path @/var/lib/ntp

ID 279 gen 196428 top level 257 path @/var/lib/overlay

ID 280 gen 194810 top level 257 path @/var/lib/rollback

ID 281 gen 196427 top level 257 path @/var/lib/systemd

ID 282 gen 194810 top level 257 path @/var/lib/vmware

ID 283 gen 194810 top level 257 path @/var/lib/wicked

ID 284 gen 197128 top level 257 path @/var/log

ID 285 gen 197111 top level 257 path @/var/spool

ID 286 gen 196428 top level 257 path @/var/tmp



# btrfs qgroup show -pcreFf /var/lib/etcd

qgroupid         rfer         excl     max_rfer     max_excl parent  child

--------         ----         ----     --------     -------- ------  -----

0/272        25.14GiB     25.14GiB         none         none ---     ---



# du -sh /var/lib/etcd/

417M    /var/lib/etcd/



·         On one worker


# btrfs fi usage /
Overall:
    Device size:                  30.00GiB
    Device allocated:             30.00GiB
    Device unallocated:            1.00MiB
    Device missing:                  0.00B
    Used:                         27.94GiB
    Free (estimated):            135.28MiB      (min: 135.28MiB)
    Data ratio:                       1.00
    Metadata ratio:                   2.00
    Global reserve:               16.00MiB      (used: 0.00B)

Data,single: Size:27.99GiB, Used:27.86GiB
   /dev/vda6      27.99GiB

Metadata,DUP: Size:1.00GiB, Used:43.44MiB
   /dev/vda6       2.00GiB

System,DUP: Size:8.00MiB, Used:16.00KiB
   /dev/vda6      16.00MiB

Unallocated:
   /dev/vda6       1.00MiB


# btrfs fi df /

Data, single: total=27.99GiB, used=27.86GiB

System, DUP: total=8.00MiB, used=16.00KiB

Metadata, DUP: total=1.00GiB, used=43.44MiB

GlobalReserve, single: total=16.00MiB, used=0.00B



# btrfs fi show /

Label: none  uuid: 1d7b76f8-f91c-47e0-8be2-a3f02f90ac96

        Total devices 1 FS bytes used 27.90GiB

        devid    1 size 30.00GiB used 30.00GiB path /dev/vda6



# btrfs qgroup show -pcreFf /var/lib/etcd

qgroupid         rfer         excl     max_rfer     max_excl parent  child

--------         ----         ----     --------     -------- ------  -----

0/272        20.99GiB     20.99GiB         none         none ---     ---



# du -sh /var/lib/etcd/

452M    /var/lib/etcd/


Regards,

[cid:image001.gif at 01D3547B.A5D70880]

Stéphane Le Bihan

SDE/DSI/IPR/SSD/UNX

90, Boulevard Pasteur - 75015 Paris

Web: http://www.amundi.com<http://www.amundi.com/>

Tél: +33 1 76 32 32 08
Equipe Unix : +33 1 76 32 02 30

@: stephane.lebihan at amundi.com<mailto:stephane.lebihan at amundi.com>
@ : sits.unix at amundi.com<mailto:sits.unix at amundi.com>




_______________________________________________

caasp-beta mailing list

caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com>

http://lists.suse.com/mailman/listinfo/caasp-beta

_______________________________________________

caasp-beta mailing list

caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com>

http://lists.suse.com/mailman/listinfo/caasp-beta

_______________________________________________

caasp-beta mailing list

caasp-beta at lists.suse.com<mailto:caasp-beta at lists.suse.com>

http://lists.suse.com/mailman/listinfo/caasp-beta

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.suse.com/pipermail/caasp-beta/attachments/20171103/d98ffddd/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 2430 bytes
Desc: image001.gif
URL: <http://lists.suse.com/pipermail/caasp-beta/attachments/20171103/d98ffddd/attachment.gif>


More information about the caasp-beta mailing list