Discussion:
[ceph-users] Full L3 Ceph
Lazuardi Nasution
2018-11-22 21:03:25 UTC
Permalink
Hi,

I'm looking example Ceph configuration and topology on full layer 3
networking deployment. Maybe all daemons can use loopback alias address in
this case. But how to set cluster network and public network configuration,
using supernet? I think using loopback alias address can prevent the
daemons down due to physical interfaces disconnection and can load balance
traffic between physical interfaces without interfaces bonding, but with
ECMP.

Best regards,
Robin H. Johnson
2018-11-23 00:29:17 UTC
Permalink
Post by Lazuardi Nasution
I'm looking example Ceph configuration and topology on full layer 3
networking deployment. Maybe all daemons can use loopback alias address in
this case. But how to set cluster network and public network configuration,
using supernet? I think using loopback alias address can prevent the
daemons down due to physical interfaces disconnection and can load balance
traffic between physical interfaces without interfaces bonding, but with
ECMP.
I can say I've done something similar**, but I don't have access to that
environment or most*** of the configuration anymore.

One of the parts I do recall, was explicitly setting cluster_network
and public_network to empty strings, AND using public_addr+cluster_addr
instead, with routable addressing on dummy interfaces (NOT loopback).

**:For values of similar:
- 99.9% IPv6 environment
- BGP everywhere
- The only IPv4 was on the outside of HAProxy for legacy IPv4 clients.
- Quanta switchgear running Cumulus Linux, 10Gbit ports
- Hosts running Cumulus quagga fork (REQUIRED)
- Host to 2xToR using IPv6 link-local addressing only
https://blog.ipspace.net/2015/02/bgp-configuration-made-simple-with.html
- Reliable ~19Gbit aggregate (2x10GBit)
- watch out for NIC overheating: no warning, just thermal throttle down
to ~2.5Gbit/port.

***:Some parts of the configuration ARE public:
https://github.com/dreamhost/ceph-chef/tree/dokken
--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : ***@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
Stefan Kooman
2018-11-25 20:17:34 UTC
Permalink
Post by Robin H. Johnson
Post by Lazuardi Nasution
I'm looking example Ceph configuration and topology on full layer 3
networking deployment. Maybe all daemons can use loopback alias address in
this case. But how to set cluster network and public network configuration,
using supernet? I think using loopback alias address can prevent the
daemons down due to physical interfaces disconnection and can load balance
traffic between physical interfaces without interfaces bonding, but with
ECMP.
I can say I've done something similar**, but I don't have access to that
environment or most*** of the configuration anymore.
One of the parts I do recall, was explicitly setting cluster_network
and public_network to empty strings, AND using public_addr+cluster_addr
instead, with routable addressing on dummy interfaces (NOT loopback).
You can do this with MP-BGP (VXLAN) EVPN. We are running it like that.
IPv6 overlay network only. ECMP to make use of all the links. We don't
use a seperate cluster network. That only complicates things, and
there's no real use for it (trademark by Wido den Hollander). If you
want to use BGP on the hosts themselves have a look at this post by
Vincent Bernat (great writeups of complex networking stuff) [1]. You can
use "MC-LAG" on the host to get redundant connectivity, or use "Type 4"
EVPN to get endpoint redundancy (Ethernet Segment Route). FRR 6.0 has
support for most of this (not yet "Type 4" EVPN support IIRC) [2].

We use a network namespace to seperate (IPv6) mangemant traffic
from production traffic. This complicates Ceph deployment a lot, but in
the end it's worth it.

Gr. Stefan

[1]: https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn
[2]: https://frrouting.org/
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl
Continue reading on narkive:
Loading...