Discussion:
Latency for the Public Network
(too old to reply)
Tobias Kropf
2018-02-05 21:04:00 UTC
Permalink
Raw Message
Hi ceph list,

we have a hyperconvergent ceph cluster with kvm on 8 nodes with ceph
hammer 0.94.10. The cluster is now 3 years old an we plan with a new
cluster for a high iops project. We use replicated pools 3/2 and have
not the best latency on our switch backend.


ping -s 8192 10.10.10.40

8200 bytes from 10.10.10.40: icmp_seq=1 ttl=64 time=0.153 ms


We plan to split the hyperconvergent setup to storage an compute nodes
and want to split ceph cluster and public network. Cluster network with
40 gbit mellanox switches and public network with the existant 10gbit
switches.

Now my question... are 0.153ms - 0.170ms fast enough for the public
network? We must deploy a setup with 1500 - 2000 terminalserver....


Has anyone some experience with a lot of terminalservers on a ceph backend?


Thanks for replys...
--
Tobias Kropf



Technik





--

inett5-100x56

inett GmbH » Ihr IT Systemhaus in Saarbrücken

Mainzerstrasse 183
66121 Saarbrücken
Geschäftsführer: Marco Gabriel
Handelsregister Saarbrücken
HRB 16588


Telefon: 0681 / 41 09 93 – 0
Telefax: 0681 / 41 09 93 – 99
E-Mail: ***@inett.de
Web: www.inett.de

Cyberoam Gold Partner - Zarafa Gold Partner - Proxmox Authorized Reseller - Proxmox Training Center - SEP sesam Certified Partner – Open-E Partner - Endian Certified Partner - Kaspersky Silver Partner – ESET Silver Partner - Mitglied im iTeam Systemhausverbund für den Mittelstand
Christian Balzer
2018-02-06 03:03:13 UTC
Permalink
Raw Message
Hello,
Post by Tobias Kropf
Hi ceph list,
we have a hyperconvergent ceph cluster with kvm on 8 nodes with ceph
hammer 0.94.10.
Do I smell Proxmox?
Post by Tobias Kropf
The cluster is now 3 years old an we plan with a new
cluster for a high iops project. We use replicated pools 3/2 and have
not the best latency on our switch backend.
ping -s 8192 10.10.10.40
8200 bytes from 10.10.10.40: icmp_seq=1 ttl=64 time=0.153 ms
Not particularly great, yes.
However your network latency is only one factor, Ceph OSDs add quite
another layer there and do affect IOPS even more usually.
For high IOPS you need of course fast storage, network AND CPUs.
Post by Tobias Kropf
We plan to split the hyperconvergent setup to storage an compute nodes
and want to split ceph cluster and public network. Cluster network with
40 gbit mellanox switches and public network with the existant 10gbit
switches.
You'd do a lot better if you were to go all 40Gb/s and forget about
splitting networks.

The faster replication network will:
a) be underutilized all of the time in terms of bandwidth
b) not help with read IOPS at all
c) still be hobbled by the public network latency when it comes to write
IOPS (but of course help in regards to replication latency).
Post by Tobias Kropf
Now my question... are 0.153ms - 0.170ms fast enough for the public
network? We must deploy a setup with 1500 - 2000 terminalserver....
Define terminal server, are we talking Windows Virtual Desktops with RDP?
Windows is quite the hog when it comes to I/O.

Regards,

Christian
--
Christian Balzer Network/Systems Engineer
***@gol.com Rakuten Communications
Tobias Kropf
2018-02-06 08:21:22 UTC
Permalink
Raw Message
Post by Christian Balzer
Hello,
Post by Tobias Kropf
Hi ceph list,
we have a hyperconvergent ceph cluster with kvm on 8 nodes with ceph
hammer 0.94.10.
Do I smell Proxmox?
Yes we use atm Proxmox
Post by Christian Balzer
Post by Tobias Kropf
The cluster is now 3 years old an we plan with a new
cluster for a high iops project. We use replicated pools 3/2 and have
not the best latency on our switch backend.
ping -s 8192 10.10.10.40
8200 bytes from 10.10.10.40: icmp_seq=1 ttl=64 time=0.153 ms
Not particularly great, yes.
However your network latency is only one factor, Ceph OSDs add quite
another layer there and do affect IOPS even more usually.
For high IOPS you need of course fast storage, network AND CPUs.
Yes we know that... the network is our first job. We plan with new
hardware for mon and osd services with a lot of flash nvme disks and
high ghz cpus.
Post by Christian Balzer
Post by Tobias Kropf
We plan to split the hyperconvergent setup to storage an compute nodes
and want to split ceph cluster and public network. Cluster network with
40 gbit mellanox switches and public network with the existant 10gbit
switches.
You'd do a lot better if you were to go all 40Gb/s and forget about
splitting networks.
Use public and cluster network over the same nics and the same subnet?
Post by Christian Balzer
a) be underutilized all of the time in terms of bandwidth
b) not help with read IOPS at all
c) still be hobbled by the public network latency when it comes to write
IOPS (but of course help in regards to replication latency).
Post by Tobias Kropf
Now my question... are 0.153ms - 0.170ms fast enough for the public
network? We must deploy a setup with 1500 - 2000 terminalserver....
Define terminal server, are we talking Windows Virtual Desktops with RDP?
Windows is quite the hog when it comes to I/O.
Yes we talking about windows virtual desktops with rdp....
Our calculation is... 1x dc= 60-80 IOPS 1x ts = 60-80 IOPS N User * 10
IOPS ...

For this system we want to wort with cache tiering in front with nvme
disk and sata disk on ec pool.  Is this a good idear to use Cache
tiering in this setup?
Post by Christian Balzer
Regards,
Christian
--
Tobias Kropf



Technik





--

inett5-100x56

inett GmbH » Ihr IT Systemhaus in Saarbrücken

Mainzerstrasse 183
66121 Saarbrücken
Geschäftsführer: Marco Gabriel
Handelsregister Saarbrücken
HRB 16588


Telefon: 0681 / 41 09 93 – 0
Telefax: 0681 / 41 09 93 – 99
E-Mail: ***@inett.de
Web: www.inett.de

Cyberoam Gold Partner - Zarafa Gold Partner - Proxmox Authorized Reseller - Proxmox Training Center - SEP sesam Certified Partner – Open-E Partner - Endian Certified Partner - Kaspersky Silver Partner – ESET Silver Partner - Mitglied im iTeam Systemhausverbund für den Mittelstand
Christian Balzer
2018-02-06 09:02:04 UTC
Permalink
Raw Message
Hello,
Post by Tobias Kropf
Post by Christian Balzer
Hello,
Post by Tobias Kropf
Hi ceph list,
we have a hyperconvergent ceph cluster with kvm on 8 nodes with ceph
hammer 0.94.10.
Do I smell Proxmox?
Yes we use atm Proxmox
Post by Christian Balzer
Post by Tobias Kropf
The cluster is now 3 years old an we plan with a new
cluster for a high iops project. We use replicated pools 3/2 and have
not the best latency on our switch backend.
ping -s 8192 10.10.10.40
8200 bytes from 10.10.10.40: icmp_seq=1 ttl=64 time=0.153 ms
Not particularly great, yes.
However your network latency is only one factor, Ceph OSDs add quite
another layer there and do affect IOPS even more usually.
For high IOPS you need of course fast storage, network AND CPUs.
Yes we know that... the network is our first job. We plan with new
hardware for mon and osd services with a lot of flash nvme disks and
high ghz cpus.
Post by Christian Balzer
Post by Tobias Kropf
We plan to split the hyperconvergent setup to storage an compute nodes
and want to split ceph cluster and public network. Cluster network with
40 gbit mellanox switches and public network with the existant 10gbit
switches.
You'd do a lot better if you were to go all 40Gb/s and forget about
splitting networks.
Use public and cluster network over the same nics and the same subnet?
Yes, at least for NICs.
If for some reason your compute nodes have no dedicated links/NICs for the
Ceph cluster and it makes you feel warm and fuzzy, you can segregate
traffic with VLANs.
But it most cases that really comes down to "security theater", if a
compute gets compromised they have access to your ceph cluster network
anyway.

When looking at the ML archives you'll find a number of people suggesting
to keep things simple if not otherwise needed.
Post by Tobias Kropf
Post by Christian Balzer
a) be underutilized all of the time in terms of bandwidth
b) not help with read IOPS at all
c) still be hobbled by the public network latency when it comes to write
IOPS (but of course help in regards to replication latency).
Post by Tobias Kropf
Now my question... are 0.153ms - 0.170ms fast enough for the public
network? We must deploy a setup with 1500 - 2000 terminalserver....
Define terminal server, are we talking Windows Virtual Desktops with RDP?
Windows is quite the hog when it comes to I/O.
Yes we talking about windows virtual desktops with rdp....
Our calculation is... 1x dc= 60-80 IOPS 1x ts = 60-80 IOPS N User * 10
IOPS ...
For this system we want to wort with cache tiering in front with nvme
disk and sata disk on ec pool.  Is this a good idear to use Cache
tiering in this setup?
Depends on the size of your cache-tier really.
I have done no analysis of Windows I/O behavior other than it being
insanely swap happy w/o needs, so if you can, eliminate the pagefile.

If all your typical writes can be satisfied from the cache-tier, good.
Reads (like OS boot, etc) should be fine from the EC pool, so cache-tier
in read-forward mode.

But you _really_ need to test this, a non-fitting cache-tier can be worse
than no cache at all.

Christian
Post by Tobias Kropf
Post by Christian Balzer
Regards,
Christian
--
Christian Balzer Network/Systems Engineer
***@gol.com Rakuten Communications
Loading...