I also concerns about this problem. And my problem is how many threads will the qemu-system-x86 has.
From what i tested, it could between 100 to 800, yeah, maybe it has relationship with the osd number. But it
seems affect the performance when it has many threads. From what i tested, 4k randwrite will reduce from 15k
to 4k. That's really unacceptable!
1. nine OSD storage servers with two intel DC 3500 SSD on each
2. hammer 0.94.3
3. QEMU emulator version 2.1.2 (Debian 1:2.1+dfsg-12+deb8u4~bpo70+1)
From: Jan Schermer
Date: 2015-10-27 05:48
To: Rick Balsano
Subject: Re: [ceph-users] Understanding the number of TCP connections between clients and OSDs
If we're talking about RBD clients (qemu) then the number also grows with number of volumes attached to the client. With a single volume it was <1000. It grows when there's heavy IO happening in the guest.
I had to bump up the file open limits to several thusands (8000 was it?) to accomodate client with 10 volumes in our cluster. We just scaled the number of OSDs down so hopefully I could have a graph of that.
But I just guesstimated what it could become, and that's not necessarily what the theoretical limit is. Very bad things happen when you reach that threshold. It could also depend on the guest settings (like queue depth), and how much it seeks over the drive (how many different PGs it hits), but knowing the upper bound is most critical.
On 26 Oct 2015, at 21:32, Rick Balsano <***@opower.com> wrote:
We've run into issues with the number of open TCP connections from a single client to the OSDs in our Ceph cluster.
We can (& have) increased the open file limit to work around this, but we're looking to understand what determines the number of open connections maintained between a client and a particular OSD. Our naive assumption was 1 open TCP connection per OSD or per port made available by the Ceph node. There are many more than this, presumably to allow parallel connections, because we see 1-4 connections from each client per open port on a Ceph node.
Here is some background on our cluster:
* still running Firefly 0.80.8
* 414 OSDs, 35 nodes, one massive pool
* clients are KVM processes, accessing Ceph RBD images using virtio
* total number of open TCP connections from one client to all nodes between 500-1000
Is there any way to either know or cap the maximum number of connections we should expect?
I can provide more info as required. I've done some searches and found references to "huge number of TCP connections" but nothing concrete to tell me how to predict how that scales.
Senior Software Engineer
O +1 571 384 1210
We're Hiring! See jobs here.
ceph-users mailing list