Discussion:
[ceph-users] Crush, data placement and randomness
Franck Desjeunes
2018-12-06 07:00:48 UTC
Permalink
Hi all cephers.

I don't know if this is the right place to ask this kind of questions, but
I'll give it a try.

I'm getting interested in ceph and deep dived into the technical details of
it but I'm struggling to understand few things.

When I execute a ceph osd map on an hypothetic object that does not exist,
the command always give me the same OSDs set to store the object.
So, what is the randomness of the CRUSH algorithm if an object A will
always be stored in the same OSDs set ?

In the same way, why when I use librados to read an object, the stack trace
shows that the code goes through the exact same functions calls as to
create an object to get the OSDs set ?

As far as I see, for me, CRUSH is fully deterministic and I don't
understand why it is qualified as a pseudo-random algorithm.

Thank you for your help.

Best regards.
Marc Roos
2018-12-06 08:44:42 UTC
Permalink
Afaik it is not random, it is calculated where your objects are stored.
Some algorithm that probably takes into account how many osd's you have
and their sizes.
How can it be random placed? You would not be able to ever find it
again. Because there is not such a thing as a 'file allocation table'

But better search for this, I am not that deep into ceph ;)




-----Original Message-----
From: Franck Desjeunes [mailto:***@gmail.com]
Sent: 06 December 2018 08:01
To: ceph-***@lists.ceph.com
Subject: [ceph-users] Crush, data placement and randomness

Hi all cephers.

I don't know if this is the right place to ask this kind of questions,
but I'll give it a try.


I'm getting interested in ceph and deep dived into the technical details
of it but I'm struggling to understand few things.

When I execute a ceph osd map on an hypothetic object that does not
exist, the command always give me the same OSDs set to store the object.
So, what is the randomness of the CRUSH algorithm if an object A will
always be stored in the same OSDs set ?

In the same way, why when I use librados to read an object, the stack
trace shows that the code goes through the exact same functions calls as
to create an object to get the OSDs set ?

As far as I see, for me, CRUSH is fully deterministic and I don't
understand why it is qualified as a pseudo-random algorithm.

Thank you for your help.

Best regards.
Leon Robinson
2018-12-06 10:11:06 UTC
Permalink
The most important thing to remember about CRUSH is that the H stands for hashing.

If you hash the same object you're going to get the same result.

e.g. cat /etc/fstab | md5sum is always the same output, unless you change the file contents.

CRUSH uses the number of osds and the object and the pool and a bunch of other things to create a hash which determines placement. If any of that changes then the hash will change, and the placement with change, if it restores to exactly how it was, then the placement returns to how it was.

On Thu, 2018-12-06 at 09:44 +0100, Marc Roos wrote:




Afaik it is not random, it is calculated where your objects are stored.

Some algorithm that probably takes into account how many osd's you have

and their sizes.

How can it be random placed? You would not be able to ever find it

again. Because there is not such a thing as a 'file allocation table'


But better search for this, I am not that deep into ceph ;)





-----Original Message-----

From: Franck Desjeunes [mailto:

<mailto:***@gmail.com>

***@gmail.com

]

Sent: 06 December 2018 08:01

To:

<mailto:ceph-***@lists.ceph.com>

ceph-***@lists.ceph.com


Subject: [ceph-users] Crush, data placement and randomness


Hi all cephers.


I don't know if this is the right place to ask this kind of questions,

but I'll give it a try.



I'm getting interested in ceph and deep dived into the technical details

of it but I'm struggling to understand few things.


When I execute a ceph osd map on an hypothetic object that does not

exist, the command always give me the same OSDs set to store the object.

So, what is the randomness of the CRUSH algorithm if an object A will

always be stored in the same OSDs set ?


In the same way, why when I use librados to read an object, the stack

trace shows that the code goes through the exact same functions calls as

to create an object to get the OSDs set ?


As far as I see, for me, CRUSH is fully deterministic and I don't

understand why it is qualified as a pseudo-random algorithm.


Thank you for your help.


Best regards.



_______________________________________________

ceph-users mailing list

<mailto:ceph-***@lists.ceph.com>

ceph-***@lists.ceph.com


<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Leon L. Robinson <***@ukfast.co.uk<mailto:***@ukfast.co.uk>>

________________________________

NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for the above-named person(s). If you are not the intended recipient, notify the sender immediately, delete this email from your system and do not disclose or use for any purpose. We may monitor all incoming and outgoing emails in line with current legislation. We have taken steps to ensure that this email and attachments are free from any virus, but it remains your responsibility to ensure that viruses do not adversely affect you
Brad Hubbard
2018-12-07 00:44:55 UTC
Permalink
https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf
Post by Leon Robinson
The most important thing to remember about CRUSH is that the H stands for hashing.
If you hash the same object you're going to get the same result.
e.g. cat /etc/fstab | md5sum is always the same output, unless you change the file contents.
CRUSH uses the number of osds and the object and the pool and a bunch of other things to create a hash which determines placement. If any of that changes then the hash will change, and the placement with change, if it restores to exactly how it was, then the placement returns to how it was.
Afaik it is not random, it is calculated where your objects are stored.
Some algorithm that probably takes into account how many osd's you have
and their sizes.
How can it be random placed? You would not be able to ever find it
again. Because there is not such a thing as a 'file allocation table'
But better search for this, I am not that deep into ceph ;)
-----Original Message-----
]
Sent: 06 December 2018 08:01
Subject: [ceph-users] Crush, data placement and randomness
Hi all cephers.
I don't know if this is the right place to ask this kind of questions,
but I'll give it a try.
I'm getting interested in ceph and deep dived into the technical details
of it but I'm struggling to understand few things.
When I execute a ceph osd map on an hypothetic object that does not
exist, the command always give me the same OSDs set to store the object.
So, what is the randomness of the CRUSH algorithm if an object A will
always be stored in the same OSDs set ?
In the same way, why when I use librados to read an object, the stack
trace shows that the code goes through the exact same functions calls as
to create an object to get the OSDs set ?
As far as I see, for me, CRUSH is fully deterministic and I don't
understand why it is qualified as a pseudo-random algorithm.
Thank you for your help.
Best regards.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
________________________________
NOTICE AND DISCLAIMER
This e-mail (including any attachments) is intended for the above-named person(s). If you are not the intended recipient, notify the sender immediately, delete this email from your system and do not disclose or use for any purpose. We may monitor all incoming and outgoing emails in line with current legislation. We have taken steps to ensure that this email and attachments are free from any virus, but it remains your responsibility to ensure that viruses do not adversely affect you
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Cheers,
Brad
Loading...