How to export a ceph RBD image from one cluster to another without using a bridge server

If you have two Ceph clusters - Cluster-A and Bluster-B and want to export a really big ceph RBD image from one to another this is what you do...

Published on: May 11, 2020 by Website Admin

I've been working with ceph for about 2 years now, and I've deployed 3 major ceph clusters so far (starting from Luminous to Mimic) on Linux Ubuntu 14.04 and Ubuntu 16.04. The Ceph clusters were two full SSD clusters and one with HDD disks (7200rpm) for capacity and I've been faced with the following situation.

On RBD ceph pool I have some really big images - more then 5T of disk space used - and I was confronted with the situation when I needed to move the data from once cluster to another. You can create the snapshot and download locally and after that you upload to the second Ceph cluster and imported there the RBD snapshot or use rsync or whatever method you have but... there is another way which involves a Linux PIPE 😊. This is what I use: SSH, Linux PIPE and the RBD snapshot on the cluster.

1. First you create the snapshot on Ceph Cluster-A

rbd snap create rbd/<<your-rbd-image-name>>@<<your-rbd-image-name>>.snapshot

Now please make sure you have the snapshot created with the following command

rbd ls -l | grep <<your-rbd-image-name>>

2 .Now the magic happens. I will use a Linux PIPE and SSH to export and import my Ceph RBD snapshot image from Cluster-A to Cluster-B, directly

rbd export rbd/<<your-rbd-image-name>>@<<your-rbd-image-name>>.snapshot - | ssh <<your-username>>@<<IP-of-cluster-B>> "sudo rbd import --image-format 2 --image-feature layering - rbd/<<your-rbd-image-name>>"

IMPORTNT - you should be able to connect from one cluster to another and have sudo rights

Now, on Ceph Cluster-B run the following command to see the import happening.

rbd ls -l | grep <<your-rbd-image-name>>

Now you should see how your RBD image is imported...

That's it. I hope I did not forget something... If I did, drop me a line on twitter. Thank you.

How to increase the MDS Cache Memory Limit on your Ceph cluster on the fly

In the following I will show you how to increase the MDS Cache Memory Limit (mds_cache_memory_limit) on a Ceph cluster (Mimic 13.2.6) without downtime.

Published on: May 11, 2020 by Website Admin

A Ceph cluster (at least in Mimic version), by default, comes set up with MDS cache memory limit (mds_cache_memory_limit) of 1G... and that is not enough if you are running some heavy load clients with CephFS and you will soon start to get warning like client X is failing to respond to cache pressure.

How do I know that Ceph cluster comes with mds_cache_memory_limit of 1G you ask? Well, I run the following command on a Ceph MDS server:

ceph daemon mds.<<your_ceph_mds_server_name>> config get mds_cache_memory_limit
... and you should get the following output:
    "mds_cache_memory_limit": "1073741824"

Now, the important part...

Always perform modifications on a standby MDS server. Do not perform modifications on a active server because (from my experience) the MDS will get stuck for some time or restart. At least this happened to me on my Mimic 13.2.6 ceph cluster.

The command to increase MDS Cache Memory Limit from 1G to 6G on your Ceph cluster is (if want more, do some calculations as 1073741824 Kilobytes is 1G 😛 ):

ceph daemon  mds.<<your_ceph_mds_server_name>> config set mds_cache_memory_limit 68719476736

Do the above modification on all your MDS standby servers and I truly hope you have more then one MDS servers on your cluster, otherwise, you are screwed. Or, you can read this and quickly deploy extra MDS servers and it's all good 😊.

Now stop the active MDS server with systectl (I am running my cluster on Ubuntu 16.04) and watch how one of your standby MDS servers becomes active.

Remember to perform the above modifications on the ex-active MDS server and that's it.

Oh, one more thing... If you reboot the servers the default values will get activated (1G mds_cache_memory_limit) and every modification performed will be erased. To prevent that from happening add the following config in /etc/ceph/ceph.conf file.

mds_cache_memory_limit = 68719476736

That's it!.