Commits · v3.2.8 · mirror / ansible-ceph-osd

Feb 28, 2019

osd: add ipc=host in systemd template for containers · b2dbcf47


in addition to 15812970f033206b8680cc68351952d49cc18314

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d5be83e5042a5e22ace6250234ccd81acaffb0a2)

b2dbcf47

Feb 20, 2019

osd: make the 'wait for all osd to be up' task configurable · 45ce8482

Guillaume Abrioux authored 6 years ago

introduce two new variables to make the check that 'wait for all osd to
be up' configurable.
It's possible that for some deployments, OSDs can take longer to be seen
as UP and IN.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1676763



Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 21e5db8982afd6e075541e7fc88620d59a1df498)

45ce8482

Feb 19, 2019

ensure at least one osd is up · f776ef3e

David Waiting authored 6 years ago


The existing task checks that the number of OSDs is equal to the number of up OSDs before continuing.

The problem is that if none of the OSDs have been discovered yet, the task will exit immediately and subsequent pool creation will fail (num_osds = 0, num_up_osds = 0).

This is related to Bugzilla 1578086.

In this change, we also check that at least one OSD is present. In our testing, this results in the task correctly waiting for all OSDs to come up before continuing.

Signed-off-by: David Waiting <david_waiting@comcast.com>
(cherry picked from commit 3930791cb7d2872e3388d33713171d7a0c1951e8)

f776ef3e

Feb 06, 2019

osd: expose udev into the container · 4b727e9c

Sébastien Han authored 6 years ago

In order to be able to retrieve udev information, we must expose its
socket. As per, https://github.com/ceph/ceph/pull/25201

 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 997667a8734eddaa616fe642e57f6378408736a9)

4b727e9c

osd: bind mount /var/run/udev/ · b51eac11

Guillaume Abrioux authored 6 years ago


without this, the command `ceph-volume lvm list --format json` hangs and
takes a very long time to complete.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7ade0328072896e99817b070b6a82448024bfb84)

b51eac11

Jan 16, 2019

start_osds: use list instead of keys (re-introduce) · 0d775d5f

Noah Watkins authored 6 years ago

the python3 fix merged by:

  https://github.com/ceph/ceph-ansible/pull/3346

was reintroduced a few days later by:

  https://github.com/ceph/ceph-ansible/commit/82a6b5adec4d72eb4b7219147f2225b7b2904460



and this patch fixes it again :)

Signed-off-by: Noah Watkins <nwatkins@redhat.com>
(cherry picked from commit 3cf5fd2c3ee1fc342ac8dc3365ed82d863c7127e)

0d775d5f

Dec 20, 2018
- add support for rocksdb and wal on the same partition in non-collocated · 27065da2
  Kai Wembacher authored 6 years ago
  
  Signed-off-by: Kai Wembacher <kai@ktwe.de> (cherry picked from commit a273ed7f6038b51d3ddb5198d4f3ab57d45bc328)
  View commit 27065da2 3 tags
  
  27065da2
Dec 04, 2018

osd: discover osd_objectstore on the fly · 32ca0b43

Sébastien Han authored 6 years ago


Applying and passing the OSD_BLUESTORE/FILESTORE on the fly is wrong for
existing clusters as their config will be changed.

Typically, if an OSD was prepared with ceph-disk on filestore and we
change the default objectstore to bluestore, the activation will fail.
The flag osd_objectstore should only be used for the preparation, not
activation. The activate in this case detects the osd objecstore which
prevents failures like the one described above.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 4c5113019893c92c4d75c9fc457b04158b86398b)

32ca0b43

ceph-osd: change jinja condition · 648c064d

Sébastien Han authored 6 years ago

If an existing cluster runs this config, and has ceph-disk OSD, the
`expose_partitions` won't be expected by jinja since it's inside the
'old' if. We need it as part of the osd_scenario != 'lvm' condition.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1640273


Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit bef522627e1e9827b86710c7a54f35a0cd596fbb)

648c064d

Nov 29, 2018

osd: manage legacy ceph-disk non-container startup · 2a15d4c5

Sébastien Han authored 6 years ago

The code is now able (again) to start osds that where configured with
ceph-disk on a non-container scenario.

Closes: https://github.com/ceph/ceph-ansible/issues/3388


Signed-off-by: Sébastien Han <seb@redhat.com>

2a15d4c5

refact osd pool size customization · c14707ed

Guillaume Abrioux authored 6 years ago


Add real default value for osd pool size customization.
Ceph itself has an `osd_pool_default_size` default value to `3`.

If users don't specify a pool size in various pools definition within
ceph-ansible, we should default to `3`.

By the way, this kind of condition isn't really clear:
```
when:
  - rbd_pool_size | default ("")
```

we should try to get the customized value then default to what is in
`osd_pool_default_size` (which has its default value pointing to
`ceph_osd_pool_default_size` (`3`) as well) and compare it to
`ceph_osd_pool_default_size`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit 7774069d45477df9f37c98bc414b3bf38cf41feb)

c14707ed

mon: move `osd_pool_default_pg_num` in `ceph-defaults` · 6c0b1e12

Guillaume Abrioux authored 6 years ago

`osd_pool_default_pg_num` parameter is set in `ceph-mon`.
When using ceph-ansible with `--limit` on a specifc group of nodes, it
will fail when trying to access this variables since it wouldn't be
defined.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1518696



Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit d4c0960f04342e995db2453b50940aa9933ceb09)

6c0b1e12

Nov 28, 2018

osd: re-introduce disk_list check · aec5b611

Sébastien Han authored 6 years ago

This commit
https://github.com/ceph/ceph-ansible/commit/4cc1506303739f13bb7a6e1022646ef90e004c90#diff-51bbe3572e46e3b219ad726da44b64ebL13


accidentally removed this check.

This is a must have for ceph-disk based containerized OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>

aec5b611

osd: commonize start_osd code · 18ec4b28

Guillaume Abrioux authored 6 years ago


since `ceph-volume` introduction, there is no need to split those tasks.

Let's refact this part of the code so it's clearer.

By the way, this was breaking rolling_update.yml when `openstack_config:
true` playbook because nothing ensured OSDs were started in ceph-osd role (In
`openstack_config.yml` there is a check ensuring all OSD are UP which was
obviously failing) and resulted with OSDs on the last OSD node not started
anyway.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>
(cherry picked from commit f7fcc012e9a5b5d37bcffd39f3062adbc2886006)

18ec4b28

Oct 29, 2018

roles: fix *_docker_memory_limit default value · 86ba3b94

Guillaume Abrioux authored 6 years ago


append 'm' suffix to specify the unit size used in all
`*_docker_memory_limit`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

86ba3b94

roles: do not limit docker_memory_limit for various daemons · a7368d7e

Neha Ojha authored 6 years ago

Since we do not have enough data to put valid upper bounds for the memory
usage of these daemons, do not put artificial limits by default. This will
help us avoid failures like OOM kills due to low default values.

Whenever required, these limits can be manually enforced by the user.

More details in
https://bugzilla.redhat.com/show_bug.cgi?id=1638148

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1638148


Signed-off-by: Neha Ojha <nojha@redhat.com>

a7368d7e

Oct 22, 2018

allow custom pool size · 548571d4

Rishabh Dave authored 6 years ago

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1596339


Signed-off-by: Rishabh Dave <ridave@redhat.com>

548571d4

Oct 17, 2018

infra: rename osd-configure to add-osd and improve it · b54b26e3

Sébastien Han authored 6 years ago

The playbook has various improvements:

* run ceph-validate role before doing anything
* run ceph-fetch-keys only on the first monitor of the inventory list
* set noup flag so PGs get distributed once all the new OSDs have been
added to the cluster and unset it when they are up and running

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1624962


Signed-off-by: Sébastien Han <seb@redhat.com>

b54b26e3

Oct 12, 2018

remove jewel support · 0c354e87

Guillaume Abrioux authored 6 years ago


As of now, we should no longer support Jewel in ceph-ansible.
The latest ceph-ansible release supporting Jewel is `stable-3.1`.

Signed-off-by: Guillaume Abrioux <gabrioux@redhat.com>

0c354e87

Oct 10, 2018

ceph_volume: refactor · bc9af3fb

Sébastien Han authored 6 years ago


This commit does a couple of things:

* Avoid code duplication
* Clarify the code
* add more unit tests
* add myself to the author of the module

Signed-off-by: Sébastien Han <seb@redhat.com>

bc9af3fb

osd: do not run when lvm scenario · d017e2f4

Sébastien Han authored 6 years ago


This task was created for ceph-disk based deployments so it's not needed
when osd are prepared with ceph-volume.

Signed-off-by: Sébastien Han <seb@redhat.com>

d017e2f4

osd: ceph-volume activate, just pass the OSD_ID · 2b4217e3

Sébastien Han authored 6 years ago


We don't need to pass the device and discover the OSD ID. We have a
task that gathers all the OSD ID present on that machine, so we simply
re-use them and activate them. This also handles the situation when you
have multiple OSDs running on the same device.

Signed-off-by: Sébastien Han <seb@redhat.com>

2b4217e3

osd: change unit template for ceph-volume container · 3e54522d

Sébastien Han authored 6 years ago


We don't need to pass the hostname on the container name but we can keep
it simple and just call it ceph-osd-$id.

Signed-off-by: Sébastien Han <seb@redhat.com>

3e54522d

osd: do not use expose_partitions on lvm · b440a0fe

Sébastien Han authored 6 years ago


expose_partitions is only needed on ceph-disk OSDs so we don't need to
activate this code when running lvm prepared OSDs.

Signed-off-by: Sébastien Han <seb@redhat.com>

b440a0fe

ceph_volume: add container support for batch command · c3be2043

Sébastien Han authored 6 years ago

The batch option got recently added, while rebasing this patch it was
necessary to implement it. So now, the batch option can work on
containerized environments.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1630977


Signed-off-by: Sébastien Han <seb@redhat.com>

c3be2043

ceph_volume: try to get ride of the dummy container · 0c356416

Sébastien Han authored 6 years ago


If we run on a containerized deployment we pass an env variable which
contains the container image.

Signed-off-by: Sébastien Han <seb@redhat.com>

0c356416

ceph-osd: ceph-volume container support · 15108478
Sébastien Han authored 6 years ago
```
Signed-off-by: Sébastien Han <seb@redhat.com>
```
15108478

Avoid using tests as filter · b08013c4

Noah Watkins authored 6 years ago


Fixes the deprecation warning:

  [DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of
  using `result|search` use `result is search`.

Signed-off-by: Noah Watkins <nwatkins@redhat.com>

b08013c4

Oct 09, 2018
- ceph-osd: use journal_size and block_db_size for lvm batch · bd29e176
  Andrew Schoen authored 6 years ago
  
  Signed-off-by: Andrew Schoen <aschoen@redhat.com>
  bd29e176
Oct 04, 2018

don't use "static" field while including tasks · 1a2c4b6c

Rishabh Dave authored 6 years ago

Instead used "import_tasks" and "include_tasks" to tell whether tasks
must be included statically or dynamically.

Fixes: https://github.com/ceph/ceph-ansible/issues/2998


Signed-off-by: Rishabh Dave <ridave@redhat.com>

1a2c4b6c

Sep 27, 2018
- don't use "include" to include tasks · 550d3fa2
  Rishabh Dave authored 6 years ago
  
  Use "import_tasks" or "include_tasks" instead. Signed-off-by: Rishabh Dave <ridave@redhat.com>
  View commits for tag v3.2.0beta3 v3.2.0beta3
  
  550d3fa2
Sep 12, 2018

ceph_volume: adds the osds_per_device parameter · f9b5116f

Andrew Schoen authored 6 years ago


If this is set to anything other than the default value of 1 then the
--osds-per-device flag will be used by the batch command to define how
many osds will be created per device.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

f9b5116f

Aug 28, 2018

remove warning for unsupported variables · a4ac646f

Sébastien Han authored 6 years ago

As promised, these will go unsupported for 3.1 so let's actually remove
them :).

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1622729


Signed-off-by: Sébastien Han <seb@redhat.com>

a4ac646f

Aug 20, 2018

osd: fix ceph_release · 53bd8f20

Sébastien Han authored 6 years ago

We need ceph_release in the condition, not ceph_stable_release

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1619255


Signed-off-by: Sébastien Han <seb@redhat.com>

53bd8f20

Aug 16, 2018

Revert "osd: generate device list for osd_auto_discovery on rolling_update" · c38226b7

Sébastien Han authored 6 years ago


This reverts commit e84f11e99ef42057cd1c3fbfab41ef66cda27302.

This commit was giving a new failure later during the rolling_update
process. Basically, this was modifying the list of devices and started
impacting the ceph-osd itself. The modification to accomodate the
osd_auto_discovery parameter should happen outside of the ceph-osd.

Also we are trying to not play ceph-osd role during the rolling_update
process so we can speed up the upgrade.

Signed-off-by: Sébastien Han <seb@redhat.com>

c38226b7

Aug 10, 2018

lvm: fix condition when selecting which scenario to run · b732546a

Andrew Schoen authored 6 years ago

devices and lvm_volumes will always be defined, so we need to instead
check it's length before deciding to run the scenario.

This fixes the failure here:
https://2.jenkins.ceph.com/job/ceph-ansible-prs-luminous-bluestore_lvm_osds/86/consoleFull#1667273050b5dd38fa-a56e-4233-a5ca-584604e56e3a

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

b732546a

osd: generate device list for osd_auto_discovery on rolling_update · e5aea340

Sébastien Han authored 6 years ago

rolling_update relies on the list of devices when performing the restart
of the OSDs. The task that is builind the devices list out of the
ansible_devices dict only runs when there are no partitions on the
drives. However during an upgrade the OSD are already configured, they
have been prepared and have partitions so this task won't run and thus
the devices list will be empty, skipping the restart during
rolling_update. We now run the same task under different requirements
when rolling_update is true and build a list when:

* osd_auto_discovery is true
* rolling_update is true
* ansible_devices exists
* no dm/lv are part of the discovery
* the device is not removable
* the device has more than 1 sector

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1613626


Signed-off-by: Sébastien Han <seb@redhat.com>

e5aea340

Aug 09, 2018

ceph-osd: adds crush_device_class config option · 9c06bfea

Andrew Schoen authored 6 years ago


This is used with the lvm osd scenario. When using devices you need the
option to set the crush device class for all of the OSDs that are
created from those devices.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

9c06bfea

ceph-volume: implement the 'lvm batch' subcommand · 9c68fee4

Andrew Schoen authored 6 years ago


This adds the action 'batch' to the ceph-volume module so that we can
run the new 'ceph-volume lvm batch' subcommand. A functional test is
also included.

If devices is defind and osd_scenario is lvm then the 'ceph-volume lvm
batch' command will be used to create the OSDs.

Signed-off-by: Andrew Schoen <aschoen@redhat.com>

9c68fee4

Jul 30, 2018

osd: do not remove expose_partition container · 6ec7390a

Sébastien Han authored 6 years ago

The container runs with --rm which means it will be deleted by Docker
when exiting. Also 'docker rm -f' is not idempotent and returns 1 if the
container does not exist.

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1609007


Signed-off-by: Sébastien Han <seb@redhat.com>

6ec7390a