Commits · 250dfded0d70beb33c260b44df125ddea6c919f2 · 小白蛋 / Rook

This project is mirrored from https://gitee.com/wangmingco/rook.git. Pull mirroring failed 2 years ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.

30 Oct, 2019 8 commits

ceph: stop orchestration restart on device config map update for osd on pvc · 250dfded

Santosh Pillai authored 5 years ago


when running OSD on PVC, orchestration should not be started again if there is any change in the device config
map. Added a check to skip orchestration on device CM update when StorageClassDeviceSets are available,
that is, when running osd on pvc.
Signed-off-by: Santosh Pillai <sapillai@redhat.com>
(cherry picked from commit d1c82015)

250dfded

Merge pull request #4225 from rook/mergify/bp/release-1.1/pr-4216 · b7d3ef01
mergify[bot] authored 5 years ago
```
Continue with orchestration if a single mon pod fails to start (bp #4216)
```
b7d3ef01
Merge pull request #4226 from leseb/rook/mergify/bp/release-1.1/pr-4214 · 3c12c0bc
Sébastien Han authored 5 years ago
```
ceph: osd hide 'restorecon' when necessary (bp #4214)
```
3c12c0bc

ceph: osd hide 'restorecon' when necessary · 05a8d61a

Sébastien Han authored 5 years ago


Some testing on OpenShift has shown that enabling Hostnetworking causes
issues with ceph-volume creating OSD.
This is a pure Red Hat OpenShift downstream issue.

This is meant to be temporary while we are waiting for the Selinux team
to get back to us.
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit d224fbde)

05a8d61a

ceph: continue with orchestration if a single mon pod fails to start · 67b4de1f

Travis Nielsen authored 5 years ago


If the cluster has previously been initialized and the operator
is restarted, the operator needs to perform the first orchestration
to confirm basic cluster health, then start watching for CR events.
If there was a single mon pod in a failed or pending pod state, the
orchestration was blocked even if just one mon was in this state.

The scenario is when a node is lost and a mon and the operator happen
to be running on that node. The operator will restart, but the mon
will not restart because it is tied to the node that died (unless running
on a PVC). The mon would remain in a pending state while attempting
to schedule on the node that is down indefinitely and the operator would
be blocked indefinitely. Now we will just log the error and attempt
to continue.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
(cherry picked from commit 95171c68)

67b4de1f

ceph: osd update error message · 5e38244a

Sébastien Han authored 5 years ago


Add a more meaninful message.
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 6dff2afe)

5e38244a

Merge pull request #4221 from rook/mergify/bp/release-1.1/pr-4213 · 02a673de
mergify[bot] authored 5 years ago
```
Ceph: use the rook image for drain-canaries. (bp #4213)
```
02a673de

Ceph: use the rook image for drain-canaries. · 30e9e430

Rohan CJ authored 5 years ago


Change the canaries to use the rook image instead of the busybox image to
run `sleep infinity`. This removes our dependency on an external image that
we do not control.
Signed-off-by: Rohan CJ <rohantmp@gmail.com>
(cherry picked from commit 17f6a25b)

30e9e430

25 Oct, 2019 10 commits

Merge pull request #4188 from rook/mergify/bp/release-1.1/pr-4179 · 57eda788
mergify[bot] authored 5 years ago
```
NFS: Set correct owner reference for garbage collection (bp #4179)
```
57eda788
Merge pull request #4192 from rook/mergify/bp/release-1.1/pr-4189 · 0603b478
mergify[bot] authored 5 years ago
```
ceph: fix set osd prepare resources (bp #4189)
```
0603b478
Merge pull request #4193 from rook/mergify/bp/release-1.1/pr-4096 · 3c351a26
mergify[bot] authored 5 years ago
```
Run the distributed test matrix in PRs (bp #4096)
```
3c351a26

tests: by default run the minimum test matrix in PRs · c37de77c

travisn authored 5 years ago


The CI runs the integration tests on several versions of K8s. By default
the integration tests have all been running on all versions of K8s. With the
recent option to run a minimum matrix for ceph tests it has shown to be reliable
for PRs and is much faster to have the test distribution.

Master and release builds will still run the full set of test suites on all
versions of K8s.
Signed-off-by: travisn <tnielsen@redhat.com>
(cherry picked from commit ac8f0dff)

c37de77c

Merge pull request #4186 from rook/mergify/bp/release-1.1/pr-4180 · 2b1fe101
mergify[bot] authored 5 years ago
```
Flex driver should not allow attach before detach on a different node (bp #4180)
```
2b1fe101

ceph: fix set osd prepare resources · 5fc16c0f

Sébastien Han authored 5 years ago

Use the right struct field. Previously it was pointing to the osd global
resource now it does the osd prepare.

Closes: https://github.com/rook/rook/issues/4182

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 56629113)

5fc16c0f

Merge pull request #4187 from rook/mergify/bp/release-1.1/pr-4184 · 3ba9b340
mergify[bot] authored 5 years ago
```
Improve reliability of the Ceph mon failover test (bp #4184)
```
3ba9b340

nfs: set correct owner reference name for garbage collection · b8380087

Travis Nielsen authored 5 years ago


The owner reference for the NFS custom resource was using the namespace
instead of the name. This causes the k8s garbage collector to remove the
nfs resources in some scenarios such as restarting etcd.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
(cherry picked from commit fa834d9a)

b8380087

ceph: improve reliability of mon failover test · 37a7124e

Travis Nielsen authored 5 years ago


Depending on the state of the orchestration, the operator might trigger
re-creation of the deleted mon. In this case, consider the test successful
rather than wait for the failover which will never occur.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
(cherry picked from commit f5b032ad)

37a7124e

ceph: flex driver should not allow attach before detach · bc71de45

Travis Nielsen authored 5 years ago


If a volume is being attached, it should be verified that the volume
is safe to attach. The volume was assumed to be safe to attach if it was
for the same pod. But this was assuming the pod of the same name would be
on the same node. This is true for pods created from deployments, but not
for pods that are part of a stateful set. A stateful set will maintain the
pod name even as the pod is failed over to a new node. Therefore, the fencing
much check if the pod is from the same node before allowing the attach to
continue. Otherwise, we need to wait for the volume to be detached from the
other node.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
(cherry picked from commit f7d7aa0b)

bc71de45

24 Oct, 2019 5 commits

Merge pull request #4175 from travisn/release-1.1.4 · 7dfaf2c5
Travis Nielsen authored 5 years ago
```
Set the v1.1.4 release tag
```
7dfaf2c5

build: set the v1.1.4 release tag · 61f2888a

Travis Nielsen authored 5 years ago


With the v1.1.4 patch release we need to set the new tag in the
deployment manifests and documentation.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

61f2888a

Merge pull request #4165 from travisn/osd-config-overrides · fac5d702
Travis Nielsen authored 5 years ago
```
Restore --conf option for filestore osds and config override settings
```
fac5d702

Revert "ceph: osd remove --conf flag" · 78922f6a

Travis Nielsen authored 5 years ago

The --conf flag is necessary in the release-1.1 branch. Due to changes
in master that have not been backported, this flag is not needed there.

This reverts commit 9b614df4

.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

78922f6a

ceph: re-enable config overrides for osds · fbd30dcb

Travis Nielsen authored 5 years ago


The Ceph config overrides in the release-1.1 branch require
merging the overrides from /etc/ceph/ceph.conf into the generated ceph.conf.
This is different from the master branch where the osd configuration
has changed such that the overrides will automatically be picked up
from /etc/ceph/ceph.conf.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

fbd30dcb

23 Oct, 2019 13 commits

Merge pull request #4163 from travisn/release-1.1.3 · 137f1a86
Travis Nielsen authored 5 years ago
```
Set the v1.1.3 release tag
```
137f1a86

build: set the v1.1.3 release tag · c0cf08b7

Travis Nielsen authored 5 years ago


With the v1.1.3 patch release we need to set the new tag
in the deployment manifests and documentation.
Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

c0cf08b7

Merge pull request #4162 from rook/mergify/bp/release-1.1/pr-4161 · 2eecdf42
mergify[bot] authored 5 years ago
```
ceph: osd remove --conf flag (bp #4161)
```
2eecdf42
Merge pull request #4159 from rook/mergify/bp/release-1.1/pr-3996 · e81fab89
mergify[bot] authored 5 years ago
```
fix operator reconcile to restart osd daemons (bp #3996)
```
e81fab89
Merge pull request #4160 from rook/mergify/bp/release-1.1/pr-4021 · 88f9e100
mergify[bot] authored 5 years ago
```
Enable restoring a cluster after disaster recovery (bp #4021)
```
88f9e100

ceph: osd remove --conf flag · 9b614df4

Sébastien Han authored 5 years ago

Even legacy osds were still running with this flag. In my previous patch
I was trying to minimize the impact of the change and was afraid the
config file was holding important information but that appears not to be
true.

The only option in the config could have the keyring path, but it's
passed in the CLI startup line.

Closes: https://github.com/rook/rook/issues/4063

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 9f053ca4)

9b614df4

Merge pull request #4157 from rook/mergify/bp/release-1.1/pr-4116 · 4e2fa771
mergify[bot] authored 5 years ago
```
Ceph: Added removeOSDsIfOutAndSafeToRemove to Cluster CR (bp #4116)
```
4e2fa771

ceph: create missing mon deployments without waiting for quorum · 2b8f1be0

travisn authored 5 years ago


If mon deployments need to be created, the operator should
go ahead and create all of them instead of waiting for quorum
in between checking each mon. In particular, if a cluster is
being restored in a disaster recovery situation, none of the mons
deployments will exist. All of them should be created before
checking for quorum.
Signed-off-by: travisn <tnielsen@redhat.com>
(cherry picked from commit 53812c00)

2b8f1be0

ceph: mgr modules requests should timeout · 3347a637

travisn authored 5 years ago


Requests to enable mgr modules or call mgr modules should
timeout rather than hang. For example, if creating a self
signed cert request hangs, the mgr should continue with other
actions in the orchestration
Signed-off-by: travisn <tnielsen@redhat.com>
(cherry picked from commit a72a9816)

3347a637

ceph: reduce verbose logging checking for clean PGs · d1880c2c

travisn authored 5 years ago


Checking for clean PGs is a recurring event when reconciling nodes
for upgrades and fencing. No need to log every request.
Signed-off-by: travisn <tnielsen@redhat.com>
(cherry picked from commit 43979747)

d1880c2c

ceph: fix operator reconcile to restart osds daemons · 36b7dbfa

Santosh Pillai authored 5 years ago


OSD on PVC does not upgrade when the user upgrades the ceph version on cluster-on-pvc yaml.
The solution incudes:
   - upgrade osd prepare and daemon pods on upgrade
   - skip c-v prepare if filesystem is already present on the pvc device.
   - skip lvm release in case of upgrade.
Signed-off-by: Santosh Pillai <sapillai@redhat.com>
(cherry picked from commit 8ea693a7)

36b7dbfa

Ceph: Added removeOSDsIfOutAndSafeToRemove to Cluster CR · 932d7e2d

rohan47 authored 5 years ago


OSDs can be removed automatically with the current mechanism if a new
setting removeOSDsIfOutAndSafeToRemove is set to true. The default for
all new or upgraded clusters should be false.
Signed-off-by: rohan47 <rohgupta@redhat.com>
(cherry picked from commit 7f9611d4)

932d7e2d

Merge pull request #4144 from leseb/bp-4086 · da6b8592
Travis Nielsen authored 5 years ago
```
ceph: rework csi keys and secrets (bp #4086)
```
da6b8592

22 Oct, 2019 4 commits

Merge pull request #4136 from rook/mergify/bp/release-1.1/pr-4009 · 7b6b6fe3
mergify[bot] authored 5 years ago
```
[OSD on PVC] add kubernetes version check. (bp #4009)
```
7b6b6fe3

ceph: rework csi keys and secrets · 772ed887

Sébastien Han authored 5 years ago

We know create 4 secrets containing 4 ceph keys where each of them have
limited permissions to access the cluster.

Closes: https://github.com/rook/rook/issues/4074

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit fcc2e21f)

772ed887

ceph: do not enable app on pool · ba7fed2b

Sébastien Han authored 5 years ago


The command `fs new` already enable the application pool settings for
the cephfs pools so we don't need to run that command again.
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit eafc2a47)

ba7fed2b

ci: more debug · 42009014

Sébastien Han authored 5 years ago


When we give up on waiting for the pod to be running which describe it
to see what's wrong.
Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit eccea9fa)

42009014