This project is mirrored from https://gitee.com/wangmingco/rook.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
- 01 Dec, 2020 4 commits
-
-
Sébastien Han authored
We can now collect logs directly into a side-car container. A new CRD spec has been added: spec: logCollector: enabled: true periodicity: 24h Every 24h we will rotate log files for each Ceph daemon. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit c6a87203) # Conflicts: # Documentation/ceph-cluster-crd.md # cluster/examples/kubernetes/ceph/cluster.yaml # pkg/operator/ceph/cluster/crash/crash.go
-
Sébastien Han authored
We don't need to print this every 60s in the operator log. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit aca5a8cc)
-
mergify[bot] authored
ceph: cleanup should ignore ceph daemon pods that are not scheduled on any node. (bp #6719)
-
Santosh Pillai authored
Before cleaning up the cluster, we wait for all the daemon pods to be cleaned up. This fails when a daemon is in pending state and has no NodeName. This PR ignores daemon pods that are not scheduled on any node. Signed-off-by:
Santosh Pillai <sapillai@redhat.com> (cherry picked from commit 753bdb35)
-
- 26 Nov, 2020 3 commits
-
-
mergify[bot] authored
ceph: fix pod labels set on csi components (bp #6702)
-
Alexander Trost authored
Signed-off-by:
Alexander Trost <galexrt@googlemail.com> (cherry picked from commit b04f8823)
-
Satoru Takeuchi authored
ceph: fix metadata device passed by-id (bp #6696)
-
- 25 Nov, 2020 6 commits
-
-
Sébastien Han authored
The code was assuming that devices were passed by the user as "/dev/sda", this is bad! We all know people should be using paths like /dev/disk/by-id so we must support them. Closes: https://github.com/rook/rook/issues/6685 Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 5d9612c2)
-
mergify[bot] authored
ceph: ability to abort orchestration (bp #6693)
-
Sébastien Han authored
We can now prioritize orchestrations on certain event. Today only two events will cancel on-going orchestrations (if any): * request for cluster deletion * request for cluster upgrade If one of the two are caught by the watcher we will cancel the on-going orchestration. For that we implemented a simple approach based on check points, where we will check for a cancellation request in certain part of the orchestration. Mainly before each mons/mgr/osds orchestration loops. This solution is not perfect, but we are waiting for the controller-runtime to release its 0.7 version which will embed context support. With that we will be able to cancel reconciles more precisely and rapidly. Operator log example: ``` 2020-11-24 13:54:59.499719 I | op-mon: parsing mon endpoints: a=10.109.126.120:6789 2020-11-24 13:54:59.499719 I | op-mon: parsing mon endpoints: a=10.109.126.120:6789 2020-11-25 12:59:12.986264 I | ceph-cluster-controller: done reconciling ceph cluster in namespace "rook-ceph" 2020-11-25 13:07:33.776947 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff= v1.ClusterSpec{ CephVersion: v1.CephVersionSpec{ Image: "ceph/ceph:v15.2.5", - AllowUnsupported: true, + AllowUnsupported: false, }, DriveGroups: nil, Storage: {UseAllNodes: true, Selection: {UseAllDevices: &true}}, ... // 20 identical fields } 2020-11-25 13:07:33.777039 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph" 2020-11-25 13:07:33.785088 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 2020-11-25 13:07:33.788626 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.5... 2020-11-25 13:07:35.280789 I | ceph-cluster-controller: detected ceph image version: "15.2.5-0 octopus" 2020-11-25 13:07:35.280806 I | ceph-cluster-controller: validating ceph version from provided image 2020-11-25 13:07:35.285888 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 2020-11-25 13:07:35.287828 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config 2020-11-25 13:07:35.288082 I | cephclient: generated admin config in /var/lib/rook/rook-ceph 2020-11-25 13:07:35.621625 I | ceph-cluster-controller: cluster "rook-ceph": version "15.2.5-0 octopus" detected for image "ceph/ceph:v15.2.5" 2020-11-25 13:07:35.642688 I | op-mon: start running mons 2020-11-25 13:07:35.646323 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 2020-11-25 13:07:35.654070 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2] 2020-11-25 13:07:35.868253 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config 2020-11-25 13:07:35.868573 I | cephclient: generated admin config in /var/lib/rook/rook-ceph 2020-11-25 13:07:37.074353 I | op-mon: targeting the mon count 3 2020-11-25 13:07:38.153435 I | op-mon: checking for basic quorum with existing mons 2020-11-25 13:07:38.178029 I | op-mon: mon "a" endpoint is [v2:10.107.242.49:3300,v1:10.107.242.49:6789] 2020-11-25 13:07:38.670191 I | op-mon: mon "b" endpoint is [v2:10.109.71.30:3300,v1:10.109.71.30:6789] 2020-11-25 13:07:39.477820 I | op-mon: mon "c" endpoint is [v2:10.98.93.224:3300,v1:10.98.93.224:6789] 2020-11-25 13:07:39.874094 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2] 2020-11-25 13:07:40.467999 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config 2020-11-25 13:07:40.469733 I | cephclient: generated admin config in /var/lib/rook/rook-ceph 2020-11-25 13:07:41.071710 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config 2020-11-25 13:07:41.078903 I | cephclient: generated admin config in /var/lib/rook/rook-ceph 2020-11-25 13:07:41.125233 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed 2020-11-25 13:07:41.327778 I | op-k8sutil: updating deployment "rook-ceph-mon-a" after verifying it is safe to stop 2020-11-25 13:07:41.327895 I | op-mon: checking if we can stop the deployment rook-ceph-mon-a 2020-11-25 13:07:44.045644 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-a" 2020-11-25 13:07:44.045706 I | op-mon: checking if we can continue the deployment rook-ceph-mon-a 2020-11-25 13:07:44.045740 I | op-mon: waiting for mon quorum with [a b c] 2020-11-25 13:07:44.109159 I | op-mon: mons running: [a b c] 2020-11-25 13:07:44.474596 I | op-mon: Monitors in quorum: [a b c] 2020-11-25 13:07:44.478565 I | op-mon: deployment for mon rook-ceph-mon-b already exists. updating if needed 2020-11-25 13:07:44.493374 I | op-k8sutil: updating deployment "rook-ceph-mon-b" after verifying it is safe to stop 2020-11-25 13:07:44.493403 I | op-mon: checking if we can stop the deployment rook-ceph-mon-b 2020-11-25 13:07:47.135524 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-b" 2020-11-25 13:07:47.135542 I | op-mon: checking if we can continue the deployment rook-ceph-mon-b 2020-11-25 13:07:47.135551 I | op-mon: waiting for mon quorum with [a b c] 2020-11-25 13:07:47.148820 I | op-mon: mons running: [a b c] 2020-11-25 13:07:47.445946 I | op-mon: Monitors in quorum: [a b c] 2020-11-25 13:07:47.448991 I | op-mon: deployment for mon rook-ceph-mon-c already exists. updating if needed 2020-11-25 13:07:47.462041 I | op-k8sutil: updating deployment "rook-ceph-mon-c" after verifying it is safe to stop 2020-11-25 13:07:47.462060 I | op-mon: checking if we can stop the deployment rook-ceph-mon-c 2020-11-25 13:07:48.853118 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff= v1.ClusterSpec{ CephVersion: v1.CephVersionSpec{ - Image: "ceph/ceph:v15.2.5", + Image: "ceph/ceph:v15.2.6", AllowUnsupported: false, }, DriveGroups: nil, Storage: {UseAllNodes: true, Selection: {UseAllDevices: &true}}, ... // 20 identical fields } 2020-11-25 13:07:48.853140 I | ceph-cluster-controller: upgrade requested, cancelling any ongoing orchestration 2020-11-25 13:07:50.119584 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-c" 2020-11-25 13:07:50.119606 I | op-mon: checking if we can continue the deployment rook-ceph-mon-c 2020-11-25 13:07:50.119619 I | op-mon: waiting for mon quorum with [a b c] 2020-11-25 13:07:50.130860 I | op-mon: mons running: [a b c] 2020-11-25 13:07:50.431341 I | op-mon: Monitors in quorum: [a b c] 2020-11-25 13:07:50.431361 I | op-mon: mons created: 3 2020-11-25 13:07:50.734156 I | op-mon: waiting for mon quorum with [a b c] 2020-11-25 13:07:50.745763 I | op-mon: mons running: [a b c] 2020-11-25 13:07:51.045108 I | op-mon: Monitors in quorum: [a b c] 2020-11-25 13:07:51.054497 E | ceph-cluster-controller: failed to reconcile. failed to reconcile cluster "rook-ceph": failed to configure local ceph cluster: failed to create cluster: CANCELLING CURRENT ORCHESTATION 2020-11-25 13:07:52.055208 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph" 2020-11-25 13:07:52.070690 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 2020-11-25 13:07:52.088979 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.6... 2020-11-25 13:07:53.904811 I | ceph-cluster-controller: detected ceph image version: "15.2.6-0 octopus" 2020-11-25 13:07:53.904862 I | ceph-cluster-controller: validating ceph version from provided image ``` Closes: https://github.com/rook/rook/issues/6587 Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit ad249904)
-
Sébastien Han authored
Since we want to pass a context to it, let's extract the logic into its own predicate. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 70c6752e)
-
Sébastien Han authored
Since the predicate for the CephCluster object will soon move into its own predidacte we need to export isDoNotReconcile so that it can be consummed by the "cluster" package. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 9b9f45b7)
-
Sébastien Han authored
Since we moved to the controller-runtime, events are processed one by one and so are reconciles. This means we won't have multiple orchestrations happening at the same time. Thus removing this code. Also removing one unused variable. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 10dd8e11)
-
- 24 Nov, 2020 1 commit
-
-
mergify[bot] authored
Bump Controller Runtime version to 0.6 (bp #6568)
-
- 20 Nov, 2020 6 commits
-
-
Sébastien Han authored
We must add the finalizer right after the object creation otherwise the seerver will later return an error on update that the object has been modified. Indeed, it has been by the task that updates the status when the object is first created. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 97be23e3)
-
Arun Kumar Mohan authored
Signed-off-by:
Arun Kumar Mohan <amohan@redhat.com> (cherry picked from commit f5fc08b8)
-
Arun Kumar Mohan authored
Signed-off-by:
Arun Kumar Mohan <amohan@redhat.com> (cherry picked from commit ded16f77)
-
Arun Kumar Mohan authored
Fetched latest lib-bucket-provisioner changes as well. Signed-off-by:
Arun Kumar Mohan <amohan@redhat.com> (cherry picked from commit 65d16bfc)
-
Arun Kumar Mohan authored
Signed-off-by:
Arun Kumar Mohan <amohan@redhat.com> (cherry picked from commit 9ff98995)
-
Arun Kumar Mohan authored
Updating the dependencies' versions to match with the newer Operator SDK version v1.x Signed-off-by:
Arun Kumar Mohan <amohan@redhat.com> (cherry picked from commit 421f340c)
-
- 19 Nov, 2020 6 commits
-
-
Travis Nielsen authored
build: Update release version to v1.5.1
-
Travis Nielsen authored
For the patch release we update the version to v1.5.1 Signed-off-by:
Travis Nielsen <tnielsen@redhat.com>
-
mergify[bot] authored
ceph: update cephcsi to latest v3.1.2 release (bp #6676)
-
Madhu Rajanna authored
updating cephcsi to v3.1.2 which is a latest bugfix release. Signed-off-by:
Madhu Rajanna <madhupr007@gmail.com> (cherry picked from commit ca3e2385)
-
mergify[bot] authored
ceph: OSD PDB reconciler changes (bp #6497)
-
Santosh Pillai authored
-creates a single PDB (max-unavailable=1) for all OSDs. This PDB allows one OSD to go down at a given time. -When a drain is detected, blocking PDBs (max-unavailable=0) will be created for each failure domain that is not being drained and the main PDB (max-unavilable=1) will be deleted. This will allow all the OSDs in the currently drained failure domain to be removed while blocking the deletion of OSDs in other failure domains. -Once the PGs are healthy again, the blocking PDBs will be deleted and the main PDB will be restored. -Add PG healthcheck timeout -Delete any legacy node drain pods and blocking OSD PDBs Signed-off-by:
Santosh Pillai <sapillai@redhat.com> (cherry picked from commit 8602b9c1)
-
- 18 Nov, 2020 10 commits
-
-
mergify[bot] authored
ci: fix device intermittent failure (bp #6666)
-
mergify[bot] authored
ceph: add snapshot scheduling for mirrored pools (bp #6553)
-
Sébastien Han authored
When we are done creating the partitions it's good to give the kernel some time to reprobe the device and for udev to finish syncing up. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit 38e47560)
-
Sébastien Han authored
Permissions on the disk might changed due to the partitions being created. So the CI user is not able to read the device correctly. Closes: https://github.com/rook/rook/issues/6580 Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit dcd84e63)
-
Sébastien Han authored
Now, we can schedule snapshots on pools from the CephBlockPool CR when the pool is mirrored. It can be enabled like this: ``` mirroring: enabled: true mode: pool snapshotSchedules: - interval: 24h # daily snapshots startTime: 14:00:00-05:00 ``` Multiple schedules are supported since snapshotSchedules is a list. Signed-off-by:
Sébastien Han <seb@redhat.com> (cherry picked from commit afc7ecff)
-
mergify[bot] authored
docs: Clarify helm warning that could delete cluster (bp #6655)
-
mergify[bot] authored
ceph: Restore mon clusterIP if the service is missing (bp #6658)
-
mergify[bot] authored
ceph: update cleanupPolicy design doc (bp #6592)
-
Travis Nielsen authored
In a disaster recovery scenario, the mon service may have been accidentally deleted, while the expected mon endpoint is still found in the mon endpoints configmap. In this case, we create the mon service with the same endpoint as previously. Signed-off-by:
Travis Nielsen <tnielsen@redhat.com> (cherry picked from commit 66de535a)
-
Santosh Pillai authored
Cleanup Policy design doc is not up to date to with respect to latest implementation. This PR updates the design doc. Signed-off-by:
Santosh Pillai <sapillai@redhat.com> (cherry picked from commit cd936b64)
-
- 17 Nov, 2020 4 commits
-
-
Travis Nielsen authored
In the helm chart the CRDs are installed if crds.enabled is set to true. If false, the helm chart will not install them. If changed to false during an upgrade, the CRDs will be removed and the cluster is destroyed. There is no way to prevent this while still being flexible about CRD management, so we make the warnings as clear as possible. Signed-off-by:
Travis Nielsen <tnielsen@redhat.com> (cherry picked from commit ba534ce2)
-
mergify[bot] authored
ceph: support ceph cluster and CSI on multus in different namespace (bp #6396)
-
mergify[bot] authored
ceph: add external script to the container image (bp #6648)
-
mergify[bot] authored
ceph: update ceph quick start doc to use new crds.yaml file (bp #6646)
-