Commit 202650f6 authored by Jared Watts's avatar Jared Watts
Browse files

Add and remove storage nodes

parent 378ec903
Showing with 90 additions and 25 deletions
+90 -25
......@@ -38,9 +38,15 @@ For more details on the mons and when to choose a number other than `3`, see the
- [storage selection settings](#storage-selection-settings)
- [storage configuration settings](#storage-configuration-settings)
#### Node updates
Nodes can be added and removed over time by updating the Cluster CRD, for example with `kubectl -n rook edit cluster rook`.
This will bring up your default text editor and allow you to add and remove storage nodes from the cluster.
This feature is only available when `useAllNodes` has been set to `false`.
### Node settings
In addition to the cluster level settings specified above, each individual node can also specify configuration to override the cluster level settings and defaults. If a node does not specify any configuration then it will inherit the cluster level settings.
In addition to the cluster level settings specified above, each individual node can also specify configuration to override the cluster level settings and defaults.
If a node does not specify any configuration then it will inherit the cluster level settings.
- `name`: The name of the node, which should match its `kubernetes.io/hostname` label.
- `devices`: A list of individual device names belonging to this node to include in the storage cluster.
- `name`: The name of the device (e.g., `sda`).
......
......@@ -3,6 +3,7 @@
## Action Required
## Notable Features
- The Cluster CRD can now be edited/updated to add and remove storage nodes. Note that only adding/removing entire nodes is currently supported, but adding individual disks/directories will also be supported soon.
- Monitoring is now done through the Ceph MGR service for Ceph storage.
- The CRUSH root can be specified for pools with the `crushRoot` property, rather than always using the `default` root. Configuration of the CRUSH hierarchy is necessary with the `ceph osd crush` commands in the toolbox.
......
......@@ -42,6 +42,7 @@ rules:
- list
- watch
- create
- update
- delete
- apiGroups:
- apiextensions.k8s.io
......
......@@ -24,6 +24,8 @@ import (
"github.com/rook/rook/pkg/daemon/ceph/mon"
"github.com/rook/rook/pkg/daemon/ceph/osd"
"github.com/rook/rook/pkg/operator/cluster"
oposd "github.com/rook/rook/pkg/operator/cluster/ceph/osd"
osdcfg "github.com/rook/rook/pkg/operator/cluster/ceph/osd/config"
"github.com/rook/rook/pkg/operator/k8sutil"
"github.com/rook/rook/pkg/util/flags"
"github.com/spf13/cobra"
......@@ -51,10 +53,10 @@ func addOSDFlags(command *cobra.Command) {
command.Flags().StringVar(&cfg.nodeName, "node-name", os.Getenv("HOSTNAME"), "the host name of the node")
// OSD store config flags
command.Flags().IntVar(&cfg.storeConfig.WalSizeMB, "osd-wal-size", osd.WalDefaultSizeMB, "default size (MB) for OSD write ahead log (WAL) (bluestore)")
command.Flags().IntVar(&cfg.storeConfig.DatabaseSizeMB, "osd-database-size", osd.DBDefaultSizeMB, "default size (MB) for OSD database (bluestore)")
command.Flags().IntVar(&cfg.storeConfig.JournalSizeMB, "osd-journal-size", osd.JournalDefaultSizeMB, "default size (MB) for OSD journal (filestore)")
command.Flags().StringVar(&cfg.storeConfig.StoreType, "osd-store", osd.DefaultStore, "type of backing OSD store to use (bluestore or filestore)")
command.Flags().IntVar(&cfg.storeConfig.WalSizeMB, "osd-wal-size", osdcfg.WalDefaultSizeMB, "default size (MB) for OSD write ahead log (WAL) (bluestore)")
command.Flags().IntVar(&cfg.storeConfig.DatabaseSizeMB, "osd-database-size", osdcfg.DBDefaultSizeMB, "default size (MB) for OSD database (bluestore)")
command.Flags().IntVar(&cfg.storeConfig.JournalSizeMB, "osd-journal-size", osdcfg.JournalDefaultSizeMB, "default size (MB) for OSD journal (filestore)")
command.Flags().StringVar(&cfg.storeConfig.StoreType, "osd-store", osdcfg.DefaultStore, "type of backing OSD store to use (bluestore or filestore)")
}
func init() {
......@@ -110,8 +112,15 @@ func startOSD(cmd *cobra.Command, args []string) error {
agent := osd.NewAgent(context, dataDevices, usingDeviceFilter, cfg.metadataDevice, cfg.directories, forceFormat,
crushLocation, cfg.storeConfig, &clusterInfo, cfg.nodeName, kv)
err = osd.Run(context, agent)
err = osd.Run(context, agent, nil)
if err != nil {
// something failed in the OSD orchestration, update the status map with failure details
status := oposd.OrchestrationStatus{
Status: oposd.OrchestrationStatusFailed,
Message: err.Error(),
}
oposd.UpdateOrchestrationStatusMap(clientset, clusterInfo.Name, cfg.nodeName, status)
terminateFatal(err)
}
......
......@@ -163,13 +163,14 @@ Of special note for removing storage is that a check should be performed to ensu
If the cluster does not have enough space for this (e.g., it would hit the `full` ratio), then the removal should not proceed.
For each OSD to remove, the following steps should be performed:
* mark the OSD as `out` with `ceph osd out <osd.id>`, which will trigger data migration from the OSD.
* reweight the OSD to 0.0 with `ceph osd crush reweight osd.<id> 0.0`, which will trigger data migration from the OSD.
* wait for all data to finish migrating from the OSD, meaning all placement groups return to the `active+clean` state
* mark the OSD as `out` with `ceph osd out osd.<id>`
* stop the OSD process and remove it from monitoring
* remove the OSD from the CRUSH map: `ceph osd crush remove <osd.id>`
* delete the OSD's auth info: `ceph auth del <osd.id>`
* delete the OSD from the cluster: `ceph osd rm <osd.id>`
* delete the OSD directory from local storage (if using `dataDirHostPath`): `rm -fr /var/lib/rook/<osd.id>`
* remove the OSD from the CRUSH map: `ceph osd crush remove osd.<id>`
* delete the OSD's auth info: `ceph auth del osd.<id>`
* delete the OSD from the cluster: `ceph osd rm osd.<id>`
* delete the OSD directory from local storage (if using `dataDirHostPath`): `rm -fr /var/lib/rook/<osdID>`
If the entire node is being removed, ensure that the host node is also removed from the CRUSH map:
```console
......
......@@ -124,3 +124,18 @@ func resolveInt(setting *int, parent, defaultVal int) {
func newBool(val bool) *bool {
return &val
}
// NodesByName implements an interface to sort nodes by name
type NodesByName []Node
func (s NodesByName) Len() int {
return len(s)
}
func (s NodesByName) Swap(i, j int) {
s[i], s[j] = s[j], s[i]
}
func (s NodesByName) Less(i, j int) bool {
return s[i].Name < s[j].Name
}
......@@ -33,7 +33,8 @@ import (
type Cluster struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata"`
Spec ClusterSpec `json:"spec"`
Spec ClusterSpec `json:"spec"`
Status ClusterStatus `json:"status,omitempty"`
}
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
......@@ -67,6 +68,20 @@ type ClusterSpec struct {
Resources ResourceSpec `json:"resources,omitempty"`
}
type ClusterStatus struct {
State ClusterState `json:"state,omitempty"`
Message string `json:"message,omitempty"`
}
type ClusterState string
const (
ClusterStateCreating ClusterState = "Creating"
ClusterStateCreated ClusterState = "Created"
ClusterStateUpdating ClusterState = "Updating"
ClusterStateError ClusterState = "Error"
)
type ResourceSpec struct {
API v1.ResourceRequirements `json:"api,omitempty"`
Mgr v1.ResourceRequirements `json:"mgr,omitempty"`
......
// +build !ignore_autogenerated
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......@@ -47,6 +47,7 @@ func (in *Cluster) DeepCopyInto(out *Cluster) {
out.TypeMeta = in.TypeMeta
in.ObjectMeta.DeepCopyInto(&out.ObjectMeta)
in.Spec.DeepCopyInto(&out.Spec)
out.Status = in.Status
return
}
......@@ -122,6 +123,22 @@ func (in *ClusterSpec) DeepCopy() *ClusterSpec {
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ClusterStatus) DeepCopyInto(out *ClusterStatus) {
*out = *in
return
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ClusterStatus.
func (in *ClusterStatus) DeepCopy() *ClusterStatus {
if in == nil {
return nil
}
out := new(ClusterStatus)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Config) DeepCopyInto(out *Config) {
*out = *in
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
/*
Copyright 2017 The Kubernetes Authors All rights reserved.
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment