Unverified Commit f449e238 authored by Andy Goldstein's avatar Andy Goldstein Committed by GitHub
Browse files

Merge pull request #313 from Bradamant3/0.7-doc-updates

Add doc changes for 0.7.0
parents dc5bbada 3b8e32fa
Showing with 780 additions and 292 deletions
+780 -292
......@@ -5,167 +5,145 @@
[![Build Status][1]][2]
## Overview
Heptio Ark is a utility for managing disaster recovery, specifically for your [Kubernetes][14] cluster resources and persistent volumes. It provides a simple, configurable, and operationally robust way to back up and restore applications and PVs from a series of checkpoints. This allows you to better automate in the following scenarios:
* **Disaster recovery** with reduced TTR (time to respond), in the case of:
* Infrastructure loss
* Data corruption
* Service outages
Ark gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes. Ark lets you:
* **Cross-cloud-provider migration** for Kubernetes API objects (cross-cloud-provider migration of persistent volume snapshots not yet supported)
* Take backups of your cluster and restore in case of loss.
* Copy cluster resources across cloud providers. NOTE: Cloud volume migrations are not yet supported.
* Replicate your production environment for development and testing environments.
* **Dev and testing environment setup (+ CI)**, via replication of prod environment
Ark consists of:
More concretely, Heptio Ark combines an in-cluster service with a CLI that allows you to record both:
1. *Configurable subsets of Kubernetes API objects* -- as tarballs stored in object storage
2. *Disk snapshots of Persistent Volumes* -- via the cloud provider APIs
* A server that runs on your cluster
* A command-line client that runs locally
Heptio Ark currently supports the [AWS][15], [GCP][16], and [Azure][17] cloud provider platforms.
## Getting started
## Quickstart
The following example sets up the Ark server and client, then backs up and restores a sample application.
This guide gets Ark up and running on your cluster, and goes through an example using the following:
* **Minio, an S3-compatible storage service** that runs locally on your cluster. This is the storage service where backup files are uploaded. *Note that Ark is intended to run on a cloud provider--we are using Minio here to keep the example convenient and self-contained.*
For simplicity, the example uses Minio, an S3-compatible storage service that runs locally on your cluster. See [Set up Ark with your cloud provider][3] for how to run on a cloud provider.
* **A sample nginx app** under the `nginx-example` namespace, used to demonstrate Ark's backup and restore functionality.
### Prerequisites
Note that this example *does not* include a demonstration of PV disk snapshots, because that feature requires integration with a cloud provider API. For snapshotting examples and instructions specific to AWS, GCP, and Azure, see [Cloud Provider Specifics][23].
* Access to a Kubernetes cluster, version 1.7 or later. Version 1.7.5 or later is required to run `ark backup delete`.
* A DNS server on the cluster
* `kubectl` installed
### 0. Prerequisites
### Download
* *You should have access to an up-and-running Kubernetes cluster (minimum version 1.7).* If you do not have a cluster, [choose a setup solution][9] from the official Kubernetes docs.
Clone or fork the Ark repository:
* *You will need to have a DNS server set up on your cluster for the example files to work.* You can check this with `kubectl get svc -l k8s-app=kube-dns --namespace=kube-system`. If said service does not exist, [these instructions][12] may help.
* *You should have `kubectl` installed.* If not, follow the instructions for [installing via Homebrew (MacOS)][10] or [building the binary (Linux)][11].
### 1. Download
Clone or fork the Heptio Ark repo:
```
git clone git@github.com:heptio/ark.git
```
> NOTE: Documentation may change between releases. See the [Changelog][20] for links to previous versions of this repository and its docs.
>
> To ensure that you are working off a specific release, `git checkout <VERSION_TAG>` where `<VERSION_TAG>` is the appropriate tag for the Ark version you wish to use (e.g. "v0.3.3"). You should `git checkout master` only if you're planning on [building the Ark image from scratch][7].
NOTE: Make sure to check out the appropriate version. We recommend that you check out the latest tagged version. The master branch is under active development and might not be stable.
### 2. Setup
### Set up server
There are two types of Ark instances that work in tandem:
1. **Ark server**: Runs persistently on the cluster.
2. **Ark client**: Launched by the user whenever they want to initiate an operation (e.g. a backup).
1. Start the server and the local storage service. In the root directory of Ark, run:
To get the server started on your cluster (as well as the local storage service), execute the following commands in Ark's root directory:
```bash
kubectl apply -f examples/common/00-prereqs.yaml
kubectl apply -f examples/minio/
kubectl apply -f examples/common/10-deployment.yaml
```
```
kubectl apply -f examples/common/00-prereqs.yaml
kubectl apply -f examples/minio/
kubectl apply -f examples/common/10-deployment.yaml
```
NOTE: If you get an error about Config creation, wait for a minute, then run the commands again.
*NOTE: If you encounter an error related to Config creation, wait for a minute and run the command again. (The Config CRD does not always finish registering in time.)*
1. Deploy the example nginx application:
Now deploy the example nginx app:
```
kubectl apply -f examples/nginx-app/base.yaml
```
```bash
kubectl apply -f examples/nginx-app/base.yaml
```
Check to see that both the Ark and nginx deployments have been successfully created:
```
kubectl get deployments -l component=ark --namespace=heptio-ark
kubectl get deployments --namespace=nginx-example
```
1. Check to see that both the Ark and nginx deployments are successfully created:
Finally, install the Ark client somehwere in your `$PATH`:
* [Download a pre-built release][26], or
* [Build it from scratch][7]
```
kubectl get deployments -l component=ark --namespace=heptio-ark
kubectl get deployments --namespace=nginx-example
```
### Install client
### 3. Back up and restore
First, create a backup specifically for any object matching the `app=nginx` label selector:
For this example, we recommend that you [download a pre-built release][26].
```
ark backup create nginx-backup --selector app=nginx
```
You can also [build from source][7].
Now you can mimic a disaster with the following:
```
kubectl delete namespace nginx-example
```
Oh no! The nginx deployment and service are both gone, as you can see (though you may have to wait a minute or two for the namespace be fully cleaned up):
```
kubectl get deployments --namespace=nginx-example
kubectl get services --namespace=nginx-example
```
Neither commands should yield any results. However, because Ark has your back(up), you can run this command:
```
ark restore create nginx-backup
```
Make sure that you install somewhere in your `$PATH`.
To check on the status of the Restore:
```
ark restore get
```
### Back up
The output should look something like the table below:
```
NAME BACKUP STATUS WARNINGS ERRORS CREATED SELECTOR
nginx-backup-20170727200524 nginx-backup Completed 0 0 2017-07-27 20:05:24 +0000 UTC <none>
```
1. Create a backup for any object that matches the `app=nginx` label selector:
If the Restore's `STATUS` column is "Completed", and `WARNINGS` and `ERRORS` are both zero, the restore is a success. All of the objects in the `nginx-example` namespace should be just as they were before.
```
ark backup create nginx-backup --selector app=nginx
```
Otherwise, if there are warnings or errors indicated, you can run the following command to look at them in more detail:
```
ark restore get <RESTORE NAME> -o yaml
```
See the [debugging documentation][18] for more details.
1. Simulate a disaster:
*NOTE*: In the example files, the `storage` volume is defined via `hostPath` for better visibility. If you're curious to see the [structure of the backup files][13] firsthand, you can find the compressed results in `/tmp/minio/ark/nginx-backup`.
```
kubectl delete namespace nginx-example
```
### 4. Tear Down
Using the following command, you can remove all Kubernetes objects associated with this example:
```
kubectl delete -f examples/common/
kubectl delete -f examples/minio/
kubectl delete -f examples/nginx-app/base.yaml
```
1. To check that the nginx deployment and service are gone, run:
## Architecture
```
kubectl get deployments --namespace=nginx-example
kubectl get services --namespace=nginx-example
kubectl get namespace/nginx-example
```
Each of Heptio Ark's operations (Backups, Schedules, and Restores) are custom resources themselves, defined using [CRDs][20]. Their accompanying [custom controllers][21] handle them when they are submitted to the Kubernetes API server.
You should get no results.
NOTE: You might need to wait for a few minutes for the namespace to be fully cleaned up.
As mentioned before, Ark runs in two different modes:
### Restore
* **Ark client**: Allows you to query, create, and delete the Ark resources as desired.
1. Run:
* **Ark server**: Runs all of the Ark controllers. Each controller watches its respective custom resource for API operations, performs validation, and handles the majority of the cloud API logic (e.g. interfacing with object storage and persistent volumes).
```
ark restore create nginx-backup
```
Looking at a specific example--an `ark backup create test-backup` command triggers the following operations:
1. Run:
![19]
```
ark restore get
```
1. The *ark client* makes a call to the Kubernetes API server, creating a `Backup` custom resource (which is stored in [etcd][22]).
After the restore finishes, the output looks like the following:
2. The `BackupController` sees that a new `Backup` has been created, and validates it.
```
NAME BACKUP STATUS WARNINGS ERRORS CREATED SELECTOR
nginx-backup-20170727200524 nginx-backup Completed 0 0 2017-07-27 20:05:24 +0000 UTC <none>
```
3. Once validation passes, the `BackupController` begins the backup process. It collects data by querying the Kubernetes API Server for resources.
NOTE: The restore can take a few moments to finish. During this time, the `STATUS` column reads `InProgress`.
4. Once the data has been aggregated, the `BackupController` makes a call to the object storage service (e.g. Amazon S3) to upload the backup file.
After a successful restore, the `STATUS` column is `Completed`, and `WARNINGS` and `ERRORS` are 0. All objects in the `nginx-example` namespacee should be just as they were before you deleted them.
5. By default, Ark also makes disk snapshots of any persistent volumes, using the appropriate cloud service API. (This can be disabled via the option `--snapshot-volumes=false`)
If there are errors or warnings, you can look at them in detail:
## Extensibility
```
ark restore describe <RESTORE_NAME>
```
Ark has multiple mechanisms for extending the core functionality to meet your individual backup/restore needs:
For more information, see [the debugging information][18].
* [Hooks][27] allow you to specify commands to be executed within running pods during a backup. This is useful if you need to run a workload-specific command prior to taking a backup (for example, to flush disk buffers or to freeze a database).
* [Plugins][28] enable you to develop custom object/block storage back-ends or per-item backup/restore actions that can execute arbitrary logic, including modifying the items being backed up/restored. Plugins can be used by Ark without needing to be compiled into the core Ark binary.
### Clean up
To remove the Kubernetes objects for this example from your cluster, run:
## Further documentation
```
kubectl delete -f examples/common/
kubectl delete -f examples/minio/
kubectl delete -f examples/nginx-app/base.yaml
```
To learn more about Heptio Ark operations and their applications, see the [`/docs` directory][3].
## More information
[The documentation][29] provides detailed information about building from source, architecture, extending Ark, and more.
## Troubleshooting
......@@ -186,7 +164,6 @@ Feedback and discussion is available on [the mailing list][24].
* We welcome pull requests. Feel free to dig through the [issues][4] and jump in.
## Changelog
See [the list of releases][6] to find out about feature changes.
......@@ -194,12 +171,12 @@ See [the list of releases][6] to find out about feature changes.
[0]: https://github.com/heptio
[1]: https://travis-ci.org/heptio/ark.svg?branch=master
[2]: https://travis-ci.org/heptio/ark
[3]: /docs
[3]: /docs/cloud-common.md
[4]: https://github.com/heptio/ark/issues
[5]: /CONTRIBUTING.md
[6]: /CHANGELOG.md
[5]: https://github.com/heptio/ark/blob/master/CONTRIBUTING.md
[6]: https://github.com/heptio/ark/releases
[7]: /docs/build-from-scratch.md
[8]: /CODE_OF_CONDUCT.md
[8]: https://github.com/heptio/ark/blob/master/CODE_OF_CONDUCT.md
[9]: https://kubernetes.io/docs/setup/
[10]: https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-with-homebrew-on-macos
[11]: https://kubernetes.io/docs/tasks/tools/install-kubectl/#tabset-1
......@@ -214,9 +191,9 @@ See [the list of releases][6] to find out about feature changes.
[20]: https://kubernetes.io/docs/concepts/api-extension/custom-resources/#customresourcedefinitions
[21]: https://kubernetes.io/docs/concepts/api-extension/custom-resources/#custom-controllers
[22]: https://github.com/coreos/etcd
[23]: /docs/cloud-provider-specifics.md
[24]: http://j.hept.io/ark-list
[25]: http://slack.kubernetes.io/
[26]: https://github.com/heptio/ark/releases
[27]: /docs/hooks.md
[28]: /docs/plugins.md
\ No newline at end of file
[28]: /docs/plugins.md
[29]: https://heptio.github.io/ark/
# Table of Contents
## User Guide
* [Concepts][1]
* [Build from scratch][0]
* [Cloud provider specifics][9]
* [Debugging restores][4]
* [FAQ][10]
## Reference
* [CLI reference][2]
* [Config definition][5]
* [Output file format][6]
* [Sample YAML files][3]
## Scenarios
* [Disaster recovery][7]
* [Cluster migration][8]
[0]: build-from-scratch.md
[1]: concepts.md
[2]: cli-reference
[3]: /examples
[4]: debugging-restores.md
[5]: config-definition.md
[6]: output-file-format.md
[7]: use-cases.md#disaster-recovery
[8]: use-cases.md#cluster-migration
[9]: cloud-provider-specifics.md
[10]: faq.md
# About Heptio Ark
Heptio Ark provides customizable degrees of recovery for all Kubernetes objects (Pods, Deployments, Jobs, Custom Resource Definitions, etc.), as well as for persistent volumes. This recovery can be cluster-wide, or fine-tuned according to object type, namespace, or labels.
Ark is ideal for the disaster recovery use case, as well as for snapshotting your application state, prior to performing system operations on your cluster (e.g. upgrades).
## Features
Ark provides the following operations:
* On-demand backups
* Scheduled backups
* Restores
Each operation is a custom resource, defined with a Kubernetes [Custom Resource Definition (CRD)][20] and stored in [etcd][22]. An additional custom resource, Config, specifies required information and customized options, such as cloud provider settings. These resources are handled by [custom controllers][21] when their corresponding requests are submitted to the Kubernetes API server.
Each controller watches its custom resource for API requests (Ark operations), performs validations, and handles the logic for interacting with the cloud provider API -- for example, managing object storage and persistent volumes.
### On-demand backups
The **backup** operation:
1. Uploads a tarball of copied Kubernetes objects into cloud object storage.
1. Calls the cloud provider API to make disk snapshots of persistent volumes, if specified.
You can optionally specify hooks to be executed during the backup. For example, you might
need to tell a database to flush its in-memory buffers to disk before taking a snapshot. [More about hooks][10].
Note that cluster backups are not strictly atomic. If Kubernetes objects are being created or edited at the time of backup, they might not be included in the backup. The odds of capturing inconsistent information are low, but it is possible.
### Scheduled backups
The **schedule** operation allows you to back up your data at recurring intervals. The first backup is performed when the schedule is first created, and subsequent backups happen at the schedule's specified interval. These intervals are specified by a Cron expression.
A Schedule acts as a wrapper for Backups; when triggered, it creates them behind the scenes.
Scheduled backups are saved with the name `<SCHEDULE NAME>-<TIMESTAMP>`, where `<TIMESTAMP>` is formatted as *YYYYMMDDhhmmss*.
### Restores
The **restore** operation allows you to restore all of the objects and persistent volumes from a previously created Backup. Heptio Ark supports multiple namespace remapping--for example, in a single restore, objects in namespace "abc" can be recreated under namespace "def", and the ones in "123" under "456".
Kubernetes objects that have been restored can be identified with a label that looks like `ark-restore=<BACKUP NAME>-<TIMESTAMP>`, where `<TIMESTAMP>` is formatted as *YYYYMMDDhhmmss*.
You can also run the Ark server in restore-only mode, which disables backup, schedule, and garbage collection functionality during disaster recovery.
## Backup workflow
Here's what happens when you run `ark backup create test-backup`:
1. The Ark client makes a call to the Kubernetes API server to create a `Backup` object.
1. The `BackupController` notices the new `Backup` object and performs validation.
1. The `BackupController` begins the backup process. It collects the data to back up by querying the API server for resources.
1. The `BackupController` makes a call to the object storage service -- for example, AWS S3 -- to upload the backup file.
By default `ark backup create` makes disk snapshots of any persistent volumes. You can adjust the snapshots by specifying additional flags. See [the CLI help][30] for more information. Snapshots can be disabled with the option `--snapshot-volumes=false`.
![19]
## Set a backup to expire
When you create a backup, you can specify a TTL by adding the flag `--ttl <DURATION>`. If Ark sees that an existing Backup resource is expired, it removes:
* The Backup resource
* The backup file from cloud object storage
* All PersistentVolume snapshots
* All associated Restores
## Object storage sync
Heptio Ark treats object storage as the source of truth. It continuously checks to see that the correct Backup resources are always present. If there is a properly formatted backup file in the storage bucket, but no corresponding Backup resources in the Kubernetes API, Ark synchronizes the information from object storage to Kubernetes.
This allows restore functionality to work in a cluster migration scenario, where the original Backup objects do not exist in the new cluster. See the tutorials for details.
[19]: /img/backup-process.png
[30]: https://github.com/heptio/ark/blob/master/docs/cli-reference/ark_create_backup.md
\ No newline at end of file
......@@ -22,7 +22,7 @@ kind: Backup
metadata:
# Backup name. May be any valid Kubernetes object name. Required.
name: a
# Backup namespace. Must be heptio-ark. Required.
# Backup namespace. Required. In version 0.7.0 and later, can be any string. Must be the namespace of the Ark server.
namespace: heptio-ark
# Parameters about the backup. Required.
spec:
......
# Run Ark on AWS
To set up Ark on AWS, you:
* Create your S3 bucket
* Create an AWS IAM user for Ark
* Configure the server
* Create a Secret for your credentials
If you do not have the `aws` CLI locally installed, follow the [user guide][5] to set it up.
## Create S3 bucket
Heptio Ark requires an object storage bucket to store backups in. Create an S3 bucket, replacing placeholders appropriately:
```bash
aws s3api create-bucket \
--bucket <YOUR_BUCKET> \
--region <YOUR_REGION> \
--create-bucket-configuration LocationConstraint=<YOUR_REGION>
```
NOTE: us-east-1 does not support a `LocationConstraint`. If your region is `us-east-1`, omit the bucket configuration:
```bash
aws s3api create-bucket \
--bucket <YOUR_BUCKET> \
--region us-east-1
```
## Create IAM user
For more information, see [the AWS documentation on IAM users][14].
1. Create the IAM user:
```bash
aws iam create-user --user-name heptio-ark
```
2. Attach policies to give `heptio-ark` the necessary permissions:
```bash
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess \
--user-name heptio-ark
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess \
--user-name heptio-ark
```
3. Create an access key for the user:
```bash
aws iam create-access-key --user-name heptio-ark
```
The result should look like:
```json
{
"AccessKey": {
"UserName": "heptio-ark",
"Status": "Active",
"CreateDate": "2017-07-31T22:24:41.576Z",
"SecretAccessKey": <AWS_SECRET_ACCESS_KEY>,
"AccessKeyId": <AWS_ACCESS_KEY_ID>
}
}
```
4. Create an Ark-specific credentials file (`credentials-ark`) in your local directory:
```
[default]
aws_access_key_id=<AWS_ACCESS_KEY_ID>
aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
```
where the access key id and secret are the values returned from the `create-access-key` request.
## Credentials and configuration
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding. To run in a custom namespace, make sure that you have edited the YAML files to specify the namespace. See [Run in custom namespace][0].
```bash
kubectl apply -f examples/common/00-prereqs.yaml
```
Create a Secret. In the directory of the credentials file you just created, run:
```bash
kubectl create secret generic cloud-credentials \
--namespace <ARK_NAMESPACE> \
--from-file cloud=credentials-ark
```
Specify the following values in the example files:
* In `examples/aws/00-ark-config.yaml`:
* Replace `<YOUR_BUCKET>` and `<YOUR_REGION>`. See the [Config definition][6] for details.
* In `examples/common/10-deployment.yaml`:
* Make sure that `spec.template.spec.containers[*].env.name` is "AWS_SHARED_CREDENTIALS_FILE".
* (Optional) If you run the nginx example, in file `examples/nginx-app/with-pv.yaml`:
* Replace `<YOUR_STORAGE_CLASS_NAME>` with `gp2`. This is AWS's default `StorageClass` name.
## Start the server
In the root of your Ark directory, run:
```bash
kubectl apply -f examples/aws/00-ark-config.yaml
kubectl apply -f examples/common/10-deployment.yaml
```
[0]: /namespace.md
[6]: /config-definition.md#aws
[14]: http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html
# Cloud Provider Specifics
# Run Ark on Azure
> NOTE: Documentation may change between releases. See the [Changelog][20] for links to previous versions of this repository and its docs.
>
> To ensure that you are working off a specific release, `git checkout <VERSION_TAG>` where `<VERSION_TAG>` is the appropriate tag for the Ark version you wish to use (e.g. "v0.3.3"). You should `git checkout master` only if you're planning on [building the Ark image from scratch][21].
To configure Ark on Azure, you:
While the [Quickstart][0] uses a local storage service to quickly set up Heptio Ark as a demonstration, this document details additional configurations that are required when integrating with the cloud providers below:
* Create your Azure storage account and blob container
* Create Azure service principal for Ark
* Configure the server
* Create a Secret for your credentials
* [Setup][12]
* [AWS][1]
* [GCP][2]
* [Azure][3]
* [Run][13]
* [Ark server][9]
* [Basic example (no PVs)][10]
* [Snapshot example (with PVs)][11]
If you do not have the `az` Azure CLI 2.0 installed locally, follow the [install guide][18] to set it up.
## Setup
### AWS
If you do not have the `aws` CLI locally installed, follow the [user guide][5] to set it up.
#### S3 bucket creation
Heptio Ark requires an object storage bucket in which to store backups. Create an S3 bucket (replacing placeholders appropriately):
```bash
aws s3api create-bucket \
--bucket <YOUR_BUCKET> \
--region <YOUR_REGION> \
--create-bucket-configuration LocationConstraint=<YOUR_REGION>
```
Note: us-east-1 does not support a `LocationConstraint`. If your region is `us-east-1`, omit the bucket configuration:
```bash
aws s3api create-bucket \
--bucket <YOUR_BUCKET> \
--region us-east-1
```
#### IAM user creation
To integrate Heptio Ark with AWS, you should follow the instructions below to create an Ark-specific [IAM user][14].
1. Create an IAM user:
```bash
aws iam create-user --user-name heptio-ark
```
2. Attach policies to give `heptio-ark` the necessary permissions:
```bash
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess \
--user-name heptio-ark
aws iam attach-user-policy \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess \
--user-name heptio-ark
```
3. Create an access key for the user:
```bash
aws iam create-access-key --user-name heptio-ark
```
The result should look like:
```json
{
"AccessKey": {
"UserName": "heptio-ark",
"Status": "Active",
"CreateDate": "2017-07-31T22:24:41.576Z",
"SecretAccessKey": <AWS_SECRET_ACCESS_KEY>,
"AccessKeyId": <AWS_ACCESS_KEY_ID>
}
}
```
4. Using the output from the previous command, create an Ark-specific credentials file (`credentials-ark`) in your local directory that looks like the following:
```
[default]
aws_access_key_id=<AWS_ACCESS_KEY_ID>
aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>
```
#### Credentials and configuration
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding:
```bash
kubectl apply -f examples/common/00-prereqs.yaml
```
Create a Secret, running this command in the local directory of the credentials file you just created:
```bash
kubectl create secret generic cloud-credentials \
--namespace heptio-ark \
--from-file cloud=credentials-ark
```
Now that you have your IAM user credentials stored in a Secret, you need to replace some placeholder values in the template files. Specifically, you need to change the following:
* In file `examples/aws/00-ark-config.yaml`:
* Replace `<YOUR_BUCKET>` and `<YOUR_REGION>`. See the [Config definition][6] for details.
* In file `examples/common/10-deployment.yaml`:
* Make sure that `spec.template.spec.containers[*].env.name` is "AWS_SHARED_CREDENTIALS_FILE".
* (Optional) If you are running the Nginx example, in file `examples/nginx-app/with-pv.yaml`:
* Replace `<YOUR_STORAGE_CLASS_NAME>` with `gp2`. This is AWS's default `StorageClass` name.
### GCP
There are two ways to use Kubernetes on Google's Cloud Platform - Kubernetes running on top of Google Compute Engine
virtual machines, or through the Google Kubernetes Engine. The instructions provided here should work on either,
with one noted exception.
If you do not have the `gcloud` and `gsutil` CLIs locally installed, follow the [user guide][16] to set them up.
#### GCS bucket creation
Heptio Ark requires an object storage bucket in which to store backups. Create a GCS bucket (replacing placeholders appropriately):
```bash
gsutil mb gs://<YOUR_BUCKET>/
```
#### Service account creation
To integrate Heptio Ark with GCP, you should follow the instructions below to create an Ark-specific [Service Account][15].
1. View your current config settings:
```bash
gcloud config list
```
Store the `project` value from the results in the environment variable `$PROJECT_ID`.
2. Create a service account:
```bash
gcloud iam service-accounts create heptio-ark \
--display-name "Heptio Ark service account"
```
Then list all accounts and find the `heptio-ark` account you just created:
```bash
gcloud iam service-accounts list
```
Set the `$SERVICE_ACCOUNT_EMAIL` variable to match its `email` value.
3. Attach policies to give `heptio-ark` the necessary permissions to function:
```bash
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role roles/compute.storageAdmin
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role roles/storage.admin
```
4. Create a service account key, specifying an output file (`credentials-ark`) in your local directory:
```bash
gcloud iam service-accounts keys create credentials-ark \
--iam-account $SERVICE_ACCOUNT_EMAIL
```
#### Credentials and configuration
When using Google Kubernetes Engine (GKE), be sure to make your current IAM user a cluster-admin, since creating RBAC objects requires it.
See [their docs][22] for more information.
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding:
```bash
kubectl apply -f examples/common/00-prereqs.yaml
```
Create a Secret, running this command in the local directory of the credentials file you just created:
```bash
kubectl create secret generic cloud-credentials \
--namespace heptio-ark \
--from-file cloud=credentials-ark
```
Now that you have your Google Cloud credentials stored in a Secret, you need to replace some placeholder values in the template files. Specifically, you need to change the following:
* In file `examples/gcp/00-ark-config.yaml`:
* Replace `<YOUR_BUCKET>` and `<YOUR_PROJECT>`. See the [Config definition][7] for details.
* In file `examples/common/10-deployment.yaml`:
* Change `spec.template.spec.containers[*].env.name` to "GOOGLE_APPLICATION_CREDENTIALS".
* (Optional) If you are running the Nginx example, in file `examples/nginx-app/with-pv.yaml`:
* Replace `<YOUR_STORAGE_CLASS_NAME>` with `standard`. This is GCP's default `StorageClass` name.
### Azure
If you do not have the `az` Azure CLI 2.0 locally installed, follow the [install guide][18] to set it up. Once done, run:
Run:
```bash
az login
```
#### Kubernetes cluster prerequisites
## Kubernetes cluster prerequisites
Ensure that the VMs for your agent pool allow Managed Disks. If I/O performance is critical,
consider using Premium Managed Disks, as these are SSD backed.
consider using Premium Managed Disks, which are SSD backed.
#### Azure storage account and blob container creation
## Create Azure storage account and blob container
Heptio Ark requires a storage account and blob container in which to store backups.
......@@ -270,25 +61,26 @@ AZURE_STORAGE_KEY=`az storage account keys list \
-o tsv`
```
#### Service principal creation
To integrate Heptio Ark with Azure, you should follow the instructions below to create an Ark-specific [service principal][17].
## Create service principal
2. There are seven environment variables that need to be set for Heptio Ark to work properly. The following steps detail how to acquire these, in the process of setting up the necessary RBAC.
To integrate Ark with Azure, you must create an Ark-specific [service principal][17]. Note that seven environment variables must be set for Ark to work properly.
3. Obtain your Azure Account Subscription ID and Tenant ID:
1. Obtain your Azure Account Subscription ID and Tenant ID:
```bash
AZURE_SUBSCRIPTION_ID=`az account list --query '[?isDefault].id' -o tsv`
AZURE_TENANT_ID=`az account list --query '[?isDefault].tenantId' -o tsv`
```
4. Set the name of the Resource Group that contains your Kubernetes cluster.
1. Set the name of the Resource Group that contains your Kubernetes cluster.
```bash
# Change "Kubernetes" as needed
AZURE_RESOURCE_GROUP=Kubernetes
# Make sure this is the name of the second resource group. See warning.
AZURE_RESOURCE_GROUP=<NAME_OF_RESOURCE_GROUP_2>
```
WARNING: `AZURE_RESOURCE_GROUP` must be set to the name of the second resource group that is created when you provision your cluster in Azure. Your cluster is provisioned in the resource group that you specified when you created the cluster. Your disks, however, are provisioned in the second resource group.
If you are unsure of the Resource Group name, run the following command to get a list that you can select from. Then set the `AZURE_RESOURCE_GROUP` environment variable to the appropriate value.
```bash
......@@ -297,23 +89,23 @@ To integrate Heptio Ark with Azure, you should follow the instructions below to
Get your cluster's Resource Group name from the `ResourceGroup` value in the response, and use it to set `$AZURE_RESOURCE_GROUP`. (Also note the `Location` value in the response -- this is later used in the Azure-specific portion of the Ark Config).
5. Create a service principal with `Contributor` role. This will have subscription-wide access, so protect this credential. You can specify a password or let the `az ad sp create-for-rbac` command create one for you.
1. Create a service principal with `Contributor` role. This will have subscription-wide access, so protect this credential. You can specify a password or let the `az ad sp create-for-rbac` command create one for you.
```bash
# Create service principal and specify your own password
AZURE_CLIENT_SECRET=super_secret_and_high_entropy_password_replace_me_with_your_own
az ad sp create-for-rbac --name "heptio-ark" --role "Contributor" --password $AZURE_CLIENT_SECRET
# Or create service principal and let the cli generate a password for you. ensure we capture the password though.
# Or create service principal and let the CLI generate a password for you. Make sure to capture the password.
AZURE_CLIENT_SECRET=`az ad sp create-for-rbac --name "heptio-ark" --role "Contributor" --query 'password' -o tsv`
# After creating the service principal, obtain the client id
AZURE_CLIENT_ID=`az ad sp list --display-name "heptio-ark" --query '[0].appId' -o tsv`
```
#### Credentials and configuration
## Credentials and configuration
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding:
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding. To run in a custom namespace, make sure that you have edited the YAML file to specify the namespace. See [Run in custom namespace][0].
```bash
kubectl apply -f examples/common/00-prereqs.yaml
......@@ -323,7 +115,7 @@ Now you need to create a Secret that contains all the seven environment variable
```bash
kubectl create secret generic cloud-credentials \
--namespace heptio-ark \
--namespace <ARK_NAMESPACE> \
--from-literal AZURE_SUBSCRIPTION_ID=${AZURE_SUBSCRIPTION_ID} \
--from-literal AZURE_TENANT_ID=${AZURE_TENANT_ID} \
--from-literal AZURE_RESOURCE_GROUP=${AZURE_RESOURCE_GROUP} \
......@@ -367,106 +159,15 @@ You can get a complete list of Azure locations with the following command:
az account list-locations --query "sort([].displayName)" -o tsv
```
## Start the server
## Run
### Ark server
Make sure that you have run `kubectl apply -f examples/common/00-prereqs.yaml` first (this command is incorporated in the previous setup instructions because it creates the necessary namespaces).
* **AWS and GCP**
Start the Ark server itself, using the Config from the appropriate cloud-provider-specific directory:
```bash
kubectl apply -f examples/common/10-deployment.yaml
kubectl apply -f examples/<CLOUD-PROVIDER>/
```
* **Azure**
Because Azure loads its credentials differently (from environment variables rather than a file), you need to instead run:
In the root of your Ark directory, run:
```bash
kubectl apply -f examples/azure/
```
### Basic example (No PVs)
Start the sample nginx app:
```bash
kubectl apply -f examples/nginx-app/base.yaml
```
Now create a backup:
```bash
ark backup create nginx-backup --include-namespaces nginx-example
```
Simulate a disaster:
```bash
kubectl delete namespaces nginx-example
```
Now restore your lost resources:
```bash
ark restore create nginx-backup
```
### Snapshot example (With PVs)
> NOTE: For Azure, your Kubernetes cluster needs to be version 1.7.2+ in order to support PV snapshotting of its managed disks.
Start the sample nginx app:
```bash
kubectl apply -f examples/nginx-app/with-pv.yaml
```
Now create a backup with PV snapshotting:
```bash
ark backup create nginx-backup --include-namespaces nginx-example
```
Simulate a disaster:
```bash
kubectl delete namespaces nginx-example
kubectl delete pv $nginx_pv_name
```
Because the default [reclaim policy][19] for dynamically-provisioned PVs is "Delete", the above commands should trigger your cloud provider to delete the disk backing the PV. The deletion process is asynchronous so this may take some time. **Before continuing to the next step, check your cloud provider (via dashboard or CLI) to confirm that the disk no longer exists.**
Now restore your lost resources:
```bash
ark restore create nginx-backup
```
[0]: /README.md#quickstart
[1]: #aws
[2]: #gcp
[3]: #azure
[4]: /examples/aws
[5]: http://docs.aws.amazon.com/cli/latest/userguide/installing.html
[6]: config-definition.md#aws
[7]: config-definition.md#gcp
[8]: config-definition.md#azure
[9]: #ark-server
[10]: #basic-example-no-pvs
[11]: #snapshot-example-with-pvs
[12]: #setup
[13]: #run
[14]: http://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html
[15]: https://cloud.google.com/compute/docs/access/service-accounts
[16]: https://cloud.google.com/sdk/docs/
[17]: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-application-objects
[18]: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli
[19]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming
[20]: /CHANGELOG.md
[21]: /docs/build-from-scratch.md
[22]: https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control#prerequisites_for_using_role-based_access_control
[0]: /namespace.md
[8]: config-definition.md#azure
[17]: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-application-objects
[18]: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli
# Build From Scratch
# Build from source
While the [README][0] pulls from the Heptio image registry, you can also build your own Heptio Ark container with the following steps:
* [Prerequisites][1]
* [Download][2]
* [Build][3]
* [Test][12]
* [Run][7]
* [Vendoring dependencies][10]
* [0. Prerequisites][1]
* [1. Download][2]
* [2. Build][3]
* [3. Test][12]
* [4. Run][7]
* [5. Vendoring dependencies][10]
## Prerequisites
## 0. Prerequisites
* Access to a Kubernetes cluster, version 1.7 or later. Version 1.7.5 or later is required to run `ark backup delete`.
* A DNS server on the cluster
* `kubectl` installed
* [Go][5] installed (minimum version 1.8)
In addition to the handling the prerequisites mentioned in the [Quickstart][4], you should have [Go][5] installed (minimum version 1.8).
## 1. Download
## Download
Install with go:
```
......@@ -21,47 +22,47 @@ go get github.com/heptio/ark
```
The files are installed in `$GOPATH/src/github.com/heptio/ark`.
## 2. Build
## Build
You can build your Ark image locally on the machine where you run your cluster, or you can push it to a private registry. This section covers both workflows.
Set the `$REGISTRY` environment variable (used in the `Makefile`) if you want to push the Heptio Ark images to your own registry. This allows any node in your cluster to pull your locally built image.
Set the `$REGISTRY` environment variable (used in the `Makefile`) to push the Heptio Ark images to your own registry. This allows any node in your cluster to pull your locally built image.
`$PROJECT` and `$VERSION` environment variables are also specified in the `Makefile`, and can be similarly modified as desired.
In the Ark root directory, to build your container with the tag `$REGISTRY/ark:$VERSION`, run:
Run the following in the Ark root directory to build your container with the tag `$REGISTRY/$PROJECT:$VERSION`:
```
make container
```
To push your image to a registry, use `make push`.
### Updating generated files
### Update generated files
The following files are automatically generated from the source code:
There are several files that are automatically generated based on the source code in the repository.
These include:
* The clientset
* Listers
* Shared informers
* Documentation
* Protobuf/gRPC types
If you make any of the following changes, you will need to run `make update` to regenerate
automatically generated files:
If you make any of the following changes, you must run `make update` to regenerate
the files:
* Add/edit/remove command line flags and/or their help text
* Add/edit/remove commands or subcommands
* Add new API types
If you make the following change, you will need to run [generate-proto.sh][13] to regenerate
automatically generated files (note that this requires the [proto compiler][14] to be installed):
* Add/edit/remove protobuf message or service definitions
If you make the following change, you must run [generate-proto.sh][13] to regenerate files:
* Add/edit/remove protobuf message or service definitions. These changes require the [proto compiler][14].
### Cross compiling
By default, `make` will build an `ark` binary that runs on your host operating system and
architecture. If you want to build for another platform, you can do so with `make
build-<GOOS>-<GOARCH` - for example, to build for the Mac, you would do `make build-darwin-amd64`.
All binaries are placed in `_output/bin/<GOOS>/<GOARCH>`, e.g. `_output/bin/darwin/amd64/ark`.
By default, `make` builds an `ark` binary that runs on your host operating system and architecture.
To build for another platform, run `make build-<GOOS>-<GOARCH`.
For example, to build for the Mac, run `make build-darwin-amd64`.
All binaries are placed in `_output/bin/<GOOS>/<GOARCH>`-- for example, `_output/bin/darwin/amd64/ark`.
Ark's `Makefile` has a convenience target, `all-build`, that builds the following platforms:
* linux-amd64
......@@ -77,42 +78,43 @@ files (clientset, listers, shared informers, docs) are up to date.
## 4. Run
### Considerations
When running Heptio Ark, you will need to account for the following (all of which are handled in the [`/examples`][6] manifests):
* Appropriate RBAC permissions in the cluster
* *Read access* for all data from the source cluster and namespaces
* *Write access* to the target cluster and namespaces
* Read access for all data from the source cluster and namespaces
* Write access to the target cluster and namespaces
* Cloud provider credentials
* *Read/write access* to volumes
* *Read/write access* to object storage for backup data
* Read/write access to volumes
* Read/write access to object storage for backup data
* A [Config object][8] definition for the Ark server
See [Cloud Provider Specifics][9] for a more detailed guide.
See [Cloud Provider Specifics][9] for more details.
### Specifying your image
Once your Ark deployment is up and running, **you need to replace the Heptio-provided Ark image with the specific one that you built.** You can do so with the following command:
When your Ark deployment is up and running, you must replace the Heptio-provided Ark image with the image that you built. Run:
```
kubectl set image deployment/ark ark=$REGISTRY/$PROJECT:$VERSION
kubectl set image deployment/ark ark=$REGISTRY/ark:$VERSION
```
where `$REGISTRY`, `$PROJECT`, and `$VERSION` match what you used in the [build step][3].
where `$REGISTRY` and `$VERSION` are the values that you built with.
## 5. Vendoring dependencies
If you need to add or update the vendored dependencies, please see [Vendoring dependencies][11].
If you need to add or update the vendored dependencies, see [Vendoring dependencies][11].
[0]: ../README.md
[1]: #0-prerequisites
[2]: #1-download
[3]: #2-build
[1]: #prerequisites
[2]: #download
[3]: #build
[4]: ../README.md#quickstart
[5]: https://golang.org/doc/install
[6]: /examples
[7]: #4-run
[8]: reference.md#ark-config-definition
[9]: cloud-provider-specifics.md
[10]: #4-vendoring-dependencies
[11]: vendoring-dependencies.md
[12]: #3-test
[13]: ../hack/generate-proto.sh
[6]: https://github.com/heptio/ark/tree/master/examples
[7]: #run
[8]: /config-definition.md
[9]: /cloud-common.md
[10]: #vendoring-dependencies
[11]: /vendoring-dependencies.md
[12]: #test
[13]: https://github.com/heptio/ark/blob/master/hack/generate-proto.sh
[14]: https://grpc.io/docs/quickstart/go.html#install-protocol-buffers-v3
\ No newline at end of file
......@@ -2,20 +2,19 @@
The Ark client provides a CLI that allows you to initiate ad-hoc backups, scheduled backups, or restores.
*The files in this directory enumerate each of the possible `ark` commands and their flags. Note that you can also find this info with the CLI itself, using the `--help` flag.*
[The files in the CLI reference directory][1] in the repository enumerate each of the possible `ark` commands and their flags.
This information is available in the CLI, using the `--help` flag.
## Running the client
While it is possible to build and run the `ark` executable yourself, it is recommended to use the containerized version. Use the alias described in the quickstart:
```
alias ark='docker run --rm -u $(id -u) -v $(dirname $KUBECONFIG):/kubeconfig -e KUBECONFIG=/kubeconfig/$(basename $KUBECONFIG) gcr.io/heptio-images/ark:latest'
```
Assuming that your `KUBECONFIG` variable is set, this alias takes care of specifying the appropriate Kubernetes cluster credentials for you.
We recommend that you [download a pre-built release][26], but you can also build and run the `ark` executable.
## Kubernetes cluster credentials
In general, Ark will search for your cluster credentials in the following order:
* `--kubeconfig` command line flag
* `$KUBECONFIG` environment variable
* In-cluster credentials--this only works when you are running Ark in a pod
[1]: https://github.com/heptio/ark/tree/master/docs/cli-reference
[26]: https://github.com/heptio/ark/releases
# Set up Ark with your cloud provider
To run Ark with your cloud provider, you specify provider-specific settings for the Ark server. In version 0.7.0 and later, you can run Ark in any namespace, which requires additional customization. See [Run in custom namespace][3].
The Ark repository includes a set of example YAML files that specify the settings for each cloud provider. For provider-specific instructions, see:
* [Run Ark on AWS][0]
* [Run Ark on GCP][1]
* [Run Ark on Azure][2]
## Examples
After you set up the Ark server, try these examples:
### Basic example (without PersistentVolumes)
1. Start the sample nginx app:
```bash
kubectl apply -f examples/nginx-app/base.yaml
```
1. Create a backup:
```bash
ark backup create nginx-backup --include-namespaces nginx-example
```
1. Simulate a disaster:
```bash
kubectl delete namespaces nginx-example
```
Wait for the namespace to be deleted.
1. Restore your lost resources:
```bash
ark restore create nginx-backup
```
### Snapshot example (with PersistentVolumes)
> NOTE: For Azure, your Kubernetes cluster needs to be version 1.7.2+ to support PV snapshotting of its managed disks.
1. Start the sample nginx app:
```bash
kubectl apply -f examples/nginx-app/with-pv.yaml
```
1. Create a backup with PV snapshotting:
```bash
ark backup create nginx-backup --include-namespaces nginx-example
```
1. Simulate a disaster:
```bash
kubectl delete namespaces nginx-example
```
Because the default [reclaim policy][19] for dynamically-provisioned PVs is "Delete", these commands should trigger your cloud provider to delete the disk backing the PV. The deletion process is asynchronous so this may take some time. **Before continuing to the next step, check your cloud provider to confirm that the disk no longer exists.**
1. Restore your lost resources:
```bash
ark restore create nginx-backup
```
[0]: /aws-config.md
[1]: /gcp-config.md
[2]: /azure-config.md
[3]: /namespace.md
[19]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reclaiming
\ No newline at end of file
# Concepts
* [Overview][0]
* [Operation types][1]
* [1. Backups][2]
* [2. Schedules][3]
* [3. Restores][4]
* [API types][8]
* [Expired backup deletion][5]
* [Cloud storage sync][6]
## Overview
Heptio Ark provides customizable degrees of recovery for all Kubernetes API objects (Pods, Deployments, Jobs, Custom Resource Definitions, etc.), as well as for persistent volumes. This recovery can be cluster-wide, or fine-tuned according to object type, namespace, or labels.
Ark is ideal for the disaster recovery use case, as well as for snapshotting your application state, prior to performing system operations on your cluster (e.g. upgrades).
## Operation types
This section gives a quick overview of the Ark operation types.
### 1. Backups
The *backup* operation (1) uploads a tarball of copied Kubernetes resources into cloud object storage and (2) uses the cloud provider API to make disk snapshots of persistent volumes, if specified.
You can optionally specify hooks that should be executed during the backup. For example, you may
need to tell a database to flush its in-memory buffers to disk prior to taking a snapshot. You can
find more information about hooks [here][10].
Some things to be aware of:
* *Cluster backups are not strictly atomic.* If API objects are being created or edited at the time of backup, they may or not be included in the backup. In practice, backups happen very quickly and so the odds of capturing inconsistent information are low, but still possible.
* *A backup usually takes no more than a few seconds.* The snapshotting process for persistent volumes is asynchronous, so the runtime of the `ark backup` command isn't dependent on disk size.
These ad-hoc backups are saved with the `<BACKUP NAME>` specified during creation.
### 2. Schedules
The *schedule* operation allows you to back up your data at recurring intervals. The first backup is performed when the schedule is first created, and subsequent backups happen at the schedule's specified interval. These intervals are specified by a Cron expression.
A Schedule acts as a wrapper for Backups; when triggered, it creates them behind the scenes.
Scheduled backups are saved with the name `<SCHEDULE NAME>-<TIMESTAMP>`, where `<TIMESTAMP>` is formatted as *YYYYMMDDhhmmss*.
### 3. Restores
The *restore* operation allows you to restore all of the objects and persistent volumes from a previously created Backup. Heptio Ark supports multiple namespace remapping--for example, in a single restore, objects in namespace "abc" can be recreated under namespace "def", and the ones in "123" under "456".
Kubernetes API objects that have been restored can be identified with a label that looks like `ark-restore=<BACKUP NAME>-<TIMESTAMP>`, where `<TIMESTAMP>` is formatted as *YYYYMMDDhhmmss*.
You can also run the Ark server in *restore-only* mode, which disables backup, schedule, and garbage collection functionality during disaster recovery.
## API types
For information about the individual API types Ark uses, please see the [API types reference][9].
## Expired backup deletion
When first creating a backup, you can specify a TTL. If Ark sees that an existing Backup resource has expired, it removes both:
* The Backup resource itself
* The actual backup file from cloud object storage
## Cloud storage sync
Heptio Ark treats object storage as the source of truth. It continuously checks to see that the correct Backup resources are always present. If there is a properly formatted backup file in the storage bucket, but no corresponding Backup resources in the Kubernetes API, Ark synchronizes the information from object storage to Kubernetes.
This allows *restore* functionality to work in a cluster migration scenario, where the original Backup objects do not exist in the new cluster. See the [use case guide][7] for details.
[0]: #overview
[1]: #operation-types
[2]: #1-backups
[3]: #2-schedules
[4]: #3-restores
[5]: #expired-backup-deletion
[6]: #cloud-storage-sync
[7]: use-cases.md#cluster-migration
[8]: #api-types
[9]: api-types/
[10]: hooks.md
......@@ -114,6 +114,4 @@ No parameters required.
[8]: #overview
[9]: #example
[10]: http://docs.aws.amazon.com/kms/latest/developerguide/overview.html
[11]: ../examples/gcp/00-ark-config.yaml
[12]: ../examples/azure/10-ark-config.yaml
# Extend Ark
Ark includes mechanisms for extending the core functionality to meet your individual backup/restore needs:
* [Hooks][27] allow you to specify commands to be executed within running pods during a backup. This is useful if you need to run a workload-specific command prior to taking a backup (for example, to flush disk buffers or to freeze a database).
* [Plugins][28] allow you to develop custom object/block storage back-ends or per-item backup/restore actions that can execute arbitrary logic, including modifying the items being backed up/restored. Plugins can be used by Ark without needing to be compiled into the core Ark binary.
[27]: /hooks.md
[28]: /plugins.md
# Run Ark on GCP
You can run Kubernetes on Google Cloud Platform in either of:
* Kubernetes on Google Compute Engine virtual machines
* Google Kubernetes Engine
If you do not have the `gcloud` and `gsutil` CLIs locally installed, follow the [user guide][16] to set them up.
## Create GCS bucket
Heptio Ark requires an object storage bucket in which to store backups. Create a GCS bucket, replacing placeholder appropriately:
```bash
gsutil mb gs://<YOUR_BUCKET>/
```
## Create service account
To integrate Heptio Ark with GCP, create an Ark-specific [Service Account][15]:
1. View your current config settings:
```bash
gcloud config list
```
Store the `project` value from the results in the environment variable `$PROJECT_ID`.
2. Create a service account:
```bash
gcloud iam service-accounts create heptio-ark \
--display-name "Heptio Ark service account"
```
Then list all accounts and find the `heptio-ark` account you just created:
```bash
gcloud iam service-accounts list
```
Set the `$SERVICE_ACCOUNT_EMAIL` variable to match its `email` value.
3. Attach policies to give `heptio-ark` the necessary permissions to function:
```bash
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role roles/compute.storageAdmin
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role roles/storage.admin
```
4. Create a service account key, specifying an output file (`credentials-ark`) in your local directory:
```bash
gcloud iam service-accounts keys create credentials-ark \
--iam-account $SERVICE_ACCOUNT_EMAIL
```
## Credentials and configuration
If you run Google Kubernetes Engine (GKE), make sure that your current IAM user is a cluster-admin. This role is required to create RBAC objects.
See [the GKE documentation][22] for more information.
In the Ark root directory, run the following to first set up namespaces, RBAC, and other scaffolding. To run in a custom namespace, make sure that you have edited the YAML files to specify the namespace. See [Run in custom namespace][0].
```bash
kubectl apply -f examples/common/00-prereqs.yaml
```
Create a Secret. In the directory of the credentials file you just created, run:
```bash
kubectl create secret generic cloud-credentials \
--namespace <ARK_NAMESPACE> \
--from-file cloud=credentials-ark
```
Specify the following values in the example files:
* In file `examples/gcp/00-ark-config.yaml`:
* Replace `<YOUR_BUCKET>` and `<YOUR_PROJECT>`. See the [Config definition][7] for details.
* In file `examples/common/10-deployment.yaml`:
* Change `spec.template.spec.containers[*].env.name` to "GOOGLE_APPLICATION_CREDENTIALS".
* (Optional) If you run the nginx example, in file `examples/nginx-app/with-pv.yaml`:
* Replace `<YOUR_STORAGE_CLASS_NAME>` with `standard`. This is GCP's default `StorageClass` name.
## Start the server
In the root of your Ark directory, run:
```bash
kubectl apply -f examples/gcp/00-ark-config.yaml
kubectl apply -f examples/common/10-deployment.yaml
```
[0]: /namespace.md
[7]: /config-definition.md#gcp
[15]: https://cloud.google.com/compute/docs/access/service-accounts
[16]: https://cloud.google.com/sdk/docs/
[22]: https://cloud.google.com/kubernetes-engine/docs/how-to/role-based-access-control#prerequisites_for_using_role-based_access_control
# Run in custom namespace
In Ark version 0.7.0 and later, you can run Ark in any namespace. To do so, you specify the namespace in the YAML files that configure the Ark server. You then also specify the namespace when you run Ark client commands.
## Edit the example files
The Ark repository includes [a set of examples][0] that you can use to set up your Ark server. The examples specify only the default `heptio-ark` namespace. To run in another namespace, you edit the relevant files to specify your custom namespace.
For all cloud providers, edit `https://github.com/heptio/ark/blob/master/examples/common/00-prereqs.yaml`. This file defines:
* CustomResourceDefinitions for the Ark objects (backups, schedules, restores, configs, downloadrequests)
* The Ark namespace
* The Ark service account
* The RBAC rules to grant permissions to the Ark service account
### AWS
For AWS, edit:
* `https://github.com/heptio/ark/blob/master/examples/common/10-deployment.yaml`
* `https://github.com/heptio/ark/blob/master/examples/aws/00-ark-config.yaml`
### GCP
For GCP, edit:
* `https://github.com/heptio/ark/blob/master/examples/common/10-deployment.yaml`
* `https://github.com/heptio/ark/blob/master/examples/gcp/00-ark-config.yaml`
### Azure
For Azure, edit:
* `https://github.com/heptio/ark/blob/master/examples/azure/00-ark-deployment.yaml`
* `https://github.com/heptio/ark/blob/master/examples/azure/10-ark-config.yaml`
## Specify the namespace in client commands
To specify the namespace for all Ark client commands, run:
```
ark client config set namespace=<NAMESPACE_VALUE>
```
[0]: https://github.com/heptio/ark/tree/master/examples
\ No newline at end of file
......@@ -2,9 +2,9 @@
A backup is a gzip-compressed tar file whose name matches the Backup API resource's `metadata.name` (what is specified during `ark backup create <NAME>`).
In cloud object storage, *each backup file is stored in its own subdirectory* beneath the bucket specified in the Ark server configuration. This subdirectory includes an additional file called `ark-backup.json`. The JSON file explicitly lists all info about your associated Backup resource--including any default values used--so that you have a complete historical record of its configuration. It also specifies `status.version`, which corresponds to the output file format.
In cloud object storage, each backup file is stored in its own subdirectory in the bucket specified in the Ark server configuration. This subdirectory includes an additional file called `ark-backup.json`. The JSON file lists all information about your associated Backup resource, including any default values. This gives you a complete historical record of the backup configuration. The JSON file also specifies `status.version`, which corresponds to the output file format.
All together, the directory structure in your cloud storage may look like:
The directory structure in your cloud storage looks something like:
```
rootBucket/
......@@ -13,8 +13,8 @@ rootBucket/
backup1234.tar.gz
```
## `ark-backup.json`
An example of this file looks like the following:
## Example backup JSON file
```
{
"kind": "Backup",
......
......@@ -41,7 +41,7 @@ Heptio Ark can help you port your resources from one cluster to another, as long
2. *(Cluster 2)* Make sure that the `persistentVolumeProvider` and `backupStorageProvider` fields in the Ark Config match the ones from *Cluster 1*, so that your new Ark server instance is pointing to the same bucket.
3. *(Cluster 2)* Make sure that the Ark Backup object has been created. Ark resources are [synced][2] with the backup files available in cloud storage.
3. *(Cluster 2)* Make sure that the Ark Backup object has been created. Ark resources are synced with the backup files available in cloud storage.
4. *(Cluster 2)* Once you have confirmed that the right Backup (`<BACKUP-NAME>`) is now present, you can restore everything with:
```
......@@ -50,5 +50,4 @@ ark restore create <BACKUP-NAME>
[0]: #disaster-recovery
[1]: #cluster-migration
[2]: concepts.md#cloud-storage-sync
[3]: config-definition.md#main-config-parameters
[3]: /config-definition.md#main-config-parameters
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment