This project is mirrored from https://gitee.com/mirrors/nomad.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
- 13 Mar, 2020 40 commits
-
-
Tim Gross authored
This changeset provides two basic e2e tests for CSI plugins targeting common AWS use cases. The EBS test launches the EBS plugin (controller + nodes) and registers an EBS volume as a Nomad CSI volume. We deploy a job that writes to the volume, stop that job, and reuse the volume for another job which should be able to read the data written by the first job. The EFS test launches the EFS plugin (nodes-only) and registers an EFS volume as a Nomad CSI volume. We deploy a job that writes to the volume, and share the volume with another job which should be able to read the data written by the first job.
-
Tim Gross authored
Nomad clients will push node updates during client restart which can cause an extra claim for a volume by the same alloc. If an alloc already claims a volume, we can allow it to be treated as a valid claim and continue.
-
Tim Gross authored
In order to correctly fingerprint dynamic plugins on client restarts, we need to persist a handle to the plugin (that is, connection info) to the client state store. The dynamic registry will sync automatically to the client state whenever it receives a register/deregister call.
-
Lang Martin authored
* nomad/structs/csi: new RemoteID() uses the ExternalID if set * nomad/csi_endpoint: pass RemoteID to volume request types * client/pluginmanager/csimanager/volume: pass RemoteID to NodePublishVolume
-
Tim Gross authored
Fix some docstring typos and fix noisy log message during client restarts. A log for the common case where the plugin socket isn't ready yet isn't actionable by the operator so having it at info is just noise.
-
Lang Martin authored
* command/agent/csi_endpoint: support type filter in volumes & plugins * command/agent/http: use /v1/volume/csi & /v1/plugin/csi * api/csi: use /v1/volume/csi & /v1/plugin/csi * api/nodes: use /v1/volume/csi & /v1/plugin/csi * api/nodes: not /volumes/csi, just /volumes * command/agent/csi_endpoint: fix ot parameter parsing
-
Lang Martin authored
* api/allocations: GetTaskGroup finds the taskgroup struct * command/node_status: display CSI volume names * nomad/state/state_store: new CSIVolumesByNodeID * nomad/state/iterator: new SliceIterator type implements memdb.ResultIterator * nomad/csi_endpoint: deal with a slice of volumes * nomad/state/state_store: CSIVolumesByNodeID return a SliceIterator * nomad/structs/csi: CSIVolumeListRequest takes a NodeID * nomad/csi_endpoint: use the return iterator * command/agent/csi_endpoint: parse query params for CSIVolumes.List * api/nodes: new CSIVolumes to list volumes by node * command/node_status: use the new list endpoint to print volumes * nomad/state/state_store: error messages consider the operator * command/node_status: include the Provider
-
Lang Martin authored
* client/allocrunner/csi_hook: tag errors * nomad/client_csi_endpoint: tag errors * nomad/client_rpc: remove an unnecessary error tag * nomad/state/state_store: ControllerRequired fix intent We use ControllerRequired to indicate that a volume should use the publish/unpublish workflow, rather than that it has a controller. We need to check both RequiresControllerPlugin and SupportsAttachDetach from the fingerprint to check that. * nomad/csi_endpoint: tag errors * nomad/csi_endpoint_test: longer error messages, mock fingerprints
-
Tim Gross authored
We denormalize the `CSIVolume` struct when we query it from the state store by getting the plugin and its health. But unless we copy the volume, this denormalization gets synced back to the state store without passing through the fsm (which is invalid).
-
Tim Gross authored
We denormalize the `CSIPlugin` struct when we query it from the state store by getting the current set of allocations that provide the plugin. But unless we copy the plugin, this denormalization gets synced back to the state store and each time we query we'll add another copy of the current allocations.
-
Tim Gross authored
Derive a provider name and version for plugins (and the volumes that use them) from the CSI identity API `GetPluginInfo`. Expose the vendor name as `Provider` in the API and CLI commands.
-
Lang Martin authored
* command/csi: csi, csi_plugin, csi_volume * helper/funcs: move ExtraKeys from parse_config to UnusedKeys * command/agent/config_parse: use helper.UnusedKeys * api/csi: annotate CSIVolumes with hcl fields * command/csi_plugin: add Synopsis * command/csi_volume_register: use hcl.Decode style parsing * command/csi_volume_list * command/csi_volume_status: list format, cleanup * command/csi_plugin_list * command/csi_plugin_status * command/csi_volume_deregister * command/csi_volume: add Synopsis * api/contexts/contexts: add csi search contexts to the constants * command/commands: register csi commands * api/csi: fix struct tag for linter * command/csi_plugin_list: unused struct vars * command/csi_plugin_status: unused struct vars * command/csi_volume_list: unused struct vars * api/csi: add allocs to CSIPlugin * command/csi_plugin_status: format the allocs * api/allocations: copy Allocation.Stub in from structs * nomad/client_rpc: add some error context with Errorf * api/csi: collapse read & write alloc maps to a stub list * command/csi_volume_status: cleanup allocation display * command/csi_volume_list: use Schedulable instead of Healthy * command/csi_volume_status: use Schedulable instead of Healthy * command/csi_volume_list: sprintf string * command/csi: delete csi.go, csi_plugin.go * command/plugin: refactor csi components to sub-command plugin status * command/plugin: remove csi * command/plugin_status: remove csi * command/volume: remove csi * command/volume_status: split out csi specific * helper/funcs: add RemoveEqualFold * command/agent/config_parse: use helper.RemoveEqualFold * api/csi: do ,unusedKeys right * command/volume: refactor csi components to `nomad volume` * command/volume_register: split out csi specific * command/commands: use the new top level commands * command/volume_deregister: hardwired type csi for now * command/volume_status: csiFormatVolumes rescued from volume_list * command/plugin_status: avoid a panic on no args * command/volume_status: avoid a panic on no args * command/plugin_status: predictVolumeType * command/volume_status: predictVolumeType * nomad/csi_endpoint_test: move CreateTestPlugin to testing * command/plugin_status_test: use CreateTestCSIPlugin * nomad/structs/structs: add CSIPlugins and CSIVolumes search consts * nomad/state/state_store: add CSIPlugins and CSIVolumesByIDPrefix * nomad/search_endpoint: add CSIPlugins and CSIVolumes * command/plugin_status: move the header to the csi specific * command/volume_status: move the header to the csi specific * nomad/state/state_store: CSIPluginByID prefix * command/status: rename the search context to just Plugins/Volumes * command/plugin,volume_status: test return ids now * command/status: rename the search context to just Plugins/Volumes * command/plugin_status: support -json and -t * command/volume_status: support -json and -t * command/plugin_status_csi: comments * command/*_status: clean up text * api/csi: fix stale comments * command/volume: make deregister sound less fearsome * command/plugin_status: set the id length * command/plugin_status_csi: more compact plugin health * command/volume: better error message, comment
-
Tim Gross authored
Adds a stanza for both Host Volumes and CSI Volumes to the the CLI output for `nomad alloc status`. Mostly relies on information already in the API structs, but in the case where there are CSI Volumes we need to make extra API calls to get the volume status. To reduce overhead, these extra calls are hidden behind the `-verbose` flag.
-
Tim Gross authored
In #7252 we removed the `DevDisableBootstrap` flag to require tests to honor only `BootstrapExpect`, in order to reduce a source of test flakiness. This changeset applies the same fix to the CSI tests.
-
Lang Martin authored
* structs: add ControllerRequired, volume.Name, no plug.Type * structs: Healthy -> Schedulable * state_store: Healthy -> Schedulable * api: add ControllerRequired to api data types * api: copy csi structs changes * nomad/structs/csi: include name and external id * api/csi: include Name and ExternalID * nomad/structs/csi: comments for the 3 ids
-
Lang Martin authored
* structs: CSIInfo include AllocID, CSIPlugins no Jobs * state_store: eliminate plugin Jobs, delete an empty plugin * nomad/structs/csi: detect empty plugins correctly * client/allocrunner/taskrunner/plugin_supervisor_hook: option AllocID * client/pluginmanager/csimanager/instance: allocID * client/pluginmanager/csimanager/fingerprint: set AllocID * client/node_updater: split controller and node plugins * api/csi: remove Jobs The CSI Plugin API will map plugins to allocations, which allows plugins to be defined by jobs in many configurations. In particular, multiple plugins can be defined in the same job, and multiple jobs can be used to define a single plugin. Because we now map the allocation context directly from the node, it's no longer necessary to track the jobs associated with a plugin directly. * nomad/csi_endpoint_test: CreateTestPlugin & register via fingerprint * client/dynamicplugins: lift AllocID into the struct from Options * api/csi_test: remove Jobs test * nomad/structs/csi: CSIPlugins has an array of allocs * nomad/state/state_store: implement CSIPluginDenormalize * nomad/state/state_store: CSIPluginDenormalize npe on missing alloc * nomad/csi_endpoint_test: defer deleteNodes for clarity * api/csi_test: disable this test awaiting mocks: https://github.com/hashicorp/nomad/issues/7123
-
Danielle Lancashire authored
This commit introduces support for providing VolumeCapabilities during requests to `ControllerPublishVolumes` as this is a required field.
-
Danielle Lancashire authored
Currently the handling of CSINode RPCs does not correctly handle forwarding RPCs to Nodes. This commit fixes this by introducing a shim RPC (nomad/client_csi_enpdoint) that will correctly forward the request to the owning node, or submit the RPC to the client. In the process it also cleans up handling a little bit by adding the `CSIControllerQuery` embeded struct for required forwarding state. The CSIControllerQuery embeding the requirement of a `PluginID` also means we could move node targetting into the shim RPC if wanted in the future.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
CSI Plugins that manage devices need not just access to the CSI directory, but also to manage devices inside `/dev`. This commit introduces a `/dev:/dev` mount to the container so that they may do so.
-
Tim Gross authored
When an alloc is marked terminal (and after node unstage/unpublish have been called), the client syncs the terminal alloc state with the server via `Node.UpdateAlloc RPC`. For each job that has a terminal alloc, the `Node.UpdateAlloc` RPC handler at the server will emit an eval for a new core job to garbage collect CSI volume claims. When this eval is handled on the core scheduler, it will call a `volumeReap` method to release the claims for all terminal allocs on the job. The volume reap will issue a `ControllerUnpublishVolume` RPC for any node that has no alloc claiming the volume. Once this returns (or is skipped), the volume reap will send a new `CSIVolume.Claim` RPC that releases the volume claim for that allocation in the state store, making it available for scheduling again. This same `volumeReap` method will be called from the core job GC, which gives us a second chance to reclaim volumes during GC if there were controller RPC failures.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
This commit is the initial implementation of claiming volumes from the server and passes through any publishContext information as appropriate. There's nothing too fancy here.
-
Danielle Lancashire authored
Currently, the client has to ship an entire allocation to the server as part of performing a VolumeClaim, this has a few problems: Firstly, it means the client is sending significantly more data than is required (an allocation contains the entire contents of a Nomad job, alongside other irrelevant state) which has a non-zero (de)serialization cost. Secondly, because the allocation was never re-fetched from the state store, it means that we were potentially open to issues caused by stale state on a misbehaving or malicious client. The change removes both of those issues at the cost of a couple of more state store lookups, but they should be relatively cheap. We also now provide the CSIVolume in the response for a claim, so the client can perform a Claim without first going ahead and fetching all of the volumes.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
This PR implements some intitial support for doing deeper validation of a volume during its registration with the server. This allows us to validate the capabilities before users attempt to use the volumes during most cases, and also prevents registering volumes without first setting up a plugin, which should help to catch typos and the like during registration. This does have the downside of requiring users to wait for (1) instance of a plugin to be running in their cluster before they can register volumes.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
The CSI Spec requires us to attach and stage volumes based on different types of usage information when it may effect how they are bound. Here we pass through some basic usage options in the CSI Hook (specifically the volume aliases ReadOnly field), and the attachment/access mode from the volume. We pass the attachment/access mode seperately from the volume as it simplifies some handling and doesn't necessarily force every attachment to use the same mode should more be supported (I.e if we let each `volume "foo" {}` specify an override in the future).
-
Danielle Lancashire authored
This commit introduces initial support for unmounting csi volumes. It takes a relatively simplistic approach to performing NodeUnpublishVolume calls, optimising for cleaning up any leftover state rather than terminating early in the case of errors. This is because it happens during an allocation's shutdown flow and may not always have a corresponding call to `NodePublishVolume` that succeeded.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
-
Danielle Lancashire authored
This commit implements support for creating driver mounts for CSI Volumes. It works by fetching the created mounts from the allocation resources and then iterates through the volume requests, creating driver mount configs as required. It's a little bit messy primarily because there's _so_ much terminology overlap and it's a bit difficult to follow.
-
Danielle Lancashire authored
-
Danielle Lancashire authored
This commit is an initial (read: janky) approach to forwarding state from an allocrunner hook to a taskrunner using a similar `hookResources` approach that tr's use internally. It should eventually probably be replaced with something a little bit more message based, but for things that only come from pre-run hooks, and don't change, it's probably fine for now.
-