This project is mirrored from https://gitee.com/mirrors/nomad.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
- 17 Feb, 2022 1 commit
-
-
Luiz Aoqui authored
-
- 16 Feb, 2022 5 commits
-
-
Luiz Aoqui authored
-
Seth Hoenig authored
build: respect GOBIN when using make targets
-
Seth Hoenig authored
This PR updates GNUMakefile to respect $GOBIN if it is set in the environment or via an $GOENV file. Previously we hard-coded the output to $GOPATH/bin, which is not necessarily the desired behavior.
-
Luiz Aoqui authored
-
Tiernan authored
-
- 15 Feb, 2022 10 commits
-
-
Tim Gross authored
Nomad communicates with CSI plugin tasks via gRPC. The plugin supervisor hook uses this to ping the plugin for health checks which it emits as task events. After the first successful health check the plugin supervisor registers the plugin in the client's dynamic plugin registry, which in turn creates a CSI plugin manager instance that has its own gRPC client for fingerprinting the plugin and sending mount requests. If the plugin manager instance fails to connect to the plugin on its first attempt, it exits. The plugin supervisor hook is unaware that connection failed so long as its own pings continue to work. A transient failure during plugin startup may mislead the plugin supervisor hook into thinking the plugin is up (so there's no need to restart the allocation) but no fingerprinter is started. * Refactors the gRPC client to connect on first use. This provides the plugin manager instance the ability to retry the gRPC client conn...
-
Seth Hoenig authored
api: return sorted results in certain list endpoints
-
Seth Hoenig authored
These API endpoints now return results in chronological order. They can return results in reverse chronological order by setting the query parameter ascending=true. - Eval.List - Deployment.List
-
Seth Hoenig authored
Update gopsutil to 3.21.12
-
Seth Hoenig authored
-
Tim Gross authored
-
Seth Hoenig authored
build: allow golangci-lint to use more than 1 core
-
Alex Holyoake authored
-
Seth Hoenig authored
scheduler: fix dropped test error
-
Lars Lehtonen authored
-
- 14 Feb, 2022 2 commits
-
-
Seth Hoenig authored
Since switching to `golangci-lint` we have set the `-j 1` flag, which restricts the tool to using 1 CPU thread. This PR removes the flag so `make check` takes less time on good computers.
-
James Rasell authored
client: track service deregister call so it's only called once.
-
- 11 Feb, 2022 5 commits
-
-
Tim Gross authored
The `volume detach`, `volume deregister`, and `volume status` commands accept a prefix argument for the volume ID. Update the behavior on exact matches so that if there is more than one volume that matches the prefix, we should only return an error if one of the volume IDs is not an exact match. Otherwise we won't be able to use these commands at all on those volumes. This also makes the behavior of these commands consistent with `job stop`.
-
Tim Gross authored
The CSI specification says: > The CO SHALL provide the listen-address for the Plugin by way of the `CSI_ENDPOINT` environment variable. Note that plugins without filesystem isolation won't have the plugin dir bind-mounted to their alloc dir, but we can provide a path to the socket anyways. Refactor to use opts struct for plugin supervisor hook config. The parameter list for configuring the plugin supervisor hook has grown enough where is makes sense to use an options struct similiar to many of the other task runner hooks (ex. template).
-
James Rasell authored
doc(typo): technical typo in advertised example
-
James Rasell authored
changelog: add entry for #12040
-
James Rasell authored
In certain task lifecycles the taskrunner service deregister call could be called three times for a task that is exiting. Whilst each hook caller of deregister has its own purpose, we should try and ensure it is only called once during the shutdown lifecycle of a task. This change therefore tracks when deregister has been called, so that subsequent calls are noop. In the event the task is restarting, the deregister value is reset to ensure proper operation.
-
- 10 Feb, 2022 16 commits
-
-
Derek Strickland authored
The allocReconciler's computeGroup function contained a significant amount of inline logic that was difficult to understand the intent of. This commit extracts inline logic into the following intention revealing subroutines. It also includes updates to the function internals also aimed at improving maintainability and renames some existing functions for the same purpose. New or renamed functions include. Renamed functions - handleGroupCanaries -> cancelUnneededCanaries - handleDelayedLost -> createLostLaterEvals - handeDelayedReschedules -> createRescheduleLaterEvals New functions - filterAndStopAll - initializeDeploymentState - requiresCanaries - computeCanaries - computeUnderProvisionedBy - computeReplacements - computeDestructiveUpdates - computeMigrations - createDeployment - isDeploymentComplete
-
Luiz Aoqui authored
-
Luiz Aoqui authored
-
Luiz Aoqui authored
Merge release 1.2.6 branch
-
Luiz Aoqui authored
-
Luiz Aoqui authored
Version 1.2.6
-
Marc-Aurèle Brothier authored
-
James Rasell authored
-
Nomad Release Bot authored
-
Nomad Release bot authored
-
Luiz Aoqui authored
-
Tim Gross authored
The spread iterator can panic when processing an evaluation, resulting in an unrecoverable state in the cluster. Whenever a panicked server restarts and quorum is restored, the next server to dequeue the evaluation will panic. To trigger this state: * The job must have `max_parallel = 0` and a `canary >= 1`. * The job must not have a `spread` block. * The job must have a previous version. * The previous version must have a `spread` block and at least one failed allocation. In this scenario, the desired changes include `(place 1+) (stop 1+), (ignore n) (canary 1)`. Before the scheduler can place the canary allocation, it tries to find out which allocations can be stopped. This passes back through the stack so that we can determine previous-node penalties, etc. We call `SetJob` on the stack with the previous version of the job, which will include assessing the `spread` block (even though the results are unused). The task group spread info sta...
-
Luiz Aoqui authored
Add new namespace ACL requirement for the /v1/jobs/parse endpoint and return early if HCLv2 parsing fails. The endpoint now requires the new `parse-job` ACL capability or `submit-job`.
-
Seth Hoenig authored
This PR adds symlink resolution when doing validation of paths to ensure they do not escape client allocation directories.
-
Seth Hoenig authored
go-getter creates a circular dependency between a Client and Getter, which means each is inherently thread-unsafe if you try to re-use on or the other. This PR fixes Nomad to no longer make use of the default Getter objects provided by the go-getter package. Nomad must create a new Client object on every artifact download, as the Client object controls the Src and Dst among other things. When Caling Client.Get, the Getter modifies its own Client reference, creating the circular reference and race condition. We can still achieve most of the desired connection caching behavior by re-using a shared HTTP client with transport pooling enabled.
-
Charlie Voiselle authored
-
- 09 Feb, 2022 1 commit
-
-
Tim Gross authored
When an allocation is updated, the job summary for the associated job is also updated. CSI uses the job summary to set the expected count for controller and node plugins. We incorrectly used the allocation's server status instead of the job status when deciding whether to update or remove the job from the plugins. This caused a node drain or other terminal state for an allocation to clear the expected count for the entire plugin. Use the job status to guide whether to update or remove the expected count. The existing CSI tests for the state store incorrectly modeled the updates we received from servers vs those we received from clients, leading to test assertions that passed when they should not. Rework the tests to clarify each step in the lifecycle and rename CSI state store functions for clarity
-