This project is mirrored from https://gitee.com/mirrors/nomad.git.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
- 20 Aug, 2021 4 commits
-
-
Luiz Aoqui authored
-
Luiz Aoqui authored
-
Luiz Aoqui authored
-
Luiz Aoqui authored
-
- 19 Aug, 2021 1 commit
-
-
Mahmood Ali authored
Deflake tests attempts
-
- 18 Aug, 2021 3 commits
-
-
Mahmood Ali authored
Attempt to deflake the test by avoiding shutting down the leaders, as leadership recovery takes more time, and consequently longer to process raft configuration changes and potentially failing the test.
-
Mahmood Ali authored
Wait for leadership to be established before killing leader.
-
Mahmood Ali authored
When a node becomes ready, create an eval for all system jobs across namespaces. The previous code uses `job.ID` to deduplicate evals, but that ignores the job namespace. Thus if there are multiple jobs in different namespaces sharing the same ID/Name, only one will be considered for running in the new node. Thus, Nomad may skip running some system jobs in that node.
-
- 17 Aug, 2021 1 commit
-
-
Mahmood Ali authored
Target all e2e datacenters for system and sysbatch e2e tests. They require that the system jobs run on all linux clients. However, the jobs currenly only target `dc1` datacenter, but the nightly e2e cluster has 4 clients spread in `dc1` and `dc2` datacenters, causing the tests to fail. I missed this problem in e2e dev cluster because it only used a single dc1 datacenter.
-
- 16 Aug, 2021 2 commits
-
-
Mahmood Ali authored
core: implement system batch scheduler
-
James Rasell authored
tlsutil: update testing certificates close to expiry.
-
- 13 Aug, 2021 1 commit
-
-
James Rasell authored
-
- 11 Aug, 2021 3 commits
-
-
Mahmood Ali authored
Tweaks to the commands in Consul Connect page. For multi-command scripts, having the leading `$` is a bit annoying, as it makes copying the text harder. Also, the `copy` button would only copy the first command and ignore the rest. Also, the `echo 1 > ...` commands are required to run as root, unlike the rest! I made them use `| sudo tee` pattern to ease copy & paste as well. Lastly, update the CNI plugin links to 1.0.0. It's fresh off the oven - just got released less than an hour ago: https://github.com/containernetworking/plugins/releases/tag/v1.0.0 .
-
Mahmood Ali authored
docs: note CNI requirement for bridge networking
-
Tim Gross authored
Using `bridge` networking requires that you have CNI plugins installed on the client, but this isn't in the jobspec `network` docs which are the first place someone will look when trying to configure task networking.
-
- 10 Aug, 2021 9 commits
-
-
Michael Schurter authored
CSI Listsnapshot secrets support
-
Mahmood Ali authored
Fix a bug where system jobs may fail to be placed on a node that initially was not eligible for system job placement. This changes causes the reschedule to re-evaluate the node if any attribute used in feasibility checks changes. Fixes https://github.com/hashicorp/nomad/issues/8448
-
Mahmood Ali authored
In a multi-task-group job, treat 0 canary groups as auto-promote. This change fixes an edge case where Nomad requires a manual promotion, if the job had any group with canary=0 and rest of groups having auto_promote set. Co-authored-by:
Michael Schurter <mschurter@hashicorp.com>
-
Mahmood Ali authored
Speed up client startup, by retrying more until the servers are known. Currently, if client fingerprinting is fast and finishes before the client connect to a server, node registration may be delayed by 15 seconds or so! Ideally, we'd wait until the client discovers the servers and then retry immediately, but that requires significant code changes. Here, we simply retry the node registration request every second. That's basically the equivalent of check if the client discovered servers every second. Should be a cheap operation. When testing this change on my local computer and where both servers and clients are co-located, the time from startup till node registration dropped from 34 seconds to 8 seconds!
-
Luiz Aoqui authored
-
Michael Schurter authored
docs: Add replication_token link with authoritative_region
-
Jai authored
ui: Fix fuzzy search namespace-handling
-
Mike Wickett authored
-
Jai Bhagat authored
-
- 09 Aug, 2021 3 commits
-
-
Luiz Aoqui authored
-
Luiz Aoqui authored
-
Lir (Rookout) authored
-
- 06 Aug, 2021 2 commits
-
-
Michael Schurter authored
consul/connect: avoid warn messages on connect proxy errors
-
Michael Schurter authored
docs: add backward incompat note about #10875
-
- 05 Aug, 2021 5 commits
-
-
Michael Schurter authored
Fixes #11002
-
James Rasell authored
changelog: add entry for #10929
-
James Rasell authored
When creating a TCP proxy bridge for Connect tasks, we are at the mercy of either end for managing the connection state. For long lived gRPC connections the proxy could reasonably expect to stay open until the context was cancelled. For the HTTP connections used by connect native tasks, we experience connection disconnects. The proxy gets recreated as needed on follow up requests, however we also emit a WARN log when the connection is broken. This PR lowers the WARN to a TRACE, because these disconnects are to be expected. Ideally we would be able to proxy at the HTTP layer, however Consul or the connect native task could be configured to expect mTLS, preventing Nomad from MiTM the requests. We also can't mange the proxy lifecycle more intelligently, because we have no control over the HTTP client or server and how they wish to manage connection state. What we have now works, it's just noisy. Fixes #10933
-
James Rasell authored
-
James Rasell authored
fix: load token in docker auth config
-
- 04 Aug, 2021 2 commits
-
-
Luiz Aoqui authored
-
James Rasell authored
cli: fix minor format error within `-ca-cert` help text.
-
- 03 Aug, 2021 4 commits
-
-
Luiz Aoqui authored
-
Mahmood Ali authored
Use basic sleeps in busybox images. busybox are very light, and ping has permissions complications, and it may fail for network related issues.
-
Seth Hoenig authored
This PR implements a new "System Batch" scheduler type. Jobs can make use of this new scheduler by setting their type to 'sysbatch'. Like the name implies, sysbatch can be thought of as a hybrid between system and batch jobs - it is for running short lived jobs intended to run on every compatible node in the cluster. As with batch jobs, sysbatch jobs can also be periodic and/or parameterized dispatch jobs. A sysbatch job is considered complete when it has been run on all compatible nodes until reaching a terminal state (success or failed on retries). Feasibility and preemption are governed the same as with system jobs. In this PR, the update stanza is not yet supported. The update stanza is sill limited in functionality for the underlying system scheduler, and is not useful yet for sysbatch jobs. Further work in #4740 will improve support for the update stanza and deployments. Closes #2527
-
James Rasell authored
-