Commits · 672b76056bff0e48740e0eec31031b6278ece77f · 小白蛋 / Nomad

This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed 2 years ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.

16 Dec, 2019 7 commits

shutdown delay for task groups · 672b7605

Drew Bailey authored 5 years ago

copy struct values

ensure groupserviceHook implements RunnerPreKillhook

run deregister first

test that shutdown times are delayed

move magic number into variable

672b7605

env_aws: Disable Retries and set Session cfg (#6860) · acde3e11
Danielle authored 5 years ago
```
env_aws: Disable Retries and set Session cfg
```
acde3e11
Update client/fingerprint/env_aws.go · 59eb8821
Danielle authored 5 years ago
```
Co-Authored-By: Mahmood Ali <mahmood@hashicorp.com>
```
59eb8821
Merge pull request #6848 from hashicorp/b-log-less-in-node-drainer · 2cdb89b1
Seth Hoenig authored 5 years ago
```
tests: remove trace statements from nodeDrainWatcher.watch
```
2cdb89b1

e2e: run client/allocs metrics nightly tests vs Windows (#6850) · 820309ed

Tim Gross authored 5 years ago

Adds Windows targets to the client/allocs metrics tests. Removes the
`allocstats` test, which covers less than these tests and is now
redundant.

Adds a firewall rule to our Windows instances so that the prometheus
server can scrape the Nomad HTTP API for metrics.

820309ed

tests: remove trace statements from nodeDrainWatcher.watch · 3163bbe7

Seth Hoenig authored 5 years ago

Avoid logging in the `watch` function as much as possible, since
it is not waited on during a server shutdown. When the logger
logs after a test passes, it may or may not cause the testing
framework to panic.

More info in:
https://github.com/golang/go/issues/29388#issuecomment-453648436

3163bbe7

env_aws: Disable Retries and set Session cfg · 544807ba

Danielle Lancashire authored 5 years ago

Previously, Nomad used hand rolled HTTP requests to interact with the
EC2 metadata API. Recently however, we switched to using the AWS SDK for
this fingerprinting.

The default behaviour of the AWS SDK is to perform retries with
exponential backoff when a request fails. This is problematic for Nomad,
because interacting with the EC2 API is in our client start path.

Here we revert to our pre-existing behaviour of not performing retries
in the fast path, as if the metadata service is unavailable, it's likely
that nomad is not running in AWS.

544807ba

13 Dec, 2019 13 commits

Merge pull request #6855 from hashicorp/b-interp-connect-task · 7700d384
Michael Schurter authored 5 years ago
```
connect: canonicalize before adding sidecar
```
7700d384
Merge pull request #6820 from hashicorp/f-skip-docker-logging-knob · 20f8227c
Mahmood Ali authored 5 years ago
```
driver: allow disabling log collection
```
20f8227c
Merge pull request #6556 from hashicorp/c-vendor-multierror-20191025 · 0bdf7a9e
Mahmood Ali authored 5 years ago
```
Update go-multierror library
```
0bdf7a9e
address review comments · e82dad73
Mahmood Ali authored 5 years ago

e82dad73
tests: fix error format assertion · 013097b2
Mahmood Ali authored 5 years ago
```
multierror library changed formatting slightly.
```
013097b2
Update changelog with #6817 · fd043056
Buck Doyle authored 5 years ago

fd043056
Update go-multierror to 72917a1 · 7c749ff8
Mahmood Ali authored 5 years ago
```
To pick up https://github.com/hashicorp/go-multierror/pull/28
```
7c749ff8

Fix flapping status light test (#6852) · fabcf7c1

Buck Doyle authored 5 years ago

I unintentionally introduced a flapping test in #6817. The
draining status of the node will be randomly chosen and
that flag takes precedence over eligibility. This forces
the draining flag to be false rather than random so the
test should no longer flap.

See here for an example failure:
https://circleci.com/gh/hashicorp/nomad/26368

fabcf7c1

update changelog · 14e34dd9
Preetha Appan authored 5 years ago

14e34dd9
Merge pull request #6839 from hashicorp/b-cgroup-cleanup · 93694f8d
Mahmood Ali authored 5 years ago
```
executor: stop joining executor to container cgroup
```
93694f8d

docs: add #6855 to changelog · 4f718f7b

Michael Schurter authored 5 years ago

Also make Connect related fixes more consistent in the changelog. I
suspect users won't care if a Connect related fix is in the server's
admission controller or in the client's groupservice hook or somewhere
else, so I think grouping them by `consul/connect:` makes the most
sense.

4f718f7b

connect: canonicalize before adding sidecar · c74de6a4

Michael Schurter authored 5 years ago

Fixes #6853

Canonicalize jobs first before adding any sidecars. This fixes a bug
where sidecar tasks were added without interpolated names and broke
validation. Sidecar tasks must be canonicalized independently.

Also adds a group network to the mock connect job because it wasn't a
valid connect job before!

c74de6a4

Merge pull request #6854 from hashicorp/update-changelog · 416b3e74
Mahmood Ali authored 5 years ago
```
Add notarization details to changelog
```
416b3e74

12 Dec, 2019 9 commits

Add clarifying update · 70608806
Michele authored 5 years ago

70608806
Add apple notarization note · d7818c2b
Michele authored 5 years ago

d7818c2b
Merge pull request #6849 from hashicorp/b-debug-preemption · 37d421e7
Preetha authored 5 years ago
```
Use debug logging for scheduler internals
```
37d421e7
More error->debug for logging in the bin packing iterator · be897cad
Preetha Appan authored 5 years ago

be897cad
driver/pot: Added extra_hosts and args commands (#6577) · ba1e66c4
ebarriosjr authored 5 years ago

ba1e66c4

UI: Fix client sorting (#6817) · 83d92251

Buck Doyle authored 5 years ago

There are two changes here, and some caveats/commentary:

1. The “State“ table column was actually sorting only by status. The state was not an actual property, just something calculated in each client row, as a product of status, isEligible, and isDraining. This PR adds isDraining as a component of compositeState so it can be used for sorting.

2. The Sortable mixin declares dependent keys that cause the sort to be live-updating, but only if the members of the array change, such as if a new client is added, but not if any of the sortable properties change. This PR adds a SortableFactory function that generates a mixin whose listSorted computed property includes dependent keys for the sortable properties, so the table will live-update if any of the sortable properties change, not just the array members. There’s a warning if you use SortableFactory without dependent keys and via the original Sortable interface, so we can eventually migrate away from it.

83d92251

Merge pull request #6808 from hashicorp/b-ui/unclosed-log-streams · 630723cf
Michael Lange authored 5 years ago
```
UI: Unclosed log streams
```
630723cf

Use debug logging for scheduler internals · ed1f30e7

Preetha Appan authored 5 years ago

We currently log an error if preemption is unable to find a suitable set of
allocations to preempt. This commit changes that to debug level since not finding
preemptable allocations is not an error condition.

ed1f30e7

e2e: run client/allocs metrics tests nightly (#6842) · 7cda1409

Tim Gross authored 5 years ago

Refactor the metrics end-to-end tests so they can be run with our e2e
test framework. Runs fabio/prometheus and a collection of jobs that
will cause metrics to be measured. We then query Prometheus to ensure
we're publishing those allocation metrics and some metrics from the
clients as well.

Includes adding a placeholder for running the same tests on Windows.

7cda1409

11 Dec, 2019 7 commits

simplify cgroup path lookup · f794b49e
Mahmood Ali authored 5 years ago

f794b49e
Merge pull request #6838 from hashicorp/f-parallelize-state-store-tests · 57b38a0f
Seth Hoenig authored 5 years ago
```
tests: parallelize state store tests
```
57b38a0f

executor: stop joining executor to container cgroup · 596d0be5

Mahmood Ali authored 5 years ago

Stop joining libcontainer executor process into the newly created task
container cgroup, to ensure that the cgroups are fully destroyed on
shutdown, and to make it consistent with other plugin processes.

Previously, executor process is added to the container cgroup so the
executor process resources get aggregated along with user processes in
our metric aggregation.

However, adding executor process to container cgroup adds some
complications with much benefits:

First, it complicates cleanup.  We must ensure that the executor is
removed from container cgroup on shutdown.  Though, we had a bug where
we missed removing it from the systemd cgroup.  Because executor uses
`containerState.CgroupPaths` on launch, which includes systemd, but
`cgroups.GetAllSubsystems` which doesn't.

Second, it may have advese side-effects.  When a user process is cpu
bound or uses too much memory, executor should remain functioning
without risk of being killed (by OOM killer) or throttled.

Third, it is inconsistent with other drivers and plugins.  Logmon and
DockerLogger processes aren't in the task cgroups.  Neither are
containerd processes, though it is equivalent to executor in
responsibility.

Fourth, in my experience when executor process moves cgroup while it's
running, the cgroup aggregation is odd.  The cgroup
`memory.usage_in_bytes` doesn't seem to capture the full memory usage of
the executor process and becomes a red-harring when investigating memory
issues.

For all the reasons above, I opted to have executor remain in nomad
agent cgroup and we can revisit this when we have a better story for
plugin process cgroup management.

596d0be5

drivers/exec: test all cgroups are destroyed · 2f4b9da6
Mahmood Ali authored 5 years ago

2f4b9da6

tests: parallelize state store tests · 35fdada2

Seth Hoenig authored 5 years ago

It has been decided we're going to live in a many core world.
Let's take advantage of that and parallelize these state store
tests which all run in memory and are largely CPU bound.

An unscientific benchmark demonstrating the improvement:

[mp state (master)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	5.162s

[mp state (f-parallelize-state-store-tests)] $ go test
PASS
ok  	github.com/hashicorp/nomad/nomad/state	1.527s

35fdada2

doc: spread is inherited from job to group (#6837) · 8babbf4f
Tim Gross authored 5 years ago

8babbf4f
Merge pull request #6834 from hashicorp/monitor-changelog · c28722e7
Drew Bailey authored 5 years ago
```
add 6828 to changelog
```
c28722e7

10 Dec, 2019 4 commits

Merge pull request #6833 from hashicorp/sentinel-imports-note · 948b24ac
Michael Schurter authored 5 years ago
```
Make note of Sentinel standard imports
```
948b24ac

Make note of Sentinel standard imports · bdb70ef0

Chris Arcand authored 5 years ago

> Sentinel-embedded applications can choose to whitelist or blacklist
certain standard imports. Please reference the documentation for the
Sentinel-enabled application you're using to determine if all standard
imports are available.

bdb70ef0

add 6828 to changelog · fbf22eff
Drew Bailey authored 5 years ago

fbf22eff

doc: explain ALLOC_INDEX uniqueness guarantees (#6830) · 5e3efbd3

Tim Gross authored 5 years ago

The `ALLOC_INDEX` isn't guaranteed to be unique, and this has caused
some user confusion. The servers make a best-effort attempt to make
this value unique from 0 to count-1 but when you have canaries on the
task group, there are reused indexes because you have multiple job
versions running at the same time. If a user needs a unique number for
interpolating a value in your application, they can get this by
combining the job version and the alloc index.
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>

5e3efbd3