This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 23 Dec, 2021 2 commits
  2. 02 Dec, 2021 1 commit
  3. 01 Dec, 2021 1 commit
  4. 30 Nov, 2021 1 commit
    • Tim Gross's avatar
      client: respect `client_auto_join` after connection loss (#11585) · d38266ae
      Tim Gross authored
      The `consul.client_auto_join` configuration block tells the Nomad
      client whether to use Consul service discovery to find Nomad
      servers. By default it is set to `true`, but contrary to the
      documentation it was only respected during the initial client
      registration. If a client missed a heartbeat, failed a
      `Node.UpdateStatus` RPC, or if there was no Nomad leader, the client
      would fallback to Consul even if `client_auto_join` was set to
      `false`. This changeset returns early from the client's trigger for
      Consul discovery if the `client_auto_join` field is set to `false`.
      d38266ae
  5. 26 Nov, 2021 1 commit
  6. 24 Nov, 2021 2 commits
    • Tim Gross's avatar
      scheduler: fix panic in system jobs when nodes filtered by class (#11565) · e50fe8c7
      Tim Gross authored
      In the system scheduler, if a subset of clients are filtered by class,
      we hit a code path where the `AllocMetric` has been copied, but the
      `Copy` method does not instantiate the various maps. This leads to an
      assignment to a nil map. This changeset ensures that the maps are
      non-nil before continuing.
      
      The `Copy` method relies on functions in the `helper` package that all
      return nil slices or maps when passed zero-length inputs. This
      changeset to fix the panic bug intentionally defers updating those
      functions because it'll have potential impact on memory usage. See
      https://github.com/hashicorp/nomad/issues/11564 for more details.
      e50fe8c7
    • Tim Gross's avatar
      scheduler: fix panic in system jobs when nodes filtered by class (#11565) · 036282ba
      Tim Gross authored
      In the system scheduler, if a subset of clients are filtered by class,
      we hit a code path where the `AllocMetric` has been copied, but the
      `Copy` method does not instantiate the various maps. This leads to an
      assignment to a nil map. This changeset ensures that the maps are
      non-nil before continuing.
      
      The `Copy` method relies on functions in the `helper` package that all
      return nil slices or maps when passed zero-length inputs. This
      changeset to fix the panic bug intentionally defers updating those
      functions because it'll have potential impact on memory usage. See
      https://github.com/hashicorp/nomad/issues/11564 for more details.
      036282ba
  7. 23 Nov, 2021 2 commits
    • Luiz Aoqui's avatar
    • James Rasell's avatar
      core: allow setting and propagation of eval priority on job de/registration (#11532) · 80dcae72
      James Rasell authored
      This change modifies the Nomad job register and deregister RPCs to
      accept an updated option set which includes eval priority. This
      param is optional and override the use of the job priority to set
      the eval priority.
      
      In order to ensure all evaluations as a result of the request use
      the same eval priority, the priority is shared to the
      allocReconciler and deploymentWatcher. This creates a new
      distinction between eval priority and job priority.
      
      The Nomad agent HTTP API has been modified to allow setting the
      eval priority on job update and delete. To keep consistency with
      the current v1 API, job update accepts this as a payload param;
      job delete accepts this as a query param.
      
      Any user supplied value is validated within the agent HTTP handler
      removing the need to pass invalid requests to the server.
      
      The register and deregister opts functions now all for setting
      the eval priority on requests.
      
      The change includes a small change to the DeregisterOpts function
      which handles nil opts. This brings the function inline with the
      RegisterOpts.
      80dcae72
  8. 19 Nov, 2021 1 commit
    • Tim Gross's avatar
      qemu: add `args_allowlist` to sandbox VM command line inputs · 40de248b
      Tim Gross authored
      The QEMU driver allows arbitrary command line options, but many of
      these options give access to host resources that operators may not
      want to expose such as devices. Add an optional allowlist to the
      plugin configuration so that operators can limit the resources for
      QEMU.
      40de248b
  9. 17 Nov, 2021 3 commits
    • Tim Gross's avatar
      changelog batch (#11517) · ff7a6545
      Tim Gross authored
      ff7a6545
    • Tim Gross's avatar
      api: return 404 for alloc FS list/stat endpoints (#11482) · 55c29fbf
      Tim Gross authored
      
      * api: return 404 for alloc FS list/stat endpoints
      
      If the alloc filesystem doesn't have a file requested by the List
      Files or Stat File API, we currently return a HTTP 500 error with the
      expected "file not found" error message. Return a HTTP 404 error
      instead.
      
      * update FS Handler
      
      Previously the FS handler would interpret a 500 status as a 404
      in the adapter layer by checking if the response body contained
      the text  or is the response status
      was 500 and then throw an error code for 404.
      Co-authored-by: default avatarJai Bhagat <jaybhagat841@gmail.com>
      55c29fbf
    • Tim Gross's avatar
      deps: update go-getter to 1.5.9 (#11481) · df41bded
      Tim Gross authored
      go-getter 1.5.9 includes a patch in 1.5.6 that automatically unpacks
      uncompressed tar archives. Previously Nomad only unpacked compressed
      archives, but documented that it unpacked all archives.
      df41bded
  10. 15 Nov, 2021 1 commit
  11. 05 Nov, 2021 3 commits
  12. 04 Nov, 2021 3 commits
  13. 03 Nov, 2021 1 commit
  14. 02 Nov, 2021 3 commits
  15. 01 Nov, 2021 2 commits
  16. 31 Oct, 2021 1 commit
    • Michael Schurter's avatar
      core: bump rejected plans from debug -> info · 0813de89
      Michael Schurter authored
      As we have continued to see reports of #9506 we need to elevate this log
      line as it is the only way to detect when plans are being *erroneously*
      rejected.
      
      Users who see this log line repeatedly should drain and restart the node
      in the log line. This seems to workaorund the issue.
      
      Please post any details on #9506!
      0813de89
  17. 29 Oct, 2021 1 commit
  18. 27 Oct, 2021 6 commits
    • Dave May's avatar
      debug: update default node-id and docs (#11398) · f46b97b2
      Dave May authored
      * debug: default node-id to all
      * debug: align cli help and website documentation
      f46b97b2
    • Mahmood Ali's avatar
      logging: Log the cause behind agent startup failure (#11353) · 3ce89b75
      Mahmood Ali authored
      Log the failure error when the agent fails to start. Previously, the
      agent startup failure error would be emitted to the command UI but not
      logged. So it doesn't get emitted to syslog or `log_file` if they are
      set, and it makes debugging much harder. Also, logging the error again
      before exit makes the error more visible: previously, the operator
      needed to scroll to the top to find the error.
      
      On a sample failure, the output will look like:
      ```
      ==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
      ==> Loaded configuration from sample-configs/config-bad
      ==> Starting Nomad agent...
      ==> Error starting agent: setting up server node ID failed: mkdir /path-without-permission: read-only file system
          2021-10-20T14:38:51.179-0400 [WARN]  agent.plugin_loader: skipping external plugins since plugin_dir doesn't exist: plugin_dir=/path-without-permission/plugins
          2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins
          2021-10-20T14:38:51.181-0400 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=/path-without-permission/plugins
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=mock_driver type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
          2021-10-20T14:38:51.181-0400 [ERROR] agent: error starting agent: error="setting up server node ID failed: mkdir /path-without-permission: read-only file system"
      ```
      
      This change adds the final `ERROR` message. It's easy to miss the `==>
      Error starting agent` above.
      3ce89b75
    • Mahmood Ali's avatar
      vault: set JobID in Vault metadata (#11397) · 54ca97fe
      Mahmood Ali authored
      Closes: #11395 .
      54ca97fe
    • Mahmood Ali's avatar
      scheduler: stop allocs in unrelated nodes (#11391) · 56a7cc61
      Mahmood Ali authored
      The system scheduler should leave allocs on draining nodes as-is, but
      stop node stop allocs on nodes that are no longer part of the job
      datacenters.
      
      Previously, the scheduler did not make the distinction and left system
      job allocs intact if they are already running.
      
      I've added a failing test first, which you can see in https://app.circleci.com/jobs/github/hashicorp/nomad/179661 .
      
      Fixes https://github.com/hashicorp/nomad/issues/11373
      56a7cc61
    • Mahmood Ali's avatar
      Fix arm64 panics by updating google/snappy library to latest, 0.0.4 (#11396) · 5f6ad87c
      Mahmood Ali authored
      Pick up https://github.com/golang/snappy/pull/56 to handle arm64 architectures to fix panics. tldr; Golang 1.16 changed `memmove` implementation for arm64 requiring additional cpu registers that snappy wasn't preserving in its assembly implementation.
      
      Other projects have experienced this issue as well, searching for `encode_arm64.s:666` on your favorite search engine will reveal some.  Vault updated the dependency earlier this August: https://github.com/hashicorp/vault/pull/12371 .
      
      I believe this issue affects Nomad 1.2.x and 1.1.x. Nomad 1.0.x use Golang 1.15 and isn't affected. However, backporting the change to 1.0.x should be harmless.
      
      Fixed https://github.com/hashicorp/nomad/issues/11385 .
      5f6ad87c
    • Luiz Aoqui's avatar
  19. 22 Oct, 2021 4 commits
  20. 21 Oct, 2021 1 commit