This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 08 Mar, 2022 1 commit
  2. 07 Mar, 2022 1 commit
  3. 01 Mar, 2022 1 commit
  4. 28 Feb, 2022 1 commit
  5. 18 Feb, 2022 1 commit
  6. 01 Feb, 2022 1 commit
  7. 31 Jan, 2022 3 commits
  8. 28 Jan, 2022 10 commits
    • Tim Gross's avatar
      docs: add 1.1.17 to changelog · beb996b3
      Tim Gross authored
      beb996b3
    • Tim Gross's avatar
      docs: missing changelog for #11892 (#11959) · 8665724c
      Tim Gross authored
      8665724c
    • Tim Gross's avatar
      45b7aab8
    • Tim Gross's avatar
      CSI: node unmount from the client before unpublish RPC (#11892) · 136e2a8e
      Tim Gross authored
      When an allocation stops, the `csi_hook` makes an unpublish RPC to the
      servers to unpublish via the CSI RPCs: first to the node plugins and
      then the controller plugins. The controller RPCs must happen after the
      node RPCs so that the node has had a chance to unmount the volume
      before the controller tries to detach the associated device.
      
      But the client has local access to the node plugins and can
      independently determine if it's safe to send unpublish RPC to those
      plugins. This will allow the server to treat the node plugin as
      abandoned if a client is disconnected and `stop_on_client_disconnect`
      is set. This will let the server try to send unpublish RPCs to the
      controller plugins, under the assumption that the client will be
      trying to unmount the volume on its end first.
      
      Note that the CSI `NodeUnpublishVolume`/`NodeUnstageVolume` RPCs can
      return ignorable errors in the case where the volume has already been
      unmounted from the node. Handle all other errors by retrying until we
      get success so as to give operators the opportunity to reschedule a
      failed node plugin (ex. in the case where they accidentally drained a
      node without `-ignore-system`). Fan-out the work for each volume into
      its own goroutine so that we can release a subset of volumes if only
      one is stuck.
      136e2a8e
    • Tim Gross's avatar
      CSI: tests to exercise csi_hook (#11788) · 37dea7c2
      Tim Gross authored
      Small refactoring of the allocrunner hook for CSI to make it more
      testable, and a unit test that covers most of its logic.
      37dea7c2
    • Tim Gross's avatar
      CSI: resolve invalid claim states (#11890) · 9154033a
      Tim Gross authored
      * csi: resolve invalid claim states on read
      
      It's currently possible for CSI volumes to be claimed by allocations
      that no longer exist. This changeset asserts a reasonable state at
      the state store level by registering these nil allocations as "past
      claims" on any read. This will cause any pass through the periodic GC
      or volumewatcher to trigger the unpublishing workflow for those claims.
      
      * csi: make feasibility check errors more understandable
      
      When the feasibility checker finds we have no free write claims, it
      checks to see if any of those claims are for the job we're currently
      scheduling (so that earlier versions of a job can't block claims for
      new versions) and reports a conflict if the volume can't be scheduled
      so that the user can fix their claims. But when the checker hits a
      claim that has a GCd allocation, the state is recoverable by the
      server once claim reaping completes and no user intervention is
      required; the blocked eval should complete. Differentia...
      9154033a
    • Mahmood Ali's avatar
      tests: use standard library testing.TB · 73740b28
      Mahmood Ali authored
      Glint pulled in an updated version of mitchellh/go-testing-interface
      which broke some existing tests because the update added a Parallel()
      method to testing.T. This switches to the standard library testing.TB
      which doesn't have a Parallel() method.
      73740b28
    • Tim Gross's avatar
      csi: update leader's ACL in volumewatcher (#11891) · 0d8952a3
      Tim Gross authored
      The volumewatcher that runs on the leader needs to make RPC calls
      rather than writing to raft (as we do in the deploymentwatcher)
      because the unpublish workflow needs to make RPC calls to the
      clients. This requires that the volumewatcher has access to the
      leader's ACL token.
      
      But when leadership transitions, the new leader creates a new leader
      ACL token. This ACL token needs to be passed into the volumewatcher
      when we enable it, otherwise the volumewatcher can find itself with a
      stale token.
      0d8952a3
    • Tim Gross's avatar
      csi: reap unused volume claims at leadership transitions (#11776) · d30ceb6f
      Tim Gross authored
      When `volumewatcher.Watcher` starts on the leader, it starts a watch
      on every volume and triggers a reap of unused claims on any change to
      that volume. But if a reaping is in-flight during leadership
      transitions, it will fail and the event that triggered the reap will
      be dropped. Perform one reap of unused claims at the start of the
      watcher so that leadership transitions don't drop this event.
      d30ceb6f
    • James Rasell's avatar
      Merge pull request #10752 from hashicorp/b-fix-test-datarace-volumewatcher · 628959b3
      James Rasell authored
      volumewatcher: fix test data race.
      628959b3
  9. 18 Jan, 2022 5 commits
  10. 17 Jan, 2022 16 commits