This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 04 Nov, 2022 2 commits
    • Luiz Aoqui's avatar
      rpc: read node ID from allocs in UpdateAlloc · e57379c2
      Luiz Aoqui authored
      The AllocUpdateRequest struct is used in three disjoint use cases:
      
      1. Stripped allocs from clients Node.UpdateAlloc RPC using the Allocs,
         and WriteRequest fields
      2. Raft log message using the Allocs, Evals, and WriteRequest fields
      3. Plan updates using the AllocsStopped, AllocsUpdated, and Job fields
      
      Adding a new field that would only be used in one these cases (1) made
      things more confusing and error prone. While in theory an
      AllocUpdateRequest could send allocations from different nodes, in
      practice this never actually happens since only clients call this method
      with their own allocations.
      e57379c2
    • Luiz Aoqui's avatar
      scheduler: persist changes to reconnected allocs · cb51a281
      Luiz Aoqui authored
      Reconnected allocs have a new AllocState entry that must be persisted by
      the plan applier.
      cb51a281
  2. 03 Nov, 2022 1 commit
  3. 02 Nov, 2022 6 commits
    • Luiz Aoqui's avatar
      client: skip terminal allocations on reconnect · f5ce8a96
      Luiz Aoqui authored
      When the client reconnects with the server it synchronizes the state of
      its allocations by sending data using the `Node.UpdateAlloc` RPC and
      fetching data using the `Node.GetClientAllocs` RPC.
      
      If the data fetch happens before the data write, `unknown` allocations
      will still be in this state and would trigger the
      `allocRunner.Reconnect` flow.
      
      But when the server `DesiredStatus` for the allocation is `stop` the
      client should not reconnect the allocation.
      f5ce8a96
    • Luiz Aoqui's avatar
      code review · aa324ad0
      Luiz Aoqui authored
      aa324ad0
    • Luiz Aoqui's avatar
      chagelog: add entry for #15068 · 3e081c1f
      Luiz Aoqui authored
      3e081c1f
    • Luiz Aoqui's avatar
      rpc: only allow alloc updates from `ready` nodes · 956a3fa9
      Luiz Aoqui authored
      Clients interact with servers using three main RPC methods:
      
        - `Node.GetAllocs` reads allocation data from the server and writes it
          to the client.
        - `Node.UpdateAlloc` reads allocation from from the client and writes
          them to the server.
        - `Node.UpdateStatus` writes the client status to the server and is
          used as the heartbeat mechanism.
      
      These three methods are called periodically by the clients and are done
      so independently from each other, meaning that there can't be any
      assumptions in their ordering.
      
      This can generate scenarios that are hard to reason about and to code
      for. For example, when a client misses too many heartbeats it will be
      considered `down` or `disconnected` and the allocations it was running
      are set to `lost` or `unknown`.
      
      When connectivity is restored the to rest of the cluster, the natural
      mental model is to think that the client will heartbeat first and then
      update its allocations status into the servers.
      
      But since there's no inherit order in these calls the reverse is just as
      possible: the client updates the alloc status and then heartbeats. This
      results in a state where allocs are, for example, `running` while the
      client is still `disconnected`.
      
      This commit adds a new verification to the `Node.UpdateAlloc` method to
      reject updates from nodes that are not `ready`, forcing clients to
      heartbeat first. Since this check is done server-side there is no need
      to coordinate operations client-side: they can continue sending these
      requests independently and alloc update will succeed after the heartbeat
      is done.
      956a3fa9
    • Luiz Aoqui's avatar
      scheduler: prevent spurious placement on reconnect · 720513f0
      Luiz Aoqui authored
      When a client reconnects it makes two independent RPC calls:
      
        - `Node.UpdateStatus` to heartbeat and set its status as `ready`.
        - `Node.UpdateAlloc` to update the status of its allocations.
      
      These two calls can happen in any order, and in case the allocations are
      updated before a heartbeat it causes the state to be the same as a node
      being disconnected: the node status will still be `disconnected` while
      the allocation `ClientStatus` is set to `running`.
      
      The current implementation did not handle this order of events properly,
      and the scheduler would create an unnecessary placement since it
      considered the allocation was being disconnected. This extra allocation
      would then be quickly stopped by the heartbeat eval.
      
      This commit adds a new code path to handle this order of events. If the
      node is `disconnected` and the allocation `ClientStatus` is `running`
      the scheduler will check if the allocation is actually reconnecting
      using its `AllocState` events.
      720513f0
    • Luiz Aoqui's avatar
      scheduler: allow updates after alloc reconnects · 882904bf
      Luiz Aoqui authored
      When an allocation reconnects to a cluster the scheduler needs to run
      special logic to handle the reconnection, check if a replacement was
      create and stop one of them.
      
      If the allocation kept running while the node was disconnected, it will
      be reconnected with `ClientStatus: running` and the node will have
      `Status: ready`. This combination is the same as the normal steady state
      of allocation, where everything is running as expected.
      
      In order to differentiate between the two states (an allocation that is
      reconnecting and one that is just running) the scheduler needs an extra
      piece of state.
      
      The current implementation uses the presence of a
      `TaskClientReconnected` task event to detect when the allocation has
      reconnected and thus must go through the reconnection process. But this
      event remains even after the allocation is reconnected, causing all
      future evals to consider the allocation as still reconnecting.
      
      This commit changes the reconnect logic to use an `AllocState` to
      register when the allocation was reconnected. This provides the
      following benefits:
      
        - Only a limited number of task states are kept, and they are used for
          many other events. It's possible that, upon reconnecting, several
          actions are triggered that could cause the `TaskClientReconnected`
          event to be dropped.
        - Task events are set by clients and so their timestamps are subject
          to time skew from servers. This prevents using time to determine if
          an allocation reconnected after a disconnect event.
        - Disconnect events are already stored as `AllocState` and so storing
          reconnects there as well makes it the only source of information
          required.
      
      With the new logic, the reconnection logic is only triggered if the
      last `AllocState` is a disconnect event, meaning that the allocation has
      not been reconnected yet. After the reconnection is handled, the new
      `ClientStatus` is store in `AllocState` allowing future evals to skip
      the reconnection logic.
      882904bf
  4. 26 Oct, 2022 1 commit
  5. 25 Oct, 2022 1 commit
  6. 24 Oct, 2022 7 commits
  7. 21 Oct, 2022 6 commits
    • Seth Hoenig's avatar
      ci: use the same go mod cache across test-core jobs (#15006) · dbd742d8
      Seth Hoenig authored
      * ci: use the same go mod cache for test-core jobs
      
      * ci: precache go modules
      
      * ci: add a mods precache job
      dbd742d8
    • Tim Gross's avatar
      keyring: fixes for keyring replication on cluster join (#14987) · 5732eb2c
      Tim Gross authored
      * keyring: don't unblock early if rate limit burst exceeded
      
      The rate limiter returns an error and unblocks early if its burst limit is
      exceeded (unless the burst limit is Inf). Ensure we're not unblocking early,
      otherwise we'll only slow down the cases where we're already pausing to make
      external RPC requests.
      
      * keyring: set MinQueryIndex on stale queries
      
      When keyring replication makes a stale query to non-leader peers to find a key
      the leader doesn't have, we need to make sure the peer we're querying has had a
      chance to catch up to the most current index for that key. Otherwise it's
      possible for newly-added servers to query another newly-added server and get a
      non-error nil response for that key ID.
      
      Ensure that we're setting the correct reply index in the blocking query.
      
      Note that the "not found" case does not return an error, just an empty key. So
      as a belt-and-suspenders, update the handling of empty responses so tha...
      5732eb2c
    • Michael Schurter's avatar
      test: use port collision instead of cpu exhaustion (#14994) · 5ed74049
      Michael Schurter authored
      Originally this test relied on Job 1 blocking Job 2 until Job 1 had a
      terminal *ClientStatus.* Job 2 ensured it would get blocked using 2
      mechanisms:
      
      1. A constraint requiring it is placed on the same node as Job 1.
      2. Job 2 would require all unreserved CPU on the node to ensure it would
         be blocked until Job 1's resources were free.
      
      That 2nd assertion breaks if *any previous job is still running on the
      target node!* That seems very likely to happen in the flaky world of our
      e2e tests. In fact there may be some jobs we intentionally want running
      throughout; in hindsight it was never safe to assume my test would be
      the only thing scheduled when it ran.
      
      *Ports to the rescue!* Reserving a static port means that both Job 2
      will now block on Job 1 being terminal. It will only conflict with other
      tests if those tests use that port *on every node.* I ensured no
      existing tests were using the port I chose.
      
      Other changes:
      - Gave job a bit more breathing room resource-wise.
      - Tightened timings a bit since previous failure ran into the `go test`
        time limit.
      - Cleaned up the DumpEvals output. It's quite nice and handy now!
      5ed74049
    • Luiz Aoqui's avatar
      docs: use of `node_class` when autoscaling (#14950) · f2318ed2
      Luiz Aoqui authored
      Document how the value of `node_class` is used during cluster scaling.
      
      https://github.com/hashicorp/nomad-autoscaler/issues/255
      f2318ed2
    • Seth Hoenig's avatar
      ci: use gotestsum for CI tests (#14995) · b52d40d4
      Seth Hoenig authored
      Use gotestsum in both GHA and Circle with retries enabled.
      b52d40d4
    • James Rasell's avatar
      acl: allow tokens to read policies linked via roles to the token. (#14982) · fbe9f590
      James Rasell authored
      ACL tokens are granted permissions either by direct policy links
      or via ACL role links. Callers should therefore be able to read
      policies directly assigned to the caller token or indirectly by
      ACL role links.
      fbe9f590
  8. 20 Oct, 2022 7 commits
  9. 19 Oct, 2022 9 commits