This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 23 Feb, 2022 1 commit
  2. 08 Feb, 2022 2 commits
  3. 28 Jan, 2022 3 commits
    • Tim Gross's avatar
      CSI: move terminal alloc handling into denormalization (#11931) · 2c6de3e8
      Tim Gross authored
      * The volume claim GC method and volumewatcher both have logic
      collecting terminal allocations that duplicates most of the logic
      that's now in the state store's `CSIVolumeDenormalize` method. Copy
      this logic into the state store so that all code paths have the same
      view of the past claims.
      * Remove logic in the volume claim GC that now lives in the state
      store's `CSIVolumeDenormalize` method.
      * Remove logic in the volumewatcher that now lives in the state
      store's `CSIVolumeDenormalize` method.
      * Remove logic in the node unpublish RPC that now lives in the state
      store's `CSIVolumeDenormalize` method.
      2c6de3e8
    • Tim Gross's avatar
      csi: ensure that PastClaims are populated with correct mode (#11932) · 26b50083
      Tim Gross authored
      In the client's `(*csiHook) Postrun()` method, we make an unpublish
      RPC that includes a claim in the `CSIVolumeClaimStateUnpublishing`
      state and using the mode from the client. But then in the
      `(*CSIVolume) Unpublish` RPC handler, we query the volume from the
      state store (because we only get an ID from the client). And when we
      make the client RPC for the node unpublish step, we use the _current
      volume's_ view of the mode. If the volume's mode has been changed
      before the old allocations can have their claims released, then we end
      up making a CSI RPC that will never succeed.
      
      Why does this code path get the mode from the volume and not the
      claim? Because the claim written by the GC job in `(*CoreScheduler)
      csiVolumeClaimGC` doesn't have a mode. Instead it just writes a claim
      in the unpublishing state to ensure the volumewatcher detects a "past
      claim" change and reaps all the claims on the volumes.
      
      Fix this by ensuring that the `CSIVolumeDenormalize` creates past
      claims for all nil allocations with a correct access mode set.
      26b50083
    • Tim Gross's avatar
      CSI: resolve invalid claim states (#11890) · 6e0119de
      Tim Gross authored
      * csi: resolve invalid claim states on read
      
      It's currently possible for CSI volumes to be claimed by allocations
      that no longer exist. This changeset asserts a reasonable state at
      the state store level by registering these nil allocations as "past
      claims" on any read. This will cause any pass through the periodic GC
      or volumewatcher to trigger the unpublishing workflow for those claims.
      
      * csi: make feasibility check errors more understandable
      
      When the feasibility checker finds we have no free write claims, it
      checks to see if any of those claims are for the job we're currently
      scheduling (so that earlier versions of a job can't block claims for
      new versions) and reports a conflict if the volume can't be scheduled
      so that the user can fix their claims. But when the checker hits a
      claim that has a GCd allocation, the state is recoverable by the
      server once claim reaping completes and no user intervention is
      required; the blocked eval should complete. Differentiate the
      scheduler error produced by these two conditions.
      6e0119de
  4. 18 Jan, 2022 1 commit
    • Tim Gross's avatar
      csi: volume deregistration should require exact ID (#11852) · cd0139d1
      Tim Gross authored
      The command line client sends a specific volume ID, but this isn't
      enforced at the API level and we were incorrectly using a prefix match
      for volume deregistration, resulting in cases where a volume with a
      shorter ID that's a prefix of another volume would be deregistered
      instead of the intended volume.
      cd0139d1
  5. 07 May, 2021 1 commit
  6. 07 Apr, 2021 1 commit
    • Tim Gross's avatar
      CSI: use AccessMode/AttachmentMode from CSIVolumeClaim · a37af310
      Tim Gross authored
      Registration of Nomad volumes previously allowed for a single volume
      capability (access mode + attachment mode pair). The recent `volume create`
      command requires that we pass a list of requested capabilities, but the
      existing workflow for claiming volumes and attaching them on the client
      assumed that the volume's single capability was correct and unchanging.
      
      Add `AccessMode` and `AttachmentMode` to `CSIVolumeClaim`, use these fields to
      set the initial claim value, and add backwards compatibility logic to handle
      the existing volumes that already have claims without these fields.
      a37af310
  7. 21 Mar, 2021 1 commit
    • Chris Baker's avatar
      removed deprecated fields from Drain structs and API · 93d5187e
      Chris Baker authored
      node drain: use msgtype on txn so that events are emitted
      wip: encoding extension to add Node.Drain field back to API responses
      
      new approach for hiding Node.SecretID in the API, using `json` tag
      documented this approach in the contributing guide
      refactored the JSON handlers with extensions
      modified event stream encoding to use the go-msgpack encoders with the extensions
      93d5187e
  8. 18 Mar, 2021 2 commits
    • Tim Gross's avatar
      CSI: unique volume per allocation · 7c756967
      Tim Gross authored
      Add a `PerAlloc` field to volume requests that directs the scheduler to test
      feasibility for volumes with a source ID that includes the allocation index
      suffix (ex. `[0]`), rather than the exact source ID.
      
      Read the `PerAlloc` field when making the volume claim at the client to
      determine if the allocation index suffix (ex. `[0]`) should be added to the
      volume source ID.
      7c756967
    • Tim Gross's avatar
      CSI: remove prefix matching from CSIVolumeByID and fix CLI prefix matching (#10158) · a1eaad9c
      Tim Gross authored
      Callers of `CSIVolumeByID` are generally assuming they should receive a single
      volume. This potentially results in feasibility checking being performed
      against the wrong volume if a volume's ID is a prefix substring of other
      volume (for example: "test" and "testing").
      
      Removing the incorrect prefix matching from `CSIVolumeByID` breaks prefix
      matching in the command line client. Add the required elements for prefix
      matching to the commands and API.
      a1eaad9c
  9. 16 Mar, 2021 1 commit
    • Charlie Voiselle's avatar
      Fixup uses of `sanity` (#10187) · d914990e
      Charlie Voiselle authored
      * Fixup uses of `sanity`
      * Remove unnecessary comments.
      
      These checks are better explained by earlier comments about
      the context of the test. Per @tgross, moved the tests together
      to better reinforce the overall shared context.
      
      * Update nomad/fsm_test.go
      d914990e
  10. 10 Mar, 2021 2 commits
    • Tim Gross's avatar
      RPC endpoints to support 'nomad ui -login' · a12f4470
      Tim Gross authored
      RPC endpoints for the user-driven APIs (`UpsertOneTimeToken` and
      `ExchangeOneTimeToken`) and token expiration (`ExpireOneTimeTokens`).
      Includes adding expiration to the periodic core GC job.
      a12f4470
    • Tim Gross's avatar
      state store updates for one-time tokens · b4a516be
      Tim Gross authored
      The `OneTimeToken` struct is to support the `nomad ui -login` command. This
      changeset adds the struct to the Nomad state store.
      b4a516be
  11. 22 Feb, 2021 1 commit
    • Tim Gross's avatar
      deploymentwatcher: reset progress deadline on promotion (#10042) · 174c206b
      Tim Gross authored
      In a deployment with two groups (ex. A and B), if group A's canary becomes
      healthy before group B's, the deadline for the overall deployment will be set
      to that of group A. When the deployment is promoted, if group A is done it
      will not contribute to the next deadline cutoff. Group B's old deadline will
      be used instead, which will be in the past and immediately trigger a
      deployment progress failure. Reset the progress deadline when the job is
      promotion to avoid this bug, and to better conform with implicit user
      expectations around how the progress deadline should interact with promotions.
      174c206b
  12. 25 Jan, 2021 1 commit
  13. 22 Jan, 2021 1 commit
    • Drew Bailey's avatar
      prevent double job status update (#9768) · 3cb11326
      Drew Bailey authored
      * Prevent Job Statuses from being calculated twice
      
      https://github.com/hashicorp/nomad/pull/8435 introduced atomic eval
      insertion iwth job (de-)registration. This change removes a now obsolete
      guard which checked if the index was equal to the job.CreateIndex, which
      would empty the status. Now that the job regisration eval insetion is
      atomic with the registration this check is no longer necessary to set
      the job statuses correctly.
      
      * test to ensure only single job event for job register
      
      * periodic e2e
      
      * separate job update summary step
      
      * fix updatejobstability to use copy instead of modified reference of job
      
      * update envoygatewaybindaddresses copy to prevent job diff on null vs empty
      
      * set ConsulGatewayBindAddress to empty map instead of nil
      
      fix nil assertions for empty map
      
      rm unnecessary guard
      3cb11326
  14. 11 Dec, 2020 1 commit
    • Drew Bailey's avatar
      Events/acl events (#9595) · 3e793ea3
      Drew Bailey authored
      * fix acl event creation
      
      * allow way to access secretID without exposing it to stream
      
      test that values are omitted
      
      test event creation
      
      test acl events
      
      payloads are pointers
      
      fix failing tests, do all security steps inside constructor
      
      * increase time
      
      * ignore empty tokens
      
      * uncomment line
      
      * changelog
      3e793ea3
  15. 08 Dec, 2020 1 commit
  16. 01 Dec, 2020 3 commits
  17. 30 Nov, 2020 1 commit
    • Drew Bailey's avatar
      Remove Managed Sinks from Nomad (#9470) · bf225f71
      Drew Bailey authored
      * Remove Managed Sinks from Nomad
      
      Managed Sinks were a beta feature in Nomad 1.0-beta2. During the beta
      period it was determined that this was not a scalable approach to
      support community and third party sinks.
      
      * update comment
      
      * changelog
      bf225f71
  18. 25 Nov, 2020 1 commit
    • Tim Gross's avatar
      CSI: fix transaction handling in state store (#9438) · c2aaa517
      Tim Gross authored
      When making updates to CSI plugins, the state store methods that have open
      write transactions were querying the state store using the same methods used
      by the CSI RPC endpoint, but these method creates their own top-level read
      transactions. During concurrent plugin updates (as happens when a plugin job
      is stopped), this can cause write skew in the plugin counts.
      
      * Refactor the CSIPlugin query methods to have an implementation method that
      accepts a transaction, which can be called with either a read txn or a write
      txn.
      * Refactor the CSIVolume query methods to have an implementation method that
      accepts a transaction, which can be called with either a read txn or a write
      txn.
      * CSI volumes need to be "denormalized" with their plugins and (optionally)
      allocations. Read-only RPC endpoints should take a snapshot so that we can
      make multiple state store method calls with a consistent view.
      c2aaa517
  19. 18 Nov, 2020 1 commit
    • Tim Gross's avatar
      CSI: fix struct copying errors (#9239) · 71a378e6
      Tim Gross authored
      The CSIVolume struct "denormalizes" allocations when it's first queried from
      the state store. The CSIVolumeByID method on the state store copies the volume
      before denormalizing so that we don't end up with unexpected changes. The
      copying has some subtle bugs that meant that Allocations (as well as
      Topologies and MountOptions) were not getting copied when expected.
      
      Also, ensure we never write allocations attached to volumes to the state store
      during claims.
      71a378e6
  20. 11 Nov, 2020 1 commit
    • Tim Gross's avatar
      csi: Postrun hook should not change mode (#9323) · 0ed0b945
      Tim Gross authored
      The unpublish workflow requires that we know the mode (RW vs RO) if we want to
      unpublish the node. Update the hook and the Unpublish RPC so that we mark the
      claim for release in a new state but leave the mode alone. This fixes a bug
      where RO claims were failing node unpublish.
      
      The core job GC doesn't know the mode, but we don't need it for that workflow,
      so add a mode specifically for GC; the volumewatcher uses this as a sentinel
      to check whether claims (with their specific RW vs RO modes) need to be claimed.
      0ed0b945
  21. 10 Nov, 2020 1 commit
  22. 05 Nov, 2020 2 commits
  23. 28 Oct, 2020 1 commit
    • Chris Baker's avatar
      added new policy capabilities for recommendations API · 9e2eadc7
      Chris Baker authored
      state store: call-out to generic update of job recommendations from job update method
      recommendations API work, and http endpoint errors for OSS
      support for scaling polices in task block of job spec
      add query filters for ScalingPolicy list endpoint
      command: nomad scaling policy list: added -job and -type
      9e2eadc7
  24. 26 Oct, 2020 1 commit
    • Drew Bailey's avatar
      Send events to EventSinks (#9171) · da45c959
      Drew Bailey authored
      * Process to send events to configured sinks
      
      This PR adds a SinkManager to a server which is responsible for managing
      managed sinks. Managed sinks subscribe to the event broker and send
      events to a sink writer (webhook). When changes to the eventstore are
      made the sinkmanager and managed sink are responsible for reloading or
      starting a new managed sink.
      
      * periodically check in sink progress to raft
      
      Save progress on the last successfully sent index to raft. This allows a
      managed sink to resume close to where it left off in the event of a lost
      server or leadership change
      
      dereference eventsink so we can accurately use the watchch
      
      When using a pointer to eventsink struct it was updated immediately and our reload logic would not trigger
      da45c959
  25. 23 Oct, 2020 1 commit
    • Drew Bailey's avatar
      event sink crud operation api (#9155) · fbb199d4
      Drew Bailey authored
      * network sink rpc/api plumbing
      
      state store methods and restore
      
      upsert sink test
      
      get sink
      
      delete sink
      
      event sink list and tests
      
      go generate new msg types
      
      validate sink on upsert
      
      * go generate
      fbb199d4
  26. 22 Oct, 2020 2 commits
  27. 19 Oct, 2020 1 commit
    • Drew Bailey's avatar
      Events/msgtype cleanup (#9117) · 7ce0b501
      Drew Bailey authored
      * use msgtype in upsert node
      
      adds message type to signature for upsert node, update tests, remove placeholder method
      
      * UpsertAllocs msg type test setup
      
      * use upsertallocs with msg type in signature
      
      update test usage of delete node
      
      delete placeholder msgtype method
      
      * add msgtype to upsert evals signature, update test call sites with test setup msg type
      
      handle snapshot upsert eval outside of FSM and ignore eval event
      
      remove placeholder upsertevalsmsgtype
      
      handle job plan rpc and prevent event creation for plan
      
      msgtype cleanup upsertnodeevents
      
      updatenodedrain msgtype
      
      msg type 0 is a node registration event, so set the default  to the ignore type
      
      * fix named import
      
      * fix signature ordering on upsertnode to match
      7ce0b501
  28. 14 Oct, 2020 4 commits
    • Drew Bailey's avatar
      filter on additional filter keys, remove switch statement duplication · 3c15f414
      Drew Bailey authored
      properly wire up durable event count
      
      move newline responsibility
      
      moves newline creation from NDJson to the http handler, json stream only encodes and sends now
      
      ignore snapshot restore if broker is disabled
      
      enable dev mode to access event steam without acl
      
      use mapping instead of switch
      
      use pointers for config sizes, remove unused ttl, simplify closed conn logic
      3c15f414
    • Michael Schurter's avatar
      api: add field filters to /v1/{allocations,nodes} · a55f46e9
      Michael Schurter authored
      Fixes #9017
      
      The ?resources=true query parameter includes resources in the object
      stub listings. Specifically:
      
      - For `/v1/nodes?resources=true` both the `NodeResources` and
        `ReservedResources` field are included.
      - For `/v1/allocations?resources=true` the `AllocatedResources` field is
        included.
      
      The ?task_states=false query parameter removes TaskStates from
      /v1/allocations responses. (By default TaskStates are included.)
      a55f46e9
    • Drew Bailey's avatar
      handle txn returning error · 8711376e
      Drew Bailey authored
      8711376e
    • Drew Bailey's avatar
      Add EvictCallbackFn to handle removing entries from go-memdb when they · 39ef3263
      Drew Bailey authored
      are removed from the event buffer.
      
      Wire up event buffer size config, use pointers for structs.Events
      instead of copying.
      39ef3263