This project is mirrored from https://gitee.com/mirrors/nomad.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 07 Apr, 2020 1 commit
  2. 06 Apr, 2020 21 commits
  3. 05 Apr, 2020 3 commits
  4. 04 Apr, 2020 9 commits
  5. 03 Apr, 2020 6 commits
    • Michael Lange's avatar
      Merge pull request #7574 from hashicorp/f-ui/configurable-page-sizes · 98cbe43c
      Michael Lange authored
      UI Configurable Page Sizes
      Unverified
      98cbe43c
    • Lang Martin's avatar
      csi: run volume claim GC on `job stop -purge` (#7615) · 2149cc69
      Lang Martin authored
      * nomad/state/state_store: error message copy/paste error
      
      * nomad/structs/structs: add a VolumeEval to the JobDeregisterResponse
      
      * nomad/job_endpoint: synchronously, volumeClaimReap on job Deregister
      
      * nomad/core_sched: make volumeClaimReap available without a CoreSched
      
      * nomad/job_endpoint: Deregister return early if the job is missing
      
      * nomad/job_endpoint_test: job Deregistion is idempotent
      
      * nomad/core_sched: conditionally ignore alloc status in volumeClaimReap
      
      * nomad/job_endpoint: volumeClaimReap all allocations, even running
      
      * nomad/core_sched_test: extra argument to collectClaimsToGCImpl
      
      * nomad/job_endpoint: job deregistration is not idempotent
      Unverified
      2149cc69
    • Mahmood Ali's avatar
      Merge pull request #7622 from hashicorp/tests-deflake-TestAutopilot_RollingUpdate · 59402ff2
      Mahmood Ali authored
      tests: deflake TestAutopilot_RollingUpdate
      Unverified
      59402ff2
    • Mahmood Ali's avatar
      tests: deflake TestAutopilot_RollingUpdate · f5439259
      Mahmood Ali authored
      I hypothesize that the flakiness in rolling update is due to shutting
      down s3 server before s4 is properly added as a voter.
      
      The chain of the flakiness is as follows:
      
      1. Bootstrap with s1, s2, s3
      2. Add s4
      3. Wait for servers to register with 3 voting peers
         * But we already have 3 voters (s1, s2, and s3)
         * s4 is added as a non-voter in Raft v3 and must wait until autopilot promots it
      4. Test proceeds without s4 being a voter
      5. s3 shutdown
      6. cluster changes stall due to leader election and too many pending configuration
      changes (e.g. removing s3 from raft, promoting s4).
      
      Here, I have the test wait until s4 is marked as a voter before shutting
      down s3, so we don't have too many configuration changes at once.
      
      In https://circleci.com/gh/hashicorp/nomad/57092, I noticed the
      following events:
      
      ```
      TestAutopilot_RollingUpdate: autopilot_test.go:204: adding server s4
          TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO]  nomad/serf.go:60: nomad: adding server: server="nomad-137.global (Addr: 127.0.0.1:9177) (DC: dc1)"
          TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.789Z [INFO]  raft/raft.go:1018: nomad.raft: updating configuration: command=AddNonvoter server-id=c54b5bf4-1159-34f6-032d-56aefeb08425 server-addr=127.0.0.1:9177 servers="[{Suffrage:Voter ID:df01ba65-d1b2-17a9-f792-a4459b3a7c09 Address:127.0.0.1:9171} {Suffrage:Voter ID:c3337778-811e-2675-87f5-006309888387 Address:127.0.0.1:9173} {Suffrage:Voter ID:186d5e15-c473-e2b3-b5a4-3259a84e10ef Address:127.0.0.1:9169} {Suffrage:Nonvoter ID:c54b5bf4-1159-34f6-032d-56aefeb08425 Address:127.0.0.1:9177}]"
      
          TestAutopilot_RollingUpdate: autopilot_test.go:218: shutting down server s3
          TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.797Z [INFO]  raft/replication.go:456: nomad.raft: aborting pipeline replication: peer="{Nonvoter c54b5bf4-1159-34f6-032d-56aefeb08425 127.0.0.1:9177}"
          TestAutopilot_RollingUpdate: autopilot_test.go:235: waiting for s4 to stabalize and be promoted
          TestAutopilot_RollingUpdate: testlog.go:34: 2020-04-03T20:08:19.975Z [ERROR] raft/raft.go:1656: nomad.raft: failed to make requestVote RPC: target="{Voter c3337778-811e-2675-87f5-006309888387 127.0.0.1:9173}" error="dial tcp 127.0.0.1:9173: connect: connection refused"
          TestAutopilot_RollingUpdate: retry.go:121: autopilot_test.go:241: don't want "c3337778-811e-2675-87f5-006309888387"
              autopilot_test.go:241: didn't find map[c54b5bf4-1159-34f6-032d-56aefeb08425:true] in []raft.ServerID{"df01ba65-d1b2-17a9-f792-a4459b3a7c09", "186d5e15-c473-e2b3-b5a4-3259a84e10ef"}
      ```
      
      Note how s3, c3337778, is present in the peers list in the final
      failure, but s4, c54b5bf4, is added as a Nonvoter and isn't present in
      the final peers list.
      f5439259
    • Tim Gross's avatar
      fix encoding/decoding tags for api.Task (#7620) · f24d2514
      Tim Gross authored
      When `nomad job inspect` encodes the response, if the decoded JSON
      from the API doesn't exactly match the API struct, the field value
      will be omitted even if it has a value. We only want the JSON struct
      tag to `omitempty`.
      Unverified
      f24d2514
    • Tim Gross's avatar
      e2e: improve test reliability for CSI (#7616) · 2468b385
      Tim Gross authored
      This changeset:
      
      * adds eval status to the error messages emitted when we have
        placement failure in tests. The implementation here isn't quite
        perfect but it's a lot better than "condition not met".
      * enforces the ordering of teardown of the CSI test
      * doesn't pass the purge flag to one of the two CSI tests, so that we
        exercise both code paths.
      Unverified
      2468b385