This project is mirrored from https://gitee.com/cowcomic/pixie.git. Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer.
  1. 31 Aug, 2020 3 commits
    • Omid Azizi's avatar
      Move parts of String/ByteArray processing from CodeGen to Dwarvifier · ad2cb1ca
      Omid Azizi authored
      Summary:
      String/ByteArray support was done very quickly for the demo, so some short-cuts were taken.
      
      This diff rectifies part of that. It is a step in the right direction; there is more to do.
      
      Test Plan: Existing tests.
      
      Reviewers: yzhao, #engineering
      
      Reviewed By: yzhao, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6123
      
      GitOrigin-RevId: 87d558c15c54abb3a61221b0e052b151933773ac
      ad2cb1ca
    • Michelle Nguyen's avatar
      Fix updater timeout commit · 11bc6f4d
      Michelle Nguyen authored
      Summary: the wrong version of when I was testing different ways to handle the timeout error got committed instead of the actual solution that was in the diff.
      
      Test Plan: n/a
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6132
      
      GitOrigin-RevId: bf22569b64cf79e4655d3ba659c9766c5073a7a7
      11bc6f4d
    • Michelle Nguyen's avatar
      Update job: Keep cloud-connector alive · dcd9b993
      Michelle Nguyen authored
      Summary:
      We want to keep the cloud connector alive as much as possible, so that we always have a way to recover without requiring a fresh reinstall from our customers.
      The current update flow is:
      - delete all vizier resources (this includes cloudconn + its deps, minus etcd/nats)
      - launch new versions of all vizier resources
      In between the time of deletion and launching, the cloud connector deployment is completely gone from the namespace. if something hapepns in that time, it is possible that we may never get the cloud connector back.
      
      Instead, the flow is updated to be more like this:
      - delete all vizier resources (minus cloudconn + its deps)
      - launch new versions of all vizier resources. for cloudconn + its deps, this is an update.
      Now, there is no period of time where there is no cloud connector deployment.
      
      We need to additionally bounce the cloudconnector pod to handle a case where we try to update to the same version we're already on. since this is an update now, the cloudconnector pod just keeps running. however, we expect the cloudconnector to clean up the update job upon startup.
      
      Test Plan:
      tried 4 cases:
      - update 0.4.5 to RC with new update changes
      - deploy RC with update changes
      - update RC to same RC version
      - update RC to newer RC version
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6129
      
      GitOrigin-RevId: 365e9e74707e5d89a7f9c8794f8830635cf524fb
      dcd9b993
  2. 28 Aug, 2020 2 commits
    • Michelle Nguyen's avatar
      Fix updater timeout issue · 7ea98e01
      Michelle Nguyen authored
      Summary:
      turns out the previous fix didn't work, because the error wasn't actually the type I expected it to be.
      turns out theres no specific error type that we can check for this timeout error, so we have to check it manually.
      
      Test Plan: updated the job timeout to be 2s instead of 10 mins. confirm that my cluster updated properly anyway
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6112
      
      GitOrigin-RevId: 5eaebfd5049f1ff5ab9e4badce9ea8d19ff7c406
      7ea98e01
    • Michelle Nguyen's avatar
      Fix create release script bazel path · b2bb3e7e
      Michelle Nguyen authored
      Summary:
      when we made our vizier images public/private we removed :vizier_images_bundle and replaced it with :public_vizier_images_bundle and :private_vizier_images_bundle.
      This broke the release script for rcs because it couldn't properly get the number of changed commits and the name would always be something like 0.4.6-pre-master.0, which made creating multiple rcs from the same branch very annoying
      
      Test Plan: ran it
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6126
      
      GitOrigin-RevId: 3bae66f02ca08924a909875440547a2567c27554
      b2bb3e7e
  3. 29 Aug, 2020 1 commit
    • Michelle Nguyen's avatar
      Fix LogCollector index settings · a3691180
      Michelle Nguyen authored
      Summary:
      the log-collector was constantly erroring every deploy with an "reindex not supported" error, although there was no reindexing required.
      @jamesbartlett added some logic for comparing the index settings for the existing index, and the new index, to see indexing is actually required or not.
      unfortunately, even after that change, it was still complaining about reindexing not being supported.
      after taking a look, its because we were passing in settings that looked like:
      ```
      settings: {
          number_of_shards: 2
      }
      ```
      Which elastic does accept. However, the index that's actually stored in elastic looks more like:
      ```
      settings: {
          index: {
            number_of_shards: 2
          }
      }
      ```
      so the comparison actually fails and still thinks it needs to be reindexed.
      
      to fix this, I updated our index settings to look more like the elastic-representation.
      perhaps in the future we can make our elastic index checker more robust to this case, but that doesn't really seem worthwhile to me right now
      
      Test Plan: ran plc-dev and made sure logcollector no longer crashes
      
      Reviewers: jamesbartlett, zasgar, #engineering
      
      Reviewed By: jamesbartlett, #engineering
      
      Subscribers: jamesbartlett
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6131
      
      GitOrigin-RevId: 8af6ff1b23952f78d09ef1332c1760ec1b1b3bb4
      a3691180
  4. 28 Aug, 2020 1 commit
    • Yaxiong Zhao's avatar
      [CELANUP] Cleanups in code_gen.cc · f99dae69
      Yaxiong Zhao authored
      Summary:
      Remove TODO that is no longer relevant: the API is no longer exposing
      code generation details.
      
      GenProgram() -> GenBCCProgram()
      
      Test Plan: Jenkins
      
      Reviewers: oazizi, #engineering
      
      Reviewed By: oazizi, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6128
      
      GitOrigin-RevId: cac1315c19003576a8de7c319632577a70c0567c
      f99dae69
  5. 27 Aug, 2020 1 commit
    • Natalie Serrino's avatar
      PP-2115: Add a heartbeat query result message to carnotpb to send to keep streaming queries alive. · 47193066
      Natalie Serrino authored
      Summary:
      added handling in the query broker for these heartbeats, and add some unit tests that were missing.
      this will help for streaming queries with sparse data where data isn't being sent over that often but we don't want to timeout.
      we can use the heartbeats to send over the execution stats for the query up to that point as well, so that streaming queries
      can still show something for the execution stats.
      
      Test Plan: added tests
      
      Reviewers: michelle, zasgar, #engineering, philkuz
      
      Reviewed By: #engineering, philkuz
      
      JIRA Issues: PP-2115
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6113
      
      GitOrigin-RevId: d4df2b1e4da244425b98ccd30cc16a620908ae5d
      47193066
  6. 28 Aug, 2020 2 commits
    • Natalie Serrino's avatar
      PP-2115: Deprecate old QueryBrokerService (qb implements other services) and... · a6a559b8
      Natalie Serrino authored
      PP-2115: Deprecate old QueryBrokerService (qb implements other services) and its only API, ReceiveAgentResult.
      
      Summary: These have now been subsumed by ResultSinkService, TransferResultChunk.
      
      Test Plan: existing
      
      Reviewers: michelle, philkuz, zasgar, #engineering
      
      Reviewed By: philkuz, #engineering
      
      JIRA Issues: PP-2115
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6119
      
      GitOrigin-RevId: d8580bbeb74bc77d06f8f57cdd98fa320ad371de
      a6a559b8
    • Michelle Nguyen's avatar
      PP-2175 Fix bug for registering viziers with the same name · 97659dea
      Michelle Nguyen authored
      Summary:
      when registering a vizier with a name that we already have in the database, we do a loop 10 times to try registering with the name <name_%d>, where d is the number of times the loop has run.
      this works when there are only 10 clusters with the same name, but once the 11th one tries to register, we error. obviously this isnt scalable. instead, we can try to get a random number from a large pool of large numbers.
      this is happening especially in the case where everyone's cluster is named "minikube"
      
      Test Plan: unit test
      
      Reviewers: zasgar, #engineering, oazizi
      
      Reviewed By: #engineering, oazizi
      
      Subscribers: oazizi
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6107
      
      GitOrigin-RevId: 19cab69e994285f865f7e1f11c2849552140073b
      97659dea
  7. 27 Aug, 2020 1 commit
    • Michelle Nguyen's avatar
      Periodically log etcd status · 791bcb11
      Michelle Nguyen authored
      Summary:
      currently we have no insight into the etcd running on our customer's clusters.
      luckily, with the client we can hit some of the etcd endpoints to see the amount of space we're currently using.
      we're constantly running into space issues on customer, so this should give us a sense of how often we may need to defrag or if a defrag won't be enough.
      
      Test Plan: ran in skaffold
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6115
      
      GitOrigin-RevId: cd3756359cf209ccfd3d11df28726e6959fd98e5
      791bcb11
  8. 28 Aug, 2020 4 commits
    • Michelle Nguyen's avatar
      PP-2178 Fix metadata agent data inconsistency · 67466b07
      Michelle Nguyen authored
      Summary:
      We ran into an error in the querybroker where the agent state was unable to update because of this error occurring every 5s:
      ```Received error running agent tracker loop. Retrying in 5 seconds. Could not update agent table metadata of unknown agent 7309788f-79d8-4f80-ad7d-b6c3e13b47aa```
      
      This means that the metadata sent the qb schema/data for an agent which has been deleted.
      Taking a look at the code, it is definitely possible to update schema/data for an agent which no longer exists.
      since we process agent updates in a queue, its possible for us to put an update on the queue and not process it before the agent has already been deleted. this is especially possible in larger clusters where there may be many updates in the queue.
      before updating the datastore with the new schema/data, we should check whether the agent actually exists.
      
      Test Plan: unit test
      
      Reviewers: nserrino, #engineering, philkuz
      
      Reviewed By: #engineering, philkuz
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6116
      
      GitOrigin-RevId: 37b1eab1cd03cdf796d701e35d951fc8f5e7e582
      67466b07
    • Natalie Serrino's avatar
      PP-2170: Update message in GRPCSink for when server context has been cancelled. · f57981aa
      Natalie Serrino authored
      Summary: tsia.
      
      Test Plan: none
      
      Reviewers: michelle, zasgar, #engineering, philkuz
      
      Reviewed By: #engineering, philkuz
      
      JIRA Issues: PP-2170
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6117
      
      GitOrigin-RevId: 861fd6deb3b513092eba783d7a2b0b1fa9a8f7e0
      f57981aa
    • Omid Azizi's avatar
      A script for testing GKE deployments · bbaf8aa1
      Omid Azizi authored
      Summary:
      For convenience.
      
      Could consider turning it into a basel sh_test.
      
      Test Plan: None
      
      Reviewers: yzhao, #engineering
      
      Reviewed By: yzhao, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6110
      
      GitOrigin-RevId: f694fcda9a55ecc4fc201a3958b9c5f00f0137f7
      bbaf8aa1
    • Natalie Serrino's avatar
      Remove deprecated Done() API on Vizier ResultSinkService. · c0242996
      Natalie Serrino authored
      Summary: TSIA
      
      Test Plan: existing.
      
      Reviewers: michelle, philkuz, zasgar, #engineering
      
      Reviewed By: philkuz, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6109
      
      GitOrigin-RevId: 1916cbe92049a0c77bdb5c2f9649c4ad3eb33e13
      c0242996
  9. 27 Aug, 2020 5 commits
  10. 26 Aug, 2020 1 commit
  11. 27 Aug, 2020 2 commits
    • Michelle Nguyen's avatar
      Handle timeouts better in update job · 0d7d48fc
      Michelle Nguyen authored
      Summary:
      we saw a case in customer where deleting objects took longer than the 5 min timeout, which caused the updater job to just exit.
      instead, if theres a timeout error, we should try to proceed as normal. it should be OK to apply a YAML when a previous one is still deleting (unless deleting a namespace, which we aren't in this case).
      also made the timeout a little longer, which should help. however, k8s pods can be known to take hours to terminate sometimes.
      
      Test Plan: n/a
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6099
      
      GitOrigin-RevId: 20bb25ef8c583bb1b3c0a9193cb23f62317e0853
      0d7d48fc
    • Michelle Nguyen's avatar
      Update CLI to fetch multiple bundles · 08efbf55
      Michelle Nguyen authored
      Summary: tsia
      
      Test Plan: ran CLI
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6090
      
      GitOrigin-RevId: 4e25aba747b1f15e286addb9ae30e4774365085b
      08efbf55
  12. 26 Aug, 2020 2 commits
    • Natalie Serrino's avatar
      PP-2161: Add StreamIR conversion rule to analyzer. · 0f7b4429
      Natalie Serrino authored
      Summary: Depends on D6089. Add the rule to convert StreamIR nodes into setting streaming=true on their parent memory sources to the analyzer. It should run before the rule that automatically adds a limit (which it does), since that rule will skip over DataFrames that have streaming MemorySource ancestors.
      
      Test Plan: unit
      
      Reviewers: philkuz, jamesbartlett, zasgar, #engineering
      
      Reviewed By: philkuz, #engineering
      
      JIRA Issues: PP-2161
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6095
      
      GitOrigin-RevId: 407d1922d53b587e9cdfd15e19051896e359aea8
      0f7b4429
    • Natalie Serrino's avatar
      PP-2161: Don't apply automatic limits to streaming queries. · b3cfe6b3
      Natalie Serrino authored
      Summary: Streaming queries that are map only shouldn't be subject to a limit because they can execute indefinitely. In the future we may want to apply a limit per window for streaming windowed queries, which are not currently supported.
      
      Test Plan: added unit test
      
      Reviewers: philkuz, jamesbartlett, zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      JIRA Issues: PP-2161
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6089
      
      GitOrigin-RevId: e3a14bdbbd322ef2a2b000a237fcd1d2c557ea83
      b3cfe6b3
  13. 27 Aug, 2020 1 commit
  14. 26 Aug, 2020 2 commits
    • Phillip Kuznetsov's avatar
      Auto load DataFrame docs · 1b227d77
      Phillip Kuznetsov authored
      Summary: Although we plan on manually editing the DataFrame docs, it helps to write it in one place so we can save ourselves the effort before we write them out.
      
      Test Plan: Tested and can extract them through the docstring integration thing
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6097
      
      GitOrigin-RevId: 678da0fd32bc9314827bf2dd767c6d1728047566
      1b227d77
    • Phillip Kuznetsov's avatar
      Wrap examples in backticks and fix docs formatting · 28bb23a2
      Phillip Kuznetsov authored
      Summary: Simple fix for examples to render properly and fix some docs formatting
      
      Test Plan: rendered
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6093
      
      GitOrigin-RevId: 79a85967f43d73a03d72d4defeb0b175e316ca71
      28bb23a2
  15. 27 Aug, 2020 1 commit
  16. 26 Aug, 2020 11 commits
    • Michelle Nguyen's avatar
      Update UI to fetch both opensource + px scripts · 98b46179
      Michelle Nguyen authored
      Summary:
      made two copies of the bundle.json as named them as bundle-px.json and bundle-os.json for testing.
      this just updates the UI so it fetches both px and open-source bundles in a Promise.all.
      
      The pxl dev server should operate as normal, unless we also want the pxl dev server to host opensource scripts as well. that would just be a small modification to the current code.
      
      Test Plan: ran webpack
      
      Reviewers: zasgar, #engineering
      
      Reviewed By: zasgar, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6085
      
      GitOrigin-RevId: 5b3fb7f8034d5a97497d799084f42631c0f333bc
      98b46179
    • James Bartlett's avatar
      UDF Docs set 5&6 · 71808c89
      James Bartlett authored
      Summary: TSIA
      
      Test Plan: N/A
      
      Reviewers: philkuz, #engineering, zasgar
      
      Reviewed By: #engineering, zasgar
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6096
      
      GitOrigin-RevId: 32cce6277180ca1bdd5418f612b7436fd483ca81
      71808c89
    • Michelle Nguyen's avatar
      Fix high CPU usage on login/signup page · 9d0d5597
      Michelle Nguyen authored
      Summary:
      High CPU is caused by having a background SVG image. It's not clear exactly why, but
      using the PNG is smaller and does not cause the CPU spike.
      
      Test Plan: Tested on Chrome.
      
      Reviewers: philkuz, michelle, #engineering
      
      Reviewed By: michelle, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6022
      
      GitOrigin-RevId: b6406f1dd813a779ffbf28e39930fb369169abf6
      9d0d5597
    • Phillip Kuznetsov's avatar
      Fixing registry test · f9a4b900
      Phillip Kuznetsov authored
      Summary: tsia
      
      Test Plan: tested
      
      Reviewers: zasgar, nserrino, #engineering
      
      Reviewed By: nserrino, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6094
      
      GitOrigin-RevId: f9d7311c4d01196df9ec3d15602c920a89f591b9
      f9a4b900
    • Natalie Serrino's avatar
      Add ResolveStreamRule to turn df.stream() into streaming memory sources · ced6cb8d
      Natalie Serrino authored
      Summary:
      This rule will return an error if someone calls df.stream() on a node with a blocking ancestor, or a node that has children besides sinks.
      PP-2115 tracks the work to add things like streaming union, agg, join in order to remove these limitations.
      Depends on D6080.
      
      Test Plan: added unit test
      
      Reviewers: philkuz, jamesbartlett, #engineering, zasgar
      
      Reviewed By: #engineering, zasgar
      
      JIRA Issues: PP-2161
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6087
      
      GitOrigin-RevId: 0699ae9e1f67c50fc1efab3aa8f1026074a3d2a8
      ced6cb8d
    • Zain Asgar's avatar
      Update service edge stats to remove broken section · add16975
      Zain Asgar authored
      Summary:
      Removes unused function.
      
      Fixup
      
      Test Plan: UI/Linter
      
      Reviewers: michelle, nserrino, philkuz, #engineering
      
      Reviewed By: michelle, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6092
      
      GitOrigin-RevId: 9cee2ecbd3d8860846aeb6153afe6bbdb5e030b1
      add16975
    • Zain Asgar's avatar
      Add additional sort to avoid infinite loop in union · 87ed6aa7
      Zain Asgar authored
      Summary:
      If we get the exact same timestamp and it happens to sort so that parent > next_parent then the loop hangs forever.
      
      In the longer term we should refactor to do this merge as a heap.
      
      Test Plan: Existing
      
      Reviewers: michelle, philkuz, nserrino, #engineering
      
      Reviewed By: michelle, nserrino, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6078
      
      GitOrigin-RevId: bc802d88fabca64adc0549a04bcffef8ce39cd74
      87ed6aa7
    • Natalie Serrino's avatar
      PP-2117: ExecuteScript API should stream results as soon as they are... · 609464ff
      Natalie Serrino authored
      PP-2117: ExecuteScript API should stream results as soon as they are available, and not batch them up in the qb.
      
      Summary:
      This completes the series of diffs that refactor the query broker to stream results using the query forwarder, rather than batching them up using the new deleted query executor before sending them back.
      I also added new tests, since we didn't actually have any ExecuteScript tests before. Depends on D6073.
      
      Test Plan: added.
      
      Reviewers: michelle, philkuz, zasgar, #engineering
      
      Reviewed By: philkuz, #engineering
      
      JIRA Issues: PP-2117
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6074
      
      GitOrigin-RevId: d494e7de693ddc747538e4802125d74c431f1446
      609464ff
    • Natalie Serrino's avatar
      PP-2161: Introduce df.stream() operator and IRNode. · 76e4051c
      Natalie Serrino authored
      Summary:
      This is going to be the pxl way to mark that a query should be executed in "infinite" streaming mode.
      Next up, a rule that turns StreamIR nodes into a modification on their parent MemorySources (setting stream=true).
      
      Test Plan: added a unit test
      
      Reviewers: philkuz, jamesbartlett, zasgar, #engineering
      
      Reviewed By: jamesbartlett, #engineering
      
      JIRA Issues: PP-2161
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6080
      
      GitOrigin-RevId: 5cacfefec022888c7f2a7b9c44e28a8676aa3d02
      76e4051c
    • Zain Asgar's avatar
      Add Apache2 header to PXL scripts under px · 1bbf8486
      Zain Asgar authored
      Summary: Required as part of Apache2 license.
      
      Test Plan: N/A
      
      Reviewers: michelle, philkuz, nserrino, #engineering
      
      Reviewed By: philkuz, #engineering
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6088
      
      GitOrigin-RevId: e1a89b2bd5a76fdf747971603baec1e1c326a5ee
      1bbf8486
    • Natalie Serrino's avatar
      PP-2161: Add streaming option to memory source IR node and propagate it down... · dac67304
      Natalie Serrino authored
      PP-2161: Add streaming option to memory source IR node and propagate it down to the memory source plan node.
      
      Summary: Next up is an analyzer rule that takes StreamIRs, and sets streaming=true on their parent memory sources.
      
      Test Plan: added unit tests
      
      Reviewers: philkuz, jamesbartlett, zasgar, #engineering
      
      Reviewed By: philkuz, jamesbartlett, #engineering
      
      JIRA Issues: PP-2161
      
      Differential Revision: https://phab.corp.pixielabs.ai/D6082
      
      GitOrigin-RevId: a63c45e6f97bb2763c89d1cd8b7a696e81af2402
      dac67304