• Natalie Serrino's avatar
    PP-1928: Add annotations struct to ExpressionIR and rule to propagate... · de3b8060
    Natalie Serrino authored
    PP-1928: Add annotations struct to ExpressionIR and rule to propagate annotations between operators.
    
    Summary:
    We are moving toward a new model for handling metadata about expressions. This is the first diff in a sequence that will replace the existing handling of metadata in our compiler. We want to track annotations such as metadata type on ExpressionIRs. That way, it is easy for a given consumer of an ExpressionIR to know if that ExpressionIR represents a metadata field, even if it originated from a column that has been processed and renamed since assignment. This annotations concept is scalable to other types of annotations in the future, but currently is limited to metadata for now. (The major use case right now is generalizing agent pruning).
    
    If a column has annotations, and then is reassigned or processed by a downstream operator, the output column should potentially share the same annotations depending on the context. If it is a simple  name reassignment in a map, then the annotations from the input column should be copied over. There are some more complex cases with things like union (where all of the input columns need to agree on a particular annotation for it to go in the output). As a result, this diff adds a rule that computes downstream annotations for columns derived from other columns or expressions that have annotations associated with them.
    
    Next up will be to tie the rule into the analyzer, and to metadata type annotation set for metadata fields produced by expressions such as df.ctx['pod_id']. Then, some of the existing metadata logic/classes will be removed.
    
    Test Plan: added
    
    Reviewers: philkuz, jamesbartlett, zasgar, #engineering
    
    Reviewed By: philkuz, #engineering
    
    JIRA Issues: PP-1928
    
    Differential Revision: https://phab.corp.pixielabs.ai/D4868
    
    GitOrigin-RevId: 49f425b6e6ada2e2b3f7ec1a5c9ff4936d91bc6f
    de3b8060