Skip to content

Latest commit

 

History

History
1097 lines (827 loc) · 29 KB

File metadata and controls

1097 lines (827 loc) · 29 KB

description: An expression represents the calculation of a prensor object.

s2t.Expression

View source on GitHub

An expression represents the calculation of a prensor object.

s2t.Expression(
    is_repeated: bool,
    my_type: Optional[tf.DType],
    schema_feature: Optional[schema_pb2.Feature] = None
)

Args

`is_repeated` if the expression is repeated.
`my_type` the DType of a field, or None for an internal node.
`schema_feature` the local schema (StructDomain information should not be present).

Attributes

`is_leaf` True iff the node tensor is a LeafNodeTensor.
`is_repeated` True iff the same parent value can have multiple children values.
`schema_feature` Return the schema of the field.
`type` dtype of the expression, or None if not a leaf expression.

Methods

apply

View source

apply(
    transform: Callable[['Expression'], 'Expression']
) -> "Expression"

apply_schema

View source

apply_schema(
    schema: schema_pb2.Schema
) -> "Expression"

broadcast

View source

broadcast(
    source_path: s2t.Path,
    sibling_field: s2t.Step,
    new_field_name: s2t.Step
) -> "Expression"

Broadcasts the existing field at source_path to the sibling_field.

calculate

View source

@abc.abstractmethod
calculate(
    source_tensors: Sequence[s2t.NodeTensor],
    destinations: Sequence['Expression'],
    options: calculate_options.Options,
    side_info: Optional[s2t.Prensor] = None
) -> s2t.NodeTensor

Calculates the node tensor of the expression.

The node tensor must be a function of the properties of the expression and the node tensors of the expressions from get_source_expressions().

If is_leaf, then calculate must return a LeafNodeTensor. Otherwise, it must return a ChildNodeTensor or RootNodeTensor.

If calculate_is_identity is true, then this must return source_tensors[0].

Sometimes, for operations such as parsing the proto, calculate will return additional information. For example, calculate() for the root of the proto expression also parses out the tensors required to calculate the tensors of the children. This is why destinations are required.

For a reference use, see calculate_value_slowly(...) below.

Args
`source_tensors` The node tensors of the expressions in get_source_expressions().
`destinations` The expressions that will use the output of this method.
`options` Options for the calculation.
`side_info` An optional prensor that is used to bind to a placeholder expression.
Returns
A NodeTensor representing the output of this expression.

calculation_equal

View source

@abc.abstractmethod
calculation_equal(
    expression: "Expression"
) -> bool

self.calculate is equal to another expression.calculate.

Given the same source node tensors, self.calculate(...) and expression.calculate(...) will have the same result.

Note that this does not check that the source expressions of the two expressions are the same. Therefore, two operations can have the same calculation, but not the same output, because their sources are different. For example, if a.calculation_is_identity() is True and b.calculation_is_identity() is True, then a.calculation_equal(b) is True. However, unless a and b have the same source, the expressions themselves are not equal.

Args
`expression` The expression to compare to.

calculation_is_identity

View source

@abc.abstractmethod
calculation_is_identity() -> bool

True iff the self.calculate is the identity.

There is exactly one source, and the output of self.calculate(...) is the node tensor of this source.

cogroup_by_index

View source

cogroup_by_index(
    source_path: s2t.Path,
    left_name: s2t.Step,
    right_name: s2t.Step,
    new_field_name: s2t.Step
) -> "Expression"

Creates a cogroup of left_name and right_name at new_field_name.

create_has_field

View source

create_has_field(
    source_path: s2t.Path,
    new_field_name: s2t.Step
) -> "Expression"

Creates a field that is the presence of the source path.

create_proto_index

View source

create_proto_index(
    field_name: s2t.Step
) -> "Expression"

Creates a proto index field as a direct child of the current root.

The proto index maps each root element to the original batch index. For example: [0, 2] means the first element came from the first proto in the original input tensor and the second element came from the third proto. The created field is always "dense" -- it has the same valency as the current root.

Args
`field_name` the name of the field to be created.
Returns
An Expression object representing the result of the operation.

create_size_field

View source

create_size_field(
    source_path: s2t.Path,
    new_field_name: s2t.Step
) -> "Expression"

Creates a field that is the size of the source path.

get_child

View source

get_child(
    field_name: s2t.Step
) -> Optional['Expression']

Gets a named child.

get_child_or_error

View source

get_child_or_error(
    field_name: s2t.Step
) -> "Expression"

Gets a named child.

get_descendant

View source

get_descendant(
    p: s2t.Path
) -> Optional['Expression']

Finds the descendant at the path.

get_descendant_or_error

View source

get_descendant_or_error(
    p: s2t.Path
) -> "Expression"

Finds the descendant at the path.

get_known_children

View source

get_known_children() -> Mapping[path.Step, 'Expression']

get_known_descendants

View source

get_known_descendants() -> Mapping[path.Path, 'Expression']

Gets a mapping from known paths to subexpressions.

The difference between this and get_descendants in Prensor is that all paths in a Prensor are realized, thus all known. But an Expression's descendants might not all be known at the point this method is called, because an expression may have an infinite number of children.

Returns
A mapping from paths (relative to the root of the subexpression) to expressions.

get_paths_with_schema

View source

get_paths_with_schema() -> List[s2t.Path]

Extract only paths that contain schema information.

get_schema

View source

get_schema(
    create_schema_features=True
) -> schema_pb2.Schema

Returns a schema for the entire tree.

Args
`create_schema_features` If True, schema features are added for all children and a schema entry is created if not available on the child. If False, features are left off of the returned schema if there is no schema_feature on the child.

get_source_expressions

View source

@abc.abstractmethod
get_source_expressions() -> Sequence['Expression']

Gets the sources of this expression.

The node tensors of the source expressions must be sufficient to calculate the node tensor of this expression (see calculate and calculate_value_slowly).

Returns
The sources of this expression.

known_field_names

View source

@abc.abstractmethod
known_field_names() -> FrozenSet[s2t.Step]

Returns known field names of the expression.

Known field names of a parsed proto correspond to the fields declared in the message. Examples of "unknown" fields are extensions and explicit casts in an any field. The only way to know if an unknown field "(foo.bar)" is present in an expression expr is to call (expr["(foo.bar)"] is not None).

Notice that simply accessing a field does not make it "known". However, setting a field (or setting a descendant of a field) will make it known.

project(...) returns an expression where the known field names are the only field names. In general, if you want to depend upon known_field_names (e.g., if you want to compile a expression), then the best approach is to project() the expression first.

Returns
An immutable set of field names.

map_field_values

View source

map_field_values(
    source_path: s2t.Path,
    operator: Callable[[tf.Tensor], tf.Tensor],
    dtype: tf.DType,
    new_field_name: s2t.Step
) -> "Expression"

Map a primitive field to create a new primitive field.

Note: the dtype argument is added since the v1 API.

Args
`source_path` the origin path.
`operator` an element-wise operator that takes a 1-dimensional vector.
`dtype` the type of the output.
`new_field_name` the name of a new sibling of source_path.
Returns
the resulting root expression.

map_ragged_tensors

View source

map_ragged_tensors(
    parent_path: s2t.Path,
    source_fields: Sequence[s2t.Step],
    operator: Callable[..., tf.SparseTensor],
    is_repeated: bool,
    dtype: tf.DType,
    new_field_name: s2t.Step
) -> "Expression"

Maps a set of primitive fields of a message to a new field.

Unlike map_field_values, this operation allows you to some degree reshape the field. For instance, you can take two optional fields and create a repeated field, or perform a reduce_sum on the last dimension of a repeated field and create an optional field. The key constraint is that the operator must return a sparse tensor of the correct dimension: i.e., a 2D sparse tensor if is_repeated is true, or a 1D sparse tensor if is_repeated is false. Moreover, the first dimension of the sparse tensor must be equal to the first dimension of the input tensor.

Args
`parent_path` the parent of the input and output fields.
`source_fields` the nonempty list of names of the source fields.
`operator` an operator that takes len(source_fields) sparse tensors and returns a sparse tensor of the appropriate shape.
`is_repeated` whether the output is repeated.
`dtype` the dtype of the result.
`new_field_name` the name of the resulting field.
Returns
A new query.

map_sparse_tensors

View source

map_sparse_tensors(
    parent_path: s2t.Path,
    source_fields: Sequence[s2t.Step],
    operator: Callable[..., tf.SparseTensor],
    is_repeated: bool,
    dtype: tf.DType,
    new_field_name: s2t.Step
) -> "Expression"

Maps a set of primitive fields of a message to a new field.

Unlike map_field_values, this operation allows you to some degree reshape the field. For instance, you can take two optional fields and create a repeated field, or perform a reduce_sum on the last dimension of a repeated field and create an optional field. The key constraint is that the operator must return a sparse tensor of the correct dimension: i.e., a 2D sparse tensor if is_repeated is true, or a 1D sparse tensor if is_repeated is false. Moreover, the first dimension of the sparse tensor must be equal to the first dimension of the input tensor.

Args
`parent_path` the parent of the input and output fields.
`source_fields` the nonempty list of names of the source fields.
`operator` an operator that takes len(source_fields) sparse tensors and returns a sparse tensor of the appropriate shape.
`is_repeated` whether the output is repeated.
`dtype` the dtype of the result.
`new_field_name` the name of the resulting field.
Returns
A new query.

project

View source

project(
    path_list: Sequence[CoercableToPath]
) -> "Expression"

Constrains the paths to those listed.

promote

View source

promote(
    source_path: s2t.Path,
    new_field_name: s2t.Step
)

Promotes source_path to be a field new_field_name in its grandparent.

promote_and_broadcast

View source

promote_and_broadcast(
    path_dictionary: Mapping[path.Step, CoercableToPath],
    dest_path_parent: s2t.Path
) -> "Expression"

reroot

View source

reroot(
    new_root: s2t.Path
) -> "Expression"

Returns a new list of protocol buffers available at new_root.

schema_string

View source

schema_string(
    limit: Optional[int] = None
) -> str

Returns a schema for the expression.

E.g.

repeated root: optional int32 foo optional bar: optional string baz optional int64 bak

Note that unknown fields and subexpressions are not displayed.

Args
`limit` if present, limit the recursion.
Returns
A string, describing (a part of) the schema.

slice

View source

slice(
    source_path: s2t.Path,
    new_field_name: s2t.Step,
    begin: Optional[IndexValue] = None,
    end: Optional[IndexValue] = None
) -> "Expression"

Creates a slice copy of source_path at new_field_path.

Note that if begin or end is negative, it is considered relative to the size of the array. e.g., slice(...,begin=-1) will get the last element of every array.

Args
`source_path` the source of the slice.
`new_field_name` the new field that is generated.
`begin` the beginning of the slice (inclusive).
`end` the end of the slice (exclusive).
Returns
An Expression object representing the result of the operation.

truncate

View source

truncate(
    source_path: s2t.Path,
    limit: Union[int, tf.Tensor],
    new_field_name: s2t.Step
) -> "Expression"

Creates a truncated copy of source_path at new_field_path.

__eq__

View source

__eq__(
    expr: "Expression"
) -> bool

if hash(expr1) == hash(expr2): then expr1 == expr2.

Do not override this method. Args: expr: The expression to check equality against

Returns
Boolean of equality of two expressions