Orbital Internals
orbitalml.translate
Translate a pipeline into an Ibis expression.
orbitalml.translate.ResultsProjection
Projection of the results of the pipeline.
This class is used to select the columns to be returned from the pipeline. It can be used to select specific columns to include in the final result set.
It can also be used to skip the select step of columns from the pipeline.
You can use the omit
method to skip the projection
step entirely.
Source code in orbitalml/translate.py
__init__
omit
classmethod
omit() -> ResultsProjection
orbitalml.translate.translate
translate(
table: Table,
pipeline: ParsedPipeline,
projection: ResultsProjection = ResultsProjection(),
) -> Table
Translate a pipeline into an Ibis expression.
This function takes a pipeline and a table and translates the pipeline into an Ibis expression applied to the table.
It is possible to further chain operations on the result to allow post processing of the prediction.
Source code in orbitalml/translate.py
orbitalml.translation.variables
Define the variables and group of variables used in the translation process.
orbitalml.translation.variables.VariablesGroup
Bases: dict[str, VariablesGroupVarT]
, Generic[VariablesGroupVarT]
A group of variables that can be used to represent a set of expressions.
This is used to represent a group of columns in a table, the group will act as a single entity on which expressions will be applied.
If an expression is applied to the group, it will be applied to all columns in the group.
Source code in orbitalml/translation/variables.py
__init__
__init__(vargroup: dict | None = None) -> None
Parameters:
Name | Type | Description | Default |
---|---|---|---|
vargroup
|
dict | None
|
A dictionary of names and expressions that are part of the group. |
None
|
Source code in orbitalml/translation/variables.py
as_value
as_value(name: str) -> Value
Return a subvariable as a Value.
Values are expressions on top of which operations like comparions, mathematical operations, etc. can be applied.
Source code in orbitalml/translation/variables.py
values_value
values_value() -> list[Value]
Return all subvariables as a list of Values.
Source code in orbitalml/translation/variables.py
orbitalml.translation.variables.ValueVariablesGroup
Bases: VariablesGroup[Value]
A group of value variables that can be used to represent a set of values.
This is used to represent a group of columns in a table, the group will act as a single entity on which expressions will be applied.
If an expression is applied to the group, it will be applied to all columns in the group.
Source code in orbitalml/translation/variables.py
orbitalml.translation.variables.NumericVariablesGroup
Bases: VariablesGroup[NumericValue]
A group of numeric variables that can be used to represent a set of numeric values.
This is used to represent a group of numeric columns in a table, steps that expect to be able to perform mathematical operations over a variables group will create a NumericVariablesGroup from it, so that it is guaranteed that all subvariables are numeric.
Source code in orbitalml/translation/variables.py
orbitalml.translation.variables.GraphVariables
A class to manage the variables used in the translation process.
This class is responsible for managing the variables and constants used in the translation process. It keeps track of the variables that have been consumed and the variables that are still available.
When a variable is consumed it will be hidden from the list of available variables. This makes sure that the remaining variables that were not consumed are only the variables that should appear in the output (as they were set with no one consuming them afterward).
This class also manages constants (initializers) that are used in the translation process. When consuming a variable, it could be both a constant or a variable. But if its a constant it won't actually be consumed as constants never appear in the output and thus it will be available for other nodes that need it.
Source code in orbitalml/translation/variables.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 |
|
__init__
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table
|
Table
|
The table the variables came from. |
required |
graph
|
GraphProto
|
The pipeline graph requiring the variables and providing the constants. |
required |
Source code in orbitalml/translation/variables.py
consume
consume(name: str) -> Expr | VariableTypes | VariablesGroup
Consume a variable or a constant.
Return a python value for constants and an Expression or VariablesGroup for variables.
When a variable is consumed it will be hidden from the list of remaining variables.
Source code in orbitalml/translation/variables.py
peek_variable
peek_variable(
name: str, default: None | Expr = None
) -> Expr | VariablesGroup | None
Peek a variable without consuming it.
get_initializer
get_initializer(
name: str, default: None | TensorProto = None
) -> TensorProto | None
get_initializer_value
get_initializer_value(
name: str, default: None | VariableTypes = None
) -> VariableTypes | None
keys
nested_len
nested_len() -> int
Get total amount of variables and subvariables
Source code in orbitalml/translation/variables.py
remaining
remaining() -> dict[str, Expr | VariablesGroup]
Return the variables that were not consumed.
generate_unique_shortname
generate_unique_shortname() -> str
orbitalml.translation.translator
Base class for the translators of each pipeline step.
orbitalml.translation.translator.Translator
Bases: ABC
Base class for all translators.
This class is responsible for translating pipeline steps into Ibis expressions.
Source code in orbitalml/translation/translator.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
|
mutated_table
property
The table as it is being mutated by the translator.
This is required for the translator to be able too set temporary variables that are not part of the final output.
For example when an expression is used many times, the translator can create a temporary column in the SQL query to avoid recomputing the same expression. That leads to new columns being added to the table.
__init__
__init__(
table: Table,
node: NodeProto,
variables: GraphVariables,
optimizer: Optimizer,
) -> None
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table
|
Table
|
The table the generated query should target. |
required |
node
|
NodeProto
|
The pipeline node to be translated. |
required |
variables
|
GraphVariables
|
The variables used during the translation process. |
required |
optimizer
|
Optimizer
|
The optimizer used for the translation. |
required |
Source code in orbitalml/translation/translator.py
process
abstractmethod
set_output
set_output(
value: Deferred | Expr | VariablesGroup | VariableTypes,
index: int = 0,
) -> None
Set the output variable for the translator.
This is only allowed if the translator has a single output. Otherwise the node is expected to explicitly set every variable.
Source code in orbitalml/translation/translator.py
preserve
preserve(*variables) -> list[Expr]
Preserve the given variables in the table.
This causes the variables to be projected in the table, so that future expressions can use them instead of repeating the expression.
Source code in orbitalml/translation/translator.py
variable_unique_short_alias
Generate a unique short name for a variable.
This is generally used to generate names for temporary variables that are used in the translation process.
The names are as short as possible to minimize the SQL query length.
Source code in orbitalml/translation/translator.py
orbitalml.translation.optimizer
Implement optiomizations to the Ibis expression tree.
Primarily it takes care of folding constant expressions and removing unnecessary casts.
orbitalml.translation.optimizer.Optimizer
Optimizer for Ibis expressions.
This class is responsible for applying a set of optimization processes to Ibis expressionsto remove unecessary operations and reduce query complexity.
Source code in orbitalml/translation/optimizer.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 |
|
__init__
__init__(enabled: bool = True) -> None
Parameters:
Name | Type | Description | Default |
---|---|---|---|
enabled
|
bool
|
Whether to enable the optimizer. When disabled, the optimizer will return the expression unchanged. |
True
|
fold_contiguous_sum
Precompute constants in a list of sums
fold_contiguous_product
Precompute constants in a list of multiplications
Source code in orbitalml/translation/optimizer.py
fold_case
Apply different folding strategies to CASE WHHEN expressions.
- If the CASE is a constant, it will evalute it immediately.
- If the CASE is a IF ELSE statement returning 1 or 0, it will be converted to a boolean expression.
- When the results and the default are the same, just return the default value.
Source code in orbitalml/translation/optimizer.py
fold_cast
Given a cast expression, precompute it if possible.
Source code in orbitalml/translation/optimizer.py
fold_zeros
Given a binary expression, precompute the result if it contains zeros.
Operations like x + 0, x * 0, x - 0 etc can be folded in just x or 0 without the need to compute the operation.
Source code in orbitalml/translation/optimizer.py
fold_operation
Given a node (an Ibis expression) fold constant expressions.
If all node immediate children are constant (i.e. NumericScalar), compute the operation in Python and return a literal with the result.
Otherwise, simply return the expression unchanged.
This function assumes that constant folding has already been applied to the children.
Source code in orbitalml/translation/optimizer.py
orbitalml.translation.steps
Translators for each ParsedPipeline step
orbitalml.translation.steps.add
Translate an Add operation to the equivalent query expression.
AddTranslator
Bases: Translator
Processes an Add node and updates the variables with the output expression.
Given the node to translate, the variables and constants available for the translation context, generates a query expression that processes the input variables and produces a new output variable that computes based on the Add operation.
Source code in orbitalml/translation/steps/add.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/add.py
orbitalml.translation.steps.argmax
Defines the translation step for the ArgMax operation.
ArgMaxTranslator
Bases: Translator
Processes an ArgMax node and updates the variables with the output expression.
Given the node to translate, the variables and constants available for the translation context, generates a query expression that processes the input variables and produces a new output variable that computes based on the ArgMax operation.
The ArgMax implementation is currently limited to emitting a variable that represents the index of the column with the maximum value in a group of columns. It is not possible to compute the max of a set of rows thus axis must be 1 and keepdims must be 1.
As it computes the maximum out of a set of columns, argmax expects a columns group as its input.
The limitation is due to the fact that we can't mix variables with a different amount of rows and MAX(col) would end up producing a single row. This is usually ok, as ArgMax is primarily used to pick the values with the maximum value out of the features analyzed by the model, and thus it is only required to produce a new value for each entry on which to perform a prediction/classification (row).
Source code in orbitalml/translation/steps/argmax.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/argmax.py
orbitalml.translation.steps.arrayfeatureextractor
ArrayFeatureExtractorTranslator
Bases: Translator
Processes an ArrayFeatureExtractor node and updates the variables with the output expression.
ArrayFeatureExtractor can be considered the opposit of :class:ConactTranslator
, as
in most cases it will be used to pick one or more features out of a group of column
previously concatenated, or to pick a specific feature out of the result of an ArgMax operation.
The provided indices always refer to the last axis of the input tensor.
If the input is a 2D tensor, the last axis is the column axis. So an index
of 0
would mean the first column. If the input is a 1D tensor instead the
last axis is the row axis. So an index of 0
would mean the first row.
This could be confusing because axis are inverted between tensors and orbitalml column groups. In the case of Tensors, index=0 means row=0, while instead in orbitalml column groups (by virtue of being a group of columns), index=0 means the first column.
We have to consider that the indices we receive, in case of column groups, are actually column indices, not row indices as in case of a tensor, the last index would be the column index. In case of single columns, instead the index is the index of a row like it would be with a 1D tensor.
Source code in orbitalml/translation/steps/arrayfeatureextractor.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/arrayfeatureextractor.py
orbitalml.translation.steps.cast
Translators for Cast and CastLike operations
CastTranslator
Bases: Translator
Processes a Cast node and updates the variables with the output expression.
Cast operation is used to convert a variable from one type to another one
provided by the attribute to
.
Source code in orbitalml/translation/steps/cast.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/cast.py
CastLikeTranslator
Bases: Translator
Processes a CastLike node and updates the variables with the output expression.
CastLike operation is used to convert a variable from one type to the same type of another variable, thus uniforming the two
Source code in orbitalml/translation/steps/cast.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/cast.py
orbitalml.translation.steps.concat
Translator for Concat and FeatureVectorizer operations.
ConcatTranslator
Bases: Translator
Concatenate multiple columns into a single group of columns.
In tensor terms, this is meant to create a new tensor by concatenating the inputs along a given axis. In most cases, this is used to concatenate multiple features into a single one, thus its purpose is usually to create a column group from separate columns.
This means that the most common use case is axis=1, which means concatenating over the columns (by virtue of column/rows in tensors being flipped over column groups), and thus only axis=1 case is supported.
Source code in orbitalml/translation/steps/concat.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/concat.py
FeatureVectorizerTranslator
Bases: Translator
Concatenate multiple columns into a single group of columns.
This is similar to Concat, but it is a simplified version that always only acts on columns, and does not support concatenating over rows. While Concat can in theory support rows concatenation, even though orbitalml doesn't implement it.
Source code in orbitalml/translation/steps/concat.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/concat.py
orbitalml.translation.steps.div
Defines the translation step for the Div operation.
DivTranslator
Bases: Translator
Processes a Div node and updates the variables with the output expression.
This class is responsible for handling the division operation in the translation process. It takes two inputs: the first operand and the second operand (divisor).
The first operand can be a column group or a single column, while the second operand must be a constant value.
When the second operand is a single value, all columns of the column group are divided for that value. If the second operand is instead a list, each column of the column group is divided for the corresponding value in the list.
Source code in orbitalml/translation/steps/div.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/div.py
orbitalml.translation.steps.gather
Defines the translation step for the Gather operation.
GatherTranslator
Bases: Translator
Processes a Gather node and updates the variables with the output expression.
The gather operations is meant to pick a specific value out of a column or column group.
The first operand can be a column group or a single column, while the second operand must be a constant value.
When the first operand is a column, the second operand must be 0 as there is only one column.
The operation could in theory be used to pick a specific row of columns by setting axis=0, but this is not supported in the current implementation.
Source code in orbitalml/translation/steps/gather.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/gather.py
orbitalml.translation.steps.identity
Implementation of the Identity operator.
IdentityTranslator
Bases: Translator
Processes an Identity node and updates the variables with the output expression.
The identity node is a no-op, it simply passes the input to the output, it is meant to copy the input into the output, but as there could be multiple references to the same expression, it doesn't actually need to perform a copy.
Source code in orbitalml/translation/steps/identity.py
process
Performs the translation and set the output variable.
orbitalml.translation.steps.imputer
Implementation of the Imputer operator.
ImputerTranslator
Bases: Translator
Processes an Imputer node and updates the variables with the output expression.
The imputer node replaces missing values in the input expression with another value. Currently the only supported value is a float, which is used to replace all missing values in the input expression.
Source code in orbitalml/translation/steps/imputer.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/imputer.py
orbitalml.translation.steps.labelencoder
Implementation of the LabelEncoder operator.
LabelEncoderTranslator
Bases: Translator
Processes a LabelEncoder node and updates the variables with the output expression.
LabelEncoder is used to map values from one variable to values of another one. It is usually meant to map numeric values to categories
Source code in orbitalml/translation/steps/labelencoder.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/labelencoder.py
orbitalml.translation.steps.linearclass
Implementation of the LinearClassifier operator.
LinearClassifierTranslator
Bases: Translator
Processes a LinearClassifier node and updates variables with the classification results.
The LinearClassifier operator computes classification outputs as: Scores = X * coefficients + intercepts
For more complex pipelines the LinearClassifier operator is not always used, usually a combination of Mul and Add operations is used.
Source code in orbitalml/translation/steps/linearclass.py
process
Performs the translation and sets the output variables Y (predictions) and Z (scores).
Source code in orbitalml/translation/steps/linearclass.py
orbitalml.translation.steps.linearreg
Implementation of the LinearRegression operator.
LinearRegressorTranslator
Bases: Translator
Processes a LinearRegression node and updates variables with the predicted expression.
The LinearRegression operator computes predictions as: Y = X * coefficients + intercept
For more complex pipelines the LinearRegression operator is not always used, usually a combination of Mul and Add operations is used.
Source code in orbitalml/translation/steps/linearreg.py
process
Performs the translation and sets the output variable.
Source code in orbitalml/translation/steps/linearreg.py
orbitalml.translation.steps.matmul
Implementation of the LabelEncoder operator.
MatMulTranslator
Bases: Translator
Processes a MatMul node and updates the variables with the output expression.
This class is responsible for handling the matrix multiplication operation in the translation process. It takes two inputs: the first operand and the second operand (coefficient tensor). The first operand can be a column group or a single column, while the second operand must be a constant value.
When the second operand is a single value, all columns of the column group are multiplied by that value. If the second operand is instead a list, each column of the column group is multiplied by the corresponding value in the list.
Source code in orbitalml/translation/steps/matmul.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/matmul.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
orbitalml.translation.steps.mul
Translate an Mul operation to the equivalent query expression.
MulTranslator
Bases: Translator
Processes an Mul node and updates the variables with the output expression.
Given the node to translate, the variables and constants available for the translation context, generates a query expression that processes the input variables and produces a new output variable that computes based on the Mul operation.
Source code in orbitalml/translation/steps/mul.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/mul.py
orbitalml.translation.steps.onehotencoder
Implementation of the OneHotEncoder operator.
OneHotEncoderTranslator
Bases: Translator
Processes a MatMul node and updates the variables with the output expression.
Given a categorical variable, this class creates a new group of columns, with one column for each category. The values of the column are 1.0 if the original column value is equal to the category, and 0.0 otherwise.
It supports only strings for categories and emits floats as column values.
Source code in orbitalml/translation/steps/onehotencoder.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/onehotencoder.py
orbitalml.translation.steps.reshape
Implementation of the Reshape operator.
ReshapeTranslator
Bases: Translator
Processes a Reshape node and updates the variables with the output expression.
Reshape is currently a noop operation, it only supports cases where it doesn't have to change the data shape. That is generally not possible to support columns of different length in the same expressions/table so we can't really change the shape of a column as it implies changing its length.
Source code in orbitalml/translation/steps/reshape.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/reshape.py
orbitalml.translation.steps.scaler
Implementation of the Scaler operator.
ScalerTranslator
Bases: Translator
Processes a Scaler node and updates variables with the scaled expression.
The Scaler operator applies a scaling and offset to the input: Y = (X - offset) * scale
The scaler operation is not always used, for more complex pipelines usually a combination of Sub and Mul operations is used.
Source code in orbitalml/translation/steps/scaler.py
process
Performs the translation and sets the output variable.
Source code in orbitalml/translation/steps/scaler.py
orbitalml.translation.steps.softmax
Implementation of the Softmax operator.
SoftmaxTranslator
Bases: Translator
Processes a Softmax node and updates the variables with the output expression.
The operation computes the normalized exponential of the input::
Softmax = Exp(input) / Sum(Exp(input))
Currently the Softmax operation is supported only for axis=-1 or axis=1, which means for the a column group means that the softmax is computed independently for each column in the group.
Source code in orbitalml/translation/steps/softmax.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/softmax.py
compute_softmax
classmethod
compute_softmax(
translator: Translator,
data: NumericValue | VariablesGroup,
) -> Expr | VariablesGroup
Computes the actual softmax operation over a column or column group.
Source code in orbitalml/translation/steps/softmax.py
orbitalml.translation.steps.sub
Implementation of the Sub operator.
SubTranslator
Bases: Translator
Processes a Sub node and updates the variables with the output expression.
Given the node to translate, the variables and constants available for the translation context, generates a query expression that processes the input variables and produces a new output variable that computes based on the Sub operation.
Source code in orbitalml/translation/steps/sub.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/sub.py
orbitalml.translation.steps.trees
Translators for trees based models.
TreeEnsembleClassifierTranslator
Bases: Translator
Processes a TreeEnsembleClassifier node and updates the variables with the output expression.
This node is foundational for most tree based models: - Random Forest - Gradient Boosted Trees - Decision Trees
The parsing of the tree is done by the :func:build_tree
function,
which results in a dictionary of trees.
The class parses the trees to generate a set of CASE WHEN THEN ELSE
expressions that are used to compute the votes for each class.
The class also computes the probability of each class by dividing the votes by the sum of all votes.
Source code in orbitalml/translation/steps/trees/classifier.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/trees/classifier.py
build_classifier
build_classifier(
input_expr: Expr | VariablesGroup,
) -> tuple[Expr, VariablesGroup]
Build the classification expression and the probabilities expressions
Return the classification expression as the first argument and a group of variables (one for each category) for the probability expressions.
Source code in orbitalml/translation/steps/trees/classifier.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
TreeEnsembleRegressorTranslator
Bases: Translator
Processes a TreeEnsembleClassifier node and updates the variables with the output expression.
This node is foundational for most tree based models: - Gradient Boosted Trees - Decision Trees
The parsing of the tree is done by the :func:build_tree
function,
which results in a dictionary of trees.
The class parses the trees to generate a set of CASE WHEN THEN ELSE
expressions that are used to compute the prediction for each tree.
Source code in orbitalml/translation/steps/trees/regressor.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/trees/regressor.py
build_regressor
build_regressor(input_expr: VariablesGroup | Expr) -> Expr
Build the regression expression
Source code in orbitalml/translation/steps/trees/regressor.py
classifier
Implement classification based on trees
TreeEnsembleClassifierTranslator
Bases: Translator
Processes a TreeEnsembleClassifier node and updates the variables with the output expression.
This node is foundational for most tree based models: - Random Forest - Gradient Boosted Trees - Decision Trees
The parsing of the tree is done by the :func:build_tree
function,
which results in a dictionary of trees.
The class parses the trees to generate a set of CASE WHEN THEN ELSE
expressions that are used to compute the votes for each class.
The class also computes the probability of each class by dividing the votes by the sum of all votes.
Source code in orbitalml/translation/steps/trees/classifier.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/trees/classifier.py
build_classifier
build_classifier(
input_expr: Expr | VariablesGroup,
) -> tuple[Expr, VariablesGroup]
Build the classification expression and the probabilities expressions
Return the classification expression as the first argument and a group of variables (one for each category) for the probability expressions.
Source code in orbitalml/translation/steps/trees/classifier.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
regressor
Implement regression based on trees
TreeEnsembleRegressorTranslator
Bases: Translator
Processes a TreeEnsembleClassifier node and updates the variables with the output expression.
This node is foundational for most tree based models: - Gradient Boosted Trees - Decision Trees
The parsing of the tree is done by the :func:build_tree
function,
which results in a dictionary of trees.
The class parses the trees to generate a set of CASE WHEN THEN ELSE
expressions that are used to compute the prediction for each tree.
Source code in orbitalml/translation/steps/trees/regressor.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/trees/regressor.py
build_regressor
build_regressor(input_expr: VariablesGroup | Expr) -> Expr
Build the regression expression
Source code in orbitalml/translation/steps/trees/regressor.py
tree
Prase tree definitions and return a graph of nodes.
build_tree
Build a tree based on nested dictionaries of nodes.
The tree is built based on the node and attributes of the translator.
Source code in orbitalml/translation/steps/trees/tree.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
mode_to_condition
mode_to_condition(node: dict, feature_expr: Expr) -> Expr
Build a comparison expression for a branch node.
The comparison is based on the mode of the node and the threshold for that noode. The feature will be compared to the threshold using the operator defined by the mode.
Source code in orbitalml/translation/steps/trees/tree.py
orbitalml.translation.steps.where
Implementation of the Where operator.
WhereTranslator
Bases: Translator
Processes a Where node and updates the variables with the output expression.
The where operation is expected to return ether its first or second input depending on a condition variable. When the variable is true, the first input is returned, otherwise the second input is returned.
The condition variable will usually be a column computed through an expression that represents a boolean predicate.
The first and second inputs can be either a single column or a group of columns. If any of the two is a group of columns, a new group of column is produced as the result. If both are single columns, the result is a single column.
Source code in orbitalml/translation/steps/where.py
process
Performs the translation and set the output variable.
Source code in orbitalml/translation/steps/where.py
orbitalml.translation.steps.zipmap
Implementation of the ZipMap operator.
ZipMapTranslator
Bases: Translator
Processes a ZipMap node and updates the variables with the output expression.
The ZipMap operator is used to map values from one variable to another set of values. It is usually meant to map numeric values to categories.
If the input is a group of columns, all columns in the group will be remappped according to the class labels.
Source code in orbitalml/translation/steps/zipmap.py
process
Performs the translation and set the output variable.