AggregateSelector
AggregateSelector
Source code in src/logos/aggregate_selector.py
_entropy(col)
Calculates the entropy of a column.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col |
Series
|
The column for which to calculate the entropy. |
required |
Returns:
Type | Description |
---|---|
float
|
The entropy of |
Source code in src/logos/aggregate_selector.py
find_uninformative_aggregates(prepared_log, parsed_variables, causal_unit_var)
Find aggregates that are uninformative for each column in prepared_log
.
Aggregates are uninformative unless they maximize the empirical entropy across causal units.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prepared_log |
DataFrame
|
The prepared log. |
required |
parsed_variables |
DataFrame
|
The parsed variables. |
required |
causal_unit_var |
str
|
The name of the causal unit variable. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
A list of uninformative aggregates for |
Source code in src/logos/aggregate_selector.py
mean(x)
Calculates the mean of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the mean will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The mean of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
min(x)
Calculates the minimum of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the minimum will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The minimum of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
max(x)
Calculates the maximum of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the maximum will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The maximum of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
median(x)
Calculates the median of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the median will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The median of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
mode(x)
Calculates the mode of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the mode will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The mode of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
std(x)
Calculates the standard deviation of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the standard deviation will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The standard deviation of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
last(x)
Returns the last non-NA value in a series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the last non-NA value will be returned. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The last non-NA value of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
first(x)
Returns the first non-NA value in a series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the first non-NA value will be returned. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The first non-NA value of the series, or None if the series is all NA. |
Source code in src/logos/aggimp/agg_funcs.py
sum(x)
Calculates the sum of a series, ignoring NA values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Series
|
The series for which the sum will be calculated. |
required |
Returns:
Type | Description |
---|---|
Optional[Series]
|
The sum of the series, or None if the series is all NA. |