Skip to content

Miscellaneous

These functions provide handy tools that can be used in different places and do not belong to a specific module.

get_flag_names

Parameters:

Name Type Description Default
cols_values list

Each element is a

required

Returns:

Name Type Description
flag_names list

Each element is the flag column name corresponding to an element of cols_values.

Source code in pycelldyn/miscellaneous.py
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
def get_flag_names(cols_values):
    """ `get_flag_names`

    Parameters
    ----------
    cols_values : list
        Each element is a

    Returns
    -------
    flag_names : list
        Each element is the flag column name corresponding to an element
        of `cols_values`.
    """

    flag_names = []
    for col in cols_values:

        # First, let's look at the special cases (i.e., exceptions)
        if (col == 'hb_nl') or (col == 'hb_usa'):
            flag_name = 'hb_flag'
        elif (col == 'mch_nl') or (col == 'mch_usa'):
            flag_name = 'mch_flag'
        elif (col == 'mchc_nl') or (col == 'mchc_usa'):
            flag_name = 'mchc_flag'
        elif (col == 'mchr_nl') or (col == 'mchr_usa'):
            flag_name = 'mchr_flag'
        elif (col == '') or (col == ''):
            flag_name = ''

        # Otherwise, we can just append `_flag` at the end.
        else:
            flag_name = col + '_flag'

        flag_names.append(flag_name)

    return flag_names

get_cols_wbc_scatter

Get a list of column names that correspond to scatter measurement of white blood cells (WBCs).

Warning

Not to be confused with columns corresponding to WBC counts or sizes.

Tip

In order to get the coefficient of variance (CV) columns, just append _cv to each element of cols_wbc_measurements. For example:

cols_cv = [c + '_cv' for c in get_cols_wbc_measurements()]

Parameters:

Name Type Description Default
None
required

Returns:

Name Type Description
cols_wbc_measurements list

List of column names that correspond to scatter measurements of WBCs.

Source code in pycelldyn/miscellaneous.py
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
def get_cols_wbc_scatter():
    """ `get_cols_wbc_scatter`

    Get a list of column names that correspond to scatter measurement of
    white blood cells (WBCs).

    !!! warning
        Not to be confused with columns corresponding to WBC counts or sizes.

    !!! tip
        In order to get the coefficient of variance (CV) columns, just
        append `_cv` to each element of `cols_wbc_measurements`. For
        example:

        `cols_cv = [c + '_cv' for c in get_cols_wbc_measurements()]`

    Parameters
    ----------
    None

    Returns
    -------
    cols_wbc_measurements : list
        List of column names that correspond to scatter measurements of WBCs.
    """
    # TODO: Should we keep this function at all or should this be relegated
    # to the QC functions exclusively?

    cols_wbc_scatter = ['neutrophil_size_mean',
                        'neutrophil_intracellular_complexity',
                        'neutrophil_lobularity_polarized',
                        'neutrophil_lobularity_depolarized',
                        'neutrophil_dna_staining',
                        'lymphocyte_size_mean',
                        'lymphocyte_intracellular_complexity'
                        ]

    return cols_wbc_scatter

get_cols_with_values

Get a list of column names that correspond to actual parameter values (i.e., not flags nor alerts)

Parameters:

Name Type Description Default
df pandas DataFrame

Original DataFrame

required

Returns:

Name Type Description
cols_with_values list

Columns with values based on their corresponding _flag counterparts.

Source code in pycelldyn/miscellaneous.py
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
def get_cols_with_values(df):
    """ `get_cols_with_values`

    Get a list of column names that correspond to actual parameter values
    (i.e., not flags nor alerts)

    Parameters
    ----------
    df : pandas DataFrame
        Original DataFrame

    Returns
    -------
    cols_with_values : list
        Columns with values based on their corresponding _flag counterparts.

    """
    # Get columns from the original DataFrame with _flag suffix.
    cols_flags = get_elements_with_substring(df.columns, ['_flag'])

    # Obtain the original parameter name by removing the suffix.
    cols_with_values = [col_with_value.replace('_flag', '') for col_with_value in cols_flags ]


    # TODO
    # Correct for flags to correspond to two variables
    # e.g., hb_flag apply to both hb_usa and hb_nl

    return cols_with_values

get_elements_with_substring

Get elements of a list that have a specific substring.

Tip

This is a handy function to get flag columns (e.g., columns which name end with _flag or that start with flag_).

Parameters:

Name Type Description Default
base_list list

Base lists (for example, list(df.columns))

required
substr_list list

Each element of the list is a substring to be found.

Tip

If interested in only one substring, pass a list with one element.

required

Returns:

Type Description
list

List with the elements that have the given substring. If none, returns an empty list.

References
Source code in pycelldyn/miscellaneous.py
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
def get_elements_with_substring(base_list, substr_list):
    """ `get_elements_with_substring`

    Get elements of a list that have a specific substring.

    !!! tip
        This is a handy function to get flag columns (e.g., columns which
        name end with `_flag` or that start with `flag_`).

    Parameters
    ----------
    base_list : list
        Base lists (for example, `list(df.columns)`)

    substr_list : list
        Each element of the list is a substring to be found.

        !!! tip
            If interested in only one substring, pass a list with one element.

    Returns
    -------
    list
        List with the elements that have the given substring.
        If none, returns an empty list.

    References
    ----------
    * [Filtering a list of strings based on a substring](https://www.geeksforgeeks.org/python-filter-list-of-strings-based-on-the-substring-list/)
    """

    return [str for str in base_list if
             any(sub in str for sub in substr_list)]