Skip to content

medcat.storage.serialisables

Classes:

Functions:

AbstractManualSerialisable

Methods:

get_init_attrs classmethod

get_init_attrs() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
212
213
214
@classmethod
def get_init_attrs(cls) -> list[str]:
    return []

get_strategy

get_strategy() -> SerialisingStrategy
Source code in medcat-v2/medcat/storage/serialisables.py
209
210
def get_strategy(self) -> SerialisingStrategy:
    return SerialisingStrategy.MANUAL

ignore_attrs classmethod

ignore_attrs() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
216
217
218
@classmethod
def ignore_attrs(cls) -> list[str]:
    return []

include_properties classmethod

include_properties() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
220
221
222
@classmethod
def include_properties(cls) -> list[str]:
    return []

AbstractSerialisable

The abstract serialisable base class.

This defines some common defaults.

Methods:

get_init_attrs classmethod

get_init_attrs() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
148
149
150
@classmethod
def get_init_attrs(cls) -> list[str]:
    return []

get_strategy

get_strategy() -> SerialisingStrategy
Source code in medcat-v2/medcat/storage/serialisables.py
145
146
def get_strategy(self) -> SerialisingStrategy:
    return SerialisingStrategy.SERIALISABLES_AND_DICT

ignore_attrs classmethod

ignore_attrs() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
152
153
154
@classmethod
def ignore_attrs(cls) -> list[str]:
    return []

include_properties classmethod

include_properties() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
156
157
158
@classmethod
def include_properties(cls) -> list[str]:
    return []

ManualSerialisable

Bases: Serialisable, Protocol

Methods:

deserialise_from classmethod

deserialise_from(folder_path: str, **init_kwargs) -> ManualSerialisable

Deserialise from a specifc path.

The init keyword arguments are generally: - cnf: The config relevant to the components - tokenizer (BaseTokenizer): The base tokenizer for the model - cdb (CDB): The CDB for the model - vocab (Vocab): The Vocab for the model - model_load_path (Optional[str]): The model load path, but not the component load path

Parameters:

  • folder_path

    (str) –

    The path to deserialsie form.

Returns:

Source code in medcat-v2/medcat/storage/serialisables.py
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
@classmethod
def deserialise_from(cls, folder_path: str, **init_kwargs
                     ) -> 'ManualSerialisable':
    """Deserialise from a specifc path.

    The init keyword arguments are generally:
    - cnf: The config relevant to the components
    - tokenizer (BaseTokenizer): The base tokenizer for the model
    - cdb (CDB): The CDB for the model
    - vocab (Vocab): The Vocab for the model
    - model_load_path (Optional[str]): The model load path,
        but not the component load path

    Args:
        folder_path (str): The path to deserialsie form.

    Returns:
        ManualSerialisable: The deserialised object.
    """
    pass

serialise_to

serialise_to(folder_path: str) -> None

Serialise to a folder.

Parameters:

  • folder_path

    (str) –

    The folder to serialise to.

Source code in medcat-v2/medcat/storage/serialisables.py
177
178
179
180
181
182
183
def serialise_to(self, folder_path: str) -> None:
    """Serialise to a folder.

    Args:
        folder_path (str): The folder to serialise to.
    """
    pass

Serialisable

Bases: Protocol

The base serialisable protocol.

Methods:

get_init_attrs classmethod

get_init_attrs() -> list[str]

Get the names of the arguments needed for init upon deserialisation.

Returns:

  • list[str]

    list[str]: The list of init arguments' names.

Source code in medcat-v2/medcat/storage/serialisables.py
116
117
118
119
120
121
122
123
@classmethod
def get_init_attrs(cls) -> list[str]:
    """Get the names of the arguments needed for init upon deserialisation.

    Returns:
        list[str]: The list of init arguments' names.
    """
    pass

get_strategy

get_strategy() -> SerialisingStrategy

Get the serialisation strategy.

Returns:

Source code in medcat-v2/medcat/storage/serialisables.py
108
109
110
111
112
113
114
def get_strategy(self) -> SerialisingStrategy:
    """Get the serialisation strategy.

    Returns:
        SerialisingStrategy: The strategy.
    """
    pass

ignore_attrs classmethod

ignore_attrs() -> list[str]

Get the names of attributes not to serialise.

Returns:

  • list[str]

    list[str]: The attribute names that should not be serialised.

Source code in medcat-v2/medcat/storage/serialisables.py
125
126
127
128
129
130
131
132
@classmethod
def ignore_attrs(cls) -> list[str]:
    """Get the names of attributes not to serialise.

    Returns:
        list[str]: The attribute names that should not be serialised.
    """
    pass

include_properties classmethod

include_properties() -> list[str]
Source code in medcat-v2/medcat/storage/serialisables.py
134
135
136
@classmethod
def include_properties(cls) -> list[str]:
    pass

SerialisingStrategy

Bases: Enum

Describes the strategy for serialising.

Methods:

  • get_dict

    Gets the appropriate parts of the dict of the object.

  • get_parts

    Gets the matching serialisable parts of the object.

Attributes:

  • DICT_ONLY

    Only include the object's .dict

  • MANUAL

    Use manual serialisation defined by the object itself.

  • SERIALISABLES_AND_DICT

    Serialise attributes that are Serialisable as well as

  • SERIALISABLE_ONLY

    Only serialise attributes that are of Serialisable type

DICT_ONLY class-attribute instance-attribute

DICT_ONLY = auto()

Only include the object's .dict

MANUAL class-attribute instance-attribute

MANUAL = auto()

Use manual serialisation defined by the object itself.

In this case, most of the logic defined within here will

likely be ignored.

SERIALISABLES_AND_DICT class-attribute instance-attribute

SERIALISABLES_AND_DICT = auto()

Serialise attributes that are Serialisable as well as the rest of .dict

SERIALISABLE_ONLY class-attribute instance-attribute

SERIALISABLE_ONLY = auto()

Only serialise attributes that are of Serialisable type

get_dict

get_dict(obj: Serialisable) -> dict[str, Any]

Gets the appropriate parts of the dict of the object.

I.e this filters out parts that shouldn't be included.

Parameters:

Returns:

  • dict[str, Any]

    dict[str, Any]: The filtered attributes map.

Source code in medcat-v2/medcat/storage/serialisables.py
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
def get_dict(self, obj: 'Serialisable') -> dict[str, Any]:
    """Gets the appropriate parts of the __dict__ of the object.

    I.e this filters out parts that shouldn't be included.

    Args:
        obj (Serialisable): The serialisable object.

    Returns:
        dict[str, Any]: The filtered attributes map.
    """
    out_dict = {
        attr_name: attr for attr_name, attr in self._iter_obj_items(obj)
        if self._is_suitable_in_dict(attr_name, attr, obj)
    }
    # do properties
    # NOTE: these are explicitly declared, so suitability is not checked
    out_dict.update({
        property_name: getattr(obj, property_name)
        for property_name in obj.include_properties()
    })
    return out_dict

get_parts

get_parts(obj: Serialisable) -> list[tuple[Serialisable, str]]

Gets the matching serialisable parts of the object.

This includes only serialisable parts, and only if specified by the strategy.

Returns:

Source code in medcat-v2/medcat/storage/serialisables.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def get_parts(self, obj: 'Serialisable'
              ) -> list[tuple['Serialisable', str]]:
    """Gets the matching serialisable parts of the object.

    This includes only serialisable parts, and only if specified
    by the strategy.

    Returns:
        list[tuple[Serialisable, str]]: The serialisable parts with names.
    """
    out_list: list[tuple[Serialisable, str]] = [
        (attr, attr_name) for attr_name, attr in self._iter_obj_items(obj)
        if self._is_suitable_part(attr_name, attr, obj)
    ]
    return out_list

get_all_serialisable_members

get_all_serialisable_members(object: Serialisable) -> tuple[list[tuple[Serialisable, str]], dict[str, Any]]

Gets all serialisable members of an object.

This looks for public and protected members, but not private ones. It should also be able to return parts of lists and tuples. It also provides the name of each serialisable object.

Parameters:

  • object

    (Any) –

    The target object.

Returns:

Source code in medcat-v2/medcat/storage/serialisables.py
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
def get_all_serialisable_members(object: Serialisable
                                 ) -> tuple[list[tuple[Serialisable, str]],
                                            dict[str, Any]]:
    """Gets all serialisable members of an object.

    This looks for public and protected members, but not private ones.
    It should also be able to return parts of lists and tuples.
    It also provides the name of each serialisable object.

    Args:
        object (Any): The target object.

    Returns:
        tuple[list[tuple[Serialisable, str]], dict[str, Any]]:
            list of serialisable objects along with their names
    """
    strat = object.get_strategy()
    return strat.get_parts(object), strat.get_dict(object)

name_all_serialisable_elements

name_all_serialisable_elements(target_list: Union[list, tuple], name_start: str = '', all_or_nothing: bool = True) -> list[tuple[Serialisable, str]]

Gets all serialisable elements from a list or tuple.

There's two strategies for finding the parts: 1) If all_or_nothing == True either all the elements in the list must be Serialisable or None of them. 2) If all_or_nothing == False some elements may be serialisable while others may not be.

Parameters:

  • target_list

    (Union[list, tuple]) –

    The list/tuple of objects to look in.

  • name_start

    (str, default: '' ) –

    The start of the name. Defaults to ''.

  • all_or_nothing

    (bool, default: True ) –

    Whether to disallow lists/tuple where only some elements are serialisable. Defaults to True.

Raises:

  • ValueError

    If all_or_nothing is specified and not all elements are serialisable.

Returns:

Source code in medcat-v2/medcat/storage/serialisables.py
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
def name_all_serialisable_elements(target_list: Union[list, tuple],
                                   name_start: str = '',
                                   all_or_nothing: bool = True
                                   ) -> list[tuple[Serialisable, str]]:
    """Gets all serialisable elements from a list or tuple.

    There's two strategies for finding the parts:
    1) If `all_or_nothing == True` either all the elements
        in the list must be Serialisable or None of them.
    2) If `all_or_nothing == False` some elements may be
        serialisable while others may not be.

    Args:
        target_list (Union[list, tuple]): The list/tuple of objects to look in.
        name_start (str, optional): The start of the name. Defaults to ''.
        all_or_nothing (bool, optional):
            Whether to disallow lists/tuple where only some elements are
            serialisable. Defaults to True.

    Raises:
        ValueError: If `all_or_nothing` is specified and not all elements
            are serialisable.

    Returns:
        list[tuple[Serialisable, str]]: The serialisable parts along with name.
    """
    out_parts: list[tuple[Serialisable, str]] = []
    if not target_list:
        return out_parts
    for el_nr, el in enumerate(target_list):
        if isinstance(el, Serialisable):
            out_parts.append((el, name_start + f"_el_{el_nr}"))
        elif all_or_nothing and out_parts:
            raise ValueError(f"The first {len(out_parts)} were serialisable "
                             "whereas the next one was not. Specify "
                             "`all_or_nothing=False` to allow for only "
                             "some of the list elements to be serialisable.")
    return out_parts