Skip to content

medcat.components.types

Classes:

Functions:

Attributes:

CompClass module-attribute

HashableComponet module-attribute

HashableComponet = HashableComponent

AbstractCoreComponent

Bases: ABC, CoreComponent

Methods:

Attributes:

NAME_PREFIX class-attribute instance-attribute

NAME_PREFIX = 'core_'

full_name property

full_name: str

is_core

is_core() -> bool
Source code in medcat-v2/medcat/components/types.py
81
82
def is_core(self) -> bool:
    return True

AbstractEntityProvidingComponent

AbstractEntityProvidingComponent(read_from_linked_ents: bool | Literal['auto'] = 'auto', write_to_linked_ents: bool | Literal['auto'] = 'auto')

Bases: AbstractCoreComponent

This is an abstract NER or linker component.

The class simplifies some things so that they don't have to be re-implemented in each implementation.

Methods:

Source code in medcat-v2/medcat/components/types.py
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
def __init__(self,
             read_from_linked_ents: bool | Literal['auto'] = 'auto',
             write_to_linked_ents: bool | Literal['auto'] = 'auto'):
    is_linker = self.get_type() == CoreComponentType.linking
    if read_from_linked_ents == 'auto':
        self._read_from_linked_ents = is_linker
    else:
        self._read_from_linked_ents = read_from_linked_ents
    if write_to_linked_ents == 'auto':
        self._write_to_linked_ents = is_linker
    else:
        self._write_to_linked_ents = write_to_linked_ents

get_ents_in

get_ents_in(doc: MutableDocument) -> list[MutableEntity] | None
Source code in medcat-v2/medcat/components/types.py
107
108
def get_ents_in(self, doc: MutableDocument) -> list[MutableEntity] | None:
    return doc.ner_ents.copy() if self._read_from_linked_ents else None

predict_entities abstractmethod

predict_entities(doc: MutableDocument, ents: list[MutableEntity] | None = None) -> list[MutableEntity]

Predict the relevant entities for the document.

This is meant to be used for the NER or the Linker component. The idea is that this is the specific implementation only really needs to implement this method for inference to work.

Parameters:

  • doc

    (MutableDocument) –

    The document.

  • ents

    (list[MutableEntity] | None, default: None ) –

    The entities to consider (if any). If None, all possible entities in the document are considered. Defaults to None.

Returns:

Source code in medcat-v2/medcat/components/types.py
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
@abstractmethod
def predict_entities(self, doc: MutableDocument,
                     ents: list[MutableEntity] | None = None
                     ) -> list[MutableEntity]:
    """Predict the relevant entities for the document.

    This is meant to be used for the NER or the Linker component.
    The idea is that this is the specific implementation only really
    needs to implement this method for inference to work.

    Args:
        doc (MutableDocument): The document.
        ents (list[MutableEntity] | None, optional): The entities to
            consider (if any). If None, all possible entities in the
            document are considered. Defaults to None.

    Returns:
        list[MutableEntity]: The predicted entities in document.
    """
    pass

set_ents

set_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
110
111
112
113
114
115
def set_ents(self, doc: MutableDocument, ents: list[MutableEntity]
             ) -> None:
    if self._write_to_linked_ents:
        self.set_linked_ents(doc, ents)
    else:
        self.set_ner_ents(doc, ents)

set_linked_ents classmethod

set_linked_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
123
124
125
126
127
@classmethod
def set_linked_ents(cls, doc: MutableDocument, ents: list[MutableEntity]
                    ) -> None:
    doc.linked_ents.clear()
    doc.linked_ents.extend(ents)

set_ner_ents classmethod

set_ner_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
117
118
119
120
121
@classmethod
def set_ner_ents(cls, doc: MutableDocument, ents: list[MutableEntity]
                 ) -> None:
    doc.ner_ents.clear()
    doc.ner_ents.extend(ents)

BaseComponent

Bases: Protocol

Methods:

  • create_new_component

    Create a new component or load one off disk if load path presented.

  • is_core

    Whether the component is a core component or not.

Attributes:

full_name property

full_name: Optional[str]

Name with the component type (e.g ner, linking, meta).

name property

name: str

The name of the component.

create_new_component classmethod

Create a new component or load one off disk if load path presented.

This may raise an exception if the wrong type of config is provided.

Parameters:

  • cnf

    (ComponentConfig) –

    The config relevant to this components.

  • tokenizer

    (BaseTokenizer) –

    The base tokenizer.

  • cdb

    (CDB) –

    The CDB.

  • vocab

    (Vocab) –

    The Vocab.

  • model_load_path

    (Optional[str]) –

    Model load path (if present).

Returns:

  • Self ( Self ) –

    The new components.

Source code in medcat-v2/medcat/components/types.py
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
@classmethod
def create_new_component(
        cls, cnf: ComponentConfig, tokenizer: BaseTokenizer,
        cdb: CDB, vocab: Vocab, model_load_path: Optional[str]) -> Self:
    """Create a new component or load one off disk if load path presented.

    This may raise an exception if the wrong type of config is provided.

    Args:
        cnf (ComponentConfig): The config relevant to this components.
        tokenizer (BaseTokenizer): The base tokenizer.
        cdb (CDB): The CDB.
        vocab (Vocab): The Vocab.
        model_load_path (Optional[str]): Model load path (if present).

    Returns:
        Self: The new components.
    """
    pass

is_core

is_core() -> bool

Whether the component is a core component or not.

Returns:

  • bool ( bool ) –

    Whether this is a core component.

Source code in medcat-v2/medcat/components/types.py
35
36
37
38
39
40
41
def is_core(self) -> bool:
    """Whether the component is a core component or not.

    Returns:
        bool: Whether this is a core component.
    """
    pass

CoreComponent

Bases: BaseComponent, Protocol

Methods:

get_type

get_type() -> CoreComponentType
Source code in medcat-v2/medcat/components/types.py
70
71
def get_type(self) -> CoreComponentType:
    pass

CoreComponentType

Bases: Enum

Attributes:

linking class-attribute instance-attribute

linking = auto()

ner class-attribute instance-attribute

ner = auto()

tagging class-attribute instance-attribute

tagging = auto()

token_normalizing class-attribute instance-attribute

token_normalizing = auto()

HashableComponent

Bases: Protocol

Methods:

get_hash

get_hash() -> str
Source code in medcat-v2/medcat/components/types.py
160
161
def get_hash(self) -> str:
    pass

TrainableComponent

Bases: Protocol

Methods:

  • train

    Train the component.

train

train(cui: str, entity: MutableEntity, doc: MutableDocument, negative: bool = False, names: Union[list[str], dict] = []) -> None

Train the component.

This should only apply to the linker.

Parameters:

  • cui

    (str) –

    The CUI to train.

  • entity

    (BaseEntity) –

    The entity we're at.

  • doc

    (BaseDocument) –

    The document within which we're working.

  • negative

    (bool, default: False ) –

    Whether or not the example is negative. Defaults to False.

  • names

    (list[str] / dict, default: [] ) –

    Optionally used to update the status of a name-cui pair in the CDB.

Source code in medcat-v2/medcat/components/types.py
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
def train(self, cui: str,
          entity: MutableEntity,
          doc: MutableDocument,
          negative: bool = False,
          names: Union[list[str], dict] = []) -> None:
    """Train the component.

    This should only apply to the linker.

    Args:
        cui (str): The CUI to train.
        entity (BaseEntity): The entity we're at.
        doc (BaseDocument): The document within which we're working.
        negative (bool): Whether or not the example is negative.
            Defaults to False.
        names (list[str]/dict):
            Optionally used to update the `status` of a name-cui
            pair in the CDB.
    """
    pass

create_core_component

Creat a core component.

Parameters:

  • comp_type

    (CoreComponentType) –

    The component type.

  • comp_name

    (str) –

    The name of the component.

  • cnf

    (ComponentConfig) –

    The config to be passed to creator.

  • tokenizer

    (BaseTokenizer) –

    The tokenizer to be passed to creator.

  • cdb

    (CDB) –

    The CDB to be passed to creator.

  • vocab

    (Vocab) –

    The vocab to be passed to the creator.

  • model_load_path

    (Optional[str]) –

    The optional load path to be passed to the creators.

Returns:

  • CoreComponent ( CoreComponent ) –

    The resulting / created component.

Source code in medcat-v2/medcat/components/types.py
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
def create_core_component(
        comp_type: CoreComponentType, comp_name: str, cnf: ComponentConfig,
        tokenizer: BaseTokenizer, cdb: CDB, vocab: Vocab,
        model_load_path: Optional[str]) -> CoreComponent:
    """Creat a core component.

    Args:
        comp_type (CoreComponentType): The component type.
        comp_name (str): The name of the component.
        cnf (ComponentConfig): The config to be passed to creator.
        tokenizer (BaseTokenizer): The tokenizer to be passed to creator.
        cdb (CDB): The CDB to be passed to creator.
        vocab (Vocab): The vocab to be passed to the creator.
        model_load_path (Optional[str]): The optional load path to be passed
            to the creators.

    Returns:
        CoreComponent: The resulting / created component.
    """
    try:
        comp_getter = get_core_registry(comp_type).get_component(comp_name)
    except MedCATRegistryException as err:
        raise MedCATRegistryException(f"With comp type '{comp_type}'") from err
    return comp_getter(cnf, tokenizer, cdb, vocab, model_load_path)

get_component_creator

get_component_creator(comp_type: CoreComponentType, comp_name: str) -> CompClass

Get the component creator.

Parameters:

  • comp_type

    (CoreComponentType) –

    The core component type.

  • comp_name

    (str) –

    The component name.

Returns:

  • CompClass

    Callable[..., CoreComponent]: The creator for the component.

Source code in medcat-v2/medcat/components/types.py
294
295
296
297
298
299
300
301
302
303
304
305
def get_component_creator(comp_type: CoreComponentType,
                          comp_name: str) -> CompClass:
    """Get the component creator.

    Args:
        comp_type (CoreComponentType): The core component type.
        comp_name (str): The component name.

    Returns:
        Callable[..., CoreComponent]: The creator for the component.
    """
    return get_core_registry(comp_type).get_component(comp_name)

get_core_registry

Get the registry for a core component type.

Parameters:

Returns:

Source code in medcat-v2/medcat/components/types.py
282
283
284
285
286
287
288
289
290
291
def get_core_registry(comp_type: CoreComponentType) -> Registry[CoreComponent]:
    """Get the registry for a core component type.

    Args:
        comp_type (CoreComponentType): The core component type.

    Returns:
        Registry[CoreComponent]: The corresponding registry.
    """
    return _CORE_REGISTRIES[comp_type]

get_registered_components

get_registered_components(comp_type: CoreComponentType) -> list[tuple[str, str]]

Get all registered components (name and class name for each).

Parameters:

Returns:

  • list[tuple[str, str]]

    list[tuple[str, str]]: The name and class name for each registered component.

Source code in medcat-v2/medcat/components/types.py
334
335
336
337
338
339
340
341
342
343
344
345
346
def get_registered_components(comp_type: CoreComponentType
                              ) -> list[tuple[str, str]]:
    """Get all registered components (name and class name for each).

    Args:
        comp_type (CoreComponentType): The core component type.

    Returns:
        list[tuple[str, str]]: The name and class name for each
            registered component.
    """
    registry = get_core_registry(comp_type)
    return registry.list_components()

lazy_register_core_component

lazy_register_core_component(comp_type: CoreComponentType, comp_name: str, comp_module: str, comp_cls_and_init: str) -> None

Register a new core component in a lazy way.

This avoid importing the registered component and its transitive imports unless the component is actually used.

For instance if your NER providing class MySpecialNER is in the module my_addon.my_module and uses the class method create_new_component to initialise (thus the complete path is my_addon.my_module.MySpecialNER.create_new_component) we would expect the following arguments: comp_type=CoreComponentType.ner, comp_name="my_special_ner", comp_module="my_addon.my_module", comp_cls_and_init="MySpecialNER.create_new_component"

Parameters:

  • comp_type

    (CoreComponentType) –

    The component type.

  • comp_name

    (str) –

    The component name.

  • comp_module

    (str) –

    The path to the component module.

  • comp_cls_and_init

    (str) –

    The component class and init method.

Source code in medcat-v2/medcat/components/types.py
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
def lazy_register_core_component(
        comp_type: CoreComponentType,
        comp_name: str,
        comp_module: str,
        comp_cls_and_init: str) -> None:
    """Register a new core component in a lazy way.

    This avoid importing the registered component and its
    transitive imports unless the component is actually used.

    For instance if your NER providing class `MySpecialNER`
    is in the module `my_addon.my_module` and uses the class method
    `create_new_component` to initialise (thus the complete path is
    `my_addon.my_module.MySpecialNER.create_new_component`) we
    would expect the following arguments:
        comp_type=CoreComponentType.ner,
        comp_name="my_special_ner",
        comp_module="my_addon.my_module",
        comp_cls_and_init="MySpecialNER.create_new_component"

    Args:
        comp_type (CoreComponentType): The component type.
        comp_name (str): The component name.
        comp_module (str): The path to the component module.
        comp_cls_and_init (str): The component class and init method.
    """
    _CORE_REGISTRIES[comp_type].register_lazy(
        comp_name, comp_module, comp_cls_and_init)

register_core_component

register_core_component(comp_type: CoreComponentType, comp_name: str, comp_clazz: CompClass) -> None

Register a new core component.

Parameters:

  • comp_type

    (CoreComponentType) –

    The component type.

  • comp_name

    (str) –

    The component name.

  • comp_clazz

    (ComplClass) –

    The component creator.

Source code in medcat-v2/medcat/components/types.py
239
240
241
242
243
244
245
246
247
248
249
def register_core_component(comp_type: CoreComponentType,
                            comp_name: str,
                            comp_clazz: CompClass) -> None:
    """Register a new core component.

    Args:
        comp_type (CoreComponentType): The component type.
        comp_name (str): The component name.
        comp_clazz (ComplClass): The component creator.
    """
    _CORE_REGISTRIES[comp_type].register(comp_name, comp_clazz)