Skip to content

medcat.components.ner.trf.model

Classes:

NerModel

NerModel(cat: CAT)

The NER model.

This wraps a CAT instance and simplifies its use as a NER model.

It provides methods for creating one from a TransformersNER as well as loading from a model pack (along with some validation).

It also exposes some useful parts of the CAT it wraps such as the config and the concept database.

Methods:

  • add_new_concepts

    Add new concepts to the model and the concept database.

  • eval

    Evaluate the underlying transformers NER model.

  • get_entities

    Gets the entities recognized within a given text.

  • load_model_pack

    Load NER model from model pack.

  • train

    Train the underlying transformers NER model.

Attributes:

Source code in medcat-v2/medcat/components/ner/trf/model.py
26
27
def __init__(self, cat: CAT) -> None:
    self.cat = cat

cat instance-attribute

cat = cat

cdb property

cdb: CDB

config property

config: Config

trf_ner property

trf_ner: TransformersNER

add_new_concepts

add_new_concepts(cui2preferred_name: dict[str, str], with_random_init: bool = False) -> None

Add new concepts to the model and the concept database.

Invoking this requires subsequent retraining on the model.

Parameters:

  • cui2preferred_name

    (dict[str, str]) –

    Dictionary where each key is the literal ID of the concept to be added and each value is its preferred name.

  • with_random_init

    (bool, default: False ) –

    Whether to use the random init strategy for the new concepts. Defaults to False.

Source code in medcat-v2/medcat/components/ner/trf/model.py
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
def add_new_concepts(self,
                     cui2preferred_name: dict[str, str],
                     with_random_init: bool = False) -> None:
    """Add new concepts to the model and the concept database.

    Invoking this requires subsequent retraining on the model.

    Args:
        cui2preferred_name(dict[str, str]): Dictionary where each key is
            the literal ID of the concept to be added and each value is
                its preferred name.
        with_random_init (bool): Whether to use the random init strategy
            for the new concepts. Defaults to False.
    """
    self.trf_ner._component.expand_model_with_concepts(
        cui2preferred_name, use_avg_init=not with_random_init)

eval

eval(json_path: Union[str, list, None], *args, **kwargs) -> tuple[Any, Any, Any]

Evaluate the underlying transformers NER model. All the extra arguments are passed to the TransformersNER eval method. Args: json_path (Union[str, list, None]): The JSON file path to read the training data from. args: Additional arguments for TransformersNER.eval . *kwargs: Additional keyword arguments for TransformersNER.eval . Returns: Tuple[Any, Any, Any]: df, examples, dataset

Source code in medcat-v2/medcat/components/ner/trf/model.py
46
47
48
49
50
51
52
53
54
55
56
57
58
def eval(self, json_path: Union[str, list, None],
         *args, **kwargs) -> tuple[Any, Any, Any]:
    """Evaluate the underlying transformers NER model.
    All the extra arguments are passed to the TransformersNER eval method.
    Args:
        json_path (Union[str, list, None]):
            The JSON file path to read the training data from.
        *args: Additional arguments for TransformersNER.eval .
        **kwargs: Additional keyword arguments for TransformersNER.eval .
    Returns:
        Tuple[Any, Any, Any]: df, examples, dataset
    """
    return self.trf_ner._component.eval(json_path, *args, **kwargs)

get_entities

get_entities(text: str, *args, **kwargs) -> Union[dict, Entities, OnlyCUIEntities]

Gets the entities recognized within a given text.

The output format is identical to CAT.get_entities.

Undefined arguments and keyword arguments get passed on to CAT.get_entities.

Parameters:

  • text

    (str) –

    The input text.

  • *args

    Additional arguments for cat.get_entities .

  • **kwargs

    Additional keyword arguments for cat.get_entities .

Returns:

Source code in medcat-v2/medcat/components/ner/trf/model.py
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
def get_entities(self, text: str, *args, **kwargs
                 ) -> Union[dict, Entities, OnlyCUIEntities]:
    """Gets the entities recognized within a given text.

    The output format is identical to `CAT.get_entities`.

    Undefined arguments and keyword arguments get passed on to
    CAT.get_entities.

    Args:
        text (str): The input text.
        *args: Additional arguments for cat.get_entities .
        **kwargs: Additional keyword arguments for cat.get_entities .

    Returns:
        dict: The output entities.
    """
    return self.cat.get_entities(text, *args, **kwargs)

load_model_pack classmethod

load_model_pack(model_pack_path: str, config: Optional[dict] = None) -> NerModel

Load NER model from model pack.

The method first wraps the loaded CAT instance.

Parameters:

  • config

    (Optional[dict], default: None ) –

    Config for DeId model pack (primarily for stride of overlap window)

  • model_pack_path

    (str) –

    The model pack path.

Returns:

  • NerModel ( NerModel ) –

    The resulting DeI model.

Source code in medcat-v2/medcat/components/ner/trf/model.py
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
@classmethod
def load_model_pack(cls, model_pack_path: str,
                    config: Optional[dict] = None) -> 'NerModel':
    """Load NER model from model pack.

    The method first wraps the loaded CAT instance.

    Args:
        config: Config for DeId model pack (primarily for stride of
            overlap window)
        model_pack_path (str): The model pack path.

    Returns:
        NerModel: The resulting DeI model.
    """
    cat = CAT.load_model_pack(model_pack_path)  # , ner_config_dict=config)
    return cls(cat)

train

train(json_path: Union[str, list, None], *args, **kwargs) -> tuple[Any, Any, Any]

Train the underlying transformers NER model.

All the extra arguments are passed to the TransformersNER train method.

Parameters:

  • json_path

    (Union[str, list, None]) –

    The JSON file path to read the training data from.

  • *args

    Additional arguments for TransformersNER.train .

  • **kwargs

    Additional keyword arguments for TransformersNER.train .

Returns:

  • tuple[Any, Any, Any]

    Tuple[Any, Any, Any]: df, examples, dataset

Source code in medcat-v2/medcat/components/ner/trf/model.py
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def train(self, json_path: Union[str, list, None],
          *args, **kwargs) -> tuple[Any, Any, Any]:
    """Train the underlying transformers NER model.

    All the extra arguments are passed to the TransformersNER train method.

    Args:
        json_path (Union[str, list, None]): The JSON file path to read the
            training data from.
        *args: Additional arguments for TransformersNER.train .
        **kwargs: Additional keyword arguments for TransformersNER.train .

    Returns:
        Tuple[Any, Any, Any]: df, examples, dataset
    """
    return self.trf_ner._component.train(json_path, *args, **kwargs)