medcat.components.types
Classes:
-
AbstractCoreComponent– -
AbstractEntityProvidingComponent–This is an abstract NER or linker component.
-
BaseComponent– -
CoreComponent– -
CoreComponentType– -
HashableComponent– -
TrainableComponent–
Functions:
-
create_core_component–Creat a core component.
-
get_component_creator–Get the component creator.
-
get_core_registry–Get the registry for a core component type.
-
get_registered_components–Get all registered components (name and class name for each).
-
lazy_register_core_component–Register a new core component in a lazy way.
-
register_core_component–Register a new core component.
Attributes:
CompClass
module-attribute
CompClass = Callable[[ComponentConfig, BaseTokenizer, CDB, Vocab, Optional[str]], CoreComponent]
AbstractCoreComponent
AbstractEntityProvidingComponent
AbstractEntityProvidingComponent(read_from_linked_ents: bool | Literal['auto'] = 'auto', write_to_linked_ents: bool | Literal['auto'] = 'auto')
Bases: AbstractCoreComponent
This is an abstract NER or linker component.
The class simplifies some things so that they don't have to be re-implemented in each implementation.
Methods:
-
get_ents_in– -
predict_entities–Predict the relevant entities for the document.
-
set_ents– -
set_linked_ents– -
set_ner_ents–
Source code in medcat-v2/medcat/components/types.py
92 93 94 95 96 97 98 99 100 101 102 103 | |
get_ents_in
get_ents_in(doc: MutableDocument) -> list[MutableEntity] | None
Source code in medcat-v2/medcat/components/types.py
107 108 | |
predict_entities
abstractmethod
predict_entities(doc: MutableDocument, ents: list[MutableEntity] | None = None) -> list[MutableEntity]
Predict the relevant entities for the document.
This is meant to be used for the NER or the Linker component. The idea is that this is the specific implementation only really needs to implement this method for inference to work.
Parameters:
-
(docMutableDocument) –The document.
-
(entslist[MutableEntity] | None, default:None) –The entities to consider (if any). If None, all possible entities in the document are considered. Defaults to None.
Returns:
-
list[MutableEntity]–list[MutableEntity]: The predicted entities in document.
Source code in medcat-v2/medcat/components/types.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | |
set_ents
set_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
110 111 112 113 114 115 | |
set_linked_ents
classmethod
set_linked_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
123 124 125 126 127 | |
set_ner_ents
classmethod
set_ner_ents(doc: MutableDocument, ents: list[MutableEntity]) -> None
Source code in medcat-v2/medcat/components/types.py
117 118 119 120 121 | |
BaseComponent
Bases: Protocol
Methods:
-
create_new_component–Create a new component or load one off disk if load path presented.
-
is_core–Whether the component is a core component or not.
Attributes:
-
full_name(Optional[str]) –Name with the component type (e.g ner, linking, meta).
-
name(str) –The name of the component.
create_new_component
classmethod
create_new_component(cnf: ComponentConfig, tokenizer: BaseTokenizer, cdb: CDB, vocab: Vocab, model_load_path: Optional[str]) -> Self
Create a new component or load one off disk if load path presented.
This may raise an exception if the wrong type of config is provided.
Parameters:
-
(cnfComponentConfig) –The config relevant to this components.
-
(tokenizerBaseTokenizer) –The base tokenizer.
-
(cdbCDB) –The CDB.
-
(vocabVocab) –The Vocab.
-
(model_load_pathOptional[str]) –Model load path (if present).
Returns:
-
Self(Self) –The new components.
Source code in medcat-v2/medcat/components/types.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
is_core
is_core() -> bool
Whether the component is a core component or not.
Returns:
-
bool(bool) –Whether this is a core component.
Source code in medcat-v2/medcat/components/types.py
35 36 37 38 39 40 41 | |
CoreComponent
Bases: BaseComponent, Protocol
Methods:
-
get_type–
get_type
get_type() -> CoreComponentType
Source code in medcat-v2/medcat/components/types.py
70 71 | |
CoreComponentType
HashableComponent
TrainableComponent
Bases: Protocol
Methods:
-
train–Train the component.
train
train(cui: str, entity: MutableEntity, doc: MutableDocument, negative: bool = False, names: Union[list[str], dict] = []) -> None
Train the component.
This should only apply to the linker.
Parameters:
-
(cuistr) –The CUI to train.
-
(entityBaseEntity) –The entity we're at.
-
(docBaseDocument) –The document within which we're working.
-
(negativebool, default:False) –Whether or not the example is negative. Defaults to False.
-
(nameslist[str] / dict, default:[]) –Optionally used to update the
statusof a name-cui pair in the CDB.
Source code in medcat-v2/medcat/components/types.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | |
create_core_component
create_core_component(comp_type: CoreComponentType, comp_name: str, cnf: ComponentConfig, tokenizer: BaseTokenizer, cdb: CDB, vocab: Vocab, model_load_path: Optional[str]) -> CoreComponent
Creat a core component.
Parameters:
-
(comp_typeCoreComponentType) –The component type.
-
(comp_namestr) –The name of the component.
-
(cnfComponentConfig) –The config to be passed to creator.
-
(tokenizerBaseTokenizer) –The tokenizer to be passed to creator.
-
(cdbCDB) –The CDB to be passed to creator.
-
(vocabVocab) –The vocab to be passed to the creator.
-
(model_load_pathOptional[str]) –The optional load path to be passed to the creators.
Returns:
-
CoreComponent(CoreComponent) –The resulting / created component.
Source code in medcat-v2/medcat/components/types.py
308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 | |
get_component_creator
get_component_creator(comp_type: CoreComponentType, comp_name: str) -> CompClass
Get the component creator.
Parameters:
-
(comp_typeCoreComponentType) –The core component type.
-
(comp_namestr) –The component name.
Returns:
-
CompClass–Callable[..., CoreComponent]: The creator for the component.
Source code in medcat-v2/medcat/components/types.py
294 295 296 297 298 299 300 301 302 303 304 305 | |
get_core_registry
get_core_registry(comp_type: CoreComponentType) -> Registry[CoreComponent]
Get the registry for a core component type.
Parameters:
-
(comp_typeCoreComponentType) –The core component type.
Returns:
-
Registry[CoreComponent]–Registry[CoreComponent]: The corresponding registry.
Source code in medcat-v2/medcat/components/types.py
282 283 284 285 286 287 288 289 290 291 | |
get_registered_components
Get all registered components (name and class name for each).
Parameters:
-
(comp_typeCoreComponentType) –The core component type.
Returns:
-
list[tuple[str, str]]–list[tuple[str, str]]: The name and class name for each registered component.
Source code in medcat-v2/medcat/components/types.py
334 335 336 337 338 339 340 341 342 343 344 345 346 | |
lazy_register_core_component
lazy_register_core_component(comp_type: CoreComponentType, comp_name: str, comp_module: str, comp_cls_and_init: str) -> None
Register a new core component in a lazy way.
This avoid importing the registered component and its transitive imports unless the component is actually used.
For instance if your NER providing class MySpecialNER
is in the module my_addon.my_module and uses the class method
create_new_component to initialise (thus the complete path is
my_addon.my_module.MySpecialNER.create_new_component) we
would expect the following arguments:
comp_type=CoreComponentType.ner,
comp_name="my_special_ner",
comp_module="my_addon.my_module",
comp_cls_and_init="MySpecialNER.create_new_component"
Parameters:
-
(comp_typeCoreComponentType) –The component type.
-
(comp_namestr) –The component name.
-
(comp_modulestr) –The path to the component module.
-
(comp_cls_and_initstr) –The component class and init method.
Source code in medcat-v2/medcat/components/types.py
252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 | |
register_core_component
register_core_component(comp_type: CoreComponentType, comp_name: str, comp_clazz: CompClass) -> None
Register a new core component.
Parameters:
-
(comp_typeCoreComponentType) –The component type.
-
(comp_namestr) –The component name.
-
(comp_clazzComplClass) –The component creator.
Source code in medcat-v2/medcat/components/types.py
239 240 241 242 243 244 245 246 247 248 249 | |