medcat.deid
Classes:
-
DeIdModel–The DeID model.
DeIdModel
DeIdModel(cat: CAT)
Bases: NerModel
The DeID model.
This wraps a CAT instance and simplifies its use as a de-identification model.
It provides methods for creating one from a TransformersNER as well as loading from a model pack (along with some validation).
It also exposes some useful parts of the CAT it wraps such as the config and the concept database.
Methods:
-
create– -
deid_multi_text– -
deid_multi_texts– -
deid_text–Deidentify text and potentially redact information.
-
deid_text_with_entities–Deidentify text and potentially redact information.
-
load_model_pack–Load DeId model from model pack.
-
train–
Attributes:
-
cat–
Source code in medcat-v2/medcat/components/ner/trf/deid.py
68 69 | |
cat
instance-attribute
cat = cat
create
classmethod
create(cdb: CDB, cnf: ConfigTransformersNER)
Source code in medcat-v2/medcat/components/ner/trf/deid.py
199 200 201 202 203 204 205 206 | |
deid_multi_text
deid_multi_text(texts: Iterable[str], redact: bool = False, n_process: Optional[int] = None) -> list[str]
Source code in medcat-v2/medcat/components/ner/trf/deid.py
123 124 125 126 127 128 129 130 131 | |
deid_multi_texts
deid_multi_texts(texts: Iterable[str], redact: bool = False, n_process: Optional[int] = None) -> list[str]
Source code in medcat-v2/medcat/components/ner/trf/deid.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
deid_text
Deidentify text and potentially redact information.
De-identified text.
If redaction is enabled, identifiable entities will be
replaced with starts (e.g *****).
Otherwise, the replacement will be the CUI or in other words,
the type of information that was hidden (e.g [PATIENT]).
Parameters:
-
(textstr) –The text to deidentify.
-
(redactbool, default:False) –Whether to redact the information.
Returns:
-
str(str) –The deidentified text.
Source code in medcat-v2/medcat/components/ner/trf/deid.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
deid_text_with_entities
Deidentify text and potentially redact information.
De-identified text.
If redaction is enabled, identifiable entities will be
replaced with starts (e.g *****).
Otherwise, the replacement will be the CUI or in other words,
the type of information that was hidden (e.g [PATIENT]).
Parameters:
-
(textstr) –The text to deidentify.
-
(redactbool, default:False) –Whether to redact the information.
Returns:
-
tuple[str, Entities]–Tuple[str, Entities]: A tuple containing: - The deidentified text as a string. - The entities found and linked within the text.
Source code in medcat-v2/medcat/components/ner/trf/deid.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
load_model_pack
classmethod
Load DeId model from model pack.
The method first loads the CAT instance.
It then makes sure that the model pack corresponds to a valid DeId model.
Parameters:
-
(configOptional[dict], default:None) –Config for DeId model pack (primarily for stride of overlap window)
-
(model_pack_pathstr) –The model pack path.
Raises:
-
ValueError–If the model pack does not correspond to a DeId model.
Returns:
-
DeIdModel(DeIdModel) –The resulting DeI model.
Source code in medcat-v2/medcat/components/ner/trf/deid.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | |
train
Source code in medcat-v2/medcat/components/ner/trf/deid.py
71 72 73 74 | |