medcat.utils.legacy.convert_cdb
Classes:
Functions:
-
convert_data–Convert the raw v1 data into a CDB.
-
get_cdb_from_old–Get the v2 CDB from a v1 CDB path.
-
load_old_raw_data–Looads the raw data from old file.
-
update_names–
Attributes:
-
CUI2KEYS– -
CUI2KEYS_OPTIONAL– -
EXPECTED_USEFUL_KEYS– -
NAME2KEYS– -
OPTIONAL_NAME2_KEYS– -
TO_RENAME– -
logger–
CUI2KEYS
module-attribute
CUI2KEYS = {'cui2names', 'cui2snames', 'cui2context_vectors', 'cui2count_train', 'cui2info', 'cui2tags', 'cui2type_ids', 'cui2preferred_name', 'cui2average_confidence'}
CUI2KEYS_OPTIONAL
module-attribute
CUI2KEYS_OPTIONAL = {'cui2info'}
EXPECTED_USEFUL_KEYS
module-attribute
EXPECTED_USEFUL_KEYS = ['name2cuis', 'name2cuis2status', 'name2count_train', 'name_isupper', 'snames', 'cui2names', 'cui2snames', 'cui2context_vectors', 'cui2count_train', 'cui2tags', 'cui2type_ids', 'cui2preferred_name', 'cui2average_confidence', 'addl_info', 'vocab']
NAME2KEYS
module-attribute
NAME2KEYS = {'name2cuis', 'name2cuis2status', 'name2count_train', 'name_isupper'}
OPTIONAL_NAME2_KEYS
module-attribute
OPTIONAL_NAME2_KEYS = {'name_isupper'}
TO_RENAME
module-attribute
TO_RENAME = {'vocab': 'token_counts'}
CustomUnpickler
Bases: Unpickler
Methods:
find_class
find_class(module, name)
Source code in medcat-v2/medcat/utils/legacy/convert_cdb.py
25 26 27 28 29 30 31 32 | |
LegacyClassNotFound
LegacyClassNotFound(*args, **kwargs)
convert_data
convert_data(all_data: dict, fix_spacy_model_name: bool = True) -> CDB
Convert the raw v1 data into a CDB.
Parameters:
-
(all_datadict) –The raw v1 data off disk.
-
(fix_spacy_model_namebool, default:True) –Whether to fix the spacy model name. Older models may have unsuported spacy model names. So these may sometimes need to be fixed. Defaults to True.
Returns:
-
CDB(CDB) –The v2 CDB.
Source code in medcat-v2/medcat/utils/legacy/convert_cdb.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | |
get_cdb_from_old
get_cdb_from_old(old_path: str, fix_spacy_model_name: bool = True) -> CDB
Get the v2 CDB from a v1 CDB path.
Parameters:
-
(old_pathstr) –The v1 CDB path.
-
(fix_spacy_model_namebool, default:True) –Whether to fix the spacy model name. Older models may have unsuported spacy model names. So these may sometimes need to be fixed. Defaults to True.
Returns:
-
CDB(CDB) –The v2 CDB.
Source code in medcat-v2/medcat/utils/legacy/convert_cdb.py
239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 | |
load_old_raw_data
Looads the raw data from old file.
This uses a wrapper that allows loading the data even if the classes do not exist.
Parameters:
-
(old_pathstr) –The path of the file to read.
Returns:
-
dict(dict) –The resulting raw data.
Source code in medcat-v2/medcat/utils/legacy/convert_cdb.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |