medcat.components.addons.meta_cat.mctokenizers.bpe_tokenizer
Classes:
-
TokenizerWrapperBPE–Wrapper around a huggingface tokenizer so that it works with the
Attributes:
FAKE_TOKENIZER_PATH
module-attribute
FAKE_TOKENIZER_PATH = '#\n/fake-path-not-exist#/'
TokenizerWrapperBPE
TokenizerWrapperBPE(hf_tokenizers: Optional[ByteLevelBPETokenizer] = None)
Bases: TokenizerWrapperBase
Wrapper around a huggingface tokenizer so that it works with the MetaCAT models.
Parameters:
-
–tokenizers.ByteLevelBPETokenizerA huggingface BBPE tokenizer.
Methods:
-
create_new– -
get_pad_id– -
get_size– -
load– -
save– -
token_to_id–
Attributes:
-
name–
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
22 23 24 25 26 27 28 29 | |
name
class-attribute
instance-attribute
name = 'bbpe'
create_new
classmethod
create_new()
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
99 100 101 102 | |
get_pad_id
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
112 113 114 115 116 117 118 | |
get_size
get_size() -> int
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
104 105 106 | |
load
classmethod
load(dir_path: str, model_variant: Optional[str] = '', **kwargs) -> TokenizerWrapperBPE
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
86 87 88 89 90 91 92 93 94 95 96 97 | |
save
save(dir_path: str) -> None
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
78 79 80 81 82 83 84 | |
token_to_id
Source code in medcat-v2/medcat/components/addons/meta_cat/mctokenizers/bpe_tokenizer.py
108 109 110 | |