当前位置:   article > 正文

Bert 文本对齐_bert tokenizer 怎么对齐

bert tokenizer 怎么对齐

搜了半天的Bert文本对齐方法

发现还没Huggingface的transformers里的方法好用

  1. from transformers import BertTokenizer
  2. tokenizer = BertTokenizer.from_pretrained("bert-base-cased")
  3. sequence_a = "This is a short sequence."
  4. sequence_b = "This is a rather long sequence. It is at least longer than the sequence A."
  5. padded_sequences = tokenizer([sequence_a, sequence_b], padding=True)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/凡人多烦事01/article/detail/72272
推荐阅读
相关标签
  

闽ICP备14008679号