赞
踩
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese") Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 624/624 [00:00<00:00, 171kB/s] Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 110k/110k [00:01<00:00, 109kB/s] model = AutoModel.from_pretrained("bert-base-chinese") Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 412M/412M [08:38<00:00, 794kB/s] model.eval() BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(21128, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (1): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (2): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (3): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (4): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (5): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (6): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (7): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (8): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (9): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (10): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (11): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) inputs = tokenizer("我想快点发论文", return_tensors="pt") outputs = model(**inputs) print(inputs) {'input_ids': tensor([[ 101, 2769, 2682, 2571, 4157, 1355, 6389, 3152, 102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1]])} > token_type_ids: This tensor will map every tokens to their corresponding segment (see below). > attention_mask: This tensor is used to "mask" padded values in a batch of sequence with different lengths (see below). print(outputs) (tensor([[[-0.2791, 0.3020, 0.4071, ..., 0.2707, 0.5302, -0.7799], [ 0.5064, -0.5631, 0.6345, ..., -1.0366, 0.2625, -0.1994], [ 0.2194, -1.4004, -0.6083, ..., -0.0954, 1.3527, 0.1086], ..., [-1.3459, 0.2393, -0.1635, ..., 0.2543, 0.3820, -0.6676], [-0.1877, 0.2440, -0.7461, ..., 0.7493, 1.2351, -0.6387], [-0.6366, 0.0020, 0.1719, ..., 0.5995, 1.0313, -0.5601]]], grad_fn=<NativeLayerNormBackward>), tensor([[ 0.9998, 1.0000, 0.9645, 0.9815, 0.8940, 0.9501, -0.8997, -0.0834, 0.9970, -0.9983, 1.0000, 0.9376, -0.7621, -0.9907, 0.9997, -0.9996, -0.3555, 0.9997, 0.9910, 0.3025, 1.0000, -1.0000, -0.9852, 0.4327, 0.5883, 0.9987, 0.9467, -0.9238, -1.0000, 0.9963, 0.9257, 0.9996, 0.9616, -1.0000, -0.9998, 0.9128, -0.3828, 0.9679, -0.9496, -0.9962, -0.9628, -0.9890, 0.9602, -0.9962, -0.9925, 0.3300, -1.0000, -1.0000, 0.2199, 1.0000, -0.6426, -0.9999, -0.7568, 0.6757, -0.3493, 0.9893, -0.9990, 0.9890, 1.0000, 0.9505, 0.9990, -0.9834, 0.3967, -0.9999, 1.0000, -0.9995, -0.9630, 0.9326, 1.0000, 1.0000, -0.0279, 0.9966, 1.0000, 0.9974, -0.2731, 0.8287, -0.9996, 0.8198, -1.0000, 0.6264, 1.0000, 0.9987, -0.8188, 0.9369, -0.9729, -0.9999, -1.0000, 0.9999, -0.4467, 0.9947, 0.9993, -0.9996, -1.0000, 0.9974, -0.9990, -0.9990, -0.9679, 0.9996, -0.8413, 0.2180, -0.6675, 0.9137, -0.7086, -0.8763, 0.9966, 0.9984, 0.4311, -0.9998, 0.9999, 0.6164, -1.0000, -0.8914, -0.9612, -0.8292, -0.9808, 0.9998, 0.6123, -0.7170, 0.9969, -0.8448, 0.7740, -0.9997, -0.9848, -0.9205, 0.9992, 1.0000, 0.9951, -0.9998, 0.9070, 1.0000, 0.9754, 0.9973, -0.9578, 0.9927, 0.9775, -0.9949, 0.5645, -0.7802, 1.0000, 0.9860, 0.9956, -0.9736, 0.9999, -0.9947, 1.0000, -1.0000, 0.9985, -1.0000, -0.9999, 0.9999, 0.9669, 1.0000, -0.9731, 1.0000, -0.9966, -0.9997, 0.9994, 0.9355, 0.9992, -1.0000, 0.9987, -0.7195, 0.1207, 0.3249, -1.0000, 0.9999, -0.8803, 1.0000, 0.9976, -0.9951, -0.9669, -0.9974, 0.8122, -0.9981, -0.9993, 0.9339, 0.1426, 0.9979, -0.9054, -0.9364, 0.9816, -0.9799, -1.0000, 0.9898, -0.1943, 0.9433, 0.9775, -0.2970, 0.9901, 0.9886, -0.0538, 0.9997, 0.0103, 0.9945, 0.9997, 0.0133, -0.9757, -0.9922, -1.0000, -0.0833, 1.0000, 0.0641, -0.9990, 0.9401, -1.0000, 0.8613, -0.9716, -0.8053, -0.9990, -1.0000, 0.9998, -0.9673, -0.9978, 0.7857, -0.8995, 0.0669, -0.9998, 0.8634, 0.9447, 0.8611, 0.6718, -0.9648, -0.9996, 0.9958, -0.9966, -0.1681, 0.9996, 1.0000, 0.9989, 0.1419, 0.7520, 0.9611, 0.9263, -1.0000, 0.9811, -0.9986, -0.9358, 0.9998, -0.9983, 0.9885, 1.0000, 0.9542, 1.0000, -0.8647, -0.9969, -0.9969, 1.0000, 0.9947, 0.9998, -0.9988, -1.0000, 0.4735, -0.3244, -1.0000, -0.9999, -0.8571, 0.9965, 1.0000, 0.8488, -0.9947, -0.9789, -0.9988, 1.0000, -0.9985, 1.0000, 0.9533, -0.9769, -0.9691, 0.7433, -0.9720, -0.9994, -0.7511, -1.0000, -0.9813, -0.9999, 0.9946, -0.9994, -1.0000, 0.9784, 0.9999, 0.9394, -1.0000, 0.9962, 0.9996, -0.9706, -0.9989, 0.9318, -1.0000, 1.0000, -0.9965, 0.6798, -0.7347, -0.9824, -0.9889, 0.9999, 0.9979, -0.7660, -0.8894, -0.9910, -0.9998, -0.0392, 0.9967, -0.9843, 0.9958, -0.7512, -0.9937, 0.9647, -0.9993, -0.9992, -0.3546, 1.0000, -0.7632, 1.0000, 0.9875, 1.0000, 0.9377, -0.9981, 0.9975, 0.3702, -0.7220, -0.9867, -0.9982, 0.9020, 0.3270, -0.7151, -0.9999, 1.0000, 0.9963, 0.9862, 0.9793, -0.8209, 0.2737, 0.9307, -0.9918, 0.9983, -0.9997, -0.9612, 0.9971, 1.0000, 0.9990, 0.6963, -0.5588, 0.9991, -0.9821, 0.9992, -0.9999, 0.9996, -0.9828, 0.9922, -0.8171, -0.9853, 1.0000, 0.9108, -0.6582, 0.9999, -0.9978, 0.9967, 0.9996, 0.9976, 0.9977, 0.8987, 1.0000, -0.7873, -0.9080, -0.9510, -0.9975, -0.9966, -1.0000, 0.7787, -0.9999, -0.9692, -0.9584, 0.1324, 0.7053, -0.7222, -0.3167, -0.8888, 0.5796, -0.9228, 0.2450, 0.9680, -0.9947, -0.9684, -1.0000, -0.9987, 0.9993, 1.0000, -1.0000, 0.9365, -1.0000, -0.9996, 0.9907, -0.8451, -0.7223, 0.9997, -1.0000, -0.4415, 0.9998, 1.0000, 0.9986, 1.0000, 0.1226, -1.0000, -0.9999, -1.0000, -1.0000, -0.9998, 0.9929, 0.9578, -1.0000, -0.9742, 0.7187, 1.0000, 0.9738, -0.9971, -0.8223, -0.9919, -0.9984, 0.9314, -0.8885, -0.9992, 0.9994, -0.4979, 1.0000, -0.7022, 0.9989, 0.9748, 0.9106, 0.9810, -1.0000, 0.9882, 1.0000, 0.9607, -1.0000, -0.7742, -0.9313, -1.0000, -0.2988, 0.9512, 0.9999, -1.0000, -0.8857, -0.9675, 0.3590, 0.9727, 0.9997, 0.9981, 0.9886, 0.9842, 0.9912, 0.6137, 1.0000, 0.6532, -0.9994, 0.9954, 0.5485, 0.5065, -0.9515, 0.9988, 0.3796, 1.0000, 0.9921, 0.1803, -0.9859, -0.9986, 0.9672, 1.0000, -0.9498, -0.4640, -0.9991, -1.0000, -0.9871, -0.4957, 0.5904, -0.9589, -0.9993, 0.1991, 0.9651, 1.0000, 1.0000, 0.9999, -0.9969, -0.6101, 0.9843, 0.1021, 0.9973, -0.8886, -1.0000, -0.9251, -0.9997, 1.0000, -0.9677, -0.5649, -0.6651, -0.5434, 0.8480, -1.0000, -0.9628, -0.9986, 0.2291, 1.0000, -0.9999, 0.9837, -0.9990, 0.2159, 0.6134, 0.5105, 0.9952, -0.9759, -0.7753, -0.7218, -0.6030, 0.9849, 0.9954, -0.9997, 0.7897, 0.9976, -0.9826, 0.9989, 0.5673, 0.9489, 0.9599, 1.0000, 0.9166, 0.9998, 0.9644, 1.0000, 0.9999, -0.9951, 0.8626, 0.8856, -0.8395, -0.4970, 0.9975, 0.9999, -0.8229, -0.9770, -0.9996, 0.9991, 1.0000, 1.0000, -0.8960, 0.9961, -0.9934, 0.9984, 0.6706, 0.7453, 0.1472, 0.6518, 0.9986, 0.9955, -0.9999, -1.0000, -1.0000, 1.0000, 0.9999, -0.8356, -1.0000, 0.9984, -0.9732, 0.9919, 0.9817, 0.9507, -0.9920, 0.9976, -0.9972, 0.3327, 0.3722, 0.3626, 0.6790, 0.9978, -0.9987, 0.9520, 1.0000, -0.8148, 1.0000, 0.5202, -0.9999, 1.0000, -0.9998, -0.9996, -0.2209, 1.0000, 0.9991, 0.8423, -0.1879, 0.9997, -1.0000, 0.9999, -0.9999, -0.9204, -0.9967, 0.9998, -0.9961, -0.9971, -0.8639, 0.9536, 0.9987, -0.9585, 0.9999, -0.7864, -0.9968, -0.4349, -0.9943, -0.9930, -0.9829, -0.8985, -1.0000, 0.7084, -0.6908, -0.8356, -0.9948, -1.0000, 1.0000, -0.9500, -0.9885, 1.0000, -0.9992, -1.0000, 0.9897, -0.9980, 0.6656, 0.9634, 0.9705, 0.4371, -1.0000, 0.9464, 1.0000, -0.8603, -0.9867, -0.9457, -0.9981, 0.7120, 0.9582, 0.9953, -0.6211, 0.8999, 0.8723, 0.7517, -0.5116, 0.8407, -0.9990, -0.9974, -0.9970, -0.7413, -1.0000, -1.0000, 1.0000, 1.0000, 1.0000, -0.6371, -0.6810, 0.7162, 0.9956, -0.9997, -0.8777, 0.9699, 0.8117, 0.9271, -0.9991, -0.8475, -1.0000, -0.9698, 0.2998, -0.9945, -0.0448, 1.0000, 0.9998, -0.9997, -0.9898, -0.9219, -0.9991, 0.9996, 0.9984, 0.9968, -0.9933, -0.5550, 0.9754, -0.7183, -0.9151, -0.9991, -0.9382, -1.0000, 0.9057, -0.9885, -1.0000, 0.9973, 1.0000, 0.9242, -1.0000, 0.0660, 1.0000, 0.9342, 1.0000, 0.8927, 0.9997, -0.9955, 0.9968, -0.9999, 1.0000, -1.0000, 1.0000, 1.0000, 0.9991, 0.9965, -0.9897, 0.9681, -0.9609, 0.5623, 0.9879, -0.6333, -0.9794, 0.7290, 0.6294, -0.9982, 1.0000, 0.9107, -0.7149, 0.9776, -0.1279, 0.9970, 0.0862, -0.9417, 0.9984, 1.0000, 0.9805, 1.0000, 0.9411, 1.0000, -0.8665, -0.9994, 0.9989, -0.5152, -0.9079, -1.0000, 1.0000, 0.9624, -1.0000, -0.9439, -0.5568, 0.6311, 1.0000, 0.9984, 0.9976, 0.9940, 0.8607, 0.9985, -0.9319, 0.8799, -0.9996, -0.9878, 1.0000, -0.9966, 0.9999, -0.9905, 0.9212, -0.9967, 0.9057, 0.8096, 0.9610, -0.9925, 1.0000, 0.9798, -0.9822, -0.9991, -0.9775, -0.9966, 0.9923]], grad_fn=<TanhBackward>)) print(outputs[0].shape) torch.Size([1, 9, 768]) print(outputs[1].shape) torch.Size([1, 768])
BERT 输出两个张量:(另外可选输出两个张量具体内容)
1、outputs[0]是last_hidden_state
outputs[0]是每个token的表示,形状为(1, NB_TOKENS, REPRESENTATION_SIZE)具体内容
是基于token级别的
The first, token-based, representation can be leveraged if your task requires to keep the sequence representation and you want to operate at a token-level. This is particularly useful for Named Entity Recognition and Question-Answering.
实际一共包括四个维度,按照顺序分别是[# layers, # batches, # tokens, # features]:具体内容
The layer number,层数 (12 layers)
The batch number ,句子数(1 sentence)
The word / token number ,token数(22 tokens in our sentence)
The hidden unit / feature number (768 features)
其第一维是个列表
# `encoded_layers` is a Python list.print(' Type of encoded_layers: ', type(encoded_layers))
Type of encoded_layers: <class 'list'>
2、outputs[1]是pooler_output
outputs[1]是整个输入的合并表达,形状为(1, REPRESENTATION_SIZE)
提取整篇文章的表达,不是基于token级别的
The second, aggregated, representation is especially useful if you need to extract the overall context of the sequence and don’t require a fine-grained token-level. This is the case for Sentiment-Analysis of the sequence or Information Retrieval.
>>> options = ['意气风发', '街谈巷议']
>>> inputs = tokenizer(options, return_tensors="pt")
>>> outputs = model(**inputs)
>>> print(outputs[0].shape)
torch.Size([2, 6, 768])
from transformers import pipeline
>>> question_answerer = pipeline('question-answering')
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 473/473 [00:00<00:00, 124kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 261M/261M [07:06<00:00, 612kB/s]
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████| 213k/213k [00:02<00:00, 106kB/s]
question_answerer({
... 'question': 'What is the name of the repository ?',
... 'context': 'Pipeline have been included in the huggingface/transformers repository'
... })
{'score': 0.5135956406593323, 'start': 35, 'end': 59, 'answer': 'huggingface/transformers'}
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。