存储数据:data = (1, 2, "abc")
存储数据:data = {"name" : "Alice"}
存储数据:导入 from collections import namedtuple
,Player = namedtuple('Player', ['name', 'number', 'position', 'age', 'grade'])
,jordan = Player('Micheal Jordan', 23, 'PG', 29, 'S+')
存储数据:from dataclasses import dataclass
class Player:
name: str
number: int
position: str
age: int
grade: str
james = Player('Lebron James', 23, 'SF', 25, 'S')
Typing.Any, Typying.List
等 ,可以设置默认值,可以数据嵌套,可以传@dataclass(frozen=True)
:数据类的基石# This function is used instead of exposing Field creation directly, # so that a type checker can be told (via overloads) that this is a # function whose type depends on its parameters. def field(*, default=MISSING, default_factory=MISSING, init=True, repr=True, hash=None, compare=True, metadata=None): """Return an object to identify dataclass fields. default is the default value of the field. default_factory is a 0-argument function called to initialize a field's value. If init is True, the field will be a parameter to the class's __init__() function. If repr is True, the field will be included in the object's repr(). If hash is True, the field will be included in the object's hash(). If compare is True, the field will be used in comparison functions. metadata, if specified, must be a mapping which is stored but not otherwise examined by dataclass. It is an error to specify both default and default_factory. """ if default is not MISSING and default_factory is not MISSING: raise ValueError('cannot specify both default and default_factory') return Field(default, default_factory, init, repr, hash, compare, metadata)
price : float = 0.0
相当于 price : float = field(default = '0.0')
与 default_factory
类型的(如 list),必须使用 filed(default_factory = list)
是一个字典,该字典作为额外补充数据,不在 dataclasses
将obj转成dict, fields
相当于一堆 filed)from dataclasses import asdict, dataclass, field, fields @dataclass class TrainingArguments: framework = "pt" output_dir: str = field( metadata={"help": "The output directory where the model predictions and checkpoints will be written."}, ) overwrite_output_dir: bool = field( default=False, metadata={ "help": ( "Overwrite the content of the output directory. " "Use this to continue training if output_dir points to a checkpoint directory." ) }, ) do_train: bool = field(default=False, metadata={"help": "Whether to run training."}) do_eval: bool = field(default=False, metadata={"help": "Whether to run eval on the dev set."}) do_predict: bool = field(default=False, metadata={"help": "Whether to run predictions on the test set."})
注意:官网左侧修改 transformers
或者 task 参数的介绍
task (str) — The task defining which pipeline will be returned. Currently accepted tasks are: "audio-classification": will return a AudioClassificationPipeline. "automatic-speech-recognition": will return a AutomaticSpeechRecognitionPipeline. "conversational": will return a ConversationalPipeline. "depth-estimation": will return a DepthEstimationPipeline. "document-question-answering": will return a DocumentQuestionAnsweringPipeline. "feature-extraction": will return a FeatureExtractionPipeline. "fill-mask": will return a FillMaskPipeline:. "image-classification": will return a ImageClassificationPipeline. "image-feature-extraction": will return an ImageFeatureExtractionPipeline. "image-segmentation": will return a ImageSegmentationPipeline. "image-to-image": will return a ImageToImagePipeline. "image-to-text": will return a ImageToTextPipeline. "mask-generation": will return a MaskGenerationPipeline. "object-detection": will return a ObjectDetectionPipeline. "question-answering": will return a QuestionAnsweringPipeline. "summarization": will return a SummarizationPipeline. "table-question-answering": will return a TableQuestionAnsweringPipeline. "text2text-generation": will return a Text2TextGenerationPipeline. "text-classification" (alias "sentiment-analysis" available): will return a TextClassificationPipeline. "text-generation": will return a TextGenerationPipeline:. "text-to-audio" (alias "text-to-speech" available): will return a TextToAudioPipeline:. "token-classification" (alias "ner" available): will return a TokenClassificationPipeline. "translation": will return a TranslationPipeline. "translation_xx_to_yy": will return a TranslationPipeline. "video-classification": will return a VideoClassificationPipeline. "visual-question-answering": will return a VisualQuestionAnsweringPipeline. "zero-shot-classification": will return a ZeroShotClassificationPipeline. "zero-shot-image-classification": will return a ZeroShotImageClassificationPipeline. "zero-shot-audio-classification": will return a ZeroShotAudioClassificationPipeline. "zero-shot-object-detection": will return a ZeroShotObjectDetectionPipeline.
,然后点击后面的 SummarizationPipeline
去索引它的用法(Usage):from transformers import pipeline
# use bart in pytorch
summarizer = pipeline("summarization")
summarizer("An apple a day, keeps the doctor away", min_length=5, max_length=20)
# use t5 in tf
summarizer = pipeline("summarization", model="google-t5/t5-base", tokenizer="google-t5/t5-base", framework="tf")
summarizer("An apple a day, keeps the doctor away", min_length=5, max_length=20)
是一个可选参数嘛,有时候默认的模型只能做英文任务,这个时候我可以去 HF 官网,查找需要的模型,传入 model
,设置运行的单卡device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass native torch.device or a str too
,注意不能和 device
同时用device_map (str or Dict[str, Union[int, str, torch.device], optional) — Sent directly as model_kwargs (just a simpler shortcut). When accelerate library is present, set device_map=“auto” to compute the most optimized
PreTrainedModel : model 的参数类型
PretrainedConfig : config 的参数类型
PreTrainedTokenizer : tokenizer 的参数类型
Data Collator
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。