代码错误记录:TypeError: dropout(): argument ‘input‘ (position 1) must be Tensor, not str

TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str

  • 背景
  • 解决方法 1 (直接在输出上进行修改)
  • 整体代码
  • 解决方法2 (直接在模型上进行修改)
  • 参考链接
  • 背景

    使用 hugging face 中的 预训练模型 完成文本分类任务的过程中。出现了这个问题。


    问题排查的过程中,发现这里定义的 cls_layer() 出现问题。

    问题是数据类型错误,因此需要检查pooler_output的数据产生的位置和输出类型

    解决方法 1 (直接在输出上进行修改)

    定位位置,寻找pooler_output的输出

    这个pooler_output是关于 bert_layer 中 [CLS]的输出向量 ,这里的返回值是一个 字典类型,因此我们需要设置它的返回是不是字典类型

    整体代码

    class SentencePairClassifier(nn.Module):
        def __init__(self, bert_model="albert-base-v2", freeze_bert=False):
            super(SentencePairClassifier, self).__init__()
            #  Instantiating BERT-based model object
            self.bert_layer = AutoModel.from_pretrained(bert_model)
            
            #  Fix the hidden-state size of the encoder outputs (If you want to add other pre-trained models here, search for the encoder output size)
            if bert_model == "albert-base-v2":  # 12M parameters
                hidden_size = 768
            elif bert_model == "albert-large-v2":  # 18M parameters
                hidden_size = 1024
            elif bert_model == "albert-xlarge-v2":  # 60M parameters
                hidden_size = 2048
            elif bert_model == "albert-xxlarge-v2":  # 235M parameters
                hidden_size = 4096
            elif bert_model == "bert-base-uncased": # 110M parameters
                hidden_size = 768
                
            # Freeze bert layers and only train the classification layer weights
            if freeze_bert:
                for p in self.bert_layer.parameters():
                    p.requires_grad = False
                    
            # Classification layer
            self.cls_layer = nn.Linear(hidden_size, 1)
            self.dropout = nn.Dropout(p=0.1)
            
            
        @autocast()  # run in mixed precision
        
        def forward(self, input_ids, attn_masks, token_type_ids):
            '''
            Inputs:
                -input_ids : Tensor  containing token ids
                -attn_masks : Tensor containing attention masks to be used to focus on non-padded values
                -token_type_ids : Tensor containing token type ids to be used to identify sentence1 and sentence2
            
            outputs:
                - last_hidden_state: 最后一层的隐藏层向量表征
                - pooler_output: 最后一层 输出 
                - all_hidden_state: 全部层的 隐藏层向量表征 
            注:all_hidden_state可以将后面的4层取出来,做mean,然后在拼接到 classifier上。
            '''
            # Feeding the inputs to the BERT-based model to obtain contextualized representations
            cont_reps, pooler_output = self.bert_layer(input_ids, attn_masks, token_type_ids, return_dict=False) ## , return_dict=False)
            
            # Feeding to the classifier layer the last layer hidden-state of the [CLS] token further processed by a
            # Linear Layer and a Tanh activation. The Linear layer weights were trained from the sentence order prediction (ALBERT) or next sentence prediction (BERT)
            # objective during pre-training.
            logits = self.cls_layer(self.dropout(pooler_output))
            
            return logits
    

    思路是:

    解决方法2 (直接在模型上进行修改)

    参考链接

    https://stackoverflow.com/questions/65082243/dropout-argument-input-position-1-must-be-tensor-not-str-when-using-bert

    来源:小王做笔记

    物联沃分享整理
    物联沃-IOTWORD物联网 » 代码错误记录:TypeError: dropout(): argument ‘input‘ (position 1) must be Tensor, not str

    发表评论