ValueError: Shapes (None, 27) and (None, 1) are incompatible
X = df.body
y = df.target
num_classes = len(y.unique())
tokenizer = BertTokenizer.from_pretrained('DeepPavlov/rubert-base-cased')
X_enc = tokenizer(X.tolist(), padding=True, truncation=True, max_length=100, return_tensors='np')
X_train, X_val, y_train, y_val = train_test_split(X_enc['input_ids'],
y,
random_state = 42,
stratify = y,
test_size = 0.3)
X_train = tf.convert_to_tensor(X_train)
X_val = tf.convert_to_tensor(X_val)
y_train = np.array(y_train)
y_val = np.array(y_val)
model = TFBertForSequenceClassification.from_pretrained('DeepPavlov/rubert-base-cased', num_labels=num_classes, from_pt = True)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=5e-5),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[tf.keras.metrics.AUC(multi_label=True)])
model.fit(X_train, y_train, epochs=3, batch_size=32)
Код падает на последней строке с ValueError.
Значения размерностей тысячу раз перепроверены:
X_train shape: (39522, 100)
y_train shape: (39522,)
X_val shape: (16938, 100)
y_val shape: (16938,)
X_train type: <class 'tensorflow.python.framework.ops.EagerTensor'>
y_train type: <class 'numpy.ndarray'>