Можно ли с помощью apply создать и заполнить несколько столбцов?

Question

В моем датафрейме df есть столбец "text", хочу его разделить на три столбца. Сейчас делаю это тремя вызовами apply:

def text_process1(text):
  index = text.find(", 20")+6
  begin = text[:index].split()
  source_name = " ".join(begin[:-3])
  return source_name

def text_process2(text):
  index = text.find(", 20")+6
  begin = text[:index].split()
  date = " ".join(begin[-3:])
  return date

def text_process3(text):
  index = text.find(", 20")+6
  new_text = text[index:]
  return new_text

df["source_name_tx"]= df["text"].apply(text_process1)
df["date"] = df["text"].apply(text_process2)
df["text"]= df["text"].apply(text_process3)

Как будто такой код выглядит не очень оптимально, хотелось бы сделать все одним apply. Пробовала вот так:

def text_process(text):
  index = text.find(", 20")+6
  begin = text[:index].split()
  source_name = " ".join(begin[:-3])
  date = " ".join(begin[-3:])
  new_text = text[index:]
  return source_name, date, new_text

df[["source_name_tx", "date", "text"]]= df["text"].apply(text_process)

При запуске возникает ошибка:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-a3c56a97d75e> in <module>()
      7   return source_name, date, new_text
      8 
----> 9 df[["source_name_tx", "date", "text"]]= df["text"].apply(text_process)

2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/frame.py in _iset_not_inplace(self, key, value)
   3673         if self.columns.is_unique:
   3674             if np.shape(value)[-1] != len(key):
-> 3675                 raise ValueError("Columns must be same length as key")
   3676 
   3677             for i, col in enumerate(key):

ValueError: Columns must be same length as key

Можно ли все таки сгенерировать эти три столбца одним apply?

Answer 1

можно. нет данных ваших, чтобы проверить, но теоретически должно сработать:

def text_process(text):
  index = text.find(", 20")+6
  begin = text[:index].split()
  source_name = " ".join(begin[:-3])
  date = " ".join(begin[-3:])
  new_text = text[index:]
  return source_name, date, new_text

df["source_name_tx"], df["date"], df["text"] = zip(*df['text'].map(text_process))

БЛОГ НА HUSL

Можно ли с помощью apply создать и заполнить несколько столбцов?

Ответы (1 шт):