как убрать лишние символы при парсинге?

как убрать лишние символы при парсинге? мне нужно чтобы получился просто красивый вывод без указания времени парсинга

код:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
url='https://books.toscrape.com/'
headers = {"User-Agent": UserAgent().random}
full_page = requests.get(url,headers)
soup = BeautifulSoup(full_page.text, 'html.parser')
titles = soup.find_all('a')
for title in titles:
    print(title.text)

вывод:

# Books to Scrape
# Home


#                                 Books




#                                 Travel




#                                 Mystery




#                                 Historical Fiction




#                                 Sequential Art




#                                 Classics




#                                 Philosophy




#                                 Romance




#                                 Womens Fiction




#                                 Fiction




#                                 Childrens




#                                 Religion




#                                 Nonfiction




#                                 Music




#                                 Default




#                                 Science Fiction




#                                 Sports and Games




#                                 Add a comment




#                                 Fantasy




#                                 New Adult




#                                 Young Adult




#                                 Science




#                                 Poetry




#                                 Paranormal




#                                 Art




#                                 Psychology




#                                 Autobiography




#                                 Parenting




#                                 Adult Fiction




#                                 Humor




#                                 Horror




#                                 History




#                                 Food and Drink




#                                 Christian Fiction




#                                 Business




#                                 Biography




#                                 Thriller




#                                 Contemporary




#                                 Spirituality




#                                 Academic




#                                 Self Help




#                                 Historical




#                                 Christian




#                                 Suspense




#                                 Short Stories




#                                 Novels




#                                 Health




#                                 Politics




#                                 Cultural




#                                 Erotica




#                                 Crime



# A Light in the ...

# Tipping the Velvet

# Soumission

# Sharp Objects

# Sapiens: A Brief History ...

# The Requiem Red

# The Dirty Little Secrets ...

# The Coming Woman: A ...

# The Boys in the ...

# The Black Maria

# Starving Hearts (Triangular Trade ...

# Shakespeare's Sonnets

# Set Me Free

# Scott Pilgrim's Precious Little ...

# Rip it Up and ...

# Our Band Could Be ...

# Olio

# Mesaerion: The Best Science ...

# Libertarianism for Beginners

# It's Only the Himalayas
# next
# [Finished in 5.6s]

Ответы (1 шт):

Автор решения: CrazyElf

Ну, например, так:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent

url='https://books.toscrape.com/'
headers = {"User-Agent": UserAgent().random}
full_page = requests.get(url,headers)
soup = BeautifulSoup(full_page.content, 'html.parser')
titles = soup.find_all('a')
for title in titles:
    if 'title' in title.attrs:
        print(title.attrs['title'])

Вывод:

A Light in the Attic
Tipping the Velvet
Soumission
Sharp Objects
Sapiens: A Brief History of Humankind
The Requiem Red
The Dirty Little Secrets of Getting Your Dream Job
The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull
The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics
The Black Maria
Starving Hearts (Triangular Trade Trilogy, #1)
Shakespeare's Sonnets
Set Me Free
Scott Pilgrim's Precious Little Life (Scott Pilgrim #1)
Rip it Up and Start Again
Our Band Could Be Your Life: Scenes from the American Indie Underground, 1981-1991
Olio
Mesaerion: The Best Science Fiction Stories 1800-1849
Libertarianism for Beginners
It's Only the Himalayas

Просто я посмотрел, что находится в переменной title внутри цикла и поискал там нужный атрибут. Там внутри что-то такое:

<a href="catalogue/a-light-in-the-attic_1000/index.html" title="A Light in the Attic">A Light in the ...</a>

На самом деле через BS это как-то ещё проще должно делаться, просто нужно почитать документацию.

→ Ссылка