при загрузке фото с помощю requests пишет 'Недействительное изображение'
я написал парсер, который буде закачивать опридиленные картинки на сайте, но почему-то, когда я захожу в картинку мне пишет 'Недействительное изображение'. Раньше исправлял это используя другой юзер-агент, но сейчас это не помогает. код парсера:
from selenium import webdriver
from selenium.webdriver.edge.service import Service
from selenium.webdriver.edge.options import Options
from selenium.webdriver.common.by import By
import requests
import os
import time
number = input(">>> ")
title_name = "Sweet Home"
if not os.path.exists(title_name):
os.mkdir(title_name)
if not os.path.exists(f"{title_name}/chapter {number}"):
os.mkdir(f"{title_name}/chapter {number}")
service = Service("C:/edgedriver_win64/msedgedriver.exe")
driver = webdriver.Edge(service=service)
driver.get(f"https://mangaweebs.in/manga/sweet-home/chapter-{number}/")
time.sleep(10)
images = driver.find_elements(By.CLASS_NAME, 'wp-manga-chapter-img')
for image in images:
image_src = image.get_attribute("src")
print(image_src)
image_num = image.get_attribute("id").split('-')
print(image_num)
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
"Referer": "https://www.google.com/"
}
response = requests.get(image_src, stream=True, headers=headers)
with open(f"{title_name}/chapter {number}/{number}_{image_num[1]}.jpg", 'wb') as file:
file.write(response.content)
также, мне дает ошибку от самого парсера:
[1792:1984:1027/200955.100:ERROR:fallback_task_provider.cc(124)] Every renderer should have at least one task provided by a primary task provider. If a "Renderer" fallback task is shown, it is a bug. If you have repro steps, please file a new bug and tag it as a dependency of crbug.com/739782.
Ответы (1 шт):
Автор решения: Сергей Ш
→ Ссылка
import os
import requests
from bs4 import BeautifulSoup
from PIL import Image
from io import BytesIO
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64;'
' rv:109.0) Gecko/20100101 Firefox/119.0'}
number = 33
title_name = "Sweet Home"
path_number = f"{title_name}/chapter {number}"
if not os.path.exists(title_name):
os.mkdir(title_name)
if not os.path.exists(path_number):
os.mkdir(path_number)
sess = requests.Session()
sess.headers.update(headers)
url = f'https://mangaweebs.in/manga/sweet-home/chapter-{number}'
response = sess.get(url)
if not response.ok:
print(f"{response.status_code}", url)
exit()
soup = BeautifulSoup(response.text, 'lxml')
img = (x['src'].strip() for x in soup(class_="wp-manga-chapter-img"))
for index, url_img in enumerate(img, 1):
response = sess.get(url_img)
if not response.ok:
print(f"{response.status_code}", url)
continue
with Image.open(BytesIO(response.content)) as webp_image:
webp_image.save(f"{path_number}/{number}_{index:03d}.jpg", "JPEG")