Hint
Not sure where to start?
Break this problem into clear steps:
Fetch the HTML content of the page using the
requests
library:
import requests
response = requests.get("https://books.toscrape.com/")
html = response.text
Parse the HTML to find all image tags with
BeautifulSoup
:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "html.parser")
img_tags = soup.find_all("img")
Each
<img>
tag has asrc
attribute. You’ll need to turn it into a full URL, because it might be relative:
from urllib.parse import urljoin
full_url = urljoin("https://books.toscrape.com/", img_src)
Finally, download the image with
requests.get
and save it as a file usingopen
in binary mode:
img_data = requests.get(full_url).content
with open("filename.jpg", "wb") as f:
f.write(img_data)
Loop through all image tags and repeat!
Click Show Solution if you’d like to see the full working code tied together.