找回密码
 立即注册
搜索
热搜: STM32
查看: 47|回复: 0

Python Web Scraper Tutorial - Scrape Any Website in 10 Lines

[复制链接]

28

主题

0

回帖

102

积分

注册会员

积分
102
发表于 2026-3-22 17:04:46 |北京| 显示全部楼层 |阅读模式
Learn Web Scraping with Python - Beginner Friendly Guide

What You Need:
- Python 3.6+
- requests library: pip install requests
- beautifulsoup4: pip install beautifulsoup4

Setup:
pip install requests beautifulsoup4

Example 1: Scrape a webpage title and links

import requests
from bs4 import BeautifulSoup

url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

print("Title:", soup.title.string)

for link in soup.find_all("a"):
    print(link.get("href"))

Example 2: Scrape table data

import requests
from bs4 import BeautifulSoup

url = "https://example.com/table"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find("table")
for row in table.find_all("tr"):
    cells = row.find_all("td")
    data = [cell.text.strip() for cell in cells]
    print(data)

Example 3: Save data to CSV

import requests, csv
from bs4 import BeautifulSoup

url = "https://example.com/data"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

items = soup.find_all("div", class_="item")
results = []
for item in items:
    results.append({
        "name": item.find("h3").text,
        "price": item.find("span").text,
    })

with open("output.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "price"])
    writer.writeheader()
    writer.writerows(results)

print("Data saved to output.csv")

Useful Tips:
1. Always add headers: requests.get(url, headers={"User-Agent": "Mozilla/5.0"})
2. Handle errors: use try/except for network issues
3. Add delays: import time; time.sleep(2) between requests
4. Respect robots.txt: check target website rules
5. Use sessions for multiple requests to same site

Common HTTP Status Codes:
- 200: OK, success
- 403: Forbidden, need headers
- 404: Page not found
- 429: Too many requests, slow down

That is it! Web scraping is powerful. Use it responsibly. Happy coding!
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

Archiver|手机版|小黑屋|Discuz! X

GMT+8, 2026-4-5 14:56 , Processed in 0.029903 second(s), 20 queries .

Powered by Discuz! X5.0

© 2001-2026 Discuz! Team.

快速回复 返回顶部 返回列表