Build a Web Scraper with Gemini CLI

Name: Qmmit
Author: Qmmit

Prerequisites

✓ Gemini CLI installed (npm install -g @anthropic-ai/gemini-cli or pip install gemini-cli)
✓ Python installed
✓ Git installed
✓ Qmmit installed
✓ A Qmmit account with CLI token

Create the project

Set up a Python project.

mkdir web-scraper
cd web-scraper
git init
pip install requests beautifulsoup4
echo "# Web Scraper" > README.md
git add . && git commit -m "initial commit"

Set up Qmmit

Initialize tracking.

qmmit init

⚡

What Qmmit does here

Detects Gemini CLI via ~/.gemini/tmp/ directory. Installs hooks.

Launch Gemini and build the scraper

Run Gemini CLI and ask it to build a scraper.

gemini

# In Gemini, type:
# "Create a Python web scraper that:
#  1. Takes a URL as argument
#  2. Fetches the page with requests
#  3. Extracts all links, headings, and paragraphs
#  4. Saves results to a JSON file
#  Use BeautifulSoup for parsing."

💡

Gemini CLI saves sessions at ~/.gemini/tmp/<project-hash>/chats/session-*.json. Qmmit reads these automatically.

Ask Gemini to add rate limiting

Add polite scraping.

# In Gemini:
# "Add rate limiting: 1 request per second, retry on 429,
#  respect robots.txt, add User-Agent header."

Test it

Run the scraper.

python scraper.py https://example.com
cat output.json

Commit

Commit everything.

git add .
git commit -m "feat: web scraper with rate limiting"
# [qmmit] 2 prompt(s) tracked (gemini-cli) → pqr1234

Push and verify

Push and check dashboard.

git push -u origin main

← All tutorials Get started — free