โ All tutorials
๐
Gemini CLI ยท Python + BeautifulSoup
Build a Web Scraper with Gemini CLI
Prerequisites
- โ Gemini CLI installed (npm install -g @anthropic-ai/gemini-cli or pip install gemini-cli)
- โ Python installed
- โ Git installed
- โ Qmmit installed
- โ A Qmmit account with CLI token
1
Create the project
Set up a Python project.
mkdir web-scraper cd web-scraper git init pip install requests beautifulsoup4 echo "# Web Scraper" > README.md git add . && git commit -m "initial commit"
2
Set up Qmmit
Initialize tracking.
qmmit init
โก
What Qmmit does here
Detects Gemini CLI via ~/.gemini/tmp/ directory. Installs hooks.
3
Launch Gemini and build the scraper
Run Gemini CLI and ask it to build a scraper.
gemini # In Gemini, type: # "Create a Python web scraper that: # 1. Takes a URL as argument # 2. Fetches the page with requests # 3. Extracts all links, headings, and paragraphs # 4. Saves results to a JSON file # Use BeautifulSoup for parsing."
๐ก
Gemini CLI saves sessions at ~/.gemini/tmp/<project-hash>/chats/session-*.json. Qmmit reads these automatically.
4
Ask Gemini to add rate limiting
Add polite scraping.
# In Gemini: # "Add rate limiting: 1 request per second, retry on 429, # respect robots.txt, add User-Agent header."
5
Test it
Run the scraper.
python scraper.py https://example.com cat output.json
6
Commit
Commit everything.
git add . git commit -m "feat: web scraper with rate limiting" # [qmmit] 2 prompt(s) tracked (gemini-cli) โ pqr1234
7
Push and verify
Push and check dashboard.
git push -u origin main