158 lines
4.0 KiB
Markdown
158 lines
4.0 KiB
Markdown
# AIEA Local Chrome Screenshot + OpenAI Assistant
|
|
|
|
Local-only (unpacked) Chrome extension + Python bridge that:
|
|
|
|
1. Captures the visible tab screenshot.
|
|
2. Extracts visible page text hierarchy (reduced HTML tree).
|
|
3. Saves files to `./screenshots`.
|
|
4. Calls OpenAI with:
|
|
- screenshot image
|
|
- extracted page content
|
|
- system instructions from `AI_EA_INSTRUCTIONS.MD`
|
|
- optional extra instructions from the extension UI
|
|
5. Returns short + medium response suggestions per visible post back to the extension popup.
|
|
|
|
No Chrome Web Store publishing is required.
|
|
|
|
## Project Structure
|
|
|
|
- `chrome_screenshot_ext/` unpacked MV3 extension (popup UI + capture + extraction)
|
|
- `tools/local_screenshot_bridge.py` local HTTP server (`127.0.0.1`) + OpenAI call
|
|
- `AI_EA_INSTRUCTIONS.MD` system instructions fed to the model
|
|
- `screenshots/` generated outputs (`.png`, `.json`, `.content.json`, `.ai.json`)
|
|
|
|
## Prerequisites
|
|
|
|
- macOS/Linux shell
|
|
- Python 3.9+
|
|
- Google Chrome
|
|
- OpenAI API key
|
|
|
|
## Setup (venv)
|
|
|
|
From repo root:
|
|
|
|
```bash
|
|
python3 -m venv .venv
|
|
source .venv/bin/activate
|
|
pip install -U pip
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
Set API key with either:
|
|
|
|
1. Environment variable
|
|
```bash
|
|
export OPENAI_API_KEY="your_api_key_here"
|
|
```
|
|
|
|
2. `.env` file in project root
|
|
```bash
|
|
OPENAI_API_KEY=your_api_key_here
|
|
```
|
|
|
|
## Run the Local Server
|
|
|
|
With OpenAI enabled:
|
|
|
|
```bash
|
|
python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots --ai
|
|
```
|
|
|
|
Without OpenAI (save files only):
|
|
|
|
```bash
|
|
python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots
|
|
```
|
|
|
|
More logging (includes AI request/response ids + parse details):
|
|
|
|
```bash
|
|
python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots --ai --log-level debug
|
|
|
|
python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots --ai --log-level debug --ai-max-output-tokens 2500
|
|
```
|
|
|
|
Health check:
|
|
|
|
```bash
|
|
curl http://127.0.0.1:8765/health
|
|
```
|
|
|
|
## Load the Chrome Extension (Unpacked)
|
|
|
|
1. Open `chrome://extensions`
|
|
2. Enable Developer mode
|
|
3. Click `Load unpacked`
|
|
4. Select `chrome_screenshot_ext/`
|
|
|
|
If you change extension code, click `Reload` on this extension page.
|
|
|
|
## Use the Extension
|
|
|
|
1. Open any normal webpage (not `chrome://` pages).
|
|
2. Click extension icon.
|
|
3. Keep endpoint: `http://127.0.0.1:8765/screenshot`
|
|
4. Optional: add `Extra instructions` (2-line input in popup).
|
|
5. Click `Capture`.
|
|
|
|
Popup shows:
|
|
|
|
- Saved file paths
|
|
- AI suggestions per detected post (6 total):
|
|
- improved short + medium
|
|
- critical short + medium
|
|
- suggested short + medium
|
|
- Any AI error details
|
|
|
|
History in popup:
|
|
|
|
- `History` dropdown stores recent generations locally (`chrome.storage.local`)
|
|
- `Show` displays a previous result
|
|
- `Clear` removes local popup history
|
|
|
|
## Output Files
|
|
|
|
Generated in `screenshots/`:
|
|
|
|
- `YYYYMMDDTHHMMSSZ-<title-slug>.png` screenshot
|
|
- `YYYYMMDDTHHMMSSZ-<title-slug>.json` metadata
|
|
- `YYYYMMDDTHHMMSSZ-<title-slug>.content.json` extracted visible content tree
|
|
- `YYYYMMDDTHHMMSSZ-<title-slug>.ai.json` structured AI suggestions
|
|
|
|
## AI Behavior Notes
|
|
|
|
- System instructions source: `AI_EA_INSTRUCTIONS.MD`
|
|
- Extra per-capture instructions source: popup `Extra instructions`
|
|
- AI output is sanitized post-generation:
|
|
- quote/dash/space cleanup
|
|
- invisible unicode cleanup
|
|
- collapsed extra spaces/newlines
|
|
- forced lowercase (no capital letters)
|
|
|
|
## Useful CLI Options
|
|
|
|
```bash
|
|
python3 tools/local_screenshot_bridge.py --help
|
|
```
|
|
|
|
Common options:
|
|
|
|
- `--ai-model` (default: `gpt-5.2`)
|
|
- `--ai-max-posts` (default: `12`)
|
|
- `--ai-content-max-chars` (default: `120000`)
|
|
- `--ai-image-detail` (default: `auto`)
|
|
- `--ai-max-output-tokens` (default: `1400`)
|
|
- `--run ...` optional post-save hook command
|
|
|
|
## Troubleshooting
|
|
|
|
- `missing_openai_api_key`:
|
|
set `OPENAI_API_KEY` in shell or `.env`, then restart server.
|
|
- `missing_openai_sdk`:
|
|
run `pip install -r requirements.txt` inside your venv.
|
|
- Capture fails on Chrome internal pages:
|
|
`chrome://*`, Web Store, and some protected tabs cannot be scripted.
|
|
- No AI results in popup:
|
|
verify server started with `--ai`.
|