add readme

2026-02-11 13:58:13 +01:00 · 2026-02-11 13:58:13 +01:00 · 0f7e69c43e
commit 0f7e69c43e
parent a5ab3640ec
6 changed files with 181 additions and 7 deletions
--- a/LOCAL_SCREENSHOT_EXTENSION.md
+++ b/LOCAL_SCREENSHOT_EXTENSION.md
@ -40,8 +40,9 @@ Notes:
 1. Click the extension icon.
 2. Confirm the endpoint is `http://127.0.0.1:8765/screenshot`.
-3. Click "Capture".
+3. Optionally add "Extra instructions" (saved locally in the extension).
-4. Use the "History" dropdown + "Show" to view older AI suggestions.
+4. Click "Capture".
 5. Use the "History" dropdown + "Show" to view older AI suggestions.
 Saved files land in `screenshots/`:
--- a/README.md
+++ b/README.md
@ -0,0 +1,149 @@
 # AIEA Local Chrome Screenshot + OpenAI Assistant
 Local-only (unpacked) Chrome extension + Python bridge that:
 1. Captures the visible tab screenshot.
 2. Extracts visible page text hierarchy (reduced HTML tree).
 3. Saves files to `./screenshots`.
 4. Calls OpenAI with:
   - screenshot image
   - extracted page content
   - system instructions from `AI_EA_INSTRUCTIONS.MD`
   - optional extra instructions from the extension UI
 5. Returns short + medium response suggestions per visible post back to the extension popup.
 No Chrome Web Store publishing is required.
 ## Project Structure
 - `chrome_screenshot_ext/` unpacked MV3 extension (popup UI + capture + extraction)
 - `tools/local_screenshot_bridge.py` local HTTP server (`127.0.0.1`) + OpenAI call
 - `AI_EA_INSTRUCTIONS.MD` system instructions fed to the model
 - `screenshots/` generated outputs (`.png`, `.json`, `.content.json`, `.ai.json`)
 ## Prerequisites
 - macOS/Linux shell
 - Python 3.9+
 - Google Chrome
 - OpenAI API key
 ## Setup (venv)
 From repo root:
 ```bash
 python3 -m venv .venv
 source .venv/bin/activate
 pip install -U pip
 pip install -r requirements.txt
 ```
 Set API key with either:
 1. Environment variable
 ```bash
 export OPENAI_API_KEY="your_api_key_here"
 ```
 2. `.env` file in project root
 ```bash
 OPENAI_API_KEY=your_api_key_here
 ```
 ## Run the Local Server
 With OpenAI enabled:
 ```bash
 python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots --ai
 ```
 Without OpenAI (save files only):
 ```bash
 python3 tools/local_screenshot_bridge.py --port 8765 --out-dir screenshots
 ```
 Health check:
 ```bash
 curl http://127.0.0.1:8765/health
 ```
 ## Load the Chrome Extension (Unpacked)
 1. Open `chrome://extensions`
 2. Enable Developer mode
 3. Click `Load unpacked`
 4. Select `chrome_screenshot_ext/`
 If you change extension code, click `Reload` on this extension page.
 ## Use the Extension
 1. Open any normal webpage (not `chrome://` pages).
 2. Click extension icon.
 3. Keep endpoint: `http://127.0.0.1:8765/screenshot`
 4. Optional: add `Extra instructions` (2-line input in popup).
 5. Click `Capture`.
 Popup shows:
 - Saved file paths
 - AI suggestions per detected post:
  - short response
  - medium response
 - Any AI error details
 History in popup:
 - `History` dropdown stores recent generations locally (`chrome.storage.local`)
 - `Show` displays a previous result
 - `Clear` removes local popup history
 ## Output Files
 Generated in `screenshots/`:
 - `YYYYMMDDTHHMMSSZ-<title-slug>.png` screenshot
 - `YYYYMMDDTHHMMSSZ-<title-slug>.json` metadata
 - `YYYYMMDDTHHMMSSZ-<title-slug>.content.json` extracted visible content tree
 - `YYYYMMDDTHHMMSSZ-<title-slug>.ai.json` structured AI suggestions
 ## AI Behavior Notes
 - System instructions source: `AI_EA_INSTRUCTIONS.MD`
 - Extra per-capture instructions source: popup `Extra instructions`
 - AI output is sanitized post-generation:
  - quote/dash/space cleanup
  - invisible unicode cleanup
  - collapsed extra spaces/newlines
  - forced lowercase (no capital letters)
 ## Useful CLI Options
 ```bash
 python3 tools/local_screenshot_bridge.py --help
 ```
 Common options:
 - `--ai-model` (default: `gpt-5.2`)
 - `--ai-max-posts` (default: `12`)
 - `--ai-content-max-chars` (default: `120000`)
 - `--ai-image-detail` (default: `auto`)
 - `--ai-max-output-tokens` (default: `1400`)
 - `--run ...` optional post-save hook command
 ## Troubleshooting
 - `missing_openai_api_key`:
  set `OPENAI_API_KEY` in shell or `.env`, then restart server.
 - `missing_openai_sdk`:
  run `pip install -r requirements.txt` inside your venv.
 - Capture fails on Chrome internal pages:
  `chrome://*`, Web Store, and some protected tabs cannot be scripted.
 - No AI results in popup:
  verify server started with `--ai`.
--- a/chrome_screenshot_ext/popup.html
+++ b/chrome_screenshot_ext/popup.html
@ -42,7 +42,8 @@
        margin: 10px 0 6px 0;
      }
      input,
-      select {
+      select,
      textarea {
        width: 100%;
        box-sizing: border-box;
        padding: 10px 10px;
@ -52,8 +53,12 @@
        color: var(--text);
        outline: none;
      }
      textarea {
        resize: vertical;
      }
      input:focus,
-      select:focus {
+      select:focus,
      textarea:focus {
        border-color: rgba(52, 211, 153, 0.6);
        box-shadow: 0 0 0 3px rgba(52, 211, 153, 0.15);
      }
@ -117,6 +122,8 @@
      <div class="title">Local Screenshot Saver</div>
      <label for="endpoint">Endpoint</label>
      <input id="endpoint" type="text" spellcheck="false" />
      <label for="extra_instructions">Extra instructions</label>
      <textarea id="extra_instructions" rows="2" spellcheck="false" placeholder="optional: tone, goal, constraints..."></textarea>
      <div class="row">
        <button id="ping" type="button">Ping</button>
        <button id="capture" type="button">Capture</button>
--- a/chrome_screenshot_ext/popup.js
+++ b/chrome_screenshot_ext/popup.js
@ -1,6 +1,7 @@
 const DEFAULT_ENDPOINT = "http://127.0.0.1:8765/screenshot";
 const HISTORY_KEY = "capture_history_v1";
 const HISTORY_LIMIT = 25;
 const EXTRA_INSTRUCTIONS_KEY = "extra_instructions_v1";
 function $(id) {
  return document.getElementById(id);
@ -40,6 +41,7 @@ function renderResult(entryOrResp) {
  lines.push(`  META: ${resp.meta_path || "(unknown)"}`);
  if (resp.content_path) lines.push(`  CONTENT: ${resp.content_path}`);
  if (meta && meta.url) lines.push(`  URL:  ${meta.url}`);
  if (meta && meta.extra_instructions) lines.push(`  EXTRA: ${clampString(meta.extra_instructions, 220)}`);
  if (resp.ai_result) {
    if (resp.ai_result.ok && resp.ai_result.ai && Array.isArray(resp.ai_result.ai.posts)) {
@ -355,6 +357,7 @@ async function ping(endpoint) {
 async function main() {
  const endpointEl = $("endpoint");
  const extraEl = $("extra_instructions");
  const captureBtn = $("capture");
  const pingBtn = $("ping");
  const historySel = $("history");
@ -362,11 +365,15 @@ async function main() {
  const clearHistoryBtn = $("clear_history");
  endpointEl.value = (await storageGet("endpoint")) || DEFAULT_ENDPOINT;
  extraEl.value = (await storageGet(EXTRA_INSTRUCTIONS_KEY)) || "";
  await refreshHistoryUI();
  endpointEl.addEventListener("change", async () => {
    await storageSet({ endpoint: endpointEl.value.trim() });
  });
  extraEl.addEventListener("change", async () => {
    await storageSet({ [EXTRA_INSTRUCTIONS_KEY]: extraEl.value });
  });
  pingBtn.addEventListener("click", async () => {
    const endpoint = endpointEl.value.trim() || DEFAULT_ENDPOINT;
@ -398,6 +405,7 @@ async function main() {
  captureBtn.addEventListener("click", async () => {
    const endpoint = endpointEl.value.trim() || DEFAULT_ENDPOINT;
    const extraInstructions = (extraEl.value || "").trim();
    captureBtn.disabled = true;
    setStatus("Extracting page content...", "");
@ -423,6 +431,7 @@ async function main() {
        url: tab.url || "",
        ts: new Date().toISOString(),
        content,
        extra_instructions: extraInstructions,
      });
      // Persist for later viewing in the popup.
@ -432,6 +441,7 @@ async function main() {
        saved_at: new Date().toISOString(),
        title: tab.title || "",
        url: tab.url || "",
        extra_instructions: extraInstructions,
        resp: {
          png_path: resp.png_path || "",
          meta_path: resp.meta_path || "",
--- a/scripts/ai_prepare_responses.py
+++ b/scripts/ai_prepare_responses.py
@ -193,6 +193,7 @@ def main(argv: list[str]) -> int:
    page_url = str(meta.get("url") or "")
    page_title = str(meta.get("title") or "")
    extra_instructions = str(meta.get("extra_instructions") or "").strip()
    user_payload = {
        "page_url": page_url,
@ -216,7 +217,8 @@ def main(argv: list[str]) -> int:
        "Do not invent facts not present in the screenshot/content.\n"
        "Return JSON matching the provided schema. Include all top-level keys: page_url, page_title, posts, notes.\n"
        "If a value is unknown, use an empty string.\n\n"
-        f"PAGE_DATA_JSON={_safe_json_dump(user_payload, args.content_max_chars)}"
+        + (f"EXTRA_INSTRUCTIONS={extra_instructions}\n\n" if extra_instructions else "")
        + f"PAGE_DATA_JSON={_safe_json_dump(user_payload, args.content_max_chars)}"
    )
    # Screenshot is provided as a base64 data URL image input.
--- a/tools/local_screenshot_bridge.py
+++ b/tools/local_screenshot_bridge.py
@ -208,6 +208,7 @@ def _maybe_generate_ai(server, png_path: Path, meta: dict, content: object) -> d
    page_url = str(meta.get("url") or "")
    page_title = str(meta.get("title") or "")
    extra_instructions = str(meta.get("extra_instructions") or "").strip()
    user_payload = {
        "page_url": page_url,
@ -231,7 +232,8 @@ def _maybe_generate_ai(server, png_path: Path, meta: dict, content: object) -> d
        "Do not invent facts not present in the screenshot/content.\n"
        "Return JSON matching the provided schema. Include all top-level keys: page_url, page_title, posts, notes.\n"
        "If a value is unknown, use an empty string.\n\n"
-        f"PAGE_DATA_JSON={_safe_json_dump(user_payload, content_max_chars)}"
+        + (f"EXTRA_INSTRUCTIONS={extra_instructions}\n\n" if extra_instructions else "")
        + f"PAGE_DATA_JSON={_safe_json_dump(user_payload, content_max_chars)}"
    )
    b64 = base64.b64encode(png_path.read_bytes()).decode("ascii")
@ -339,6 +341,7 @@ class Handler(BaseHTTPRequestHandler):
        page_url = req.get("url") or ""
        client_ts = req.get("ts") or ""
        content = req.get("content", None)
        extra_instructions = req.get("extra_instructions") or ""
        m = re.match(r"^data:image/png;base64,(.*)$", data_url)
        if not m:
@ -393,6 +396,7 @@ class Handler(BaseHTTPRequestHandler):
                        "saved_utc": now.isoformat(),
                        "png_path": str(png_path),
                        "content_path": final_content_path,
                        "extra_instructions": extra_instructions,
                    },
                    indent=2,
                    ensure_ascii=True,
@ -411,6 +415,7 @@ class Handler(BaseHTTPRequestHandler):
            "saved_utc": now.isoformat(),
            "png_path": str(png_path),
            "content_path": final_content_path,
            "extra_instructions": extra_instructions,
        }
        run = getattr(self.server, "run_cmd", None)  # type: ignore[attr-defined]
@ -472,7 +477,7 @@ def main(argv: list[str]) -> int:
        "--run",
        nargs="+",
        default=None,
-        help="Optional command to run after saving. Screenshot paths are appended as args: PNG then JSON.",
+        help="Optional command to run after saving. Args appended: <png_path> <meta_path> [content_path].",
    )
    args = p.parse_args(argv)