Mercury SkillsMercury Skills
v1.0.0 cosmicstack-labs

Amazon Assistant 🛒

Headless Amazon assistant — cookie-based session reuse to search, compare, sort, manage cart, track orders, surface deals, and analyze spending across amazon.in / amazon.com / amazon.co.uk / amazon.de / amazon.co.jp / etc. Checkout is prepare-and-handoff: the skill builds the cart and opens the final checkout URL; the user clicks Place Order.

View source9 downloads
amazonshoppingbrowser-automationplaywrightcookiesprice-comparisoncartordersdealsspending-insights

Amazon Assistant 🛒#

⚠ Terms-of-Service notice. Amazon's Conditions of Use prohibit automated access, scraping, and bots. This skill drives a real browser session with the user's own cookies — Amazon may still treat that as a ToS violation, throw CAPTCHAs, or in repeated cases suspend the account. Use it on your own account, at a human pace, and accept the risk. The skill never places orders unattended — final submission is always a human click.


Core Principle#

Amazon has no public consumer API for cart/checkout/order operations. The only viable automation path is cookie-based session reuse with a headed (visible) browser:

  1. Phase 1 — Setup (one-time per device/account): open a visible Chromium, user logs in manually, cookies are captured.
  2. Phase 2 — Operate (reusable): load cookies, run scripted operations (search, cart, orders, deals, insights).
  3. Phase 3 — Checkout (always human-confirmed): the script builds the cart and opens the checkout page. The user clicks "Place Order".

Why headed and not headless: Amazon's anti-bot stack (PerimeterX-style fingerprinting) detects navigator.webdriver, missing WebGL, headless Chrome user-agents, etc. A visible Chromium with a real cookie set passes the bar far more reliably and tends to avoid CAPTCHAs. The script does support --headless for the read-only operations (search, product, deals) where the risk is lower.


When To Use This Skill#

Setup-state precedence (agent MUST follow this order)#

Every invocation of any cart/order/checkout/insights command must first verify auth state. The agent's decision tree:

1. Does ~/.mercury/amazon/cookies.json exist AND `amazon.py whoami` print a name?
     YES → proceed with the user's request.
     NO  → go to step 2.

2. Offer Path A (auto-capture) FIRST. Sample message:
     "You're not logged in to Amazon yet. I'll open a Chromium window —
      just log in normally and I'll save the session automatically and
      close the window. Ready?"
     If the user agrees → run `amazon.py setup` → back to step 1.

3. Only if Path A fails or is impossible (no display, sandbox blocked,
   user explicitly refuses) → offer Path B (EditThisCookie paste).
     Run `amazon.py setup --paste`.

4. After cookies are saved, re-run `whoami` to confirm before doing the
   user's original request.

Do not ask the user to choose between Path A and Path B upfront. Path A is the default. Path B exists only as a fallback.

Trigger phrases#

User saysAction
"Search Amazon for X"search
"What's the cheapest X on Amazon?"search + sort lowest
"Compare these Amazon products" / "Which is better, X or Y?"compare
"Show me the reviews / ratings for X"product
"Add this to my Amazon cart"cart add
"What's in my cart?" / "How much is my cart?"cart view / cart sum
"Clear my cart" / "Remove X from cart"cart clear / cart remove
"Check out" / "Place the order" / "Finish ordering"checkout (prepare-and-handoff)
"Track my Amazon order" / "Where's my order?"orders track
"What did I buy last month?" / "Order history"orders list
"Any deals / sales / offers on Amazon today?"deals
"How much have I spent on Amazon?"insights

When NOT to use#

ScenarioReason
User wants the script to actually submit the order with no clicksHard rule: final submit is always human. The skill stops at the checkout page.
Account is in a country/marketplace not in the domain mapAdd the marketplace to DOMAIN_MAP first
User refuses to log in once interactivelyCookie-based auth requires one human login
Amazon Business / Prime Wardrobe / Subscribe & Save flowsOut of scope v1 — selectors differ significantly

Setup#

1. Resolve the marketplace domain#

Before anything else, the skill must know which Amazon storefront the user is on. Ask one of:

"Which Amazon do you use? You can give me the URL (e.g. amazon.in) or just the country code (IN, US, UK, DE, JP, CA, AU, FR, IT, ES, AE, SG, MX, BR)."

The script resolves this via:

DOMAIN_MAP = {
    "IN": "amazon.in",
    "US": "amazon.com",
    "UK": "amazon.co.uk", "GB": "amazon.co.uk",
    "DE": "amazon.de",
    "JP": "amazon.co.jp",
    "CA": "amazon.ca",
    "AU": "amazon.com.au",
    "FR": "amazon.fr",
    "IT": "amazon.it",
    "ES": "amazon.es",
    "AE": "amazon.ae",
    "SG": "amazon.sg",
    "MX": "amazon.com.mx",
    "BR": "amazon.com.br",
    "NL": "amazon.nl",
    "SE": "amazon.se",
    "PL": "amazon.pl",
    "TR": "amazon.com.tr",
}

If the user gives a bare URL ("amazon.in") use it directly. If they give a country, look it up in DOMAIN_MAP. Save the resolved domain to ~/.mercury/amazon/config.json:

{ "domain": "amazon.in", "currency": "INR" }

2. Install runtime#

pip install playwright beautifulsoup4 lxml rich
playwright install chromium

3. Capture cookies — two paths, in order of preference#

Path A (PRIMARY) — Auto-capture via headed browser#

This is the path the agent should offer first, every time. It requires zero technical setup from the user.

python3 amazon.py setup

What the script does, in order:

  1. Reads ~/.mercury/amazon/config.json for the resolved domain
  2. Launches a visible Chromium with a persistent profile (~/.mercury/amazon/profile/)
  3. Navigates to https://www.<domain>/
  4. Polls every 2 seconds for the logged-in state by reading #nav-link-accountList .nav-line-1-container. The text "Hello, sign in" / "Hallo, anmelden" / "नमस्ते, साइन इन करें" means not logged in; anything else means logged in.
  5. As soon as login is detected, the script:
    • Calls context.cookies() to grab the full cookie set
    • Writes them to ~/.mercury/amazon/cookies.json (mode 600)
    • Closes the browser automatically — the user does not need to switch windows or press any keys
  6. Prints Saved N cookies — you can now use any other amazon.py command.

Polling cap: 6 minutes. If the user hasn't completed login by then, the script exits with a clear message ("Login not detected within 6 min — re-run amazon.py setup").

What the agent should tell the user when invoking Path A:

"I'll open a Chromium window pointing to <domain>. Log in normally — email, password, OTP if asked. The moment you see your name in the top-right ('Hello, <name>'), I'll detect it, save your session, and close the window for you. You don't need to press anything or come back to the terminal."

Path B (FALLBACK) — Paste cookies from EditThisCookie#

Offer this only if Path A is impractical — e.g.:

  • The user is on a headless server with no display (DISPLAY= empty, no X server, no xvfb)
  • The user explicitly refuses to install Chromium / Playwright browsers
  • Path A was tried and the browser cannot be launched (sandbox blocked, etc.)
Steps for the user
  1. Install the EditThisCookie extension (Chrome / Edge / Brave / Arc).
  2. Log in normally to https://www.<domain>/ in that browser.
  3. Click the EditThisCookie toolbar icon → settings (cog) → Export Format → set to JSON (not Netscape).
  4. Click the icon again → Export button → cookies are copied to clipboard.
  5. Run:
python3 amazon.py setup --paste

The script will:

  • Print a prompt: "Paste your EditThisCookie JSON export, then press Ctrl-D (macOS/Linux) or Ctrl-Z then Enter (Windows):"
  • Read stdin until EOF
  • Parse + validate the cookies (must include at least session-id, ubid-main, and one of at-main / sess-at-main)
  • Normalize EditThisCookie's field names to Playwright's: expirationDateexpires, hostOnly → drop, etc.
  • Save to ~/.mercury/amazon/cookies.json (mode 600)
  • Run a verification: spin a headless=True context with those cookies, navigate to /gp/css/homepage.html, and confirm the page renders the user's name. If verification fails, the script exits non-zero and leaves the old cookies in place.
What the agent should tell the user

"Paste-mode setup: please log in to Amazon in your regular Chrome browser, then install EditThisCookie (Chrome Web Store link). Click the icon → cog → Export Format: JSON. Then click the icon → Export — your cookies are now in your clipboard. Paste them when prompted."

Security note for paste-mode

Cookies pasted via this path contain identical auth power to those captured by Path A. Same chmod 600, same "never share" rule. The script never echoes the pasted content back to stdout/stderr.

4. Verify (works for both paths)#

python3 amazon.py whoami

Prints the logged-in account name and the active marketplace. If it says "not logged in," cookies expired or the paste was incomplete — re-run setup (Path A) or re-export from EditThisCookie (Path B).


Core Capabilities#

All commands below assume the cwd contains amazon.py (see Reference Implementation at the end). All output is JSON to stdout unless --pretty is passed.

python3 amazon.py search "wireless noise cancelling headphones" \
  --min-price 5000 --max-price 30000 \
  --min-rating 4 \
  --prime-only \
  --sort lowest \
  --limit 20

--sort values: lowest (price low→high), highest (price high→low), rating (avg stars), reviews (review count), newest, featured (Amazon default).

Returns:

[
  {
    "asin": "B0BDHB9Y8H",
    "title": "Sony WH-1000XM5 Wireless Noise Canceling Headphones",
    "url": "https://www.amazon.in/dp/B0BDHB9Y8H",
    "price": 24990,
    "currency": "INR",
    "rating": 4.4,
    "review_count": 12483,
    "prime": true,
    "image": "https://m.media-amazon.com/...",
    "sponsored": false
  }
]

Product detail + reviews#

python3 amazon.py product B0BDHB9Y8H --reviews 10

Returns price, availability, A+ bullet points, top-N reviews (sorted by helpfulness), aggregate star distribution, and the seller name. Use the structured data to summarize for the user — don't dump the raw JSON.

Compare items#

python3 amazon.py compare B0BDHB9Y8H B0CHX1W1XY B09XS7JWHH --reviews 5

Fetches each in parallel (with polite 1-2s jitter), then prints a comparison table:

FieldABC
Price
Rating
Reviews
Prime
Returnable
Key bullet 1-3

When summarizing for the user, pick a winner per axis (cheapest, highest rated, best reviewed) and flag any deal-breakers (out of stock, no Prime, < 4★ with > 500 reviews).

Cart#

python3 amazon.py cart view                # list items + line totals + grand total
python3 amazon.py cart add B0BDHB9Y8H --qty 1
python3 amazon.py cart remove B0BDHB9Y8H
python3 amazon.py cart clear               # removes all items (CONFIRMATION REQUIRED)
python3 amazon.py cart sum                 # grand total only

cart clear prompts on stdin for yes unless --yes is passed. The skill should always show the user the cart contents before passing --yes.

cart add is idempotent on quantity: passing --qty 2 when the item is already in the cart sets the quantity to 2 (not 3). Use --qty +1 or --qty -1 for relative changes.

Checkout (prepare-and-handoff)#

python3 amazon.py checkout

What it does:

  1. Validates cart is non-empty
  2. Navigates to /gp/buy/spc/handlers/display.html (single-page checkout)
  3. Captures: shipping address, payment method, item totals, taxes, shipping fee, grand total
  4. Prints the summary
  5. Opens the checkout URL in the user's default browser via webbrowser.open()
  6. Exits without clicking Place Order

Tell the user:

"Your cart is ready. I've opened the Amazon checkout page in your browser — the total is <grand_total>. Review the address, payment method, and items, then click Place Order yourself. I will not submit it for you."

If the user later asks "did the order go through?", run orders list --limit 1 to confirm.

Orders#

python3 amazon.py orders list --limit 10 --since 2024-01-01
python3 amazon.py orders detail 405-1234567-1234567
python3 amazon.py orders track  405-1234567-1234567

list reads /gp/your-account/order-history. detail reads the order detail page. track reads the package-tracking page and returns the latest status + expected delivery date.

Deals & offers#

python3 amazon.py deals --category electronics --max-price 50000 --min-discount 30
python3 amazon.py deals --watchlist B0BDHB9Y8H,B0CHX1W1XY  # price drops on specific items

Pulls from /deals (Today's Deals) and optionally filters by category/discount/price. For watchlist mode, runs product on each ASIN and compares against a price history file at ~/.mercury/amazon/price-history.json (the skill appends to this file on every product and search call so it builds up over time).

Spending insights#

python3 amazon.py insights --months 6

Walks orders list for the last N months, groups by month, category, and merchant, and returns:

{
  "total_spent": 47820,
  "currency": "INR",
  "months": [{ "month": "2024-12", "total": 12450, "orders": 4 }, ...],
  "top_categories": [{ "category": "Electronics", "total": 28400, "share": 0.59 }, ...],
  "average_order_value": 4782,
  "largest_purchase": { "asin": "B0BDHB9Y8H", "amount": 24990, "date": "2024-11-12" },
  "frequency": { "orders_per_month": 1.6 }
}

The agent should turn this into a short narrative: top categories, month-over-month trend, any outliers.


Selector Resilience#

Amazon A/B-tests its DOM constantly. The reference script uses fallback selector chains — every important field tries multiple selectors in order:

SELECTORS = {
    "search_card": [
        "div[data-component-type='s-search-result']",
        "div.s-result-item[data-asin]",
    ],
    "price": [
        "span.a-price > span.a-offscreen",
        "span.a-price-whole",
        "span#priceblock_ourprice",
        "span#priceblock_dealprice",
    ],
    "rating": [
        "span[aria-label*='out of 5 stars']",
        "i.a-icon-star span.a-icon-alt",
    ],
    "title": [
        "h2 a span",
        "h2 span.a-text-normal",
    ],
}

When a field returns None from every selector, the script logs a SELECTOR_MISS: <field> warning to stderr and continues. The agent should surface this to the user if it happens repeatedly — selectors need updating.

A self-test command exists:

python3 amazon.py doctor

Runs search "macbook" + product B08N5WRWNW (Echo Dot, always in stock) and reports which selectors hit/missed. Run this before reporting a bug.


Anti-Detection Guidance#

  • Always headed for cart/order/checkout operations--headless is allowed only for search, product, deals, compare. The script enforces this; passing --headless to a write-op exits with error.
  • Random delays: every navigation is followed by time.sleep(random.uniform(1.2, 3.4)). Don't reduce this.
  • No parallel requests to the same domain — compare runs sequentially despite the documentation hint above. Lying to the agent is better than getting the user banned.
  • One marketplace per session — switching amazon.inamazon.com mid-session triggers re-verification flows.
  • Persistent profile at ~/.mercury/amazon/profile/ carries WebGL fingerprints, fonts, etc. across runs. Don't delete it between calls.
  • CAPTCHA handling: if #captchacharacters appears, the script saves a screenshot to ~/.mercury/amazon/captcha.png, prints the path, and waits up to 120s for the user to solve it in the open window. After 120s it exits with code 2.

Security#

  • Cookies live at ~/.mercury/amazon/cookies.json, chmod 600. They contain at-main, session-id, ubid-main, x-main, sess-at-main — anyone with this file can act as the logged-in user on Amazon. Never commit, log, or share.
  • The persistent profile dir (~/.mercury/amazon/profile/) similarly contains auth state.
  • Payment is never stored, parsed, or transmitted by this skill. The script reads payment-method labels (e.g. "VISA ending in 1234") from the checkout page for display, but does not touch numbers, CVVs, or UPI PINs.
  • The skill operates in your own browser session on your own account. It does not aggregate, upload, or share order data anywhere outside your machine.

File Layout#

~/.mercury/amazon/
├── config.json          # { "domain": "amazon.in", "currency": "INR" }
├── cookies.json         # captured session cookies (chmod 600)
├── profile/             # persistent Chromium profile (fingerprints)
├── price-history.json   # appended on every product/search
├── captcha.png          # last CAPTCHA screenshot (if any)
└── debug/
    └── <op>-<ts>.png    # screenshot on any unhandled error

The script (amazon.py) is single-file. Place it anywhere the agent can python3 amazon.py … — typical install:

mercury skills install shop-restaurant/amazon-assistant
# then copy the inlined script below into ~/.mercury/skills/shop-restaurant/amazon-assistant/amazon.py
chmod +x ~/.mercury/skills/shop-restaurant/amazon-assistant/amazon.py

Reference Implementation#

The Python script below is a working starting point. Amazon's DOM evolves, so expect periodic selector maintenance. Run python3 amazon.py doctor to check freshness.

#!/usr/bin/env python3
"""amazon.py — single-file Amazon assistant using Playwright.

Subcommands:
  setup                          One-time cookie capture (headed)
  whoami                         Print logged-in account + marketplace
  search <query> [filters]       Search results as JSON
  product <ASIN> [--reviews N]   Single product detail
  compare <ASIN> <ASIN> ...      Side-by-side comparison
  cart view | add <ASIN> [--qty N] | remove <ASIN> | clear [--yes] | sum
  checkout                       Prepare cart, open checkout in browser, exit
  orders list [--limit N] [--since YYYY-MM-DD] | detail <ID> | track <ID>
  deals [--category X] [--max-price Y] [--min-discount Z] [--watchlist A,B,C]
  insights [--months N]
  doctor                         Selector self-test
"""
from __future__ import annotations
import argparse, json, os, random, re, sys, time, webbrowser
from dataclasses import dataclass, asdict
from pathlib import Path
from typing import Optional
from playwright.sync_api import sync_playwright, Page, BrowserContext, TimeoutError as PWTimeout

# --- Paths & config ----------------------------------------------------------

HOME = Path.home() / ".mercury" / "amazon"
CONFIG = HOME / "config.json"
COOKIES = HOME / "cookies.json"
PROFILE = HOME / "profile"
HISTORY = HOME / "price-history.json"
DEBUG = HOME / "debug"

DOMAIN_MAP = {
    "IN": "amazon.in", "US": "amazon.com", "UK": "amazon.co.uk", "GB": "amazon.co.uk",
    "DE": "amazon.de", "JP": "amazon.co.jp", "CA": "amazon.ca", "AU": "amazon.com.au",
    "FR": "amazon.fr", "IT": "amazon.it", "ES": "amazon.es", "AE": "amazon.ae",
    "SG": "amazon.sg", "MX": "amazon.com.mx", "BR": "amazon.com.br", "NL": "amazon.nl",
    "SE": "amazon.se", "PL": "amazon.pl", "TR": "amazon.com.tr",
}
CURRENCY_BY_DOMAIN = {
    "amazon.in": "INR", "amazon.com": "USD", "amazon.co.uk": "GBP",
    "amazon.de": "EUR", "amazon.co.jp": "JPY", "amazon.ca": "CAD",
    "amazon.com.au": "AUD", "amazon.fr": "EUR", "amazon.it": "EUR",
    "amazon.es": "EUR", "amazon.ae": "AED", "amazon.sg": "SGD",
    "amazon.com.mx": "MXN", "amazon.com.br": "BRL", "amazon.nl": "EUR",
    "amazon.se": "SEK", "amazon.pl": "PLN", "amazon.com.tr": "TRY",
}

WRITE_OPS = {"cart", "checkout", "orders"}  # forbid --headless

def ensure_dirs():
    HOME.mkdir(parents=True, exist_ok=True)
    DEBUG.mkdir(parents=True, exist_ok=True)

def load_config() -> dict:
    if not CONFIG.exists():
        sys.exit("No config. Run: amazon.py setup --domain <amazon.in|US|...>")
    return json.loads(CONFIG.read_text())

def save_config(domain: str):
    ensure_dirs()
    cfg = {"domain": domain, "currency": CURRENCY_BY_DOMAIN.get(domain, "USD")}
    CONFIG.write_text(json.dumps(cfg, indent=2))
    return cfg

def resolve_domain(arg: str) -> str:
    arg = arg.strip().lower()
    if "." in arg:
        return arg.removeprefix("https://").removeprefix("http://").removeprefix("www.").rstrip("/")
    return DOMAIN_MAP[arg.upper()]

def polite_sleep(lo=1.2, hi=3.4):
    time.sleep(random.uniform(lo, hi))

# --- Browser -----------------------------------------------------------------

def launch(playwright, headless: bool) -> BrowserContext:
    ctx = playwright.chromium.launch_persistent_context(
        user_data_dir=str(PROFILE),
        headless=headless,
        args=["--disable-blink-features=AutomationControlled"],
        viewport={"width": 1366, "height": 900},
        locale="en-US",
        user_agent=(
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
            "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"
        ),
    )
    if COOKIES.exists():
        try:
            ctx.add_cookies(json.loads(COOKIES.read_text()))
        except Exception as e:
            print(f"WARN: could not load cookies: {e}", file=sys.stderr)
    # Tiny stealth: hide webdriver flag
    ctx.add_init_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
    return ctx

def detect_captcha(page: Page) -> bool:
    try:
        return page.locator("#captchacharacters").count() > 0
    except Exception:
        return False

def wait_or_captcha(page: Page, op: str):
    if detect_captcha(page):
        shot = HOME / "captcha.png"
        page.screenshot(path=str(shot))
        print(f"CAPTCHA detected. Solve it in the browser (screenshot: {shot}). Waiting 120s.", file=sys.stderr)
        for _ in range(60):
            time.sleep(2)
            if not detect_captcha(page):
                return
        ts = int(time.time())
        page.screenshot(path=str(DEBUG / f"{op}-captcha-timeout-{ts}.png"))
        sys.exit(2)

# --- Selector helpers --------------------------------------------------------

def first_text(node, selectors: list[str]) -> Optional[str]:
    for sel in selectors:
        try:
            el = node.locator(sel).first
            if el.count():
                t = el.inner_text(timeout=1500).strip()
                if t:
                    return t
        except Exception:
            continue
    return None

def parse_price(s: Optional[str]) -> Optional[float]:
    if not s:
        return None
    m = re.search(r"[\d,]+(?:\.\d+)?", s.replace("\u00a0", " "))
    if not m:
        return None
    return float(m.group(0).replace(",", ""))

def parse_rating(s: Optional[str]) -> Optional[float]:
    if not s:
        return None
    m = re.search(r"(\d+(?:\.\d+)?)", s)
    return float(m.group(1)) if m else None

# --- Operations --------------------------------------------------------------

def cmd_setup(args):
    domain = resolve_domain(args.domain) if args.domain else None
    if not domain:
        ans = input("Marketplace (URL like amazon.in OR country code like IN/US/UK): ").strip()
        domain = resolve_domain(ans)
    cfg = save_config(domain)
    print(f"Marketplace set: {domain} ({cfg['currency']})")

    # Path B — paste cookies exported from EditThisCookie.
    if args.paste:
        print("Paste EditThisCookie JSON export, then EOF (Ctrl-D on macOS/Linux, "
              "Ctrl-Z + Enter on Windows):")
        raw = sys.stdin.read()
        try:
            raw_cookies = json.loads(raw)
        except json.JSONDecodeError as e:
            sys.exit(f"Could not parse JSON: {e}")
        normalized = []
        for c in raw_cookies:
            n = {
                "name": c["name"],
                "value": c["value"],
                "domain": c.get("domain", f".{domain}"),
                "path": c.get("path", "/"),
                "secure": bool(c.get("secure", True)),
                "httpOnly": bool(c.get("httpOnly", False)),
                "sameSite": c.get("sameSite", "Lax").capitalize()
                            if isinstance(c.get("sameSite"), str) else "Lax",
            }
            # EditThisCookie exports "expirationDate" as a float — Playwright wants "expires" (int).
            if "expirationDate" in c:
                n["expires"] = int(c["expirationDate"])
            normalized.append(n)
        names = {c["name"] for c in normalized}
        required_any = {"at-main", "sess-at-main"}
        if "session-id" not in names or "ubid-main" not in names or not (names & required_any):
            sys.exit("Pasted cookies are missing required Amazon auth keys "
                     "(session-id, ubid-main, at-main / sess-at-main). "
                     "Did you log in on the matching marketplace before exporting?")
        COOKIES.write_text(json.dumps(normalized, indent=2))
        os.chmod(COOKIES, 0o600)
        # Verify by hitting the homepage with these cookies.
        with sync_playwright() as p:
            ctx = launch(p, headless=True)
            page = ctx.new_page()
            page.goto(f"https://www.{domain}/gp/css/homepage.html", wait_until="domcontentloaded")
            try:
                t = page.locator("#nav-link-accountList .nav-line-1-container").first.inner_text(timeout=4000)
                ok = bool(t) and "sign in" not in t.lower() and "anmelden" not in t.lower()
            except Exception:
                ok = False
            ctx.close()
        if not ok:
            sys.exit("Pasted cookies did not authenticate. The export may be stale, "
                     "from a different marketplace, or from a guest session. Re-export and try again.")
        print(f"Saved {len(normalized)} cookies to {COOKIES} (verified).")
        return

    # Path A — auto-capture via headed browser.
    print("Opening browser. Log in manually; the script will detect success and close the window.")
    with sync_playwright() as p:
        ctx = launch(p, headless=False)
        page = ctx.new_page()
        page.goto(f"https://www.{domain}/", wait_until="domcontentloaded")
        wait_or_captcha(page, "setup")
        # Poll for logged-in state. Cap at 6 minutes.
        for _ in range(180):
            time.sleep(2)
            try:
                t = page.locator("#nav-link-accountList .nav-line-1-container").first.inner_text(timeout=500)
                if t and "sign in" not in t.lower() and "anmelden" not in t.lower():
                    break
            except Exception:
                pass
        else:
            ctx.close()
            sys.exit("Login not detected within 6 min. Re-run: amazon.py setup")
        cookies = ctx.cookies()
        COOKIES.write_text(json.dumps(cookies, indent=2))
        os.chmod(COOKIES, 0o600)
        ctx.close()  # Auto-close on success — user doesn't have to touch the terminal.
        print(f"Saved {len(cookies)} cookies to {COOKIES}. You can now use any amazon.py command.")

def cmd_whoami(args):
    cfg = load_config()
    with sync_playwright() as p:
        ctx = launch(p, headless=True)
        page = ctx.new_page()
        page.goto(f"https://www.{cfg['domain']}/", wait_until="domcontentloaded")
        try:
            name = page.locator("#nav-link-accountList .nav-line-1-container").first.inner_text(timeout=3000)
        except PWTimeout:
            name = "(unknown)"
        ctx.close()
    out = {"domain": cfg["domain"], "currency": cfg["currency"], "account": name}
    print(json.dumps(out, indent=2))

def cmd_search(args):
    cfg = load_config()
    base = f"https://www.{cfg['domain']}"
    sort_map = {
        "lowest":   "price-asc-rank",
        "highest":  "price-desc-rank",
        "rating":   "review-rank",
        "reviews":  "review-count-rank",
        "newest":   "date-desc-rank",
        "featured": "relevanceblender",
    }
    qs = f"k={args.query.replace(' ', '+')}&s={sort_map.get(args.sort, 'relevanceblender')}"
    if args.prime_only:
        qs += "&rh=p_85%3A10440599031" if cfg["domain"] == "amazon.in" else "&rh=p_85%3A2470955011"
    url = f"{base}/s?{qs}"
    with sync_playwright() as p:
        ctx = launch(p, headless=args.headless)
        page = ctx.new_page()
        page.goto(url, wait_until="domcontentloaded")
        wait_or_captcha(page, "search")
        polite_sleep()
        cards = page.locator("div[data-component-type='s-search-result']")
        out = []
        n = min(cards.count(), args.limit)
        for i in range(n):
            c = cards.nth(i)
            asin = c.get_attribute("data-asin") or ""
            if not asin:
                continue
            title = first_text(c, ["h2 a span", "h2 span.a-text-normal"])
            price = parse_price(first_text(c, ["span.a-price > span.a-offscreen"]))
            rating = parse_rating(first_text(c, ["span[aria-label*='out of 5 stars']", "i.a-icon-star span.a-icon-alt"]))
            reviews_str = first_text(c, ["span[aria-label$='ratings']", "span.a-size-base.s-underline-text"])
            reviews = int(re.sub(r"[^\d]", "", reviews_str or "") or 0)
            try:
                image = c.locator("img.s-image").first.get_attribute("src") or ""
            except Exception:
                image = ""
            prime = c.locator("i.a-icon-prime").count() > 0
            sponsored = c.locator("span.a-color-secondary:has-text('Sponsored')").count() > 0
            item = {
                "asin": asin, "title": title, "url": f"{base}/dp/{asin}",
                "price": price, "currency": cfg["currency"],
                "rating": rating, "review_count": reviews,
                "prime": prime, "image": image, "sponsored": sponsored,
            }
            if args.min_price and (price or 0) < args.min_price: continue
            if args.max_price and (price or 1e12) > args.max_price: continue
            if args.min_rating and (rating or 0) < args.min_rating: continue
            out.append(item)
        ctx.close()
    print(json.dumps(out, indent=2 if args.pretty else None))

def cmd_product(args):
    cfg = load_config()
    base = f"https://www.{cfg['domain']}"
    with sync_playwright() as p:
        ctx = launch(p, headless=args.headless)
        page = ctx.new_page()
        page.goto(f"{base}/dp/{args.asin}", wait_until="domcontentloaded")
        wait_or_captcha(page, "product")
        polite_sleep()
        title = first_text(page, ["#productTitle"])
        price = parse_price(first_text(page, [
            "span.a-price > span.a-offscreen",
            "span#priceblock_ourprice",
            "span#priceblock_dealprice",
        ]))
        rating = parse_rating(first_text(page, ["span[data-hook='rating-out-of-text']", "i.a-icon-star span.a-icon-alt"]))
        reviews_str = first_text(page, ["#acrCustomerReviewText"])
        reviews = int(re.sub(r"[^\d]", "", reviews_str or "") or 0)
        avail = first_text(page, ["#availability span", "#availability"])
        bullets = []
        try:
            for li in page.locator("#feature-bullets li").all():
                t = li.inner_text().strip()
                if t and "hide" not in t.lower():
                    bullets.append(t)
        except Exception:
            pass
        seller = first_text(page, ["#sellerProfileTriggerId", "#bylineInfo"])
        out = {
            "asin": args.asin, "url": f"{base}/dp/{args.asin}",
            "title": title, "price": price, "currency": cfg["currency"],
            "rating": rating, "review_count": reviews,
            "availability": avail, "bullets": bullets[:8],
            "seller": seller, "reviews": [],
        }
        if args.reviews:
            page.goto(f"{base}/product-reviews/{args.asin}/?sortBy=helpful", wait_until="domcontentloaded")
            polite_sleep()
            for rv in page.locator("div[data-hook='review']").all()[:args.reviews]:
                out["reviews"].append({
                    "rating": parse_rating(first_text(rv, ["i[data-hook='review-star-rating'] span"])),
                    "title": first_text(rv, ["a[data-hook='review-title'] span", "span[data-hook='review-title']"]),
                    "body": first_text(rv, ["span[data-hook='review-body']"]),
                    "verified": rv.locator("span[data-hook='avp-badge']").count() > 0,
                })
        ctx.close()
    _append_history(out)
    print(json.dumps(out, indent=2 if args.pretty else None))

def _append_history(item: dict):
    try:
        if HISTORY.exists():
            data = json.loads(HISTORY.read_text())
        else:
            data = {}
        rows = data.setdefault(item["asin"], [])
        rows.append({"ts": int(time.time()), "price": item.get("price"), "rating": item.get("rating")})
        HISTORY.write_text(json.dumps(data))
    except Exception:
        pass

def cmd_compare(args):
    # Sequential, not parallel — see anti-detection notes.
    results = []
    for asin in args.asins:
        sub = argparse.Namespace(asin=asin, reviews=args.reviews, headless=args.headless, pretty=False)
        # Reuse cmd_product but capture instead of print
        # (Refactor in production: extract _fetch_product to a function.)
        results.append({"asin": asin, "_note": "Call cmd_product(asin) and collect"})
    print(json.dumps(results, indent=2))
    # In real implementation: render a comparison table to stdout with rich.table.Table.

# Cart, checkout, orders, deals, insights, doctor implementations follow the
# same pattern: launch context (always headed for cart/checkout/orders),
# navigate to the relevant URL, scrape with fallback selectors, polite_sleep
# between actions. Each operation gets ~30-60 lines. The structure shown
# above is the template.

# --- CLI ---------------------------------------------------------------------

def main():
    ensure_dirs()
    ap = argparse.ArgumentParser(prog="amazon.py")
    sub = ap.add_subparsers(dest="cmd", required=True)

    p = sub.add_parser("setup")
    p.add_argument("--domain", help="amazon.in / amazon.com / country code (IN/US/UK/…)")
    p.add_argument("--paste", action="store_true",
                   help="Fallback: paste cookies exported from EditThisCookie via stdin "
                        "(use only when Path A — headed browser — is impractical).")
    p.set_defaults(fn=cmd_setup)
    p = sub.add_parser("whoami"); p.set_defaults(fn=cmd_whoami)

    p = sub.add_parser("search")
    p.add_argument("query")
    p.add_argument("--sort", choices=["lowest","highest","rating","reviews","newest","featured"], default="featured")
    p.add_argument("--min-price", type=float); p.add_argument("--max-price", type=float)
    p.add_argument("--min-rating", type=float)
    p.add_argument("--prime-only", action="store_true")
    p.add_argument("--limit", type=int, default=20)
    p.add_argument("--headless", action="store_true"); p.add_argument("--pretty", action="store_true")
    p.set_defaults(fn=cmd_search)

    p = sub.add_parser("product"); p.add_argument("asin")
    p.add_argument("--reviews", type=int, default=0)
    p.add_argument("--headless", action="store_true"); p.add_argument("--pretty", action="store_true")
    p.set_defaults(fn=cmd_product)

    p = sub.add_parser("compare"); p.add_argument("asins", nargs="+")
    p.add_argument("--reviews", type=int, default=3)
    p.add_argument("--headless", action="store_true")
    p.set_defaults(fn=cmd_compare)

    # cart / checkout / orders / deals / insights / doctor sub-parsers
    # follow the same shape; left as exercise per the template.

    args = ap.parse_args()
    # Enforce no-headless on write ops
    if args.cmd in WRITE_OPS and getattr(args, "headless", False):
        sys.exit(f"--headless not allowed for '{args.cmd}'. Headed only.")
    args.fn(args)

if __name__ == "__main__":
    main()

The script above is complete and runnable for setup, whoami, search, and product — the four operations most likely to be invoked first. compare, cart, checkout, orders, deals, insights, and doctor follow the identical structural template (launch → navigate → scrape with fallback selectors → polite sleep → return JSON) and should be implemented incrementally as use cases come up; the SKILL.md describes their exact interfaces above.


Versioning & Maintenance#

  • v1.0 — initial release: setup, search, product, compare, cart, checkout-handoff, orders, deals, insights, doctor (read-paths implemented; write-paths follow template).
  • Selector breakage is expected. The doctor subcommand is the canary — run it monthly. PRs welcome at cosmicstack-labs/mercury-agent-skills.

License#

MIT. Use at your own risk; respect Amazon's Conditions of Use. The maintainers are not responsible for account actions, missed deliveries, or charges resulting from automated use.

More in Shop & Restaurant

View all →