Skip to content

posit-dev/nokap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

164 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

nokap

Screenshots and PDFs from web pages. Powered by headless Chrome.

Python Versions PyPI License CI

nokap captures screenshots and PDFs from web pages (or local HTML) using headless Chrome via the Chrome DevTools Protocol. It doesn't need Selenium or Playwright. We use just one lightweight dependency (websockets).

Installation

pip install nokap

Chrome or Chromium must be installed on the system. nokap will hunt it down pretty quickly.

Quick Start

import nokap

# Screenshot a URL
nokap.webshot("https://example.com", "example.png")

# Save as PDF
nokap.webshot("https://example.com", "example.pdf")

# Screenshot with a CSS selector (captures just that element)
nokap.webshot("https://example.com", "header.png", selector="h1")

# From an HTML string (great for table libraries)
nokap.from_html("<h1>Hello, world!</h1>", "hello.png")

API

nokap.webshot()

Take a screenshot or PDF of a web page.

nokap.webshot(
    url,                # URL or local file path
    file="webshot.png", # Output path (.png, .jpg, .webp, .pdf)
    *,
    vwidth=992,         # Viewport width (px)
    vheight=744,        # Viewport height (px)
    selector=None,      # CSS selector to capture
    cliprect=None,      # Clip rectangle (x, y, width, height)
    expand=0,           # Padding around selector (px)
    delay=0.2,          # Wait after page load (seconds)
    zoom=1,             # Scale factor (2 = retina)
    useragent=None,     # Custom User-Agent string
)

nokap.from_html()

Render an HTML string to an image or PDF. Designed for integration with table/report libraries.

nokap.from_html(
    html,               # HTML content
    file="webshot.png", # Output path
    *,
    selector="html",    # CSS selector to capture
    **kwargs,           # All webshot() options
)

nokap.close()

Shut down the background Chrome process. Called automatically at exit, but available for explicit cleanup.

nokap.close()

How It Works

nokap communicates directly with Chrome over the Chrome DevTools Protocol via WebSockets. No browser driver binaries, no heavyweight automation frameworks.

The architecture:

  1. Launch: Finds and starts headless Chrome with a random debugging port
  2. Connect: Opens a WebSocket to Chrome's CDP endpoint
  3. Capture: Creates a tab, navigates, waits, then calls Page.captureScreenshot or Page.printToPDF
  4. Cleanup: Closes the tab; Chrome stays running for reuse until nokap.close() or process exit

Features

Feature Details
Image formats PNG, JPEG, WebP
PDF generation Configurable page size, margins, orientation
CSS selectors Capture specific elements (or union of multiple)
Zoom/scale Produce retina-quality (2×, 3×) images
Expand/padding Add whitespace around captured elements
Local HTML Render HTML strings or local .html files
Viewport control Set width/height for responsive layouts
Custom User-Agent Override the browser UA string
Auto-cleanup Chrome process managed via atexit
Jupyter-safe Works in notebooks (no event loop conflicts)

Configuration

Environment Variable Purpose
CHROME_PATH Override Chrome binary location

Requirements

  • Python ≥ 3.10
  • Chrome or Chromium installed on the system

License

MIT

About

Capture screenshots and PDFs from web pages using headless Chrome via CDP.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Contributors