diff --git a/.gitignore b/.gitignore index a4a9ce6..7b5c577 100644 --- a/.gitignore +++ b/.gitignore @@ -2,3 +2,4 @@ __pycache__ uv.lock .playwright-mcp/ +tests/.fixture_cache/ diff --git a/README.md b/README.md index 53612ee..6426ebb 100644 --- a/README.md +++ b/README.md @@ -47,14 +47,16 @@ All commands support these options: - `-o, --output DIRECTORY` - output directory (default: writes to temp dir and opens browser) - `-a, --output-auto` - auto-name output subdirectory based on session ID or filename -- `--repo OWNER/NAME` - GitHub repo for commit links (auto-detected from git push output if not specified) +- `--repo PATH|URL|OWNER/NAME` - Git repo for commit links and code viewer. Accepts a local path, GitHub URL, or owner/name format. - `--open` - open the generated `index.html` in your default browser (default if no `-o` specified) - `--gist` - upload the generated HTML files to a GitHub Gist and output a preview URL - `--json` - include the original session file in the output directory +- `--code-view` - generate an interactive code viewer showing all files modified during the session The generated output includes: - `index.html` - an index page with a timeline of prompts and commits - `page-001.html`, `page-002.html`, etc. - paginated transcript pages +- `code.html` - interactive code viewer (when `--code-view` is used) ### Local sessions @@ -106,7 +108,19 @@ Preview: https://gisthost.github.io/?abc123def456/index.html Files: /var/folders/.../session-id ``` -The preview URL uses [gisthost.github.io](https://gisthost.github.io/) to render your HTML gist. The tool automatically injects JavaScript to fix relative links when served through gisthost. +The preview URL uses [gisthost.github.io](https://gisthost.github.io/) to render your HTML gist. The tool automatically injects JavaScript to fix relative links when served through gisthost (also works with gistpreview.github.io for backward compatibility). + +**Large sessions:** GitHub's gist API has size limits (~1MB). For large sessions, the tool automatically handles this: + +- **Page content**: When total page content exceeds 500KB, the tool generates separate `page-data-NNN.json` files for each page. The HTML pages are stripped of their inline content when uploaded, and JavaScript fetches the content from the JSON files on demand. This keeps each file small while preserving full functionality. + +- **Code viewer**: When using `--code-view`, large sessions may have a `code-data.json` file that also needs separate handling. + +- **Two-gist strategy**: When data files exceed 1MB total, they're uploaded to a separate "data gist", and the main gist's HTML files reference it. + +- **Batched uploads**: If files are still too large, they're automatically batched into multiple gists. + +All of this happens transparently and requires no additional options. Search continues to work by fetching from the JSON files instead of HTML. Combine with `-o` to keep a local copy: @@ -116,6 +130,36 @@ claude-code-transcripts json session.json -o ./my-transcript --gist **Requirements:** The `--gist` option requires the [GitHub CLI](https://cli.github.com/) (`gh`) to be installed and authenticated (`gh auth login`). +### Code viewer + +Use `--code-view` to generate an interactive three-pane code viewer that shows all files modified during the session: + +```bash +# Generate with code viewer from a local session +claude-code-transcripts --code-view + +# Point to the actual repo for full file content and blame +claude-code-transcripts --code-view --repo /path/to/repo + +# From a URL +claude-code-transcripts json https://example.com/session.jsonl --code-view +``` + +The code viewer (`code.html`) provides: +- **File tree**: Navigate all files that were written or edited during the session +- **File content**: View file contents with git blame-style annotations showing which prompt modified each line +- **Transcript pane**: Browse the full conversation with links to jump to specific file operations + +When you provide `--repo` pointing to the local git repository that was being modified, the code viewer can show the complete file content with accurate blame attribution. Without a repo path, it shows a diff-only view of the changes. + +Use `--exclude-deleted-files` to filter out files that no longer exist on disk: + +```bash +claude-code-transcripts --code-view --exclude-deleted-files +``` + +This is useful when files were deleted after the session (either manually or by commands not captured in the transcript). + ### Auto-naming output directories Use `-a/--output-auto` to automatically create a subdirectory named after the session: @@ -145,11 +189,14 @@ This is useful for archiving the source data alongside the HTML output. ### Converting from JSON/JSONL files -Convert a specific session file directly: +Convert a specific session file or URL directly: ```bash claude-code-transcripts json session.json -o output-directory/ claude-code-transcripts json session.jsonl --open + +# Fetch and convert from a URL +claude-code-transcripts json https://example.com/session.jsonl --open ``` This works with both JSONL files in the `~/.claude/projects/` folder and JSON session files extracted from Claude Code for web. diff --git a/pyproject.toml b/pyproject.toml index e1eaed6..5de9054 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -11,9 +11,11 @@ requires-python = ">=3.10" dependencies = [ "click", "click-default-group", + "gitpython", "httpx", "jinja2", "markdown", + "nh3>=0.3.2", "questionary", ] diff --git a/src/claude_code_transcripts/__init__.py b/src/claude_code_transcripts/__init__.py index f2246a2..d27b57c 100644 --- a/src/claude_code_transcripts/__init__.py +++ b/src/claude_code_transcripts/__init__.py @@ -9,14 +9,19 @@ import subprocess import tempfile import webbrowser +from dataclasses import dataclass, field from datetime import datetime from pathlib import Path +from typing import Optional, List, Tuple, Dict, Any import click from click_default_group import DefaultGroup +from git import Repo +from git.exc import InvalidGitRepositoryError import httpx from jinja2 import Environment, PackageLoader import markdown +import nh3 import questionary # Set up Jinja2 environment @@ -49,6 +54,101 @@ def get_template(name): ) +# Import code viewer functionality from separate module +from claude_code_transcripts.code_view import ( + FileOperation, + FileState, + CodeViewData, + BlameRange, + OP_WRITE, + OP_EDIT, + OP_DELETE, + extract_file_operations, + filter_deleted_files, + normalize_file_paths, + find_git_repo_root, + find_commit_before_timestamp, + build_file_history_repo, + get_file_blame_ranges, + get_file_content_from_repo, + build_file_tree, + reconstruct_file_with_blame, + build_file_states, + render_file_tree_html, + file_state_to_dict, + generate_code_view_html, + build_msg_to_user_html, +) + + +def extract_github_repo_from_url(url: str) -> Optional[str]: + """Extract 'owner/name' from various GitHub URL formats. + + Handles: + - https://github.com/owner/repo + - https://github.com/owner/repo.git + - git@github.com:owner/repo.git + + Args: + url: GitHub URL or git remote URL. + + Returns: + Repository identifier as 'owner/name', or None if not found. + """ + match = re.search(r"github\.com[:/]([^/]+/[^/?#.]+)", url) + if match: + repo = match.group(1) + return repo[:-4] if repo.endswith(".git") else repo + return None + + +def parse_repo_value(repo: Optional[str]) -> Tuple[Optional[str], Optional[Path]]: + """Parse --repo value to extract GitHub repo name and/or local path. + + Args: + repo: The --repo value (could be path, URL, or owner/name). + + Returns: + Tuple of (github_repo, local_path): + - github_repo: "owner/name" string for commit links, or None + - local_path: Path to local git repo for file history, or None + """ + if not repo: + return None, None + + # Check if it's a local path that exists + repo_path = Path(repo) + if repo_path.exists() and (repo_path / ".git").exists(): + # Try to extract GitHub remote URL + github_repo = None + try: + result = subprocess.run( + ["git", "remote", "get-url", "origin"], + cwd=repo_path, + capture_output=True, + text=True, + ) + if result.returncode == 0: + github_repo = extract_github_repo_from_url(result.stdout.strip()) + except Exception: + pass + return github_repo, repo_path + + # Check if it's a GitHub URL + if is_url(repo): + github_repo = extract_github_repo_from_url(repo) + if github_repo: + return github_repo, None + # Not a GitHub URL, ignore + return None, None + + # Assume it's owner/name format + if "/" in repo and not repo.startswith("/"): + return repo, None + + return None, None + + def extract_text_from_content(content): """Extract plain text from message content. @@ -401,8 +501,6 @@ def _generate_project_index(project, output_dir): project_name=project["name"], sessions=sessions_data, session_count=len(sessions_data), - css=CSS, - js=JS, ) output_path = output_dir / "index.html" @@ -440,8 +538,6 @@ def _generate_master_index(projects, output_dir): projects=projects_data, total_projects=len(projects), total_sessions=total_sessions, - css=CSS, - js=JS, ) output_path = output_dir / "index.html" @@ -492,6 +588,14 @@ def _parse_jsonl_file(filepath): if obj.get("isCompactSummary"): entry["isCompactSummary"] = True + # Preserve isMeta if present (skill expansions, not real user prompts) + if obj.get("isMeta"): + entry["isMeta"] = True + + # Preserve toolUseResult if present (needed for originalFile content) + if "toolUseResult" in obj: + entry["toolUseResult"] = obj["toolUseResult"] + loglines.append(entry) except json.JSONDecodeError: continue @@ -629,10 +733,58 @@ def format_json(obj): return f"
{html.escape(str(obj))}
" +# Allowed HTML tags for markdown content - anything else gets escaped +ALLOWED_TAGS = { + # Block elements + "p", + "div", + "h1", + "h2", + "h3", + "h4", + "h5", + "h6", + "blockquote", + "pre", + "hr", + # Lists + "ul", + "ol", + "li", + # Inline elements + "a", + "strong", + "b", + "em", + "i", + "code", + "br", + "span", + # Tables + "table", + "thead", + "tbody", + "tr", + "th", + "td", +} + +ALLOWED_ATTRIBUTES = { + "a": {"href", "title"}, + "code": {"class"}, # For syntax highlighting + "pre": {"class"}, + "span": {"class"}, + "td": {"align"}, + "th": {"align"}, +} + + def render_markdown_text(text): if not text: return "" - return markdown.markdown(text, extensions=["fenced_code", "tables"]) + raw_html = markdown.markdown(text, extensions=["fenced_code", "tables"]) + # Sanitize HTML to only allow safe tags - escapes everything else + return nh3.clean(raw_html, tags=ALLOWED_TAGS, attributes=ALLOWED_ATTRIBUTES) def is_json_like(text): @@ -852,7 +1004,7 @@ def is_tool_result_message(message_data): ) -def render_message(log_type, message_json, timestamp): +def render_message(log_type, message_json, timestamp, prompt_num=None): if not message_json: return "" try: @@ -865,7 +1017,8 @@ def render_message(log_type, message_json, timestamp): if is_tool_result_message(message_data): role_class, role_label = "tool-reply", "Tool reply" else: - role_class, role_label = "user", "User" + role_class = "user" + role_label = f"User Prompt #{prompt_num}" if prompt_num else "User" elif log_type == "assistant": content_html = render_assistant_message(message_data) role_class, role_label = "assistant", "Assistant" @@ -1082,7 +1235,7 @@ def render_message(log_type, message_json, timestamp): } // Use MutationObserver to catch dynamically added content - // gistpreview.github.io may add content after initial load + // gisthost/gistpreview may add content after initial load var observer = new MutationObserver(function(mutations) { mutations.forEach(function(mutation) { mutation.addedNodes.forEach(function(node) { @@ -1104,6 +1257,70 @@ def render_message(log_type, message_json, timestamp): }); }); + // Load JS from gist (relative script srcs don't work on gistpreview) + document.querySelectorAll('script[src]').forEach(function(script) { + var src = script.getAttribute('src'); + if (src.startsWith('http')) return; // Already absolute + var jsUrl = 'https://gist.githubusercontent.com/raw/' + gistId + '/' + src; + fetch(jsUrl) + .then(function(r) { if (!r.ok) throw new Error('Failed'); return r.text(); }) + .then(function(js) { + var newScript = document.createElement('script'); + newScript.textContent = js; + document.body.appendChild(newScript); + }) + .catch(function(e) { console.error('Failed to load JS:', src, e); }); + }); + + // Rewrite relative links to work with gist preview URL format + function rewriteLinks(root) { + var scope = root || document; + var links = []; + + // Check if the root itself is a link (for MutationObserver calls) + if (scope.matches && scope.matches('a[href]')) { + links.push(scope); + } + + // Also get all descendant links + scope.querySelectorAll('a[href]').forEach(function(link) { + links.push(link); + }); + + links.forEach(function(link) { + var href = link.getAttribute('href'); + // Skip already-rewritten links (issue #26 fix) + if (href.startsWith('?')) return; + // Skip external links and anchors + if (href.startsWith('http') || href.startsWith('#') || href.startsWith('//')) return; + // Handle anchor in relative URL (e.g., page-001.html#msg-123) + var parts = href.split('#'); + var filename = parts[0]; + var anchor = parts.length > 1 ? '#' + parts[1] : ''; + link.setAttribute('href', '?' + gistId + '/' + filename + anchor); + }); + } + + // Run immediately + rewriteLinks(); + + // Also run on DOMContentLoaded in case DOM isn't ready yet + if (document.readyState === 'loading') { + document.addEventListener('DOMContentLoaded', function() { rewriteLinks(); }); + } + + // Use MutationObserver to catch dynamically added content + // gistpreview.github.io may add content after initial load + var observer = new MutationObserver(function(mutations) { + mutations.forEach(function(mutation) { + mutation.addedNodes.forEach(function(node) { + if (node.nodeType === 1) { // Element node + rewriteLinks(node); + } + }); + }); + }); + // Start observing once body exists function startObserving() { if (document.body) { @@ -1114,6 +1331,18 @@ def render_message(log_type, message_json, timestamp): } startObserving(); + // Execute module scripts that were injected via innerHTML + // (browsers don't execute scripts added via innerHTML for security) + document.querySelectorAll('script[type="module"]').forEach(function(script) { + if (script.src) return; // Already has src, skip + var blob = new Blob([script.textContent], { type: 'application/javascript' }); + var url = URL.createObjectURL(blob); + var newScript = document.createElement('script'); + newScript.type = 'module'; + newScript.src = url; + document.body.appendChild(newScript); + }); + // Handle fragment navigation after dynamic content loads // gisthost.github.io/gistpreview.github.io loads content dynamically, so the browser's // native fragment navigation fails because the element doesn't exist yet @@ -1142,10 +1371,29 @@ def render_message(log_type, message_json, timestamp): def inject_gist_preview_js(output_dir): - """Inject gist preview JavaScript into all HTML files in the output directory.""" + """Inject gist preview JavaScript into all HTML files in the output directory. + + Also removes inline CODE_DATA from code.html since gist version fetches it separately. + + Args: + output_dir: Path to the output directory containing HTML files. + """ output_dir = Path(output_dir) for html_file in output_dir.glob("*.html"): content = html_file.read_text(encoding="utf-8") + + # For code.html, remove the inline CODE_DATA script + # (gist version fetches code-data.json instead) + if html_file.name == "code.html": + import re + + content = re.sub( + r"\s*", + "", + content, + flags=re.DOTALL, + ) + # Insert the gist preview JS before the closing tag if "" in content: content = content.replace( @@ -1154,22 +1402,49 @@ def inject_gist_preview_js(output_dir): html_file.write_text(content, encoding="utf-8") -def create_gist(output_dir, public=False): +def create_gist(output_dir, public=False, description=None): """Create a GitHub gist from the HTML files in output_dir. - Returns the gist ID on success, or raises click.ClickException on failure. + Args: + output_dir: Directory containing the HTML files to upload. + public: Whether to create a public gist. + description: Optional description for the gist. + + Returns (gist_id, gist_url) tuple. + Raises click.ClickException on failure. + + Note: This function calls inject_gist_preview_js internally. Caller should NOT + call it separately. """ output_dir = Path(output_dir) html_files = list(output_dir.glob("*.html")) if not html_files: raise click.ClickException("No HTML files found to upload to gist.") - # Build the gh gist create command - # gh gist create file1 file2 ... --public/--private + # Collect all files (HTML + CSS/JS + data) + css_js_files = [ + output_dir / f + for f in ["styles.css", "main.js", "search.js"] + if (output_dir / f).exists() + ] + data_files = [] + code_data = output_dir / "code-data.json" + if code_data.exists(): + data_files.append(code_data) + + all_files = sorted(html_files) + css_js_files + data_files + + # Inject gist preview JS into HTML files + inject_gist_preview_js(output_dir) + + # Create gist with all files + click.echo(f"Creating gist with {len(all_files)} files...") cmd = ["gh", "gist", "create"] - cmd.extend(str(f) for f in sorted(html_files)) + cmd.extend(str(f) for f in all_files) if public: cmd.append("--public") + if description: + cmd.extend(["--desc", description]) try: result = subprocess.run( @@ -1201,35 +1476,22 @@ def generate_index_pagination_html(total_pages): return _macros.index_pagination(total_pages) -def generate_html(json_path, output_dir, github_repo=None): - output_dir = Path(output_dir) - output_dir.mkdir(exist_ok=True) - - # Load session file (supports both JSON and JSONL) - data = parse_session_file(json_path) - - loglines = data.get("loglines", []) - - # Auto-detect GitHub repo if not provided - if github_repo is None: - github_repo = detect_github_repo(loglines) - if github_repo: - print(f"Auto-detected GitHub repo: {github_repo}") - else: - print( - "Warning: Could not auto-detect GitHub repo. Commit links will be disabled." - ) - - # Set module-level variable for render functions - global _github_repo - _github_repo = github_repo +def build_conversations(loglines): + """Build conversation dicts from loglines. + Returns list of conversation dicts with keys: + - user_text: The user's prompt text + - timestamp: ISO timestamp + - messages: List of (log_type, message_json, timestamp) tuples + - is_continuation: Boolean indicating if this is a continuation + """ conversations = [] current_conv = None for entry in loglines: log_type = entry.get("type") timestamp = entry.get("timestamp", "") is_compact_summary = entry.get("isCompactSummary", False) + is_meta = entry.get("isMeta", False) message_data = entry.get("message", {}) if not message_data: continue @@ -1246,66 +1508,25 @@ def generate_html(json_path, output_dir, github_repo=None): if is_user_prompt: if current_conv: conversations.append(current_conv) + # isMeta entries (skill expansions) are continuations, not new prompts current_conv = { "user_text": user_text, "timestamp": timestamp, "messages": [(log_type, message_json, timestamp)], - "is_continuation": bool(is_compact_summary), + "is_continuation": bool(is_compact_summary or is_meta), } elif current_conv: current_conv["messages"].append((log_type, message_json, timestamp)) if current_conv: conversations.append(current_conv) + return conversations - total_convs = len(conversations) - total_pages = (total_convs + PROMPTS_PER_PAGE - 1) // PROMPTS_PER_PAGE - for page_num in range(1, total_pages + 1): - start_idx = (page_num - 1) * PROMPTS_PER_PAGE - end_idx = min(start_idx + PROMPTS_PER_PAGE, total_convs) - page_convs = conversations[start_idx:end_idx] - messages_html = [] - for conv in page_convs: - is_first = True - for log_type, message_json, timestamp in conv["messages"]: - msg_html = render_message(log_type, message_json, timestamp) - if msg_html: - # Wrap continuation summaries in collapsed details - if is_first and conv.get("is_continuation"): - msg_html = f'
Session continuation summary{msg_html}
' - messages_html.append(msg_html) - is_first = False - pagination_html = generate_pagination_html(page_num, total_pages) - page_template = get_template("page.html") - page_content = page_template.render( - css=CSS, - js=JS, - page_num=page_num, - total_pages=total_pages, - pagination_html=pagination_html, - messages_html="".join(messages_html), - ) - (output_dir / f"page-{page_num:03d}.html").write_text( - page_content, encoding="utf-8" - ) - print(f"Generated page-{page_num:03d}.html") - - # Calculate overall stats and collect all commits for timeline - total_tool_counts = {} - total_messages = 0 - all_commits = [] # (timestamp, hash, message, page_num, conv_index) - for i, conv in enumerate(conversations): - total_messages += len(conv["messages"]) - stats = analyze_conversation(conv["messages"]) - for tool, count in stats["tool_counts"].items(): - total_tool_counts[tool] = total_tool_counts.get(tool, 0) + count - page_num = (i // PROMPTS_PER_PAGE) + 1 - for commit_hash, commit_msg, commit_ts in stats["commits"]: - all_commits.append((commit_ts, commit_hash, commit_msg, page_num, i)) - total_tool_calls = sum(total_tool_counts.values()) - total_commits = len(all_commits) +def build_timeline_items(conversations, all_commits, github_repo): + """Build sorted timeline HTML items from conversations and commits. - # Build timeline items: prompts and commits merged by timestamp + Returns list of HTML strings sorted by timestamp. + """ timeline_items = [] # Add prompts @@ -1348,33 +1569,252 @@ def generate_html(json_path, output_dir, github_repo=None): # Add commits as separate timeline items for commit_ts, commit_hash, commit_msg, page_num, conv_idx in all_commits: item_html = _macros.index_commit( - commit_hash, commit_msg, commit_ts, _github_repo + commit_hash, commit_msg, commit_ts, github_repo ) timeline_items.append((commit_ts, "commit", item_html)) - # Sort by timestamp + # Sort by timestamp and return just the HTML timeline_items.sort(key=lambda x: x[0]) - index_items = [item[2] for item in timeline_items] + return [item[2] for item in timeline_items], prompt_num + + +def _generate_html_impl( + loglines, + output_dir, + github_repo=None, + code_view=False, + exclude_deleted_files=False, + echo=None, +): + """Internal implementation of HTML generation from loglines. + + Args: + loglines: List of log entries + output_dir: Path to output directory + github_repo: Optional GitHub repo string (e.g., "owner/repo") + code_view: Whether to generate code view + exclude_deleted_files: Whether to filter deleted files from code view + echo: Function to use for output (print or click.echo wrapper) + """ + if echo is None: + echo = print + + output_dir = Path(output_dir) + + # Set module-level variable for render functions + global _github_repo + _github_repo = github_repo + + conversations = build_conversations(loglines) + + total_convs = len(conversations) + total_pages = (total_convs + PROMPTS_PER_PAGE - 1) // PROMPTS_PER_PAGE + + # Determine if code view will be generated (for tab navigation) + has_code_view = False + file_operations = None + if code_view: + file_operations = extract_file_operations(loglines, conversations) + # Optionally filter out files that no longer exist on disk + if exclude_deleted_files and file_operations: + file_operations = filter_deleted_files(file_operations) + has_code_view = len(file_operations) > 0 + + # Collect all messages HTML for the code view transcript pane + all_messages_html = [] + # Collect messages per page for potential page-data.json + page_messages_dict = {} + + # Track prompt number across all pages + prompt_num = 0 + + for page_num in range(1, total_pages + 1): + start_idx = (page_num - 1) * PROMPTS_PER_PAGE + end_idx = min(start_idx + PROMPTS_PER_PAGE, total_convs) + page_convs = conversations[start_idx:end_idx] + messages_html = [] + # Count total messages for this page for progress display + total_page_messages = sum(len(c["messages"]) for c in page_convs) + msg_count = 0 + for conv in page_convs: + is_first = True + for log_type, message_json, timestamp in conv["messages"]: + msg_count += 1 + if total_page_messages > 50: + echo( + f"\rPage {page_num}/{total_pages}: rendering message {msg_count}/{total_page_messages}...", + end="", + ) + # Track prompt number for user messages (not tool results) + current_prompt_num = None + if log_type == "user" and message_json: + try: + message_data = json.loads(message_json) + if not is_tool_result_message(message_data): + prompt_num += 1 + current_prompt_num = prompt_num + except json.JSONDecodeError: + pass + msg_html = render_message( + log_type, message_json, timestamp, current_prompt_num + ) + if msg_html: + # Wrap continuation summaries in collapsed details + if is_first and conv.get("is_continuation"): + msg_html = f'
Session continuation summary{msg_html}
' + messages_html.append(msg_html) + is_first = False + if total_page_messages > 50: + echo("\r" + " " * 60 + "\r", end="") # Clear the progress line + + # Store messages for this page + page_messages_dict[str(page_num)] = "".join(messages_html) + + # Collect all messages for code view transcript pane + all_messages_html.extend(messages_html) + + # Generate page HTML files + for page_num in range(1, total_pages + 1): + pagination_html = generate_pagination_html(page_num, total_pages) + page_template = get_template("page.html") + page_content = page_template.render( + page_num=page_num, + total_pages=total_pages, + pagination_html=pagination_html, + messages_html=page_messages_dict[str(page_num)], + has_code_view=has_code_view, + active_tab="transcript", + use_page_data_json=False, + use_external_assets=False, + ) + (output_dir / f"page-{page_num:03d}.html").write_text( + page_content, encoding="utf-8" + ) + echo(f"Generated page-{page_num:03d}.html") + + # Calculate overall stats and collect all commits for timeline + total_tool_counts = {} + total_messages = 0 + all_commits = [] # (timestamp, hash, message, page_num, conv_index) + for i, conv in enumerate(conversations): + total_messages += len(conv["messages"]) + stats = analyze_conversation(conv["messages"]) + for tool, count in stats["tool_counts"].items(): + total_tool_counts[tool] = total_tool_counts.get(tool, 0) + count + page_num = (i // PROMPTS_PER_PAGE) + 1 + for commit_hash, commit_msg, commit_ts in stats["commits"]: + all_commits.append((commit_ts, commit_hash, commit_msg, page_num, i)) + total_tool_calls = sum(total_tool_counts.values()) + total_commits = len(all_commits) + + # Build timeline items using helper + index_items, final_prompt_num = build_timeline_items( + conversations, all_commits, github_repo + ) + index_items_html = "".join(index_items) index_pagination = generate_index_pagination_html(total_pages) index_template = get_template("index.html") index_content = index_template.render( - css=CSS, - js=JS, pagination_html=index_pagination, - prompt_num=prompt_num, + prompt_num=final_prompt_num, total_messages=total_messages, total_tool_calls=total_tool_calls, total_commits=total_commits, total_pages=total_pages, - index_items_html="".join(index_items), + index_items_html=index_items_html, + has_code_view=has_code_view, + active_tab="transcript", + use_index_data_json=False, + use_external_assets=False, ) index_path = output_dir / "index.html" index_path.write_text(index_content, encoding="utf-8") - print( + echo( f"Generated {index_path.resolve()} ({total_convs} prompts, {total_pages} pages)" ) + # Generate code view if requested + if has_code_view: + num_ops = len(file_operations) + num_files = len(set(op.file_path for op in file_operations)) + + last_phase = [None] # Use list to allow mutation in nested function + + def code_view_progress(phase, current, total): + # Clear line when switching phases + if last_phase[0] and last_phase[0] != phase: + echo("\r" + " " * 60 + "\r", end="") + last_phase[0] = phase + + if phase == "operations" and num_ops > 20: + echo( + f"\rCode view: replaying operation {current}/{total}...", + end="", + ) + elif phase == "files" and num_files > 5: + echo( + f"\rCode view: processing file {current}/{total}...", + end="", + ) + + msg_to_user_html, msg_to_context_id, msg_to_prompt_num = build_msg_to_user_html( + conversations + ) + generate_code_view_html( + output_dir, + file_operations, + transcript_messages=all_messages_html, + msg_to_user_html=msg_to_user_html, + msg_to_context_id=msg_to_context_id, + msg_to_prompt_num=msg_to_prompt_num, + total_pages=total_pages, + progress_callback=code_view_progress, + ) + # Clear progress line + if num_ops > 20 or num_files > 5: + echo("\r" + " " * 60 + "\r", end="") + echo(f"Generated code.html ({num_files} files)") + + +def generate_html( + json_path, + output_dir, + github_repo=None, + code_view=False, + exclude_deleted_files=False, +): + """Generate HTML from a session file path.""" + output_dir = Path(output_dir) + output_dir.mkdir(exist_ok=True) + + # Load session file (supports both JSON and JSONL) + data = parse_session_file(json_path) + loglines = data.get("loglines", []) + + # Auto-detect GitHub repo if not provided + if github_repo is None: + github_repo = detect_github_repo(loglines) + if github_repo: + print(f"Auto-detected GitHub repo: {github_repo}") + else: + print( + "Warning: Could not auto-detect GitHub repo. Commit links will be disabled." + ) + + # Use print with flush for progress output + def echo(msg, end="\n"): + print(msg, end=end, flush=True) + + _generate_html_impl( + loglines, + output_dir, + github_repo=github_repo, + code_view=code_view, + exclude_deleted_files=exclude_deleted_files, + echo=echo, + ) + @click.group(cls=DefaultGroup, default="local", default_if_no_args=True) @click.version_option(None, "-v", "--version", package_name="claude-code-transcripts") @@ -1398,7 +1838,7 @@ def cli(): ) @click.option( "--repo", - help="GitHub repo (owner/name) for commit links. Auto-detected from git push output if not specified.", + help="Git repo: local path, GitHub URL, or owner/name. Used for commit links and code viewer file history.", ) @click.option( "--gist", @@ -1422,7 +1862,27 @@ def cli(): default=10, help="Maximum number of sessions to show (default: 10)", ) -def local_cmd(output, output_auto, repo, gist, include_json, open_browser, limit): +@click.option( + "--code-view", + is_flag=True, + help="Generate a code viewer tab showing files modified during the session.", +) +@click.option( + "--exclude-deleted-files", + is_flag=True, + help="Exclude files that no longer exist on disk from the code viewer.", +) +def local_cmd( + output, + output_auto, + repo, + gist, + include_json, + open_browser, + limit, + code_view, + exclude_deleted_files, +): """Select and convert a local Claude Code session to HTML.""" projects_folder = Path.home() / ".claude" / "projects" @@ -1473,7 +1933,15 @@ def local_cmd(output, output_auto, repo, gist, include_json, open_browser, limit output = Path(tempfile.gettempdir()) / f"claude-session-{session_file.stem}" output = Path(output) - generate_html(session_file, output, github_repo=repo) + # Parse --repo to get GitHub repo name + github_repo, _ = parse_repo_value(repo) + generate_html( + session_file, + output, + github_repo=github_repo, + code_view=code_view, + exclude_deleted_files=exclude_deleted_files, + ) # Show output directory click.echo(f"Output: {output.resolve()}") @@ -1487,10 +1955,10 @@ def local_cmd(output, output_auto, repo, gist, include_json, open_browser, limit click.echo(f"JSONL: {json_dest} ({json_size_kb:.1f} KB)") if gist: - # Inject gist preview JS and create gist - inject_gist_preview_js(output) + # Create gist (handles inject_gist_preview_js internally) click.echo("Creating GitHub gist...") - gist_id, gist_url = create_gist(output) + gist_desc = f"claude-code-transcripts local {session_file.stem}" + gist_id, gist_url = create_gist(output, description=gist_desc) preview_url = f"https://gisthost.github.io/?{gist_id}/index.html" click.echo(f"Gist: {gist_url}") click.echo(f"Preview: {preview_url}") @@ -1555,7 +2023,7 @@ def fetch_url_to_tempfile(url): ) @click.option( "--repo", - help="GitHub repo (owner/name) for commit links. Auto-detected from git push output if not specified.", + help="Git repo: local path, GitHub URL, or owner/name. Used for commit links and code viewer file history.", ) @click.option( "--gist", @@ -1574,9 +2042,30 @@ def fetch_url_to_tempfile(url): is_flag=True, help="Open the generated index.html in your default browser (default if no -o specified).", ) -def json_cmd(json_file, output, output_auto, repo, gist, include_json, open_browser): +@click.option( + "--code-view", + is_flag=True, + help="Generate a code viewer tab showing files modified during the session.", +) +@click.option( + "--exclude-deleted-files", + is_flag=True, + help="Exclude files that no longer exist on disk from the code viewer.", +) +def json_cmd( + json_file, + output, + output_auto, + repo, + gist, + include_json, + open_browser, + code_view, + exclude_deleted_files, +): """Convert a Claude Code session JSON/JSONL file or URL to HTML.""" # Handle URL input + original_input = json_file if is_url(json_file): click.echo(f"Fetching {json_file}...") temp_file = fetch_url_to_tempfile(json_file) @@ -1590,6 +2079,9 @@ def json_cmd(json_file, output, output_auto, repo, gist, include_json, open_brow raise click.ClickException(f"File not found: {json_file}") url_name = None + # Parse --repo to get GitHub repo name + github_repo, _ = parse_repo_value(repo) + # Determine output directory and whether to open browser # If no -o specified, use temp dir and open browser by default auto_open = output is None and not gist and not output_auto @@ -1604,24 +2096,43 @@ def json_cmd(json_file, output, output_auto, repo, gist, include_json, open_brow ) output = Path(output) - generate_html(json_file_path, output, github_repo=repo) + generate_html( + json_file_path, + output, + github_repo=github_repo, + code_view=code_view, + exclude_deleted_files=exclude_deleted_files, + ) # Show output directory click.echo(f"Output: {output.resolve()}") # Copy JSON file to output directory if requested - if include_json: + if include_json and not is_url(original_input): output.mkdir(exist_ok=True) json_dest = output / json_file_path.name shutil.copy(json_file_path, json_dest) json_size_kb = json_dest.stat().st_size / 1024 click.echo(f"JSON: {json_dest} ({json_size_kb:.1f} KB)") + elif include_json and is_url(original_input): + # For URLs, copy the temp file with a meaningful name + output.mkdir(exist_ok=True) + url_name = Path(original_input.split("?")[0]).name or "session.jsonl" + json_dest = output / url_name + shutil.copy(json_file, json_dest) + json_size_kb = json_dest.stat().st_size / 1024 + click.echo(f"JSON: {json_dest} ({json_size_kb:.1f} KB)") if gist: - # Inject gist preview JS and create gist - inject_gist_preview_js(output) + # Create gist (handles inject_gist_preview_js internally) click.echo("Creating GitHub gist...") - gist_id, gist_url = create_gist(output) + # Use filename/URL for description + if is_url(original_input): + input_name = Path(original_input.split("?")[0]).name or "session" + else: + input_name = Path(original_input).stem + gist_desc = f"claude-code-transcripts json {input_name}" + gist_id, gist_url = create_gist(output, description=gist_desc) preview_url = f"https://gisthost.github.io/?{gist_id}/index.html" click.echo(f"Gist: {gist_url}") click.echo(f"Preview: {preview_url}") @@ -1677,7 +2188,13 @@ def format_session_for_display(session_data): return f"{session_id} {created_at[:19] if created_at else 'N/A':19} {title}" -def generate_html_from_session_data(session_data, output_dir, github_repo=None): +def generate_html_from_session_data( + session_data, + output_dir, + github_repo=None, + code_view=False, + exclude_deleted_files=False, +): """Generate HTML from session data dict (instead of file path).""" output_dir = Path(output_dir) output_dir.mkdir(exist_ok=True, parents=True) @@ -1690,159 +2207,17 @@ def generate_html_from_session_data(session_data, output_dir, github_repo=None): if github_repo: click.echo(f"Auto-detected GitHub repo: {github_repo}") - # Set module-level variable for render functions - global _github_repo - _github_repo = github_repo - - conversations = [] - current_conv = None - for entry in loglines: - log_type = entry.get("type") - timestamp = entry.get("timestamp", "") - is_compact_summary = entry.get("isCompactSummary", False) - message_data = entry.get("message", {}) - if not message_data: - continue - # Convert message dict to JSON string for compatibility with existing render functions - message_json = json.dumps(message_data) - is_user_prompt = False - user_text = None - if log_type == "user": - content = message_data.get("content", "") - text = extract_text_from_content(content) - if text: - is_user_prompt = True - user_text = text - if is_user_prompt: - if current_conv: - conversations.append(current_conv) - current_conv = { - "user_text": user_text, - "timestamp": timestamp, - "messages": [(log_type, message_json, timestamp)], - "is_continuation": bool(is_compact_summary), - } - elif current_conv: - current_conv["messages"].append((log_type, message_json, timestamp)) - if current_conv: - conversations.append(current_conv) - - total_convs = len(conversations) - total_pages = (total_convs + PROMPTS_PER_PAGE - 1) // PROMPTS_PER_PAGE - - for page_num in range(1, total_pages + 1): - start_idx = (page_num - 1) * PROMPTS_PER_PAGE - end_idx = min(start_idx + PROMPTS_PER_PAGE, total_convs) - page_convs = conversations[start_idx:end_idx] - messages_html = [] - for conv in page_convs: - is_first = True - for log_type, message_json, timestamp in conv["messages"]: - msg_html = render_message(log_type, message_json, timestamp) - if msg_html: - # Wrap continuation summaries in collapsed details - if is_first and conv.get("is_continuation"): - msg_html = f'
Session continuation summary{msg_html}
' - messages_html.append(msg_html) - is_first = False - pagination_html = generate_pagination_html(page_num, total_pages) - page_template = get_template("page.html") - page_content = page_template.render( - css=CSS, - js=JS, - page_num=page_num, - total_pages=total_pages, - pagination_html=pagination_html, - messages_html="".join(messages_html), - ) - (output_dir / f"page-{page_num:03d}.html").write_text( - page_content, encoding="utf-8" - ) - click.echo(f"Generated page-{page_num:03d}.html") - - # Calculate overall stats and collect all commits for timeline - total_tool_counts = {} - total_messages = 0 - all_commits = [] # (timestamp, hash, message, page_num, conv_index) - for i, conv in enumerate(conversations): - total_messages += len(conv["messages"]) - stats = analyze_conversation(conv["messages"]) - for tool, count in stats["tool_counts"].items(): - total_tool_counts[tool] = total_tool_counts.get(tool, 0) + count - page_num = (i // PROMPTS_PER_PAGE) + 1 - for commit_hash, commit_msg, commit_ts in stats["commits"]: - all_commits.append((commit_ts, commit_hash, commit_msg, page_num, i)) - total_tool_calls = sum(total_tool_counts.values()) - total_commits = len(all_commits) - - # Build timeline items: prompts and commits merged by timestamp - timeline_items = [] - - # Add prompts - prompt_num = 0 - for i, conv in enumerate(conversations): - if conv.get("is_continuation"): - continue - if conv["user_text"].startswith("Stop hook feedback:"): - continue - prompt_num += 1 - page_num = (i // PROMPTS_PER_PAGE) + 1 - msg_id = make_msg_id(conv["timestamp"]) - link = f"page-{page_num:03d}.html#{msg_id}" - rendered_content = render_markdown_text(conv["user_text"]) - - # Collect all messages including from subsequent continuation conversations - # This ensures long_texts from continuations appear with the original prompt - all_messages = list(conv["messages"]) - for j in range(i + 1, len(conversations)): - if not conversations[j].get("is_continuation"): - break - all_messages.extend(conversations[j]["messages"]) - - # Analyze conversation for stats (excluding commits from inline display now) - stats = analyze_conversation(all_messages) - tool_stats_str = format_tool_stats(stats["tool_counts"]) - - long_texts_html = "" - for lt in stats["long_texts"]: - rendered_lt = render_markdown_text(lt) - long_texts_html += _macros.index_long_text(rendered_lt) - - stats_html = _macros.index_stats(tool_stats_str, long_texts_html) - - item_html = _macros.index_item( - prompt_num, link, conv["timestamp"], rendered_content, stats_html - ) - timeline_items.append((conv["timestamp"], "prompt", item_html)) - - # Add commits as separate timeline items - for commit_ts, commit_hash, commit_msg, page_num, conv_idx in all_commits: - item_html = _macros.index_commit( - commit_hash, commit_msg, commit_ts, _github_repo - ) - timeline_items.append((commit_ts, "commit", item_html)) - - # Sort by timestamp - timeline_items.sort(key=lambda x: x[0]) - index_items = [item[2] for item in timeline_items] - - index_pagination = generate_index_pagination_html(total_pages) - index_template = get_template("index.html") - index_content = index_template.render( - css=CSS, - js=JS, - pagination_html=index_pagination, - prompt_num=prompt_num, - total_messages=total_messages, - total_tool_calls=total_tool_calls, - total_commits=total_commits, - total_pages=total_pages, - index_items_html="".join(index_items), - ) - index_path = output_dir / "index.html" - index_path.write_text(index_content, encoding="utf-8") - click.echo( - f"Generated {index_path.resolve()} ({total_convs} prompts, {total_pages} pages)" + # Use click.echo for progress output + def echo(msg, end="\n"): + click.echo(msg, nl=(end == "\n")) + + _generate_html_impl( + loglines, + output_dir, + github_repo=github_repo, + code_view=code_view, + exclude_deleted_files=exclude_deleted_files, + echo=echo, ) @@ -1866,7 +2241,7 @@ def generate_html_from_session_data(session_data, output_dir, github_repo=None): ) @click.option( "--repo", - help="GitHub repo (owner/name) for commit links. Auto-detected from git push output if not specified.", + help="Git repo: local path, GitHub URL, or owner/name. Used for commit links and code viewer file history.", ) @click.option( "--gist", @@ -1885,6 +2260,11 @@ def generate_html_from_session_data(session_data, output_dir, github_repo=None): is_flag=True, help="Open the generated index.html in your default browser (default if no -o specified).", ) +@click.option( + "--code-view", + is_flag=True, + help="Generate a code viewer tab showing files modified during the session.", +) def web_cmd( session_id, output, @@ -1895,6 +2275,7 @@ def web_cmd( gist, include_json, open_browser, + code_view, ): """Select and convert a web session from the Claude API to HTML. @@ -1966,7 +2347,14 @@ def web_cmd( output = Path(output) click.echo(f"Generating HTML in {output}/...") - generate_html_from_session_data(session_data, output, github_repo=repo) + # Parse --repo to get GitHub repo name + github_repo, _ = parse_repo_value(repo) + generate_html_from_session_data( + session_data, + output, + github_repo=github_repo, + code_view=code_view, + ) # Show output directory click.echo(f"Output: {output.resolve()}") @@ -1981,10 +2369,10 @@ def web_cmd( click.echo(f"JSON: {json_dest} ({json_size_kb:.1f} KB)") if gist: - # Inject gist preview JS and create gist - inject_gist_preview_js(output) + # Create gist (handles inject_gist_preview_js internally) click.echo("Creating GitHub gist...") - gist_id, gist_url = create_gist(output) + gist_desc = f"claude-code-transcripts web {session_id}" + gist_id, gist_url = create_gist(output, description=gist_desc) preview_url = f"https://gisthost.github.io/?{gist_id}/index.html" click.echo(f"Gist: {gist_url}") click.echo(f"Preview: {preview_url}") diff --git a/src/claude_code_transcripts/code_view.py b/src/claude_code_transcripts/code_view.py new file mode 100644 index 0000000..9de3052 --- /dev/null +++ b/src/claude_code_transcripts/code_view.py @@ -0,0 +1,1701 @@ +"""Code viewer functionality for Claude Code transcripts. + +This module handles the three-pane code viewer with git-based blame annotations. +""" + +import html +import json +import os +import re +import shutil +import tempfile +from dataclasses import dataclass, field +from datetime import datetime +from pathlib import Path +from typing import Optional, List, Tuple, Dict, Any, Set + +from git import Repo +from git.exc import InvalidGitRepositoryError + + +# ============================================================================ +# Helper Functions +# ============================================================================ + + +def group_operations_by_file( + operations: List["FileOperation"], +) -> Dict[str, List["FileOperation"]]: + """Group operations by file path and sort each group by timestamp. + + Args: + operations: List of FileOperation objects. + + Returns: + Dict mapping file paths to lists of FileOperation objects, sorted by timestamp. + """ + file_ops: Dict[str, List["FileOperation"]] = {} + for op in operations: + if op.file_path not in file_ops: + file_ops[op.file_path] = [] + file_ops[op.file_path].append(op) + + # Sort each file's operations by timestamp + for ops in file_ops.values(): + ops.sort(key=lambda o: o.timestamp) + + return file_ops + + +def read_blob(tree, file_path: str, decode: bool = True) -> Optional[str | bytes]: + """Read file content from a git tree/commit. + + Args: + tree: Git tree object (e.g., commit.tree). + file_path: Relative path to the file within the repo. + decode: If True, decode as UTF-8 string; if False, return raw bytes. + + Returns: + File content as string (if decode=True) or bytes (if decode=False), + or None if not found. + """ + try: + blob = tree / file_path + data = blob.data_stream.read() + return data.decode("utf-8") if decode else data + except (KeyError, TypeError, ValueError): + return None + + +# Backwards-compatible aliases +def read_blob_content(tree, file_path: str) -> Optional[str]: + """Read file content from a git tree/commit as string.""" + return read_blob(tree, file_path, decode=True) + + +def read_blob_bytes(tree, file_path: str) -> Optional[bytes]: + """Read file content from a git tree/commit as bytes.""" + return read_blob(tree, file_path, decode=False) + + +def parse_iso_timestamp(timestamp: str) -> Optional[datetime]: + """Parse ISO timestamp string to datetime with UTC timezone. + + Handles 'Z' suffix by converting to '+00:00' format. + + Args: + timestamp: ISO format timestamp (e.g., "2025-12-27T16:12:36.904Z"). + + Returns: + datetime object, or None on parse failure. + """ + try: + ts = timestamp.replace("Z", "+00:00") + return datetime.fromisoformat(ts) + except ValueError: + return None + + +# ============================================================================ +# Constants +# ============================================================================ + +# Operation types for file operations +OP_WRITE = "write" +OP_EDIT = "edit" +OP_DELETE = "delete" + +# File status for tree display +STATUS_ADDED = "added" +STATUS_MODIFIED = "modified" + +# Regex patterns for rm commands +# Matches: rm, rm -f, rm -r, rm -rf, rm -fr, etc. +RM_COMMAND_PATTERN = re.compile(r"^\s*rm\s+(?:-[rfivI]+\s+)*(.+)$") + + +# ============================================================================ +# Data Structures +# ============================================================================ + + +@dataclass +class FileOperation: + """Represents a single Write or Edit operation on a file.""" + + file_path: str + operation_type: str # "write", "edit", or "delete" + tool_id: str # tool_use.id for linking + timestamp: str + page_num: int # which page this operation appears on + msg_id: str # anchor ID in the HTML page + + # For Write operations + content: Optional[str] = None + + # For Edit operations + old_string: Optional[str] = None + new_string: Optional[str] = None + replace_all: bool = False + + # For Delete operations + is_recursive: bool = False # True for directory deletes (rm -r) + + # Original file content from tool result (for Edit operations) + # This allows reconstruction without local file access + original_content: Optional[str] = None + + +@dataclass +class FileState: + """Represents the reconstructed state of a file with blame annotations.""" + + file_path: str + operations: List[FileOperation] = field(default_factory=list) + + # If we have a git repo, we can reconstruct full content + initial_content: Optional[str] = None # From git or first Write + final_content: Optional[str] = None # Reconstructed content + + # Blame data: list of (line_text, FileOperation or None) + # None means the line came from initial_content (pre-session) + blame_lines: List[Tuple[str, Optional[FileOperation]]] = field(default_factory=list) + + # For diff-only mode when no repo is available + diff_only: bool = False + + # File status: "added" (first op is Write), "modified" (first op is Edit) + status: str = "modified" + + +@dataclass +class CodeViewData: + """All data needed to render the code viewer.""" + + files: Dict[str, FileState] = field(default_factory=dict) # file_path -> FileState + file_tree: Dict[str, Any] = field(default_factory=dict) # Nested dict for file tree + mode: str = "diff_only" # "full" or "diff_only" + repo_path: Optional[str] = None + session_cwd: Optional[str] = None + + +@dataclass +class BlameRange: + """A range of consecutive lines from the same operation.""" + + start_line: int # 1-indexed + end_line: int # 1-indexed, inclusive + tool_id: Optional[str] + page_num: int + msg_id: str + operation_type: str # "write" or "edit" + timestamp: str + + +# ============================================================================ +# Code Viewer Functions +# ============================================================================ + + +def extract_deleted_paths_from_bash(command: str) -> List[str]: + """Extract file paths deleted by an rm command. + + Handles various rm forms: + - rm file.py + - rm -f file.py + - rm -rf /path/to/dir + - rm "file with spaces.py" + - rm 'file.py' + + Args: + command: The bash command string. + + Returns: + List of file paths that would be deleted by this command. + """ + paths = [] + + # Check if this is an rm command + match = RM_COMMAND_PATTERN.match(command) + if not match: + return paths + + # Get the path arguments part + args_str = match.group(1).strip() + + # Parse paths - handle quoted and unquoted paths + # Simple approach: split on spaces but respect quotes + current_path = "" + in_quotes = None + i = 0 + + while i < len(args_str): + char = args_str[i] + + if in_quotes: + if char == in_quotes: + # End of quoted string + if current_path: + paths.append(current_path) + current_path = "" + in_quotes = None + else: + current_path += char + elif char in ('"', "'"): + # Start of quoted string + in_quotes = char + elif char == " ": + # Space outside quotes - end of path + if current_path: + paths.append(current_path) + current_path = "" + else: + current_path += char + + i += 1 + + # Don't forget the last path if not quoted + if current_path: + paths.append(current_path) + + return paths + + +def extract_file_operations( + loglines: List[Dict], + conversations: List[Dict], + prompts_per_page: int = 5, +) -> List[FileOperation]: + """Extract all Write, Edit, and Delete operations from session loglines. + + Delete operations are extracted from Bash rm commands. Files that are + ultimately deleted will be filtered out when the operations are replayed + in the git repo (deleted files won't exist in the final state). + + Args: + loglines: List of parsed logline entries from the session. + conversations: List of conversation dicts with page mapping info. + prompts_per_page: Number of prompts per page for pagination. + + Returns: + List of FileOperation objects sorted by timestamp. + """ + operations = [] + + # Build a mapping from message content to page number and message ID + # We need to track which page each operation appears on + msg_to_page = {} + for conv_idx, conv in enumerate(conversations): + page_num = (conv_idx // prompts_per_page) + 1 + for msg_idx, (log_type, message_json, timestamp) in enumerate( + conv.get("messages", []) + ): + # Generate a unique ID matching the HTML message IDs + msg_id = f"msg-{timestamp.replace(':', '-').replace('.', '-')}" + # Store timestamp -> (page_num, msg_id) mapping + msg_to_page[timestamp] = (page_num, msg_id) + + # First pass: collect originalFile content from tool results + # These are stored in the toolUseResult field of user messages + tool_id_to_original = {} + for entry in loglines: + tool_use_result = entry.get("toolUseResult", {}) + if tool_use_result and "originalFile" in tool_use_result: + # Find the matching tool_use_id from the message content + message = entry.get("message", {}) + content = message.get("content", []) + if isinstance(content, list): + for block in content: + if isinstance(block, dict) and block.get("type") == "tool_result": + tool_use_id = block.get("tool_use_id", "") + if tool_use_id: + tool_id_to_original[tool_use_id] = tool_use_result.get( + "originalFile" + ) + + for entry in loglines: + timestamp = entry.get("timestamp", "") + message = entry.get("message", {}) + content = message.get("content", []) + + if not isinstance(content, list): + continue + + for block in content: + if not isinstance(block, dict): + continue + + if block.get("type") != "tool_use": + continue + + tool_name = block.get("name", "") + tool_id = block.get("id", "") + tool_input = block.get("input", {}) + + # Get page and message ID from our mapping + fallback_msg_id = f"msg-{timestamp.replace(':', '-').replace('.', '-')}" + page_num, msg_id = msg_to_page.get(timestamp, (1, fallback_msg_id)) + + if tool_name == "Write": + file_path = tool_input.get("file_path", "") + file_content = tool_input.get("content", "") + + if file_path: + operations.append( + FileOperation( + file_path=file_path, + operation_type=OP_WRITE, + tool_id=tool_id, + timestamp=timestamp, + page_num=page_num, + msg_id=msg_id, + content=file_content, + ) + ) + + elif tool_name == "Edit": + file_path = tool_input.get("file_path", "") + old_string = tool_input.get("old_string", "") + new_string = tool_input.get("new_string", "") + replace_all = tool_input.get("replace_all", False) + + if file_path and old_string is not None and new_string is not None: + # Get original file content if available from tool result + original_content = tool_id_to_original.get(tool_id) + + operations.append( + FileOperation( + file_path=file_path, + operation_type=OP_EDIT, + tool_id=tool_id, + timestamp=timestamp, + page_num=page_num, + msg_id=msg_id, + old_string=old_string, + new_string=new_string, + replace_all=replace_all, + original_content=original_content, + ) + ) + + elif tool_name == "Bash": + # Extract delete operations from rm commands + command = tool_input.get("command", "") + deleted_paths = extract_deleted_paths_from_bash(command) + is_recursive = "-r" in command + + for path in deleted_paths: + operations.append( + FileOperation( + file_path=path, + operation_type=OP_DELETE, + tool_id=tool_id, + timestamp=timestamp, + page_num=page_num, + msg_id=msg_id, + is_recursive=is_recursive, + ) + ) + + # Sort by timestamp + operations.sort(key=lambda op: op.timestamp) + + return operations + + +def filter_deleted_files(operations: List[FileOperation]) -> List[FileOperation]: + """Filter out operations for files that no longer exist on disk. + + This is used with the --exclude-deleted-files flag to filter out files + that were modified during the session but have since been deleted + (outside of the session or by commands we didn't detect). + + Only checks absolute paths - relative paths are left as-is since we can't + reliably determine where they are. + + Args: + operations: List of FileOperation objects. + + Returns: + Filtered list excluding operations for files that don't exist. + """ + if not operations: + return operations + + # Get unique file paths from Write/Edit operations (not Delete) + file_paths = set( + op.file_path for op in operations if op.operation_type in (OP_WRITE, OP_EDIT) + ) + + # Check which files exist (only for absolute paths) + missing_files: Set[str] = set() + for file_path in file_paths: + if os.path.isabs(file_path) and not os.path.exists(file_path): + missing_files.add(file_path) + + if not missing_files: + return operations + + # Filter out operations for missing files + return [op for op in operations if op.file_path not in missing_files] + + +def normalize_file_paths(operations: List[FileOperation]) -> Tuple[str, Dict[str, str]]: + """Find common prefix in file paths and create normalized relative paths. + + Args: + operations: List of FileOperation objects. + + Returns: + Tuple of (common_prefix, path_mapping) where path_mapping maps + original absolute paths to normalized relative paths. + """ + if not operations: + return "", {} + + # Get all unique file paths + file_paths = list(set(op.file_path for op in operations)) + + if len(file_paths) == 1: + # Single file - use its parent as prefix + path = Path(file_paths[0]) + prefix = str(path.parent) + return prefix, {file_paths[0]: path.name} + + # Find common prefix + common = os.path.commonpath(file_paths) + # Make sure we're at a directory boundary + if not os.path.isdir(common): + common = os.path.dirname(common) + + # Create mapping + path_mapping = {} + for fp in file_paths: + rel_path = os.path.relpath(fp, common) + path_mapping[fp] = rel_path + + return common, path_mapping + + +def find_git_repo_root(start_path: str) -> Optional[Path]: + """Walk up from start_path to find a git repository root. + + Args: + start_path: Directory path to start searching from. + + Returns: + Path to the git repo root, or None if not found. + """ + current = Path(start_path) + while current != current.parent: + if (current / ".git").exists(): + return current + current = current.parent + return None + + +def find_commit_before_timestamp(file_repo: Repo, timestamp: str) -> Optional[Any]: + """Find the most recent commit before the given ISO timestamp. + + Args: + file_repo: GitPython Repo object. + timestamp: ISO format timestamp (e.g., "2025-12-27T16:12:36.904Z"). + + Returns: + Git commit object, or None if not found. + """ + target_dt = parse_iso_timestamp(timestamp) + if target_dt is None: + return None + + # Search through commits to find one before the target time + try: + for commit in file_repo.iter_commits(): + commit_dt = datetime.fromtimestamp( + commit.committed_date, tz=target_dt.tzinfo + ) + if commit_dt < target_dt: + return commit + except Exception: + pass + + return None + + +def get_commits_during_session( + file_repo: Repo, start_timestamp: str, end_timestamp: str +) -> List[Any]: + """Get all commits that happened during the session timeframe. + + Args: + file_repo: GitPython Repo object. + start_timestamp: ISO format timestamp for session start. + end_timestamp: ISO format timestamp for session end. + + Returns: + List of commit objects in chronological order (oldest first). + """ + from datetime import timezone + + start_dt = parse_iso_timestamp(start_timestamp) + end_dt = parse_iso_timestamp(end_timestamp) + if start_dt is None or end_dt is None: + return [] + + commits = [] + + try: + for commit in file_repo.iter_commits(): + commit_dt = datetime.fromtimestamp(commit.committed_date, tz=timezone.utc) + + # Skip commits after session end + if commit_dt > end_dt: + continue + + # Stop when we reach commits before session start + if commit_dt < start_dt: + break + + commits.append(commit) + + except Exception: + pass + + # Return in chronological order (oldest first) + return list(reversed(commits)) + + +def find_file_content_at_timestamp( + file_repo: Repo, file_rel_path: str, timestamp: str, session_commits: List[Any] +) -> Optional[str]: + """Find the file content from the most recent commit at or before the timestamp. + + Args: + file_repo: GitPython Repo object. + file_rel_path: Relative path to the file within the repo. + timestamp: ISO format timestamp to search for. + session_commits: List of commits during the session (chronological order). + + Returns: + File content as string, or None if not found. + """ + from datetime import timezone + + target_dt = parse_iso_timestamp(timestamp) + if target_dt is None: + return None + + try: + # Find the most recent commit at or before the target timestamp + best_commit = None + for commit in session_commits: + commit_dt = datetime.fromtimestamp(commit.committed_date, tz=timezone.utc) + if commit_dt <= target_dt: + best_commit = commit + else: + break # Commits are chronological, so we can stop + + if best_commit: + content = read_blob_content(best_commit.tree, file_rel_path) + if content is not None: + return content + + except Exception: + pass + + return None + + +def _init_temp_repo() -> Tuple[Repo, Path]: + """Create and configure a temporary git repository. + + Returns: + Tuple of (repo, temp_dir). + """ + temp_dir = Path(tempfile.mkdtemp(prefix="claude-session-")) + repo = Repo.init(temp_dir) + + with repo.config_writer() as config: + config.set_value("user", "name", "Claude") + config.set_value("user", "email", "claude@session") + + return repo, temp_dir + + +def _find_actual_repo_context( + sorted_ops: List[FileOperation], session_start: str, session_end: str +) -> Tuple[Optional[Repo], Optional[Path], List[Any]]: + """Find the actual git repo and session commits from operation file paths. + + Args: + sorted_ops: List of operations sorted by timestamp. + session_start: ISO timestamp of first operation. + session_end: ISO timestamp of last operation. + + Returns: + Tuple of (actual_repo, actual_repo_root, session_commits). + """ + for op in sorted_ops: + repo_root = find_git_repo_root(str(Path(op.file_path).parent)) + if repo_root: + try: + actual_repo = Repo(repo_root) + session_commits = get_commits_during_session( + actual_repo, session_start, session_end + ) + return actual_repo, repo_root, session_commits + except InvalidGitRepositoryError: + pass + return None, None, [] + + +def _fetch_initial_content( + op: FileOperation, + full_path: Path, + earliest_op_by_file: Dict[str, str], +) -> bool: + """Fetch initial file content using fallback chain. + + Priority: pre-session git commit > HEAD > disk > original_content + + Args: + op: The edit operation needing initial content. + full_path: Path where content should be written. + earliest_op_by_file: Map of file path to earliest operation timestamp. + + Returns: + True if content was fetched successfully. + """ + # Try to find a git repo for this file + file_repo_root = find_git_repo_root(str(Path(op.file_path).parent)) + if file_repo_root: + try: + file_repo = Repo(file_repo_root) + file_rel_path = os.path.relpath(op.file_path, file_repo_root) + + # Find commit from before the session started for this file + earliest_ts = earliest_op_by_file.get(op.file_path, op.timestamp) + pre_session_commit = find_commit_before_timestamp(file_repo, earliest_ts) + + if pre_session_commit: + content = read_blob_bytes(pre_session_commit.tree, file_rel_path) + if content is not None: + full_path.write_bytes(content) + return True + + # Fallback to HEAD (file might be new) + content = read_blob_bytes(file_repo.head.commit.tree, file_rel_path) + if content is not None: + full_path.write_bytes(content) + return True + except InvalidGitRepositoryError: + pass + + # Fallback: read from disk if file exists + if Path(op.file_path).exists(): + try: + full_path.write_text(Path(op.file_path).read_text()) + return True + except Exception: + pass + + # Fallback: use original_content from tool result (for remote sessions) + if op.original_content: + full_path.write_text(op.original_content) + return True + + return False + + +def build_file_history_repo( + operations: List[FileOperation], + progress_callback=None, +) -> Tuple[Repo, Path, Dict[str, str]]: + """Create a temp git repo that replays all file operations as commits. + + For Edit operations, uses intermediate commits from the actual repo to + resync state when our reconstruction might have diverged from reality. + This handles cases where edits fail to match our reconstructed content + but succeeded on the actual file. + + Args: + operations: List of FileOperation objects in chronological order. + progress_callback: Optional callback for progress updates. + + Returns: + Tuple of (repo, temp_dir, path_mapping) where: + - repo: GitPython Repo object + - temp_dir: Path to the temp directory + - path_mapping: Dict mapping original paths to relative paths + """ + repo, temp_dir = _init_temp_repo() + + # Get path mapping - exclude delete operations since they don't contribute files + # and may have relative paths that would break os.path.commonpath() + non_delete_ops = [op for op in operations if op.operation_type != OP_DELETE] + common_prefix, path_mapping = normalize_file_paths(non_delete_ops) + + # Sort operations by timestamp + sorted_ops = sorted(operations, key=lambda o: o.timestamp) + + if not sorted_ops: + return repo, temp_dir, path_mapping + + # Get session timeframe + session_start = sorted_ops[0].timestamp + session_end = sorted_ops[-1].timestamp + + # Build a map of file path -> earliest operation timestamp + earliest_op_by_file: Dict[str, str] = {} + for op in sorted_ops: + if op.file_path not in earliest_op_by_file: + earliest_op_by_file[op.file_path] = op.timestamp + + # Try to find the actual git repo and get commits during the session + actual_repo, actual_repo_root, session_commits = _find_actual_repo_context( + sorted_ops, session_start, session_end + ) + + total_ops = len(sorted_ops) + for op_idx, op in enumerate(sorted_ops): + if progress_callback: + progress_callback("operations", op_idx + 1, total_ops) + # Delete operations aren't in path_mapping - handle them specially + if op.operation_type == OP_DELETE: + rel_path = None # Will find matching files below + full_path = None + else: + rel_path = path_mapping.get(op.file_path, op.file_path) + full_path = temp_dir / rel_path + full_path.parent.mkdir(parents=True, exist_ok=True) + + # For edit operations, try to sync from commits when our reconstruction diverges + if op.operation_type == OP_EDIT and actual_repo and actual_repo_root: + file_rel_path = os.path.relpath(op.file_path, actual_repo_root) + old_str = op.old_string or "" + + if old_str and full_path.exists(): + our_content = full_path.read_text() + + # If old_string doesn't match our content, we may have diverged + if old_str not in our_content: + # Try to find content where old_string DOES exist + # First, check intermediate commits during the session + commit_content = find_file_content_at_timestamp( + actual_repo, file_rel_path, op.timestamp, session_commits + ) + + if commit_content and old_str in commit_content: + # Resync from this commit + full_path.write_text(commit_content) + repo.index.add([rel_path]) + repo.index.commit("{}") # Sync commit + else: + # Try HEAD - the final state should be correct + head_content = read_blob_content( + actual_repo.head.commit.tree, file_rel_path + ) + if head_content and old_str in head_content: + # Resync from HEAD + full_path.write_text(head_content) + repo.index.add([rel_path]) + repo.index.commit("{}") # Sync commit + + if op.operation_type == OP_WRITE: + full_path.write_text(op.content or "") + elif op.operation_type == OP_EDIT: + # If file doesn't exist, try to fetch initial content + if not full_path.exists(): + fetched = _fetch_initial_content(op, full_path, earliest_op_by_file) + + # Commit the initial content first (no metadata = pre-session) + # This allows git blame to correctly attribute unchanged lines + if fetched: + repo.index.add([rel_path]) + repo.index.commit("{}") # Empty metadata = pre-session content + + if full_path.exists(): + content = full_path.read_text() + old_str = op.old_string or "" + + # If old_string doesn't match, try to resync from original_content + # This handles remote sessions where we can't access the actual repo + if old_str and old_str not in content and op.original_content: + if old_str in op.original_content: + # Resync from original_content before applying this edit + content = op.original_content + full_path.write_text(content) + repo.index.add([rel_path]) + repo.index.commit("{}") # Sync commit + + if op.replace_all: + content = content.replace(old_str, op.new_string or "") + else: + content = content.replace(old_str, op.new_string or "", 1) + full_path.write_text(content) + else: + # Can't apply edit - file doesn't exist + continue + elif op.operation_type == OP_DELETE: + # Delete operation - remove file or directory contents + is_recursive = op.is_recursive + delete_path = op.file_path + + # Find files to delete by matching original paths against path_mapping + # Delete paths may be absolute or relative, and may not be in the mapping + files_to_remove = [] + + if is_recursive: + # Delete all files whose original path starts with delete_path + delete_prefix = delete_path.rstrip("/") + "/" + for orig_path, mapped_rel_path in path_mapping.items(): + # Check if original path starts with delete prefix or equals delete path + if orig_path.startswith(delete_prefix) or orig_path == delete_path: + file_abs = temp_dir / mapped_rel_path + if file_abs.exists(): + files_to_remove.append((file_abs, mapped_rel_path)) + else: + # Single file delete - find by exact original path match + if delete_path in path_mapping: + mapped_rel_path = path_mapping[delete_path] + file_abs = temp_dir / mapped_rel_path + if file_abs.exists(): + files_to_remove.append((file_abs, mapped_rel_path)) + + if files_to_remove: + for file_abs, file_rel in files_to_remove: + file_abs.unlink() + try: + repo.index.remove([file_rel]) + except Exception: + pass # File might not be tracked + + # Commit the deletion + try: + repo.index.commit("{}") # Delete commit + except Exception: + pass # Nothing to commit if no files were tracked + + continue # Skip the normal commit below + + # Stage and commit with metadata + repo.index.add([rel_path]) + metadata = json.dumps( + { + "tool_id": op.tool_id, + "page_num": op.page_num, + "msg_id": op.msg_id, + "timestamp": op.timestamp, + "operation_type": op.operation_type, + "file_path": op.file_path, + } + ) + repo.index.commit(metadata) + + # Note: We intentionally skip final sync here to preserve blame attribution. + # The displayed content may not exactly match HEAD, but blame tracking + # of which operations modified which lines is more important for the + # code viewer's purpose. + + return repo, temp_dir, path_mapping + + +def get_file_blame_ranges(repo: Repo, file_path: str) -> List[BlameRange]: + """Get blame data for a file, grouped into ranges of consecutive lines. + + Args: + repo: GitPython Repo object. + file_path: Relative path to the file within the repo. + + Returns: + List of BlameRange objects, each representing consecutive lines + from the same operation. + """ + try: + blame_data = repo.blame("HEAD", file_path) + except Exception: + return [] + + ranges = [] + current_line = 1 + + for commit, lines in blame_data: + if not lines: + continue + + # Parse metadata from commit message + try: + metadata = json.loads(commit.message) + except json.JSONDecodeError: + metadata = {} + + start_line = current_line + end_line = current_line + len(lines) - 1 + + ranges.append( + BlameRange( + start_line=start_line, + end_line=end_line, + tool_id=metadata.get("tool_id"), + page_num=metadata.get("page_num", 1), + msg_id=metadata.get("msg_id", ""), + operation_type=metadata.get("operation_type", "unknown"), + timestamp=metadata.get("timestamp", ""), + ) + ) + + current_line = end_line + 1 + + return ranges + + +def get_file_content_from_repo(repo: Repo, file_path: str) -> Optional[str]: + """Get the final content of a file from the repo. + + Args: + repo: GitPython Repo object. + file_path: Relative path to the file within the repo. + + Returns: + File content as string, or None if file doesn't exist. + """ + try: + return read_blob_content(repo.head.commit.tree, file_path) + except ValueError: + # ValueError occurs when repo has no commits yet + return None + + +def build_file_tree(file_states: Dict[str, FileState]) -> Dict[str, Any]: + """Build a nested dict structure for file tree UI. + + Common directory prefixes shared by all files are stripped to keep the + tree compact. + + Args: + file_states: Dict mapping file paths to FileState objects. + + Returns: + Nested dict where keys are path components and leaves are FileState objects. + """ + if not file_states: + return {} + + # Split all paths into parts + all_parts = [Path(fp).parts for fp in file_states.keys()] + + # Find the common prefix (directory components shared by all files) + # We want to strip directories, not filename components + common_prefix_len = 0 + if all_parts: + # Find minimum path depth (excluding filename) + min_dir_depth = min(len(parts) - 1 for parts in all_parts) + + for i in range(min_dir_depth): + # Check if all paths have the same component at position i + first_part = all_parts[0][i] + if all(parts[i] == first_part for parts in all_parts): + common_prefix_len = i + 1 + else: + break + + tree: Dict[str, Any] = {} + + for file_path, file_state in file_states.items(): + # Normalize path and split into components + parts = Path(file_path).parts + + # Strip common prefix + parts = parts[common_prefix_len:] + + # Navigate/create the nested structure + current = tree + for i, part in enumerate(parts[:-1]): # All but the last part (directories) + if part not in current: + current[part] = {} + current = current[part] + + # Add the file (last part) + if parts: + current[parts[-1]] = file_state + + return tree + + +def reconstruct_file_with_blame( + initial_content: Optional[str], + operations: List[FileOperation], +) -> Tuple[str, List[Tuple[str, Optional[FileOperation]]]]: + """Reconstruct a file's final state with blame attribution for each line. + + Applies all operations in order and tracks which operation wrote each line. + + Args: + initial_content: The initial file content (from git), or None if new file. + operations: List of FileOperation objects in chronological order. + + Returns: + Tuple of (final_content, blame_lines): + - final_content: The reconstructed file content as a string + - blame_lines: List of (line_text, operation) tuples, where operation + is None for lines from initial_content (pre-session) + """ + # Initialize with initial content + if initial_content: + lines = initial_content.rstrip("\n").split("\n") + blame_lines: List[Tuple[str, Optional[FileOperation]]] = [ + (line, None) for line in lines + ] + else: + blame_lines = [] + + # Apply each operation + for op in operations: + if op.operation_type == OP_WRITE: + # Write replaces all content + if op.content: + new_lines = op.content.rstrip("\n").split("\n") + blame_lines = [(line, op) for line in new_lines] + + elif op.operation_type == OP_EDIT: + if op.old_string is None or op.new_string is None: + continue + + # Reconstruct current content for searching + current_content = "\n".join(line for line, _ in blame_lines) + + # Find where old_string occurs + pos = current_content.find(op.old_string) + if pos == -1: + # old_string not found, skip this operation + continue + + # Calculate line numbers for the replacement + prefix = current_content[:pos] + prefix_lines = prefix.count("\n") + old_lines_count = op.old_string.count("\n") + 1 + + # Build new blame_lines + new_blame_lines = [] + + # Add lines before the edit (keep their original blame) + for i, (line, attr) in enumerate(blame_lines): + if i < prefix_lines: + new_blame_lines.append((line, attr)) + + # Handle partial first line replacement + if prefix_lines < len(blame_lines): + first_affected_line = blame_lines[prefix_lines][0] + # Check if the prefix ends mid-line + last_newline = prefix.rfind("\n") + if last_newline == -1: + prefix_in_line = prefix + else: + prefix_in_line = prefix[last_newline + 1 :] + + # Build the new content by doing the actual replacement + new_content = ( + current_content[:pos] + + op.new_string + + current_content[pos + len(op.old_string) :] + ) + new_content_lines = new_content.rstrip("\n").split("\n") + + # All lines from the edit point onward get the new attribution + for i, line in enumerate(new_content_lines): + if i < prefix_lines: + continue + new_blame_lines.append((line, op)) + + blame_lines = new_blame_lines + + # Build final content + final_content = "\n".join(line for line, _ in blame_lines) + if final_content: + final_content += "\n" + + return final_content, blame_lines + + +def build_file_states( + operations: List[FileOperation], +) -> Dict[str, FileState]: + """Build FileState objects from a list of file operations. + + Args: + operations: List of FileOperation objects. + + Returns: + Dict mapping file paths to FileState objects. + """ + # Group operations by file (already sorted by timestamp) + file_ops = group_operations_by_file(operations) + + file_states = {} + for file_path, ops in file_ops.items(): + + # Determine status based on first operation + status = STATUS_ADDED if ops[0].operation_type == OP_WRITE else STATUS_MODIFIED + + file_state = FileState( + file_path=file_path, + operations=ops, + diff_only=True, # Default to diff-only + status=status, + ) + + # If first operation is a Write (file creation), we can show full content + if ops[0].operation_type == OP_WRITE: + final_content, blame_lines = reconstruct_file_with_blame(None, ops) + file_state.final_content = final_content + file_state.blame_lines = blame_lines + file_state.diff_only = False + + file_states[file_path] = file_state + + return file_states + + +def render_file_tree_html(file_tree: Dict[str, Any], prefix: str = "") -> str: + """Render file tree as HTML. + + Args: + file_tree: Nested dict structure from build_file_tree(). + prefix: Path prefix for building full paths. + + Returns: + HTML string for the file tree. + """ + html_parts = [] + + # Sort items: directories first, then files + items = sorted( + file_tree.items(), + key=lambda x: ( + not isinstance(x[1], dict) or isinstance(x[1], FileState), + x[0].lower(), + ), + ) + + for name, value in items: + full_path = f"{prefix}/{name}" if prefix else name + + if isinstance(value, FileState): + # It's a file - status shown via CSS color + status_class = f"status-{value.status}" + html_parts.append( + f'
  • ' + f'{html.escape(name)}' + f"
  • " + ) + elif isinstance(value, dict): + # It's a directory + children_html = render_file_tree_html(value, full_path) + html_parts.append( + f'
  • ' + f'' + f'{html.escape(name)}' + f'' + f"
  • " + ) + + return "".join(html_parts) + + +def file_state_to_dict(file_state: FileState) -> Dict[str, Any]: + """Convert FileState to a JSON-serializable dict. + + Args: + file_state: The FileState object. + + Returns: + Dict suitable for JSON serialization. + """ + operations = [ + { + "operation_type": op.operation_type, + "tool_id": op.tool_id, + "timestamp": op.timestamp, + "page_num": op.page_num, + "msg_id": op.msg_id, + "content": op.content, + "old_string": op.old_string, + "new_string": op.new_string, + } + for op in file_state.operations + ] + + blame_lines = None + if file_state.blame_lines: + blame_lines = [ + [ + line, + ( + { + "operation_type": op.operation_type, + "page_num": op.page_num, + "msg_id": op.msg_id, + "timestamp": op.timestamp, + } + if op + else None + ), + ] + for line, op in file_state.blame_lines + ] + + return { + "file_path": file_state.file_path, + "diff_only": file_state.diff_only, + "final_content": file_state.final_content, + "blame_lines": blame_lines, + "operations": operations, + } + + +def generate_code_view_html( + output_dir: Path, + operations: List[FileOperation], + transcript_messages: List[str] = None, + msg_to_user_html: Dict[str, str] = None, + msg_to_context_id: Dict[str, str] = None, + msg_to_prompt_num: Dict[str, int] = None, + total_pages: int = 1, + progress_callback=None, +) -> None: + """Generate the code.html file with three-pane layout. + + Args: + output_dir: Output directory. + operations: List of FileOperation objects. + transcript_messages: List of individual message HTML strings. + msg_to_user_html: Mapping from msg_id to rendered user message HTML for tooltips. + msg_to_context_id: Mapping from msg_id to context_msg_id for blame coloring. + msg_to_prompt_num: Mapping from msg_id to prompt number (1-indexed). + total_pages: Total number of transcript pages (for search feature). + progress_callback: Optional callback for progress updates. Called with (phase, current, total). + """ + # Import here to avoid circular imports + from claude_code_transcripts import get_template + + if not operations: + return + + if transcript_messages is None: + transcript_messages = [] + + if msg_to_user_html is None: + msg_to_user_html = {} + + if msg_to_context_id is None: + msg_to_context_id = {} + + if msg_to_prompt_num is None: + msg_to_prompt_num = {} + + # Extract message IDs from HTML for chunked rendering + # Messages have format:
    + msg_id_pattern = re.compile(r'id="(msg-[^"]+)"') + messages_data = [] + current_prompt_num = None + for msg_html in transcript_messages: + match = msg_id_pattern.search(msg_html) + msg_id = match.group(1) if match else None + # Update current prompt number when we hit a user prompt + if msg_id and msg_id in msg_to_prompt_num: + current_prompt_num = msg_to_prompt_num[msg_id] + # Every message gets the current prompt number (not just user prompts) + messages_data.append( + {"id": msg_id, "html": msg_html, "prompt_num": current_prompt_num} + ) + + # Build temp git repo with file history + if progress_callback: + progress_callback("operations", 0, len(operations)) + repo, temp_dir, path_mapping = build_file_history_repo( + operations, progress_callback=progress_callback + ) + + try: + # Build file data for each file + file_data = {} + + # Group operations by file (already sorted by timestamp) + ops_by_file = group_operations_by_file(operations) + total_files = len(ops_by_file) + file_count = 0 + + for orig_path, file_ops in ops_by_file.items(): + file_count += 1 + if progress_callback: + progress_callback("files", file_count, total_files) + rel_path = path_mapping.get(orig_path, orig_path) + + # Get file content + content = get_file_content_from_repo(repo, rel_path) + if content is None: + continue + + # Get blame ranges + blame_ranges = get_file_blame_ranges(repo, rel_path) + + # Determine status + status = ( + STATUS_ADDED + if file_ops[0].operation_type == OP_WRITE + else STATUS_MODIFIED + ) + + # Pre-compute color indices for each unique context_msg_id + # Colors are assigned per-file, with each unique context getting a sequential index + context_to_color_index: Dict[str, int] = {} + color_index = 0 + + # Build blame range data with pre-computed values + blame_range_data = [] + for r in blame_ranges: + context_id = msg_to_context_id.get(r.msg_id, r.msg_id) + + # Assign color index for new context IDs + if r.msg_id and context_id not in context_to_color_index: + context_to_color_index[context_id] = color_index + color_index += 1 + + blame_range_data.append( + { + "start": r.start_line, + "end": r.end_line, + "tool_id": r.tool_id, + "page_num": r.page_num, + "msg_id": r.msg_id, + "context_msg_id": context_id, + "prompt_num": msg_to_prompt_num.get(r.msg_id), + "color_index": ( + context_to_color_index.get(context_id) if r.msg_id else None + ), + "operation_type": r.operation_type, + "timestamp": r.timestamp, + "user_html": msg_to_user_html.get(r.msg_id, ""), + } + ) + + # Build file data + file_data[orig_path] = { + "file_path": orig_path, + "rel_path": rel_path, + "content": content, + "status": status, + "blame_ranges": blame_range_data, + } + + # Build file states for tree (reusing existing structure) + file_states = {} + for orig_path, data in file_data.items(): + file_states[orig_path] = FileState( + file_path=orig_path, + status=data["status"], + ) + + # Build file tree + file_tree = build_file_tree(file_states) + file_tree_html = render_file_tree_html(file_tree) + + # Build code data object + code_data = { + "fileData": file_data, + "messagesData": messages_data, + } + + # Write data to separate JSON file for gistpreview lazy loading + # (gistpreview has size limits, so it fetches this file separately) + (output_dir / "code-data.json").write_text( + json.dumps(code_data), encoding="utf-8" + ) + + # Also embed data inline for local file:// use + # (fetch() doesn't work with file:// URLs due to CORS) + code_data_json = json.dumps(code_data) + # Escape sequences that would confuse the HTML parser inside script tags: + # - would break parsing) + # - +
    +
    + + + + + {{ inline_data_script|safe }} + + + +{%- endblock %} diff --git a/src/claude_code_transcripts/templates/code_view.js b/src/claude_code_transcripts/templates/code_view.js new file mode 100644 index 0000000..a9d690d --- /dev/null +++ b/src/claude_code_transcripts/templates/code_view.js @@ -0,0 +1,1307 @@ +// CodeMirror 6 imports from CDN +import {EditorView, lineNumbers, gutter, GutterMarker, Decoration, ViewPlugin, WidgetType} from 'https://esm.sh/@codemirror/view@6'; +import {EditorState, StateField, StateEffect} from 'https://esm.sh/@codemirror/state@6'; + +// Widget to show user message number at end of line +class MessageNumberWidget extends WidgetType { + constructor(msgNum) { + super(); + this.msgNum = msgNum; + } + toDOM() { + const span = document.createElement('span'); + span.className = 'blame-msg-num'; + span.textContent = `#${this.msgNum}`; + return span; + } + eq(other) { + return this.msgNum === other.msgNum; + } +} +import {syntaxHighlighting, defaultHighlightStyle} from 'https://esm.sh/@codemirror/language@6'; +import {javascript} from 'https://esm.sh/@codemirror/lang-javascript@6'; +import {python} from 'https://esm.sh/@codemirror/lang-python@6'; +import {html} from 'https://esm.sh/@codemirror/lang-html@6'; +import {css} from 'https://esm.sh/@codemirror/lang-css@6'; +import {json} from 'https://esm.sh/@codemirror/lang-json@6'; +import {markdown} from 'https://esm.sh/@codemirror/lang-markdown@6'; + +// Format timestamps in local timezone with nice format +function formatTimestamp(date) { + const now = new Date(); + const isToday = date.toDateString() === now.toDateString(); + const yesterday = new Date(now); + yesterday.setDate(yesterday.getDate() - 1); + const isYesterday = date.toDateString() === yesterday.toDateString(); + const isThisYear = date.getFullYear() === now.getFullYear(); + + const timeStr = date.toLocaleTimeString(undefined, { hour: 'numeric', minute: '2-digit' }); + + if (isToday) { + return timeStr; + } else if (isYesterday) { + return 'Yesterday ' + timeStr; + } else if (isThisYear) { + return date.toLocaleDateString(undefined, { month: 'short', day: 'numeric' }) + ' ' + timeStr; + } else { + return date.toLocaleDateString(undefined, { month: 'short', day: 'numeric', year: 'numeric' }) + ' ' + timeStr; + } +} + +function formatTimestamps(container) { + container.querySelectorAll('time[data-timestamp]').forEach(function(el) { + const timestamp = el.getAttribute('data-timestamp'); + const date = new Date(timestamp); + el.textContent = formatTimestamp(date); + el.title = date.toLocaleString(undefined, { dateStyle: 'full', timeStyle: 'long' }); + }); +} + +// Get the URL for fetching code-data.json on gisthost/gistpreview +function getGistDataUrl() { + // URL format: https://gisthost.github.io/?GIST_ID/code.html + const match = window.location.search.match(/^\?([^/]+)/); + if (match) { + const gistId = match[1]; + // Use raw gist URL (no API rate limits) + return `https://gist.githubusercontent.com/raw/${gistId}/code-data.json`; + } + return null; +} + +// Show loading state +function showLoading() { + const codeContent = document.getElementById('code-content'); + if (codeContent) { + codeContent.innerHTML = '

    Loading code data...

    '; + } +} + +// Show error state +function showError(message) { + const codeContent = document.getElementById('code-content'); + if (codeContent) { + codeContent.innerHTML = `

    Error: ${message}

    `; + } +} + +// Palette of colors for blame ranges +const rangeColors = [ + 'rgba(66, 165, 245, 0.15)', // blue + 'rgba(102, 187, 106, 0.15)', // green + 'rgba(255, 167, 38, 0.15)', // orange + 'rgba(171, 71, 188, 0.15)', // purple + 'rgba(239, 83, 80, 0.15)', // red + 'rgba(38, 198, 218, 0.15)', // cyan +]; + +// State effect for updating active range +const setActiveRange = StateEffect.define(); + +// State field for active range highlighting +const activeRangeField = StateField.define({ + create() { return Decoration.none; }, + update(decorations, tr) { + for (let e of tr.effects) { + if (e.is(setActiveRange)) { + const {rangeIndex, blameRanges, doc} = e.value; + if (rangeIndex < 0 || rangeIndex >= blameRanges.length) { + return Decoration.none; + } + const range = blameRanges[rangeIndex]; + const decs = []; + for (let line = range.start; line <= range.end; line++) { + if (line <= doc.lines) { + const lineStart = doc.line(line).from; + decs.push( + Decoration.line({ + class: 'cm-active-range' + }).range(lineStart) + ); + } + } + return Decoration.set(decs, true); + } + } + return decorations; + }, + provide: f => EditorView.decorations.from(f) +}); + +// Main initialization - uses embedded data or fetches from gist +async function init() { + let data; + + // Always show loading on init - parsing large embedded JSON takes time + showLoading(); + + // Check for embedded data first (works with local file:// access) + if (window.CODE_DATA) { + // Use setTimeout to allow the loading message to render before heavy processing + await new Promise(resolve => setTimeout(resolve, 0)); + data = window.CODE_DATA; + } else { + // No embedded data - must be gist version, fetch from raw URL + showLoading(); + const dataUrl = getGistDataUrl(); + if (!dataUrl) { + showError('No data available. If viewing locally, the file may be corrupted.'); + return; + } + try { + const response = await fetch(dataUrl); + if (!response.ok) { + throw new Error(`Failed to fetch data: ${response.status} ${response.statusText}`); + } + data = await response.json(); + } catch (err) { + showError(err.message); + console.error('Failed to load code data:', err); + return; + } + } + + const fileData = data.fileData; + const messagesData = data.messagesData; + + // Expose for testing + window.codeViewData = { messagesData, fileData }; + + // Windowed rendering state + // We render a "window" of messages, not necessarily starting from 0 + const CHUNK_SIZE = 50; + let windowStart = 0; // First rendered message index + let windowEnd = -1; // Last rendered message index (-1 = none rendered) + + // For backwards compatibility + function getRenderedCount() { + return windowEnd - windowStart + 1; + } + + // Find the user prompt that contains a given message index + // Scans backwards to find a message with class "user" (non-continuation) + function findUserPromptIndex(targetIndex) { + for (let i = targetIndex; i >= 0; i--) { + const msg = messagesData[i]; + // Check if this is a user message (not a continuation) + if (msg.html && msg.html.includes('class="message user"') && + !msg.html.includes('class="continuation"')) { + return i; + } + } + return 0; // Fallback to start + } + + // Build ID-to-index map for fast lookup + const msgIdToIndex = new Map(); + messagesData.forEach((msg, index) => { + if (msg.id) { + msgIdToIndex.set(msg.id, index); + } + }); + + // Build msg_id to file/range map for navigating from transcript to code + const msgIdToBlame = new Map(); + Object.entries(fileData).forEach(([filePath, fileInfo]) => { + (fileInfo.blame_ranges || []).forEach((range, rangeIndex) => { + if (range.msg_id) { + if (!msgIdToBlame.has(range.msg_id)) { + msgIdToBlame.set(range.msg_id, { filePath, range, rangeIndex }); + } + } + }); + }); + + // Build sorted list of blame operations by message index + const sortedBlameOps = []; + msgIdToBlame.forEach((blameInfo, msgId) => { + const msgIndex = msgIdToIndex.get(msgId); + if (msgIndex !== undefined) { + sortedBlameOps.push({ msgId, msgIndex, ...blameInfo }); + } + }); + sortedBlameOps.sort((a, b) => a.msgIndex - b.msgIndex); + + // Find the first blame operation at or after a given message index + function findNextBlameOp(msgIndex) { + for (const op of sortedBlameOps) { + if (op.msgIndex >= msgIndex) { + return op; + } + } + return null; + } + + // Current state + let currentEditor = null; + let currentFilePath = null; + let currentBlameRanges = []; + let isInitializing = true; // Skip pinned message updates during initial load + let isScrollingToTarget = false; // Skip pinned updates during programmatic scrolls + let scrollTargetTimeout = null; + + // Tooltip element for blame hover + let blameTooltip = null; + + function createBlameTooltip() { + const tooltip = document.createElement('div'); + tooltip.className = 'blame-tooltip'; + tooltip.style.display = 'none'; + document.body.appendChild(tooltip); + return tooltip; + } + + function showBlameTooltip(event, html) { + if (!blameTooltip) { + blameTooltip = createBlameTooltip(); + } + if (!html) return; + + const codePanel = document.getElementById('code-panel'); + if (codePanel) { + const codePanelWidth = codePanel.offsetWidth; + const tooltipWidth = Math.min(Math.max(codePanelWidth * 0.75, 300), 800); + blameTooltip.style.maxWidth = tooltipWidth + 'px'; + } + + blameTooltip.innerHTML = html; + formatTimestamps(blameTooltip); + blameTooltip.style.display = 'block'; + + const padding = 10; + let x = event.clientX + padding; + let y = event.clientY + padding; + + const rect = blameTooltip.getBoundingClientRect(); + const maxX = window.innerWidth - rect.width - padding; + const maxY = window.innerHeight - rect.height - padding; + + if (x > maxX) x = event.clientX - rect.width - padding; + if (y > maxY) { + const yAbove = event.clientY - rect.height - padding; + if (yAbove >= 0) { + y = yAbove; + } + } + + blameTooltip.style.left = x + 'px'; + blameTooltip.style.top = y + 'px'; + } + + function hideBlameTooltip() { + if (blameTooltip) { + blameTooltip.style.display = 'none'; + } + } + + // Build maps for range colors and message numbers + // Uses pre-computed prompt_num and color_index from server + function buildRangeMaps(blameRanges) { + const colorMap = new Map(); + const msgNumMap = new Map(); + + blameRanges.forEach((range, index) => { + if (range.msg_id) { + // Use pre-computed prompt_num from server + if (range.prompt_num) { + msgNumMap.set(index, range.prompt_num); + } + + // Use pre-computed color_index from server + if (range.color_index !== null && range.color_index !== undefined) { + colorMap.set(index, rangeColors[range.color_index % rangeColors.length]); + } + } + }); + return { colorMap, msgNumMap }; + } + + // Language detection based on file extension + function getLanguageExtension(filePath) { + const ext = filePath.split('.').pop().toLowerCase(); + const langMap = { + 'js': javascript(), + 'jsx': javascript({jsx: true}), + 'ts': javascript({typescript: true}), + 'tsx': javascript({jsx: true, typescript: true}), + 'mjs': javascript(), + 'cjs': javascript(), + 'py': python(), + 'html': html(), + 'htm': html(), + 'css': css(), + 'json': json(), + 'md': markdown(), + 'markdown': markdown(), + }; + return langMap[ext] || []; + } + + // Create line decorations for blame ranges + function createRangeDecorations(blameRanges, doc, colorMap, msgNumMap) { + const decorations = []; + + blameRanges.forEach((range, index) => { + const color = colorMap.get(index); + if (!color) return; + + for (let line = range.start; line <= range.end; line++) { + if (line <= doc.lines) { + const lineInfo = doc.line(line); + const lineStart = lineInfo.from; + + decorations.push( + Decoration.line({ + attributes: { + style: `background-color: ${color}`, + 'data-range-index': index.toString(), + 'data-msg-id': range.msg_id, + } + }).range(lineStart) + ); + + if (line === range.start) { + const msgNum = msgNumMap.get(index); + if (msgNum) { + decorations.push( + Decoration.widget({ + widget: new MessageNumberWidget(msgNum), + side: 1, + }).range(lineInfo.to) + ); + } + } + } + } + }); + + return Decoration.set(decorations, true); + } + + // Create the scrollbar minimap + function createMinimap(container, blameRanges, totalLines, editor, colorMap) { + const existing = container.querySelector('.blame-minimap'); + if (existing) existing.remove(); + + if (colorMap.size === 0 || totalLines === 0) return null; + + // Check if scrolling is needed - if not, don't show minimap + const editorContainer = container.querySelector('.editor-container'); + const scrollElement = editorContainer?.querySelector('.cm-scroller'); + if (scrollElement) { + const needsScroll = scrollElement.scrollHeight > scrollElement.clientHeight; + if (!needsScroll) return null; + } + + const minimap = document.createElement('div'); + minimap.className = 'blame-minimap'; + + blameRanges.forEach((range, index) => { + const color = colorMap.get(index); + if (!color) return; + + const startPercent = ((range.start - 1) / totalLines) * 100; + const endPercent = (range.end / totalLines) * 100; + const height = Math.max(endPercent - startPercent, 0.5); + + const marker = document.createElement('div'); + marker.className = 'minimap-marker'; + marker.style.top = startPercent + '%'; + marker.style.height = height + '%'; + marker.style.backgroundColor = color.replace('0.15', '0.6'); + marker.dataset.rangeIndex = index; + marker.dataset.line = range.start; + marker.title = `Lines ${range.start}-${range.end}`; + + marker.addEventListener('click', () => { + const doc = editor.state.doc; + if (range.start <= doc.lines) { + const lineInfo = doc.line(range.start); + editor.dispatch({ + effects: EditorView.scrollIntoView(lineInfo.from, { y: 'center' }) + }); + highlightRange(index, blameRanges, editor); + if (range.msg_id) { + scrollToMessage(range.msg_id); + } + } + }); + + minimap.appendChild(marker); + }); + + container.appendChild(minimap); + return minimap; + } + + // Create editor for a file + function createEditor(container, content, blameRanges, filePath) { + container.innerHTML = ''; + + const wrapper = document.createElement('div'); + wrapper.className = 'editor-wrapper'; + container.appendChild(wrapper); + + const editorContainer = document.createElement('div'); + editorContainer.className = 'editor-container'; + wrapper.appendChild(editorContainer); + + const doc = EditorState.create({doc: content}).doc; + const { colorMap, msgNumMap } = buildRangeMaps(blameRanges); + const rangeDecorations = createRangeDecorations(blameRanges, doc, colorMap, msgNumMap); + + const rangeDecorationsField = StateField.define({ + create() { return rangeDecorations; }, + update(decorations) { return decorations; }, + provide: f => EditorView.decorations.from(f) + }); + + const clickHandler = EditorView.domEventHandlers({ + click: (event, view) => { + const target = event.target; + if (target.closest('.cm-line')) { + const line = target.closest('.cm-line'); + const rangeIndex = line.getAttribute('data-range-index'); + if (rangeIndex !== null) { + highlightRange(parseInt(rangeIndex), blameRanges, view); + const range = blameRanges[parseInt(rangeIndex)]; + if (range) { + updateLineHash(range.start); + // Scroll to the corresponding message in the transcript + if (range.msg_id) { + scrollToMessage(range.msg_id); + } + } + } + } + }, + mouseover: (event, view) => { + const target = event.target; + const line = target.closest('.cm-line'); + if (line) { + const rangeIndex = line.getAttribute('data-range-index'); + if (rangeIndex !== null) { + const range = blameRanges[parseInt(rangeIndex)]; + if (range && range.user_html) { + showBlameTooltip(event, range.user_html); + } + } + } + }, + mouseout: (event, view) => { + const target = event.target; + const line = target.closest('.cm-line'); + if (line) { + hideBlameTooltip(); + } + }, + mousemove: (event, view) => { + const target = event.target; + const line = target.closest('.cm-line'); + if (line && line.getAttribute('data-range-index') !== null) { + const rangeIndex = parseInt(line.getAttribute('data-range-index')); + const range = blameRanges[rangeIndex]; + if (range && range.user_html && blameTooltip && blameTooltip.style.display !== 'none') { + showBlameTooltip(event, range.user_html); + } + } + } + }); + + const extensions = [ + lineNumbers(), + EditorView.editable.of(false), + EditorView.lineWrapping, + syntaxHighlighting(defaultHighlightStyle), + getLanguageExtension(filePath), + rangeDecorationsField, + activeRangeField, + clickHandler, + ]; + + const state = EditorState.create({ + doc: content, + extensions: extensions, + }); + + currentEditor = new EditorView({ + state, + parent: editorContainer, + }); + + createMinimap(wrapper, blameRanges, doc.lines, currentEditor, colorMap); + + return currentEditor; + } + + // Highlight a specific range in the editor + function highlightRange(rangeIndex, blameRanges, view) { + view.dispatch({ + effects: setActiveRange.of({ + rangeIndex, + blameRanges, + doc: view.state.doc + }) + }); + } + + // Initialize truncation for elements within a container + function initTruncation(container) { + container.querySelectorAll('.truncatable:not(.truncation-initialized)').forEach(function(wrapper) { + wrapper.classList.add('truncation-initialized'); + const content = wrapper.querySelector('.truncatable-content'); + const btn = wrapper.querySelector('.expand-btn'); + if (content && content.scrollHeight > 250) { + wrapper.classList.add('truncated'); + if (btn) { + btn.addEventListener('click', function() { + if (wrapper.classList.contains('truncated')) { + wrapper.classList.remove('truncated'); + wrapper.classList.add('expanded'); + btn.textContent = 'Show less'; + } else { + wrapper.classList.remove('expanded'); + wrapper.classList.add('truncated'); + btn.textContent = 'Show more'; + } + }); + } + } + }); + } + + // Append messages to the end of the transcript + function appendMessages(startIdx, endIdx) { + const transcriptContent = document.getElementById('transcript-content'); + const sentinel = document.getElementById('transcript-sentinel'); + let added = false; + + for (let i = startIdx; i <= endIdx && i < messagesData.length; i++) { + if (i > windowEnd) { + const msg = messagesData[i]; + const div = document.createElement('div'); + div.innerHTML = msg.html; + while (div.firstChild) { + // Insert before the sentinel + transcriptContent.insertBefore(div.firstChild, sentinel); + } + windowEnd = i; + added = true; + } + } + + if (added) { + initTruncation(transcriptContent); + formatTimestamps(transcriptContent); + } + } + + // Prepend messages to the beginning of the transcript + function prependMessages(startIdx, endIdx) { + const transcriptContent = document.getElementById('transcript-content'); + const topSentinel = document.getElementById('transcript-sentinel-top'); + let added = false; + + // Prepend in reverse order so they appear in correct sequence + for (let i = endIdx; i >= startIdx && i >= 0; i--) { + if (i < windowStart) { + const msg = messagesData[i]; + const div = document.createElement('div'); + div.innerHTML = msg.html; + // Insert all children after the top sentinel + const children = Array.from(div.childNodes); + const insertPoint = topSentinel ? topSentinel.nextSibling : transcriptContent.firstChild; + children.forEach(child => { + transcriptContent.insertBefore(child, insertPoint); + }); + windowStart = i; + added = true; + } + } + + if (added) { + initTruncation(transcriptContent); + formatTimestamps(transcriptContent); + } + } + + // Clear and rebuild transcript starting from a specific index + function teleportToMessage(targetIndex) { + const transcriptContent = document.getElementById('transcript-content'); + const transcriptPanel = document.getElementById('transcript-panel'); + + // Find the user prompt containing this message + const promptStart = findUserPromptIndex(targetIndex); + + // Clear existing content (except sentinels - we'll recreate them) + transcriptContent.innerHTML = ''; + + // Add top sentinel for upward loading + const topSentinel = document.createElement('div'); + topSentinel.id = 'transcript-sentinel-top'; + topSentinel.style.height = '1px'; + transcriptContent.appendChild(topSentinel); + + // Add bottom sentinel + const bottomSentinel = document.createElement('div'); + bottomSentinel.id = 'transcript-sentinel'; + bottomSentinel.style.height = '1px'; + transcriptContent.appendChild(bottomSentinel); + + // Reset window state + windowStart = promptStart; + windowEnd = promptStart - 1; // Will be updated by appendMessages + + // Render from user prompt up to AND INCLUDING the target message + // This ensures the target is always in the DOM after teleporting + const initialEnd = Math.max( + Math.min(promptStart + CHUNK_SIZE - 1, messagesData.length - 1), + targetIndex + ); + appendMessages(promptStart, initialEnd); + + // Set up observers for the new sentinels + setupScrollObservers(); + + // Reset scroll position + transcriptPanel.scrollTop = 0; + } + + // Render messages down to targetIndex (extending window downward) + function renderMessagesDownTo(targetIndex) { + if (targetIndex <= windowEnd) return; + appendMessages(windowEnd + 1, targetIndex); + } + + // Render messages up to targetIndex (extending window upward) + function renderMessagesUpTo(targetIndex) { + if (targetIndex >= windowStart) return; + prependMessages(targetIndex, windowStart - 1); + } + + // Render next chunk downward (for lazy loading) + function renderNextChunk() { + const targetIndex = Math.min(windowEnd + CHUNK_SIZE, messagesData.length - 1); + appendMessages(windowEnd + 1, targetIndex); + } + + // Render previous chunk upward (for lazy loading) + function renderPrevChunk() { + if (windowStart <= 0) return; + const targetIndex = Math.max(windowStart - CHUNK_SIZE, 0); + prependMessages(targetIndex, windowStart - 1); + } + + // Check if target message is within or near the current window + function isNearCurrentWindow(msgIndex) { + if (windowEnd < 0) return false; // Nothing rendered yet + const NEAR_THRESHOLD = CHUNK_SIZE * 2; + return msgIndex >= windowStart - NEAR_THRESHOLD && + msgIndex <= windowEnd + NEAR_THRESHOLD; + } + + // Scroll observers for lazy loading + let topObserver = null; + let bottomObserver = null; + + function setupScrollObservers() { + // Clean up existing observers + if (topObserver) topObserver.disconnect(); + if (bottomObserver) bottomObserver.disconnect(); + + const transcriptPanel = document.getElementById('transcript-panel'); + + // Bottom sentinel observer (load more below) + const bottomSentinel = document.getElementById('transcript-sentinel'); + if (bottomSentinel) { + bottomObserver = new IntersectionObserver((entries) => { + if (entries[0].isIntersecting && windowEnd < messagesData.length - 1) { + renderNextChunk(); + } + }, { + root: transcriptPanel, + rootMargin: '200px', + }); + bottomObserver.observe(bottomSentinel); + } + + // Top sentinel observer (load more above) + const topSentinel = document.getElementById('transcript-sentinel-top'); + if (topSentinel) { + topObserver = new IntersectionObserver((entries) => { + if (entries[0].isIntersecting && windowStart > 0) { + // Save scroll position before prepending + const scrollTop = transcriptPanel.scrollTop; + const scrollHeight = transcriptPanel.scrollHeight; + + renderPrevChunk(); + + // Adjust scroll position to maintain visual position + const newScrollHeight = transcriptPanel.scrollHeight; + const heightDiff = newScrollHeight - scrollHeight; + transcriptPanel.scrollTop = scrollTop + heightDiff; + } + }, { + root: transcriptPanel, + rootMargin: '200px', + }); + topObserver.observe(topSentinel); + } + } + + // Calculate sticky header offset + function getStickyHeaderOffset() { + const panel = document.getElementById('transcript-panel'); + const h3 = panel?.querySelector('h3'); + const pinnedMsg = document.getElementById('pinned-user-message'); + + let offset = 0; + if (h3) offset += h3.offsetHeight; + if (pinnedMsg && pinnedMsg.style.display !== 'none') { + offset += pinnedMsg.offsetHeight; + } + return offset + 8; + } + + // Scroll to a message in the transcript + // Uses teleportation for distant messages to avoid rendering thousands of DOM nodes + // Always ensures the user prompt for the message is also loaded for context + function scrollToMessage(msgId) { + const transcriptContent = document.getElementById('transcript-content'); + const transcriptPanel = document.getElementById('transcript-panel'); + + const msgIndex = msgIdToIndex.get(msgId); + if (msgIndex === undefined) return; + + // Find the user prompt for this message - we always want it in the window + const userPromptIndex = findUserPromptIndex(msgIndex); + + // Check if both user prompt and target message are in/near the window + const targetNear = isNearCurrentWindow(msgIndex); + const promptNear = isNearCurrentWindow(userPromptIndex); + + // Track if we teleported (need longer delay for layout) + let didTeleport = false; + + // If either user prompt or target is far from window, teleport + if (!targetNear || !promptNear) { + teleportToMessage(msgIndex); + didTeleport = true; + } else { + // Both are near the window - extend as needed + // Ensure user prompt is loaded (extend upward if needed) + if (userPromptIndex < windowStart) { + renderMessagesUpTo(userPromptIndex); + } + // Ensure target message is loaded (extend downward if needed) + if (msgIndex > windowEnd) { + renderMessagesDownTo(msgIndex); + } + } + + // Helper to perform the scroll after DOM is ready + const performScroll = () => { + const message = transcriptContent.querySelector(`#${CSS.escape(msgId)}`); + if (message) { + transcriptContent.querySelectorAll('.message.highlighted').forEach(el => { + el.classList.remove('highlighted'); + }); + message.classList.add('highlighted'); + + const stickyOffset = getStickyHeaderOffset(); + const messageTop = message.offsetTop; + const targetScroll = messageTop - stickyOffset; + + // Suppress pinned message updates during scroll + isScrollingToTarget = true; + if (scrollTargetTimeout) clearTimeout(scrollTargetTimeout); + + // Use instant scroll after teleport (jumping anyway), smooth otherwise + transcriptPanel.scrollTo({ + top: targetScroll, + behavior: didTeleport ? 'instant' : 'smooth' + }); + + // Re-enable pinned updates after scroll completes + scrollTargetTimeout = setTimeout(() => { + isScrollingToTarget = false; + updatePinnedUserMessage(); + }, didTeleport ? 100 : 500); + } + }; + + // After teleporting, wait for layout to complete before scrolling + // Teleport adds many DOM elements - need time for browser to lay them out + if (didTeleport) { + // Use setTimeout to wait for layout, then requestAnimationFrame for paint + setTimeout(() => { + requestAnimationFrame(performScroll); + }, 50); + } else { + requestAnimationFrame(performScroll); + } + } + + // Load file content + // skipInitialScroll: if true, don't scroll to first blame range (caller will handle scroll) + function loadFile(path, skipInitialScroll = false) { + currentFilePath = path; + + const codeContent = document.getElementById('code-content'); + const currentFilePathEl = document.getElementById('current-file-path'); + + currentFilePathEl.textContent = path; + + const fileInfo = fileData[path]; + if (!fileInfo) { + codeContent.innerHTML = '

    File not found

    '; + return; + } + + // Always show loading indicator - gives visual feedback during file switch + codeContent.innerHTML = '

    Loading file...

    '; + + // Use setTimeout to ensure loading message renders before heavy work + setTimeout(() => { + const content = fileInfo.content || ''; + currentBlameRanges = fileInfo.blame_ranges || []; + createEditor(codeContent, content, currentBlameRanges, path); + + // Scroll to first blame range and align transcript (without highlighting) + // Skip if caller will handle scroll (e.g., hash navigation to specific line) + if (!skipInitialScroll) { + const firstOpIndex = currentBlameRanges.findIndex(r => r.msg_id); + if (firstOpIndex >= 0) { + const firstOpRange = currentBlameRanges[firstOpIndex]; + scrollEditorToLine(firstOpRange.start); + // Scroll transcript to the corresponding message (no highlight on initial load) + if (firstOpRange.msg_id) { + scrollToMessage(firstOpRange.msg_id); + } + } + } + }, 10); + } + + // Scroll editor to a line + function scrollEditorToLine(lineNumber) { + if (!currentEditor) return; + const doc = currentEditor.state.doc; + if (lineNumber < 1 || lineNumber > doc.lines) return; + + const line = doc.line(lineNumber); + currentEditor.dispatch({ + effects: EditorView.scrollIntoView(line.from, { y: 'center' }) + }); + } + + // Update URL hash for deep-linking to a line + function updateLineHash(lineNumber) { + if (!currentFilePath) return; + // Use format: #path/to/file:L{number} + const hash = `${encodeURIComponent(currentFilePath)}:L${lineNumber}`; + history.replaceState(null, '', `#${hash}`); + } + + // Parse URL hash and navigate to file/line + // Supports formats: #L5, #path/to/file:L5, #path%2Fto%2Ffile:L5 + function navigateFromHash() { + const hash = window.location.hash.slice(1); // Remove leading # + if (!hash) return false; + + let filePath = null; + let lineNumber = null; + + // Check for file:L{number} format + const fileLineMatch = hash.match(/^(.+):L(\d+)$/); + if (fileLineMatch) { + filePath = decodeURIComponent(fileLineMatch[1]); + lineNumber = parseInt(fileLineMatch[2]); + } else { + // Check for just L{number} format (uses current file) + const lineMatch = hash.match(/^L(\d+)$/); + if (lineMatch) { + lineNumber = parseInt(lineMatch[1]); + filePath = currentFilePath; // Use current file + } + } + + if (lineNumber) { + // Helper to scroll to line and select blame + const scrollAndSelect = () => { + scrollEditorToLine(lineNumber); + // Find and highlight the range at this line + if (currentBlameRanges.length > 0 && currentEditor) { + const rangeIndex = currentBlameRanges.findIndex(r => + lineNumber >= r.start && lineNumber <= r.end + ); + if (rangeIndex >= 0) { + const range = currentBlameRanges[rangeIndex]; + highlightRange(rangeIndex, currentBlameRanges, currentEditor); + // Also scroll transcript to the corresponding message + if (range.msg_id) { + scrollToMessage(range.msg_id); + } + } + } + }; + + // If we have a file path and it's different from current, load it + if (filePath && filePath !== currentFilePath) { + // Find and click the file in the tree + const fileEl = document.querySelector(`.tree-file[data-path="${CSS.escape(filePath)}"]`); + if (fileEl) { + document.querySelectorAll('.tree-file.selected').forEach(el => el.classList.remove('selected')); + fileEl.classList.add('selected'); + // Skip initial scroll - scrollAndSelect will handle it + loadFile(filePath, true); + // Wait for file to load (loadFile uses setTimeout 10ms + rendering time) + setTimeout(scrollAndSelect, 100); + } + return true; + } else if (filePath) { + // Same file already loaded, just scroll + requestAnimationFrame(scrollAndSelect); + return true; + } else if (lineNumber && !currentFilePath) { + // Line number but no file loaded yet - let caller load first file + // We'll handle the scroll after file loads + return false; + } + return true; + } + return false; + } + + // Navigate from message to code + function navigateToBlame(msgId) { + const blameInfo = msgIdToBlame.get(msgId); + if (!blameInfo) return false; + + const { filePath, range, rangeIndex } = blameInfo; + + const fileEl = document.querySelector(`.tree-file[data-path="${CSS.escape(filePath)}"]`); + if (fileEl) { + let parent = fileEl.parentElement; + while (parent && parent.id !== 'file-tree') { + if (parent.classList.contains('tree-dir') && !parent.classList.contains('open')) { + parent.classList.add('open'); + } + parent = parent.parentElement; + } + + document.querySelectorAll('.tree-file.selected').forEach(el => el.classList.remove('selected')); + fileEl.classList.add('selected'); + } + + // Helper to scroll and highlight the range + const scrollAndHighlight = () => { + scrollEditorToLine(range.start); + if (currentEditor && currentBlameRanges.length > 0) { + const idx = currentBlameRanges.findIndex(r => r.msg_id === msgId && r.start === range.start); + if (idx >= 0) { + highlightRange(idx, currentBlameRanges, currentEditor); + } + } + // Don't auto-scroll transcript - user is already viewing it + }; + + if (currentFilePath !== filePath) { + // Skip initial scroll - scrollAndHighlight will handle it + loadFile(filePath, true); + // Wait for file to load (loadFile uses setTimeout 10ms + rendering time) + setTimeout(scrollAndHighlight, 100); + } else { + requestAnimationFrame(scrollAndHighlight); + } + + return true; + } + + // Set up file tree interaction + document.getElementById('file-tree').addEventListener('click', (e) => { + const dir = e.target.closest('.tree-dir'); + if (dir && (e.target.classList.contains('tree-toggle') || e.target.classList.contains('tree-dir-name'))) { + dir.classList.toggle('open'); + return; + } + + const file = e.target.closest('.tree-file'); + if (file) { + document.querySelectorAll('.tree-file.selected').forEach((el) => { + el.classList.remove('selected'); + }); + file.classList.add('selected'); + loadFile(file.dataset.path); + } + }); + + // Check URL hash for deep-linking FIRST + // If hash specifies a file, we load that directly instead of the first file + // This avoids race conditions between loading the first file and then the hash file + const hashFileLoaded = navigateFromHash(); + + // If no hash or hash didn't specify a file, load the first file + if (!hashFileLoaded) { + const firstFile = document.querySelector('.tree-file'); + if (firstFile) { + firstFile.click(); + // If hash has just a line number (no file), apply it after first file loads + if (window.location.hash.match(/^#L\d+$/)) { + setTimeout(() => navigateFromHash(), 100); + } + } + } + + // Mark initialization complete after a delay to let scrolling finish + setTimeout(() => { + isInitializing = false; + updatePinnedUserMessage(); + }, 500); + + // Handle hash changes (browser back/forward) + window.addEventListener('hashchange', () => { + navigateFromHash(); + }); + + // Resizable panels + function initResize() { + const fileTreePanel = document.getElementById('file-tree-panel'); + const transcriptPanel = document.getElementById('transcript-panel'); + const resizeLeft = document.getElementById('resize-left'); + const resizeRight = document.getElementById('resize-right'); + + let isResizing = false; + let currentHandle = null; + let startX = 0; + let startWidthLeft = 0; + let startWidthRight = 0; + + function startResize(e, handle) { + isResizing = true; + currentHandle = handle; + startX = e.clientX; + handle.classList.add('dragging'); + document.body.style.cursor = 'col-resize'; + document.body.style.userSelect = 'none'; + + if (handle === resizeLeft) { + startWidthLeft = fileTreePanel.offsetWidth; + } else { + startWidthRight = transcriptPanel.offsetWidth; + } + + e.preventDefault(); + } + + function doResize(e) { + if (!isResizing) return; + + const dx = e.clientX - startX; + + if (currentHandle === resizeLeft) { + const newWidth = Math.max(200, Math.min(500, startWidthLeft + dx)); + fileTreePanel.style.width = newWidth + 'px'; + } else { + const newWidth = Math.max(280, Math.min(700, startWidthRight - dx)); + transcriptPanel.style.width = newWidth + 'px'; + } + } + + function stopResize() { + if (!isResizing) return; + isResizing = false; + if (currentHandle) { + currentHandle.classList.remove('dragging'); + } + currentHandle = null; + document.body.style.cursor = ''; + document.body.style.userSelect = ''; + } + + resizeLeft.addEventListener('mousedown', (e) => startResize(e, resizeLeft)); + resizeRight.addEventListener('mousedown', (e) => startResize(e, resizeRight)); + document.addEventListener('mousemove', doResize); + document.addEventListener('mouseup', stopResize); + } + + initResize(); + + // File tree collapse/expand + const collapseBtn = document.getElementById('collapse-file-tree'); + const fileTreePanel = document.getElementById('file-tree-panel'); + const resizeLeftHandle = document.getElementById('resize-left'); + + if (collapseBtn && fileTreePanel) { + collapseBtn.addEventListener('click', () => { + fileTreePanel.classList.toggle('collapsed'); + if (resizeLeftHandle) { + resizeLeftHandle.style.display = fileTreePanel.classList.contains('collapsed') ? 'none' : ''; + } + collapseBtn.title = fileTreePanel.classList.contains('collapsed') ? 'Expand file tree' : 'Collapse file tree'; + }); + } + + // Initialize transcript with windowed rendering + // Add top sentinel for upward lazy loading + const transcriptContentInit = document.getElementById('transcript-content'); + const topSentinelInit = document.createElement('div'); + topSentinelInit.id = 'transcript-sentinel-top'; + topSentinelInit.style.height = '1px'; + transcriptContentInit.insertBefore(topSentinelInit, transcriptContentInit.firstChild); + + // Render initial chunk of messages (starting from 0) + windowStart = 0; + windowEnd = -1; + renderNextChunk(); + + // Set up scroll observers for bi-directional lazy loading + setupScrollObservers(); + + // Sticky user message header + const pinnedUserMessage = document.getElementById('pinned-user-message'); + const pinnedUserContent = pinnedUserMessage?.querySelector('.pinned-user-content'); + const pinnedUserLabel = pinnedUserMessage?.querySelector('.pinned-user-message-label'); + const transcriptPanel = document.getElementById('transcript-panel'); + const transcriptContent = document.getElementById('transcript-content'); + let currentPinnedMessage = null; + + function extractUserMessageText(messageEl) { + const contentEl = messageEl.querySelector('.message-content'); + if (!contentEl) return ''; + + let text = contentEl.textContent.trim(); + if (text.length > 150) { + text = text.substring(0, 150) + '...'; + } + return text; + } + + // Get the prompt number for any message from server-provided data + function getPromptNumber(messageEl) { + const msgId = messageEl.id; + if (!msgId) return null; + + const msgIndex = msgIdToIndex.get(msgId); + if (msgIndex === undefined) return null; + + // Every message has prompt_num set by the server + return messagesData[msgIndex]?.prompt_num || null; + } + + // Cache the pinned message height to avoid flashing when it's hidden + let cachedPinnedHeight = 0; + // Store pinned message ID separately (element reference may become stale after teleportation) + let currentPinnedMsgId = null; + + function updatePinnedUserMessage() { + if (!pinnedUserMessage || !transcriptContent || !transcriptPanel) return; + if (isInitializing || isScrollingToTarget) return; // Skip during scrolling to avoid repeated updates + + const userMessages = transcriptContent.querySelectorAll('.message.user:not(.continuation)'); + if (userMessages.length === 0) { + pinnedUserMessage.style.display = 'none'; + currentPinnedMessage = null; + currentPinnedMsgId = null; + return; + } + + const panelRect = transcriptPanel.getBoundingClientRect(); + const headerHeight = transcriptPanel.querySelector('h3')?.offsetHeight || 0; + + // Use cached height if pinned is hidden, otherwise update cache + if (pinnedUserMessage.style.display !== 'none') { + cachedPinnedHeight = pinnedUserMessage.offsetHeight || cachedPinnedHeight; + } + // Use a minimum height estimate if we've never measured it + const pinnedHeight = cachedPinnedHeight || 40; + + // Threshold for when a message is considered "scrolled past" + const pinnedAreaBottom = panelRect.top + headerHeight + pinnedHeight; + + let messageToPin = null; + let nextUserMessage = null; + + for (const msg of userMessages) { + const msgRect = msg.getBoundingClientRect(); + // A message should be pinned if its bottom is above the pinned area + if (msgRect.bottom < pinnedAreaBottom) { + messageToPin = msg; + } else { + // This is the first user message that's visible + nextUserMessage = msg; + break; + } + } + + // Hide pinned if the next user message is entering the pinned area + // Use a small buffer to prevent flashing at the boundary + if (nextUserMessage) { + const nextRect = nextUserMessage.getBoundingClientRect(); + if (nextRect.top < pinnedAreaBottom) { + // Next user message is in the pinned area - hide the pinned + messageToPin = null; + } + } + + if (messageToPin && messageToPin !== currentPinnedMessage) { + currentPinnedMessage = messageToPin; + currentPinnedMsgId = messageToPin.id; + const promptNum = getPromptNumber(messageToPin); + // Update label with prompt number + if (pinnedUserLabel) { + pinnedUserLabel.textContent = promptNum ? `User Prompt #${promptNum}` : 'User Prompt'; + } + pinnedUserContent.textContent = extractUserMessageText(messageToPin); + pinnedUserMessage.style.display = 'block'; + // Use message ID to look up element on click (element may be stale after teleportation) + pinnedUserMessage.onclick = () => { + if (currentPinnedMsgId) { + const msgEl = transcriptContent.querySelector(`#${CSS.escape(currentPinnedMsgId)}`); + if (msgEl) { + msgEl.scrollIntoView({ behavior: 'smooth', block: 'start' }); + } else { + // Element not in DOM (teleported away) - use scrollToMessage to bring it back + scrollToMessage(currentPinnedMsgId); + } + } + }; + } else if (!messageToPin) { + pinnedUserMessage.style.display = 'none'; + currentPinnedMessage = null; + currentPinnedMsgId = null; + } + } + + // Throttle scroll handler + let scrollTimeout = null; + transcriptPanel?.addEventListener('scroll', () => { + if (scrollTimeout) return; + scrollTimeout = setTimeout(() => { + updatePinnedUserMessage(); + scrollTimeout = null; + }, 16); + }); + + setTimeout(updatePinnedUserMessage, 100); + + // Click handler for transcript messages + transcriptContent?.addEventListener('click', (e) => { + const messageEl = e.target.closest('.message'); + if (!messageEl) return; + + const msgId = messageEl.id; + if (!msgId) return; + + const msgIndex = msgIdToIndex.get(msgId); + if (msgIndex === undefined) return; + + const nextOp = findNextBlameOp(msgIndex); + if (nextOp) { + navigateToBlame(nextOp.msgId); + } + }); +} + +// Start initialization +init(); diff --git a/src/claude_code_transcripts/templates/index.html b/src/claude_code_transcripts/templates/index.html index 30ed6ea..f0d6296 100644 --- a/src/claude_code_transcripts/templates/index.html +++ b/src/claude_code_transcripts/templates/index.html @@ -3,34 +3,15 @@ {% block title %}Claude Code transcript - Index{% endblock %} {% block content %} -
    -

    Claude Code transcript

    -