Skip to content

D2M: Mon, Jun 15, 2026 (v6.15.26)#74

Open
github-actions[bot] wants to merge 24 commits into
mainfrom
develop
Open

D2M: Mon, Jun 15, 2026 (v6.15.26)#74
github-actions[bot] wants to merge 24 commits into
mainfrom
develop

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR lands a large batch of features across iOS, macOS, and the Go runtime: on-device WebKit browsing (BrowseSkill), HTML story/infographic generation ported to macOS, Google Workspace integration (Drive, Gmail, Calendar), VM-backed cron agents, a music mini-player banner, iMessage-style swipe-to-reveal timestamps, Sparkle auto-update for LoopMac, and a background handoff mechanism that migrates an in-flight local inference turn to a Loop Runner when the app backgrounds. It also fixes several audio ducking/TTS cleanup regressions, Apple on-device fallback state leaks, Claude tool-call text bleeding into the streaming bubble, and playlist/album playback using library IDs.

Changes by area

Browse skill (iOS-only, new)

  • BrowseSession: drives a non-persistent WKWebView offscreen; full JS bridge (click, type, scroll, wait_for, querySelectorAll, eval_js, finish); per-step screenshot + DOM snapshot persisted to workspace://browse/<id>/ as a replay bundle.
  • BrowseGenerationService: coordinates sessions, streams per-step attachment updates to MessagingVC.
  • BrowsePlayerVC: full-screen viewer with live-mirror mode (read-only, gesture-swallowing overlay) and scrubbable replay mode (slider + play/pause + per-frame DOM/screenshot/share actions).
  • BrowseSkill: browse user-facing tool + internal browse_action tool for the nested model loop; registered in AgentHarness, SkillDispatcher, ToolRouter.

Stories / HTML infographic

  • StorySkill, StoryGenerator, StoryGenerationService, StoryAttachment, StoryBundledTemplates, StoryPlayerVC/View ported to #if os(iOS) || os(macOS).
  • StoryMacUI.swift: StoryBubbleView card + StoryPlayerWindowController WKWebView player for macOS.
  • ConversationWindowController wired as StorySkillHost on macOS.
  • story- prefix added to message-filter guards in AnthropicChat, OpenAIChat, and ConversationTitleService.

Google Workspace integration (new)

  • GoogleWorkspaceClient: shared networking layer with typed errors including token_expired on 401.
  • GoogleDriveSkill (list/get/read/create), GoogleGmailSkill (search/get/send), GoogleCalendarSkill (list/create events).
  • KeyStore: google_workspace_access_token, optional refresh/client tokens.
  • IntegrationsVC: Google Workspace row with connected status and Revoke/Remove.
  • Registered in AgentHarness, SkillDispatcher, ToolRouter, VoiceLoopCoordinator.

VM cron agents (new)

  • VMCronSkill, VMCronManager, VMCronPoller, VMCronTasksVC: schedule/list/delete recurring agent jobs that run on the SSH VM and push results back.
  • VMAgentRuntime, RunnerTurnApplier, BackgroundTurnRunner, RunnerProvisioner: pipeline for running and applying runner turns, including deduplication.
  • AppDelegate: handles runner_turn push type — reconciles reply into conversation, suppresses redundant banner, cold-start tap routing via PendingConversationOpen.
  • Linux runner binaries (loop-runner-linux-amd64.gz, loop-runner-linux-arm64.gz) bundled as dataset assets.

Background handoff to Loop Runner

  • LocalInferenceController: tracks the active URLSessionDataTask for each provider (Anthropic, OpenAI, Fireworks) so it can be hard-cancelled.
  • MessagingVC.handoffInFlightTurnIfEligible(): on backgrounding, cancels local inference, marks turn abandoned, submits to runner via LoopRunnerPoller.submitHandoff; posts a local notification when the reply arrives; falls back to an error notice if handoff fails.
  • discardIfHandedOff guard in every Cloud.connection.chat completion block.

Audio ducking / TTS cleanup

  • iOS MessagingVC: deactivateAudioSession() (with .notifyOthersOnDeactivation) called from stopSpeaking(), audioPlayerDidFinishPlaying, decode error, speechSynthesizer(_:didFinish:), and DeepgramTTS callbacks.
  • All TTS category options switched from []/.mixWithOthers to .duckOthers.
  • visionOS VisionVoiceCoordinator: matching deactivation calls after TTS and recording teardown.
  • macOS DeepgramTTS: mixer output tap removed in finishAfterDrain() and the error path.
  • MusicController: duck music synchronously before earcon in switchToRecordingState(); resume token guards against orphaned auto-resumes; new play_music/set_music_mood calls mid-session re-pause via reduckIfVoiceSessionActive().

Music mini-player banner

  • MusicMiniPlayerView: compact pill and expanded card states with play/pause, skip, progress bar, album art, and deep-link.
  • TopBannerScrollView: horizontally scrollable container below the sub-agent status bar hosting the mini-player.
  • Auto-dismisses on stop, auto-minimizes on voice recording, swipe-down to collapse.

STT engine badge

  • VoiceLoopCoordinator gains STTEngine enum and activeSTTEngine property.
  • MessageBox publishes engine on start and on Deepgram→SFSpeech fallback.
  • iOS: monospaced pill beside transcribingLabel; macOS: appended to recorder bar placeholder.
  • sttEngine persisted to the model column for user rows and restored on load; shown as a dictation byline under user bubbles.

iMessage-style swipe-to-reveal timestamps

  • MessagingCell.timeLabel: pinned just past the right edge of contentView.
  • setTimeRevealOffset(_:) translates contentView; layoutSubviews re-applies the transform after UITableViewCell resets the frame each layout pass.
  • MessagingVC pan gesture (handleTimeRevealPan) drives all visible cells in lock-step with rubber-banding.
  • AgentLargeView.setChromeHidden(_:animated:) added; chrome held back until the orb lands to fix "labels appearing before orb arrives" visual.

Sparkle auto-update (macOS)

  • Sparkle 2.9.2 added as a Swift package dependency for LoopMac.
  • release-mac.yml overhauled from commented-out stub to a fully active workflow: build → sign → notarize app → notarize DMG → package Sparkle zip (post-staple) → generate_appcast with EdDSA signature → publish per-build release (DMG + zip) → upsert appcast release with appcast.xml.

Apple on-device fallback fixes

  • Both Apple Task{} fallback paths (primary send and tool-loop) now call ActiveRequestTracker.markIdle, clear streamingPartial, reset VoiceLoopCoordinator to .idle, and surface a user-facing error message with the error earcon.
  • Previously empty catch blocks filled in for both paths.

Claude tool-call streaming fix

  • AnthropicStreamReader: sawToolUse flag suppresses onDelta for text_delta events once any tool_use block starts; pre-tool reasoning text is still accumulated in contentBuffer for the disclosure but no longer leaks into the live streaming bubble.
  • AnthropicStreamReaderTests added covering text-only, tool-call assembly, delta suppression, connectivity error, and usage parsing.

Playlist/album playback fix

  • Library playlist IDs (p.), album IDs (l.), and song IDs (i.) now use MusicLibraryRequest instead of MusicCatalogResourceRequest.
  • queue_mode: append now works for albums and playlists, not just songs.

APNs / push registration

  • PushBridge: runtime-discovered seam (mirrors AppSignals) that passes the APNs token to a private sender when present.
  • APNs environment resolved from embedded provisioning profile; token registered immediately after permission grant.

Geocoding skill (new)

  • GeocodingSkill: converts an address or place name to lat/lon using CLGeocoder; registered in AgentHarness, SkillDispatcher, VoiceLoopCoordinator.

AgentMail skill (new)

  • AgentMailClient + AgentMailSkill: send/receive email via the AgentMail API; keys wired through KeyStore and Info.plist.

Managed-mode / AppFlags

  • AppFlags.isManaged reads LOOP_FLAG from Info.plist; when set, pins TTS provider to ElevenLabs Flash v2.5, shows "Managed" in the backend indicator (non-interactive), drops the provider picker from the speaker menu, and allows managed onboarding to speak its messages.
  • Secrets_managed.xcconfig added to .gitignore; Config.xcconfig switches include to Secrets_managed.xcconfig.

SSH / runner improvements

  • SSHConfigStore, SSHConnectionsVC, SSHSettingsVC: multiple SSH profiles, passphrase support, managed pre-seeding from Info.plist.
  • LoopRunnerSSHClient, LoopRunnerPoller, LoopRunnerClient: handoff submission, backstop catch-up polling for missed pushes.
  • RunnerEditVC, RunnerModels, RunnerProvisioner added.

Go runtime

  • agent.go, handlers.go, storage.go, main.go, config.go: managed-secrets support, improved cron handling, new integration test coverage.

Notable risks

  • Background handoff is destructive: local inference is hard-cancelled and the turn marked abandoned before the runner submit completes. If submitHandoff fails and the runner is unreachable, the user sees a notice but has permanently lost the local result for that turn.
  • BrowseSession parks a WKWebView offscreen in the key window: if keyWindow() returns nil (e.g. during early launch or on multi-scene), the web view is never attached and WebKit may not render/snapshot correctly, silently producing empty frames.
  • Secrets_managed.xcconfig replaces Secrets.xcconfig as the build config include: developers who have Secrets.xcconfig locally must rename or symlink; the old filename is no longer read by Config.xcconfig.
  • Sparkle EdDSA key (SPARKLE_ED_PRIVATE_KEY) is a new required CI secret: the release workflow will fail on main pushes until it is added.
  • sttEngine piggybacked on the model DB column for user rows: any existing user-role rows with a non-empty model value (e.g. from a future schema change) would be misread as a dictation byline.
  • storyCardView, browseCardView constraints grow with each prepareForReuse call if NSLayoutConstraint.deactivate does not fully remove them — the lazy-view superview == nil guard prevents duplicate additions, but constraint deactivation on recycled cells for the story/browse paths should be verified under heavy scrolling.

Auto-generated by .github/workflows/auto-develop-pr.yml

devin-ai-integration Bot and others added 6 commits June 7, 2026 05:41
…tool-call streaming

1. Apple on-device fallback leaves thread non-sendable:
   Both Apple Intelligence fallback paths in MessagingVC (post-tool-loop
   and primary send) now properly call ActiveRequestTracker.markIdle(),
   clear streamingPartial, and reset VoiceLoopCoordinator to .idle.
   Previously these were missing, leaving the conversation in a stuck
   'active' state after an Apple response.

2. Apple fallback error handling:
   The catch blocks in both Apple Task{} paths were empty. Now they
   surface a user-facing error message, persist it, play the error
   earcon, and clean up all state — matching the existing error path
   for when Apple Intelligence is unavailable.

3. Claude tool-call content leaks into streaming bubble:
   AnthropicStreamReader now tracks when the first tool_use content
   block starts (sawToolUse flag). Once set, text_delta events are
   still accumulated in contentBuffer (for the 'Used N tools'
   disclosure prose) but no longer fired via onDelta — preventing
   pre-tool-call 'thinking' text from appearing in the live streaming
   bubble as visible assistant prose.

4. Added AnthropicStreamReaderTests covering:
   - Text-only content assembly
   - Tool-call assembly from input_json_delta
   - onDelta suppression after tool_use block starts
   - Error propagation on mid-stream connectivity failure
   - Usage (token) parsing

Co-Authored-By: bot_apk <apk@cognition.ai>
Show which speech-to-text backend is active during live transcription:
- iOS: small monospaced pill (DG or APL) beside the transcribingLabel
- macOS: appended to the recorder bar placeholder text

VoiceLoopCoordinator gains an STTEngine enum and activeSTTEngine
property. MessageBox publishes the engine when recording starts and
on Deepgram-to-SFSpeech fallback.

Co-Authored-By: bot_apk <apk@cognition.ai>
Implements the Music Mini-Player Banner feature:
- MusicMiniPlayerView: compact pill (minimized) and expanded card states
  with play/pause, skip, progress, album art, and deep-link to Apple Music
- TopBannerScrollView: horizontally scrollable container below the sub-agent
  status bar, hosting the music mini-player
- Visibility logic: shows when music playing/paused within 5min, auto-dismisses
  on stop, auto-minimizes on voice recording
- Gesture support: tap pill to expand, swipe down to collapse, swipe away to dismiss
- Wired to MusicController.shared for playback state and controls
- Xcode project updated: new files excluded from LoopMac and LoopVision targets

Co-Authored-By: bot_apk <apk@cognition.ai>
- Duck music synchronously before earcon in switchToRecordingState()
  so the Action Button flow doesn't clip audio against a playing track.
- Fix handleVoiceLoopState to only resume on .idle, not .thinking or
  .transcribing — prevents a brief music-plays gap mid-voice-turn.
- Add resumeToken (UUID) that tracks whether auto-resume is valid;
  cleared on user-explicit stop/pause so orphaned resumes never fire.
- New play_music/set_music_mood calls mid-voice-session immediately
  re-pause via reduckIfVoiceSessionActive(); the new track becomes
  the resume target.
- resumeAfterVoiceSession() fails silently with a log on any
  MusicKit/AVAudioSession error.
- status() now exposes will_auto_resume for debugging.

Co-Authored-By: bot_apk <apk@cognition.ai>
…ter-failures

Fix model router failure cases: Apple fallback state leak & Claude tool-call streaming
Add inline STT engine badge (DG / APL) to voice transcription UI
@vercel

vercel Bot commented Jun 7, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
loop-harness Ready Ready Preview, Comment Jun 15, 2026 4:54pm

…ayer-banner

feat: Music Mini-Player Banner in chat view
Add media ducking and resumption for voice sessions
Co-authored-by: Ash Bhat <me@ashbhat.com>
ashbhat and others added 4 commits June 7, 2026 10:05
Replace the deferred transmitToVM stub with PushBridge, a runtime-discovered
seam (mirroring AppSignals) that hands the APNs device token to a private
sender when present and is a no-op in public clones. Resolve the APNs
environment from the embedded provisioning profile, and obtain/register the
token immediately after a notification-permission grant instead of waiting
for the next launch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
devin-ai-integration Bot and others added 9 commits June 7, 2026 23:10
- Playlist IDs starting with 'p.' (user library) now use
  MusicLibraryRequest instead of MusicCatalogResourceRequest, which
  only works for catalog playlists (pl.…).
- Same fix for album IDs starting with 'l.' (library albums).
- Same fix for song IDs starting with 'i.' (library songs).
- queue_mode 'append' now works for albums and playlists, not just
  songs — previously albums/playlists always replaced the queue.
- Updated tool description and system prompt to document library ID
  support and full-track-list queuing.

Co-Authored-By: bot_apk <apk@cognition.ai>
…tivation after TTS

Root cause: When TTS playback finishes on iOS/visionOS, the audio
session was never deactivated with .notifyOthersOnDeactivation. Without
this, the system never sends the 'interruption ended — you may resume'
signal to other audio apps (Apple Music, Spotify, podcasts), so they
stay paused indefinitely after Loop's TTS finishes.

The recording path already did this correctly (MessageBox.swift line 693
and 1644). Only the TTS finish paths were missing it.

Changes:
- iOS (MessagingVC): Add deactivateAudioSession() helper; call it from
  stopSpeaking(), audioPlayerDidFinishPlaying, audioPlayerDecodeError,
  speechSynthesizer(_:didFinish:), and DeepgramTTS onFinished/onError.
- iOS (MessagingVC): Switch TTS category options from [] or
  [.mixWithOthers] to [.duckOthers] for all providers (Deepgram,
  ElevenLabs, OpenAI, offline, playMP3Data). This tells the system to
  duck other apps' volume during speech and send .shouldResume when we
  deactivate — the standard polite-audio-citizen pattern.
- visionOS (VisionVoiceCoordinator): Add matching deactivateAudioSession
  calls after TTS finishes (speechDidFinish, onFinished, onError) and
  after recording teardown.
- macOS/shared (DeepgramTTS): Remove the mixer output tap in
  finishAfterDrain() and the error path before stopping the engine.
  Ensures the engine is fully released so the audio device isn't held.

Loop-initiated MusicKit playback (MusicController) is unaffected — it
uses explicit ApplicationMusicPlayer.play()/pause() on state transitions
and doesn't depend on the audio session's activation state.

Co-Authored-By: bot_apk <apk@cognition.ai>
Add a full Google Workspace integration following the existing Notion/Slack
pattern:

- GoogleWorkspaceClient: shared networking layer with token injection,
  error parsing, and typed errors (including token_expired on 401)
- GoogleDriveSkill: list_files, get_file, read_file, create_file actions
- GoogleGmailSkill: search_messages, get_message, send_message actions
- GoogleCalendarSkill: list_events, create_event actions
- KeyStore: add google_workspace_access_token (required) plus optional
  refresh_token, client_id, client_secret
- IntegrationsVC: add Google Workspace row with connected status,
  service details, and Revoke/Remove button
- Register all three skills in SkillDispatcher, AgentHarness (catalog +
  system prompt fragments), Messaging.swift tools array, MessagingVC
  statusText, SubAgentRuntime statusText, and VoiceLoopCoordinator
  (dispatch + statusText)

Co-Authored-By: bot_apk <apk@cognition.ai>
End-to-end prototype for generating 1080×1920 portrait HTML infographics
(stories) from structured JSON data and rendering them in-app via a
chromeless WKWebView player.

Components:
- StoryAttachment: chat message attachment type (.generating → .ready)
- StoryGenerator: JSON → HTML renderer that injects data into templates
- StoryGenerationService: async pipeline (mirrors PDFGenerationService)
- StoryPlayerView: chromeless WKWebView, inline (scaled) + full-screen
- StoryPlayerVC: full-screen modal with tap-to-advance + progress bar
- StorySkill: tool definition for generate_story LLM calls
- StoryDemoVC: demo view controller with sample data
- Templates: DailyRecap, ActivitySummary (pure HTML/CSS/JS, no deps)

Wired into MessageStruct via storyAttachment field.

Co-Authored-By: bot_apk <apk@cognition.ai>
…b.com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
…m:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
….com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
My work this change:
- TTS voice parity: add OpenAI gpt-4o-mini-tts provider + 11 voices and the
  5 missing ElevenLabs voices to LoopMac so Mac matches the iOS voice menu.
- Wire 7 already-shared skills into VoiceLoopCoordinator's dispatch on Mac
  (Maps, Geocoding, Navigation, MuniRealtime, Twitter, SSH, MCP) — they were
  advertised to the model but returned "not available on Mac".
- Port the Stories feature to Mac: make StorySkill/StoryGenerationService
  cross-platform (#if os(iOS)||os(macOS)) and add StoryMacUI.swift
  (StoryBubbleView card + StoryPlayerWindowController WKWebView player) with a
  StorySkillHost on ConversationWindowController; advertise generate_story on
  macOS via the shared tools/AgentHarness catalog.

Also includes concurrent in-flight work present in the tree (committed at the
user's request): the iOS Stories prototype rewrite (StorySkill/StoryGenerator/
StoryBundledTemplates/templates/MessagingCell/MessagingVC + story- prefix
handling), Sparkle auto-update wiring, Google Workspace integration tokens,
and the release-mac.yml workflow overhaul.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…infographic

feat: Stories / HTML Infographic prototype
@github-actions github-actions Bot changed the title D2M: Sun, Jun 7, 2026 (v6.7.26) D2M: Tue, Jun 9, 2026 (v6.9.26) Jun 9, 2026
Co-authored-by: Ash Bhat <me@ashbhat.com>
@github-actions github-actions Bot changed the title D2M: Tue, Jun 9, 2026 (v6.9.26) D2M: Thu, Jun 11, 2026 (v6.11.26) Jun 11, 2026
* added improved support for dragging cells to see times

* added support for managed secrets

---------

Co-authored-by: Ash Bhat <me@ashbhat.com>
@github-actions github-actions Bot changed the title D2M: Thu, Jun 11, 2026 (v6.11.26) D2M: Mon, Jun 15, 2026 (v6.15.26) Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants