D2M: Mon, Jun 15, 2026 (v6.15.26)#74
Open
github-actions[bot] wants to merge 24 commits into
Open
Conversation
…tool-call streaming
1. Apple on-device fallback leaves thread non-sendable:
Both Apple Intelligence fallback paths in MessagingVC (post-tool-loop
and primary send) now properly call ActiveRequestTracker.markIdle(),
clear streamingPartial, and reset VoiceLoopCoordinator to .idle.
Previously these were missing, leaving the conversation in a stuck
'active' state after an Apple response.
2. Apple fallback error handling:
The catch blocks in both Apple Task{} paths were empty. Now they
surface a user-facing error message, persist it, play the error
earcon, and clean up all state — matching the existing error path
for when Apple Intelligence is unavailable.
3. Claude tool-call content leaks into streaming bubble:
AnthropicStreamReader now tracks when the first tool_use content
block starts (sawToolUse flag). Once set, text_delta events are
still accumulated in contentBuffer (for the 'Used N tools'
disclosure prose) but no longer fired via onDelta — preventing
pre-tool-call 'thinking' text from appearing in the live streaming
bubble as visible assistant prose.
4. Added AnthropicStreamReaderTests covering:
- Text-only content assembly
- Tool-call assembly from input_json_delta
- onDelta suppression after tool_use block starts
- Error propagation on mid-stream connectivity failure
- Usage (token) parsing
Co-Authored-By: bot_apk <apk@cognition.ai>
Show which speech-to-text backend is active during live transcription: - iOS: small monospaced pill (DG or APL) beside the transcribingLabel - macOS: appended to the recorder bar placeholder text VoiceLoopCoordinator gains an STTEngine enum and activeSTTEngine property. MessageBox publishes the engine when recording starts and on Deepgram-to-SFSpeech fallback. Co-Authored-By: bot_apk <apk@cognition.ai>
Implements the Music Mini-Player Banner feature: - MusicMiniPlayerView: compact pill (minimized) and expanded card states with play/pause, skip, progress, album art, and deep-link to Apple Music - TopBannerScrollView: horizontally scrollable container below the sub-agent status bar, hosting the music mini-player - Visibility logic: shows when music playing/paused within 5min, auto-dismisses on stop, auto-minimizes on voice recording - Gesture support: tap pill to expand, swipe down to collapse, swipe away to dismiss - Wired to MusicController.shared for playback state and controls - Xcode project updated: new files excluded from LoopMac and LoopVision targets Co-Authored-By: bot_apk <apk@cognition.ai>
- Duck music synchronously before earcon in switchToRecordingState() so the Action Button flow doesn't clip audio against a playing track. - Fix handleVoiceLoopState to only resume on .idle, not .thinking or .transcribing — prevents a brief music-plays gap mid-voice-turn. - Add resumeToken (UUID) that tracks whether auto-resume is valid; cleared on user-explicit stop/pause so orphaned resumes never fire. - New play_music/set_music_mood calls mid-voice-session immediately re-pause via reduckIfVoiceSessionActive(); the new track becomes the resume target. - resumeAfterVoiceSession() fails silently with a log on any MusicKit/AVAudioSession error. - status() now exposes will_auto_resume for debugging. Co-Authored-By: bot_apk <apk@cognition.ai>
…ter-failures Fix model router failure cases: Apple fallback state leak & Claude tool-call streaming
Add inline STT engine badge (DG / APL) to voice transcription UI
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…ayer-banner feat: Music Mini-Player Banner in chat view
Add media ducking and resumption for voice sessions
Co-authored-by: Ash Bhat <me@ashbhat.com>
Replace the deferred transmitToVM stub with PushBridge, a runtime-discovered seam (mirroring AppSignals) that hands the APNs device token to a private sender when present and is a no-op in public clones. Resolve the APNs environment from the embedded provisioning profile, and obtain/register the token immediately after a notification-permission grant instead of waiting for the next launch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Feat/stt improvements
- Playlist IDs starting with 'p.' (user library) now use MusicLibraryRequest instead of MusicCatalogResourceRequest, which only works for catalog playlists (pl.…). - Same fix for album IDs starting with 'l.' (library albums). - Same fix for song IDs starting with 'i.' (library songs). - queue_mode 'append' now works for albums and playlists, not just songs — previously albums/playlists always replaced the queue. - Updated tool description and system prompt to document library ID support and full-track-list queuing. Co-Authored-By: bot_apk <apk@cognition.ai>
…tivation after TTS Root cause: When TTS playback finishes on iOS/visionOS, the audio session was never deactivated with .notifyOthersOnDeactivation. Without this, the system never sends the 'interruption ended — you may resume' signal to other audio apps (Apple Music, Spotify, podcasts), so they stay paused indefinitely after Loop's TTS finishes. The recording path already did this correctly (MessageBox.swift line 693 and 1644). Only the TTS finish paths were missing it. Changes: - iOS (MessagingVC): Add deactivateAudioSession() helper; call it from stopSpeaking(), audioPlayerDidFinishPlaying, audioPlayerDecodeError, speechSynthesizer(_:didFinish:), and DeepgramTTS onFinished/onError. - iOS (MessagingVC): Switch TTS category options from [] or [.mixWithOthers] to [.duckOthers] for all providers (Deepgram, ElevenLabs, OpenAI, offline, playMP3Data). This tells the system to duck other apps' volume during speech and send .shouldResume when we deactivate — the standard polite-audio-citizen pattern. - visionOS (VisionVoiceCoordinator): Add matching deactivateAudioSession calls after TTS finishes (speechDidFinish, onFinished, onError) and after recording teardown. - macOS/shared (DeepgramTTS): Remove the mixer output tap in finishAfterDrain() and the error path before stopping the engine. Ensures the engine is fully released so the audio device isn't held. Loop-initiated MusicKit playback (MusicController) is unaffected — it uses explicit ApplicationMusicPlayer.play()/pause() on state transitions and doesn't depend on the audio session's activation state. Co-Authored-By: bot_apk <apk@cognition.ai>
Add a full Google Workspace integration following the existing Notion/Slack pattern: - GoogleWorkspaceClient: shared networking layer with token injection, error parsing, and typed errors (including token_expired on 401) - GoogleDriveSkill: list_files, get_file, read_file, create_file actions - GoogleGmailSkill: search_messages, get_message, send_message actions - GoogleCalendarSkill: list_events, create_event actions - KeyStore: add google_workspace_access_token (required) plus optional refresh_token, client_id, client_secret - IntegrationsVC: add Google Workspace row with connected status, service details, and Revoke/Remove button - Register all three skills in SkillDispatcher, AgentHarness (catalog + system prompt fragments), Messaging.swift tools array, MessagingVC statusText, SubAgentRuntime statusText, and VoiceLoopCoordinator (dispatch + statusText) Co-Authored-By: bot_apk <apk@cognition.ai>
End-to-end prototype for generating 1080×1920 portrait HTML infographics (stories) from structured JSON data and rendering them in-app via a chromeless WKWebView player. Components: - StoryAttachment: chat message attachment type (.generating → .ready) - StoryGenerator: JSON → HTML renderer that injects data into templates - StoryGenerationService: async pipeline (mirrors PDFGenerationService) - StoryPlayerView: chromeless WKWebView, inline (scaled) + full-screen - StoryPlayerVC: full-screen modal with tap-to-advance + progress bar - StorySkill: tool definition for generate_story LLM calls - StoryDemoVC: demo view controller with sample data - Templates: DailyRecap, ActivitySummary (pure HTML/CSS/JS, no deps) Wired into MessageStruct via storyAttachment field. Co-Authored-By: bot_apk <apk@cognition.ai>
…b.com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
…m:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
….com:getathelas/LoopHarness into devin/1780880580-stories-html-infographic
My work this change: - TTS voice parity: add OpenAI gpt-4o-mini-tts provider + 11 voices and the 5 missing ElevenLabs voices to LoopMac so Mac matches the iOS voice menu. - Wire 7 already-shared skills into VoiceLoopCoordinator's dispatch on Mac (Maps, Geocoding, Navigation, MuniRealtime, Twitter, SSH, MCP) — they were advertised to the model but returned "not available on Mac". - Port the Stories feature to Mac: make StorySkill/StoryGenerationService cross-platform (#if os(iOS)||os(macOS)) and add StoryMacUI.swift (StoryBubbleView card + StoryPlayerWindowController WKWebView player) with a StorySkillHost on ConversationWindowController; advertise generate_story on macOS via the shared tools/AgentHarness catalog. Also includes concurrent in-flight work present in the tree (committed at the user's request): the iOS Stories prototype rewrite (StorySkill/StoryGenerator/ StoryBundledTemplates/templates/MessagingCell/MessagingVC + story- prefix handling), Sparkle auto-update wiring, Google Workspace integration tokens, and the release-mac.yml workflow overhaul. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…infographic feat: Stories / HTML Infographic prototype
Co-authored-by: Ash Bhat <me@ashbhat.com>
* added improved support for dragging cells to see times * added support for managed secrets --------- Co-authored-by: Ash Bhat <me@ashbhat.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR lands a large batch of features across iOS, macOS, and the Go runtime: on-device WebKit browsing (
BrowseSkill), HTML story/infographic generation ported to macOS, Google Workspace integration (Drive, Gmail, Calendar), VM-backed cron agents, a music mini-player banner, iMessage-style swipe-to-reveal timestamps, Sparkle auto-update for LoopMac, and a background handoff mechanism that migrates an in-flight local inference turn to a Loop Runner when the app backgrounds. It also fixes several audio ducking/TTS cleanup regressions, Apple on-device fallback state leaks, Claude tool-call text bleeding into the streaming bubble, and playlist/album playback using library IDs.Changes by area
Browse skill (iOS-only, new)
BrowseSession: drives a non-persistentWKWebViewoffscreen; full JS bridge (click, type, scroll, wait_for, querySelectorAll, eval_js, finish); per-step screenshot + DOM snapshot persisted toworkspace://browse/<id>/as a replay bundle.BrowseGenerationService: coordinates sessions, streams per-step attachment updates toMessagingVC.BrowsePlayerVC: full-screen viewer with live-mirror mode (read-only, gesture-swallowing overlay) and scrubbable replay mode (slider + play/pause + per-frame DOM/screenshot/share actions).BrowseSkill:browseuser-facing tool + internalbrowse_actiontool for the nested model loop; registered inAgentHarness,SkillDispatcher,ToolRouter.Stories / HTML infographic
StorySkill,StoryGenerator,StoryGenerationService,StoryAttachment,StoryBundledTemplates,StoryPlayerVC/Viewported to#if os(iOS) || os(macOS).StoryMacUI.swift:StoryBubbleViewcard +StoryPlayerWindowControllerWKWebView player for macOS.ConversationWindowControllerwired asStorySkillHoston macOS.story-prefix added to message-filter guards inAnthropicChat,OpenAIChat, andConversationTitleService.Google Workspace integration (new)
GoogleWorkspaceClient: shared networking layer with typed errors includingtoken_expiredon 401.GoogleDriveSkill(list/get/read/create),GoogleGmailSkill(search/get/send),GoogleCalendarSkill(list/create events).KeyStore:google_workspace_access_token, optional refresh/client tokens.IntegrationsVC: Google Workspace row with connected status and Revoke/Remove.AgentHarness,SkillDispatcher,ToolRouter,VoiceLoopCoordinator.VM cron agents (new)
VMCronSkill,VMCronManager,VMCronPoller,VMCronTasksVC: schedule/list/delete recurring agent jobs that run on the SSH VM and push results back.VMAgentRuntime,RunnerTurnApplier,BackgroundTurnRunner,RunnerProvisioner: pipeline for running and applying runner turns, including deduplication.AppDelegate: handlesrunner_turnpush type — reconciles reply into conversation, suppresses redundant banner, cold-start tap routing viaPendingConversationOpen.loop-runner-linux-amd64.gz,loop-runner-linux-arm64.gz) bundled as dataset assets.Background handoff to Loop Runner
LocalInferenceController: tracks the activeURLSessionDataTaskfor each provider (Anthropic, OpenAI, Fireworks) so it can be hard-cancelled.MessagingVC.handoffInFlightTurnIfEligible(): on backgrounding, cancels local inference, marks turn abandoned, submits to runner viaLoopRunnerPoller.submitHandoff; posts a local notification when the reply arrives; falls back to an error notice if handoff fails.discardIfHandedOffguard in everyCloud.connection.chatcompletion block.Audio ducking / TTS cleanup
MessagingVC:deactivateAudioSession()(with.notifyOthersOnDeactivation) called fromstopSpeaking(),audioPlayerDidFinishPlaying, decode error,speechSynthesizer(_:didFinish:), and DeepgramTTS callbacks.[]/.mixWithOthersto.duckOthers.VisionVoiceCoordinator: matching deactivation calls after TTS and recording teardown.DeepgramTTS: mixer output tap removed infinishAfterDrain()and the error path.MusicController: duck music synchronously before earcon inswitchToRecordingState(); resume token guards against orphaned auto-resumes; newplay_music/set_music_moodcalls mid-session re-pause viareduckIfVoiceSessionActive().Music mini-player banner
MusicMiniPlayerView: compact pill and expanded card states with play/pause, skip, progress bar, album art, and deep-link.TopBannerScrollView: horizontally scrollable container below the sub-agent status bar hosting the mini-player.STT engine badge
VoiceLoopCoordinatorgainsSTTEngineenum andactiveSTTEngineproperty.MessageBoxpublishes engine on start and on Deepgram→SFSpeech fallback.transcribingLabel; macOS: appended to recorder bar placeholder.sttEnginepersisted to themodelcolumn for user rows and restored on load; shown as a dictation byline under user bubbles.iMessage-style swipe-to-reveal timestamps
MessagingCell.timeLabel: pinned just past the right edge ofcontentView.setTimeRevealOffset(_:)translatescontentView;layoutSubviewsre-applies the transform afterUITableViewCellresets the frame each layout pass.MessagingVCpan gesture (handleTimeRevealPan) drives all visible cells in lock-step with rubber-banding.AgentLargeView.setChromeHidden(_:animated:)added; chrome held back until the orb lands to fix "labels appearing before orb arrives" visual.Sparkle auto-update (macOS)
release-mac.ymloverhauled from commented-out stub to a fully active workflow: build → sign → notarize app → notarize DMG → package Sparkle zip (post-staple) →generate_appcastwith EdDSA signature → publish per-build release (DMG + zip) → upsertappcastrelease withappcast.xml.Apple on-device fallback fixes
Task{}fallback paths (primary send and tool-loop) now callActiveRequestTracker.markIdle, clearstreamingPartial, resetVoiceLoopCoordinatorto.idle, and surface a user-facing error message with the error earcon.catchblocks filled in for both paths.Claude tool-call streaming fix
AnthropicStreamReader:sawToolUseflag suppressesonDeltafortext_deltaevents once anytool_useblock starts; pre-tool reasoning text is still accumulated incontentBufferfor the disclosure but no longer leaks into the live streaming bubble.AnthropicStreamReaderTestsadded covering text-only, tool-call assembly, delta suppression, connectivity error, and usage parsing.Playlist/album playback fix
p.), album IDs (l.), and song IDs (i.) now useMusicLibraryRequestinstead ofMusicCatalogResourceRequest.queue_mode: appendnow works for albums and playlists, not just songs.APNs / push registration
PushBridge: runtime-discovered seam (mirrorsAppSignals) that passes the APNs token to a private sender when present.Geocoding skill (new)
GeocodingSkill: converts an address or place name to lat/lon using CLGeocoder; registered inAgentHarness,SkillDispatcher,VoiceLoopCoordinator.AgentMail skill (new)
AgentMailClient+AgentMailSkill: send/receive email via the AgentMail API; keys wired throughKeyStoreandInfo.plist.Managed-mode / AppFlags
AppFlags.isManagedreadsLOOP_FLAGfromInfo.plist; when set, pins TTS provider to ElevenLabs Flash v2.5, shows "Managed" in the backend indicator (non-interactive), drops the provider picker from the speaker menu, and allows managed onboarding to speak its messages.Secrets_managed.xcconfigadded to.gitignore;Config.xcconfigswitches include toSecrets_managed.xcconfig.SSH / runner improvements
SSHConfigStore,SSHConnectionsVC,SSHSettingsVC: multiple SSH profiles, passphrase support, managed pre-seeding fromInfo.plist.LoopRunnerSSHClient,LoopRunnerPoller,LoopRunnerClient: handoff submission, backstop catch-up polling for missed pushes.RunnerEditVC,RunnerModels,RunnerProvisioneradded.Go runtime
agent.go,handlers.go,storage.go,main.go,config.go: managed-secrets support, improved cron handling, new integration test coverage.Notable risks
submitHandofffails and the runner is unreachable, the user sees a notice but has permanently lost the local result for that turn.BrowseSessionparks aWKWebViewoffscreen in the key window: ifkeyWindow()returns nil (e.g. during early launch or on multi-scene), the web view is never attached and WebKit may not render/snapshot correctly, silently producing empty frames.Secrets_managed.xcconfigreplacesSecrets.xcconfigas the build config include: developers who haveSecrets.xcconfiglocally must rename or symlink; the old filename is no longer read byConfig.xcconfig.SPARKLE_ED_PRIVATE_KEY) is a new required CI secret: the release workflow will fail on main pushes until it is added.sttEnginepiggybacked on themodelDB column for user rows: any existing user-role rows with a non-emptymodelvalue (e.g. from a future schema change) would be misread as a dictation byline.storyCardView,browseCardViewconstraints grow with eachprepareForReusecall ifNSLayoutConstraint.deactivatedoes not fully remove them — the lazy-viewsuperview == nilguard prevents duplicate additions, but constraint deactivation on recycled cells for the story/browse paths should be verified under heavy scrolling.Auto-generated by
.github/workflows/auto-develop-pr.yml