feat(codex): add codex app server adapter + e2e tests#2975
Conversation
|
React Doctor found 1 issue in 1 file · 1 warning. 1 warning
Reviewed by React Doctor for commit |
00acd92 to
8899ff0
Compare
c62e8b0 to
492b304
Compare
|
Reviews (1): Last reviewed commit: "improve coverage of app server" | Re-trigger Greptile |
| const ADAPTERS: Adapter[] = ["claude", "codex"]; | ||
|
|
||
| const EDIT_PROMPT = | ||
| "Do exactly these steps and nothing else: 1) Read the file target.txt. " + | ||
| "2) Edit it so the second line reads FOO instead of line2. " + | ||
| "3) Run the shell command `cat target.txt`. " + | ||
| "4) In one sentence confirm what you changed, then stop."; |
There was a problem hiding this comment.
for-loop over adapters instead of parametrised test
The for (const adapter of ADAPTERS) loop is a hand-rolled substitute for describe.each. The team's explicit rule is to prefer parametrised tests. The same pattern appears in structured-output.e2e.test.ts. Both files generate one describe block per adapter through a loop; describe.each expresses that intent directly and is idiomatic in Vitest.
Context Used: Do not attempt to comment on incorrect alphabetica... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| { outcome: "selected", optionId: "option_1" }, | ||
| ]); | ||
|
|
||
| const params = { | ||
| threadId: "t", | ||
| turnId: "turn", | ||
| itemId: "item-9", | ||
| autoResolutionMs: null, | ||
| questions: [ | ||
| { | ||
| id: "q1", | ||
| header: "Pick one", | ||
| question: "Which environment?", | ||
| isOther: false, | ||
| isSecret: false, | ||
| options: [ | ||
| { label: "staging", description: "" }, | ||
| { label: "production", description: "danger" }, | ||
| ], | ||
| }, | ||
| ], | ||
| }; | ||
|
|
||
| const result = await handleServerRequest( | ||
| APP_SERVER_REQUESTS.TOOL_USER_INPUT, | ||
| params, | ||
| client, | ||
| opts, | ||
| ); | ||
|
|
||
| expect(result.handled).toBe(true); | ||
| expect(result.response).toEqual({ | ||
| answers: { q1: { answers: ["production"] } }, | ||
| }); | ||
|
|
||
| // Prompt carried the question's options and the session id. | ||
| expect(calls).toHaveLength(1); | ||
| expect(calls[0].sessionId).toBe("sess-1"); | ||
| expect(calls[0].options.map((o) => o.name)).toEqual([ | ||
| "staging", | ||
| "production", | ||
| ]); | ||
| }); | ||
|
|
||
| it("defaults a cancelled question to an empty answer", async () => { | ||
| const { client } = fakeClient([{ outcome: "cancelled" }]); | ||
|
|
||
| const params = { | ||
| threadId: "t", | ||
| turnId: "turn", | ||
| itemId: "item-1", | ||
| autoResolutionMs: null, | ||
| questions: [ | ||
| { | ||
| id: "q1", | ||
| header: "h", | ||
| question: "q?", | ||
| isOther: false, | ||
| isSecret: false, | ||
| options: [{ label: "a", description: "" }], |
There was a problem hiding this comment.
Repetitive outcome tests could use
it.each
The allow / reject / cancel outcome tests for each of the three request types (TOOL_USER_INPUT, PERMISSIONS_APPROVAL, MCP_ELICITATION) share the same structure: queue an outcome, call handleServerRequest, assert the response. Grouping them as it.each rows would express the same coverage more concisely, consistent with the team's preference for parametrised tests.
Context Used: Do not attempt to comment on incorrect alphabetica... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if (block.type === "resource" && "text" in block.resource) { | ||
| const uri = block.resource.uri ?? ""; | ||
| if (uri.startsWith("file://")) { | ||
| input.push(textInput(resourceLinkText(uri))); | ||
| continue; | ||
| } | ||
| input.push(textInput(uri)); | ||
| context.push( | ||
| `<context ref="${uri}">\n${block.resource.text}\n</context>`, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Empty text block emitted when resource has no URI
When block.resource.uri is undefined or null, block.resource.uri ?? "" yields "" and textInput("") is pushed into the input array. Codex may reject a turn/start payload containing an empty-string text block, and the empty string carries no useful information — the resource content is already in the trailing <context ref=""> block. Skip the bare-URI push when uri is empty. This path has no unit test coverage.
| if (block.type === "resource" && "text" in block.resource) { | |
| const uri = block.resource.uri ?? ""; | |
| if (uri.startsWith("file://")) { | |
| input.push(textInput(resourceLinkText(uri))); | |
| continue; | |
| } | |
| input.push(textInput(uri)); | |
| context.push( | |
| `<context ref="${uri}">\n${block.resource.text}\n</context>`, | |
| ); | |
| } | |
| if (block.type === "resource" && "text" in block.resource) { | |
| const uri = block.resource.uri ?? ""; | |
| if (uri.startsWith("file://")) { | |
| input.push(textInput(resourceLinkText(uri))); | |
| continue; | |
| } | |
| if (uri) { | |
| input.push(textInput(uri)); | |
| } | |
| context.push( | |
| `<context ref="${uri}">\n${block.resource.text}\n</context>`, | |
| ); | |
| } |
14877aa to
7d28bad
Compare
|
Reviews (2): Last reviewed commit: "remove some comments" | Re-trigger Greptile |
| if (response.outcome.optionId === "reject_with_feedback") { | ||
| // codex's response has no feedback field, so decline and inject the guidance | ||
| // into the running turn (as its TUI does: Denied + a follow-up message). | ||
| const feedback = (response as { _meta?: { customInput?: unknown } }) | ||
| ._meta?.customInput; | ||
| const activeTurnId = this.turns.activeTurnId; | ||
| if (typeof feedback === "string" && feedback.trim() && activeTurnId) { | ||
| void this.rpc | ||
| .request(APP_SERVER_METHODS.TURN_STEER, { | ||
| threadId: this.threadId, | ||
| input: toCodexInput([{ type: "text", text: feedback.trim() }]), | ||
| expectedTurnId: activeTurnId, | ||
| }) | ||
| .catch((err) => | ||
| this.logger.warn("turn/steer (reject feedback) failed", err), | ||
| ); | ||
| } | ||
| return { decision: "decline" }; |
There was a problem hiding this comment.
Stale
activeTurnId after feedback steer
The reject_with_feedback path fires turn/steer as a fire-and-forget void call and never calls this.turns.onSteered(newTurnId) with the rotated turn ID that codex returns. After this steer completes, activeTurnId still points to the old (now invalid) turn ID. Any subsequent operation that reads activeTurnId — a second rejection with feedback, an interrupt, or a concurrent prompt() call that hits the turns.isRunning steer branch — will supply a stale expectedTurnId and codex will reject the request with a turn-ID mismatch error.
The main prompt() steer path shows the correct pattern: await the steer response, then call this.turns.onSteered(steerRes?.turnId) to keep the turn controller current.
8899ff0 to
589557c
Compare
7d28bad to
78af4dc
Compare
Adds app server implementation for codex adapter and e2e test coverage for the agent package
This is part 2 of 2 in a stack made with GitButler: