Skip to content

Commit b171f29

Browse files
wilcorreaCopilot
andauthored
feat(acp): Session persistence, connection health and UI polish (#45)
* feat(acp): Add heartbeat, dead connection cleanup and connection logging - Add Rust-side heartbeat task with 15s ping interval and 3-strike disconnect - Clean up dead connections in `acp_connect` and `acp_check_health` - Emit structured connection log events via `acp:log` - Add `HeartbeatEvent`, `ConnectionLogEntry` and `ConnectionConfig` types - Add `chat_panel_size` field to session persistence * feat(acp): Persist session state across minimize/restore and reload history - Add in-memory session cache with global event listener independent of React lifecycle - Keep DirectoryWorkspace mounted via CSS hidden when minimized - Add `persistedWorkspaceId` to AppContext for mount/visibility separation - Always call `acp_load_session` to replay full conversation from ACP backend - Restore cached messages on "already loaded" fallback - Add periodic 30s heartbeat in `useAcpConnection` for proactive disconnect detection - Remove auto-disconnect on unmount; move disconnect to explicit close/forget - Add connection logs panel with `useAcpLogs` hook and `ConnectionLogs` component - Add i18n strings for connection logs and ACP status * test(acp): Add tests for connection hook and logs hook - Test connect/disconnect lifecycle, status events, workspace isolation - Test cached state restoration and health check on mount - Verify disconnect is not called on unmount - Test log capture, filtering and clear behavior * style(ui): Replace X icon with Trash2 on session delete button * fix(acp): Reset session init on disconnect and auto-reload on "not found" - Reset `initRef` when `isConnected` goes false so reconnect triggers `doInit` and reloads the session in the new copilot process - Auto-recover in `sendPrompt`: on "Session not found" (-32602), try `acp_load_session` then retry the prompt before surfacing the error * fix(acp): Add 60s streaming timeout to auto-clear stale Stop button If no streaming event arrives within 60s, `isStreaming` is reset to false automatically. Prevents the Stop button from staying red indefinitely when `end_turn` is never received. * fix: replace native button with shadcn Button and localize aria-label in ErrorConsole - Use Button (variant=ghost, size=icon) from shadcn/ui - Add useTranslation() and use t('common.dismiss') for aria-label - Add common.dismiss key to pt-BR.json ("Dispensar") and en.json ("Dismiss") Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(acp): Avoid awaiting async ops while holding connections mutex In `acp_connect` and `acp_check_health`, `is_alive().await` and `shutdown().await` were called while the connections mutex was held, potentially stalling other ACP commands behind long waits. Refactored both functions to remove the entry under the lock, drop the lock, perform async checks/shutdown outside it, then re-acquire the lock only to re-insert (if still alive) or finalize cleanup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(acp): Ensure `onConnect` runs even if `onDisconnect` rejects In `handleReconnect`, if `onDisconnect()` rejected the promise chain would abort before calling `onConnect()`, leaving the UI stuck in an error state. Wrapped `onDisconnect` in try/catch so `onConnect` always executes regardless of disconnect errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(acp): Fix listener guard and add streaming timer in session-cache Flip listenerSetup flag only after the Tauri event listener resolves, allowing retries if setup fails. Also call resetStreamingTimer() in addUserMessage so the isStreaming flag is always auto-cleared if no backend events follow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(ui): Replace hardcoded hex colors with HSL CSS variables in index.css Add --terminal-code CSS variable and replace all #3dd68c and #56d4dd hex values with hsl(var(--success)) and hsl(var(--terminal-code)) respectively, ensuring colors respect the app theme system. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(ui): Use semantic token and localize done label in TerminalMessage Replace hex color class with text-success/60 semantic token and localize the hardcoded "done" label via useTranslation(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(acp): Full state reset on session load and clear idle timer on workspace change Reset all local state fields (currentMode, availableModes, activeAcpSessionId, agentPlanFilePath) when loading an existing session to prevent stale UI state. Also clear idleTimerRef in the workspaceId effect cleanup so timers don't fire on the wrong workspace. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(ui): Use @/ alias for ConnectionLogs and re-trigger init on session change Replace relative import with @/ path alias. Add a dedicated effect to reset initRef when session identity changes so doInit() reruns when a different session is mounted on the same connected workspace. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(acp): Avoid holding connections mutex during shutdown in acp_disconnect Drop the connections lock before calling conn.shutdown().await, consistent with the same fix applied to acp_connect and acp_check_health. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 25da3bf commit b171f29

27 files changed

Lines changed: 1561 additions & 495 deletions

apps/tauri/src-tauri/src/acp/commands.rs

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,17 @@ pub async fn acp_connect(
2828
app_handle: AppHandle,
2929
state: State<'_, AcpState>,
3030
) -> Result<(), String> {
31-
let mut connections = state.connections.lock().await;
32-
if connections.contains_key(&workspace_id) {
33-
return Ok(());
31+
let existing = {
32+
let mut connections = state.connections.lock().await;
33+
connections.remove(&workspace_id)
34+
};
35+
if let Some(existing) = existing {
36+
if existing.is_alive().await {
37+
state.connections.lock().await.insert(workspace_id.clone(), existing);
38+
return Ok(());
39+
}
40+
existing.shutdown().await;
41+
state.configs.lock().await.remove(&workspace_id);
3442
}
3543

3644
let binary = binary_path
@@ -72,9 +80,10 @@ pub async fn acp_connect(
7280

7381
conn.send_notification("initialized", None).await?;
7482
conn.emit_status("connected", None);
83+
conn.emit_log("info", "connect", &format!("Connected via {}", binary));
7584

7685
state.configs.lock().await.insert(workspace_id.clone(), config);
77-
connections.insert(workspace_id, conn);
86+
state.connections.lock().await.insert(workspace_id, conn);
7887
Ok(())
7988
}
8089

@@ -84,8 +93,9 @@ pub async fn acp_disconnect(
8493
state: State<'_, AcpState>,
8594
) -> Result<(), String> {
8695
state.configs.lock().await.remove(&workspace_id);
87-
let mut connections = state.connections.lock().await;
88-
if let Some(conn) = connections.remove(&workspace_id) {
96+
let conn = state.connections.lock().await.remove(&workspace_id);
97+
if let Some(conn) = conn {
98+
conn.emit_log("info", "disconnect", "Disconnected by user");
8999
conn.shutdown().await;
90100
}
91101
Ok(())
@@ -202,9 +212,10 @@ pub async fn acp_send_prompt(
202212
}],
203213
};
204214

205-
conn.send_request(
215+
conn.send_request_with_timeout(
206216
"session/prompt",
207217
Some(serde_json::to_value(&params).map_err(|e| e.to_string())?),
218+
std::time::Duration::from_secs(600),
208219
)
209220
.await?;
210221

@@ -265,13 +276,19 @@ pub async fn acp_check_health(
265276
app_handle: AppHandle,
266277
state: State<'_, AcpState>,
267278
) -> Result<String, String> {
268-
let connections = state.connections.lock().await;
269-
let status = if let Some(conn) = connections.get(&workspace_id) {
279+
let existing = {
280+
let mut connections = state.connections.lock().await;
281+
connections.remove(&workspace_id)
282+
};
283+
let status = if let Some(conn) = existing {
270284
if conn.is_alive().await {
271285
conn.emit_status("connected", None);
286+
state.connections.lock().await.insert(workspace_id.clone(), conn);
272287
"connected"
273288
} else {
274289
conn.emit_status("disconnected", None);
290+
conn.shutdown().await;
291+
state.configs.lock().await.remove(&workspace_id);
275292
"disconnected"
276293
}
277294
} else {

apps/tauri/src-tauri/src/acp/connection.rs

Lines changed: 131 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ pub struct AcpConnection {
1515
child: ChildRef,
1616
writer_tx: mpsc::Sender<String>,
1717
pending: PendingMap,
18-
next_id: AtomicU64,
18+
next_id: Arc<AtomicU64>,
1919
reader_handle: Mutex<Option<tokio::task::JoinHandle<()>>>,
2020
writer_handle: Mutex<Option<tokio::task::JoinHandle<()>>>,
2121
heartbeat_handle: Mutex<Option<tokio::task::JoinHandle<()>>>,
@@ -64,8 +64,12 @@ impl AcpConnection {
6464
));
6565

6666
let child_arc: ChildRef = Arc::new(Mutex::new(Some(child)));
67+
let next_id = Arc::new(AtomicU64::new(1));
6768
let heartbeat_handle = tokio::spawn(Self::heartbeat_task(
6869
child_arc.clone(),
70+
writer_tx.clone(),
71+
pending.clone(),
72+
next_id.clone(),
6973
workspace_id.clone(),
7074
app_handle.clone(),
7175
));
@@ -74,7 +78,7 @@ impl AcpConnection {
7478
child: child_arc,
7579
writer_tx,
7680
pending,
77-
next_id: AtomicU64::new(1),
81+
next_id,
7882
reader_handle: Mutex::new(Some(reader_handle)),
7983
writer_handle: Mutex::new(Some(writer_handle)),
8084
heartbeat_handle: Mutex::new(Some(heartbeat_handle)),
@@ -169,6 +173,7 @@ impl AcpConnection {
169173
}
170174
}
171175
eprintln!("[acp] Reader task ended for workspace {}", workspace_id);
176+
emit_log_raw(&app_handle, &workspace_id, "warn", "reader_exit", "Reader task ended — stdout closed");
172177
let event = ConnectionStatusEvent {
173178
workspace_id: workspace_id.clone(),
174179
status: "disconnected".to_string(),
@@ -177,17 +182,101 @@ impl AcpConnection {
177182
let _ = app_handle.emit("acp:connection-status", &event);
178183
}
179184

180-
async fn heartbeat_task(child: ChildRef, workspace_id: String, app_handle: AppHandle) {
185+
async fn heartbeat_task(
186+
child: ChildRef,
187+
writer_tx: mpsc::Sender<String>,
188+
pending: PendingMap,
189+
next_id: Arc<AtomicU64>,
190+
workspace_id: String,
191+
app_handle: AppHandle,
192+
) {
181193
let interval = std::time::Duration::from_secs(15);
194+
let ping_timeout = std::time::Duration::from_secs(5);
195+
let mut consecutive_failures: u32 = 0;
196+
182197
loop {
183198
tokio::time::sleep(interval).await;
184-
let mut guard = child.lock().await;
185-
if let Some(ref mut c) = *guard {
186-
match c.try_wait() {
187-
Ok(Some(status)) => {
188-
eprintln!("[acp] Heartbeat: process exited (status: {:?}) for workspace {}", status, workspace_id);
189-
*guard = None;
190-
drop(guard);
199+
200+
{
201+
let mut guard = child.lock().await;
202+
if let Some(ref mut c) = *guard {
203+
match c.try_wait() {
204+
Ok(Some(status)) => {
205+
eprintln!("[acp] Heartbeat: process exited (status: {:?}) for workspace {}", status, workspace_id);
206+
*guard = None;
207+
drop(guard);
208+
emit_log_raw(&app_handle, &workspace_id, "error", "process_exit", &format!("Process exited with status: {:?}", status));
209+
let event = ConnectionStatusEvent {
210+
workspace_id,
211+
status: "disconnected".to_string(),
212+
attempt: None,
213+
};
214+
let _ = app_handle.emit("acp:connection-status", &event);
215+
return;
216+
}
217+
Ok(None) => {}
218+
Err(e) => {
219+
eprintln!("[acp] Heartbeat: try_wait error for workspace {}: {}", workspace_id, e);
220+
}
221+
}
222+
} else {
223+
return;
224+
}
225+
}
226+
227+
let id = next_id.fetch_add(1, Ordering::SeqCst);
228+
let request = JsonRpcRequest::new(id, "ping", None);
229+
let line = match serde_json::to_string(&request) {
230+
Ok(l) => l + "\n",
231+
Err(_) => continue,
232+
};
233+
234+
let (tx, rx) = oneshot::channel();
235+
pending.lock().await.insert(id, tx);
236+
237+
let start = std::time::Instant::now();
238+
if writer_tx.send(line).await.is_err() {
239+
consecutive_failures += 1;
240+
eprintln!("[acp] Heartbeat: writer channel closed for workspace {}", workspace_id);
241+
pending.lock().await.remove(&id);
242+
if consecutive_failures >= 3 {
243+
let event = ConnectionStatusEvent {
244+
workspace_id,
245+
status: "disconnected".to_string(),
246+
attempt: None,
247+
};
248+
let _ = app_handle.emit("acp:connection-status", &event);
249+
return;
250+
}
251+
continue;
252+
}
253+
254+
let timestamp = chrono::Utc::now().to_rfc3339();
255+
match tokio::time::timeout(ping_timeout, rx).await {
256+
Ok(_) => {
257+
let latency = start.elapsed().as_millis() as u64;
258+
consecutive_failures = 0;
259+
let event = HeartbeatEvent {
260+
workspace_id: workspace_id.clone(),
261+
status: "healthy".to_string(),
262+
latency_ms: Some(latency),
263+
timestamp,
264+
};
265+
let _ = app_handle.emit("acp:heartbeat", &event);
266+
}
267+
Err(_) => {
268+
consecutive_failures += 1;
269+
pending.lock().await.remove(&id);
270+
let event = HeartbeatEvent {
271+
workspace_id: workspace_id.clone(),
272+
status: "degraded".to_string(),
273+
latency_ms: None,
274+
timestamp: timestamp.clone(),
275+
};
276+
let _ = app_handle.emit("acp:heartbeat", &event);
277+
emit_log_raw(&app_handle, &workspace_id, "warn", "ping_timeout", &format!("Ping timeout ({}/3)", consecutive_failures));
278+
if consecutive_failures >= 3 {
279+
emit_log_raw(&app_handle, &workspace_id, "error", "disconnect", "Disconnected after 3 consecutive ping timeouts");
191280
let event = ConnectionStatusEvent {
192281
workspace_id,
193282
status: "disconnected".to_string(),
@@ -196,14 +285,7 @@ impl AcpConnection {
196285
let _ = app_handle.emit("acp:connection-status", &event);
197286
return;
198287
}
199-
Ok(None) => {} // still running
200-
Err(e) => {
201-
eprintln!("[acp] Heartbeat: try_wait error for workspace {}: {}", workspace_id, e);
202-
}
203288
}
204-
} else {
205-
// child was already taken (shutdown called)
206-
return;
207289
}
208290
}
209291
}
@@ -244,6 +326,15 @@ impl AcpConnection {
244326
&self,
245327
method: &str,
246328
params: Option<serde_json::Value>,
329+
) -> Result<serde_json::Value, String> {
330+
self.send_request_with_timeout(method, params, std::time::Duration::from_secs(30)).await
331+
}
332+
333+
pub async fn send_request_with_timeout(
334+
&self,
335+
method: &str,
336+
params: Option<serde_json::Value>,
337+
timeout: std::time::Duration,
247338
) -> Result<serde_json::Value, String> {
248339
let id = self.next_id.fetch_add(1, Ordering::SeqCst);
249340
let request = JsonRpcRequest::new(id, method, params);
@@ -260,7 +351,7 @@ impl AcpConnection {
260351
.await
261352
.map_err(|_| "Writer channel closed".to_string())?;
262353

263-
let result = tokio::time::timeout(std::time::Duration::from_secs(30), rx)
354+
let result = tokio::time::timeout(timeout, rx)
264355
.await
265356
.map_err(|_| format!("Timeout waiting for response to {}", method))?
266357
.map_err(|_| "Response channel dropped".to_string())?;
@@ -329,4 +420,26 @@ impl AcpConnection {
329420
};
330421
let _ = self.app_handle.emit("acp:connection-status", &event);
331422
}
423+
424+
pub fn emit_log(&self, level: &str, event: &str, message: &str) {
425+
let entry = ConnectionLogEntry {
426+
timestamp: chrono::Utc::now().to_rfc3339(),
427+
level: level.to_string(),
428+
event: event.to_string(),
429+
message: message.to_string(),
430+
workspace_id: self.workspace_id.clone(),
431+
};
432+
let _ = self.app_handle.emit("acp:log", &entry);
433+
}
434+
}
435+
436+
pub fn emit_log_raw(app_handle: &AppHandle, workspace_id: &str, level: &str, event: &str, message: &str) {
437+
let entry = ConnectionLogEntry {
438+
timestamp: chrono::Utc::now().to_rfc3339(),
439+
level: level.to_string(),
440+
event: event.to_string(),
441+
message: message.to_string(),
442+
workspace_id: workspace_id.to_string(),
443+
};
444+
let _ = app_handle.emit("acp:log", &entry);
332445
}

apps/tauri/src-tauri/src/acp/types.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,25 @@ pub struct ConnectionStatusEvent {
147147
pub attempt: Option<u32>,
148148
}
149149

150+
#[derive(Debug, Serialize, Clone)]
151+
#[serde(rename_all = "camelCase")]
152+
pub struct HeartbeatEvent {
153+
pub workspace_id: String,
154+
pub status: String,
155+
pub latency_ms: Option<u64>,
156+
pub timestamp: String,
157+
}
158+
159+
#[derive(Debug, Serialize, Clone)]
160+
#[serde(rename_all = "camelCase")]
161+
pub struct ConnectionLogEntry {
162+
pub timestamp: String,
163+
pub level: String,
164+
pub event: String,
165+
pub message: String,
166+
pub workspace_id: String,
167+
}
168+
150169
#[derive(Debug, Clone)]
151170
#[allow(dead_code)] // stored for future reconnect support
152171
pub struct ConnectionConfig {

apps/tauri/src-tauri/src/lib.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -982,6 +982,7 @@ pub fn run() {
982982
sessions::session_update_plan,
983983
sessions::session_update_plan_file_path,
984984
sessions::session_update_phase,
985+
sessions::session_update_chat_panel_size,
985986
sessions::session_delete,
986987
sessions::forget_workspace_data,
987988
plan_file::plan_write,

0 commit comments

Comments
 (0)