Skip to content

feat: enhance file operations, multi-type previewer, and architecture documentation#322

Open
topabomb wants to merge 4 commits intoNeuralNomadsAI:devfrom
topabomb:dev
Open

feat: enhance file operations, multi-type previewer, and architecture documentation#322
topabomb wants to merge 4 commits intoNeuralNomadsAI:devfrom
topabomb:dev

Conversation

@topabomb
Copy link
Copy Markdown

Summary

This PR introduces comprehensive file operation enhancements, a multi-type file previewer system, and extensive architecture documentation updates.

Changes

📁 File Operations Enhancement (Server & UI)

  • Server-side:

    • Added file browser API endpoints with recursive directory support
    • Implemented git worktree operations integration in workspace manager
    • Extended HTTP server routes for workspace file management
    • Added multipart file upload type definitions
  • UI - Right Panel:

    • New useFileOperations hook for centralized file CRUD operations
    • Enhanced FilesTab with file browsing, upload, and management capabilities
    • Improved SplitFilePanel component for dual-pane file viewing
    • Added progress bar component for file operation feedback

📄 Multi-Type File Previewer

  • New viewers:
    • Image viewer with zoom and fit controls
    • PDF viewer with embedded rendering
    • Markdown viewer with formatted output
    • Audio player with playback controls
    • Video player with standard controls
  • Registry system: Centralized viewer registry with automatic file type detection
  • Type definitions: Unified FileViewerType interface for all viewer components
  • Styles: Dedicated file-viewers.css and updated right-panel.css

📐 Architecture Documentation

  • Agent workflow: Complete 5-phase lifecycle (prompt → streaming → permission → complete → idle)
  • Session state machine: Detailed idle/working/compacting states with guard rules
  • Revision system: sessionRevisions reactive mechanism for UI rendering
  • Permission flow: 10-step closed loop for permission/issue handling
  • Crash handling: Design documentation (no auto-restart policy)
  • OpenCode ↔ Server bridge: Event mapping table
  • Updated TOC with new §25 section

🔧 Dependencies

  • Bumped @opencode-ai/plugin from 1.3.7 to 1.4.3

🌐 Internationalization

  • Added translations for file viewer and instance UI in all supported locales (en, es, fr, he, ja, ru, zh-Hans)

Stats

  • 51 files changed
  • 15,618 insertions / 6,368 deletions

root added 4 commits April 13, 2026 01:27
Implement complete file browser capabilities in the right panel:

Backend (Phase 0-1):
- Add worktree-aware file operations via ?worktree= parameter
- FileSystemBrowser.deleteFile() with directory protection
- WorkspaceManager uploadFile/deleteFile/resolveFilePath methods
- New routes: POST /files/upload, DELETE /files/content, GET /files/download
- Range request support for download (206 Partial Content)
- Migrate resolveWorktreeDirectory to git-worktrees.ts
- Patch existing GET/PUT content routes for worktree awareness

Frontend (Phase 2-3):
- api-client: uploadWorkspaceFile, downloadWorkspaceFile, deleteWorkspaceFile
- File previewer registry with priority-based selection
- Markdown viewer with rendered/code modes, image inlining, internal links
- Image, audio, video, PDF viewers with Blob URL support
- Progress bar component for upload/download operations

UI integration (Phase 4-5):
- Fixed 48px action column with download/delete buttons (always visible)
- Header: Upload, Stats/path, Eye/Code toggle (md only), Save, Refresh
- Sticky header for scroll persistence
- useFileOperations hook with always-show delete confirmation
- Tab switch re-fetch for file content restoration
- CSS grid height constraint for viewer cell (fixes list scrollbar)

i18n:
- 9 new instance keys across 7 locales (en/es/fr/he/ja/ru/zh-Hans)
- 8 new fileViewer keys across 7 locales
- Replace hardcoded error messages with t() calls

Bug fixes:
- NeuralNomadsAI#1: Upload file input covering viewport (position: relative fix)
- NeuralNomadsAI#2: Delete without confirmation (always show confirm dialog)
- NeuralNomadsAI#3: Header path squeezing stats buttons (max-width constraint)
- NeuralNomadsAI#4: Markdown blob URL memory leak (cleanup on switch/unmount)
- NeuralNomadsAI#5: Action column stacking vertically (flex layout fix)
- NeuralNomadsAI#6: Tab switch loses file content (re-fetch effect)
- NeuralNomadsAI#7: Markdown mode loses file list scrollbar (viewer cell height constraint)

Documentation:
- Move ARCHITECTURE_ANALYSIS.md to dev-docs/
- Create FILE_OPERATIONS_ENHANCEMENT.md v1.5
Update ARCHITECTURE_ANALYSIS.md to reflect implementation changes:
- Server: 63 files, ~10,916 lines (was 62 / ~10,465)
- UI: ~410 files, ~66,790 lines (was ~388 / ~64,714)
- Added file-viewer components, useFileOperations hook, file-types.ts
- Updated API routes table (content, upload, download)
- Updated all line numbers across §1-§8, §10, §23, §24
- Plugin dep v1.3.7 → v1.4.3
- CSS panels count 6→7, i18n message parts 17→18
…on system, and permission flow

- §3.5: Crash handling design (no auto-restart)
- §11.1: Complete 5-phase Agent workflow (prompt → streaming → permission → complete → idle)
- §15.2: Session state machine (idle/working/compacting with guard rules)
- §15.4: sessionRevisions reactive mechanism for UI rendering
- §25: Agent workflow awareness (lifecycle, state transitions, message completion,
  permission/issue 10-step closed loop, error recovery, file change tracking)
- Add OpenCode → Server bridge events table
- Update TOC with §25
@github-actions
Copy link
Copy Markdown

PR builds are available as GitHub Actions artifacts:

https://github.com/NeuralNomadsAI/CodeNomad/actions/runs/24323890818

Artifacts expire in 7 days.
Artifacts: (none found on this run)

@shantur
Copy link
Copy Markdown
Collaborator

shantur commented Apr 13, 2026

@topabomb - Thanks for working on CodeNomad

I understand people want to be able to edit files in CodeNomad and add more file / editing / coding capabilities. The core focus on CodeNomad is Agentic coding and the reality is that CodeNomad can't match the feature of a proper IDE.
Having said that, I do understand the need and hence recently I built a feature called SideCars, I do agree its not properly documented.
SideCars allow users access any locally hosted service remotely via CodeNomad. It allows users to access development web servers or other services they use, this functionality can be used to access whole VSCode using OpenVSCode server.

Check the screenshots below, I am running an OpenVSCode dev server on my machine using docker

  1. Run a OpenVSCode server
    docker run -it --init -p 8000:3000 -v "${HOME}:${HOME}:cached" -e HOME=${HOME} gitpod/openvscode-server --server-base-path /sidecars/vscode

  2. Create a SideCar in CodeNomad settings like below

Screenshot 2026-04-13 at 09 47 31
  1. Use VSCode in CodeNomad as normal and enjoy all VSCode features
Screenshot 2026-04-13 at 09 47 08

As we already can do this, unfortunately, I won't be able to inherit this PR and maintain it.
Let me know your thoughts.

@topabomb
Copy link
Copy Markdown
Author

Thanks for taking the time to respond, and for introducing the SideCars feature — it's a genuinely clever architectural decision, and I completely understand your intent to keep CodeNomad focused on agentic coding rather than becoming yet another bloated IDE.

That said, I'd like to explain the thinking behind this PR and why I believe it complements rather than conflicts with that vision.


The web-first use case is bigger than it looks

CodeNomad already markets itself as a remote-accessible agent manager — "run as a server, access via browser, perfect for remote development." This positions it naturally for a much wider range of devices and contexts: a developer checking in on a long-running agent session from their phone, a team lead reviewing what the agent changed while sitting in a meeting on a tablet, or someone on a lightweight machine who simply can't run a full desktop client.

On mobile and tablet, the experience gap is significant. Users can follow the conversation, but they have no way to verify what the agent actually did to the filesystem. For an agent manager, that's a fundamental blind spot — the whole point is to stay in control of what the agent is doing.

Why SideCars doesn't fully close this gap

SideCars is a powerful escape hatch for power users, but it requires Docker, custom base-path configuration, and manual registration in settings. That's a high bar for the most common lightweight scenario: "the agent just modified some files, I want to quickly see what changed."

More importantly, SideCars opens a separate, context-free window. There's no connection between the agent conversation and the files being shown. What users actually need — especially on mobile — is a lightweight, inline way to see and manage files in the same interface where the agent is working.

Also, rich media preview (images, PDFs, audio, video) is something SideCars + VSCode doesn't handle elegantly. These are precisely the kinds of artifacts agents commonly produce, and reviewing them inside the conversation context is a meaningfully better experience.

What this PR actually adds

This isn't an attempt to build a competitor to VSCode inside CodeNomad. The file operations are scoped and intentional:

  • A lightweight file browser scoped strictly to the workspace root (no filesystem escape, path traversal protection in place, all routes reuse the existing auth middleware)
  • Inline multi-type previews for agent-generated artifacts
  • Upload support for passing context files to the agent
  • Full i18n coverage across 7 locales and a registry-based viewer system that makes future extensions trivial to add
  • Comprehensive architecture documentation (§25) that stands on its own value regardless of the feature code

On the review burden

I fully understand that 51 files is a large diff to review at once. I'm happy to split this into incremental PRs ordered by risk:

  1. Architecture docs + i18n — zero risk, independent value
  2. Read-only file preview system — no new write surface
  3. File browser API — read operations only
  4. Upload + worktree operations — write operations, most scrutiny needed

We can go at whatever pace works for you.


I've invested a significant amount of time studying CodeNomad's internals to make sure this work fits the architecture properly. My genuine preference is to see it land in the main branch and benefit the whole community. If there are specific concerns or things you'd like me to change, I'm very willing to work through them together.

Looking forward to your thoughts.

@shantur
Copy link
Copy Markdown
Collaborator

shantur commented Apr 13, 2026

@topabomb - Agree with VSCode not being good enough for mobile.

Definitely, the PR in current state can't be merged in. Before working on the next stages, can we discuss the bare minimum requirements here. I don't see the need for PDF, Audio and Video file viewer in an agentic coding space, correct me if I am missing something.

  • From the feature list built in, what is the bare minimum needed.
  • All the development and documentation needs to happen in english

Thanks

@topabomb
Copy link
Copy Markdown
Author

Bare minimum I'd argue for:

Markdown preview — agents constantly produce or modify README files, changelogs, and documentation. The existing code viewer shows raw markdown text. Rendered preview is a meaningful quality-of-life improvement, especially on mobile where scanning raw markdown is painful.

File upload — lets users drop a spec, a reference doc, or a config file directly into the workspace from a phone or tablet, without needing terminal access. This completes the mobile workflow loop: you can already read what the agent did, but you can't give it new files without SSH.

File download — when the agent generates an artifact (a config file, a compiled output, a report), users on a remote/mobile session currently have no way to retrieve it through the UI. Download closes that gap.


What I'd drop from the PR:

  • PDF, audio, and video viewers — agreed, not core to an agentic coding workflow. I'll remove these entirely.
  • File delete — lower priority, can be deferred.

On the architecture documentation:

The current draft is written in Chinese — I'll rewrite it fully in English before any merge. That section documents existing internal behavior (agent lifecycle, session state machine, permission flow) rather than anything introduced by this PR, so it can also be submitted as a completely separate PR if that's cleaner.


In short, the bare minimum is: markdown preview + upload + download, on top of the file browser that already exists. Does that feel like a reasonable scope to move forward with?

One last thing: if you decide this PR isn't the right fit in its current form, I'd genuinely ask that you consider adding these three features yourself at some point. I believe they're important for the web/mobile use case, and none of them are particularly complex to implement on top of the existing file panel. I'd rather see them land in the official codebase than maintain a long-lived fork — but either way, I think users who rely on the server mode deserve them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants