Tensorlake Go SDK

A comprehensive Go SDK for the Tensorlake API, enabling intelligent document processing and cloud sandbox management. Parse documents, extract structured data, classify pages, and run code in isolated sandbox environments with interactive terminals and process control.

Features

Document Parsing: Convert PDFs, DOCX, images, and more to structured markdown
Data Extraction: Extract structured data using JSON schemas
Page Classification: Classify pages by content type
File Management: Upload and manage documents
Datasets: Reusable parsing configurations for consistent processing
Sandboxes: Create, manage, and interact with cloud sandboxes
PTY Sessions: Interactive terminal sessions via WebSocket
Process Management: Start, monitor, and control processes in sandboxes
SSE Support: Real-time progress updates via Server-Sent Events
Iterator Pattern: Easy pagination through results

Installation

go get github.com/sixt/tensorlake-go

Requirements: Go 1.25 or later

Quick Start

1. Initialize the Client

import "github.com/sixt/tensorlake-go"

c := tensorlake.NewClient(
    tensorlake.WithBaseURL("https://api.your-domain.com"),
    tensorlake.WithAPIKey("your-api-key"),
)

2. Upload a File

file, _ := os.Open("document.pdf")
defer file.Close()

uploadResp, _ := c.UploadFile(context.Background(), &tensorlake.UploadFileRequest{
    FileBytes: file,
    FileName:  "document.pdf",
    Labels:    map[string]string{"category": "invoice"},
})

fmt.Printf("File uploaded: %s\n", uploadResp.FileId)

3. Parse the Document

parseJob, _ := c.ParseDocument(context.Background(), &tensorlake.ParseDocumentRequest{
    FileSource: tensorlake.FileSource{
        FileId: uploadResp.FileId,
    },
})

// Get results with real-time updates
result, _ := c.GetParseResult(
    context.Background(),
    parseJob.ParseId,
    tensorlake.WithSSE(true),
    tensorlake.WithOnUpdate(func(name tensorlake.ParseEventName, r *tensorlake.ParseResult) {
        fmt.Printf("Status: %s - %d/%d pages\n", name, r.ParsedPagesCount, r.TotalPages)
    }),
)

// Access parsed content
for _, page := range result.Pages {
    fmt.Printf("Page %d:\n", page.PageNumber)
    // Process page content...
}

Documentation

Core APIs

File Management APIs - Upload, list, retrieve metadata, and delete files
Parse APIs - Parse documents, extract data, and classify pages
Dataset APIs - Create reusable parsing configurations
Sandbox APIs - Create and manage cloud sandboxes, PTY sessions, and processes

Comprehensive Examples

Extract Structured Data

import "github.com/google/jsonschema-go/jsonschema"

// Define extraction schema
type InvoiceData struct {
    InvoiceNumber string     `json:"invoice_number"`
    VendorName    string     `json:"vendor_name"`
    TotalAmount   float64    `json:"total_amount"`
    LineItems     []LineItem `json:"line_items"`
}

type LineItem struct {
    Description string  `json:"description"`
    Amount      float64 `json:"amount"`
}

schema, _ := jsonschema.For[InvoiceData](nil)

// Parse with extraction
parseJob, _ := c.ParseDocument(context.Background(), &tensorlake.ParseDocumentRequest{
    FileSource: tensorlake.FileSource{FileId: fileId},
    StructuredExtractionOptions: []tensorlake.StructuredExtractionOptions{
        {
            SchemaName:        "invoice_data",
            JSONSchema:        schema,
            PartitionStrategy: tensorlake.PartitionStrategyNone,
            ProvideCitations:  true,
        },
    },
})

// Retrieve and unmarshal extracted data
result, _ := c.GetParseResult(context.Background(), parseJob.ParseId)
for _, data := range result.StructuredData {
    var extracted map[string]interface{}
    json.Unmarshal(data.Data, &extracted)
    fmt.Printf("Extracted: %+v\n", extracted)
}

Classify Pages

parseJob, err := c.ClassifyDocument(context.Background(), &tensorlake.ClassifyDocumentRequest{
    FileSource: tensorlake.FileSource{FileId: fileId},
    PageClassifications: []tensorlake.PageClassConfig{
        {
            Name:        "signature_page",
            Description: "Pages containing signatures or signature blocks",
        },
        {
            Name:        "terms_and_conditions",
            Description: "Pages with legal terms and conditions",
        },
    },
})

result, _ := c.GetParseResult(context.Background(), parseJob.ParseId)
for _, pageClass := range result.PageClasses {
    fmt.Printf("Class '%s' found on pages: %v\n", pageClass.PageClass, pageClass.PageNumbers)
}

Use Datasets for Batch Processing

// Create a reusable dataset
dataset, err := c.CreateDataset(context.Background(), &tensorlake.CreateDatasetRequest{
    Name:        "invoice-processing",
    Description: "Standard invoice parsing configuration",
    ParsingOptions: &tensorlake.ParsingOptions{
        TableOutputMode: tensorlake.TableOutputModeMarkdown,
    },
    StructuredExtractionOptions: []tensorlake.StructuredExtractionOptions{
        {
            SchemaName: "invoice",
            JSONSchema: schema,
        },
    },
})

// Process multiple files with the same configuration
fileIds := []string{"file_001", "file_002", "file_003"}
for _, fileId := range fileIds {
    parseJob, err := c.ParseDataset(context.Background(), &tensorlake.ParseDatasetRequest{
        DatasetId:  dataset.DatasetId,
        FileSource: tensorlake.FileSource{FileId: fileId},
    })
    // Process results...
}

Sandbox APIs

Create and Use a Sandbox

// Create a sandbox
sb, _ := c.CreateSandbox(ctx, &tensorlake.CreateSandboxRequest{
    TimeoutSecs: ptr(int64(300)),
})

// Wait for it to be running
for {
    info, _ := c.GetSandbox(ctx, sb.SandboxId)
    if info.Status == tensorlake.SandboxStatusRunning {
        break
    }
    time.Sleep(time.Second)
}

// Run a process
proc, _ := c.StartProcess(ctx, sb.SandboxId, &tensorlake.StartProcessRequest{
    Command:    "python",
    Args:       []string{"-c", "print('hello from sandbox')"},
    StdoutMode: tensorlake.OutputModeCapture,
})

// Get the output
time.Sleep(2 * time.Second)
stdout, _ := c.GetProcessStdout(ctx, sb.SandboxId, proc.PID)
fmt.Println(stdout.Lines) // ["hello from sandbox"]

// Clean up
c.DeleteSandbox(ctx, sb.SandboxId)

Interactive Terminal via PTY

// Create a PTY session
pty, _ := c.CreatePTY(ctx, sandboxID, &tensorlake.CreatePTYRequest{
    Command: "/bin/sh",
    Rows:    24,
    Cols:    80,
})

// Connect via WebSocket
conn, _ := c.ConnectPTY(ctx, sandboxID, pty.SessionId, pty.Token)
defer conn.Close()

conn.Ready(ctx)                           // Signal readiness
conn.Write(ctx, []byte("ls\n"))           // Send input
msg, _ := conn.Read(ctx)                  // Read output
conn.Resize(ctx, 120, 40)                 // Resize terminal

See the interactive terminal example for a complete implementation.

Follow Process Output via SSE

proc, _ := c.StartProcess(ctx, sandboxID, &tensorlake.StartProcessRequest{
    Command:    "/bin/sh",
    Args:       []string{"-c", "for i in 1 2 3; do echo line$i; sleep 1; done"},
    StdoutMode: tensorlake.OutputModeCapture,
})

for evt, err := range c.FollowProcessStdout(ctx, sandboxID, proc.PID) {
    if err != nil {
        break
    }
    fmt.Printf("[%d] %s\n", evt.Timestamp, evt.Line)
}

Sandbox Lifecycle

Sandboxes support the full lifecycle: create, suspend, resume, snapshot, and restore.

// Suspend a named sandbox
c.SuspendSandbox(ctx, sandboxID)

// Resume it later
c.ResumeSandbox(ctx, sandboxID)

// Snapshot for later restore
snap, _ := c.SnapshotSandbox(ctx, sandboxID, nil)

// Restore from snapshot
restored, _ := c.CreateSandbox(ctx, &tensorlake.CreateSandboxRequest{
    SnapshotId: snap.SnapshotId,
})

Custom Endpoints

All sandbox URLs are configurable for custom deployments:

c := tensorlake.NewClient(
    tensorlake.WithAPIKey("your-key"),
    tensorlake.WithSandboxAPIBaseURL("https://api-tensorlake.example.com/sandboxes"),
    tensorlake.WithSandboxProxyBaseURL("https://sandbox-tensorlake.example.com"),
)

Advanced Features

Server-Sent Events (SSE)

Get real-time progress updates for long-running parse jobs:

result, err := c.GetParseResult(
    ctx,
    parseId,
    tensorlake.WithSSE(true),
    tensorlake.WithOnUpdate(func(name tensorlake.ParseEventName, r *tensorlake.ParseResult) {
        switch eventName {
        case tensorlake.SSEEventParseQueued:
            fmt.Println("Job queued")
        case tensorlake.SSEEventParseUpdate:
            fmt.Printf("Progress: %d/%d pages\n", r.ParsedPagesCount, r.TotalPages)
        case tensorlake.SSEEventParseDone:
            fmt.Println("Complete!")
        case tensorlake.SSEEventParseFailed:
            fmt.Printf("Failed: %s\n", r.Error)
        }
    }),
)

Iterator Pattern

Easily iterate through paginated results:

// Iterate all files
for file, err := range c.IterFiles(ctx, 50) {
    if err != nil {
        panic(err)
    }
    fmt.Printf("File: %s\n", file.FileName)
}

// Iterate all parse jobs
for job, err := range c.IterParseJobs(ctx, 50) {
    if err != nil {
        panic(err)
    }
    fmt.Printf("Job %s: Status: %s\n", job.ParseId, job.Status)
}

// Iterate all datasets
for dataset, err := range c.IterDatasets(ctx, 50) {
    if err != nil {
        panic(err)
    }
    fmt.Printf("Dataset %s: Name: %s, Status: %s\n", dataset.DatasetId, dataset.Name, dataset.Status)
}

Supported File Types

Documents: PDF, DOCX
Spreadsheets: XLS, XLSX, XLSM, CSV
Presentations: PPTX, Apple Keynote
Images: PNG, JPG, JPEG
Text: Plain text, HTML

Maximum file size: 1 GB

Error Handling

All API methods return structured errors:

result, err := c.ParseDocument(ctx, request)
if err != nil {
    var apiErr *tensorlake.ErrorResponse
    if errors.As(err, &apiErr) {
        fmt.Printf("API Error: %s (Code: %s)\n", apiErr.Message, apiErr.ErrorCode)
        // Handle specific error codes
    } else {
        fmt.Printf("Network/Client Error: %v\n", err)
    }
}

Best Practices

Reuse Datasets - Create datasets for frequently processed document types
Use SSE - Enable SSE for large documents to track progress
Batch Processing - Process similar documents with the same dataset configuration
Error Handling - Always check error responses and handle retries appropriately
Labels - Use labels to organize and filter files and parse jobs
Iterators - Use iterator methods for efficient pagination through large result sets

Examples

sandbox-terminal - Interactive terminal session via PTY WebSocket
sandbox-benchmark - Concurrent sandbox creation latency benchmark

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github		.github
docs		docs
examples		examples
internal/sse		internal/sse
testdata		testdata
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
client.go		client.go
client_test.go		client_test.go
dataset_create.go		dataset_create.go
dataset_delete.go		dataset_delete.go
dataset_get.go		dataset_get.go
dataset_list.go		dataset_list.go
dataset_parse.go		dataset_parse.go
dataset_test.go		dataset_test.go
dataset_update.go		dataset_update.go
doc.go		doc.go
enum.go		enum.go
errors.go		errors.go
file_delete.go		file_delete.go
file_list.go		file_list.go
file_metadata.go		file_metadata.go
file_test.go		file_test.go
file_upload.go		file_upload.go
go.mod		go.mod
go.sum		go.sum
metadata.yaml		metadata.yaml
opt.go		opt.go
opt_test.go		opt_test.go
parse_classify.go		parse_classify.go
parse_classify_test.go		parse_classify_test.go
parse_delete.go		parse_delete.go
parse_extract.go		parse_extract.go
parse_extract_test.go		parse_extract_test.go
parse_get.go		parse_get.go
parse_get_test.go		parse_get_test.go
parse_list.go		parse_list.go
parse_parse.go		parse_parse.go
parse_parse_test.go		parse_parse_test.go
parse_read.go		parse_read.go
parse_read_test.go		parse_read_test.go
sandbox.go		sandbox.go
sandbox_api.go		sandbox_api.go
sandbox_api_test.go		sandbox_api_test.go
sandbox_create.go		sandbox_create.go
sandbox_delete.go		sandbox_delete.go
sandbox_file_delete.go		sandbox_file_delete.go
sandbox_file_list.go		sandbox_file_list.go
sandbox_file_read.go		sandbox_file_read.go
sandbox_file_write.go		sandbox_file_write.go
sandbox_get.go		sandbox_get.go
sandbox_list.go		sandbox_list.go
sandbox_process.go		sandbox_process.go
sandbox_process_close_stdin.go		sandbox_process_close_stdin.go
sandbox_process_follow_output.go		sandbox_process_follow_output.go
sandbox_process_follow_stderr.go		sandbox_process_follow_stderr.go
sandbox_process_follow_stdout.go		sandbox_process_follow_stdout.go
sandbox_process_get.go		sandbox_process_get.go
sandbox_process_kill.go		sandbox_process_kill.go
sandbox_process_list.go		sandbox_process_list.go
sandbox_process_output.go		sandbox_process_output.go
sandbox_process_signal.go		sandbox_process_signal.go
sandbox_process_start.go		sandbox_process_start.go
sandbox_process_stderr.go		sandbox_process_stderr.go
sandbox_process_stdin.go		sandbox_process_stdin.go
sandbox_process_stdout.go		sandbox_process_stdout.go
sandbox_process_test.go		sandbox_process_test.go
sandbox_pty_create.go		sandbox_pty_create.go
sandbox_pty_get.go		sandbox_pty_get.go
sandbox_pty_kill.go		sandbox_pty_kill.go
sandbox_pty_list.go		sandbox_pty_list.go
sandbox_pty_resize.go		sandbox_pty_resize.go
sandbox_pty_test.go		sandbox_pty_test.go
sandbox_pty_websocket.go		sandbox_pty_websocket.go
sandbox_resume.go		sandbox_resume.go
sandbox_snapshot.go		sandbox_snapshot.go
sandbox_suspend.go		sandbox_suspend.go
sandbox_test.go		sandbox_test.go
sandbox_update.go		sandbox_update.go
types.go		types.go
types_test.go		types_test.go
util_test.go		util_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tensorlake Go SDK

Features

Installation

Quick Start

1. Initialize the Client

2. Upload a File

3. Parse the Document

Documentation

Core APIs

Comprehensive Examples

Extract Structured Data

Classify Pages

Use Datasets for Batch Processing

Sandbox APIs

Create and Use a Sandbox

Interactive Terminal via PTY

Follow Process Output via SSE

Sandbox Lifecycle

Custom Endpoints

Advanced Features

Server-Sent Events (SSE)

Iterator Pattern

Supported File Types

Error Handling

Best Practices

Examples

Contributing

Related Resources

License

About

Uh oh!

Releases 5

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Tensorlake Go SDK

Features

Installation

Quick Start

1. Initialize the Client

2. Upload a File

3. Parse the Document

Documentation

Core APIs

Comprehensive Examples

Extract Structured Data

Classify Pages

Use Datasets for Batch Processing

Sandbox APIs

Create and Use a Sandbox

Interactive Terminal via PTY

Follow Process Output via SSE

Sandbox Lifecycle

Custom Endpoints

Advanced Features

Server-Sent Events (SSE)

Iterator Pattern

Supported File Types

Error Handling

Best Practices

Examples

Contributing

Related Resources

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Uh oh!

Contributors

Uh oh!

Languages