Apply npx prettier

barufa · barufa · commit ce6883ec6522 · 2025-12-07T20:29:28.000Z
diff --git a/_posts/2025-5-12-s3-vectors.md b/_posts/2025-5-12-s3-vectors.md
@@ -8,16 +8,16 @@ category:
 tags: machine-learning, mlops
 ---
 
-S3 Vectors adds **native vector storage + ANN search** to S3 via *vector buckets* and *vector indexes*. It’s cost‑oriented, elastic, and ideal for big volumes with **sub‑second** queries when throughput is moderate. It’s in **preview** and has sharp edges: hard **Top‑K=30**, only **float32** vectors. This post focuses on where it fits and the gotchas that actually matter in design.
+S3 Vectors adds **native vector storage + ANN search** to S3 via _vector buckets_ and _vector indexes_. It’s cost‑oriented, elastic, and ideal for big volumes with **sub‑second** queries when throughput is moderate. It’s in **preview** and has sharp edges: hard **Top‑K=30**, only **float32** vectors. This post focuses on where it fits and the gotchas that actually matter in design.
 
 ---
 
 ## Why S3 Vectors matters
 
 Embeddings are everywhere (RAG, recommendations, search). Most teams start with a hosted vector DB and quickly hit two things:
 
-1) **Cost at scale**: millions to billions of vectors add up.
-2) **Ops overhead**: clusters, replicas, upgrades, and capacity planning.
+1. **Cost at scale**: millions to billions of vectors add up.
+2. **Ops overhead**: clusters, replicas, upgrades, and capacity planning.
 
 **S3 Vectors** flips that default: store vectors in S3 with a **purpose‑built index** that you can **query directly**. You don’t manage nodes; you pay S3‑style pricing for storage + per‑request querying. If your workload tolerates low to mid QPS but needs large capacity and durability, this is a compelling baseline.
 
@@ -36,18 +36,20 @@ Key properties you’ll design around:
 - **Distance**: Cosine or Euclidean (immutable per index).
 - **Throughput**: write RPS per index is low; design for batching and sharding.
 - **Retrieve limit**: Top-K is capped at 30 per similarity query — and there is no pagination mechanism for ANN queries.
+
 ---
 
 ## Hard limits (and how to design around them)
 
 - **Top‑K ≤ 30** (no pagination). If you need >30 another vector database could be more suitable.
 
-- **Inserts**: up to **500 vectors** per `PutVectors`; **5 write RPS per index**. 
+- **Inserts**: up to **500 vectors** per `PutVectors`; **5 write RPS per index**.
+
   - **Design**: batch & backpressure; parallelize across **multiple indexes** (logical shards) if you need higher throughput.
 
-- **Data types**: **`float32` only** for vector values. If you pass other types, S3 Vectors converts to float32. 
+- **Data types**: **`float32` only** for vector values. If you pass other types, S3 Vectors converts to float32.
 
-- **Immutable index schema**: `dimension`, `distance`, and **non‑filterable metadata keys** can’t be changed. 
+- **Immutable index schema**: `dimension`, `distance`, and **non‑filterable metadata keys** can’t be changed.
 
 ---
 
@@ -67,36 +69,38 @@ S3 Vectors aims to **lower total cost** for large volumes with moderate QPS: you
 
 ## When to choose S3 Vectors vs alternatives
 
-| Need | S3 Vectors | OpenSearch (Serverless/Managed) | SaaS Vector DB |
-|---|---|---|---|
-| Massive volume, low cost, low‑to‑mid QPS | **Yes** | Maybe (if QPS climbs) | Maybe |
-| Ultra‑low latency, high QPS, complex filters | Can fall short | **Yes** | **Yes** |
-| Simple ops (no clusters) | **Yes** | Less | Less |
-| Bedrock tie‑in (embeddings/KB) | **Yes** | Indirect | Varies |
+| Need                                         | S3 Vectors     | OpenSearch (Serverless/Managed) | SaaS Vector DB |
+| -------------------------------------------- | -------------- | ------------------------------- | -------------- |
+| Massive volume, low cost, low‑to‑mid QPS     | **Yes**        | Maybe (if QPS climbs)           | Maybe          |
+| Ultra‑low latency, high QPS, complex filters | Can fall short | **Yes**                         | **Yes**        |
+| Simple ops (no clusters)                     | **Yes**        | Less                            | Less           |
+| Bedrock tie‑in (embeddings/KB)               | **Yes**        | Indirect                        | Varies         |
 
 ---
 
 ## S3 Vectors vs Upstash Vectors (serverless)
 
 **High‑level**
 
-| Aspect | **S3 Vectors** | **Upstash Vectors** |
-|---|---|---|
-| Nature | Object store with **native vector indexes** (preview) | Managed **serverless vector DB** |
-| Pricing model | Storage + per‑request query (see AWS announcement) | Per‑request pricing (e.g., $0.4/100K req in PAYG) + storage (e.g., $0.25/GB) |
-| Query limits | **Top‑K ≤ 30**, **no pagination** in `QueryVectors` | Client‑set `topK` **with pagination** via **Resumable Query** (cursor) |
-| Filtering | Metadata filters; declare **non‑filterable** keys at index create | **Metadata filtering**; **namespaces** for isolation |
-| Data type | **`float32` only** for vector values | Upstash accepts numeric vector payloads (not advertised as multiple dtypes; equivalently treated as float arrays). |
-| Multi‑tenancy | **One bucket, many indexes** (index per tenant) recommended | Multiple **indexes** + **namespaces**; multiple DBs by project |
+| Aspect        | **S3 Vectors**                                                    | **Upstash Vectors**                                                                                                |
+| ------------- | ----------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
+| Nature        | Object store with **native vector indexes** (preview)             | Managed **serverless vector DB**                                                                                   |
+| Pricing model | Storage + per‑request query (see AWS announcement)                | Per‑request pricing (e.g., $0.4/100K req in PAYG) + storage (e.g., $0.25/GB)                                       |
+| Query limits  | **Top‑K ≤ 30**, **no pagination** in `QueryVectors`               | Client‑set `topK` **with pagination** via **Resumable Query** (cursor)                                             |
+| Filtering     | Metadata filters; declare **non‑filterable** keys at index create | **Metadata filtering**; **namespaces** for isolation                                                               |
+| Data type     | **`float32` only** for vector values                              | Upstash accepts numeric vector payloads (not advertised as multiple dtypes; equivalently treated as float arrays). |
+| Multi‑tenancy | **One bucket, many indexes** (index per tenant) recommended       | Multiple **indexes** + **namespaces**; multiple DBs by project                                                     |
 
 **Links**
+
 - [AWS announcement](https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/)
 - [S3 Vectors Query limits](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-vectors-limitations.html)
 - [Upstash pricing](https://upstash.com/pricing/vector) and [FAQ](https://upstash.com/docs/vector/help/faq)
 - [Upstash filtering & namespaces](https://upstash.com/docs/vector/features/filtering) and [SDK query params](https://upstash.com/docs/vector/sdks/py/example_calls/query)
 - [Upstash Resumable Query](https://upstash.com/docs/vector/features/resumablequery)
 
 **Implications**
+
 - If you need **scroll/infinite‑results UIs**, Upstash’s **resumable query** is simpler. In S3 Vectors you’ll fan‑out across segments and re‑rank to bypass `Top‑K=30`. The fan-out could end in a clique with means some vectors would never be retrieved.
 - If your profile is **huge corpuses + moderate QPS + lowest storage cost**, S3 Vectors is attractive. For **real‑time UX** with richer ranking/pagination, Upstash is frictionless.
 - Multi‑tenant: both solve it, but S3 Vectors centralizes observability/logging per bucket; Upstash leans on namespaces and per‑index isolation.
@@ -107,5 +111,4 @@ S3 Vectors aims to **lower total cost** for large volumes with moderate QPS: you
 
 S3 Vectors won’t replace every vector DB —but it **changes the starting point**. If your priority is **cost and simplicity** with large datasets and moderate QPS, start with S3 Vectors and scale out to OpenSearch or a dedicated vector DB only where the **latency/query profile** demands it.
 
-*Questions you want me to benchmark next?* Throughput vs. shards, recall vs. filters, or evaluator design for reranking under `Top‑K=30`.
-
+_Questions you want me to benchmark next?_ Throughput vs. shards, recall vs. filters, or evaluator design for reranking under `Top‑K=30`.