You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: verify data types + storage docs against source
- Add missing RAY_F32=6 to type table (was skipped between I64=5 and F64=7)
- Fix type-of examples to use meta builtin (type-of does not exist)
- Fix BOOL vector output: [true false true] not [1b 0b 1b]
- Fix ray_read_csv_opts signature: (path, delim, header, col_types, n_types)
- Fix CSV date inference docs: only YYYY-MM-DD supported, not YYYY.MM.DD
- Replace non-existent splay-save/splay-load/part-load Rayfall examples
with info boxes noting these are C API only
- Replace part-load Rayfall example in collections with C API equivalent
<p>All vector processing in Rayforce happens in <strong>morsels</strong> — fixed-size chunks of 1024 elements. The executor never processes an entire column at once. Instead, it iterates morsel by morsel, which keeps data in L1/L2 cache and enables pipeline parallelism.</p>
@@ -349,12 +349,11 @@ <h3>Type Encoding</h3>
349
349
<h3>MAPCOMMON</h3>
350
350
<p>When loading a date-partitioned table, Rayforce creates a virtual <code>RAY_MAPCOMMON</code> column. This column does not store actual data — it derives values from the partition directory names (e.g., <code>2024.01.15/</code>). Each row in a partition shares the same date value, so the MAPCOMMON column can represent millions of rows with zero per-row storage.</p>
351
351
352
-
<pre><code><spanclass="hl-comment">; Load a date-partitioned table</span>
<spanclass="hl-comment">; The 'date' column is MAPCOMMON — derived from directory names</span>
356
-
<spanclass="hl-comment">; Queries that filter on date trigger partition pruning</span>
357
-
<spanclass="hl-comment">ray></span> (<spanclass="hl-kw">select</span> {<spanclass="hl-sym">from:</span>trades <spanclass="hl-sym">where:</span> (<spanclass="hl-fn">=</span> date <spanclass="hl-num">2024.01.15</span>)})</code></pre>
355
+
<spanclass="hl-comment">// The 'date' column is MAPCOMMON — derived from directory names</span>
356
+
<spanclass="hl-comment">// Queries that filter on date trigger partition pruning</span></code></pre>
358
357
359
358
<divclass="info-box">
360
359
<strong>Partition pruning:</strong> The optimizer recognizes filters on MAPCOMMON columns and eliminates entire partitions from the scan — skipping their memory-mapped segments entirely. A query filtering on a single date in a year of data reads only 1/365th of the files.
<p>Every Rayforce object begins with a 32-byte <code>ray_t</code> header. This is the fundamental building block — atoms, vectors, lists, tables, and functions all start with this structure.</p>
<strong>Note:</strong> Splayed table I/O is currently available only through the C API. There are no <code>splay-save</code> / <code>splay-load</code> Rayfall builtins yet.
<p>For large time-series datasets, Rayforce supports date-partitioned storage. Data is split into directories named by date, each containing a splayed table for that day's data.</p>
<spanclass="hl-comment">; Filter on date — optimizer prunes partitions</span>
234
-
<spanclass="hl-comment">ray></span> (<spanclass="hl-kw">select</span> {<spanclass="hl-sym">from:</span>trades <spanclass="hl-sym">where:</span> (<spanclass="hl-fn">=</span> date <spanclass="hl-num">2024.01.15</span>)})
235
-
236
-
<spanclass="hl-comment">; Range filter — only relevant partitions are scanned</span>
237
-
<spanclass="hl-comment">ray></span> (<spanclass="hl-kw">select</span> {<spanclass="hl-sym">from:</span>trades <spanclass="hl-sym">where:</span> (<spanclass="hl-fn">and</span> (<spanclass="hl-fn">>=</span> date <spanclass="hl-num">2024.01.15</span>) (<spanclass="hl-fn"><=</span> date <spanclass="hl-num">2024.01.17</span>))})</code></pre>
230
+
<divclass="info-box">
231
+
<strong>Note:</strong> Partitioned table loading is currently available only through the C API (<code>ray_part_load</code>). There is no <code>part-load</code> Rayfall builtin yet. Once loaded via the C API, the resulting table supports normal <code>select</code> queries with partition pruning.
232
+
</div>
238
233
239
234
<h3>Partition Pruning</h3>
240
235
<p>The query optimizer recognizes predicates on the <code>MAPCOMMON</code> column and eliminates entire partitions from the scan plan. This means a query filtering on a single date in a year of data only touches 1/365th of the files on disk — with zero per-row cost for the pruned partitions.</p>
@@ -313,11 +308,11 @@ <h3>C API</h3>
313
308
<tbody>
314
309
<tr>
315
310
<td><code>ray_read_csv(path)</code></td>
316
-
<td>Load a CSV file with default options: comma delimiter, first row as header, automatic type inference, <code>""</code> as null.</td>
311
+
<td>Load a CSV file with default options: comma delimiter, first row as header, and automatic type inference. Empty fields are treated as null.</td>
<td>Load with custom options: delimiter character, whether first row is a header, explicit column type array (<code>int8_t*</code>), and number of type entries. Pass <code>NULL, 0</code> for automatic type inference.</td>
0 commit comments