Skip to content

Register VECTOR helper functions: VECTOR_DIM, VECTOR_TO_STRING, STRING_TO_VECTOR, TO_VECTOR #31

@kyleconroy

Description

@kyleconroy

Summary

Marino's lexer/AST already knows the VECTOR data type, but it does not register the canonical helper functions added in MySQL 9.0. They need to appear in ast/functions.go so the generic non-keyword function path resolves them and downstream tools (sqlc) can recognise them.

MySQL version

Introduced in MySQL 9.0; whitespace tolerance and additional aliases added in 9.1.

Current state in marino

VECTOR keyword: present (parser/keywords.go:645, parser/parser.y:718).
Function constants: absent. grep -in 'vector_dim\|vector_to_string\|string_to_vector' ast/functions.go returns no matches.

Example SQL

SELECT STRING_TO_VECTOR('[1.0, 2.0, 3.0]');
SELECT TO_VECTOR('[1.0, 2.0, 3.0]');                  -- alias added in 9.0
SELECT VECTOR_DIM(STRING_TO_VECTOR('[1, 2, 3]'));     -- 3
SELECT VECTOR_TO_STRING(STRING_TO_VECTOR('[1, 2, 3]'));

DDL using the type with the helpers:

CREATE TABLE embeddings (id INT PRIMARY KEY, v VECTOR(3));

INSERT INTO embeddings (id, v) VALUES (1, STRING_TO_VECTOR('[0.1, 0.2, 0.3]'));

SELECT id, VECTOR_TO_STRING(v) FROM embeddings;

Validation

All four SELECT calls above and the DDL run successfully against MySQL 9.2.0 Community.

Notes for the implementer

  • Add the four function-name constants to ast/functions.go:
    • VectorDim = "vector_dim"
    • VectorToString = "vector_to_string"
    • StringToVector = "string_to_vector"
    • ToVector = "to_vector"
  • The non-keyword function path in parser.y will then accept them with no further changes.
  • Add round-trip tests (SELECT VECTOR_DIM(...), SELECT VECTOR_TO_STRING(...)).
  • Future work (out of scope for this issue): DISTANCE(...) for vector similarity once it is added to community.
  • Reference: https://dev.mysql.com/doc/refman/9.2/en/vector-functions.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions