Replies: 69 comments
-
|
I also wondered how to store JSON objects without converting to string. SQL Alchemy supports storing these directly |
Beta Was this translation helpful? Give feedback.
-
|
@OXERY && @scuervo91 - I was able to get something that works Using this:
That said: this is a postgresql JSONB column in my database. But it works. For a nested Object you could use a pydantic model as the Type and do it the same way. Hope this helps as I was having a difficult time figuring out a solution as well :) |
Beta Was this translation helpful? Give feedback.
-
|
I also got it working, on SQLite and Postgresql: |
Beta Was this translation helpful? Give feedback.
-
|
@TheJedinator Could you help a bit more with the nested object? I tried to "use the pydantic model as the Type" but I can't get it to work :( Here is my snippet: from sqlalchemy import Column
from sqlalchemy.dialects.postgresql import JSONB
from sqlmodel import Field
from sqlmodel import Session
from sqlmodel import SQLModel
from engine import get_sqlalchemy_engine
class J(SQLModel):
j: int
class A(SQLModel, table=True):
a: int = Field(primary_key=True)
b: J = Field(sa_column=Column(JSONB))
engine = get_sqlalchemy_engine()
SQLModel.metadata.create_all(engine)
with Session(engine) as session:
a = A(a=1, b=J(j=1))
session.add(a)
session.commit()
session.refresh(a)Throws an error |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
|
Thank you! Unfortunately I get the same error :( I found one workaround - registering a custom_serializer for the sqlalchemy engine, like so: def custom_serializer(d):
return json.dumps(d, default=lambda v: v.json())
def get_sqlalchemy_engine():
return create_engine("postgresql+psycopg2://", creator=get_conn, json_serializer=custom_serializer)But if there is a cleaner way, I would gladly use that instead. |
Beta Was this translation helpful? Give feedback.
-
|
Hey @psarka I just actually tried what I told and sorry have mislead... I did get a working solution though 😄 It was actually the opposite function that you need to use, here's the example you supplied with the amendments to make it work: with Session(engine) as session:
j = J(j=1)
j_dumped = J.json(j)
a = A(a=1, b=j_dumped)
session.add(a)
session.commit()
session.refresh(a) |
Beta Was this translation helpful? Give feedback.
-
|
Hmm, this doesn't (or at least shouldn't) typecheck :) But I see what you did there, essentially it's the same as registring a custom serializer, but manually. |
Beta Was this translation helpful? Give feedback.
-
|
It does type check when you create the This allows for the type checking of the object, the It is essentially the same as registering a custom serializer but allows you to be explicit about using it. |
Beta Was this translation helpful? Give feedback.
-
|
A hacky method with type checking that work with sqlite is from sqlalchemy import Column
from typing import List
# from sqlalchemy.dialects.postgresql import JSONB
from sqlmodel import Field
from sqlmodel import Session
from pydantic import validator
from sqlmodel import SQLModel, JSON,create_engine
# from engine import get_sqlalchemy_engine
sqlite_file_name = "test.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"
engine = create_engine(sqlite_url)
class J2(SQLModel):
test: List[int]
class J(SQLModel):
j: int
nested: J2
class A(SQLModel, table=True):
a: int = Field(primary_key=True)
b: J = Field(sa_column=Column(JSON))
@validator('b')
def val_b(cls, val):
return val.dict()
SQLModel.metadata.create_all(engine)
with Session(engine) as session:
a = A(a=1, b=J(j=1,nested=J2(test=[100,100,100])))
session.add(a)
session.commit()
session.refresh(a) |
Beta Was this translation helpful? Give feedback.
-
|
hi, from sqlmodel import SQLModel,Relationship,Field,JSON
from typing import Optional,List, Dict
from sqlalchemy import Column
from pydantic import validator
#
class J2(SQLModel):
id: int
title:str
#
class Companies(SQLModel, table=True):
id:Optional[int]=Field(default=None,primary_key=True)
name:str
adddresses: List['J2'] = Field(sa_column=Column(JSON))
@validator('adddresses')
def val_b(cls, val):
print(val)
return val.dict()Given error. TypeError: Type is not JSON serializable: J2when i print it, it returns [J2(id=1, title='address1'), J2(id=2, title='address2')]how can i handle that? Why is this J2 added, how can I get rid of it, i can't turn it to .dict(), i cannot serialise it... can you give an idea? |
Beta Was this translation helpful? Give feedback.
-
|
Does this work? |
Beta Was this translation helpful? Give feedback.
-
@HenningScheufler thank you for your help, it worked perfect. |
Beta Was this translation helpful? Give feedback.
-
|
Hey all, thanks for the great advice here. Creating a the object using the classes and writing them to the DB works as expected and writes the data as a dict into a JSON field. See this example: However, when reading the model from the DB using a Any hint how that could be achieved? Maybe via the custom-serialiser mentioned by @psarka ? Thanks already! |
Beta Was this translation helpful? Give feedback.
-
|
Something like this works, but obviously doesn't scale if we have mulitple nested models, instead of just the Instead, we would need more context in the deserialiser (i.e. access to the type-hint of the field we're trying to deserialise so that we can use Any hint where and how I could achieve that kind of access to the deserialisation process? Thanks :) |
Beta Was this translation helpful? Give feedback.
-
|
Here is a minimum reproducible example: import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
import sqlmodel as sm
import uuid
import typing as t
class TestTable(sm.SQLModel, table=True): # type: ignore[call-arg]
uuid_id: uuid.UUID = sm.Field( # type: ignore[call-overload]
default_factory=uuid.uuid4,
sa_column=sa.Column("id", sa.UUID, unique=True, nullable=False, primary_key=True),
)
conversation_transcript: t.Optional[t.List[dict]] = sm.Field(sa_column=sa.Column(JSONB, nullable=True), default=None)
SYNC_SQLALCHEMY_URL = "postgresql://postgres:postgres@localhost:5432/postgres"
SYNC_DB_ENGINE = sa.create_engine(SYNC_SQLALCHEMY_URL, pool_pre_ping=True) # type: ignore
SYNC_SESSION_MAKER = sa.orm.sessionmaker(bind=SYNC_DB_ENGINE, class_=sa.orm.Session, expire_on_commit=False, autoflush=True)
sm.SQLModel.metadata.create_all(SYNC_DB_ENGINE)
with SYNC_SESSION_MAKER() as session:
session.add(TestTable(conversation_transcript=[{"text": "Hello"}]))
session.commit()
with SYNC_SESSION_MAKER() as session:
items = session.query(TestTable).all()
item = items[0]
print(f"Before update: {item.conversation_transcript} (ID: {item.uuid_id})")
item_id = item.uuid_id
item.conversation_transcript.append({"text": "World"})
print(f"After update before save: {item.conversation_transcript} (ID: {item.uuid_id})")
session.add(item)
session.commit()
with SYNC_SESSION_MAKER() as session:
item = session.get(TestTable, item_id)
print(f"After update after load: {item.conversation_transcript} (ID: {item.uuid_id})") |
Beta Was this translation helpful? Give feedback.
-
Great work, but seems to be broken. (1) It runs validations on nested types which slows down reads (2) somehow this is triggering other records to be downloaded. |
Beta Was this translation helpful? Give feedback.
-
|
@fny I know, it's certainly not great. If I get a better version I will be sure to post it here, and you can do the same as well, while we wait for an official implementation |
Beta Was this translation helpful? Give feedback.
-
|
Hi all, Below is a better implementation based on @iloveiltaly's ActiveModel project. My version is more robust than ActiveModel's. It also assumes all records coming out of the database are valid (i.e. uses This tremendously speeds up loading records since validations are skipped. I anticipate this should handle 80% of use cases. @tiangolo You might find this useful too. from typing import get_args, get_origin
from pydantic import BaseModel
def convert_field_value(annotation: Any, raw_value: Any) -> Any:
origin = get_origin(annotation)
args = get_args(annotation)
if is_union_type(origin):
if raw_value is None:
return None
# The above is optimistic: it's possible that None is not a valid
# value for the annotation. The code below would account for that.
#
# if type(None) in args:
# return None
# else:
# raise ValueError(f"None is not a valid value for the annotation {annotation}")
for arg in args:
converted = convert_field_value(arg, raw_value)
if converted is not None:
return converted
return None
if origin is list:
return [convert_field_value(args[0], item) for item in raw_value]
if origin is dict:
return {
key: convert_field_value(args[1], value) for key, value in raw_value.items()
}
if origin is tuple:
if len(args) != len(raw_value):
return raw_value
return tuple(
convert_field_value(arg, item) for arg, item in zip(args, raw_value)
)
try:
if issubclass(annotation, BaseModel):
attrs = {
field_name: convert_field_value(
field_info.annotation, raw_value.get(field_name)
)
for field_name, field_info in annotation.model_fields.items()
}
return annotation.model_construct(**attrs)
except TypeError as e:
if "issubclass()" not in str(e):
raise e
return raw_value
class SQLModelJSONMixin:
@reconstructor
def init_on_load(self):
for field_name, field_info in self.model_fields.items():
raw_value = getattr(self, field_name)
print(field_name, field_info, raw_value)
converted = convert_field_value(field_info.annotation, raw_value)
setattr(self, field_name, converted)Usage: Other note: you need to call I'm sure there's something I could add to the mixin, but I haven't had time to investigate. |
Beta Was this translation helpful? Give feedback.
-
|
^^ Very nice! Tested the above seems to work great so far! Also I had to add 2 lines of code to make sure it works with None default values being returned if the type is an array or list (this is technically possible). if not origin: # not a container type (e.g. int, untyped list, None, datetime)
return raw_value
if raw_value is None:
return None |
Beta Was this translation helpful? Give feedback.
-
|
@fny looks great ! Can you provide an example on usage and migrations (with alembic) ? |
Beta Was this translation helpful? Give feedback.
-
|
@DaanRademaker I just realized that error myself. I updated my version to make it more robust and also avoid an issue where setattr(...) was triggering validations. @Seluj78: I added an example. Migrations will work without any additional changes. Other note: you need to call I'm sure there's something I could add to the mixin, but I haven't had time to investigate. |
Beta Was this translation helpful? Give feedback.
-
|
@fny would love to merge these updates into the active model project if you're up for submitting a PR |
Beta Was this translation helpful? Give feedback.
-
|
I am here to bump this. I would love to see this added! |
Beta Was this translation helpful? Give feedback.
-
|
Here is how I do it using pydantic's TypeAdapter. from sqlalchemy import TypeDecorator
from sqlmodel import JSON
from pydantic import TypeAdapter
class PydanticJson(TypeDecorator):
impl = JSON()
cache_ok = True
def __init__(self, pt):
super().__init__()
self.pt = TypeAdapter(pt)
self.coerce_compared_value = self.impl.coerce_compared_value
def bind_processor(self, dialect):
return lambda value: self.pt.dump_json(value) if value is not None else None
def result_processor(self, dialect, coltype):
return lambda value: self.pt.validate_json(value) if value is not None else NoneAnd how to use it. from sqlalchemy import Column
from pydantic import BaseModel
from sqlmodel import SQLModel, Field
class Nested(BaseModel):
value: str
class Parent(SQLModel, table=True):
id: int = Field(primary_key=True, default=None)
nested: Nested | None = Field(sa_column=Column(PydanticJson(Nested)))
nested_list: list[Nested] = Field(sa_column=Column(PydanticJson(list[Nested]))) |
Beta Was this translation helpful? Give feedback.
-
Hey @iloveitaly! Sorry for the late response. I just saw this. I'll try to get around to it this week. |
Beta Was this translation helpful? Give feedback.
-
|
I tried using @pporcher solution and it works great to create the tables and get running. It generated the migration file with code like this: Which has two issues:
I guess the solution is to just tell Alembic to create the column as type JSON and go from there but im not sure how to do that. |
Beta Was this translation helpful? Give feedback.
-
|
For the specific use-case where one wants to store in a JSON column a list of Pydantic Model instances, I built on @pporcher's answer # sql.py
from typing import Any, Generic, Self, TypeVar
from sqlalchemy import Column, Dialect, TypeDecorator
from sqlalchemy.sql.operators import OperatorType
from sqlalchemy.sql.type_api import _BindProcessorType, _ResultProcessorType
from sqlmodel import JSON
from pydantic import BaseModel, TypeAdapter
T = TypeVar("T", bound=BaseModel)
class PydanticColumn(TypeDecorator, Generic[T]):
impl = JSON
cache_ok = True
def __init__(self, pt: type[T]):
super().__init__()
self.adapter: TypeAdapter[T] = TypeAdapter(pt)
def coerce_compared_value(self, op: OperatorType | None, value: Any) -> Any:
return self.impl.coerce_compared_value(self, op, value) # type: ignore
def bind_processor(self, dialect: Dialect) -> _BindProcessorType | None:
def processor(value: T | None) -> bytes | None:
if value is None:
return None
return self.adapter.dump_json(value)
return processor
def result_processor(self, dialect: Dialect, coltype: Any) -> _ResultProcessorType | None:
def processor(value: bytes | str | None):
if value is None:
return None
return self.adapter.validate_json(value)
return processor
@classmethod
def col(cls, model: type[T]) -> Column[Self]:
return Column(cls(model))from pydantic import RootModel, BaseModel
from sqlmodel import Field, SQLModel
from .sql import PydanticColumn
class Item(BaseModel):
item: str
class List(RootModel[list[Item]], Sequence[Item]):
root: list[Item]
def __len__(self) -> int:
return len(self.root)
@overload
def __getitem__(self, item: int) -> Item:
...
@overload
def __getitem__(self, item: slice[int | None, int | None, int | None]) -> Sequence[Item]:
...
def __getitem__(self, item: int | slice[int | None, int | None, int | None]) -> Item | Sequence[Item]:
return self.root[item]
class MyTable(SQLModel, table=True):
... # other columns here
custom_list: List = Field(default_factory=list, sa_column=PydanticColumn.col(List))I added the |
Beta Was this translation helpful? Give feedback.
-
|
I recently added JSONB field mutation tracking to activemodel. This works both for JSOB fields that render as Pydantic objects and plain old py objects. |
Beta Was this translation helpful? Give feedback.
-
|
Here is another variation for
from typing import TYPE_CHECKING, Any, cast, override
from pydantic import TypeAdapter
from sqlalchemy import Dialect, TypeDecorator
from sqlalchemy.dialects.postgresql import JSONB
if TYPE_CHECKING:
from sqlalchemy.sql.type_api import _BindProcessorType, _ResultProcessorType
class PydanticJSONB[T](TypeDecorator):
impl = JSONB()
cache_ok = True
def __init__(self, pydantic_type: type[T]) -> None:
super().__init__()
self.adapter = TypeAdapter(pydantic_type)
self.coerce_compared_value = cast(
"JSONB", PydanticJSONB.impl
).coerce_compared_value
@override
def bind_processor(self, dialect: Dialect) -> _BindProcessorType:
def processor(value: T | None) -> str | None:
if value is None:
return None
return self.adapter.dump_json(value).decode("utf-8")
return processor
@override
def result_processor(self, dialect: Dialect, coltype: Any) -> _ResultProcessorType:
def processor(value: Any) -> T:
if value is None:
return None
return self.adapter.validate_python(value)
return processorInspired by #1324 (comment), the from typing import TYPE_CHECKING, Any, Literal
if TYPE_CHECKING:
from alembic.autogenerate.api import AutogenContext
def render_item(
type_: str,
obj: Any, # noqa: ANN401
autogen_context: AutogenContext,
) -> str | Literal[False]:
if type_ == "type" and isinstance(obj, PydanticJSONB):
autogen_context.imports.add("import sqlalchemy as sa")
autogen_context.imports.add("from sqlalchemy.dialects import postgresql")
return "postgresql.JSONB(astext_type=sa.Text())"
return FalseInspired by #1324, here is another implementation which is less hacky and follows the
So effectively the process is
As a result, this implementation is less performant than overriding from typing import Any, cast, override
from pydantic import TypeAdapter
from sqlalchemy import Dialect, TypeDecorator
from sqlalchemy.dialects.postgresql import JSONB
class PydanticJSONB[T](TypeDecorator):
impl = JSONB()
cache_ok = True
def __init__(self, pydantic_type: type[T]) -> None:
super().__init__()
self.adapter = TypeAdapter(pydantic_type)
self.coerce_compared_value = cast(
"JSONB", PydanticJSONB.impl
).coerce_compared_value
@override
def process_bind_param(self, value: T | None, dialect: Dialect) -> Any:
if value is None:
return None
return self.adapter.dump_python(value)
@override
def process_result_value(self, value: Any, dialect: Dialect) -> T | None:
if value is None:
return None
return self.adapter.validate_python(value) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
First Check
Commit to Help
Example Code
Description
I have already implemented an API using FastAPI to store Pydantic Models. These models are themselves nested Pydantic models so the way they interact with a Postgres DataBase is throught JsonField. I've been using Tortoise ORM as the example shows.
Is there an equivalent model in SQLModel?
Operating System
Linux
Operating System Details
WSL 2 Ubuntu 20.04
SQLModel Version
0.0.4
Python Version
3.8
Additional Context
No response
Beta Was this translation helpful? Give feedback.
All reactions