Skip to content

Scope change stops optimizer from pushing outer BGP into both sides of a UNION (performance regression 5.1.3 to 5.1.4) #5444

@reckart

Description

@reckart

Current Behavior

The current behavior with 5.1.4 and 5.1.5 is basically that this query here never terminates:

PREFIX search: <http://www.openrdf.org/contrib/lucenesail#>
SELECT DISTINCT ?sc ?m ?l ?dc ?dp ?subj
WHERE { { { VALUES ( ?pMatch ) { (<http://www.w3.org/2004/02/skos/core#prefLabel>) (<http://www.w3.org/2004/02/skos/core#altLabel>) (<http://www.w3.org/20x00/01/rdf-schema#Label>) } 
{ ?subj search:matches [ search:query "progesterone receptor positive tumor" ;
    search:property ?pMatch ;
    search:score ?sc ] ;
    ?pMatch ?m .
FILTER ( ( REGEX( STR( ?m ), "^Progesterone\\s+receptor\\s+positive\\s+tumor$", "i" ) && ( LANGMATCHES( LANG( ?m ), "nl" ) || LANGMATCHES( LANG( ?m ), "en" ) || LANG( ?m ) = "" ) ) ) } UNION { ?subj search:matches [ search:query "macroscopisch" ;
    search:property ?pMatch ;
    search:score ?sc ] ;
    ?pMatch ?m .
FILTER ( ( REGEX( STR( ?m ), "^macroscopisch$", "i" ) && ( LANGMATCHES( LANG( ?m ), "nl" ) || LANGMATCHES( LANG( ?m ), "en" ) || LANG( ?m ) = "" ) ) ) } } }
OPTIONAL { VALUES ( ?pPrefLabel ) { (<http://www.w3.org/2004/02/skos/core#prefLabel>) }  }
OPTIONAL { { ?subj ?pPrefLabel ?l .
FILTER ( ( LANGMATCHES( LANG( ?l ), "nl" ) || LANGMATCHES( LANG( ?l ), "en" ) || LANG( ?l ) = "" ) ) } }
OPTIONAL { { ?subj <http://www.w3.org/2004/02/skos/core#definition> ?dc .
FILTER ( ( LANGMATCHES( LANG( ?dc ), "nl" ) || LANGMATCHES( LANG( ?dc ), "en" ) || LANG( ?dc ) = "" ) ) } }
OPTIONAL { { ?subj <http://www.w3.org/2002/07/owl#deprecated> ?dp .
FILTER ( ( LANGMATCHES( LANG( ?dp ), "nl" ) || LANGMATCHES( LANG( ?dp ), "en" ) || LANG( ?dp ) = "" ) ) } } }
LIMIT 200

Here is the query plan that RDF4J outputs for this query in trace mode: with 5.1.4 and 5.1.5:

RDF4J 5.1.4

QueryRoot
   Slice (limit=200)
      Distinct
         Projection
            ProjectionElemList
               ProjectionElem "sc"
               ProjectionElem "m"
               ProjectionElem "l"
               ProjectionElem "dc"
               ProjectionElem "dp"
               ProjectionElem "subj"
            LeftJoin
               LeftJoin
                  LeftJoin
                     LeftJoin
                        Join
                           BindingSetAssignment ([[pMatch=http://www.w3.org/2004/02/skos/core#prefLabel], [pMatch=http://www.w3.org/2004/02/skos/core#altLabel], [pMatch=http://www.w3.org/20x00/01/rdf-schema#Label]]) (costEstimate=0, resultSizeEstimate=1,00)
                           Union (new scope)
                              Join
                                 BindingSetAssignment (org.eclipse.rdf4j.sail.lucene.BindingSetCollection@b5a416f1) (costEstimate=0, resultSizeEstimate=1,00)
                                 Filter (new scope)
                                    And
                                       Or
                                          LangMatches
                                             Lang
                                                Var (name=m)
                                             ValueConstant (value="nl")
                                          Or
                                             LangMatches
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="en")
                                             Compare (=)
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="")
                                       Regex
                                          Str
                                             Var (name=m)
                                          ValueConstant (value="^Progesterone\s+receptor\s+positive\s+tumor$")
                                          ValueConstant (value="i")
                                    StatementPattern (costEstimate=10, resultSizeEstimate=1000)
                                       Var (name=subj)
                                       Var (name=pMatch)
                                       Var (name=m)
                              Join
                                 BindingSetAssignment (org.eclipse.rdf4j.sail.lucene.BindingSetCollection@e576360a) (costEstimate=0, resultSizeEstimate=1,00)
                                 Filter (new scope)
                                    And
                                       Or
                                          LangMatches
                                             Lang
                                                Var (name=m)
                                             ValueConstant (value="nl")
                                          Or
                                             LangMatches
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="en")
                                             Compare (=)
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="")
                                       Regex
                                          Str
                                             Var (name=m)
                                          ValueConstant (value="^macroscopisch$")
                                          ValueConstant (value="i")
                                    StatementPattern (costEstimate=10, resultSizeEstimate=1000)
                                       Var (name=subj)
                                       Var (name=pMatch)
                                       Var (name=m)
                        BindingSetAssignment ([[pPrefLabel=http://www.w3.org/2004/02/skos/core#prefLabel]])
                     Filter (new scope)
                        Or
                           LangMatches
                              Lang
                                 Var (name=l)
                              ValueConstant (value="nl")
                           Or
                              LangMatches
                                 Lang
                                    Var (name=l)
                                 ValueConstant (value="en")
                              Compare (=)
                                 Lang
                                    Var (name=l)
                                 ValueConstant (value="")
                        StatementPattern (resultSizeEstimate=1000)
                           Var (name=subj)
                           Var (name=pPrefLabel)
                           Var (name=l)
                  Filter (new scope)
                     Or
                        LangMatches
                           Lang
                              Var (name=dc)
                           ValueConstant (value="nl")
                        Or
                           LangMatches
                              Lang
                                 Var (name=dc)
                              ValueConstant (value="en")
                           Compare (=)
                              Lang
                                 Var (name=dc)
                              ValueConstant (value="")
                     StatementPattern (resultSizeEstimate=100)
                        Var (name=subj)
                        Var (name=_const_baac53b8_uri, value=http://www.w3.org/2004/02/skos/core#definition, anonymous)
                        Var (name=dc)
               Filter (new scope)
                  Or
                     LangMatches
                        Lang
                           Var (name=dp)
                        ValueConstant (value="nl")
                     Or
                        LangMatches
                           Lang
                              Var (name=dp)
                           ValueConstant (value="en")
                        Compare (=)
                           Lang
                              Var (name=dp)
                           ValueConstant (value="")
                  StatementPattern (resultSizeEstimate=100)
                     Var (name=subj)
                     Var (name=_const_d9d629bf_uri, value=http://www.w3.org/2002/07/owl#deprecated, anonymous)
                     Var (name=dp)

Expected Behavior

In 5.1.3, this query returned immediately. Here is the 5.1.3 query plan:

RDF4J 5.1.3

QueryRoot
   Slice (limit=200)
      Distinct
         Projection
            ProjectionElemList
               ProjectionElem "sc"
               ProjectionElem "m"
               ProjectionElem "l"
               ProjectionElem "dc"
               ProjectionElem "dp"
               ProjectionElem "subj"
            LeftJoin
               LeftJoin
                  LeftJoin
                     LeftJoin
                        Join
                           BindingSetAssignment ([[pMatch=http://www.w3.org/2004/02/skos/core#prefLabel], [pMatch=http://www.w3.org/2004/02/skos/core#altLabel], [pMatch=http://www.w3.org/20x00/01/rdf-schema#Label]]) (costEstimate=0, resultSizeEstimate=1,00)
                           Union (new scope)
                              Join
                                 BindingSetAssignment (org.eclipse.rdf4j.sail.lucene.BindingSetCollection@b5a416f1) (costEstimate=0, resultSizeEstimate=1,00)
                                 Filter
                                    And
                                       Regex
                                          Str
                                             Var (name=m)
                                          ValueConstant (value="^Progesterone\s+receptor\s+positive\s+tumor$")
                                          ValueConstant (value="i")
                                       Or
                                          LangMatches
                                             Lang
                                                Var (name=m)
                                             ValueConstant (value="nl")
                                          Or
                                             LangMatches
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="en")
                                             Compare (=)
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="")
                                    StatementPattern (costEstimate=10, resultSizeEstimate=1000)
                                       Var (name=subj)
                                       Var (name=pMatch)
                                       Var (name=m)
                              Join
                                 BindingSetAssignment (org.eclipse.rdf4j.sail.lucene.BindingSetCollection@e576360a) (costEstimate=0, resultSizeEstimate=1,00)
                                 Filter
                                    And
                                       Regex
                                          Str
                                             Var (name=m)
                                          ValueConstant (value="^macroscopisch$")
                                          ValueConstant (value="i")
                                       Or
                                          LangMatches
                                             Lang
                                                Var (name=m)
                                             ValueConstant (value="nl")
                                          Or
                                             LangMatches
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="en")
                                             Compare (=)
                                                Lang
                                                   Var (name=m)
                                                ValueConstant (value="")
                                    StatementPattern (costEstimate=10, resultSizeEstimate=1000)
                                       Var (name=subj)
                                       Var (name=pMatch)
                                       Var (name=m)
                        BindingSetAssignment ([[pPrefLabel=http://www.w3.org/2004/02/skos/core#prefLabel]])
                     Filter (new scope)
                        Or
                           LangMatches
                              Lang
                                 Var (name=l)
                              ValueConstant (value="nl")
                           Or
                              LangMatches
                                 Lang
                                    Var (name=l)
                                 ValueConstant (value="en")
                              Compare (=)
                                 Lang
                                    Var (name=l)
                                 ValueConstant (value="")
                        StatementPattern (resultSizeEstimate=1000)
                           Var (name=subj)
                           Var (name=pPrefLabel)
                           Var (name=l)
                  Filter (new scope)
                     Or
                        LangMatches
                           Lang
                              Var (name=dc)
                           ValueConstant (value="nl")
                        Or
                           LangMatches
                              Lang
                                 Var (name=dc)
                              ValueConstant (value="en")
                           Compare (=)
                              Lang
                                 Var (name=dc)
                              ValueConstant (value="")
                     StatementPattern (resultSizeEstimate=100)
                        Var (name=subj)
                        Var (name=_const_baac53b8_uri, value=http://www.w3.org/2004/02/skos/core#definition, anonymous)
                        Var (name=dc)
               Filter (new scope)
                  Or
                     LangMatches
                        Lang
                           Var (name=dp)
                        ValueConstant (value="nl")
                     Or
                        LangMatches
                           Lang
                              Var (name=dp)
                           ValueConstant (value="en")
                        Compare (=)
                           Lang
                              Var (name=dp)
                           ValueConstant (value="")
                  StatementPattern (resultSizeEstimate=100)
                     Var (name=subj)
                     Var (name=_const_d9d629bf_uri, value=http://www.w3.org/2002/07/owl#deprecated, anonymous)
                     Var (name=dp)

Steps To Reproduce

I'm afraid, I cannot provide a simple example reproducing the problem. For small graphs, 5.1.4 and 5.1.5 work fine and all my unit tests work just nicely. However, with a properly knowledge base it gets unbearable.

I hope though that you find something insightful when you compare the 5.1.3 and 5.1.4 query plans to each other. I can see the differences, but unfortunately, I cannot interpret them.

Version

5.1.4

Are you interested in contributing a solution yourself?

None

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions