Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query joining search index and foreign table is very slow #157

Closed
2 tasks done
pdpark opened this issue Aug 24, 2024 · 3 comments
Closed
2 tasks done

Query joining search index and foreign table is very slow #157

pdpark opened this issue Aug 24, 2024 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers priority-high High priority issue

Comments

@pdpark
Copy link

pdpark commented Aug 24, 2024

What happens?

I'm getting wildly varying query times on the first query after creating my tables and search index. About 50% of the time it runs in 300-500ms. The other ~50% of the time it takes almost two minutes (1:45-1:50)! After the query runs slow once, the next query runs fast. In between runs I stop & remove the container, remove the images, and prune all volumes.

To Reproduce

Query:

with search_score as (
    select * from my_schema.search_idx.score_bm25(
    '
    (
             x:"green"
         OR  y:"green"
         OR  z:"green"
         OR  m:"green"
         OR  n:"green"
         OR  p:"green"
         OR  q:"green"
         OR  r:"green"
    )
    AND a:"456"
    '
    )
)
select
        x.cold,
        x.cole,
        x.colf,
        x.colg,
        x.colh,
        p.colx
from search_score as s
join my_schema.table1 as x
on s.id = x.id
join my_schema.table2 as p
on p.cola = x.cola
and p.colb = x.colb
and (
    p.colc = '123'
    or p.colc is null
)
order by score_bm25 desc
limit 250
;

OS:

Amazon Linux 2023 / PostgreSQL 16.4 (Debian 16.4-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

ParadeDB Version:

pg_search 0.9.1 pg_analytics 0.1.0

Are you using ParadeDB Docker, Helm, or the extension(s) standalone?

ParadeDB Docker Image

Full Name:

Patrick Park

Affiliation:

Payzer

Did you include all relevant data sets for reproducing the issue?

No - I cannot share the data sets because they are confidential

Did you include the code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configurations (e.g., CPU architecture, PostgreSQL version, Linux distribution) to reproduce the issue?

  • Yes, I have
@pdpark pdpark added the bug Something isn't working label Aug 24, 2024
@philippemnoel philippemnoel added good first issue Good for newcomers priority-high High priority issue labels Aug 24, 2024
@philippemnoel
Copy link
Collaborator

Potentially related to this issue: #65

@philippemnoel philippemnoel transferred this issue from paradedb/paradedb Oct 15, 2024
@philippemnoel
Copy link
Collaborator

This is now unblocked and a great first issue

@philippemnoel
Copy link
Collaborator

Capture d’écran, le 2024-10-18 à 21 32 50 Reported as fixed by @pdpark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers priority-high High priority issue
Projects
None yet
Development

No branches or pull requests

2 participants