Skip to content

Conversation

@trinity-1686a
Copy link
Contributor

Description

When scheduling leaf splits, allow up to N queries to run concurrently so that a single large query doesn't hog all ressources on a searcher.

How was this PR tested?

added unit test to verify it emits permits in the expected order

@fulmicoton
Copy link
Collaborator

fulmicoton commented Jan 27, 2026

What metric will be improved by this change and which metric will be worse?

My understanding is that:

  • the average latency will be worsened.
  • the average (real latency / expected latency give the query size) will be improved.

Is this a patch meant to address current urgent PoC or is it meant as a long term solution?
I think we can be a tiny bit more ambitious than this.

I think @rdettai's approach using differentiated service was on the right track, but using separate pool was too wasteful.
We could use a watered down version of https://en.wikipedia.org/wiki/Shortest_remaining_time for instance.

@trinity-1686a
Copy link
Contributor Author

the average latency should be exactly the same, just moving it from one query to another.
i think the latency for pXX where XX is low isn't impacted much (queries done with no load at all are unimpacted, queries done with low load start a bit earlier, but compete with the next request, which overall should cancel each other).
latency for pXX where XX is high should be improved (queries stuck after a big one are not entirely stuck, and may complete before that large query)
latency for p100 should be worth, a single expensive query no longer has 100% of the compute of searchers dedicated to it, so it's longer to run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants