-
Notifications
You must be signed in to change notification settings - Fork 182
[DOC-9267] Guidance for Search Service SIzing #4078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/8.0
Are you sure you want to change the base?
Conversation
|
|
||
| Based on these variables, the required vCPUs could be either: | ||
|
|
||
| * stem:[290], using a value of 300 in the vCPU calculation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have two cases for 150 & 200 respectively instead of 200 & 300 as 150 & 200 are the values of QPS/vCPU that we have documented earlier
|
|
||
| |==== | ||
|
|
||
| Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of 300 in the calculation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, the value should be 200
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarahlwelton I've some comments around search index nomenclature and requirements for sizing right.
I think the estimation itself @TusharMadaan04 's done nice work on and you've captured it quite well here.
| Search Service nodes manage Search indexes and serve your Search queries. | ||
|
|
||
| Basic Search indexes are lists of all the unique terms that appear in the documents on your cluster. | ||
| For each term, the Search index also contains a list of the documents where that term appears. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to use the nomenclature - inverted index here .. which is the primary data structure within the search index that indicates the list of the documents where the term appears.
|
|
||
| Basic Search indexes are lists of all the unique terms that appear in the documents on your cluster. | ||
| For each term, the Search index also contains a list of the documents where that term appears. | ||
| These lists inside a Search index can cause the Search index to be larger than your original dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a possibility, it depends very much on the data though .. in many a situation it could be lesser too because we de-dup terms/prefixes/sub-strings/suffixes in the index and the inverted-index/postings-lists are compressed bitmaps alongside a map for doc keys.
Will defer to your judgement on how to frame all of this :)
| To size the Search Service nodes in your cluster, you need the following information: | ||
|
|
||
| * The number of documents you need to include in your Search index or indexes. | ||
| * The average size of the documents that need to be included in your Search index, in KB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not accurate enough - it's not the average size of the docs that we need, but the number of the fields and their sizes that'll drive footprint.
| * A sample document or documents that show the structure of your data. | ||
| * The specific queries per second (QPS) target you need from the Search Service. | ||
|
|
||
| You should also consider your replication and recovery needs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recovery/high-availability* needs
The long-awaited guidance for sizing the Search Service in a Couchbase Server deployment.
Preview URL:
https://preview.docs-test.couchbase.com/docs-server-DOC-9267-fts-sizing/server/current/install/sizing-general.html#sizing-search-service-nodes
You will need the Docs Team credentials on Confluence.