-
Notifications
You must be signed in to change notification settings - Fork 190
Document fair EIS usage limits in free trial #4206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Warning It looks like this PR modifies one or more |
Vale Linting ResultsSummary: 2 suggestions found 💡 Suggestions (2)
|
|
Warning It looks like this PR modifies one or more |
This reverts commit c919107.
🔍 Preview links for changed docs |
shainaraskas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few suggestions and questions for you
| * Scaling is limited for {{serverless-short}} projects in trials. Failures might occur if the workload requires memory or compute beyond what the above search power and search boost window setting limits can provide. | ||
| * We monitor token usage per account for the Elastic Managed LLM. If an account uses over one million tokens in 24 hours, we will inform you and then disable access to the LLM. This is in accordance with our fair use policy for trials. | ||
|
|
||
| **Inference tokens** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having this be a sibling to hosted/serverless is not ideal because they are "sibling" deployment types. we should add some applies tags to indicate that the inference tokens limitation applies to both deployment types
| **Inference tokens** | |
| **Inference tokens** {applies_to}`ess: ga` {applies_to}`serverless: ga` |
|
|
||
| **Inference tokens** | ||
|
|
||
| * You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits: | |
| * You can use the following models hosted by the Elastic {{infer-cap}} Service, with the following limits: |
| * You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits: | ||
| * **Elastic Managed LLM:** 100 million input tokens in a 24-hour period or 5 million output tokens in a 24-hour period | ||
| * **ELSER**: 1 billion tokens in a 24-hour period | ||
| * Access to some models may be paused temporarily if either of these limits are exceeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this sentence only works in the context of the bullet above, so we should not make it its own bullet.
is access ALWAYS paused when these limits are exceeded? can we be specific? also, does the pausing vary by model? if it doesn't, we should skip some
if the answer to both of these questions is yes:
| * Access to some models may be paused temporarily if either of these limits are exceeded | |
| Access to these models is paused temporarily if either of these limits are exceeded. |
if no:
| * Access to some models may be paused temporarily if either of these limits are exceeded | |
| Access to some models might be paused temporarily if either of these limits are exceeded. |
Summary
https://github.com/elastic/search-team/issues/10695
This PR adds some bullets about "fair usage" limits of the Elastic Inference Service for accounts that are currently in their free trial period. The Search Inference team monitors activity and gets alerted if an account exceeds the limit in a 24h period. The follow up action may involve pausing access for the account if it is deemed to be abusing the free trial.
We're listing the bullets under the general free trial limitations page.
Generative AI disclosure
Tool(s) and model(s) used:
-->