Skip to content

Conversation

@demjened
Copy link
Contributor

@demjened demjened commented Dec 3, 2025

Summary

https://github.com/elastic/search-team/issues/10695

This PR adds some bullets about "fair usage" limits of the Elastic Inference Service for accounts that are currently in their free trial period. The Search Inference team monitors activity and gets alerted if an account exceeds the limit in a 24h period. The follow up action may involve pausing access for the account if it is deemed to be abusing the free trial.

We're listing the bullets under the general free trial limitations page.

Generative AI disclosure

  1. Did you use a generative AI (GenAI) tool to assist in creating this contribution?
  • Yes
  • No
  1. If you answered "Yes" to the previous question, please specify the tool(s) and model(s) used (e.g., Google Gemini, OpenAI ChatGPT-4, etc.).

Tool(s) and model(s) used:
-->

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

Warning

It looks like this PR modifies one or more .asciidoc files. These files are being migrated to Markdown, and any changes merged now will be lost. See the migration guide for details.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

Vale Linting Results

Summary: 2 suggestions found

💡 Suggestions (2)
File Line Rule Message
deploy-manage/deploy/elastic-cloud/create-an-organization.md 83 Elastic.Acronyms 'ELSER' has no definition.
deploy-manage/deploy/elastic-cloud/create-an-organization.md 84 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

Warning

It looks like this PR modifies one or more .asciidoc files. These files are being migrated to Markdown, and any changes merged now will be lost. See the migration guide for details.

This reverts commit c919107.
@demjened demjened changed the title Demjened/fair eis usage limit Document fair EIS usage limits in free trial Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🔍 Preview links for changed docs

@demjened demjened marked this pull request as ready for review December 3, 2025 18:49
@demjened demjened requested a review from a team as a code owner December 3, 2025 18:49
@demjened demjened requested a review from szabosteve December 3, 2025 18:50
Copy link
Collaborator

@shainaraskas shainaraskas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few suggestions and questions for you

* Scaling is limited for {{serverless-short}} projects in trials. Failures might occur if the workload requires memory or compute beyond what the above search power and search boost window setting limits can provide.
* We monitor token usage per account for the Elastic Managed LLM. If an account uses over one million tokens in 24 hours, we will inform you and then disable access to the LLM. This is in accordance with our fair use policy for trials.

**Inference tokens**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having this be a sibling to hosted/serverless is not ideal because they are "sibling" deployment types. we should add some applies tags to indicate that the inference tokens limitation applies to both deployment types

Suggested change
**Inference tokens**
**Inference tokens** {applies_to}`ess: ga` {applies_to}`serverless: ga`


**Inference tokens**

* You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits:
* You can use the following models hosted by the Elastic {{infer-cap}} Service, with the following limits:

* You can use these models hosted by the Elastic {{infer-cap}} Service with the following limits:
* **Elastic Managed LLM:** 100 million input tokens in a 24-hour period or 5 million output tokens in a 24-hour period
* **ELSER**: 1 billion tokens in a 24-hour period
* Access to some models may be paused temporarily if either of these limits are exceeded
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence only works in the context of the bullet above, so we should not make it its own bullet.

is access ALWAYS paused when these limits are exceeded? can we be specific? also, does the pausing vary by model? if it doesn't, we should skip some

if the answer to both of these questions is yes:

Suggested change
* Access to some models may be paused temporarily if either of these limits are exceeded
Access to these models is paused temporarily if either of these limits are exceeded.

if no:

Suggested change
* Access to some models may be paused temporarily if either of these limits are exceeded
Access to some models might be paused temporarily if either of these limits are exceeded.

shainaraskas

This comment was marked as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants