Skip to main content
Version: v3.2.0

Known Issues and Limitations

This page outlines current known issues and limitations in our system to ensure transparency and help users better understand potential constraints.

1. Model Input Token Limitations

Some models, such as meta-llama/Llama-2-13b-chat-hf, have maximum input token limits that are smaller than the chunk sizes we use during certain iterations of our optimizations. When these limits are exceeded, the models may error out. This is a known limitation of certain models with lower token limits.


2. Hugging Face Models: Token Cost Limitation

For models hosted on Hugging Face (HF), prompt and response token costs are marked as Not Applicable (NA). This is because the cost for HF models is determined by infrastructure usage (e.g., per hour charges) rather than token-based charges. Token-based cost reporting is not applicable for these models.


3. NVIDIA Models: Cost Information Limitation

Cost information for NVIDIA models is unavailable as NVIDIA does not currently provide per-hour or per-token pricing in public records. This limitation may affect your ability to estimate operational costs for NVIDIA-hosted models.


4. Fine-Tuned User Models: Limited Support

We currently lack full support for fine-tuned user models due to the unavailability of tokenizers and specific cost information for these models. While users can utilize their fine-tuned OpenAI models, we may be unable to provide accurate cost estimates, as detailed pricing information for such models is not available.


5. OpenAI Model Registration Limitations

Not all OpenAI models will be available to be registered in the system. The following models will not be registered as they are not chat nor language models:

  • gpt-4o-audio-preview
  • gpt-4o-audio-preview-2024-10-01
  • gpt-4o-audio-preview-2024-12-17
  • gpt-4o-mini-realtime-preview
  • gpt-4o-mini-audio-preview
  • gpt-4o-mini-audio-preview-2024-12-17
  • gpt-4o-mini-realtime-preview-2024-12-17
  • gpt-4o-realtime-preview
  • gpt-4o-realtime-preview-2024-10-01
  • gpt-4o-realtime-preview-2024-12-17

Additionally, the following models will not be registered as they are deprecated or unavailable:

  • gpt-3.5-turbo-16k-0613
  • gpt-3.5-turbo-instruct-0914

6. Query Generation: Dependency on Document and Model Quality

The quality of query generation is highly dependent on the quality of the uploaded document and the selected LLM. Documents with incomplete or unclear information may result in suboptimal queries. Similarly, the choice of LLM directly impacts the relevance and accuracy of the generated queries.


We are actively working to address these limitations where possible and will update this page as new information becomes available.

For any issues that were not addressed, please reach out to our support team at help@trustwise.ai for assistance.