Your Customized AI Chatbot is NVIDIA ChatRTX.
Guide for NVIDIA ChatRTX Users
Overview
With the help of the example application NVIDIA ChatRTX, you may customize a GPT large language model (LLM) linked to your own documents, notes, images, or other data. With the help of TensorRT-LLM, RTX acceleration, and retrieval-augmented generation (RAG), you may ask a bespoke chatbot questions and receive prompt, contextually appropriate responses. Additionally, you’ll obtain quick and safe results because everything runs locally on your Windows RTX PC or workstation.
Text, PDF, doc/docx (with LLMs), jpeg, gif, and png (with CLIP) are among the file types that ChatRTX supports. In only a few seconds, the program will load your files into the library when you point it at the folder that contains them. The TensorRT-LLM RAG developer reference project, which can be seen on Github, is the basis for the ChatRTX tech demo. With the help of TensorRT-LLM, developers may utilize that reference to create and implement their own RAG-based RTX applications.
Requirements
- As of right now, ChatRTX is only compatible with RTX 3xxx and RTX 4xxx series GPUs with at least 8GB of GPU RAM (vGPU combinations are not supported at this time).
- A minimum of 100 gigabytes of free hard drive space
- The most recent NVIDIA GPU drivers for Windows 10/11
Installation Advice
AI model weights, engine files, and other software libraries will be downloaded by the installer. Depending on the models chosen, the download will be around 11 GB in size overall. Depending on the traffic on the servers and your internet connection, the download and installation should take ten to thirty minutes.
- During the installation process, please ensure that the sleep feature on your computer is turned off.
- If an error message appears after the installation fails. If you restart the installer, it will pick up where it left off and carry on with the installation.
- If some of the components are installed and the installation fails. On your subsequent installation attempt, please choose “do a clean install.”
- The installer still needs to download a few files from public sites, even if it contains the majority of the necessary huge files. The installer can malfunction or momentarily halt if these servers are unavailable.
- Please ensure that the folder path and folder name contain no spaces if you decide to install the application in a place other than the usual install location. This is a known problem that will be resolved in a later version.
- Before trying to install, please remove the following folder if the installation continues to fail after several tries: \AppData\Local\NVIDIA\RAG C:\Users
ChatRTX use without a dataset
In order to offer context when submitting your query to the LLM, the program looks up the local files you link it to using a method known as Retrieval Augmented Generation (RAG). The LLM will only produce replies based on the data it was first trained on if RAG is disabled.
Making Use of the CLIP Language and Vision Model
You may download and install the CLIP vision and language model from the “Add new models” option in addition to the Mistral LLM model that comes pre-installed. Once the model is installed, you may interact with your photos by pointing the program to your JPEG picture folder. It is not necessary to tag these images. You can pose queries like “Show me pictures with cats in them,” “Show me pictures taken outside,” “Show me pictures with flowers,” and so on. The CLIP model’s accuracy and training influence how accurately your queries will be answered.
Using voice to input your questions
The Whisper model, which translates audio to text, has also been incorporated into this version of NVIDIA ChatRTX. Make sure your system’s microphone is turned on before using this function. Then, click the “mic” button and ask your inquiry. To end the recording when you have finished asking your question, click the “stop” symbol. Your question will be recognized by the program and shown in the chat box. After that, you may click “Send” to send the text to the LLM for a reply. French, Spanish, Mandarin, and more languages are supported using the Whisper model.
Rules for query results
To create a response to a question, ChatRTX feeds data into the vector library in chunks, which are chosen based on their relevancy. You may think of these chunks as paragraphs in a manuscript. NVIDIA ChatRTX is useful for queries that ask for information that is covered in little pieces across the dataset because of this technique of data storage, but it is not appropriate for queries that need reasoning about the complete dataset at once. For instance, requesting a few facts discussed in a few documents is probably going to provide more results than requesting a synopsis of a single document or collection of documents.
More data tends to increase the quality of the response, as is the case with most AI use cases. Better answers are typically obtained when directing NVIDIA ChatRTX to further information related to a certain topic.
Terminating the Application
Click the Power button symbol located in the top right corner of the application to end it. The application will be closed as a result. To exit the program backend in the Command Prompt window, press any key on your keyboard.
Recognized Problems and Restrictions
- The current build has the following known problems.
- Currently, Google Chrome and Microsoft Edge are compatible with the app. The program is incompatible with the FireFox browser due to a bug. A later release will address this.
- Context is not remembered by the program. This implies that subsequent enquiries won’t be addressed in light of the earlier queries’ context. For instance, the app won’t recognise that you are enquiring about the RTX 4080 Super if you ask, “What is the price of the RTX 4080 Super?” and then, “What are its hardware specifications?”
- The response’s source file attribution isn’t always accurate. A subsequent release will have improvements made to this.
- We have seen a few cases when the application becomes useless and cannot be fixed by restarting.
- Usually, this may be resolved by removing the preferences.C:\Users\\AppData\Local\NVIDIA\ChatWithRTX\RAG\trt-llm-rag-windows-main\config\preferences.json) json file (by default found at C:\Users\\AppData\Local\NVIDIA\ChatWithRTX) If a reinstallation doesn’t work, try deleting the install directory.
System Requirements
Component | Requirement |
---|---|
GPU | NVIDIA RTX Ampere or Ada Generation GPU or NVIDIA GeForce RTX 30/40 Series with minimum 8GB VRAM |
RAM | 16GB or more |
Operating System | Windows 11 |
Driver Version | 535.11 or higher |
Storage | 11 GB of available data space |