Introduction
NVIDIA's Chat with RTX is a demo application that personalizes GPT large language models (LLMs), connecting them to your own content—documents, notes, videos, or other data. Leveraging retrieval-augmented generation (RAG), TensorRT-LLM, and RTX acceleration technology, you can query a customized chatbot to quickly obtain context-relevant answers. Since all of this runs locally on a Windows RTX PC or workstation, you will get fast and secure results.
Official website download ↓ address: https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/#sys-req
Device requirements
Platform | Windows |
---|---|
GPU | NVIDIA GeForce™ RTX 30 or 40 Series GPU or NVIDIA RTX™ Ampere or Ada Generation GPU with at least 8GB of VRAM |
RAM | 16GB or greater |
OS | Windows 11 |
Driver | 535.11 or later |
Application scenarios
NVIDIA's Chat with RTX provides two core application scenarios:
Chat with RTX supports multiple file formats, including text, pdf, doc/docx, and xml. Just point the app to the folder containing your files, and it will load them into the library within seconds. Additionally, you can provide the URL of a YouTube playlist, and the app will load the transcripts of the videos in the playlist, allowing you to query the content they cover.
The Chat with RTX technical demonstration is built based on the TensorRT-LLM RAG developer reference project available on GitHub. Developers can use this reference to develop and deploy RAG-based applications that are accelerated for RTX and supported by TensorRT-LLM.