NVIDIA has published a blog announcing optimizations for a new open language model, Google’s Gemma.

  • NVIDIA, in collaboration with Google, today launched optimizations across all NVIDIA AI platforms, including local RTX AI PCs, for Gemma — Google’s groundbreaking new 2 billion-and 7 billion- parameter open language models.
  • Chat With RTX, an NVIDIA tech demo that uses retrieval-augmented generation and NVIDIA TensorRT-LLM software to give users generative AI capabilities on their local, RTX-powered Windows PCs — will add support for Gemma soon.

Teams from Google and NVIDIA worked closely together to accelerate the performance of Gemma — Google’s groundbreaking new 2 billion- and 7 billion-parameter open language model. Its built from the same research and technology used to create the Gemini models — with TensorRT-LLM, an open-source library for optimizing large language model inference, when running on NVIDIA GPUs in the data center, in the cloud, and on local RTX AI PCs with NVIDIA RTX GPUs.

We’re also announcing that Chat With RTX will add Gemma as a supported model soon. Stay tuned for an exact release date. If you’re interested in using Chat With RTX with Gemma, we anticipate having a press build as early as later today. Let us know and we’ll share it as soon as its ready.

Leave a Reply

Your email address will not be published