NVIDIA Jarvis GA
May 15, 2020
Jarvis is a GPU-accelerated application framework that allows
companies to use video and speech data to build state-of-the-art
conversational AI services customized for their own industry, products
The shift toward working from home, telemedicine and remote learning has
created a surge in demand for custom, language-based AI services,
ranging from customer support to real-time transcriptions and
summarization of video calls to keep people productive and connected.
Among the first companies to take advantage of Jarvis-based
conversational AI products and services for their customers are Voca, an
AI agent for call center support; Kensho, for automatic speech
transcriptions for finance and business; and Square, with its virtual
assistant for appointment scheduling.
“Conversational AI is central to the future of many industries, as
applications gain the ability to understand and communicate with nuance
and contextual awareness,” said Jensen Huang, founder and CEO of NVIDIA.
“NVIDIA Jarvis can help the healthcare, financial services, education
and retail industries automate their overloaded customer support with
speed and accuracy.”
Applications built with Jarvis can take advantage of innovations in the
new NVIDIA A100 Tensor Core GPU for AI computing and the latest
optimizations in NVIDIA TensorRT™ for inference. For the first time,
it’s now possible to run an entire multimodal application, using the
most powerful vision and speech models, faster than the 300-millisecond
threshold for real-time interactions.
Jarvis provides a complete, GPU-accelerated software stack and tools
making it easy for developers to create, deploy and run end-to-end,
real-time conversational AI applications that can understand terminology
unique to each company and its customers.
“IDC continues to see rapid growth within the conversational AI market
largely because organizations of all sizes are beginning to realize the
value of using well-trained virtual assistants and chatbots to help
service their customers and grow their businesses,” said David Schubmehl,
research director of AI Software Platforms at IDC. “IDC expects
worldwide spending on conversational AI use cases like automated
customer service agents and digital assistants to grow from $5.8 billion
in 2019 to $13.8 billion in 2023, a compound annual growth rate of 24
To offer an interactive, personalized experience, companies need to
train their language-based applications on data that is specific to
their own product offerings and customer requirements. However, building
a service from scratch requires deep AI expertise, large amounts of data
and compute resources to train the models, and software to regularly
update models with new data.
Jarvis addresses these challenges by offering an end-to-end deep
learning pipeline for conversational AI. It includes state-of-the-art
deep learning models, such as NVIDIA’s Megatron BERT for natural
language understanding. Enterprises can further fine-tune these models
on their data using NVIDIA NeMo, optimize for inference using TensorRT,
and deploy in the cloud and at the edge using Helm charts available on
NGC, NVIDIA’s catalog of GPU-optimized software.
Early Adopters — Voca, Kensho, Square
Companies worldwide are using NVIDIA’s conversational AI platform to
improve their services.
Voca’s AI virtual agents — which use NVIDIA for faster, more
interactive, human-like engagements — are used by Toshiba, AT&T and
other world-leading companies. Voca uses AI to understand the full
intent of a customer’s spoken conversation and speech. This makes it
possible for the agents to automatically identify different tones and
vocal clues to discern between what a customer says and what a customer
means. Additionally, using scalability features built into NVIDIA’s AI
platform, they can dramatically reduce customer wait time.
“Low latency is critical in call centers and with NVIDIA GPUs our agents
are able to listen, understand and respond in under a second with the
highest levels of accuracy,” said Alan Bekker, co-founder and CTO of
Voca. “Now our virtual agents are able to successfully handle 70-80
percent of all calls — ranging from general customer service requests to
payment transactions and technical support.”
Kensho, the innovation hub for S&P Global located in Cambridge, Mass.,
that deploys scalable machine learning and analytics systems, has used
NVIDIA’s conversational AI to develop Scribe, a speech recognition
solution for finance and business. With NVIDIA, Scribe outperforms other
commercial solutions on earnings calls and similar financial audio in
terms of accuracy by a margin of up to 20 percent.
working closely with NVIDIA on ways to push end-to-end automatic speech
recognition with deep learning even further,” said Georg Kucsko, head of
AI research at Kensho. “By training new models with NVIDIA, we’re able
to offer higher transcription accuracy for financial jargon compared to
traditional approaches that do not use AI, offering our customers timely
information in minutes versus days.”
Square has created an AI virtual assistant that allows Square sellers to
use AI to automatically confirm, cancel or change appointments with
their customers, and free themselves to conduct more strategic customer
“Square Assistant can understand and provide help for 75 percent of
customer questions, along with ensuring that 10 percent more people are
showing up to their appointments,” said Gabor Angeli, head of
conversational AI at Square. “With GPUs, we’re able to train models 10x
faster versus CPUs to deliver more accurate, human-like interactions,
ultimately helping our customers grow their businesses.”
An early access program for NVIDIA
Jarvis is available to a limited number of applicants.