Project Rocket platform—designed for easy, customizable live video analytics—is open source

By Microsoft's Ganesh Ananthanarayanan, Principal Researcher; Yuanchao Shu, Senior Researcher; Landon Cox, Principal Researcher

January 27, 2020

code lines

Thanks to advances in computer vision and deep neural networks (DNNs) in what can arguably be described as the golden age of vision, AI, and machine learning, video analytics systems—systems performing analytics on live camera streams—are becoming more accurate. This accuracy offers opportunities to support individuals and society in exciting ways, like informing homeowners when a package has been delivered outside their door, allowing people to give their pets the attention they need when out for the day, and detecting high-traffic areas so cities can consider adding a stop light.

While DNN advancements and DNN inference are enablers, they alone are not enough when it comes to extracting valuable insights from live videos. Live video analytics requires keeping up with video frame rates, which can be as fast as 60 frames per second, making it crucial to effectively filter frames and avoid the costly processing of each frame. Project Rocket provides a framework to do exactly that.


Rocket—which we’re glad to announce is now open source on GitHub—enables the easy construction of video pipelines for efficiently processing live video streams. You can build, for example, a video pipeline that includes a cascade of DNNs in which a decoded frame is first passed through a relatively inexpensive “light” DNN like ResNet-18 or Tiny YOLO and a “heavy” DNN such as ResNet-152 or YOLOv3 is invoked only when required. With Rocket, you can plug in any TensorFlow or Darknet DNN model. You can also augment the above pipeline with, let’s say, a simpler motion filter based on OpenCV background subtraction, as shown in the figure below.


The above figure represents one of several video pipelines that can be built for efficient, customizable live video analytics with the Project Rocket platform. In this pipeline, decoded video frames are filtered first using background subtraction detection and then low-resource DNN detection. Frames requiring further processing are passed through a heavy DNN detector.

Cascaded pipelines, like the one above, allow for very efficient processing of live video streams by filtering out frames with limited relevant information and being judicious about invoking resource-intensive operations. Plus, Rocket also makes it easy to ship the outputs of the video analytics, such as the number of relevant objects in an object-counting application, to a database for after-the-fact review.


The Project Rocket video analytics platform (above) is self-contained and allows people to plug in TensorFlow and Darknet DNN models to create pipelines for object detection, object counting, and the like to drive higher-level applications such as traffic prediction analysis and smart homes.

Making streets safer

Project Rocket has been focusing on smart cities as its driving application. In partnership with the city of Bellevue, Washington, we used the framework to help make the city’s street system safer for drivers, riders, and pedestrians as part of its Vision Zero initiative to reduce traffic-related fatalities. With aggregate car and bicycle counts provided by a system built on the framework, for example, the city was able to assess the value of adding a bike lane to its downtown area.

One exciting traffic safety–related application we recently used it for, separate of our work with Bellevue, is a smart crosswalk. Using a live camera feed, the smart crosswalk, which is in the prototype stage, is able to detect when a person in a wheelchair is in the middle of the crosswalk and extend the timer to allow the person to safely finish crossing.

Throughout our research and as we continue to develop Rocket, we’re devising privacy-protecting tools, including a “privacy protector” technique in which only those elements relevant to an application—for example, cars in a traffic-counting system—will be made available; background elements and other details, such as people, homes, businesses, and license plate numbers in the traffic-counting example, will be blacked out. Additionally, Rocket leverages the edge, and for all its benefits in enabling efficiency, we also see it as a means for keeping data in a trusted space—that is, on users’ premises.

Get the code and get to work!

The Rocket platform is Linux friendly. The code is written in .NET Core, which is compatible on Windows, as well as Linux. The Rocket repository also has simple instructions to create Docker containers, allowing for easy deployment using orchestration frameworks like Kubernetes. Docker containers are also readily compatible with appliances that bring computing to the edge, such as Microsoft Azure Stack Edge. Additionally, Rocket has easy-to-use code for optionally invoking customized Azure Machine Learning models in the cloud.

Check out the code and give it a spin!

For more information, including a tutorial on how to get started building your own video analytics applications atop the platform, check out our Project Rocket webinar, available on demand now.

Terms of Use | Copyright © 2002 - 2020 CONSTITUENTWORKS SM  CORPORATION. All rights reserved. | Privacy Statement