GStreamer Nervous System for AI Brain : Introducing Python Analytics

Settings

Always show controls

Qualities

UbiCast player

Keyboard shortcuts

Action

Key

Play / Pause

K or space

Mute / Unmute

Toggle fullscreen mode

Select next subtitles

Select next audio track

Toggle automatic slides maximization

Seek 5s backward

left arrow

Seek 5s forward

right arrow

Seek 10s backward

shift + left arrow or J

Seek 10s forward

shift + right arrow or L

Seek 60s backward

control + left arrow

Seek 60s forward

control + right arrow

Seek 1 frame backward

alt + left arrow

Seek 1 frame forward

alt + right arrow

Decrease volume

shift + down arrow

Increase volume

shift + up arrow

Decrease playback rate

Increase playback rate

Seek to end

end

Seek to beginning

beginning

Loading Click here to add:

Subscribe to notifications

When subscribed to notifications, an email will be sent to you for all added annotations.

Your user account has no email address.

Information on this media

24 views

With the growing success of machine learning (ML) language and speech models over the past four years, ML systems are behaving increasingly like human brains. These brains must be be fed with data, and GStreamer is the perfect framework to do it. But how do we remove obstacles to rapid adoption ? ML research and commercial development takes place almost exclusively in the Python world, and the current dominant ML toolkit is PyTorch. PyTorch has succeeded for a number of reasons, including it's simplicity, strong community, rapid innovation, broad hardware support and ease of integration with the vast Python world. Over the past year, Pytorch has introduced a compile feature, a Just In Time (JIT) compiler that dynamically optimizes code for the current target hardware. Performance improvements are astonishing - in some cases compiled PyTorch is faster than TensorRT on Nvidia hardware. But compile is not limited to just one hardware platform. Collabora was the first to upstream neural network support into GStreamer via ONNX analytics elements. ONNX is a cross-platform inference engine whose C++ API has been integrated to enable new object detection and segmentation elements. We have also introduced the analytics meta-data framework, a framework for flexibly storing meta data generated from AI models, and the relationships between different meta data.. We now introduce a suite of GStreamer custom elements and classes written in Python that allow users to easily and rapidly support all the latest AI models, using native PyTorch support. We provide a package with base classes supporting models for audio, video and Large Language Models (LLMs). The package works with the latest GStreamer version and inter-operates with the new meta data framework. Performance enhancements such as batching and managing device memory buffers are available out of the box. In addition to the base classes, we also provide elements that perform object detection, tracking, speech to text, text to speech and LLM chat-bot features. There is a Kafka element that can send meta data to a Kafka server. The list of elements continues to grow rapidly, supporting any of the many Hugging Face models with ease. Our goal is no less than to provide a complete upstream solution for GStreamer Analytics via PyTorch, making upstream GStreamer the number one multimedia framework for machine learning.

Creation date: Oct. 7, 2024

Speakers: Aaron Boxer

License: CC BY-SA 3.0

Links:

Latest annotations RSS feed

Information on this media

Other media in the channel "GStreamer Conference 2024"