Actual-Time Video Processing with WebCodecs and Streams: Processing Pipelines (Half 1)
WebRTC was about capturing some media and sending it from Level A to Level B. Machine Studying has modified this. Now it’s common to make use of ML to research and manipulate media in actual time for issues like digital backgrounds, augmented actuality, noise suppression, clever cropping, and way more. To higher accommodate this rising development, the net platform has been exposing its underlying platform to offer builders extra entry. The end result isn’t solely extra management inside present APIs, but in addition a bunch of latest APIs like Insertable Streams, WebCodecs, Streams, WebGPU, and WebNN.
So how do all these new APIs work collectively? That’s precisely what W3C specialists, François Daoust and Dominique Hazaël-Massieux (Dom) determined to search out out. In case you forgot, W3C is the World Huge Net Consortium that standardizes the Net. François and Dom are long-time requirements guys with a deep historical past of serving to to make the net what it’s at present.
That is the primary of a two-part sequence of articles that explores the way forward for real-time video processing with WebCodecs and Streams. This primary part offers a evaluation of the steps and pitfalls in a multi-step video processing pipeline utilizing present and the most recent net APIs. Half two will discover the precise processing of video frames.
I’m thrilled in regards to the depth and insights these guides present on these cutting-edge approaches – get pleasure from!
{“editor”, “chad hart“}
In easy WebRTC video conferencing situations, audio and video streams captured on one system are despatched to a different system, presumably going by means of some middleman server. The seize of uncooked audio and video streams from microphones and cameras depends on getUserMedia
. Uncooked media streams then must be encoded for transport and despatched over to the receiving facet. Obtained streams have to be decoded earlier than they are often rendered. The ensuing video pipeline is illustrated under. Net purposes don’t see these separate encode/ship and obtain/decode steps in apply – they’re entangled within the core WebRTC API and beneath the management of the browser.
If you wish to add the power to do one thing like take away customers’ backgrounds, probably the most scalable and privateness respective possibility is to do it client-side earlier than the video stream is distributed to the community. This operation wants entry to the uncooked pixels of the video stream. Mentioned in another way, it must happen between the seize step and encode steps. Equally, on the receiving facet, you could wish to give customers choices like adjusting colours and distinction, which additionally require uncooked pixel entry between the decode and render steps. As illustrated under, this provides an additional course of steps to the ensuing video pipeline.
This made Dominique Hazaël-Massieux and me surprise how net purposes can construct such media processing pipelines.
The principle downside is uncooked frames from a video stream can not casually be uncovered to net purposes. Uncooked frames are:
- giant – a number of MB per body,
- plentiful – 25 frames per second or extra,
- not simply exposable – GPU to CPU read-back usually wanted, and
- browsers have to take care of quite a lot of pixel codecs (RGBA, YUV, and many others.) and coloration areas beneath the hoods.
As such, each time potential, net applied sciences that manipulate video streams on the net (HTMLMediaElement, WebRTC, getUserMedia, Media Source Extensions) deal with them as opaque objects and conceal the underlying pixels from purposes. This makes it troublesome for net purposes to create a media processing pipeline in apply.
Happily, the VideoFrame
interface in WebCodecs could assist, particularly if you happen to couple this with the MediaStreamTrackProcessor
object outlined in MediaStreamTrack Insertable Media Processing using Streams that creates a bridge between WebRTC and WebCodecs. WebCodecs permits you to entry and course of uncooked pixels of media frames. Precise processing can use one among many applied sciences, beginning with good ol’ JavaScript and together with WebAssembly, WebGPU, or the Web Neural Network API (WebNN).
After processing, you could possibly get again to WebRTC land by means of the identical bridge. That stated, WebCodecs can even put you in command of the encode/decode steps within the pipeline by means of its VideoEncoder
and VideoDecoder
interfaces. These may give you full management over all particular person steps within the pipeline:
- For transporting the processed picture someplace whereas preserving latency low, you could possibly take into account WebTransport or WebRTC’s
RTCDataChannel
. - For rendering, you could possibly render on to a canvas by means of
drawImage
, utilizing WebGPU, or through an<video>
ingredient by means ofVideoTrackGenerator
(additionally outlined in MediaStreamTrack Insertable Media Processing using Streams).
Impressed by sample code created by Bernard Aboba – co-editor of the WebCodecs and WebTransport specs and co-chair of the WebRTC Working Group in W3C – Dominique and I made a decision to spend a little bit of time exploring the creation of processing media pipelines. First, we wished to higher grasp media ideas equivalent to video pixel codecs and coloration areas – we most likely qualify as net specialists, however we’re not media specialists and we are likely to view media streams as opaque beasts as effectively. Second, we wished to evaluate whether or not technical gaps stay. Lastly, we wished to know the place and when copies get made and collect some efficiency metrics alongside the best way.
This text describes our method, offers highlights of our resulting demo code, and shares our learnings. The code shouldn’t be seen as authoritative and even appropriate (although we hope it’s), it’s simply the results of a brief journey on this planet of media processing. Additionally, word the applied sciences beneath dialogue are nonetheless nascent and don’t but assist interoperability throughout browsers. Hopefully, this can change quickly!
Notice: We didn’t contact on audio for lack of time. Audio frames take much less reminiscence, however there are a lot of extra of them per second and they’re extra delicate to timing hiccups. Audio frames are processed with the Net Audio API. It could be very fascinating so as to add audio to the combo, be it solely to discover audio/video synchronization wants.
Our demo explores the creation of video processing pipelines, captures efficiency metrics, evaluates the impacts of selecting a selected expertise to course of frames, and offers insights about the place operations get accomplished and when copies are made. The processing operations loop by means of all pixels within the body and “do one thing with them” (what they really do is of little curiosity right here). Completely different processing applied sciences are used for testing functions, not as a result of they’d essentially be a good selection for the issue at hand.
The demo lets the person:
- Select a supply of enter to create an preliminary stream of VideoFrame: both a Nyan-cat-like animation created from scratch utilizing
OffscreenCanvas
, or a dwell stream generated from a digital camera. The person may select the decision and framerate of the video stream. - Course of video frames to interchange inexperienced with blue utilizing WebAssembly.
- Course of video frames to show them into black and white utilizing pure JavaScript.
- Add an H.264 encoding/decoding transformation stage utilizing WebCodecs.
- Introduce slight delays within the stream utilizing common JavaScript.
- Add an overlay to the underside proper a part of the video that encodes the body’s timestamp. The overlay is added utilizing WebGPU and WGSL.
- Add middleman steps to pressure copies of the body to CPU reminiscence or GPU reminiscence, to guage the affect of the body’s location in reminiscence on transformations.
When you hit the “Begin” button, the pipeline runs and the ensuing stream is displayed on the display in a <video>
ingredient. And… that’s it, actually! What mattered to us was the code wanted to attain that and the insights we gained from gathering efficiency metrics and taking part in with parameters. Let’s dive into that!
Notice: these APIs are new and should not work in your browser
Applied sciences mentioned on this article and used within the demo are nonetheless “rising” (no less than as of March 2023). The demo at present solely runs in Google Chrome Canary with WebGPU enabled (“Unsafe WebGPU” flag set in chrome://flags/). Hopefully, the demo can quickly run in different browsers too. Video processing with WebCodecs is accessible within the technical preview of Safari (16.4) and is beneath growth in Firefox. WebGPU can also be beneath growth in Safari and Firefox. A larger unknown is assist for MediaStreamTrack Insertable Media Processing utilizing Streams in different browsers. For instance, see this tracking bug in Firefox.
Timing Stats
Timing statistics are reported to the top of the web page on the finish and as objects to the console (this requires opening the dev instruments panel). Offered the overlay was current, show occasions for every body are reported too.
We’ll talk about extra on this within the Measuring Performance part.
WebCodecs is the core of the demo and the important thing expertise we’re utilizing to construct a media pipeline. Earlier than we dive extra into this, it could be helpful to mirror on the worth of utilizing WebCodecs on this context. Different approaches might work simply as effectively.
What in regards to the Canvas? Do we’d like WebCodecs?
The truth is, client-side processing of uncooked video frames has been potential on the net ever for the reason that <video>
and <canvas>
parts have been added to HTML, with the next recipe:
- Render the video onto a
<video>
ingredient. - Draw the contents of the
<video>
ingredient onto a<canvas>
with drawImage on a recurring foundation, e.g. utilizing requestAnimationFrame or the newer requestVideoFrameCallback that notifies purposes when a video body has been offered for composition and offers them with metadata in regards to the body. - Course of the contents of the <canvas> each time it will get up to date.
We didn’t combine this method in our demo. Amongst different issues, the efficiency right here would rely upon having the processing occur out of the principle thread. We would want to make use of an OffscreenCanvas to course of contents in a employee, presumably coupled with a name to grabFrame to ship the video body to the employee.
WebCodecs benefits
One disadvantage to the Canvas method is that there isn’t a assure that each one video frames get processed. Functions can inform what number of frames they missed in the event that they hook onto requestVideoFrameCallback
by trying on the presentedFrames counter, however missed frames have been, by definition, missed. One other disadvantage is that a number of the code (drawImage
or grabFrame
) must run on the principle thread to entry the <video>
ingredient.
WebGL and WebGPU additionally present mechanisms to import video frames as textures instantly from a <video>
ingredient, e.g. by means of the importExternalTexture method in WebGPU. This method works effectively if the processing logic can absolutely run on the GPU.
WebCodecs provides purposes a direct deal with to a video body and mechanisms to encode/decode them. This enables purposes to create frames from scratch, or from an incoming stream, supplied that the stream is in non-containerized type.
Notice on containerized media
One vital word – media streams are normally encapsulated in a media container. The container could embrace different streams together with timing and different metadata. Whereas media streams in WebRTC situations don’t use containers, most saved media recordsdata and media streamed on the net use adaptive streaming applied sciences (e.g. DASH, HLS) which are in a containerized type (e.g. MP4, ISOBMFF). WebCodecs can solely be used on non-containerized streams. Functions that wish to use WebCodecs with containerized media have to ship extra logic on their very own to extract the media streams from their container (and/or so as to add streams to a container). For extra details about media container codecs, we suggest The Definitive Guide to Container File Formats by Armin Trattnig.
So, having a direct deal with on a video body appears helpful to create a media processing pipeline. It provides a deal with to the atomic chunk of information that might be processed at every step.
Pipe chains
WHATWG Streams are particularly designed to create pipe chains to course of such atomic chunks. That is illustrated within the Streams API concepts MDN web page:
WHATWG Streams are additionally used because the underlying construction by a number of the applied sciences into consideration, equivalent to WebTransport
, VideoTrackGenerator
, and MediaStreamTrackProcessor
.
Backpressure
Lastly, Streams present backpressure and queuing mechanisms out of the field. As outlined within the WHATWG Streams standard, backpressure is the method of
normalizing movement from the unique supply in keeping with how briskly the chain can course of chunks.
When a step in a sequence is unable to simply accept extra chunks in its queue, it sends a sign that propagates backward by means of the pipe chain and as much as the supply to inform it to regulate its charge of manufacturing of latest chunks. With backpressure, no want to fret about overflowing queues, the movement will naturally adapt to the utmost pace at which processing can run.
Making a pipeline
Broadly talking, making a media processing pipeline utilizing streams interprets to:
- Create a stream of
VideoFrame
objects – by some means - Use
TransformStream
to create processing steps – compose them as wanted - Ship/Render the ensuing stream or
VideoFrame
objects – by some means
The Satan is after all within the by some means. Some applied sciences can ingest or digest a stream of VideoFrame
objects instantly – not all of them can. Connectors are wanted.
Pipelining is sort of a recreation of dominoes
We discovered it helpful to visualise potentialities by means of a recreation of dominoes:
The left facet of every domino is a sort of enter. The suitable facet of the diagram exhibits the kind of output. There are three important kinds of dominoes:
- turbines,
- transformers, and
- shoppers.
So long as you match the enter of a domino with the output of the previous one, you could assemble them any manner you wish to create pipelines. Let’s take a look at them in additional element:
Producing a stream
From scratch
Chances are you’ll create a VideoFrame
from the contents of a canvas (or a buffer of bytes for that matter). Then, to generate a stream, simply write the body to a WritableStream
at a given charge. In our code, that is applied within the worker-getinputstream.js file. The logic creates a Nyan-cat-like animation with the W3C brand. As we’ll describe later, we make use of the WHATWG Streams backpressure mechanism by ready for the author to be prepared:
await author.prepared; const body = new VideoFrame(canvas, ...); author.write(body); |
From a digital camera or a WebRTC monitor
In WebRTC contexts, the supply of a video stream is normally a MediaStreamTrack
obtained from the digital camera by means of a name to getUserMedia
, or obtained from a peer. The MediaStreamTrackProcessor
object (MSTP) can be utilized to transform the MediaStreamTrack
to a stream of VideoFrame
objects.
Notice: MediaStreamTrackProcessor
is just uncovered in employee contexts… in principle, however Chrome at present exposes it on the principle thread and solely there.
From a WebTransport stream
WebTransport creates WHATWG streams, so there isn’t a have to run any stream conversion. That stated, it’s pretty inefficient to move uncooked decoded frames given their dimension. Thus why all media streams journey encoded by means of the cloud! As such, the WebTransportReceiveStream
will usually comprise encoded chunks, to be interpreted as EncodedVideoChunk. To get again to a stream of VideoFrame
objects, every chunk must undergo a VideoDecoder
. Easy chunk encoding/decoding logic (with out WebTransport
) could be discovered within the worker-transform.js file.
What about WebTransport?
The demo doesn’t combine WebTransport
but. We encourage you to test Bernard Aboba’s WebCodecs/WebTransport sample. Each the pattern and method offered listed below are restricted in that just one stream is used to ship/obtain encoded frames. Actual-life purposes would seemingly be extra complicated to keep away from head-of-line blocking points. They might seemingly use a number of transport streams in parallel, as much as one per body. On the receiving finish, frames obtained on particular person streams then must be reordered and merged to re-create a singular stream of encoded frames. The IETF Media over QUIC (moq) Working Group develops such a low-latency media supply resolution (over uncooked QUIC or WebTransport).
What about Knowledge Channels?
RTCDataChannel may be used to move encoded frames, with the caveat that some adaptation logic would be needed to connect RTCDataChannel with Streams.
Reworking a stream
After getting a Stream of VideoFrame
objects, video processing could be structured as a TransformStream
that takes a VideoFrame
as enter and produces an up to date VideoFrame
as output. Rework streams could be chained as wanted, though it’s at all times a good suggestion to maintain the variety of steps that have to entry pixels to a minimal, since accessing pixels in a video body usually means looping by means of hundreds of thousands of them (ie 1920 * 1080 = 2 074 600
pixels for a video body in full HD).
Notice: Half 2 explores applied sciences that can be utilized beneath the hood to course of the pixels. We additionally evaluation efficiency issues.
Sending/Rendering a stream
Some apps solely have to extract info from the stream – like within the case of gesture detection. Nevertheless, normally, the ultimate stream must be rendered or despatched someplace.
To a <canvas> ingredient
A VideoFrame
could be instantly drawn onto a canvas. Easy!
canvasContext.drawImage(body, 0, 0); |
Rendering frames to a canvas provides the purposes full management over when to show these frames. This appears significantly helpful when a video stream must be synchronized with one thing else, e.g. overlays and/or audio. One disadvantage is that, if the aim is to finish up with a media participant, you’ll have to re-implement that media participant from scratch. Which means including controls, assist for tracks, accessibility, and many others. That is no straightforward process…
To a <video> ingredient
A stream of VideoFrame
objects can’t be injected right into a <video>
ingredient. Happily, a VideoTrackGenerator
(VTG) can be utilized to transform the stream right into a MediaStreamTrack
that may then be injected right into a <video>
ingredient.
Notes and Caveats
Just for Employees
Notice VideoTrackGenerator
is just uncovered in employee contexts… in principle, however as for MediaStreamTrackProcessor
, Chrome at present exposes it on the principle thread and solely there.
VideoTrackGenerator
is the brand new MediaStreamTrackGenerator
Additionally word: VideoTrackGenerator
was referred to as MediaStreamTrackGenerator
. Implementation in Chrome has not but caught up with the brand new title, so our code nonetheless makes use of the outdated title!
To the cloud with WebTransport
WebTransport can be utilized to ship the ensuing stream to the cloud. As famous earlier than, it could require an excessive amount of bandwidth to ship unencoded video frames in a WebTransportSendStream
. They must be encoded first, utilizing the VideoEncoder
interface outlined in WebCodecs. Easy body encoding/decoding logic (with out WebTransport) could be discovered within the worker-transform.js file.
Dealing with backpressure
Streams come geared with a backpressure mechanism. Alerts propagate by means of the pipe chain and as much as the supply when the queue is constructing as much as point out it is perhaps time to decelerate or drop a number of frames. This mechanism may be very handy to keep away from accumulating giant decoded video frames within the pipeline that might exhaust reminiscence. 1 second of full HD video at 25 frames per second fortunately takes 200MB of reminiscence as soon as decoded.
The API additionally makes it potential for net purposes to implement their very own buffering technique. If it’s worthwhile to course of a dwell feed in real-time, you could wish to drop frames that can not be processed in time. Alternatively, if it’s worthwhile to remodel recorded media then you’ll be able to decelerate and course of all frames, irrespective of how lengthy it takes.
One structural limitation is that backpressure indicators solely propagate by means of the pipeline in elements the place WHATWG streams are used. They cease each time the indicators stumble upon one thing else. As an example, MediaStreamTrack
doesn’t expose a WHATWG streams interface. Consequently, if a MediaStreamTrackProcessor
is utilized in a pipeline, it receives backpressure indicators however indicators don’t propagate past it. The buffering technique is imposed: the oldest body might be faraway from the queue when room is required for a brand new body.
In different phrases, if you happen to ever find yourself with a VideoTrackGenerator
adopted by a MediaStreamTrackProcessor
in a pipeline, backpressure indicators might be dealt with by the MediaStreamTrackProcessor
and won’t propagate to the supply earlier than the VideoTrackGenerator
. You shouldn’t have to create such a pipeline, however we by chance ended up with that configuration whereas writing the demo. Remember that this isn’t equal to an id remodel.
Employees, TransformStream and VideoFrame
Thus far, we’ve got assembled dominoes with out being specific about the place the underlying code goes to run. With the notable exception of getUserMedia
, all of the parts that we’ve got mentioned can run in employees. Working them exterior of the principle thread is both good apply or mandated as within the case of VideoTrackGenerator
and MediaStreamTrackProcessor
– although word these interfaces are literally solely out there on the principle thread in Chrome’s present implementation.
A number of Employees?
Now if we’re going to have threads, why prohibit your self to 1 employee when you’ll be able to create extra? Although a media pipeline describes a sequence of steps, it appears helpful at first sight to try to run completely different steps in parallel.
To run a processing step in a employee, the employee wants to realize entry to the preliminary stream of VideoFrame
objects which can have been created in one other employee. Employees usually don’t share reminiscence however the postMessage
API could also be used for cross-worker communication. A VideoFrame
isn’t a easy object however it’s outlined as a transferable object, which primarily implies that it may be despatched from one employee to a different effectively, with out requiring a replica of the underlying body knowledge.
Notice: Switch detaches the thing being transferred, which implies that the transferred object can now not be utilized by the employee that issued the decision to postMessage
.
One method to run processing steps in separate employees could be to difficulty a name to postMessage
for each VideoFrame
on the finish of a processing step to move it over to the following step. From a efficiency perspective, whereas postMessage is not necessarily slow, the API is event-based and occasions nonetheless introduce delays. A greater method could be to move the stream of VideoFrame
objects as soon as and for all when the pipeline is created. That is potential as a result of ReadableStream
, WritableStream
and TransformStream
are all transferable objects as effectively. Code to attach an enter and output stream to a different employee might then change into:
employee.postMessage({ sort: ‘begin’, inputStream: readableStream, outputStream: writableStream }, [readableStream, writableStream]); |
Now, the truth that streams get transferred doesn’t imply that the chunks that get learn from or written to those streams are themselves transferred. Chunks are moderately serialized. The nuance is skinny (and will have a really minimal affect on efficiency) however significantly vital for VideoFrame
objects. Why? As a result of a VideoFrame
must be explicitly closed by means of a name to its shut
technique to free the underlying media useful resource that the VideoFrame
factors to.
When a VideoFrame
is transferred, its shut
technique is mechanically referred to as on the sender facet. When a VideoFrame
is serialized, despite the fact that the underlying media useful resource isn’t cloned, the VideoFrame
object itself is cloned, and the shut
technique now must be referred to as twice: as soon as on the sender facet and as soon as on the receiver facet. The receiver facet isn’t a problem: calling shut()
there may be to be anticipated. Nevertheless, there’s a downside on the sender’s facet: a name like controller.enqueue(body)
in a TransformStream
connected to a readable stream transferred to a different employee will set off the serialization course of, however that course of occurs asynchronously and there’s no method to inform when it’s accomplished. In different phrases, on the sender facet, code can not merely be:
controller.enqueue(body); body.shut(); // Too early! |
In case you do this, the browser will rightfully complain when it successfully serializes the body that it can not clone it as a result of the body has already been closed. And but the sender wants to shut the body sooner or later. In case you don’t, one among two issues might occur:
- the browser will both report a warning that it ran into dangling
VideoFrame
situations (which suggests a reminiscence leak) or - the pipeline merely freezes after a few frames are processed.
The pipeline freeze occurs, for instance, when the VideoFrame
is tied to hardware-decoded knowledge. {Hardware} decoders use a really restricted reminiscence buffer, so pause till the reminiscence of already decoded frames will get freed. It is a recognized difficulty. There are ongoing discussions to increase WHATWG streams with a brand new mechanism that may permit it to explicitly switch possession of the body in order that the sender facet doesn’t want to fret in regards to the body anymore. See for instance the Transferring Ownership Streams Explained proposal.
Notice: Closing the body synchronously as within the code above typically works in apply in Chrome relying on the underlying processing pipeline. We discovered it laborious to breed the precise situations that make the browser resolve to clone the body immediately or delay it. So far as we are able to inform, the code mustn’t work in any case.
Studying: use a single employee for now
For now, it’s most likely greatest to stay to touching streams of VideoFrame
objects from one and just one employee. The demo does use a couple of employee. It retains monitor of body situations to shut on the finish of the processing pipeline. Nevertheless, we did that just because we didn’t know initially that creating a number of employees could be problematic and require such a hack.
The timestamp property of a VideoFrame
occasion offers a very good identifier for particular person frames, and permits purposes to trace them all through the pipeline. The timestamp even survives encoding (and respectively decoding) with a VideoEncoder
(and respectively with VideoDecoder
).
Within the steered pipeline mannequin, a metamorphosis step is a TransformStream
that operates on encoded or decoded frames. The time taken to run the transformation step is thus merely the time taken by the remodel operate, or extra exactly the time taken till the operate calls controller.enqueue(transformedChunk)
to ship the up to date body down the pipe. The demo contains a generic InstrumentedTransformStream class that extends TransformStream
to document begin and finish occasions for every body in a static cache. The category is a drop-in substitute for TransformStream
:
const transformStream = new InstrumentedTransformStream({ title: ‘super-duper’, remodel(chunk, controller) { const transformedChunk = doSomethingWith(chunk); controller.enqueue(transformedChunk); } }); |
Recorded occasions then get entered in an occasion of a generic StepTimesDB class to compute statistics equivalent to minimal, most, common, and median occasions taken by every step, in addition to time spent ready in queues.
This works effectively for the a part of the pipeline that makes use of WHATWG Streams, however as quickly because the pipeline makes use of opaque streams, equivalent to when frames are fed right into a VideoTrackGenerator
, we lose the power to trace particular person frames. Specifically, there isn’t a straightforward method to inform when a video body is definitely exhibited to a <video>
ingredient. The requestVideoFrameCallbacokay operate stories many fascinating timestamps, however not the timestamp of the body that has been offered for composition.
The workaround applied within the demo encodes the body’s timestamp in an overlay within the bottom-right nook of the body after which copies the related a part of frames rendered to the <video>
ingredient to a <canvas>
ingredient each time the requestVideoFrameCallback
callback is known as to decode the timestamp. This doesn’t work completely – frames could be missed in between calls to the callback operate, however it’s higher than nothing.
Notice: requestVideoFrameCallback
is supported in Chrome and Safari however not in Firefox for now.
It’s helpful for statistical functions to trace the time when the body is rendered. For instance, one might consider jitter results. Or you could possibly use this knowledge for synchronization functions, like if video must be synchronized with an audio stream and/or different non-video overlays. Frames can after all be rendered to a canvas as a substitute. The applying can then hold management over when a body will get exhibited to the person (ignoring the challenges of reimplementing a media participant).
A typical instance of statistics returned to the console on the finish of a demo run is supplied under:
The occasions are per processing step and per body. The statistics embrace the common occasions taken by every step per body. For this instance: background removing took 22ms
, including the overlay 1ms
, encoding 8 ms
, decoding 1ms
, and frames stayed on show throughout 38ms
.
This text explored the creation of a real-time video processing pipeline utilizing WebCodecs and Streams, together with issues on dealing with backpressure, managing the VideoFrame
lifecycle, and measuring efficiency. The following step is to truly begin processing the VideoFrame
objects that such a pipeline would expose. Please keep tuned, that is the subject of half 2!
{“writer”: “François Daoust“}
Attributions
– WHATWG Stream brand: https://resources.whatwg.org/l
Licensed beneath a Artistic Commons Attribution 4.0 Worldwide License: https://streams.spec.whatwg.or
– Movie strip: https://www.flaticon.com/free-
Movie icons created by Freepik – Flaticon