How content material moderation rescued $750k in unpaid invoices from soccer pirates

For those who had requested me two years in the past which sport a video startup must be most fearful about, I’d have stated American soccer or basketball. My US-centric thoughts would by no means have thought-about that soccer can be the darling sport of stream pirates. It wasn’t till I joined Mux that I came upon how a lot individuals love soccer…and the way a lot they love to observe soccer free of charge. Streaming video on Mux is simple — which is an effective factor! Sadly, which means we’re a preferred goal for soccer pirates.
Enter the abuse detection system: our homegrown answer to determine and take motion in opposition to soccer pirates and different delinquents who attempt to stream copyrighted content material on Mux with out permission from the rights holder.
Our journey begins on the edge. We ship all our movies by means of two CDNs (Fastly and Cloudflare). For every request we ship, the CDNs present us a document of that request. Every of those information will get enriched with extra knowledge in our CDN logs pipeline. On the finish of that system, the information are inserted right into a ClickHouse cluster. Initially, the ClickHouse cluster was used just for debugging functions. With minimal adjustments, we had been ready to make use of the identical knowledge and ClickHouse cluster for abuse detection as properly.
The logs embrace plenty of helpful data, however the abuse detection system solely cares about three issues: which asset the log corresponds to, when the asset was seen, and the way the system can entry the asset.
Subsequent, we’ve a small Go program that’s designed to question the CDN log knowledge saved in ClickHouse each 10 minutes. These queries generate an inventory of belongings and environments that had excessive viewership within the final 20 minutes. This system then runs follow-up queries to determine any custom domains related to the atmosphere, and in addition checks to see if the asset is public or signed. This data will decide how the video is accessed later within the system.
For every video within the listing, we then do a lookup of the client. The Go program makes use of buyer knowledge to assign every video a danger rating. Among the apparent danger components embrace:
- How outdated is the client account (older is best)?
- Do they pay their payments on time?
- Is the e-mail handle an organization area or a shopper electronic mail like Gmail?
These components, plus a number of others, go into assigning a danger rating earlier than n8n takes over for filtering and notification.
N8n is a node-based workflow automation instrument. Workflows are made up of a number of constructing blocks — or nodes — every of which performs a particular operate. N8n has a catalog of prebuilt nodes in addition to assist for creating your personal customized nodes. We picked n8n due to the short growth time and its easy-to-debug nature. We might have constructed this a part of the system in Go or a special language, however by utilizing n8n’s premade nodes, we had been in a position to construct our workflows at an unimaginable pace. N8n additionally lets us visualize our workflows in a method a program constructed from scratch wouldn’t. We also have a workflow that will get triggered on errors and posts a hyperlink in Slack to the node that errored. By leveraging n8n, we’ve been in a position to create a set of abuse detection workflows.


We’ve two main workflows in n8n.
First is our soccer detection workflow. For every video despatched to this workflow, we generate 4 thumbnails taken from considerably random factors within the video. The thumbnails are then despatched to Google Vision, which produces an inventory of labels describing the thumbnails. These labels are then in comparison with the next listing of phrases.
"sports activities uniform",
"soccer",
"soccer",
"jersey",
"participant",
"ball",
"discipline home",
"ball sport"
If a phrase in our listing matches a phrase within the listing Google Imaginative and prescient sends us, then we’ve a possible soccer stream that must be checked by a human. The thesaurus we use is deliberately broad. This will result in the system flagging belongings which can be clearly not soccer as soccer. However we’d somewhat generate false positives than doubtlessly miss an actual soccer pirate.
As soon as n8n has recognized a stream which may be displaying a soccer sport, it creates a Slack message and an alert in Opsgenie.


Our second main n8n workflow is our excessive site visitors workflow. This isn’t particular to soccer content material. As an alternative, it’s designed to determine and present us movies which have the next than common viewership depend. The asset’s danger rating, viewership depend, and high 5 referrers are checked within the n8n workflow. If every meets a sure threshold, then a Slack message is created. If the asset is VoD, then the Slack message will embrace a storyboard hyperlink. If the asset is a dwell stream, the message will as an alternative have 4 thumbnails.

As soon as an alert is generated from both the excessive site visitors or soccer detection workflow, a Slack message is created. These Slack messages are despatched to a channel monitored by a crew of contractors. The Slack message accommodates all the data we are able to present for an individual to find out if the video is legit or violates our phrases of service. Essentially the most helpful knowledge is the storyboard and the highest 5 referrers. The storyboard lets us see the content material of the video with out watching it. And the referrers may be highly effective clues to assist us decide whether or not a steam is legit. If the highest 5 referrers appear to be this

then there’s a good likelihood it is a soccer pirate.
The contractor can escalate or silence the alert utilizing the buttons on the Slack message. If it’s a false constructive, they are going to press “Silence,” which prompts one other n8n workflow that provides the asset to an allowlist, so it gained’t alert once more. The workflow additionally closes the Opsgenie alert. If the contractor believes the video is in violation of our TOS or wants assist making a dedication, they are going to press “Escalate.” After urgent escalate or taking no motion for 5 minutes, an alert is shipped to a full-time Mux worker.
From there, the Mux worker has a pair choices open to them. First, the worker will take a look at the Slack alert and consider the video for themselves. A lot of the time, the Mux worker could have sufficient context concerning the buyer to have the ability to inform if the stream is piracy by simply it. The worker also can attain out to the client to see if they’ve the rights to point out the video. If the client does have the rights to stream the video, then we are able to add the video to an allowlist so it doesn’t alert once more. If the client doesn’t personal the rights to the content material, we are able to work with them to cease the stream. Lastly, if the client is a repeat offender that’s uncooperative, we even have the choice of disabling their account.
By leveraging our abuse detection system, we’ve been in a position to minimize down on the variety of DMCA requests we obtain. Earlier than the system was created, it was not unusual for us to obtain dozens of DMCA requests in a month. Now we’re stunned once we obtain a single DMCA request. On high of that, the system has saved us fairly a bit of cash. This will likely come as a shock, however soccer pirates are likely to not pay their payments. That, mixed with the truth that these streams normally have giant viewership, means we incur a not insignificant value and have nobody to invoice. In 2021 alone, Mux had over $750,000 in unpaid invoices attributable to pirated streams. For an infrastructure firm like Mux, this pirating comes with laborious prices. Transcoding, storing, and delivering video just isn’t low cost. If these pirated streams weren’t held in verify, they may shortly spiral uncontrolled and have a big unfavorable influence on our enterprise. By figuring out and shutting down these streams, we’re in a position to scale back our prices. The abuse detection system additionally supplies different much less tangible advantages, reminiscent of preserving our popularity. Mux doesn’t wish to be often known as a haven for soccer pirates, and our clients do not wish to be related to soccer piracy both. By investing on this system, we present clients that we take content material moderation severely.