How Facebook encodes videos

londons_explore · on April 25, 2021

I'm surprised Facebook still uses software to do video encoding.

Most big companies with millions of hours of video uploaded each day have realised it's cheaper to stick a bunch of hardware video encoding chips onto an accelerator board and be able to transcode 100 HD streams simultaneously into all the formats and resolutions you need to host.

The power savings on CPU's pay for the custom hardware in a matter of months.

It does reduce flexibility when new video formats get released though.

ksec · on April 26, 2021

Google released the paper [1] "Warehouse-scale video acceleration: co-design and deployment in the wild " a few days ago.

https://news.ycombinator.com/item?id=26925307

>It does reduce flexibility when new video formats get released though.

A new video codec is released and adopted once every ten years. So I dont think it would be a problem. Hardware encoding also trade compression quality for speed. And they are compensating it with slightly higher bitrate. It will also reduce their incentives for improving open source encoder.

Although I think even Netflix switched to BEAMR ( Cant really blame them though )

I think the next frontier, for both Audio and Video will be Codec designed with LiveStreaming / Low Latency in mind.

( Or pretty much everything computing, I wish we could focus on latency, from Hardware Input, Display, Network, Disk, etc. Apple is certainly moving in that direction without talking about it. )

cpeterso · on April 26, 2021

I was talking with a Netflix product manager a few years ago and, IIUC, he said Netflix re-encodes their entire catalog monthly to take advantage of new encoder optimizations. Disk is cheap. They want to optimize network transfers to both save money and improve user experience.

teruakohatu · on April 26, 2021

Netflix has about 36,000 hours of content. On Youtube that much content is uploaded every 1 hour and 20 minutes! Or to put it another way, Netflix re-encodes every month what YouTube encodes in just over an hour.

henvic · on April 26, 2021

Your numbers are off. Netflix has more than 36.000 hours of content.

viciousvoxel · on April 26, 2021

I don't know the reliability of these sources, but according to one article[1], as of 1 year ago, US Netflix had ~5.8K titles, totaling 36K hours of content. Another more recent source[2] claims that worldwide, Netflix has ~15K titles, so assuming the same distribution of runtimes, it would be closer to 93K hours of content. So, while you're correct, it's still the same order of magnitude.

sources:

[1] https://www.whats-on-netflix.com/news/how-long-would-it-take...

[2] https://www.comparitech.com/blog/vpn-privacy/netflix-statist...

sandebert · on April 26, 2021

Care to cite a source?

henvic · on April 26, 2021

Back-of-the-envelope calculation: a slacker suffering from insomnia watching 5 streams of Netflix at once × 20 hours/day × 365 days non-stop would otherwise watch the whole catalog in less than a year.

Convinced?

sandebert · on April 27, 2021

Not really.

If we unpack your calculation to slightly more typical human behaviour: Watching 1 stream × 10 hours/day × 365 days non-stop, one person would need 10 years to watch the whole catalog.

Doesn't seem unreasonable to me.

henvic · on May 4, 2021

You didn't get the core idea of my calculation: to show that Netflix has times of magnitude more than just 36k hours of content.

cpeterso · on April 26, 2021

FWIW, that conversation about Netflix re-encoding their catalog monthly was from about seven years ago! Their process has surely changed since then. :)

Also, IIUC, they re-encoded about once a month, but the re-encoding didn't necessarily take one month of compute time.

BlueTemplar · on April 27, 2021

Next codec is likely to be AV1. How does it fare in these situations ?

But also this might be more about protocols, but AFAIK streaming protocols have mostly been abandoned to the favor of chunk downloads over HTTPS ?

meteorfox · on April 25, 2021

Facebook does have hardware for video transcoding. [1] https://youtu.be/0yk4SPRisPA

astrange · on April 26, 2021

Hardware encoders are not as good because they have worse algorithms in order to make them faster. I don’t think you could make an x265 ASIC.

Also, good video codec engineers can’t be hired for any predictable amount of money, and ones who speak Verilog barely exist.

yaur · on April 26, 2021

This used to be true but modern NVIDIA cards beat x264 on both quality and energy cost.

izacus · on April 26, 2021

Any links for this? Since my experience shows that Ampere encoding blocks still aren't close to x264 / slow preset when it comes to saving bandwith and delivering quality.

Even with power savings it was usually more economically efficient to run encodes on a large (12+ core) machine than to deal with limited amount of nvenc slots on GPUs.

astrange · on April 26, 2021

I believe it, and of course it's going to be better on energy cost. Is that because it supports newer codecs though, or is it on H.264?

Wowfunhappy · on April 26, 2021

They do?! Even on the slow presets?

paulmd · on April 26, 2021

NVENC beats x264 medium. Not quite up to the level of the "slow" presets yet but you have to throw a huge amount of hardware at it to match them let alone beat them. Basically the number I came up with a few weeks ago from playing around with x264 settings on ffmpeg was between 6 and 12 cores to keep up with NVENC at 720p and 1080p, depending on framerate and the quality preset.

ksec · on April 26, 2021

What sort of bitrate are we talking about? I haven't tried using NVENC for years and last time I check it was clearly missing many details that x264 tries to preserve.

NVENC is good at cleaning the noise and fast encoding. ( Or basically Game Streaming ). Which isn't something you want to do if you want to do movies encoding.

paulmd · on April 26, 2021

https://developer.nvidia.com/blog/turing-h264-video-encoding...

Wowfunhappy · on April 26, 2021

> but you have to throw a huge amount of hardware at it to match them let alone beat them.

Sure, and I think hardware encoders are great when you need speed, certainly for real-time video. But in other cases, well, the medium preset sucks. I always encode videos at `veryslow`, and there's just no way to get close to that with a gpu.

BlueTemplar · on April 27, 2021

How come ? This looks like to be easily parallelised, so GPUs should wipe the floor with CPUs ?

astrange · on April 27, 2021

Video encoding isn't GPU-parallelizable. It's a good fit for either CPU+SIMD or custom ASICs. It's just a kind of compression, which means it's based on unpredictable if-statements, which is just what GPUs don't do.

You can mass parallelize it by encoding different clips on different CPUs, this is more optimal because it has less communication overhead.

Wowfunhappy · on April 27, 2021

No idea, maybe it will change some day. But right now, if you want the best quality at the smallest filesize, nothing seems to come close to the best software encoders. Maybe x264 and x265 are just really good.

astrange · on April 27, 2021

They're unusually good because 1. encoding research wasn't done on a business schedule 2. isn't done according to objective metrics (the main ones used, SSIM/PSNR, suck and you have to use human raters) 3. more varied and weirder testcases (some pirated movies, some video game screen recording, more anime).

There's a lot of other free video encoding tools, like avisynth plugins, that are just better than all professional tools. I'm not sure why this is, maybe customers aren't sophisticated enough.

BlueTemplar · on April 27, 2021

What about AV1 ?

retrac · on April 25, 2021

Hardware encoders for newer codecs tend to be simply less bit-efficient. They can be more limited in features. My desktop's GPU can handle quite a few 1080p H.264 streams in parallel. But it can't do average-bitrate-capped CRF, which I think is the preferred type of encoding for streaming services.

rubatuga · on April 25, 2021

Weird, NVENC supports this, I've used it for streaming before.

867-5309 · on April 26, 2021

AMD consumer identified

jimmySixDOF · on April 26, 2021

Like someone already pointed out this whole announcement could be in response to Google announcing this week it's new Argos chipset approach to transcoding

[1] https://arstechnica.com/gadgets/2021/04/youtube-is-now-build...

manca · on April 26, 2021

The problem is that most of the big tech thinks they are smarter than 1000's of codec engineers who worked on libx264 and they try to reinvent the wheel. They realize too late that all the money they threw into building their own stuff (or variants) do far worse in general case than the baseline libx264. They just don't want to admit it.

mlazos · on April 26, 2021

I don’t think it’s that they don’t want to admit it, it’s that the cost savings from something disruptive may make it worth it at scale. Or that somebody wants to risk their career on it haha and big tech has tons of cash to throw around.

LdSGSgvupDV · on April 26, 2021

BidTechEngineer: Ya, I know, but I need that wasted money.

manca · on April 26, 2021

Absolutely. In fact, I interviewed a guy last week that comes from very big tech and he basically himself admitted that they're basically wasting time and money chasing unknown "ideal" codec while almost always those projects hit dead end. He loves the money, I am sure, but he wants out anyway. :)

Tade0 · on April 26, 2021

Who wouldn't eventually?

I've been to a project where, because the development and "hardening" budgets were separate, we knowingly pushed out buggy code so that we could take advantage of the latter.

Had I known that half of the job would be to game the system, I wouldn't have joined.

_qzu4 · on April 26, 2021

> half of the job would be to game the system, I wouldn't have joined.

Bureaucracy is inevitable once a certain number of people join a group.

ClumsyPilot · on April 26, 2021

"Had I known that half of the job would be to game the system"

You should be happy it's only half!

ignoramous · on April 26, 2021

...and I need that promotion to Senior Staff.

esperent · on April 26, 2021

I think libx265 is the baseline now BTW (correct me if I'm wrong).

webmobdev · on April 25, 2021

I would be suprised if FB doesn't use hardware encoders. That said, hardware encoders are fast, but they don't match the quality of the software (CPU) encoding. Slower CPU encoding still provide the best quality.

thefourthchime · on April 26, 2021

I can’t speak for all of the encoding options. But I can tell you that Dolby Hybrik, one of the main competitors and encoding space uses relatively cheap ec2 instances.

Based on the speed I see from Telestream and Elemental. I think they’re software based as well

cobookman · on April 26, 2021

Youtube and Facebook have a much different encoding problem then say HBO/Netflix/Hulu...etc

YT & FB have a few popular videos, and a ginormous long-tail. That long-tail needs to be encoded, and cheaply as it might not get many views.

The TV/Movie VOD over IP industry stands to benefit optimizing the encoding to have the smallest filesize with highest quality. Spending considerable CPU cycles to find that best quality path.

For those like Netflix, why would you invest with ASICs. You likely don't have a huge infra bill (relatively) when it comes to encoding. And CPU vs ASICs, the ASICs mean you loose flexibility to really fine-tune the quality and use the latest codecs.

For those like YT/FB. You just gotta encode that long tail. Getting something cheap that's 80% the quality of CPU-based encoding is good enough for 99% of the content.

mnt16 · on April 26, 2021

Elemental is a mix of software and hardware encoding. They utilize GPU’s for some portion on the encode.

dheera · on April 26, 2021

I could see lots of reasons to do software encoding at a company like Facebook which are probably not mentioned in the article, including e.g. extracting the optical flow which can be reused as part of various machine learning workflows.

For example, if you want to only annotate 1 in 100 frames and interpolate with optical flow, you can get the flow in the process of doing video compression.

I don't have inside info but there are various other things like this that I can think of for wanting to do it in software.

Also, hardware encoders suck if the company making the hardware decides to suddenly EOL the product. With software encoders you can easily scale the backend whenever you want, at any time in the future, and without succumbing to supply chain issues.

tinus_hn · on April 26, 2021

If you’re Facebook, the company making the hardware doesn’t suddenly decide to EOL the product, or they are sued for millions for breach of contract.

ZeroGravitas · on April 26, 2021

I've seen this contrasted with the Google stuff but it seemed to me the two approaches were entirely complementary.

Facebook, which videos will we transcode into what versions.

Google, with this list of videos to transcode how can we do it cheaply on a massive scale.

I didn't spot anything that pitted approaches against each other.

ynx · on April 26, 2021

counterpoint: if you're big enough to have your own bona-fide cloud, then you might have more spare CPU capacity that can be flexibly allocated than available PCs with dedicated GPU

zaphirplane · on April 26, 2021

If you are in the cloud business you are worried about heating power, with powering down disks a real thing

PicassoCTs · on April 26, 2021

? Xilinx for the rescue?

prvc · on April 25, 2021

What was mentioned in the article seem like very basic, obvious considerations. Meanwhile, Youtube has custom ASICs which efficiently produce multiple formats simultaneously.

jeffbee · on April 25, 2021

And yet, despite YouTube's abundantly clear motivation for doing so it took them years and years to develop the hardware and it only recently hit production. Could it be that acceleration for video encoding isn't as easy as people are making it sound?

dylan604 · on April 26, 2021

I'm waiting for the "meh, i could roll it out in a weekend" responses.

yjftsjthsd-h · on April 26, 2021

Of course I could;) It would just be hacky, ugly, unreliable, scale badly, use either GPUs or inadequate FPGAs, and fall over once it exceeded 10 users. My resume doesn't need to know that, though:)

dylan604 · on April 26, 2021

You now know 1 way of not to do it.

Diggsey · on April 26, 2021

If you break videos up into short chunks, then you could simply encode those chunks on-demand into the perfect encoding for the requester.

The advantages are:

- You don't waste CPU encoding video into formats that won't be used.

- You can use a standard caching solution to reuse those chunks.

- Everyone gets the perfect encoding, always.

- If most people watch the first 2 minutes and then give up on a 20 minute video, you don't waste CPU encoding the other 18 minutes.

- You can introduce new encodings instantly for all videos, without going back and re-encoding historical videos.

- You don't waste storage on video chunks in formats that will never be used.

- It's really simple.

The disadvantages would be:

- You have more unpredictable load if a lot of people start watching (different) videos at once (although the common case of a lot of people watching the same video is still fine) and you could "cap" the load by switching to a fallback format to avoid becoming overloaded.

- There might be an initial delay when playing or sweeping a video whilst the first chunk is encoded. On the other hand, it can't get much worse than it is already, and you could make sure these initial chunks are prioritised, or else serve the initial chunks via a fallback format.

craigc · on April 26, 2021

> It’s really simple

Have you ever written a stateless transcoder like this? Of course it can be done, but saying “you could simply encode those chunks” and “It’s really simple” is pretty misleading especially if you are changing frame rates or sample rates or audio codecs during the encoding process.

That said, if there is someone that could do this at scale it would be Facebook.

Also this would mess up ABR streaming at least for the first people to watch the video which would not really guarantee “the perfect encoding always”.

AndrewUnmuted · on April 26, 2021

I have written such a transcoder [0] and while it is definitely not "simple," it has definitely never been easier to achieve than today.

If the input video source has been prepared properly (i.e., constant framerate, truly compliant VBR/ABR, fixed-GOP), or if your input is a raw/Y4M, then segmenting each GOP into its own x264 bytestream is rather trivial.

If the input is not prepared for immediate segmentation, it is also somewhat easy now to fix this before segmenting for processing. Using hardware acceleration a transcoder could decode the non-conforming input to Y4M (yuv4mpegpipe) or FFV1, which can then have a proper GOP set.

[0] https://suprnova.video/

manca · on April 26, 2021

It's not that simple. Especially if you deal with videos that have open GOPs and have multiple B frames in the GOP. The encoders don't do so great job in those cases. Also breaking the videos up into short chunks is easier said than done -- you need to understand where it makes sense to split the video, make sure the Iframes are aligned and generally try to keep the consistent segment size -- which with very dynamic videos encoded with different types of source encoders could result in very inconsistent performance of the encoder itself across the chunks. For that reason, it's always best to have some sort of two/three pass encoding where the analysis step is integral and then based on it the actual split & encode is performed. Which of course does not work for low-latency live streaming scenarios.

bastawhiz · on April 26, 2021

> simply encode those chunks on-demand

That's not always reasonable, though. If I upload a video at 4K, there needs to be some "baseline" encoding so that when the video is published, there's something playable without streaming 4K to, say, cell phones with a resolution smaller than 4K.

Even then, chewing on that 4K file into a 1080P resolution video "on-demand" for a desktop user on high-speed internet is no small task. First and foremost, you need to assume concurrency: if two people request the video at the same time, there's a complex problem of coordinating the encoding in a large distributed system so that the video is encoded once (or a very small number of times). You also need to do the encoding _faster than the video can be played_, or at least faster than the baseline/fallback version can be retrieved and sent. You also need to queue the next chunk(s) of video up, so you're not watching a chunk, buffering, watching a chunk, etc.

In a system at the scale of FB, it's not smooth sailing for compute jobs like this: you're subject to network latency, noisy neighbors, failures (disk/network/software/power/etc.). The case where you're able to stream the original file from storage, start encoding it and streaming the output back to storage and to everyone around the world who is requesting it _at that moment_, and coordinating the encoding of the next chunk is actually not very likely.

Want to talk about weird failure modes?

- I start watching your video and click all over the seek bar. Am I DOSing your compute cluster?

- Two users on opposite sides of the world request the same location of the same file with the same fidelity. Does one of those users get a dirt-slow experience, or do I double my compute costs?

- A thousand users start watching the same video at roughly the same time. A software bug causes the encoder(s) to crash. Do 1000 users suddenly have a broken experience, or does the video pause while your coordination software realizes there was a failure, releases the lock, and restarts that encoding job from the top while your users all get in line for the new job?

I'd argue that this is the _least simple_ approach. In the happy path, you get a nice outcome while reducing compute cost, and users get high-quality video. In the unhappy path, users get slow loading from slow encoding instead of reduced resolution, or you start to need to trade performance for compute (do you encode twice in two datacenters, or move compute further from the viewer?).

KaiserPro · on April 26, 2021

> It's really simple.

As other people have pointed out, video encoding isn't stateless. You can chunk and encode in parallel, but there are tradeoffs.

The biggest problem is that you've taken what is effectively a CDN problem (serve the first chunk of a video) into a CDN and a CPU scheduling problem

Serving video is cheap, apart from the bandwidth. So anything that reduces the number of bytes transferred yields savings. Real time encoders are not as efficient as "slow" encoders.

For low volume videos (ie 99.5% of all video) the biggest cost is storage. So storing things in a high quality, or worse still, original codec makes storage expensive. Not only that you still have to transcode on the way in, or support all codecs ever made, in real time.

In short, yes, for some applications this approach might work, but for facebook or youtube, it wont.

maxgashkov · on April 26, 2021

It seems that this process will be incredibly stateful in the encoder component.

Most of codecs targeted at low bandwidth for mobile streaming track scene changes and if you make chunks in a naive way (split via I-frame borders and encode 'em independent of each other), when reassembled final video will look choppy due to broken scene change relations.

So after encoding each chunk you will have to carefully save relevant parts of encoder state and reuse it for the next chunk. Seems doable, but tricky to get right?

tinus_hn · on April 26, 2021

This is kind of simplistic. For instance, if ‘most people’ only watch the first 2 minutes of a 20 minute video, you still have to encode all of it for that minority that does watch the whole video. Also consider that very large groups of people use very similar hardware and connections.

Anyway, of course videos are already chopped into chunks that are stored separately. It’s much easier to distribute and cache these independent chunks. On demand encoding doesn’t change that.

BlueTemplar · on April 27, 2021

Sounds like "real" (old-school) streaming ?

c8g · on April 26, 2021

Except facebook web always show me 1080p on mobile data connection. I have reported many times but I guess they don't care about user feedback. If I select 480p on youtube, It will never show me 1080p, unless I switch back to 1080p. On facebook, every video is by default 1080p. Even if I don't watch video, scrolling news feed also pre-load 1080p. This is a huge problem with limited mobile data plan.

mappu · on April 25, 2021

> An encoding family requires a minimum set of resolutions to be made available before we can deliver a video. [...] For example, having one video with all of its VP9 lanes adds more value than 10 videos with incomplete (and therefore, undeliverable) VP9 lanes.

I don't see why this constraint is in place, you can absolutely serve video for certain-res users only with certain codecs (youtube certainly does this).

vbsteven · on April 25, 2021

I assume Facebook has this requirement for usability reasons. They don’t want a user to receive a video through one of its many share features and then not be able to view it.

agolliver · on April 26, 2021

This is (was?) a big problem with google photos. If you upload a video and immediately share it with someone they'll just see a black screen. A few minutes later they might be able to watch a super-low res encoded version, and only much later does a HD version become available. I started waiting 10 minutes between generating a "share" link before sending it to my family because of all the confusion it was causing.

mappu · on April 25, 2021

You need baseline codecs for baseline visibility, sure. But the article is about selectively choosing which videos to encode with more advanced parameters, and there's no reason to hold back delivering your 1080p VP9 just because the 720p VP9 isn't ready yet - just serve something else.

aledalgrande · on April 26, 2021

What if you were streaming 1080 VP9 and then your player decides to downgrade to 720 because your connection became slower? At that point the missing 720 would make you unable to watch the video.

yaur · on April 26, 2021

In MPEG-DASH codec is specified at the representation level so with a compliant packager and a compliant player the player should be able to switch down to the "fast h.264" rendition that exists before they start trying to create VP9 renditions.

don-code · on April 25, 2021

It seems like Facebook does a thing that doesn't scale, in the interest of time: feed a minimally processed video until other streams are available. That makes sense, but I wonder if it also means that video quality on Facebook generally suffers when high-profile events are taking place.

Context on that wonder: I've always noticed that news outlets seem to carry downright horrible quality user-generated video clips of rallies, protests, and the like. Where everybody's carrying around stellar-quality video gear in their pockets these days, I've never figured out why that is.

avidiax · on April 26, 2021

I suspect two factors:

1) Livestreaming can have very poor quality when bandwidth constrained (e.g. at an overloaded cell site at a rally) 2) Viral videos get reencoded many times in many formats. The cumulative encoding errors are not only limited by the lowest quality reencode, but also by the defects in all previous encodes with various codecs.

don-code · on April 26, 2021

I bet you just answered my question - we aren't talking about a 5mbps video uploaded after the fact on home Wi-Fi, but a 384kbps video shoved down a 512kbps, TCP-unfriendly pipe as it's being shot. Thank you for that!

ascorbic · on April 26, 2021

Minimally-processed doesn't need to mean lower quality. In this case, the expensive advanced codecs are for mainly for smaller filesize.

the_duke · on April 25, 2021

Related question: is there any decent OS software that can intelligently pick encoding parameters that preserve quality while minimizing size?

I recently tried to implement video uploading for an open source project, but naively choosing ffmpeg parameters can often result in noticeable quality loss / large output file size / long encoding time. And easily all three of those at the same time.

manca · on April 26, 2021

This is a known and hard problem. A lot of companies are trying to internally do some sort of analysis of their own content libraries and intelligently tune their encoders. Netflix is definitely the most known for their tech related to video analysis and introduction of VMAF[1], but other metrics also exist which enable you to compare the original/master and the encoded variant (PSNR, SSIM, etc). Bottom line is that you need a lot of trial and error and fitting different curves on your bitrate/quality graphs and often times what metrics consider to be good quality human visual system doesn't agree 100% with. It's a very interesting problem nonetheless, and I recommend [2] if you want to learn more.

[1] https://en.wikipedia.org/wiki/Video_Multimethod_Assessment_F...

[2] https://netflixtechblog.com/toward-a-practical-perceptual-vi...

dylan604 · on April 26, 2021

> but naively choosing ffmpeg parameters can often result in noticeable quality loss

Welcome to the world of video compression. If you're not able to take the time to learn the ins/outs of how to use a codec as well take each incoming video's specifics into consideration, then you'll be needing to borrow someone's middle of the road presets. Dedicated settings make decisions based on the frame size, bitrate restrictions, things like HLS vs download/play, 1pass/multipass etc. All of that determines GOP size, reference frames, etc.

martinald · on April 25, 2021

The problem really is encoding from compressed video to compressed video again. It really isn't great whatever you do (Facebook/Twitter definitely haven't solved this problem either).

The quality from going from source -> 15mbit/sec h265 -> 2mbit/sec h264 (like on a classic social media or whatever site) is absolutely terrible compared to going from source -> 2mbit/sec h264.

dylan604 · on April 26, 2021

Facebook/YouTube don't care about how their compressed video looks. They just need it in the format so that they can get in front of their millions of user's eyballs. I'd be willing to bet that >95% of users don't "care" about compression quality. They just want the content, hence their decisions. There's just no way to properly encode that much content with a "we care about compression" thoughts.

stavros · on April 26, 2021

Why is that? Isn't 15 mbit to 2 mbit a big enough difference in quality that you'd expect to retain all the good parts?

martinald · on April 26, 2021

Not really - "raw" 1080p video is ~3gbit/sec - so going from 3000 to 15 is a huge amount of data compression.

The problem is when you go from 15 to 2 in my example is that the encoder spends all the time trying to basically encode the artefacts. It really doesn't work well at all.

stavros · on April 26, 2021

Hm, that's too bad. Is there a better way to compress already-compressed streams? I find having to do that a lot due to a use case that's not very relevant here.

pimlottc · on April 26, 2021

Handbrake [0] is a pretty user-friendly GUI with some good presets. It's not quite the automatic optimizing tool you're describing but does a good job for most basic tasks.

0: https://handbrake.fr/

fratlas · on April 26, 2021

I spent about 1-2 years on this, built a ML model. Pretty deep problem, there's no clear winner in terms of OSS software to do this.

amq · on April 26, 2021

But what exactly was the problem you were trying to solve?

astrange · on April 26, 2021

The parameters for x264 present everything you need in the correct different dimensions (speed, compatibility, quality, type of content like animation/film/screen recording). The problem is solved, the issue is people built frontends on top of it and present the options in a wrong way that ruins them.

fratlas · on April 26, 2021

Yes, the x264 params are all useful. It's selecting the optimal combination for a given use case that presents a challenge. Vimeo's video quality (and therefore filesize) is much higher than FBs, because their use cases are completely different.

dmitryminkovsky · on April 25, 2021

I second this question. I also recently had this experience [0] and the results are disheartening.

[0] https://stackoverflow.com/q/61784204/741970

astrange · on April 26, 2021

The correct answer for that is crf 22 or so plus a speed preset tolerable for your hardware. The presets come with x264/x265/ffmpeg.

antibuddy · on April 26, 2021

I am not sure if this is still the case, but be aware that a specific CRF on x264 vs x265 might mean a different (perceived) quality. It was certainly the case a few years ago.

astrange · on April 26, 2021

I think so. It's also different in 8-bit and 10-bit.

amq · on April 26, 2021

Plus maybe denoise in front to keep the size lower.

rogers18445 · on April 26, 2021

I have been interested in this and to my surprise I was able to find just a single project that does something similar: https://github.com/master-of-zen/Av1an

It has the ability to do trial compression (w/ scene splitting) and evaluate quality loss up to a desired factor.

867-5309 · on April 26, 2021

this is where 2-pass encoding would be of benefit

gruez · on April 26, 2021

There really isn't a point in using 2-pass when you have CRF/CQP.

867-5309 · on April 26, 2021

unless file size is a factor, like here

astrange · on April 26, 2021

Minimizing size isn’t a constraint that needs 2pass encoding. Only targeting a specific size is.

(Which could include max size over time constraints like VBV.)

jakear · on April 26, 2021

> A relatively small percentage (roughly one-third) of all videos on Facebook generate the majority of overall watch time.

Surprising to me that it’s so large, I’d expect something like 5% or less account for half of watch time. I wonder how it compares across platforms (YouTube for instance would almost certainly be in the <5% bucket, I’d imagine)

rahimnathwani · on April 26, 2021

Maybe you and the author are considering different periods, e.g.

A. what % of all videos generated the majority of overall watch time over the last week?

B. what % of all videos generated the majority of overall watch time over the last year?

If the watch time is growing by a high percentage month-over-month, perhaps the numbers would be similar. But, if overall watch time is flat, then you'd expect A to be much lower than B.

jaimex2 · on April 26, 2021

Facebook does a garbage job of encoding, I wouldn't be using them as a reference. Half the videos people send me on messenger are either terrible quality or just a frozen frame with audio.

Both when played on my phone or via the web page.

Videos sent by the same people on other services like Signal, Slack or Messages are fine.

billjings · on April 26, 2021

Videos over messaging are a different problem space from the one outlined in this article. The number of people watching a messaged video is low; latency might be the topline metric the team optimizes for, since it has a big effect on whether people stay in the conversation or leave.

In fact, prioritizing latency might be the reason for the garbage quality you're seeing: reduce the bitrate, and people will receive the video more quickly and more reliably. Whether that's a smart decision is another question.

mdoms · on April 26, 2021

I feel the same way about their terrible UIs. Messenger is almost unusable on both web and android and new bugs are introduced frequently, but somehow their shit house technology (React) has become the industry standards.

ashconnor · on April 26, 2021

Twitter's video encoding is far worse.

chx · on April 26, 2021

This post does not, at all, try to steal Google's thunder after revealing Google Argos.

bootcat · on April 26, 2021

i think argos seems like a better techy project than this,

yaur · on April 26, 2021

Is there an example of a video that can be encoded at 5MB with H.264 that can be encoded at an equivalent quality with VP9 at 3MB? There are a lot of things to say about this article, but that seems most glaring.

ksec · on April 26, 2021

The first draft of H.264 is over over 20 years old. And in most of these comparison they could use anywhere from Baseline to High Profile. Modern VP9 could compete with HEVC, 40% reduction of Bitrate is very reasonable between generations as long as it is not some insanely low bitrate which doesn't compress well.

loeg · on April 26, 2021

I think the figure is intended to be illustrative, rather than literal.

yaur · on April 26, 2021

That may be what its intended. But 5 -> 3 is a world apart from 5 -> 5.95. One is consumer friendly (and groundbreaking information theory) and the other is just about reducing margins while pretending to be consumer friendly.

xiphias2 · on April 26, 2021

Is it too early to think about using the AI inferencing chips provided on mobile phones for decoding video? There are lots of papers that use deep CNNs for compression/decompression, and Facebook / Google would be the best companies to productionize it first.

astrange · on April 26, 2021

What’s wrong with using the video decoding chips on the phones to decode video?

thefourthchime · on April 26, 2021

I believe the next generation of encoders will use this. But the standardization process takes years. So it might be a little while

Const-me · on April 25, 2021

At least here, the video quality is not good on FB. I have 120 mbit/sec connection, youtube plays good quality 4k just fine. I don’t think I ever saw 1080p served by facebook, both resolution and bitrate are very low.

mrtksn · on April 25, 2021

Nothing is bad as reddit video that would load random segments instead of consequential ones or Twitter video that would be 3 seconds on 4K 20 seconds 144p 5 more seconds in buffering or Vimeo that’s buffering all the times on 100Mbps connection.

Moral of the story: Video is hard, apart from Youtube, Twitch, Netflix and Amazon Prime I don’t know any service that plays video flowlessly. O.K. maybe some ad networks too.

beckman466 · on April 25, 2021

Peertube and torrents are the future.

stavros · on April 26, 2021

If Vimeo can't get it right, I don't think Peertube will.

ericbarrett · on April 25, 2021

Your ISP might be throttling Facebook video

gruez · on April 25, 2021

There's other explanations as well:

* ISP has poor peering to facebook servers

* ISP has cache/CDN node installed, but that's overloaded

* the content that OP viewed isn't popular so it has to be pulled from origin, which adds another layer of complications

ericbarrett · on April 25, 2021

All possible. But FB video (meaning also Instagram) is a popular target for cheap ISPs that don't rate a peer to constrict since it consumes such a large portion of users' total bandwidth, especially on mobile

astrange · on April 26, 2021

I find it odd that this is a kind of mathematical article but doesn’t use the existing terms in codec literature - why does it say “cost/benefit” instead of “rate/distortion”?

not2b · on April 26, 2021

Because FB doesn't care about rate or distortion, they care about cost to them and whether the result meets some minimum standard of acceptability.

astrange · on April 26, 2021

Cost is the rate. Acceptability and distortion are the same thing (although the curves are different).

pocak · on April 26, 2021

In the article, cost is cpu time, and benefit is file size reduction multiplied by number of times watched.

jimmySixDOF · on April 26, 2021

A Related Tangent is how FB handles different photo requests by saving 4 copies with different resolutions at the time of upload. It's even worse because according to [1]:

   " The number of Haystack photos written is 12 times the number of photos uploaded since the application scales each image to 4 sizes and saves each size in 3 different locations. "

[1] https://bigdata.devcodenote.com/2015/04/haystack-facebooks-p...

nbm · on April 26, 2021

Keep in mind that the referenced paper that this page is based on is over a decade old now. Many things have changed since then, as you can imagine. Look for later papers and engineering blog posts for more about what’s changed since then.

lousken · on April 26, 2021

can anyone show me decent looking video on facebook? whenever i see anything from my friends it's just a horrible pixelated mess, even the gaming channels i saw

BlueTemplar · on April 27, 2021

Hmm, not even a single comment mentioning AV1 ?

mdoms · on April 26, 2021

Why would I read this? Every video I share on Messenger gets crushed beyond any usefulness, I don't even bother anymore. Whatever FB does is only useful as a case study in failure.

neximo64 · on April 25, 2021

In a way for playback of course that makes safari consume too much resources for the tab and restart the tab mid video so you never see the ending..

blowski · on April 25, 2021

These are the kinds of articles that show why Facebook has different problems to almost every other tech company. The complexity of these kinds of solutions is mind-boggling.

Just imagine if all that ingenuity was focused on solving humanity’s problems, instead of sharing conspiracy theories and advertising.

saagarjha · on April 25, 2021

Being able to efficiently serve video is one of humanity's problems. You may not like the host or video content, but that doesn't mean the problem is not a useful one to solve.

esjeon · on April 26, 2021

I don't think this is difficult to engineer. It's a good example of ML/AI being used for optimization, which I believe is the most legit usecase that everyone should be aiming for. Still, the problem is how do we get to the scale where we desperately need stuffs like this...

beervirus · on April 25, 2021

Don’t forget spying on everyone on the internet!

bob33212 · on April 25, 2021

I wish people spent more time on Facebook. Is that what FB employees think? Or are they like the rest of us and think fuck this guy, but I'll take his money.

echelon · on April 25, 2021

Other firms pay just as much, so there's plenty of choice in employment. Rather, working at Facebook just isn't an ethical problem for many of its employees.

My engineering colleagues that voraciously use and consume Messenger and Instagram don't see a problem either.

There a lot of "bad" companies. Certain advertising, defense, pharma, law, etc. firms are shady or morally bankrupt. It doesn't stop them from finding people to do the work.

stefan_ · on April 25, 2021

Maybe the people at Twitter can read this and learn something.

juliendorra · on April 25, 2021

I have found Twitter videos to have improved in quality recently, but there’s a disconcerting effect where when you upload your own video you see first a very low quality version, and the much better version replace it. So you are lead to believe that your video is indeed horribly recompressed, especially if you don’t wait a bit, when it seems that there’s actually several level of quality available.

calebio · on April 25, 2021

They still haven't figured out how to give users control of their poor image cropping :(

zeeshanqureshi · on April 25, 2021

Their "race-based" image cropping?

https://youtu.be/Ok5sKLXqynQ?t=103

1cvmask · on April 25, 2021

The one thing I find fascinating is how big a difference there is in “AI” based background removal tools. The discrepancy in quality is huge.

This is the new one I found:

removebackground.app

wongarsu · on April 25, 2021

That looks like cropping on whatever has the highest confidence value of being a face. Evidently not a good idea, it does look pretty racist.

slig · on April 26, 2021

And reddit.

teclordphrack2 · on April 25, 2021

[flagged]

dang · on April 26, 2021

Can you please not do this here? You may not owe $BigCo better, but you do owe this community better if you're participating in it.

https://news.ycombinator.com/newsguidelines.html

TedShiller · on April 26, 2021

not well

bootcat · on April 26, 2021

So they are using feedback to see how many people are watching a video to use more cpu for advanced compression - There is nothing innovative here - we want to make sure we provide resources to something thats viral and not to all videos, whats innovative here ? I think in cloud and all companies are using metrics to optimize their resources,

sharkjacobs · on April 26, 2021

I'd never heard of the practice, is it common with other video hosting services?

bootcat · on April 26, 2021

So I was commenting on the particular technique,

To Collect feedback metrics to judge usage and throw resources to optimize rather than blind optimizing every single video. These techniques are very simple but makes sense to use when you have such a high demand. But I would expect more from facebook.