Like a lot of software problems, you can create a poor solution very quickly and easily and this fools people into thinking it must be pretty straight forward. As a Windows desktop developer I could quickly create an app that takes the output from a web cam and sends it over my local network to another machine for display.
A single frame might be 1024x768x3bytes ~= 2MB. For video we need 24 frames per second giving me 28MB per second to be sent and then shown. Over the internet I need a connection of at least 28MB x 8 bits = 224Mbs which is why it works between two machines on my local network but not over the internet. So start adding compression. Then you notice that even when you can hit the performance metric the image is still juddering. Well the internet does not deliver your packets at a nice consistent pace, so you need to buffer slightly to try and get around this. Plus the connection speed between your machines might vary over time and so you need to detect this and adjust the number of frames or resolution to adapt. Sometimes your target machine is behind a firewall and you cannot make a direct connection between you and them. So now you need a third-party server that can both connect to that acts as a relay.
So the implementation starts getting pretty hard but if you manage to do a great job then people using it just think it is simple.
Attention to network connection quality has dropped.
1. Many organisations have abandoned end-to-end internet connectivity, and deployed NAT or restrictive firewall rules. Hence modern videoconferencing apps designed to work without direct communication. This increases complexity on many layers, leading to hard to diagnose failure modes, leading to inability to fix the crappy network when things work poorly.
2. Nearly all other communication is HTTP requests to services that are designed to tolerate mobile connections, people don't really notice hiccups and the videoconferencing ends up being the only real-time networked app that businesses use.
3. Last mile internet connections have stopped getting faster, or nearly so. If the 2000's trend had continued, we'd all have 10 gigabit internet connections by now. But instead people got excited about iPhones and choppy 3G/4G.
Some people do not have good cameras and microphones.
Some people have a picture of Lisa-Marie Scott on their background, other people have a bill from their cardiologist
or shit fanfic that they write.
Some people have a slow internet connection, others have a slow enough computer that they can't use a more effective compression algorithm that would help with the slow compression. Or maybe they can't or won't license the patents.
Somebody joins the meeting on a phone while they are driving.
Network address translation.
The amazing thing is how often companies have had a hit or dominant product in the space appear to grab defeat from the jaws of victory. Look at aim, icq, Lync and Skype.
Technically, neither of those is particularly hard. I mean, you have to keep latency low, etc., but this is hardly cutting edge tech and lots of companies do it. Do you mean some other aspect of it aside from technical?
A single frame might be 1024x768x3bytes ~= 2MB. For video we need 24 frames per second giving me 28MB per second to be sent and then shown. Over the internet I need a connection of at least 28MB x 8 bits = 224Mbs which is why it works between two machines on my local network but not over the internet. So start adding compression. Then you notice that even when you can hit the performance metric the image is still juddering. Well the internet does not deliver your packets at a nice consistent pace, so you need to buffer slightly to try and get around this. Plus the connection speed between your machines might vary over time and so you need to detect this and adjust the number of frames or resolution to adapt. Sometimes your target machine is behind a firewall and you cannot make a direct connection between you and them. So now you need a third-party server that can both connect to that acts as a relay.
So the implementation starts getting pretty hard but if you manage to do a great job then people using it just think it is simple.