KIRIKI DELANY AND JOHANNES RIETSCHEL ⋅ OCT 24, 2019
The authors are the founders, respectively, of StreamGuys and Barix AG.
Reliable urban performance is particularly important for competing with satellite, which often has dropouts even in cities with terrestrial repeaters.
As with most things in the broadcast universe, the transition from legacy to IP workflows has been gradual. In radio, this is perhaps best represented in the STL category.
For one thing, IP networks were uncharted, unproven territory for audio transport. The less reliable nature of IP as a transport medium versus tried-and-true T1/E1 lines was an immediate concern for broadcasters. From dropped packets to network outages, time spent off the air is money and listeners lost.
But there were other concerns as well. Working with IP meant learning an entirely new operation; configuration processes often required IT specialists to open firewalls and establish IP addresses on send (encode) and receive (decode) devices — a starting point that caused major frustration and confusion for many. This would grow even more complex for broadcasters seeking to adopt IP for point-to-multipoint architectures such as program syndication.
Once operational with live, local area connections, these send and receive devices, along with other boxes in the architecture that began to speak digital, required a great deal of local management and monitoring to ensure consistent reliability. That required being on-premise to manage all of these systems on the network.
Security was also a concern — a concern that remains, but continues to grow stronger thanks to more secure solutions, and a better understanding of how broadcasters should protect their networks.
Early innovations like the Barix Reflector Service aimed to change these dynamics by providing a plug-and-play solution that simplified configuration, enhanced security and established a future foundation for cloud management. As these challenges have been addressed more strongly and broadcasters transition to IP more aggressively, the next logical question was how to optimize audio quality and support new media services over the network.
Radio has often been an industry of compromise; and with IP transport that compromise has been to the detriment of great-sounding audio. For radio studios and content owners in the adjacent audio production landscape, the focus is on creating high-quality, impactful audio. On the internet, the industry begrudgingly has accepted compressed formats — albeit for good reasons.
MP3 compression was widely accepted when the internet was slow; and in terms of compressed formats, it remains the most reliable when it comes to managing program-associated metadata. Nowadays, connections of 10 Mbps, 100 Mbps and even 5 Gbps are supporting 4K video to consumers, along with more efficient metadata management. It is now possible to send uncompressed streams over once-unthinkable 4G connections, for example, where T1 or better was traditionally necessary.
The question remains: With upload bandwidth no longer a concern, why compromise a radio station’s audio quality with compression?
Compressed formats still have a role in content networking and distribution, but when packaged for last-mile delivery to the consumer, the concept of “no compromise” in the signal chain is enormously important. This, along with a desire to support new media services and business models, makes an increasingly stronger case for broadcasters to move to an uncompressed IP transport service.
MOVING TOWARD GREATNESS
Similar to how broadcasters grew comfortable with IP, operating within the cloud is no longer a technical uncertainty. The transition has been similarly gradual, but the evidence exists that moving to the cloud is both operationally sound, while also simplifying systems management. This also reduces exposure to security risks, as the devices within the architecture are phoning home to the CDN or service providers, versus living inside the broadcaster’s network.
For example, there is no longer a need to run encoders on-premise for an uncompressed service. In most cases, the in-studio overhead is reduced to a stable desktop solution — typically well under $1,000.
Today’s premium encoders no longer need to sit inside the studio environment, and instead will reliably take in an input signal and its associated metadata in the cloud. In addition to reducing equipment costs and maintenance, operationally this cloud-based architecture unlocks the potential to mix-and-match digital signage processors, as well as codecs. The latter provides the flexibility to repackage program audio in HLS or segmented formats required for the radio affiliate, tower and or/consumer.
The metadata component unlocks a lot of this potential and flexibility at the final production stage. In addition to simplifying encoding into several formats, the presence of metadata provides more information to the listener to visualize and enhance the user experience. That same information also simplifies royalty reporting for the artists.
Enabling the service comes down to a stable, dedicated connection on the WAN interface — the same configuration that an ISP would embrace — that can support a bandwidth payload of 1.4 Mb per second. A 1.4 Mbps payload will support uncompressed PCM audio at 44.1 kHz, which delivers a human resolution up to 20 kHz — the standard for compact disc audio. This is representative of the Nyquist frequency, delivering a high-fidelity signal at approximately half of the sampling rate.
PCM audio, which represents the starting point of the uncompressed audio, remains a more reliable format for external IP distribution landscape. While AES67 inside the studio has come to fruition, PCM is still better equipped to tolerate the latencies and network condition variables of long-haul IP transport; our tests and real-world deployments prove latency at sub-1 second, with very minimal packet loss.
With more data moving across the network in an uncompressed format, packet loss or slight bandwidth interruptions will have minimal impact on the resulting audio quality.
There will come a time where 192 kHz resolution will be more reliable to manage over long distances, but PCM will provide the high fidelity of an uncompressed audio service with optimal reliability on today’s networks.
While understanding the path to uncompressed transport is necessary, what matters most to broadcasters is solving problems and supporting new services. Let’s outline some of these scenarios, the value that an uncompressed transport platform delivers.
Operating within a cloud workflow requires that the broadcaster send the program audio data into the cloud. While this can be achieved with a compressed stream, that signal will require further compression from downstream transcoding or transrating, among other processes. The more the audio is encoded and compressed, the greater likelihood of stream latency, undesirable audio artifacts and other issues with quality of experience.
With uncompressed source audio, a single encoding stage will support a varied bouquet of codecs and bitrates required for many consumer formats. And, with one device accommodating all encoding, the outputs are more tightly aligned from a latency perspective. This remains true when outputting different protocols, such as RTMP and HLS, at the encoding stage.
Therefore, working with uncompressed source audio — in addition to enhancing sound quality for audiences — will deliver a wide array of tightly aligned outputs encoded once from the master quality source.
As referenced earlier, moving encoders to the cloud introduces several new operational efficiencies, both in terms of upgrade and network growth.
On-premise encoders are offered in two flavors: a hardware device with fixed, limited CPU and RAM resources, and a software solution that typically runs on a PC or Mac. Both offer limitations that are amplified when working within an uncompressed environment.
The built-in capabilities of a hardware encoder are typically finite, and upgrades are often limited by what the vendor makes available. Any significant changes, such as adding a new codec or an increase in CPU processing, will likely require replacement of the encoder, with a potentially lengthy configuration process to bring the new system online.
While a software encoder is typically easier to replace, the supporting computer infrastructure hosting the software may require an upgrade. Over the long term, the management of that software, computer hardware and operating system will escalate costs and labor — and potentially put more stress on an already overburdened IT department.
Cloud encoders offer a simpler upgrade path. Most can be sized on the fly to amplify computing resources without wasting unnecessary resources and power, while also eliminating the need to replace the OS or software. An increase in available CPU, RAM and/or disk resources can be executed through a simple reboot process.
Scaling the infrastructure is also much easier in the cloud environment, with greater flexibility to increase the number of encoders efficiently without burdensome integration costs and labor.
The audio contribution and distribution pool continues to broaden, and broadcasters are finding themselves more limited by the locations of their on-premise encoders. For example, a remote contribution application may be limited by the resources and gear of the corresponding studio. Perhaps the content has been supplied to an affiliate that has no control of the master studio.
More specifically, an on-premises encoder increases the challenge of encoding at the right point in the signal chain. If the on-premise encoder is not at the precise location where the broadcaster desires, this means that encoding at the distribution point to the end user or desired application may not be possible — potentially introducing more than one encoding stage in the workflow.
Encoding in the cloud solves this problem by offering the option to insert the encoding output at any relevant place in the signal chain. If the broadcaster wants to condition and process a signal prior to sending to an affiliate, that affiliate could use an uncompressed master signal to feed their headend. From there, the uncompressed feed can be transported without any encoding required. Instead, a decoder can be supplied that can pass through the unmodified source at very low latency.
Using a cloud encoder also enables the broadcaster to send high- and low-bitrate signals in two formats, such as HE-AAC v2 and AAC-LC — and then output them as both RTMP, HLS and Icecast audio sources. A single uncompressed signal at the studio, with a fixed bandwidth rate of 1.4 Mbps, is all that is required, which equates to much less than the combined total of sending high and low bitrates for each protocol.
The overarching benefit here is that the management burden at the studio is reduced to one output to support a wide array of audio contribution and distribution requirements.
OUT IN THE REAL WORLD
Philadelphia-based WXPN, the public radio service of the University of Pennsylvania, is one example of a major broadcaster that has embraced the benefits of uncompressed audio over IP for program syndication. The broadcaster set out to develop a more sustainable distribution model for its XPoNential Radio channel, leveraging the Reflector Service from StreamGuys and Barix.
XPoNential Radio was originally distributed to affiliates via satellite and offered only for use on HD2 or HD3 channels. WXPN wanted to widen the usage of the channel to include primary broadcast, and while it continues to use satellite for national programming, the station sought an alternative, sustainable distribution model for the smaller-scale XPoNential Radio. Despite cost-effectiveness being one of the station’s motivations, quality and reliability were also key criteria.
The WXPN architecture leverages uncompressed PCM audio, which is transported between the encoding and decoding endpoints across the CDN infrastructure, while link management is simplified through a cloud-based portal. The station has achieved lossless, CD-quality audio enabled by uncompressed delivery to affiliates as far away as Alaska. New affiliates plug in Ethernet, power and audio cables to receive XPoNential Radio programming. Affiliates connect using 1.4 to 1.5 Mbps of bandwidth, which is plenty to receive the uncompressed signal and deliver it to consumers.
Moving the service to the cloud simplifies management, with station personnel able to access the portal to confirm that all clients are connected and streaming. The portal also allows operators to start, stop and configure delivery to each affiliate. Service can also be terminated for any client directly through the management portal.
Affiliates also don’t need any “special” internet connectivity to use the service. A very modest 1.5 Mbps of bandwidth is enough to receive the uncompressed signal, and most consumer-level internet connections are sufficiently reliable and stable. Even WXPN does not require hefty bandwidth regardless of how many affiliates they serve, as the Barix Reflector service takes a single feed from the origin (a Barix codec), with StreamGuys’ delivery network scaling out the bandwidth for reaching recipients.
BRINGING IT TOGETHER
As we look deeper into the future, the enhanced reliability and flexibility of an uncompressed IP service will provide a strong value proposition that will be hard to deny. Uncompressed STL will simply deliver T1-like audio quality over IP unhindered by downstream processes like transcoding, while syndicators will save a great deal of money and labor in the transition from satellite to IP for contribution and distribution.
Moving encoders into the cloud will support more formats and services while reducing the systems management burden, both at the studio and elsewhere in the audio contribution and distribution chain. The opportunity to better manage metadata alongside the uncompressed program audio stream will strengthen business opportunities and the consumer experience. And, the adaptability to accommodate even high-resolution formats as network conditions evolve will surely open new doors from both a service provision and listener experience perspective.
Kiriki Delany, a musician, computer geek and multimedia specialist, founded StreamGuys in 2000. Johannes Rietschel, a communications engineer by heart, founded Barix AG in 2000 and serves as CTO.