Challenges Faced in a Hybrid SD/HD Facility - a Teranex White Paper


Overview

In the past, broadcast facilities were oriented around an analog composite signal path. Although the facility may have had several different types of tape machines, cameras, etc, each of these could be interconnected without the use of conversion equipment.

In the transition from standard definition (SD) to DTV, many broadcasters are faced with a need for managing a hybrid SD/HD facility. In today’s economy, the concept of converting an entire facility and all of the associated workflows to HD is not a practical business plan. This makes implementing a highly integrated SD/HD processing infrastructure mandatory.

Building a hybrid facility will likely require the source material be converted to a new, internal video production format. This means any analog sources must be converted to serial digital (SDI) and all SDI signals must either be up-, down-, or cross-converted to the newly selected internal format.

In addition to the issues of the video and audio conversion, the hybrid infrastructure introduces an entirely new level of audio and ancillary data handling requirements. In the analog and SD environments, audio tended to be fairly basic, consisting of a stereo pair, and ancillary data tended to be in the vertical blanking interval or its digital equivalent.

In the new hybrid environment, audio has grown to include multi-channel audio, compressed audio and Dolby-E encoded audio, while ancillary data elements now go beyond the traditional VBI space to include things like EIA-708 closed caption, timecode, and video indexing, such as active format description (AFD), WSS and RP186 flags.

A number of challenges face broadcasters attempting to implement and operate an SD/HD processing infrastructure in which digital video, audio and metadata elements seamlessly interoperate. Addressing these challenges is now critical because of the analog shutoff and the public’s increased DTV awareness.

Format Conversion

A fundamental challenge in any facility transitioning to an SD/HD infrastructure is the requirement to transparently handle multiple signal formats. In addition to handling existing video signals, such as analog composite or SD-SDI, a broadcast facility will be required to handle at least one of the new HD formats, including 720p or 1080i.

Most hybrid facilities will be based on a single, internal format. This means that all material that is not in the native format, whether it is received from outside the facility or ingested internally, will need to be format converted.

An important consideration in choosing a format converter is the de-interlacing technique used in converting between the various formats. In addition, if International program interchange is involved, the ability to perform frame-rate conversion is another important consideration.

De-interlacing

Most SD and HD video sources used in broadcast applications are interlaced images. In an interlaced format only half of the information in each frame is transmitted at a given time. A progressive format, on the other hand, transmits the entire frame at one time.

In order to perform format conversion on an interlaced video signal, for example from 480i or 1080i sources, it is required that the image first be de-interlaced to create a progressive image. Once the image is in the progressive domain, it is then possible to scale it to the desired output resolution without creating unwanted image artifacts. This interlaced-to-progressive conversion is the most important step in the format conversion process and determines the overall quality of the output video signal.

De-interlacing Techniques

If the objects in the video image are not moving, it is easy to de-interlace the image. The two fields can be ‘weaved together’ to form a complete frame. However, if there is motion in the image, the two source fields that make up the complete frame will contain slightly different information because of the temporal offset of the interlaced fields. This means that the two fields cannot be simply weaved together without causing artifacts. A more sophisticated process must be used.

The simplest approach to avoid these artifacts is to ignore the even fields. This is called a non-motion adaptive approach. In this method, when the two fields are processed, data from the even fields are completely ignored. The video-processing circuitry recreates or “interpolates,” the missing lines by averaging pixels from above and below. While there are no combing artifacts, image quality is compromised because half of the resolution has been discarded.

A more advanced de-interlacing technique is a frame-based, motion-adaptive algorithm. By default, these format converters use the same techniques described in the non-motion adaptive approach above. However, by using a simple motion detection system, the converter can determine when no movement has occurred in the image. If nothing in the image is moving, the converter combines the two fields directly. With this method, still images will have full vertical resolution, but as soon as there is any motion, half of the data is discarded and the resolution drops by half.

The most advanced de-interlacing technique available is a true pixel-based motion-adaptive approach. With this technique, motion is identified at the pixel level rather than the frame level. While it is mathematically impossible to avoid discarding pixels in motion during de-interlacing, the method is designed to discard only the pixels that would cause combing artifacts. Everything else is displayed with full resolution.

Pixel-based, motion-adaptive de-interlacing avoids artifacts in moving objects and preserves full resolution of non-moving portions of the screen even if neighboring pixels are in motion.

Diagonal Filtering

To recover some of the detail lost in the areas of motion, Pixel-Based Motion Adaptive processing implements a multi-directional diagonal filter that reconstructs some of the lost data at the edges of moving objects, filtering out any “jaggies.” This operation is called “second-stage” diagonal interpolation because it’s performed after the de-interlacing, which is the first stage of processing.

Frame Synchronization

In a hybrid facility there is a high probability that you will need to work with signals that come from outside of the facility or from internal workflows that are not integrated with the main system. In these cases, a frame synchronizer must be used to align video signals with the internal facility reference. The frame synchronizer has a source input for the incoming un-timed signal and a reference input to which the facility reference is applied. The device delays the source feed signal by storing the image information in memory. The reference signal determines the time at which the device begins to output the video. By synchronizing all signals to the same reference, the signals can now be switched together (i.e. mixed, dissolved, etc.) without timing errors.

Aspect Ratio Conversion

Aspect ratio conversion is another factor that must be addressed in the hybrid facility. Standard definition material can exist in either 4:3 or 16:9 aspect ratio while high definition material is primarily in a 16:9 aspect ratio. It is important to remember that aspect ratio refers to the ratio of an image‘s width to its height. It is not related to image size.

The video industry generally expresses aspect ratios as whole numbers such as 4:3 or 16:9. Thus an image with an aspect ratio of 4:3 means that the image is 4 units wide by 3 units high. The value of the units is completely arbitrary.

The technical issues of aspect ratio conversion are not very complex. The process does not require any information to be created; rather it is simply a process of cropping, stretching or squeezing the image.

This manipulation does, however present a number of creative issues in how an image is resized and reshaped to change its aspect ratio.

 

Up-conversion: Common Top & Bottom
If a 4:3 image is up-converted without any aspect ratio changes it is referred to as Common Top and Bottom. The top and bottom edges of the image match the top and bottom edges of the display device. This creates a pillarbox, as shown in Figure 1, or an image with black curtains on either side.


Figure 1: Common Top & Bottom

 

Up-conversion: Anamorphic
If the 4:3 image is stretched to fill the 16:9 output display, we refer to the aspect ratio mode as Anamorphic, as shown in Figure 2. This mode is usually intended for material that was captured with an anamorphic lens, as it will horizontally stretch the compressed source material to compensate for the effects of the anamorphic lens. If the Anamorphic mode is used with standard 4:3 material, the result is a horizontal distortion of the geometry of the image (e.g. circles are stretched and become ovals).


Figure 2: Anamorphic

 

Up-conversion: Common Sides
As was seen above, if the 4:3 image is stretched horizontally to fill the 16:9 display, the result is a distortion of the geometry of the image. In order to correct this and continue to fill the output display, the image must be stretched vertically as well as horizontally. This yields the ‘correct’ 16:9 geometry, but causes approximately 33% of the original input information to be lost from the top and bottom of the image. We refer to this mode as “Common Sides or “Common Left & Right” and can be seen in Figure 3.

The loss of the information in the vertical domain, in addition to its creative issues, does present a technical one as well. By cropping the lines, less vertical information is made available to the interpolation process. This will effectively lower the overall resolution of the output image.


Figure 3: Common Sides

 

Up-conversion: 14:9
Another aspect ratio, 14:9, has become known as the compromise format. This format is generally shown with a correct geometry. It also requires both a horizontal and vertical stretch of the image. The horizontal stretch being 14 units, as opposed to 16, will result in small bars (pillarbox) on either side of the image. The vertical stretch needed to maintain geometry is only 15%, rather than 33%, and thus results in less information being cropped, as shown in Figure 4. Like the 16:9 format above, the 14:9 format will also have an effect on the overall vertical resolution.


Figure 4: 14:9 Common Sides

 

Up-conversion: Flexview
The creative problem associated with aspect ratio conversions is one that has plagued broadcasters since the start of the DTV transition. A compromise between the pillarbox method and the anamorphic or common sides methods, was needed.

One compromise for this issue is the Flexview aspect ratio, introduced by Teranex. The Flexview aspect ratio utilizes a non-linear anamorphic function. It performs the aspect ratio conversion by leaving the picture information in the center of the image relatively undisturbed and by applying progressively more ‘stretch’ to the image as it gets closer to the left and right edges. This process, while largely done in the horizontal domain, also has a small vertical component to help maintain correct geometry.

The basic premise of Flexview is that most of the important content in a scene (the material that your eye is drawn to) is in the center of the image and the information on the edges is generally less important. Flexview leaves the center portion undisturbed and stretches the image most aggressively around the edges where there is less important material.

The examples on the next page show a comparison of an up-conversion of the same frame of video utilizing the Common Top & Bottom aspect ratio (as shown in Figure 5), the Common Sides aspect ratio (as shown in Figure 6), the Anamorphic aspect ratio (as shown in Figure 7) and the Flexview aspect ratio (as shown in Figure 8).


Figure 5: Common Top & Bottom

 


Figure 6: Common Sides

 


Figure 7: Anamorphic

 


Figure 8: Flexview

 

Down-conversion: Common Sides
If a 16:9 image is down-converted without any aspect ratio changes, we refer to the aspect ratio conversion mode as Common Sides. The left and right edges of the image will match the left and right edges of the display device. This creates a letterbox, as shown in Figure 9, or an image with black bars on the top & bottom.


Figure 9: Common Sides

 

Down-conversion: Anamorphic
If the 16:9 image is squeezed to fill the 4:3 output display, we refer to the aspect ratio mode as Anamorphic, as shown in Figure 10. When the Anamorphic mode is used, the result is a horizontal distortion to the geometry of the image (circles are squeezed to become ovals).


Figure 10: Anamorphic

 

Down-conversion: Common Top & Bottom
As was seen above, if the 16:9 image is squeezed horizontally to fill the 4:3 display it results in a distortion of the geometry of the image. In order to eliminate the distortion, the image must be cropped horizontally. This will result in ‘correct’ geometry, but will cause information on the left and right side of the input image to be lost. This mode is referred to as "Common Top & Bottom," as shown in Figure 11.


Figure 8: Common Top & Bottom

 

Ancillary Data

Audio Workflow

There are many types of audio workflows employed in hybrid facilities. To better understand these different workflows, It is important to understand the differences between two-channel (stereo) and multiple-channel (surround sound) audio processing. A stereo signal that sounds like stereo may have inaudible surround sound information encoded in the stereo signal (e.g. Dolby-E). Surround sound mixes can have up to eight channels, depending on the format. Typically, stereo and 5.1 mixes are used.

In addition to the number of channels and type of audio carried in them, audio content may be moved around a facility as discrete audio signals (analog or AES), embedded or compressed signals. It is often necessary to "break-out" audio channels from incoming programming in order to isolate foreign language tracks or "natural" sounds. The ability to reassign audio channels during video format conversion is often critical to the work flow and can save time when processing incoming programming.

Dolby-E Audio

Dolby-E is an audio encoding and decoding technology developed by Dolby Laboratories that allows up to 8 channels of audio to be compressed into a digital stream that can be stored on a standard stereo pair of audio tracks.

In a hybrid facility it may be necessary to distribute multi-channel audio around a facility as either a Dolby-E encoded pair or in the 8-channel discrete format. Applying Dolby-E decoding during the format conversion process can save time and effort.

Closed Captioning

Another challenge in a hybrid facility is transcoding closed caption information when converting between SD and HD content. Standard definition signals use EIA-608 closed captioning, whereas high definition signals utilize EIA-708 closed captioning.

When an SD signal is up-converted to HD, the closed caption information in the SD signal must be extracted from the ancillary data space, transcoded from EIA-608 to EIA-708 and inserted in the ancillary data space of the HD output signal. In addition to the transcoding process, the original EIA-608 closed caption information is also inserted in to the new EIA-708 closed caption payload. This allows for the information to be extracted later if down-conversion of the HD signal to SD is required.

Active Format Description

Active Format Description (AFD) is a standard set of codes that can be sent in the baseband SDI video signal that carries information about their aspect ratio and active picture characteristics. It can be used by broadcasters to enable both 4:3 and 16:9 television sets to optimally present pictures transmitted in either format. It has also been used by broadcasters to dynamically control how format-conversion equipment formats the image during the conversion process.

By using AFD codes, broadcasters can more accurately control the timing of Aspect Ratio switches within their program stream. For example, when commercials have different aspect ratios that must be maintained when inserted into the program stream.

Hybrid SD/HD Workflow Examples

The following scenarios will discuss a number of different hybrid SD/HD workflows, show how a format converter can be used to mix the SD and HD content and address the audio and metadata issues described above to form a cohesive workflow.

 

Scenario #1
In this scenario, a broadcaster has an existing standard definition facility but wants to transmit an HDTV signal in 1080i59.94 as an initial step in their DTV transition. This is one of the most common scenarios and one of the simplest to resolve.

Looking at the diagram below, you can see that there are no changes required to the existing baseband facility. All work is still done in the original standard definition format. The output of the station's master control switcher is fed to the up-converter, which will convert the standard definition, 480i59.94 signal to a 1080i59.94 signal. The output of the up-converter is then fed to the new ATSC encoder, which then feeds the new DTV transmitter.

For audio, the original workflow used an AES pair to distribute audio. In the new workflow, this AES pair will be fed to the up-converter and embedded in the output HD-SDI signal, which is then passed on to the ATSC encoder.

In addition, the EIA-608 closed caption data in the SD-SDI signal will be extracted by the up-converter, transcoded to EIA-708 and inserted in the ancillary data space of the output HD-SDI signal. The up-converter will also insert the original EIA-608 data into the newly created EIA-708 data per SMPTE-334M.

 

Scenario #2
In this scenario, we will look at a facility that has transitioned most of the workflows to HD but still needs to distribute an SD feed to its local cable company. In the case, the output of the master control switcher will feed a down-converter, which will convert the 1080i59.94 station HD feed to 480i59.94 to be sent to the cable company.

The HD feed contains 10-channels (2-channels for the main stereo audio, 6-channels for the unencoded surround sound, 2-channels for second audio language) of audio but the SD feed will only need 4-channels (2-channels for the main stereo pair and 2-channels for the second audio language) so the down-converter will be configured to pass channels 1 & 2, remap channels 9 & 10, which contains the second audio language, to channels 3 & 4, and then mute all other channels.

The down-converter will also extract the EIA-608 closed caption data from the EIA-708 closed caption data in the HD signal and insert it in the output SD signal.

 

Scenario #3
In this scenario, a broadcaster has an existing standard definition facility. The DTV output for the facility will be 1080i59.94. The SD studio cameras have the ability to capture in 16:9 anamorphic, which will eliminate the issues involved with aspect ratio conversion. Unfortunately, the local advertiser requires that his commercial be shown without any aspect ratio changes and has included AFD flags in the content to identify the desired aspect ratio.

As we have seen in other workflows, the SD output of the master control switcher will be fed to an up-converter. The up-converter will be set to the Anamorphic aspect ratio so that when it receives the signal from the studio cameras, which will be squeezed horizontally, it will stretch the image horizontally during the up-conversion process so that the correct geometry is achieved when material is viewed in HD.

When the output of the master control switcher changes to one of the commercial spots, the up-converter will detect the AFD flag and automatically change the aspect ratio from Anamorphic to Common Top & Bottom. This will display the commercial as a 4:3 image with black bars on the left and right sides to fill the 16:9 display. At the end of the commercial break when the master control switcher changes back to the studio cameras, the up-converter will detect the absence of the AFD flags and will revert back to it user-defined default aspect ratio, in the case, Anamorphic.

In addition, the EIA-608 closed caption data, in the SD-SDI signal will be extracted by the up-converter, transcoded to EIA-708 and inserted in the ancillary data space of the output HD-SDI signal. The up-converter will also insert the original EIA-608 data into the newly created EIA-708 data per SMPTE-334M.

 

Scenario #4
In this scenario we will look at a special case situation where a facility, which is internally setup to work in 720p59.94, needs to distribute signals in 480i59.94, 1080i59.94, 720p50 and 1080i50 to cover a special event. To implement this workflow, the output of the HD master control switcher can be fed to a dual channel format-converter, where the first processing channel can be set to perform the cross-conversion from 720p59.94 to 1080i59.94 and the second processing channel can be set to perform a down-conversion from 720p59.94 to 480i59.94.

For the 720p50 & 1080i50 feed, a second converter can be used to perform the standards conversion from 720p59.94 to 720p50 & 1080i50 for international distribution.

 

Conclusions

The format conversion process is a key element to broadcasters that are involved in the transition to DTV and a hybrid facility. The format converter allows broadcasters to work with both standard definition and high definition programming on their DTV channel. It also resolved the issues revolving around the processing of audio and ancillary data and passing those elements between the SD and HD environments.

The use of a high quality de-interlacing process is very important in the format conversion process. Since a format converter must create picture information in the resampling it is critical that the maximum amount of vertical detail be recovered from the input signal. The Teranex VC100, with its motion adaptive de-interlacing process and diagonal filtering, can recover the full vertical detail of the input signal even in areas of motion.

Aspect ratio conversion is another area that will impact both the engineering and the creative side of a broadcast facility. From an engineering standpoint one must ensure that the format converter can address aspect ratio conversion. There may be a decision as to whether the facility should switch to 16:9 for all programming new and old. This decision would lead to changes in equipment such as cameras, and the redesign of sets. From a creative side there is the issue of how to fill the new widescreen displays, what percentage of cropping or distortion will be acceptable.

 

 

 

Multi Language

© Copyright - Teranex - All Rights Reserved