34 0 4MB
Understanding HD Your comprehensive guide to High Definition on a budget
Part One
Understanding HD with Avid
1 Chapter 1
Video formats and sampling
1
Video formats and sampling Perhaps one of the most baffling areas of HD and SD, is the shorthand jargon used to describe sampling and colour space, such as RGB 4:4:4, and Y,Cr,Cb 4:2:2. Also the video formats such as 1080/24P sound strange until you get to know them. For a quick initiation, or reminder, about sampling ratios, please read the piece directly below.
Understanding HD with Avid
Line 1
• Y
• Y
CrCb
Line 2
• Y
• Y
• Y
• Y
CrCb
• Y
CrCb
• Y
• Y
CrCb
• Y
• Y
• Y
CrCb
• Y
CrCb
• Y
• Y
CrCb
• Y
CrCb
4:2:2 sampling of luminance and colour difference signals
4:2:2 etc (Chroma sub-sampling) The sampling rates used in digital television are described by shorthand that has, in some ways, only a tenuous connection to what it is used to describe. The numbers denote ratios of sampling rates, not absolute numbers, and they need a little interpretation to understand them all. Sometimes these ratios are referred to as ‘chrominance (chroma) sub-sampling’. In most instances the first number refers to luminance (Y), the last two refer to chrominance – the exceptions are 4:4:4, or 4:4:4:4 (more later). The first number is nearly always a 4 and that means that the luminance is sampled once for every pixel produced in the image. There are a very few instances where a lower sample rate is used for luminance. An example is HDCAM, which is generally considered to use 3:1:1 sampling. Sampling at a lower rate than the final pixel rate is known as sub-sampling The second two numbers describe the sampling frequencies of the two pure colour digitised components of (Red-Y) and (Blue-Y), called Cr and Cb. In line with television’s practise of taking advantage of our eye’s response which is more acute for luminance than for pure colour, cuts to reduce data tend to be made in the chrominance sampling rather than luminance. The most common studio sampling system is 4:2:2 where each of the two colour components is sampled coincidently with every second luminance sample along every line.
4:1:1, used in some DV formats and DVCAM, makes Cr and Cb samples at every fourth Y sample point on every line – but still carries more chrominance detail than PAL or NTSC.
Line 1
Y
Y
Y
Y
CrCb
Line 2
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
CrCb
Y
Y
Y
CrCb
Y CrCb
4:1:1 sampling
Then another argument says that if the chrominance is subsampled horizontally, as in 4:1:1, why not do the same vertically to give a more even distribution of colour information? So instead of sampling both Cr and Cb on every line, they are sampled on alternate lines, but more frequently on each line (at every other Y). This is 4:2:0 sampling (4:2:0 on one line and 4:0:2 on the next) and it is used in MPEG-2 and most common JPEG compression schemes.
Line 1
Y
Y
Cr
Line 2
Y Cb
Y
Y
Cr
Y
Y Cb
Y
Y
Cr
Y
Y Cb
Y
Y
Cr
Y
Y
Y Cr
Y
Cb
4:2:0 provides equal colour resolution vertically and horizontally if using square pixels
Y Cb
2
Video formats and sampling
Understanding HD with Avid
In many cases it is very useful to have a key (or alpha) signal associated with the pictures. A key is essentially a full image but in luminance only. So then it is logical to add a fourth number 4, as in 4:2:2:4.
what is now the ITU-R BT.601 standard for SD sampling. ‘601’ defines luminance sampling at 13.5MHz (giving 720 pixels per active line) and each of the colour difference signals at half that rate – 6.75MHz.
Technically 4:4:4 can denote full sampling of RGB or Y, Cr, Cb component signals – but it is rarely used for the latter. RGB may have an associated key channel, making 4:4:4:4.
The final twist in this tale is that someone then noticed that 13.5MHz was nearly the same as 14.3MHz that was 4 x NTSC subcarrier. Had he looked a little further he might have seen a much nearer relationship to 3 x PAL SC and a whole swathe of today’s terminology would be that much different! But so it was that the number that might have been 3 and should have been 1, became 4.
Occasionally people go off-menu and do something else – like over-sampling which, with good processing can improve picture quality. In this case you might see something like 8:8:8 mentioned. That would be making two samples per pixel for RGB. This sampling ratio system is used for both SD and HD. Even though the sampling is generally 5.5 times bigger, 4:2:2 sampling is the standard for HD studios. Why 4? Logic would dictate that the first number, representing a 1:1 relationship with the pixels, should be 1 but, for many good (and some not so good) reasons, television standards are steeped in legacy. Historically, in the early 1970s, the first television signals to be digitised were coded NTSC and PAL. In both cases it was necessary to lock the sampling frequency to that of the colour subcarrier (SC), which itself has a fixed relationship to line and frame frequencies. NTSC subcarrier is 3.579545MHz and PAL-I’s is 4.43361875MHz and the digital systems typically sampled at 4 x NTSC SC or 3 x PAL SC, making 14.3 and 13.3MHz respectively. Then came the step to use component video Y, B-Y and RY (luminance and two pure colour components – known as colour difference signals) that is much easier to process for re-sizing, smooth positioning, standards conversion, compression and all the other 1001 operations that can be applied to pictures today. When a standard was developed for sampling this component video it followed some of the same logic as before, but this time also sort commonality between the two SD scanning systems used around the world: 525/60I and 625/50I. Putting all that together led to
As HD sampling rates are 5.5 times faster than those for SD, the commonly used studio 4:2:2 sampling actually represents 74.25MHz for Y and 37.125MHz for Cr and Cb.
1080I Short for 1080 lines, interlace scan. This is the very widely used HD line format which is defined as 1080 lines, 1920 pixels per line, interlace scan. The 1080I statement alone does not specify the frame rate which, as defined by SMPTE and ITU, can be 25 and 30Hz. See also: Common Image Format, Interlace, ITU-R.BT 709, Table 3
1080P TV image size of 1080 lines by 1920, progressively scanned. Frame rates can be as for 1080I (25 and 30Hz) as well as 24, 50, 60Hz. See also: Common Image Format, Progressive, ITU-R.BT 709, Table 3
13.5MHz Sampling frequency used in the 601 digital coding of SD video. The frequency was chosen to be a whole multiple of the 525 and 625-line television system frequencies to create some compatibility between the digital systems. The sampling is fast enough to faithfully portray the
3
Video formats and sampling highest frequency, 5.5MHz, luminance detail information present in SD images. Digital sampling of most HD standards samples luminance at 74.25MHz, which is 5.5 times 13.5MHz. See also: 2.25MHz, ITU-R BT.601
2.25MHz This is the lowest common multiple of the 525/59.94 and 625/50 television line frequencies, being 15.734265kHz and 15.625kHz respectively. Although seldom mentioned, its importance is great as it is the basis for all digital component sampling frequencies both at SD and HD. See also: 13.5MHz
24P Short for 24 frames, progressive scan. In most cases this refers to the HD picture format with 1080 lines and 1920 pixels per line (1080 x 1920/24P). The frame rate is also used for SD at 480 and 576 lines with 720 pixels per line. This is often as an offline for an HD 24P edit, or to create a pan-and-scan version of an HD down-conversion. Displays working at 24P usually use the double shuttering technique – like film projectors – to show each image twice and reduce flicker when viewing this low rate of images.
24PsF 24P Segmented Frame. This blurs some of the boundaries between film/video as video is captured in a film-like way, formatted for digital recording and can pass through existing HD video infrastructure. Like film, entire images are captured at one instant rather than by the usual lineby-line TV scans down the image that means the bottom can be scanned 1/24 of a second after the top. The images are then recorded to tape as two temporally coherent fields (segments), one with odd lines and the other with even lines, that are well suited to TV recorders.
Understanding HD with Avid The images are a pure electronic equivalent of a film shoot and telecine transfer – except the video recorder operates at film rate (24 fps), not at television rates. The footage has more of a filmic look but with the low frame rate, movement portrayal can be poor. 25PsF and 30PsF rates are also included in the ITU-R BT. 709-4 recommendation. See also: ITU-R BT. 709
601 See ITU-R BT. 601
709 See ITU-R BT. 709
720P Short for 720 lines, progressive scan. Defined in SMPTE 296M and a part of both ATSC and DVB television standards, the full format is 1280 pixels per line, 720 lines and 60 progressively scanned pictures per second. It is mainly the particular broadcasters who transmit 720P that use it. Its 60 progressive scanned pictures per second offers the benefits of progressive scan at a high enough picture refresh rate to portray action well. It has advantages for sporting events, smoother slow motion replays etc.
74.25MHz The sampling frequency commonly used for luminance (Y) or RGB values of HD video. Being 33 x 2.25MHz, the frequency is a part of the hierarchical structure used for SD and HD. It is a part of SMPTE 274M and ITU-R BT.709. See also: 2.25MHz
4
Video formats and sampling
Understanding HD with Avid
Active picture
Anamorphic
The part of the picture that contains the image. With the analogue 625 and 525-line systems only 575 and 487 lines actually contain the picture. Similarly, the total time per line is 64 and 63.5µS but only around 52 and 53.3µS contain picture information. As the signal is continuous the extra time allows for picture scans to reset to the top of the frame and the beginning of the line.
This generally describes cases where vertical and horizontal magnification is not equal. The mechanical anamorphic process uses an additional lens to compress the image by some added amount, often on the horizontal axis. In this way a 1.85:1 or a 2.35:1 aspect ratio can be squeezed horizontally into a 1.33:1 (4:3) aspect film frame. When the anamorphic film is projected it passes through another anamorphic lens to stretch the image back to the wider aspect ratio. This is often used with SD widescreen images which keep to the normal 720 pixel count but stretch them over a 33-percent wider display. It can also apply to camera lenses used to shoot 16:9 widescreen where the CCD chips are 4:3 aspect ratio.
Digitally sampled SD formats contain 576 lines and 720 pixels per line (625-line system), and 480 lines and 720 pixels per line (525-line system) but only 702 contain picture information. The 720 pixels are equivalent to 53.3µS. The sampling process begins during line blanking of the analogue signal, just before the left edge of active picture, and ends after the active analogue picture returns to blanking level. Thus, the digitised image includes the left and right frame boundaries as part of the digital scan line. This allows a gentle roll-on and roll-off the between the blanking (black) and active picture. HD systems are usually quoted just by their active line count, so a 1080-line system has 1080 lines of active video, each of 1920 samples. This may be mapped onto a larger frame, such as 1125 lines, to fit with analogue connections.
Aliasing Artefacts created as a result of inadequate or poor video sampling or processing. Spatial aliasing results from the pixel-based nature of digital images and leads to the classic ‘jagged edge’ (a.k.a. ‘jaggies’) appearance of curved and diagonal detail and twinkling on detail. This results from sampling rates or processing accuracy too low for the detail. Temporal aliasing occurs where the speed of the action is too fast for the frame rate, the classic example being wagon wheels that appear to rotate the wrong way. See also: Anti-aliasing
See also: Aspect ratio
Anti-aliasing Attempts to reduce the visible effects of aliasing. This is particularly the case with spatial anti-aliasing that typically uses filtering processes to smooth the effects of aliasing which may be noticeable as jaggedness on diagonal lines, or ‘twinkling’ on areas of fine detail. A better solution is to improve the original sampling and processing and avoid aliasing in the first place. See also: Aliasing
Aspect Ratio For pictures, this refers to the ratio of picture width to height. HD pictures use a 16:9 aspect ratio, which also may be noted as 1.77:1. This is a third wider than the traditional 4:3 television aspect ratio (1.33:1) and is claimed to enhance the viewing experience as it retains more of our concentration by offering a wider field of view. Pixel aspect ratio refers to the length versus height for a pixel in an image. HD always uses square pixels as do most computer applications. SD does not. The matter is
5
Video formats and sampling further complicated by SD using 4:3 and 16:9 (widescreen) images which all use the same pixel and line counts. Care is needed to alter pixel aspect ratio when moving between systems using different pixel aspect ratios so that objects retain their correct shape. With both 4:3 and 16:9 images and displays in use, some thought is needed to ensure a shoot will suit its target displays. All HD, and an increasing proportion of SD, shoots are 16:9 but many SD displays are 4:3. As most HD productions will also be viewed on SD, clearly keeping the main action in the middle ‘4:3’ safe area would be a good idea – unless the display is letterboxed. See also: ARC
Chrominance (or Chroma) sub-sampling See 4:2:2 etc.
CIF Common Image Format. An image format that is widely used and denoted ‘Common Image Format’ by the ITU. The idea is to promote the easy exchange of image information nationally and internationally.
Understanding HD with Avid maintain the dynamic range. For example, if the YCrCb colour space video is 8 bits per component then the RGB colour space video will need to be 10 bits.
Component video Most traditional digital television equipment handles video in the component form: as a combination of pure luminance Y, and the pure colour information carried in the two colour difference signals R-Y and B-Y (analogue) or Cr, Cb (digital). The components are derived from the RGB delivered by imaging devices, cameras, telecines, computers etc. Part of the reasoning for using components is that it allows colour pictures to be compressed. The human eye can see much more detail in luminance than in the colour information (chrominance). The simple task of converting RGB to Y, (RY) and (B-Y) allows exclusive access to the chrominance only, so its bandwidth can be reduced with negligible impact on the viewed pictures. This is used in PAL and NTSC colour coding systems and has been carried through to component digital signals both at SD and HD. For the professional digital video applications, the colour difference signals are usually sampled at half the frequency of the luminance - as in 4:2:2 sampling. There are also other types of component digital sampling such as 4:1:1 with less colour detail (used in DV), and 4:2:0 used in MPEG-2.
See HD-CIF
Colour space The space encompassed by a colour system. Examples are: RGB, YCrCb, HSL (hue, saturation and luminance) for video, CMYK for print and XYZ for film. Moving between media, platforms or applications can require a change of colour space. This involves complex image processing so care is needed to get the right result. Also, repeated changes of colour space can lead to colours drifting off. It is important to note that when converting from YCrCb to RGB more bits are required in the RGB colour space to
Co-sited sampling Where samples of luminance and chrominance are all taken at the same instant. This is designed so that the relative timing (phase) of all signal components is symmetrical and not skewed by the sampling system. Sampling is usually co-sited but there is a case of 4:2:0 sampling being interstitial – with chrominance samples made between the luminance samples. See also: 4:2:2 etc.
6
Video formats and sampling
HD High Definition Television. This has been defined in the USA by the ATSC and others as having a resolution of approximately twice that of conventional television (meaning analogue NTSC – implying 486 visible lines) both horizontally and vertically, a picture aspect ratio of 16:9 and a frame rate of 24fps and higher. This is not quite straightforward as the 720-line x 1280 pixels per line, progressive scan format is well accepted as HD. This is partly explained by the better vertical resolution of its progressive scanning. Apart from the video format, another HD variation on SD is a slightly different colorimetry where, for once the world agrees on a common standard.
2048 2K Film
1920 1080-HD 1536
There is potential for overloading equipment – especially transmitters which may cut out to avoid damage! There is equipment that clearly shows many areas of out-of-gamut pictures, so that they can be adjusted before they cause problems.
After some initial debate about the formats available to prospective HD producers and television stations, the acceptance of 1080-HD video at various frame rates, as a common image format by the ITU, has made matters far more straightforward. While television stations may have some latitude in their choice of format, translating, if required, from the common image formats should be routine and give high quality results.
1280 720-HD
720 576 & 480-SD
1080
The range of possible colours available in an imaging system. The red, blue and green phosphors on television screens and the RGB colour pick-up CCDs or CMOS chips in cameras, define the limits of the colours that can be displayed – the colour gamut. Between the camera and viewer’s screen there are many processes, many using component 4:2:2 video. However, not all component value combinations relate to valid RGB colours (for example, combinations where Y is zero). Equipment that generates images directly in component colour space, such as some graphics machines, can produce colours within the component range but that are invalid in RGB, which can also exceed the limits allowed for PAL and NTSC.
720
Gamut (colour)
576
Digital Television. This is a general term that covers both SD and HD digital formats.
7
As HD’s 1080 x 1920 image size is close to the 2K used for film, there is a crossover between film and television. This is even more the case if using a 16:9 window of 2K as here there is very little difference in size. It is generally agreed that any format containing at least twice the standard definition format on both H and V axes is high definition.
480
DTV
Understanding HD with Avid
2K, HD and SD images sizes See also: Common Image Format, Interlace Factor
Video formats and sampling
Understanding HD with Avid
PAL and NTSC
RGB
PAL and NTSC do not exist in HD. They do not exist in modern SD digital television either – although it was digitised in early digital VTR formats. PAL, means Phase Alternating Line and is an analogue system for coding colour that is still widely in use. Similarly NTSC (National Television Standards Committee) describes an analogue system. Confusingly PAL and NTSC are still used to describe frame rates and formats that relate in some way with their analogue world. So 1080 PAL might be 1080/50I.
Red, Green and Blue. Cameras, telecines and most computer equipment originate images in this colour space. For digital sampling, all three colours are sampled in the same way at full bandwidth – hence 4:4:4. images may offer better source material for the most critical chroma keying, but they occupy 50 percent more data space than 4:2:2 and as no VTRs record 4:4:4, data recorders or disks must be used to store them. Also, there are no television means to connect them, so IT-based networking technology is used.
Quantization
Often 4:4:4 is only used in post production areas and is converted to 4:2:2 when material is more widely distributed.
Quantization refers to sampling: the number of bits used in making digital samples of a signal. For video, 8 bits is quite common in consumer and prosumer products such as DV. HDV also uses 8 bits. Note that the 8 bits can define 28 or 256 numbers or levels that, for converting analogue video into digits, are assigned to levels of image brightness. For more accuracy and to withstand multiple levels of complex post production processing, studio video applications often use 10-bit sampling – providing 1024 levels. Usually the distribution of the levels between brightest and darkest is linear (even) but in the case of scanning film negative for input to a digital intermediate chain, then a logarithmic distribution is often used that progressively squashes the levels into the darker areas of picture. This is because film negative has to carry a very wide range of contrast information from the original scene, and the levels in the dark/shadow areas are more significant and visible than those in bright areas. The ‘log’ sampling suitably redistributes the available digital levels – hence 10-bit log. This is considered to be as useful as 13-bit linear quantization. NB: Quantization has another meaning. See section: Video Compression 1
See also: 4:4:4, Gamut
Segmented Frame See: 24PsF
Square pixels Square pixels are the pixel aspect ratio where the pixels describe a square area of the displayed image. This is the case with HD broadcast standards, as the picture formats describe line length (number of pixels per line) and number of lines, in exact 16:9 ratios – which is also the display aspect ratio of the pictures. There are places in HD where pixels are not square. The very widely used HDCAM sub-samples the 1920-pixel HD line lengths with 1440 luminance samples. This is only an internal function of the recorder; the inputs and outputs use square pixels. In a similar way the 1080I HDV(2) format also uses 1440 samples per line. Generally, computers generate images with square pixels but digital SD television images are not square. This means that any applications or equipment used needs to
8
Video formats and sampling take this into account when transferring between applications, or performing image manipulations to maintain correct image aspect ratios (so circles remain circular). See also: Anamorphic, Aspect ratio
Understanding HD with Avid
Table 3 The video formats allowed for broadcast in the ATSC DTV standard are listed in Table 3 of document Doc. A/53A. Table 3 Compression Format Constraints
Sub-sampling In a digital sampling system, taking fewer samples of an analogue signal than the number of pixels in the digital image is called sub-sampling. Generally sub-sampling is used to reduce the amount of data used for an image. In the widely used 4:2:2 sampling system for studio quality video, each luminance sample corresponds to one pixel – denoted by the ‘4’. The two chrominance signals are each sampled at half the rate, making one per two pixels. This is known as chrominance sub-sampling – a term that is sometimes more generally ascribed to the sampling ratios – such as 4:2:2, 4:1:1, etc. See also: 4:2:2 etc
Vertical_ size_ value
Horizontal_ size_ value
aspect_ ratio_ information
frame_ rate_ code
progressive_ sequence
1080
1920
1,3
1,2,4,5
1
4,5
0
1,2,4,5,7,8
1
720
1280
1,3
1,2,4,5,7,8
1
480
704
2,3
4,5
0
640
1,2
1,2,4,5,7,8
1
4,5
0
Legend for MPEG-2 coded values in Table 3 aspect_ratio_information 1 = square samples 2 = 4:3 display aspect ratio 3 = 16:9 display aspect ratio Frame_rate_code 1 = 23.976 Hz 2 = 24 Hz 4 = 29.97 Hz 5 = 30 Hz 7 = 59.94 Hz 8 = 60 Hz Progressive_sequence 0 = interlaced scan 1 = progressive scan
System nomenclature A term used to describe television standards. The standards are mostly written in a self-explanatory form but there is room for confusion concerning vertical scanning rates. For example, 1080/60I implies there are 60 interlaced fields per second that make up 30 frames. Then 1080/30P describes 30 frames per second, progressively scanned. The general rule appears to be that the final figure always indicates the number of vertical refreshes per second. However, Table 3 (below) uses a different method. It defines frame rates (numbers of complete frames) and then defines whether they are interlaced or progressive. So here the ‘frame rate code 5’ is 30Hz which produces 30 vertical refreshes when progressive, and 60 when interlaced. Be careful! See also: Interlace, Progressive
This table lists no fewer than 18 DTV formats for SD and HD. Initially, this led to some confusion about which should be adopted for whatever circumstances. Now most HD production and operation is centred on the 1080-line formats either with 24P, 25P or 60I vertical scanning, and 720-line formats at 50P and 60P.
9
Video formats and sampling
Understanding HD with Avid
Truncation (a.k.a. Rounding)
Universal Format
Reducing the number of bits used to describe a value. This is everyday practice; we may say 1,000 instead of 1024 in the same way we leave off the cents/pence when talking about money. There is also the need to truncate the digits used in digital video systems. With due care, this can be invisible, without it degradation becomes visible.
1080/24P is sometimes referred to as the Universal Format for television. The reason is its suitability for translation into all other formats to produce high quality results in all cases.
Decimal: Binary:
186 x 203 = 37758 10111010 x 11001011 = 1001001101111110
It is the nature of binary mathematics that multiplication, which is commonplace in video processing (e.g. mixing pictures), produces words of a length equal to the sum of the two numbers. For instance, multiplying two 8-bit video values produces a 16-bit result – which will grow again if another process is applied. Although hiways within equipment may carry this, ultimately the result will have to be truncated to fit the outside world which, for HD, may be a 10-bit HD-SDI interface or 8-bit MPEG-2 encoder. In the example, truncating by dropping the lower eight bits lowers its value by 01111110, or 126. Depending on video content, and any onward processing where the error is compounded, this may, or may not be visible. Typically, flat (no detail) areas of low brightness are prone to showing this type of discrepancy as banding. This is, for example, sometimes visible from computer generated images. Inside equipment, it is a matter of design quality to truncate numbers in an intelligent way that will not produce visible errors – even after further processing. Outside, plugging 10-bit equipment into 8-bit needs care. Intelligent truncation is referred to as Rounding.
See also: HD-CIF, Universal Master
Universal Master The 1080/24P format has well defined and efficient paths to all major television formats and is capable of delivering high quality results to all. An edited master tape in this format is sometimes referred to as a Universal Master. See also: HD-CIF
Y, Cr, Cb This signifies video components in the digital form. Y, Cr, Cb is the digitised form of Y, R-Y, B-Y.
Y, R-Y, B-Y See component video
YUV De-facto shorthand for any standard involving component video. This has been frequently, and incorrectly, used as shorthand for SD analogue component video – Y, R-Y, B-Y. Y is correct, but U and V are axes of the PAL colour subcarrier which are modulated by scaled and filtered versions of B-Y and R-Y respectively. Strangely, the term is still used to describe component analogue HD. This is double folly. Although Y is still correct, all HD coding is digital and has nothing to do with subcarriers or their axes. So forget it!
10
Understanding HD with Avid
2 Chapter 2
Video Compression: Concepts
11
Video Compression: Concepts Video compression reduces the amount of data or bandwidth used to describe moving pictures. Digital video needs vast amounts of data to describe it and there have long been various methods used to reduce this for SD. And as HD has up to a six times bigger requirement of 1.2Gb/s and requiring 560GB per hour of storage, the need for compression is even more pressing.
Intro Compression – General Exactly which type and how much compression is used depends on the application. Consumer delivery (DVD, transmission, etc) typically uses very high compression (low data rates) as the bandwidth of the channels is quite small. For production and online editing use much lighter compression (higher data rates) are used as good picture quality needs to be maintained though all the stages leading to the final edited master. Video compression methods are all based on the principle of removing information that we are least likely to miss – so-called ‘redundant’ picture detail. This applied to still images as well as video and cinema footage. This takes the form of several techniques that may be used together. Digital technology has allowed the use of very complex methods which have been built into low cost mass produced chips. First, our perception of colour (chroma) is not as sharp as it is for black and white (luminance), so the colour resolution is reduced to half that of luminance (as in 4:2:2). This is used in colour television (NTSC, PAL and digital). Similarly, fine detail with little contrast is less noticeable than bigger objects with higher contrast. To access these a process called DCT resolves 8 x 8 pixel blocks of digital images into frequencies and amplitudes to make it possible to scale (down), or ‘quantize’, the DCT coefficients (frequencies and amplitudes) and so reduce the data. This applies most of
Understanding HD with Avid the digital video compression schemes in use today including AVR, DV, HDV, JPEG (but not JPEG2000) and the I frames of MPEG-1, 2 and 4, and Windows Media 9. A further reduction is made using Huffman coding, a purely mathematical process that reduces repeated data. MPEG-2 and the more recent MPEG-4 add another layer of compression by analysing what changes form frame to frame by analysing the movement of 16 x 16-pixel macro blocks of the pictures. Then it can send just the movement information, called motion vectors, that make up predictive (B and P) frames and contain much less data than I frames, for much of the time. Whole pictures (I frames, more data) are sent only a few times a second. MPEG-2 compression is used in all forms of digital transmission and DVDs as well as for HDV. The more refined and efficient MPEG-4 is now being introduced for some HD services, and is set to become widely used for new television services. Each of these techniques does a useful job but needs to be applied with some care when used in the production chain. Multiple compression (compress/de-compress) cycles may occur while moving along the chain, causing a build-up of compression errors. Also, as many compression schemes are designed around what looks good to us, they may not be so good in production, post production and editing. This particularly applies in processes, such as keying and colour correction, that depend on greater image fidelity than we can see, so disappointing results may ensue from otherwise good-looking compressed originals. See also: AVR, Component video, DV, DNxHD, Huffman coding, JPEG, JPEG2000, MPEG-2, MPEG-4
Blocks See DCT
12
Video Compression: Concepts
Understanding HD with Avid
Codec
Compression-friendly
Codec is short for coder/decoder – usually referring to a compression engine. Confusingly, the term is often misused to describe just a coder or decoder.
Material that looks good after compression is sometimes referred to as ‘compression friendly’. This can become important in transmission where very limited data bandwidth is available and high compression ratios have to be used. Footage with large areas of flat colour, little detail and little movement compress very well: for example, cartoons, head-and-shoulder close-ups and some dramas. As, MPEG-2 compression looks at spatial detail as well as movement in pictures and an excess of both may show at the output as poor picture quality. This often applies to fast-moving sports – for instance football.
Compression ratio This is the ratio of the uncompressed (video or audio) data to the compressed data. It does not define the resulting picture or sound quality, as the effectiveness of the compression system needs to be taken into account. Even so, if used in studio applications, compression is usually between 2:1 and 7:1 for SD (and D1 and D5 uncompressed VTRs are also available), whereas compression for HD is currently approximately between 6:1 and 14:1 – as defined by VTR formats, and is I-frame only. For transmission, the actual values depend on the broadcaster’s use of the available bandwidth but around 40:1 is common for SD and somewhat higher, 50 or 60:1 for HD (also depending on format). These use both I-frames and the predictive frames to give the greater compression. HDV records data to tape at 19-25 Mb/s – a rate comparable with HD transmission and a compression ratio of around 40:1, depending on the standard used. Transmission and video recorders in general work at a constant bit rate so, as the original pictures may include varying amounts of detail, the quality of the compressed images varies. DVDs usually work on a constant quality/variable bit rate principle. So the compression ratio slides up and down according to the demands of the material, to give consistent results. This is part of the reason why DVDs can look so good while only averaging quite low bit rates – around 4 Mb/s.
Poor technical quality can be compression unfriendly. Random noise will be interpreted as movement by an MPEG-2 or MPEG-4 encoder, so it wastes valuable data space conveying unwanted movement information. Movement portrayal can also be upset by poor quality frame-rate conversions that produce judder on movement, again increasing unwanted movement data to be transmitted at the expense of spatial detail. Such circumstances also increase the chance of movement going wrong – producing ‘blocking’ in the pictures. Errors can be avoided by the use of good quality equipment throughout the production chain. Also, the choice of video format can help. For example, there is less movement in using 25 progressively scanned images than in 50 interlaced fields, so the former compress more easily. The efficiency increase is typically 15-20 percent.
DCT Discrete Cosine Transform is used as a first stage of many digital video compression schemes including JPEG and MPEG-2 and –4. It converts 8 x 8 pixel blocks of pictures to express them as frequencies and amplitudes. This may not reduce the data but it does arrange the image information so that it can. As the high frequency, low amplitude detail is least noticeable their coefficients are progressively
13
Video Compression: Concepts
Understanding HD with Avid
reduced, some often to zero, to fit the required file size per picture (constant bit rate) or to achieve a specified quality level. It is this reduction process, known as quantization, which actually reduces the data. For VTR applications the file size is fixed and the compression scheme’s efficiency is shown in its ability to use all the file space without overflowing it. This is one reason why a quoted compression ratio is not a complete measure of picture quality. DCT takes place within a single picture and so is intraframe (I-frame) compression. It is a part of the currently most widely used compression in television.
GOP Group Of Pictures – as in MPEG-2 and MPEG-4 video compression. This is the number of frames to each integral I-frame: the frames between being predictive (types B and P). ‘Long GOP’ usually refers to MPEG-2 and 4 coding. For transmission the GOP is often as long as half a second, 13, or 15 frames (25 or 30fps), which helps to achieve the required very high compression ratios.
B
B
P
B
B
P
B
B
P
B
B
See also: MPEG-2, MPEG-4
I-frame only (aka I-frame) Short for intra-frame only.
Inter-frame compression
See also: AVR, Compression ratio, DV, JPEG, MPEG-2, MPEG-4
I
Studio applications of MPEG-2 have very short GOPs, Betacam SX has a GOP of 2, IMX has 1, (i.e. I-frame only – no predictive frames) which means cutting at any frame is straightforward. Other formats such as DV, DVCPRO HD and HDCAM, D5-HD do not use MPEG but are also Iframe only.
I
Video compression that uses information from several successive video frames to make up the data for its compressed ‘predictive’ frames. The most common example is MPEG-2 with a GOP greater than 1. Such an MPEG-2 stream contains a mix of both I-frames and predictive B and P (Bi-directional predictive and Predictive) frames. Predictive frames cannot be decoded in isolation from those in the rest of the GOP so the whole GOP must be decoded. This is an efficient coding system that is good for transmission but it does not offer the flexibility needed for accurate editing as it can only be cut at the GOP boundaries. It also requires estimation of the movement from picture to picture, which is complex and not always accurate – leading to ‘blockiness’. See also: GOP, MPEG-2, MPEG-4
A typical group of pictures
Cutting long GOP MPEG is not straightforward as its accuracy is limited to the GOP length unless further processing is applied – typically decoding. HDV uses long GOP MPEG-2 of 6 or 15 frames for HDV1 or HDV2 respectively making it editable at 1/4 or 1/2 second intervals. A GOP of 1 indicates ‘I-frame only’ video, which can be cut at every frame without need of processing.
Interlace A method of ordering the lines of scanned images as two (or more) interlaced fields per frame. Most television uses 2:1 interlacing; alternate fields of odd lines 1,3,5, etc., followed by a field of even lines 2, 4, 6, etc. This doubles the vertical refresh rate as there are twice as many interlaced fields as there would be whole frames. The
14
Video Compression: Concepts result is better portrayal of movement and reduction of flicker without increasing the number of full frames or required signal bandwidth. There is an impact on vertical resolution and care is needed in image processing. See also: Interlace factor, Progressive
Interlace Factor Use of interlaced, rather than progressive, scans has no effect on the vertical resolution of still images. However, if anything in the image moves the resolution is reduced by the Interlace Factor, which may be 0.7 or less. This is due to the time displacement between the two fields of interlace which will produce detail that is jagged, line-byline, during the movement and it appears as an overall slight softening of vertical resolution.
Intra-frame compression (a.k.a. I-frame compression) Video compression which takes information from one video frame only. This way, all the information to re-create the frame is contained within its own compressed data and is not dependent on other adjacent frames. This means that I-frame compressed video is easily edited as it can simply be cut at any picture boundary without the need for any decoding and recoding. I-frame only video can be edited and the result output as first generation material. Any other operations such as wipes, dissolves, mixes, DVE moves etc., can only be performed on the baseband signal, requiring that the video is first decompressed. See also: AVR, DV, JPEG, MPEG-2,
Understanding HD with Avid
Macroblock A 16 x 16 pixel block, comprising four adjacent DCT blocks – macroblocks are used to generate motion vectors in MPEG-2 coding. Most coders use a ‘block matching’ technique to establish where the block has moved and so generate motion vectors to describe the movement. This works most of the time but also has its well-known moments of failure. For example, slow fades to black tend to defeat the technique, making the resulting misplaced blocks quite visible. Better technologies are available for use in movement estimation, such as phase correlation.
Motion Vectors Used in MPEG-2 and MPEG-4 compression systems, motion vectors describe the direction and distance that macroblocks (16 x 16 pixels) move between frames. Sending this movement information requires much less data than sending an I frame, and so further reduces the video data.
Progressive (scan) Sequence for scanning an image where the vertical scan progresses from line 1 to the end in one sweep. In HDTV there are a number of progressive vertical frame (refresh) rates allowed and used. 24Hz is popular for its compatibility with motion pictures and its ability to be easily translated into all of the world’s television formats. 25 and 30Hz correspond with existing SD frame rates (although they use interlaced scans). 50 and 60Hz are also allowed for, but, due to bandwidth restrictions, these are limited in picture size, e.g. 720/60P and 720/50P. Today, progressive scanning is most commonly found in computer displays and all the modern panel TV displays are progressive. Progressive images are rock steady making the detail easy to see. For the equipment designer progressive images are easier process as there is no difference between the two fields of a frame to contend with.
15
Video Compression: Concepts Progressive scans have the disadvantage of a slow vertical refresh rate. Thus, for the lower rates of 24, 25 and 30Hz, which can be used in HD television with the larger 1080line formats, there would be considerable flicker on displays, unless there were some processing to show each picture twice (as in double shuttering in cinema projectors). Besides flicker, the other potential problem area is that of fast action or pans, as the lower refresh rate means that movement will tend to stutter. Interlace, with its two vertical refreshes per frame, has advantages here. See also: 24PsF, Interlace
Quantization Quantizing is the process used in DCT-based compression schemes, including AVC, JPEG, MPEG-2 and MPEG-4, to reduce the video data in an I frame. DCT allows quantizing to selectively reduce the DCT coefficients that represent the highest frequencies and lowest amplitudes that make up the least noticeable elements of the image. As many are reduced to zero significant data reduction is realised. Using a fixed quantizing level will produce a constant quality of output with a data rate that varies according to the amount of detail in the images. Alternatively quantizing can be varied to produce a constant data rate, but variable quality, images. This is useful where the data must be fitted into a defined size of store or data channel – such as a VTR or a transmission channel. The success in nearly filling, but never overflowing, the storage is one measure of the efficiency of DCT compression schemes. NB: Quantization has a second meaning. See Video Formats section
Understanding HD with Avid
16
Understanding HD with Avid
3 Chapter 3
Video Compression: Formats
17
Video Compression: Formats
Understanding HD with Avid
DVC
This is the practical side of compression showing the systems and formats that are used. Some are proprietary, in which case the company involved is mentioned.
AVC See MPEG-4
AVR AVR is a range of Motion-JPEG video compression schemes devised by Avid Technology for use in its ABVB hardware-based non-linear systems. An AVR is referred to as a constant quality M-JPEG resolution since the same quantization table (of coefficients) is applied to each frame of a video clip during digitization. For any given AVR, the actual compressed data rate will increase as the complexity of the imagery increases. For example, a head shot typically results in a low data rate while a crowd shot from a sporting event will yield a high data rate. To avoid system bandwidth problems, AVRs utilize a mode of rate control called rollback which prevents the compressed data rate from increasing beyond a preset limit for a sustained period. So, when the data rate exceeds the rollback limit on a given frame, high spatial frequency information is simply discarded from subsequent frames until the rate returns to a tolerable level. See also: DCT, JPEG
DVC is the compression used in DV equipment that is standardised in IEC 61834. It is a DCT-based, intra-frame scheme achieving 5:1 compression so that 8-bit video sampling of 720 x 480 at 4:1:1 (NTSC) or 720 x 576 4:2:0 (PAL) produces a 25 Mb/s video data rate. The same is used for DV, DVCAM, Digital8 and DVCPRO (where PAL is PAL 4:1:1). It achieves good compression efficiency by applying several quantizers at the same time, selecting the nearest result below 25Mb/s for recording to tape.
DNxHD Avid DNxHD encoding is designed to offer quality at significantly reduced data rate and file size and it is supported by the family of Avid editing systems. Engineered for editing, it allows any HD material to be handled on SD-original Avid systems. Any HD format can be encoded edited, effects added, colour corrected and the project finished. There is a choice of compression image formats to suit requirements. Some of the formats are: Format
DNxHD 220x
DNxHD 185x
DNxHD 185
DNxHD 145
DNxHD 120
Bit depth
10 bit
10 bit
8 bit
8 bit
8 bit
Frame rate
29.92 fps
25 fps
25 fps
25 fps
25 fps
Data rate
220 Mb/s
184 Mb/s
184 Mb/s
135 Mb/s
220 Mb/s
Avid DNxHD maintains the full raster, is sampled at 4:2:2 and uses highly optimised coding and decoding techniques, so image quality is maintained over multiple generations and processes. When you’re ready, master to any format you need. DNxHD efficiency enables collaborative HD workflow using networks and storage designed to handle SD media. So, for example, Avid Unity shared media networks are HDready today! Cost-effective, real-time HD workflows can be
18
Video Compression: Formats designed with Media Composer Adrenaline HD and Avid DS Nitris systems. You can even edit HD on a laptop. For more information see www.avid.com/dnxhd/index.asp
H.264 See MPEG-4
Huffman coding A method of compressing data by recognizing repeated patterns and assigning short codes to those that occur frequently, and longer codes to those that are less frequent. The codes are assigned according to a Huffman Table. Sending the codes rather than all the original data can achieve as much as a 2:1 lossless compression and the method is often used as a part of video compression schemes such as JPEG and MPEG.
JFIF JPEG File Interchange Format – a compression scheme used by Avid Technology in its Meridien hardware-based non-linear systems. A JFIF M-JPEG resolution is termed constant rate since compressing clips of varying complexity results in a fixed data rate. Each JFIF resolution is defined by a target data rate and a base quantization table. When digitizing, the quantization table is linearly scaled (known as rolling Q) to conform the actual compressed data rate to the target rate. Due to the flexibility of this approach, imagery compressed by a JFIF resolution generally looks better than that compressed by an AVR of comparable average data rate.
Understanding HD with Avid
JPEG Joint (ISO and ITU-T) Photographic Experts Group; JPEG is a standard for compressing still picture data. It offers compression ratios of between two and 100 times and there are three levels of processing available: baseline, extended and lossless encoding. JPEG baseline coding, which is the most common for television and computer applications, starts by applying DCT to 8 x 8 pixel blocks of the picture, transforming them into frequency and amplitude data. This itself may not reduce data but then the generally less visible high frequencies can be divided by a high quantizing factor (reducing many to zero), and the more visible low frequencies by a lower factor. The quantizing factor can be set according to data size (for constant bit rate) or picture quality (constant quality) requirements – effectively adjusting the compression ratio. The final stage is Huffman coding which is a lossless mathematical treatment that can further reduce data by 2:1 or more. Baseline JPEG coding creates .jpg files and is very similar to the I-frames of MPEG-1, -2 and -4, the main difference being they use slightly different Huffman tables. See also: Compression, Compression ratio, DCT, DV, Huffman coding, JFIF, M-JPEG,
www
http://www.jpeg.org
JPEG 2000 JPEG 2000 is an advanced image coding (compression) system from the Joint Photographic Experts Group. Like ‘normal’ JPEG, this is intra-frame compression and it is suitable for a wide range of uses from portable digital cameras, to scientific and industrial applications.
19
Video Compression: Formats Rather than using the established DCT, it employs state-ofthe-art techniques based on wavelet technology. Requiring more processing than MPEG, JPEG 2000 has, until recently been too costly for wide use in television applications. Now new chips have lowered the price barriers and JPEG 2000’s use in TV and D-cinema is expected to rapidly expand as it has distinct advantages for the high quality large images. It is already recommended for D-cinema and Grass Valley have adopted it for HD compression in their new Infinity range of cameras. As it does not analyse images block-by-block but in a circular area-by-area pattern, there are no ‘blocky’ artefacts, instead problem areas tend to become a little softer – which is much less noticeable. JPEG 2000 continues to improve as more bits are used for the images. As a result, at high bit rates of 200-300Mb/s HD and Dcinema images are displayed with ‘visually lossess’ quality. It is also scalable, so image sizes different to the encoded size, can be extracted directly without decoding.
www
http://www.jpeg.org
M-JPEG Motion JPEG refers to JPEG compression applied to moving pictures. As the detail contained within each frame varies, so some decision is required as to whether to use a constant bitrate scheme or constant quality. See also: AVR, JPEG
Understanding HD with Avid
MPEG Moving Pictures Expert Group. A group of industry experts involved with setting standards for moving pictures and sound. These are not only those for the compression of video and audio (such as MPEG-2 and MP3) but also include standards for indexing, filing and labelling material.
www
http://www.mpeg.org
MPEG-2 ISO/IEC 13818-1. This is a video compression system primarily designed for use in the transmission of digital video and audio to viewers by use of very high compression ratios. Its importance is huge as it is currently used for nearly all DTV transmissions worldwide, SD and HD, as well as for DVDs and many other applications where high video compression ratios are needed. The Profiles and Levels table (below) shows that it is not a single standard but a whole family which uses similar tools in different combinations for various applications. Although all profile and level combinations use MPEG-2, moving from one part of the table to another may be impossible without decoding to baseband video and recoding. Profile Level
Simple 4:2:0 I, B
Main 4:2:0 I, B, P
High
1920x1152 80 Mb/s
High-1440
1440x1152 60 Mb/s
M-JPEG 2000
Main
JPEG 2000 used for moving pictures.
Low
720x570 15 Mb/s
720x576 15 Mb/s 352x288 4 Mb/s
422P 4:2:2 I, B, P
SNR* 4:2:0 I, B, P
Spatial* High 4:2:0 4:2:0,4:2:2 I, B, P I, B, P 1920x1152 100 Mb/s 1440x1152 60 Mb/s
720x608 50 Mb/s
720x576 15 Mb/s 352x288 4 Mb/s
MPEG-2 profiles and levels *SNR and Spatial are both scalable
1440x1152 80 Mb/s 720x576 20 Mb/s
20
Video Compression: Formats Profiles outline the set of compression tools used. Levels describe the picture format/quality from High Definition to VHS. There is a bit rate defined for each allocated profile /level combination. In all cases, the levels and bit rates quoted are maximums so lower values may be used. Combinations applicable to modern HD are highlighted. MPEG-2 is deliberately highly asymmetrical in that decoding is far simpler than the encoding – so millions of viewers enjoy reasonable prices while a few broadcasters incur the higher unit costs. Coding has two parts. The first uses DCTbased intra-frame (I-frame) compression and application of quantizing, to reduce the data – almost identically to JPEG. The second involves inter-frame compression – calculating the movement of macroblocks and then substituting just that information for the pictures between successive Iframes – making a GOP. The movement is conveyed as motion vectors, showing direction and distance, which amounts to far less data than is needed for I-frames. Motion vector calculation is not an exact science so there can be huge difference in quality between different MPEG compressors. Decompression is deterministic so all decompressors (decoders) should be the same. The encoding process necessarily needs to look at several frames at once and so introduces a considerable delay. Similarly, the decoder delays pictures. For transmissions this can add up to over a second. MPEG-2 is sometimes used on broadcast contribution circuits, this becomes noticeable when news reporters appear to delay answering a question. To fit HD video and audio down a transmission ‘data pipe’ requires very high compression. Uncompressed 10-bit HD requires up to 1244Mb/s. But this is 10-bit data and sampled at 4:2:2. MPEG-2 is 8-bit sampled at 4:2:0 – bringing the data down to 746Mb/s. However, the data pipes for ATSC (19.2Mb/s) or DVB (20Mb/s, depending on channel width, parameters etc.) imply the need for around 40:1 compression. See also: DCT, GOP, Intra-frame compression, Inter-frame compression. Macroblock
Understanding HD with Avid
MPEG-4 MPEG-4 (ISO/IEC 14496) was developed by MPEG (Moving Picture Experts Group) and is a wide standard covering many subjects but its importance in television production is mostly related to its video compression scheme. MPEG4 Part 10, AVC (Advanced Video Coding) and H.264 all refer to the same compression system. This is another DCT-based system that builds on MPEG-2 to produce a more efficient codec again using intra and inter-frame techniques. Coding is more complex than MPEG-2 but it can produce extra data savings of around 30 percent – or more. Some of the latest television services are planned to use MPEG-4. This is especially true with HD where more bandwidth is required. It will enable the delivery of better image quality to viewers, or more channels to be delivered within a given bandwidth. It is said to be similar to, but not the same as, WM 9.
www
http://www.chiariglione.org/mpeg
VC-1 VC-1 is a video compression codec specification that is currently being standardised by SMPTE (SMPTE 421M) and implemented by Microsoft as Windows Media Video (WMV) 9 Advanced Profile. See: WMV 9
WMV 9 Windows Media Video 9 is a video and audio compression system (codec) developed by Microsoft. It is said to be similar to MPEG-4 AVC and to have as good or slightly better performance giving lower data rates and claims to be a less complex process. It’s applications are seen as being for content delivery such as HD DVD.
21
Understanding HD with Avid
4 Chapter 4
HD formats
22
HD formats
Understanding HD with Avid
Tape formats for high definition television now span a wide range of qualities and prices. These cater for the recording needs of digital cinematography, mainstream broadcast and programming and, most recently, the prosumer market. The latter is addressed by HDV and has enabled a huge expansion of HD use.
See DVCPRO HD
This is an HD version of the D5 half-inch digital VTR format from Panasonic and has been widely used for HD mastering. It records on a standard D-5 cassette shell for over two hours with a wide selection of video formats: 1080/60I, 1035/60I, 1080/24P, 720/60P, 1080/50I, 1080/25P and 480/60I. It can slew a 24Hz recording to use the material directly in 25/50Hz applications – useful for European replay of movies. There are eight discrete channels of 24-bit 48kHz digital audio to allow for 5.1 and stereo mixes. Panasonic uses a proprietary compression scheme to reduce the raw HD-SDI 4:2:2 component digital video data rate of up to 1240Mb/s. The D5-HD compresses video 4:1 (8-bit mode) and 5:1 (10-bit mode).
http://www.panasonic.com
Also see HD VCR formats at:
www
The D6 tape format uses a 19mm ‘D-1 like’ cassette to record 64 minutes of uncompressed HD material in most of the current HDTV standards. The recording rate is up to 1020 Mb/s and uses 10-bit luminance and 8-bit chrominance and records 12 channels of AES/EBU stereo digital audio. The only D6 VTR on the market is VooDoo from Thomson and it has been used in film-to-tape applications.
D7-HD
D5-HD
www
D6
DVCPRO HD (a.k.a. D7-HD and DVCPRO 100) This is the HD version of Panasonic’s DVCPRO VTR hierarchy. DV and DVCPRO record 25Mb/s; DVCPRO 50 records 50Mb/s; and DVCPRO HD records 100Mb/s. All use the DVC intra-frame DCT-based digital compression scheme and the 6.35mm (1/4-inch) DV tape cassette. In the recording format, video sampling is 8-bit, 4:2:2 and 1080I as well as 720P formats are supported. There are eight 16-bit 48kHz audio channels. The recording data rate means that considerable video compression must be used to reduce around 1Gb/s video and audio data. Video compression of 6.7:1 is quoted. A feature of DVCPRO HD camcorder range is the VariCam that offers variable progressive frame rates for shooting from 4-60Hz in one-frame increments.
http://videoexpert.home.att.net
www
http://www.panasonic.com/pbds/index.html
23
HD formats HDCAM Sony’s HD camcorder version of the popular Digital Betacam. Introduced in 1997 at ‘near DigiBeta’ prices it was the first more affordable HD format. Now the expanded range includes still lower priced models. HDCAM defines a half-inch tape recording format. There is also a range of studio recorders and players as well as options for down conversion to SD. In the camcorder, the camera section includes 2/3-inch, 2.1 million pixel CCDs to capture 1080 x 1920 images. The lenses have compatibility with Digital Betacam products as well as accepting HD lenses for the highest picture quality. The recorder offers up to 40-minutes’ time on a small cassette making the package suitable for a wide range of programme origination, including on location. A series of steps, including 4.4:1 intra-frame compression, reduces the baseband video data rate to 140Mb/s. The format supports four channels of AES/EBU audio and the total recording rate to tape is 185Mb/s. HDCAM effectively samples video at 3:1:1 with the horizontal resolution sub-sampled to 1440 pixels. It fulfils many HD needs but is not an ideal medium for Blue Screen work. Video formats supported by HDCAM are: 1080 x 1920 pixels at 24, 25 and 30 progressive fps and at 50 and 60Hz interlace. Material shot at 24P can be directly played back into 50Hz or 60Hz environments. Also, the ability to playback at different frame rates can be used to speed up or slow down the action. See also: CineAlta
HDCAM SR HDCAM SR can record either 4:4:4 RGB or component 4:2:2 HD video at a net video rate of 440Mb/s. It uses mild MPEG-4 Studio Profile (ISO/IEC 14496-2:2001-1) ‘visually lossless’ compression and records onto 1/2-inch tape cassettes. The Studio Profile addresses the need for high resolution; it is I-frame only and so easy to edit, and scalable
Understanding HD with Avid in its pixel count (SD and HD), bit depth (10- or 12-bit), and colour resolution (component or RGB). Its applications include high end HD recording, editing and as a mastering format. HDCAM SR is probably the highest quality HD tape recording system available. Practical recorders at any higher bit rate use hard discs or flash memory. Besides working at the 440Mb/s rate, called the SQ mode, HDCAM SR also offers an HQ mode with recording at 880Mb/s to offer lower compression 4:4:4 RGB or two 4:2:2 channels.
HDV HDV is a low cost system for shooting and recording HD. It defines video formats, a compression scheme and uses DV recording and familiar DV, or MiniDV, cassettes. HDV is available in two standards HDV1 and HDV2 but, unlike DV, they use MPEG-2 long GOP compression to squeeze the HD video into DV-sized data. 4:2:0 colour 8-bit sampling is common to both standards. The two channels of 16bit/48Hz audio are compressed (4:1) with MPEG-1 (Layer II) to 384 kb/s. HDV1 is 1280x720 progressive scan format with frame rates of 60, 50, 30 and 25Hz. JVC’s ProHD adds a 24Hz frame rate. The luminance sampling rate is 74.25MHz. The video is compressed using MPEG-2 six-frame GOP compression to produce a recording data rate of just 19 Mb/s. In this standard a 63-minute MiniDV cassette records 63 minutes of HDV and, with critical data interleaved over all the recorded tracks, dropouts are minimised. HDV2 is a 1440x1080 interlaced scan format with frame rates of 60 or 50Hz. The data rate is 25Mb/s after applying MPEG-2 15-frame GOP compression. Note that the pixel count is not in the usual 16:9 pixel/line ratio, but the pictures themselves are. So here luminance sampling rate is 55.7MHz and the pixels are not square but are stretched to an aspect ratio of 1.33:1. This is the same luminance sampling as is used in HDCAM.
24
HD formats ProHD ProHD is JVC’s adaptation of the HDV 720P recording mode that adds 24-frame progressive scan 24P – but not for the 1080-line format. This is useful of productions seeking a film look or wishing to output to film or D-cinema as it avoids the never-perfect process of deinterlacing. Apart from adding 24P, ProHD uses the same compression and bitstream format as HDV.
XDCAM HD Sony’s XDCAM HD records 1080I 4:2:0 HD at bit rates of 18, 25 and 35Mb/s onto Professional Disc media (Blu-ray). The 25Mb/s is a constant bit rate to give users a bridge to HDV, and the other two rates are variable. 18Mb/s allows for a recording time of two hours, and the other two allow for 90 and 60 minutes. User can mix the different bit rate on the same disc. As with HDV, long GOP MPEG-2 compression is used.
Understanding HD with Avid
25
Understanding HD with Avid
5 Chapter 5
SD formats
26
SD formats Standard definition has a wide variety of digital tape formats to suit everyone from consumers to broadcast professionals. Recent trends include more compact formats and lower costs. Many of the HD tape formats have their routes in SD including HDV that uses the widely used (SD) DV format.
Understanding HD with Avid
D5 Introduced by Panasonic in 1994 this records uncompressed 625 and 525-line 4:2:2 10-bit component digital video onto the same half-inch cassettes as D3. Being component it uses in post production and, as it has lower costs than D1, is still in use today. The format also has provision for HDTV recording by use of about 4 or 5:1 compression (see HD-D5).
Digital Betacam D1 Digital tape format to record SD uncompressed 4:2:2 component digital 625 and 525-line video onto 19mm (3/4-inch) cassettes. Introduced by Sony in 1987 it was relatively expensive and used for high-end work where multi-generation quality needed to be maintained. It is not widely used today.
D2 Introduced in 1988 by Ampex, this records uncompressed digitised composite PAL or NTSC video onto 19mm (3/4-inch) cassettes. Although it used less data, and so less tape, than D1, and was good for analogue transmission replay, the signal suffered from all the original restrictions of PAL and NTSC. It was little use in modern post production and would have to be decoded for any digital transmission. The format is little used today.
Launched in 1993, ‘Digibeta’ superseded the analogue Betacam formats and costs much less D1. It provides good video and audio quality and run time up to 124 minutes. 720 x 576 or 720 x 480 4:2:2 component SD digital video is DCT-compressed to a bitrate of 90 Mb/s (about 2:1 compression) plus 4 channels of uncompressed 48 kHz PCM audio.
DV Launched in 1996, DV (IEC 61834) defines both the codec (video compression system) and the tape format for the first SD digital tape format for the consumer and prosumer markets. Features include intra-frame compression for straightforward editing, an IEEE 1394 interface for transfer to non-linear editing systems, and good video quality compared to consumer analogue formats. Variants include the DVCPRO series and DVCAM. Also, much of HDV has its roots in DV including the MiniDV tape, but not HDV’s use of MPEG-2 compression.
D3 Introduced by Panasonic D3 is similar to D2 in that it records composite PAL or NTSC video onto cassettes, the D3 ones being 1/2-inch. As it has the same benefits and drawbacks as D2 and is not widely used today.
DVCAM Introduced by Sony, DVCAM is a professional variant of the DV standard that uses the same cassettes as DV and MiniDV, the same compression scheme, but runs the tape through 50 percent faster making it more robust with fewer errors/dropouts.
27
SD formats
Understanding HD with Avid
DVCPRO (25 and 50)
XDCAM
Panasonic created the DVCPRO range for professional applications of the root DV technology. Also known as DVCPRO 25, DVCPRO is identical to the DV format for recording, and uses a 25Mb/s recording stream. There are two tracks of 16-bit, 48kHz audio and video is sampled at 4:1:1 for both the 576/50I and 480/60I versions.
Sony’s camcorder that uses Professional Disc media. It records Sony’s MPEG IMX (MXF compatible) format, 8-bit I-frame (only) MPEG-2 at 50, 40 or 30 Mb/s – claiming Digital Betacam quality with the highest bit rate. The rates give 45, 57, and 68 minutes recording time respectively. Some models can also record the 8-bit DVCAM format with 5:1 compression and 4:1:1 sampling for the 480/60I system (NTSC) and 4:2:0 for 576/50I (PAL) system. DVCAM recording time is 85 minutes.
DVCPRO has a hierarchical structure that doubles the data rate. The next step up is DVCPRO 50 with 50Mb/s from the tape that allows reducing the video compression and the use of 4:2:2 sampling to give the better image quality required for studio production. Four 16-bit, 48kHz audio tracks are provided.
HD-CIF See Common Image Format
P2 Solid-state recording system from Panasonic that records DV, DVCPRO and DVCPRO HD video onto flash memory to offer advantages of speed and reliability over tape, but at a high cost and with shorter run times. Currently available P2 cards offer up to 8GB storage – enough for about 40 minutes of DV, 20 minutes of DVCPRO 50, and 10 minutes of DVCPRO HD. But the random access and ‘loop’ recording possibilities mean this space is more useful than the equivalent length of tape. The workflow may include in-camera shot selection and very fast data dumping to hard disc storage for editing.
See also: MXF
28
Understanding HD Your comprehensive guide to High Definition on a budget
Part Three
Understanding HD with Avid
6 Chapter 6
Digital Film
29
Digital film In years gone by, many TV dramas, documentaries and ‘soaps’ were produced on film. Today, not only is that is becoming increasingly rare as HD and digital technology shows many benefits in these areas, but the movies themselves are going digital. Shoots and cinemas may still use film but all the processes between increasingly are digital. A number of movies have been shot on digital cameras, including blockbusters such as Sin City and the later Star Wars episodes, and the installation of digital cinemas is gathering pace. Digital film has many crossovers with television as well as its own standards and terminology.
Understanding HD with Avid So a television image and a film negative carry very different information. While 10 bits (linear) is usually plenty to smoothly resolve all the contrast levels in TV, a film negative needs about 13 bits (linear). However as we can detect small brightness differences in darker areas and only larger ones in bright areas, assigning more digital levels to low light, and fewer to the highlights is a more efficient way to use the available digital levels. This is what the ‘log’ sampling does. See also: Quantizing (Video formats, colour space and sampling)
10-bit log Widely used for digitising film material, this usually refers to 10-bit sampling of an image, that describes 210 or 1024 discrete numbers or brightness levels which have logarithmic scaling – rather than the linear scale that is always used in television. This highlights a major difference in the way that film and television material is shot. In film, the camera negative is designed to pick up as much detail as possible over a very wide brightness range of up to 11 stops – equivalent to a contrast ratio of over 2000:1 and capturing all detail from bright sunlit objects to down in the shadows. This gives latitude for later adjustments and grading before selecting the much more limited contrast range used for the release print that gives a punchy presentation at the cinema. In television it is always possible to see exactly what the images look like and so any adjustments and selections are be made live while the camera is shooting. What you record is, essentially, what viewers see and this may be 8 stops – a contrast range of 256:1 – but it looks great at home.
2K This is a picture format generally used with images scanned from 35mm motion picture film, as well as a slightly different format for cinema exhibition. For the production side, it refers to 1536 lines each with 2048 pixels and describes a 4 x 3 aspect ratio picture. The sampling is 4:4:4 RGB with 10-bit log accuracy to carry the full sharpness and contrast detail of 35mm negatives. This is not a television format but 35mm film is commonly scanned to this resolution for use as ‘digital film’ for effects work and, increasingly, to input to DI for grading, cutting and mastering. For publishing in television, a 16:9 (1080 x 1920), and a 4 x 3 aspect ratio window can be selected from the 2K material for HD and SD distribution. The format is also suitable to support high quality transfers back to film or for direct D-cinema exhibition. Just as with film, not all the original image is shown on the screen. For digital projection 2K refers to a size or 2048 x 1080 lines, giving a wide aspect ratio display.
30
Digital film
Understanding HD with Avid
4K
D-cinema and E-cinema
This is a digital film production image format of 3072 lines by 4096 pixels – four times the area of 2K. With each image producing about 32MB of data it requires a powerful workstation to play and process 4K footage in real time. Also the storage requirement is massive. Despite the current technical challenges, a small but increasing number prefer to work at 4K partly as it is seen as more future proof than 2K. Also some effects shots that have to be seamlessly re-inserted back into a 2K movie may be created at 4K. As the onward march of technology makes 4K easier and less costly to use, so it will become more widely used as a digital film mastering format alongside 2K.
D-cinema or Digital Cinema may involve the whole sceneto-screen production chain but it is usually refers to the distribution and exhibition of cinema material, movies, by digital means. There are no hard-and-fast rules about what constitutes D or E-cinema but some say D-cinema, images should be 2K size or bigger. Smaller HD or SD formats then fall into the E-cinema category. Nonetheless audiences have been generally impressed with the results from HD projections.
CineAlta Sony’s name for its family of products that bridge cinematography and HDTV and includes HDCAM-based camcorders and studio VTRs as well as extending to whole production and post production systems. The more recent HDCAM SR series offers a more refined cine package with higher recording data rates and direct access to the original RGB images, rather than the ‘gamma corrected’ images used for television. See also: 24PsF
Dark chip
Digital presentations lack film weave, scratches, sparkles etc., to deliver a new standard of technical excellence to the cinema screen and, unlike film, quality is maintained regardless of the number of replays. Digital movies are distributed by disks or over networks rather than on 35mm film that costs around $1000-2000 per copy which lasts only about 200 passes through the projector. Copying and distribution of film prints cost an estimated $800 million per year, spent by studios. E-cinema is currently further developed than D-cinema and already has proven viable in a support role to the main features. It allows low cost production of local advertising and promotions as well as the flexibility to easily add any other TV-based content. Among the necessary technologies, the recent rapid development of high-resolution, large screen digital projectors has made digital cinema exhibition possible. These are based on three technologies: D-ILA, DLP and SXRD.
See DLP Cinema
D-cinema standards have recently been recommended by Digital Cinema Initiatives. See also: DCI, DLP, D-ILA, SXRD
31
Digital film
Understanding HD with Avid
DCI
Digital Intermediate (DI)
Digital Cinema Initiatives was set up in 2002 by a group of major Hollywood studios to establish an open digital cinema set of standards that ensures a uniform high level of technical performance, reliability and quality control. The standard was completed in 2005 and is being implemented by various suppliers. Among a host of detail including security, its recommendations include 2K and 4K image formats and JPEG 2000 compression.
Digital Intermediate is a digital alternative to the traditional photochemical process that accepts original camera negative (OCN) and produces the internegatives that make the release prints of a movie. This has always included many stages of colour grading to match up all the shots seen in the final release print. DI is increasingly accepted as the preferable and path as, depending on the system used, it can be instant, interactive, presented on a big screen, can have audio and allows any grading changes right up to the outputting the internegative film from the graded and edited digital internegative. This way the grades are made on the edited material, complete with all effects shots, rather than looking at isolated individual shots. It is also possible to output fully graded whole reels, rather than applying further final when making the release prints.
www
www.dcimovies.com
Digital Cinematography Digital Cinematography refers to the use of electronic cameras in shooting material for movies. A number of cameras have been designed specifically for this as alternatives to 35mm, including Viper (Thomson), CineAlta range (Sony) and DVCPRO HD (Panasonic). These produce HD formats, can run at 24P, capture a wider contrast range than TV cameras and do not use TV’s gamma correction curves. Origin (Dalsa) and D20 (ARRI) provide larger D-cinema sized images: Origin offers up to 4K and D20 3018 x 2200 active pixels. The D20 also offers frame rates from 1-60 fps. These cameras are designed as an alternative to 35mm movie cameras however any video camera could be used.
DI starts with scanning the 35mm film. This is usually made at 2K size using 10-bit log RGB (4:4:4) sampling to carry all the sharpness and contrast detail from highlights to deep shadows, from the film into the digits. The contrast latitude is needed to allow headroom for onward grading. If using footage from a digital cinematography camera, the scanning operation, which can be quite costly, is not needed.
32
Digital film
Understanding HD with Avid
D-ILA
DLP
Direct-Drive Image Light Amplifier. A technology that uses a liquid crystal reflective CMOS chip for light modulation in a digital projector. In a drive for higher resolutions, the latest developments by JVC have produced a 2K (2,048 x 1,536) array, which is said to meet the SMPTE DC 28.8 recommendation for 2000 lines of resolution for digital cinema.
Digital Light Processing: Texas Instruments Inc digital projection technology that involves the application of digital micromirror devices (DMD) for television, including HD, as well as cinema (see DLP cinema below). DMD chips have an array of minute mirrors which can be angled by +/10 degrees so as to reflect projection lamp light through the projection lens, or not. Since mirror response time is fast (~10 microseconds), rapidly varying the time of through-the-lens reflection allows greyscales to be perceived. For video, each video field is subdivided into time intervals, or bit times. So, for 8-bit video, 256 grey levels are produced and, with suitable pre-processing, digital images are directly projected.
The 1.3-inch diagonal, 3.1 million-pixel chip is addressed digitally by the source signal. The tiny 13.5-micron pitch between pixels is intended to help eliminate stripe noise to produce bright, clear, high-contrast images. This is an efficient reflective structure, bouncing more than 93 percent (aperture) of the used light off the pixels. See also D-cinema
www
www.jvc.com/prof
The array, which is created by micomachining technology, is built up over conventional CMOS SRAM address circuitry. Array sizes for video started with 768 x 576 pixels – 442,368 mirrors, for SD. The later 1280 x 1024 DMD has been widely seen in HD and D-cinema presentations. Most agree it is at least as good as projected film. TI expect to offer an ‘over 2000-pixel wide’ chip in the near future. While much interest focuses on the DMD chips themselves, some processing is required to drive the chips. One aspect is ‘degamma’: the removal of gamma correction from the signal to suit the linear nature of the DMD-based display. Typically this involves a LUT (Look Up Table) to convert one given range of signal values to another. See also: Gamma
www
www.dlp.com
33
Digital film
Understanding HD with Avid
DLP cinema
OCN
This refers to the application of Texas Instruments’ DLP technology to the specific area of film exhibition. Here particular care is taken to achieve high contrast ratios and deliver high brightness to large screens. The development of ‘Dark chips’ has played an important part by very much reducing spurious reflected light from the digital micromirror devices. This has been achieved by making the chip’s substrate, and everything except the mirror faces, non-reflective. In addition, the use of normal projection lamp power produces up to 12 ft/l light level on a 60-foot screen.
Original Camera Negative has very high value and is designed to hold a very wide contrast range. It is always handled with great care and, to avoid damage, as little as possible. The onward path toward making a programme involves either scanning the OCN and proceeding along the DI route, or copying to make an interpositive film, and so on into the photochemical intermediate chain.
See also: D-cinema, DLP
HD RGB Television usually uses 4:2:2 component video (Y,Cr,Cb). Slightly higher quality can be achieved through using RGB sampled at 4:4:4. Many of the digital cinematography cameras offer this type of output that can use linear or log sample scaling. The 1080 x 1920 HDTV image format is very close to the 2K projected image size, so RGB HD can be considered as a TV/film crossover format, able to take advantage of many of the economies and speed of TV equipment to produce ‘film’ quality results.
ILA See D-ILA
SXRD Silicon X-tal Reflective Display (X-tal is short for crystal) is digital projector display technology developed by Sony. Its first claim to fame was that it provided the first viable 4K (4096 x 2160 pixels) size as incorporated in Sony SXRD projectors. The design of this reflective liquid crystal microdisplay is also aimed to provide for enhanced contrast, speed allowing up to 200 fps and minimises image smear, and offering extended service life.
34
Understanding HD with Avid
7 Chapter 7
Post production and editing
35
Post production and editing Shot selection and editing for film and video are now undertaken using nonlinear editing systems. Post production has grown immensely in importance with the advent of highly-capable online digital equipment and nonlinear editing. Now it is often cheaper to ‘fix it in post’ rather than spend extra time on another take on the set.
AAF Advanced Authoring Format. This is an industry-driven, open standard for the multimedia authoring and post production industries which is supported by many companies, including Avid. It is intended to enable content creators to easily exchange complete digital media – video, audio and metadata – across platforms and between applications. It simplifies project management, saves time and preserves valuable metadata that was often lost in the past during media transfers. It is in editing and post production areas that the metadata load is greatest and individual systems and applications have become isolated by incompatibilities,: so limiting their interaction, interoperability and usefulness. Use of the AAF file format allows the passage of full information between AAF-enabled applications. Thus video, audio and metadata, with the decisions about how material has been manipulated (cuts, DVE, colour correction etc.) and assembled – a complete, modern-day EDL – can always be available and, where needed accessed. The metadata also passes on existing, original information such as timecode or edgecode, ownership, previous editing etc. that helps with any later archive retrieval and versioning. See also: EDL, MXF, OMFI
www
http://www.panasonic.com
Understanding HD with Avid
Blue screen Shooting items against a blue background or screen allows them to be cut out and keyed onto other backgrounds. The blue is normally chosen as being unique in the picture and not present in the foreground item to be keyed. This should enable easy and accurate derivation into a key signal used to cut out the object. Consideration may also be given to the colour spill onto the object’s edges. So, for example, if the object is set into a forest, maybe a green screen would be preferred. Modern colour correction and key processing allow a wider choice of colour and the possibility of correcting for less-than-perfect shoots. However, this will increase post production time and effort. The accuracy of the key signal derived from blue screen shots depends on the accuracy and resolution of colour information. Unlike SD, where the popular Digital Betacam or DVCPRO 50 records 4:2:2 sampled video using only 2:1 or 3:1 compression, most HD recorders do not offer equivalent quality with the 100-140Mb/s camcorders, where restrictions in chrominance bandwidth can limit the effectiveness of HD key. The notable exception is HDCAM SR, offering up to 440Mb/s with 10-bit 4:2:2 (4:4:4 possible too) sampling with ‘lossless’ compression.
Content Any material completed and ready for delivery to viewers. Content is the product of applying metadata to essence (for TV, video and audio). See also: Metadata
36
Post production and editing
Understanding HD with Avid
Chroma Keying
CSO
The process of deriving and using a key signal formed from areas of a particular colour in a picture (often blue, sometimes green).
Colour Separation Overlay. Another name for chroma keying. See also: Keying
See also: Keying
Colour correction Historically this is the process of adjusting the colours in a picture so that they match those from other shots or create a particular look. Colour correction in HD and SD television has become highly sophisticated. This can include secondary colour correction that can be targeted at specific areas of pictures or ranges of colour. So, for example, a blue car in a commercial can be changed to red. Depending on equipment, operation can be real-time and interactive; enabling fine adjustments to achieve precise results in a short time.
Compositing (a.k.a. Vertical Editing) The process of adding layers of moving (or still) video to assemble a scene. This involves many tools such as DVE (sizing and positioning), colour correction and keying. As the operation frequently entails adding many layers, the work is best suited to nonlinear equipment using uncompressed video to avoid generation losses. Techniques are now highly developed and are a key part of modern production for both film and television – cutting production costs and bringing new possibilities and new effects.
DS Nitris DS Nitris is Avid Technology’s flagship effects and editing solution for HD and film resolutions. It was launched in September 2000 and based on the successful V4 release of DS (Digital Studio) code. The original version had no hardware acceleration and was entirely software based with the exception of input/output operations, but the Nitris DNA hardware offers powerful hardware acceleration, while still benefiting from the continuing development of faster CPUs. The system is well supported by nearly all plug-in manufacturers and is resolution-independent. It also supports the transparent import of multi-layered effectbased OMF files from products such as Avid Media Composer and Digidesign’s ProTools to provide an efficient link between off-line and on-line operations.
DTF/DTF2 Name for Sony’s half-inch Digital Tape Format which offers high data storage capacity (up to 200GB) on half-inch tape cartridges. Such stores are often used for storing digital video – such as HD – in post production areas, where they may be available to clients on a network.
37
Post production and editing
Understanding HD with Avid
EDL
Gamma (correction)
Edit Decision List. This is data that describes how material is to be edited, e.g. from offline to online, or a record of what happened in the editing process.
Gamma describes the difference in the brightness transfer curve characteristics between video source devices, such as the CCDs in cameras, and the response of the display devices – usually considered to be cathode ray tubes. Gamma correction is normally applied early to the source video R, G, B signals as part of the processing in cameras. It is imposed here as it makes the video signal more impervious to atmospheric noise during ‘over-the-air’ analogue transmissions. However, the more recent use of other display devices – plasmas, LCDs and DLPs – with very different technologies and gammas means that they must again adjust gamma to match their transfer characteristics. For example, DLP technology uses Digital Micromirror Devices (DMDs) – millions of mirrors that are actually time-modulated. The amount of light they reflect onto the screen is a function of a duty cycle for time ‘on’. Thus, DLP systems program the display gamma for any given luminance level by adjusting the exposure time for that level through a Look Up Table (LUT).
EDLs were devised before the days of nonlinear editing and were never updated to take on board any of the digital enhancements such as DVEs and advanced colour correction and keying. Even so, they remain in wide use as a wellrecognised means of conveying the more basic editing decisions, cuts, dissolves, wipes, slo-mo, etc. Popular formats are CMX 3400 and 3600. More recently, new initiatives such as AAF and OMF offer the far wider capabilities needed for today’s production needs. OMF has become a de facto standard for transferring full decision data between offline and online operations. See also: AAF, OMF
Essence Term used to describe essential material which, for television, is what appears on the screen and comes out of the speakers – video, audio and text. Essence consists of those recorded elements that may be incorporated by means of editing, mixing or effects compositing into a finished programme (content). See also: Content, Metadata
Gamma corrected colours or components are annotated with a prime, e.g., R´, G´, B´, and Y´, Cr´, Cb´. As virtually all mentions in this document involve gamma corrected signals, the primes have not been included, for simplicity. See also: DLP
38
Post production and editing Grading Colour grading, also called colour correction, involves adjusting the colour of recorded footage. This is highly skilled work and depends on sensitive and very accurate adjustments. Traditionally, television has not had a use for grading as all cameras are matched to make a TV programme, but when shooting over several days, with isolated (iso) cameras, or simply using footage from a number of sources, grading becomes necessary so that all shots have the same colour look. Primary grading is applied to whole frames. Secondary grading involves adjusting the colour of a specific area of a picture. This could be to grade an object or to affect a specified range of colours – perhaps to change seasons by modifying the green leaves of spring to look like the hues tones of autumn. Defining the area to be changed may well involve using a key (see below)
Keying A general term for the process of placing an object or section of picture over another – as in keying text over video. This is a video version of matting in film but may use interactive tools and feature live operation. Operation splits into two areas, deriving the key signal and applying it to produce the keyed result. In HD’s high quality, big picture environment it is essential that keyed results are accurate and look convincing. Increasing use of compositing to add scenery, objects and actors to make footage that the camera never saw, requires excellence in keying so that the keyed items look ‘photo-real’ – like a part of the original image. Keying tools have developed rapidly with the introduction of digital technology and online nonlinear editing. If working with electronically generated material, such as graphics or captions, the key signal is supplied along with the video. Otherwise sophisticated means are available to derive the
Understanding HD with Avid key signal. Typically, objects are shot against a blue or green screen and that colour then defines the key signal. In reality the key colour spills onto the object so de-spill techniques are applied. The boundary between the object and background is often the subject of much effort. It is rarely a hard cut (hard key), which tends to look jagged and false, but a carefully set up dissolve to render a smooth, naturallooking edge (shaped or linear key). Further techniques are used to key semi-transparent material such as smoke, fog, and glass. Often this uses a non-additive mix technique which apportions foreground and background according to its luminance. The availability of highly developed digital keying techniques has been a large factor in swinging motion picture effects into the digital domain. Their excellence and efficiency has changed the way many are made, cutting costs by simplifying the shoot and avoiding some expensive location work. In digital systems, the key is a full-bandwidth signal (like Y, luminance), and is often associated with its foreground video when stored. Disk-based nonlinear systems can store and replay this video-with-key combination in one operation, but it would take two VTRs. See also: Blue Screen, 4:2:2:4, 4:4:4:4
39
Post production and editing
Understanding HD with Avid
Media Composer
MXF
This series of non-linear editing systems has formed the core part of Avid’s business over recent years. There are many permutations of hardware platforms, video cards and breakout boxes on both Apple Mac and PC formats. Seen as the de facto standard in editing for both on-line and off-line, Media Composer has tens of thousands of users worldwide and touches the vast majority of mainstream film and television production.
Material eXchange Format is standardised in SMPTE 377M and supported by the Pro-MPEG Forum. It is aimed at the exchange of programme material between file servers, tape streamers and digital archives. It usually contains one complete sequence but this may comprise a sequence of clips and programme segments.
See also: AVR
Metadata Metadata is data about data. Essence, or video and audio, is of little use without rights and editing details. This information also adds long-term value to archives. Metadata is any information about the essence, for instance how, when (timecode) and where it was shot, who owns the rights, what processes it has been, or should be, subjected to in post production and editing, and where it should be sent next. Uses with audio alone include AES/EBU with metadata to describe sample rate, also metadata in AC3 helps the management of low frequencies and creating stereo down-mixes. Typically the audio and video essence is preserved as it passes along a production chain, but the metadata is often lost. Avid with OMF and the AAF Association have both done much to rectify this for the area of editing and post production. See also: AAF, Essence, OMF
MXF is derived from the AAF data model, integrates closely with its files and so bridges the worlds of file-based and streaming transfers. It helps to move material between AAF file-based post production and streaming programme reply over standard networks. This set-up extends the reliable essence and metadata pathways so that both formats together reach from content creation to playout. The MXF body carries content, which can include MPEG, DV and uncompressed video, and contains an interleaved sequence of picture frames, each with audio and data essence, plus frame-based metadata.
www
www.pro-mpeg.org
Non-additive mix See Keying
40
Post production and editing
Understanding HD with Avid
OMFI
Symphony
Open Media Framework (OMF) or Open Media Framework Interchange (OMFI) is a platform-independent file format intended for transfer of digital media between different software applications and equipment. Besides sending the video and audio, the transfers can include metadata about the content and what editing and other processes it has been through. It is used by Avid products, Final Cut Pro, Pro Tools and others. It is the basis for the AAF.
Avid’s Symphony is a pure editing and finishing tool with real-time effects processing which offers advanced primary and secondary colour correction, captioning and titles. Initially working only at SD, its universal mastering allows users to generate both 525/50 and 625/50 version of an edit in real-time from a 24P master.
Photo real Term to describe effects-generated material that looks as if it originated from a camera. This may apply to computergenerated objects or to items shot on camera and composed into the picture. Here, attention to detail such as shadows and reflections as well as keying are needed to maintain the illusion. Achieving such quality at HD and film resolutions is all the more demanding as their bigger, sharper displays make detail, including errors, easier to see. See also: Keying
Plug-ins A generic term for software applications that can be added to existing applications to enhance their functionality. Nonlinear video and audio systems are often expanded with new effects or functionality via plug-ins.
Symphony’s real-time uncompressed performance is extended to HD with Avid Nitris DNA hardware. Symphony Nitris systems combine Symphony’s full finishing toolset to provide real-time uncompressed HD and SD performance using Avid Nitris DNA hardware.
Timecode Timecode is a 24-hour frame-accurate reference of hours, minutes, seconds and frames and fields designed for television production use. For example 10:32:24:16 Typically it is recorded with the video and is the first reference when logging and editing. EDLs run on timecode. It is relatively straightforward in the 25/50Hz frame-rate world but gets a lot more complicated in the 30/60Hz world where, for historic reasons, the whole number frame frequencies was offset by a factor of 1000/1001 – hence 29.97 and 59.94Hz. To make up the time to that of a whole 30 or 60Hz rate, one frame is dropped in every 1000. This ‘drop-frame’ is accounted for in drop-frame timecode.
41
The Lifecycle of a Project: Shooting Shooting HD on a budget – in practice
Shooting HD on a budget By Chris Jones
HD is here. In fact, it’s been here for some time, in various guises. But the big difference now is that it’s affordable, and the distribution technologies (such as digital cinema and HD screens in the home) are also appearing in the marketplace. So what does HD actually mean? What does it mean to you? How can you get the best balance between your needs and your cashflow?
Let’s look at formats first, as this is the first HD quagmire. First off let’s be clear. HD is a video format. It’s just higher resolution than what we have previously been exposed to. The cameras are now starting to carry onboard processing that produce aesthetically pleasing images that are closer to film, but it’s still different. At the higher end of the scale is HDCAM, which does not use a Firewire interface to get from the tape to an Avid Xpress Pro. So if you are on a budget, HDCAM is out of the equation. As we slide down the scale there is DVCProHD, an excellent format that sadly, has failed to capture the consumer’s eye, partly because of its price tag. As we slide even lower, we reach HDV. If there was a prosumer HD format war, (between DVCProHD and HDV) then HDV has won, mainly due to Sony and its aggressive pricing. I just bought myself an HDV Camcorder for £900 now all my home movies are shot in HD! That is very cool. (In reality, DVCProHD is a much higher spec format that is more suited to higher end TV drama and low budget features). Given that HDV (HD) cameras can also shoot DV (SD, short for standard definition, which for us is PAL), in my view buying a DV camera now is a bad idea. This stance is strengthened when you consider that most cameras can shoot HDV and then internally down-convert to DV when playing out. So shooting everything in HDV makes sense.
The Lifecycle of a Project: Shooting So what are the pro’s and con’s of HDV? Firstly, the cameras are newer technology, so it should be that little bit better. Everything from the audio encoding circuitry to the lens quality is improved over older versions, not to mention batteries being smaller and longer lasting. But what of the image technology? HDV has a resolution of 1080 by 1920 pixels, opposed to the SD PAL resolution of 768 by 576 pixels. So you can see it produces roughly four times more image. That’s an awful lot more detail. And when you see it for the first time on an HD monitor, it’s quite staggering. So the advantages are clear. The disadvantages are less apparent though, and rumour and myth don’t help either. Looking at the facts, HDV uses the same data rate as DV; that’s 25 Mb per second. The internal circuitry of the cameras have very fast processors that can handle that kind of data compression on the fly, and it also reduces the amount of colour information in the signal. Most importantly it encodes using MPEG technology, which is something of a dark art as it doesn’t actually have defined fields and frames like standard DV. This means that it’s slightly more complicated in postproduction, though Avid now supports HDV even from frame-accurate edits. Such high compression can cause problems when shooting. It’s possible that when shooting material containing a lot of detail and movement, for instance panning around a stadium full of cheering people, that you may see some image compression artefacts. It’s also possible that when panning around quickly, due to the way MPEG works, you may get some kicks or juddering in the image. It’s also been said that drop-out on the tape can be disastrous. This may be the case, but if you take care of your tapes, and use only new tapes, then this is unlikely to happen. I come from the days of film when all sorts of problems could occur, so for every single shot we took, we would also do a second take, whether we needed it or not. We considered the job mission critical and the technology NEVER 100% reliable. In fact, the more I think about it, film and old TV cameras were extremely restrictive and limited, and the complaints I hear about
2
Shooting HD on a budget HDV are insignificant in comparison. All formats require care and attention when shooting, and HDV is no different. All this brings me on to the primary problem with HDV. It’s nothing to do with Avid, Sony, Panasonic, the format... It’s the users. Years ago, film and TV was a very expensive and technical business. To be successful, film and programme makers needed to be diligent, professional and educated. Now, anyone can pick up an HDV camcorder and be shooting in moments, with quite good results. Switch everything to auto and point the camera. But this approach, for anything other than news coverage, can lead to disaster. In fact, the way the image is captured is less forgiving than film, and so even greater care and attention must be taken unless you want your footage to look like corporate video, or worse, home videos Not all cameras are the same, either. The newer HDV and DVCProHD cameras all have true 16:9 CCD’s (the chips that capture the light from the lens) which frankly, is about time! So shooting SD footage in 16:9 is now better than ever. However, there are a number of other issues that are less clear cut. First is progressive scan, or one of the aspects that yields a ‘film look’. Some cameras can handle a true progressive scan, where others can’t. Ideally, if you want that single frame (as opposed to dual field look), get a camera with true progressive scan. If not, you will need to shoot interlaced and create the look in postproduction. The second issue is the camera and how it is set up to respond to light. Not all cameras have a great deal of control over how they interpret light coming in through the lens in general, the cheaper the camera, the less control it has. The result is that, all too often, HDV tends to look a lot like hi-res news material, or at best a TV soap opera. Some cameras do offer control over the image, and with these you need to apply a set-up that captures quite flat, low-contrast images. Detail in both the dark areas and the highlights are what will make your project look better once you get it into post. You can always add contrast, but it’s hard to remove it without seeing crushed blacks and burnt out whites.
The Lifecycle of a Project: Shooting Remember, what you shoot on set is not how it will look after you have edited and graded on Avid. The image can be graded and colour corrected hugely, as long as what you shoot contains the visual information in the first place. The next step is lighting and what you actually choose to shoot. If your frame contains very dark areas and very light areas, you may start to have problems - after all, you want to keep the detail in the light areas and dark areas as well. The upshot is that to shoot HDV properly takes more time and experience than it does to shoot film! Exposure is critical. It’s also best to avoiding shots where the contrast is too high. The nightmare situation is where an actor is in shadow, but a light is pointing right into camera and burning out to 100% white. The actor in shadow is so underexposed that when it’s pulled up in post, the image has horrible milky blacks with image compression artefacts, and the whites look electronic and harsh. In short, it looks awful. The biggest giveaway to my eye is burnt out whites. So avoid bright skies, car headlights, film noir-like lighting (unless you can achieve the look with less contrast) etc. Of course, this isn’t always possible. As the definition increased to HD, so did the need for higher quality lenses. The lens that comes with any HDV camera is already pretty good, but you can make huge aesthetic improvements by using high quality prime lenses hooked up to a P + S Technic Mini 35 (of course you will need a camera that you can actually change lenses with). The Mini 35 works by mimicking the depth of field look you get on 35mm. Couple that with a prime lens that gives you a nice long shot, and you can shoot quite startling images where the subject is beautifully cut out from a soft background. Very filmy and not at all like video. This look is one of the main differences between an amateur video look and pro film look. Video tends to hold everything in focus, and it can give a cluttered and unattractive image. The Mini 35 goes some way to reducing this. Another bizarre problem with HDV is that the cameras tend to be small, and some actors who see the ‘big camera’ as
3
Shooting HD on a budget their audience can be put off. Or worse, without noticing, they may play the scene at a lower energy level. The same is true of the crew. Of course, everyone will deny this, but I have seen it repeatedly. So get yourself some accessories such as a big tripod and head, a big matte box and follow focus, so that your small camera looks worthy of the effort of the cast and crew. A component of the project that is often forgotten is the sound. The capabilities of the digital audio tracks on an HDV camera are impressive. The problem isn’t the format, it’s the microphone (mic): how you choose to mic up for sound, and who is in charge of monitoring levels. You will need a camera with manual control over the recording level, then ideally (for drama) an external mixer like an SQN, calibrated to the levels on the camera, so that a separate sound recordist and boom swinger can take care of sound without interfering with the camera department. Of course this will mean your camera is permanently and umbilically connected to the sound recordist, but this is the most cost effective way of doing it. Your choice of mic is vital, as the mic on the camera should only really be used for grabbing guide tracks. It is not good enough to capture good dialogue unless the person is speaking directly to camera and projecting their voice. So while your camera can quite easily record excellent sound, getting that excellent sound to the camera in the first place is a much bigger job than you would imagine. Of all the things that are overlooked, sound is the biggest. Considering all the work that goes into a film project - the planning, the script writing, the casting etc - it’s astonishing to see filmmakers just throw around the tapes onto which EVERYTHING is committed! Tapes are not indestructible. Treat them with care and attention and label them clearly. Ideally make a backup of every tape on the day that you shoot (ideally in the cutting room every night). It will only take one tape to get trodden on, dropped in coffee, get damp, left in the sun on a dashboard etc., to spell complete disaster. Make backups, and store those backups somewhere else, so that if your building burns down you still have your movie.
The Lifecycle of a Project: Shooting Don’t fall into the trap that somehow shooting HDV is as good as, or the same as, film. It is not. It is different and requires a different way of working. It has many advantages over film - less noise (grain), immediate playback, cheap running costs, sync sound, stable and sharp image (over Super 16mm) and so on - but on the downside, it has less latitude (that is the amount information between absolute black and absolute white), it does not have the same aesthetic as film (though this can be mimicked to some degree in post), and most subtly, it often does not command the same ‘presence’ on set. One last point about grading: To view HD properly, certainly for grading the image, you need an HD monitor capable of displaying exactly what you are getting in the image. Of course, most people cannot afford one, nor the hardware interface to drive it. Some people choose to hook up a 16:9 TFT monitor, such as the Apple Cinema display, but this doesn’t give a true and exact representation of the image in terms of colour, hues and contrast. You would be better advised to down-convert all your material to SD on the fly (for example using the Avid Mojo DNA box), hooking up a graded monitor and doing all your grading in that environment. Viewing SD while working in HD: It’s clear that ‘offline’ and ‘online’ is becoming a thing of the past!
Chris Jones is a director/producer and co-founder of Living Spirit Pictures Ltd, a UK-based film company which produces commercial feature films for the international market place, writes the Guerilla Film Makers series of books, and runs courses for film makers. www.livingspirit.com
4
The Lifecycle of a Project: The Camera Shooting HD on a budget
Shooting HD on a budget By Christina Fox and David Fox HDV is to HDTV what DV is to Standard Definition – the cheapest way to produce reasonably high-quality pictures. If you use DV already, and want to move to high definition, HDV is the obvious upgrade path (there is an alternative, as we'll mention later).
Even if you don't expect to need HD soon or are using an older SD format, such as one of the Betacam variants, if you need a new camera then HDV is worth a look. The cameras are widescreen native (16:9) and can record very good pictures in SD, to DV or DVCAM (or downconvert HDV to SD), so there is no need for anamorphic adapters or aspect ratio converters. There are essentially three professional three-chip HDV camcorders: the JVC GY-HD100/101, the Sony HVR-Z1 and Canon's XLH1. There are also a few single-chip HDV models worth considering (especially as a back-up camera that can also be used as a low-cost play-in deck beside your editing system). There are also two varieties of HDV: interlaced or progressive, with 1080 or 720 lines. Interlaced is the traditional way of scanning video (it is how cathode ray tube TV sets work), which displays each TV frame as two fields made up of the odd-numbered and even-numbered lines. Repeating an image 50 times per second, or 50i, gives a very natural-looking motion blur, because that is close to how the eye works. PAL is 576 active lines at 50i, while NTSC is 480/60i. Progressive displays a full frame at a time, like film, and is how LCD screens work. If you want to make your video look like film, then cameras that can do 24p or 25p will have an advantage. The ideal, and what European broadcasters want for HD, is 50p, but this isn't part of the HDV standard. With HDV you currently get
The Lifecycle of a Project: The Camera either 1080 lines at 50i (Sony and Canon), or 720 lines at 24/25 or 30p (JVC). Both 1080i and 720p are being used for HD transmission by broadcasters; so all HD equipment should cope with either, although most HDTV-ready sets will initially be 720 native rather than 1080.
HDV (and its problems) explained HDV is an HD version of DV, but uses a different form of compression. In DV, each frame is individually compressed, making it easy to edit. To fit more picture information onto the same miniDV tape, HDV uses MPEG-2 long GoP encoding. This means that the camera does not record every frame of video as a full frame. It records occasional key frames and just enough other information to enable it to recreate the rest. The video between one key frame and the next is called a Group of Pictures (or GoP), which consists of three types of frames - the I-frame, P-frame and B-frame (Intra frame, predictive frame and backwardly predictive frame). A typical HDV GoP of 12 frames will be: I B B P B B P B B P B B, where only the I-frame holds the full picture information. In DV, a speck of dirt on the heads when recording could result in dropout problems on a single frame only. With HDV, dirt causing dropout on an I-frame could affect half a second of video, ie the whole GoP. To counter this, either use fresh, new tapes (particularly the higher quality tapes sold for HDV), or buy second-hand tapes from somewhere that cleans and assesses each one. We buy used tapes for about £1.50 for a 60-minute DV tape that has been erased, cleaned (removing any loose iron oxide) and evaluated for defects, including a printout of where they are – almost invariably within the first or last minute, as that is where tapes are fixed to the spool, so, whatever tape you buy, don't record anything valuable at the beginning or end. Long GoP was designed for transmission rather than editing, so it is not easy to edit it frame accurately. Fortunately the latest software, such as Avid Xpress Pro and Avid Liquid, can now edit natively in HDV with frame
2
Shooting HD on a budget accurate edits. HDV also has a further problem, in that it isn't great at dealing with fast-moving detail where every pixel can change from frame to frame. This can result in visible artefacts around the fine detail (typically blockiness). If the camera and/or the fine detail is moving, this can be lost in the motion blur, and might look OK. We haven’t noticed any nasty artefacts on anything we've shot on our Z1 or the other cameras we've used, and there are a few techniques to avoid them: 1.
Keep your lens angle wide if going handheld, because tight angles accentuate any shakes, provoking artefacts. Otherwise use a tripod, monopod, Steadicam or other camera support.
2.
Limit depth of field to throw the background out of focus (so there is less detail for the compression to deal with). To do this, you might need to invest in some external neutral density filters and/or graduated filters (alongside the camera's built-in filters), to reduce the light entering the lens so that you can widen the iris and increase the depth of field. Cameras with small CCDs (these all have one-third inch chips), deliver more depth of field than those with large CCDs or 35mm film, so defocusing the background may require moving closer to the subject, where a wide-angle adapter may also prove useful.
3.
Do any important camera moves at several speeds, to see which delivers the best results. Or do moves against less detailed backgrounds, such as blue skies or painted walls, so that the compression has fewer moving details changing between each frame.
4.
Shoot more cutaways, so you have room to manoeuvre in the edit suite.
5.
The various "film" modes (especially that on the Z1) seem to exacerbate any artefacts, so don't use them particularly if you are not shooting true film style (all long, slow, fairly static shots).
The Lifecycle of a Project: The Camera The two versions of HDV also differ: HDV1's 720 lines record at 19Megabits per second, while HDV2's 1080 lines record at 25Mbps. Both are 4:2:0, which means that there isn't as much colour information as some other formats. For comparison, DV, DVCAM and DVCPRO 25 record at 25Mbps in 4:2:0 (or 4:1:1 for DVCPRO), DVCPRO 50 at 50Mbps 4:2:2, and DVCPRO HD at 100Mbps 4:2:2. The HDV image is 1440x1080 pixels (the same as the CCDs used in many other HD cameras), but the pixels are rectangular with an aspect ratio of 1.333:1, which makes the 1440 equivalent to 1920 so you get 16:9 output.
The contenders Let’s look at the three main cameras in the professional HDV range. They cost from approximately £3,000 to almost £6,000 each, although extras such as batteries will increase this. SONY HVR-Z1
JVC was first to bring out an HDV camcorder, releasing its single-chip JY-HD10 in 2004. It found favour with some corporate producers Documentary evidence: Sony's HVR-Z1 camcorder wanting to display material on a large screen, but it wasn't until the end of 2004 that Sony brought out the first 3CCD model, the HDR-FX1. It delivers excellent pictures, but lacks the XLR sockets required of a professional camcorder, and is relatively light on features. If you shoot sync sound on an external recording device, it will save a few hundred pounds compared to Sony's XLR-equipped model, the HVR-Z1. Since shipping in Spring 2005, it has become the most popular HDV camcorder, and is now widely used by the BBC (although mainly as a widescreen DV camcorder).
3
Shooting HD on a budget
The Lifecycle of a Project: The Camera
Anyone who has used a PD150 or PD170 will find the Z1 to be a natural progression. This is primarily a documentary maker's camera and records in HDV (1080i), DVCAM and DV formats. Pros:
Cons:
Very nice lens, with good wide angle (4.5mm); ability to creatively manipulate white balance; easily switched from PAL to NTSC, useful if you have clients all over the world; and very good battery life. No variable shutter, only stepped; can’t display peaking and zebra simultaneously; expanded focus only works in standby, not in record mode; not as good in low light as PD150/170 was; not designed for use on your shoulder; and no interchangeable lenses - although Italian producer/cinematographer, Matteo Ricchetti (www.eidomedia.com), bravely dismantled his FX1 to fit a universal lens mount, so it can be overcome. It's not worth using its CineFrame film-look mode as you can probably achieve better results in post.
JVC GY-HD100/101
JVC's 3CCD HDV camera, the GY-HD100 went on sale in August 2005. The European version, the GY-HD101, is slightly more expensive because it has full FireWire i/o which attracts an EU levy. However, it is worth paying A Mini35 adapter can allow the HD100 (or the other the extra if you also want to low-budget HD cameras) to work with 35mm lenses. use the camera as an editing deck. The HD100 and 101 take interchangeable lenses, a desirable addition - assuming there is money left in the budget to buy them. It is made from die cast aluminium and is pretty robust. Uniquely in this group, it has a proper adjustable shoulder mount, so, from a handling point of view, it should be more comfortable to operate. It can be a bit front-heavy, but if you opt for an IDX or Anton/Bauer battery option, it
Getting the cine look: HD100 with cine lens and matte box
will help balance things out. If you are used to ENG-style cameras like the DSR500/570 then you'll feel at home with this camera and its lenses. Pros:
Ergonomic; good for the film look (as the CCDs are progressive); lots of lens options; and you can store your menu settings on an SD card. For better quality you can access the camera's analogue HD component 720p output signal, but need to put this through an analogue HD to HD-SDI converter (which costs from around £750 to £2,000).
Cons:
Standard (Fujinon16x) lens (5.5mm) not as wide angle as Z1 and exhibits some colour aberrations at the edge of the frame; needs an upgraded battery as the standard battery lasts only about an hour.
Steady state: JVC's HD100 in use on a Sachtler Artemis stabilisation system
4
Shooting HD on a budget Canon XLH1 Canon's XLH1 would be an ideal choice for anyone equipping an HD studio or OB on a budget, as it is the lowest cost way of getting full 1080i HD-SDI 4:2:2 output. This bypasses the HDV compression, and can be plugged straight into an HD vision mixer. It also does SDI (Serial Digital Interface) for SD. It will deliver high quality pictures for live big screen presentations at concerts or conferences. The H1 can also be genlocked and have timecode synched with other cameras, so it should behave nicely in multicamera set ups. Its other big attraction is its interchangeable lens (although very few XL1/XL2 owners ever took advantage of that previously). It started shipping late 2005 in the US, and is due to arrive in Europe in early 2006.
The Lifecycle of a Project: The Camera
Pros:
PAL/NTSC switchable (via option, which also gives 60i, 24f and 30f); can shoot (and save on SD card) 1920x1080 still images; very good standard 20x lens; lots of other lenses available; reportedly good in low light; useful black stretch function compared to XL1/2 (although that can be done in post); creative white balance; four audio channels. It can also deliver 24 or 25 frames per second images, although its isn't true progressive - this is why it is called 24f or 25f as it uses a Frame mode that, in the case of 24f, runs the CCDs at 48Hz, then de-interlaces using Canon's own method. From the few examples we've seen, it is certainly better than Sony's CineFrame, but may need further post-processing to achieve a film look.
Cons: Has a 2.4-inch LCD combining the job of viewfinder and LCD screen - not as easy to see as its two rivals when handholding camera; doesn't sit particularly well on the shoulder (front heavy); standard lens only marginally wider (5.4mm) than standard HD101 lens; and the 24/25f images will only playback properly from the H1. A good deal more expensive than Z1.
One chip wonder
Back in black: the XLH1 is the most stylish HDV camera
If you are on a more restricted budget, Sony has two smaller, single CMOS chipped HDV cameras, the "professional" HVR-A1E (with XLR inputs and costing as little as £1,300 Even lower budget HD: plus VAT) and the HDRSony's single-chip HVR-A1E HC1E (a consumer version, without XLRs). The A1 would also be ideal as both a back-up camera and to use as an edit deck as it costs less than dedicated HDV VTRs. Single CCD (Charge Coupled Device) cameras are not usually recommended for professional use because individual
5
Shooting HD on a budget
The Lifecycle of a Project: The Camera
CCDs don't deal well with multiple colours. They are designed to convert light into an electrical signal and you ideally need three – one each for red, green and blue light to give both good colour accuracy and high resolution. However, CMOS (Complementary Metal Oxide Semiconductor) sensors work differently. They allow more individual light sensors per square centimeter than CCD, and offer a wider dynamic light range, for better detail in both shadows and highlights, and are less susceptible to vertical smear. They offer higher resolution (and good multi-resolution handling) at lower costs, which is why they have become the mainstay of many high-resolution digital stills cameras, shooting at resolutions way above HDTV. In the A1, having a single chip removes the need for a bulky beam splitter (to split the light into red, green and blue), enabling it to be a lot smaller without compromising picture quality significantly. Certainly, if you have a limited budget and had been considering buying a second-hand DV camcorder, such as the Sony PD-150 or 170, the A1 is a better option, as it is widescreen, can record DV and DVCAM and produces surprisingly good results. However, even if you are on a tight budget, if you require an HDV camera for a specific project you can save money by buying a camera and then selling it upon completion; the depreciation of a Sony Z1, for instance, over two months will almost certainly be less than the cost of hiring it for the same length of time (especially if you buy it on a 0% credit card).
HD but not HDV A budget alternative to HDV is Panasonic's new AG-HVX200, which is potentially better as it records using the DVCPRO HD format (as well as in DVCPRO 25 and 50). Its other interesting feature is that it records onto P2 cards. These are removable flash memory - which makes for a quicker workflow because there is no need to transfer from tape to computer hard drive for editing and they work very well with Avid. Unfortunately, P2 cards are very expensive (approx £1,000 for an 8GB card compared to a £2.75 miniDV tape). As P2 cards
Card sharp: the HVX200 will record to P2 card, tape or add-on disk
come down in price, and go up in capacity, this could be a very interesting camera. However, anyone using it seriously for HD should buy the upcoming Focus Enhancements' FireStore FS-100 hard disk recorder, which clips on the back and has native Pinnacle AVI and Avid OMF support, so you can plug it straight into your editor and start editing. There is also an FS-4 Pro HD for HDV camcorders. The HVX200 records 1080i and 720p (including, notably, 720/50p and 1080/25p). If it lives up to its promise, this will be a strong alternative to HDV. It will probably be March 2006, or later, before they go on sale - at least for the European version.
Pros:
DVCPRO HD is a more robust, less compressed format than HDV, and gives a full 4:2:2 signal, so it records more colour information, making it easier to use for effects work, such as bluescreen composites. The camera also promises the widest-angle standard lens (4.2mm) and four-channel 16-bit PCM audio. It also records DV to tape.
Cons: As expensive as the XLH1 if you buy a bundle with two 8GB P2 cards. Has only been seen as a prototype so far.
David Fox is a freelance journalist, scriptwriter, producer and director, and associate editor of TVB Europe magazine. Christina Fox is a broadcast consultant and trainer, and maintains the www.UrbanFox.tv website, which has more information on buying a low-budget broadcast camera, as well as a guide to all the other camera kit you might need to go with it.
6
The Lifecycle of a Project: Editing with HDV
Shooting HD on a budget
The Lifecycle of a Project: Editing with HDV
By Kevin Hilton
The fundamental form and principles of picture editing have remained pretty much the same since the technique was first developed during the early days of film. However material is cut, whether on film, linear video or a nonlinear digital system, the narrative and artistic principle is the same. Footage is assembled in order to tell the story, using specific cuts and devices for effect and to elicit an emotional response from the audience.
Like any other new acquisition technology HDV, the high definition format that is recorded onto mini digital video tapes, does not change the basic philosophy of editing that was laid down in 1903 by Edwin Porter and was then built upon and turned into an art form by DW Griffith and Eisenstein. What today's filmmakers and editors have to bear in mind is that a format such as HDV will cause problems in the post-production process unless it is handled in a suitable form and domain. HDV does not lend itself well to the post-production process due to the technology that has made it attractive to low budget filmmakers, videographers, corporate productions and the education sector: the Long-GOP (Group of Pictures) MPEG2 compression that enables HD 24p pictures to get on the DV tape in the first place. Multiple copying and compositing contribute to image degradation, thus damaging a major selling point of HDV, the quality of its images. A long-term user of Avid editing systems is Swedish documentary maker Loui Bernal, who sums up HDV well: "It doesn't have the clarity of other formats, but it does have the detail." Avid has worked to make HDV a proper post-production tool. The general opinion is that the best way to edit HDV material is in its native form. Avid editing systems support native HDV editing at 1080i 50/59,94
(Sony/Cannon) and at 720p 30 (JVC ProHD). That means real-time editing with no transcoding, real-time HD effects, and real-time Avid multicamera editing in HD. Working in native HDV avoids the need for lengthy transcoding to intermediate formats, which takes up large amounts of disk space. It also means no rendering is necessary, with editing, effects creation and compositing being done in real-time. Avid systems also allow HDV to be edited on the same timeline with other video formats, both HD and non-HD. Again, there is no need for rendering or transcoding to get material into the system. Clips are dragged and dropped onto the editing timeline in the usual way: it is possible to have two kinds of HD, for example native HDV and DVCPro HD, in the same file and edit the footage together. It is also possible to combine multiple HDV sources without first converting material to a common format. As ever, opinions differ as to the best way to deal with HDV, and which forms of the format give the best results for the different stages of the post-production process. A common view is that native HDV is fine for straightforward offline continuity editing, but when it comes to online functions such as colour correction, other formats are better suited to avoid generational loss. In such cases the commonly used argument is to convert HDV and any other acquisition format being used, be it P2 or HDCAM, to a common platform. Whatever the editing format chosen, there are many in the business who feel that it should be uncompressed. Despite modern broadcast and post-production being increasingly digital and moving towards high definition, standard definition SDI is still favoured as a solid platform on which to edit and finish material drawn from different sources. Among the useful little boxes available for this conversion process is Convergent Design's HD-Connect LE unit. This takes HDV and converts it to SDI for use on any NLE editing system. Avid's position is that the HDV format is fine for simple cutting, but the format can hinder efficient working due to
2
Shooting HD on a budget
The Lifecycle of a Project: Editing with HDV
the Long-GOP compression used to reduce bandwidth on the tape. Because of this the intervals between individual frames can be obscured, with the common problem of a decrease in image quality over a number of generations. Avid DNxHD encoding has been developed to keep the image quality as it was shot, and can be used in those sections of the programme where the post-production process involves multi-layer and multiple generation compositing, titles and graphics. Mark Dyson, founder and head of factual programmes at Creative Touch Films, is now shooting in HD and HDV to satisfy the demands of American broadcasters. Creative Touch Films produces documentary series in which travellers and explorers journey to remote locations, very often enduring extreme cold or heat. Dyson is a long-term user of Avid systems and now works on Avid Xpress Pro. "In the early days of HD and HDV I felt it was necessary to online edit using Avid Media Composer Adrenaline," he says. "Now the software for Xpress Pro HD has been finalised the whole process can be kept within the desktop domain." Problems continue to exist, even though HD and HDV give the impression of being mature technologies. HDV is barely three years old and is continuing to develop from its consumer/prosumer beginnings and offer an even more convincing high-end product. Monitoring of HD in general is a particular stumbling block, as Loui Bernal points out: "We've been able to work online with other formats but with HDV, nothing worked. When you're using high definition you can't watch it back, which is a problem when you're doing colour correction.” Wailing Banshee is a UK production company that has been working with HDV for approximately eight months, and is therefore familiar with all the arguments. Founded in 1997 by video producer and director David Baumber, Wailing Banshee specialises in digital production, with an animation division in New York. In the UK approximately 80 per cent of the company's work is in corporate production for blue chip clients such as Unilever, Lipton and BT. The remaining 20 per cent comprises commercials, largely for theatrical release.
Both the corporate and commercials sectors are demanding higher quality and resolution, with the latest effects and processing. Baumber comments that the requirement is to maintain as high a quality as possible; DigiBeta has been a favoured format and now Wailing Banshee is making a transition to HD and HDV. "Corporate clients are now beginning to hire cinemas for their presentations, it's not always in a hotel function room these days," Baumber observes. "DigiBeta is good but higher resolution formats are so much better, partly because it means we can use a lot of chromakeying." In the relatively short time that Wailing Banshee has been using HDV, Baumber says clients have been "amazed" at the quality of the finished work. “When we put footage on the internet it looks as though we went out with a 35mm film camera," he comments. "The cameras are very flexible. We've been using the JVC, and for TV and presentations it looks like 16mm." Baumber trained at the BBC as an Avid editor and so has long experience of the business of editing and getting material in a form that can be cut. For some time Baumber has worked with Avid Liquid Chrome, following it through from its early days as a Pinnacle product to its present position as part of the Avid editing family. Material is fed from the camcorder into Liquid Chrome through the TARGA 2000, and now the 3000, video capture card, using Motion JPEG encoding. In Baumber's experience, having the right interface to get material into the editing system is crucial. "Pinnacle had its own interface and it was nice but it was never quite right," he comments. "But TARGA is so powerful and when Liquid Chrome was packaged with TARGA everything went to a new level. We're now using Chrome HD and in terms of the graphical interface there's no difference between Chrome and Chrome HD. We're using it for both offline and online, sometimes three to four projects at the same time." Material is fed into the editing systems over FireWire links, which Baumber sees as very easy to set up. "The nice thing
3
Shooting HD on a budget
The Lifecycle of a Project: Editing with HDV
about the Liquid interface is that it is self-explanatory about how to get things in and out of the system," he says. "From the control panel you can set up the disks to receive whatever input you're working with." As for loading in HDV Baumber says, "It's no different from any DVcam - there's nothing mysterious about it, although a lot more data is captured at the same speed." When it comes to the edit Baumber also says there is very little difference between working with HDV and DV. "There are more megabits in the signal, so storage is something to bear in mind, but there is no real difference in the editing," he says. Wailing Banshee uses either eight Ultra 320 SCSI drives in an EonStor cabinet striped as a single drive, or eight SATA drives. The eight Ultra 320s give 2TB of storage and are used for commercials work as that demands uncompressed video, while the SATAs give 250GB of space and tend to be used more for corporate projects, where the use of compressed DV is common and not as much an issue as in TV.
DigiBeta, DV, HDCAM or HDV. "We let Avid decide what goes into the editing machine," he says. "What we and other users have to worry about is our own creativity." Where Baumber sees the new HD formats as coming into their own is for the processes involved in online editing. "When it comes to compositing, using HD technology for SD production is a godsend - it's really beautiful at the moment" he says. "Selfishly, I don't want anyone to buy HD televisions because we will eventually lose the opportunity and ability to use HD for SD, and we'll have to learn new tricks!" As new, faster and higher resolution film stock came into use, followed by the transition to video tape and then the successive development of analogue and digital formats, it is likely that editors and directors over the years have had similar feelings - that their knowledge base will have to develop and change. HD transmission will bring an end to some things, but the fundamentals remain and editing is one of them, regardless of format.
Baumber says that once the footage is in the system, HDV gives "extra space to play with" if the final output is to be in SD. "That's largely why we're using HD at the moment," he explains. "It's not because we want to create HD projects but because the technology gives us more flexibility for what we do in SD. For example, a shot of a person might have been framed from the waist up but in the edit we may want just the face. With HDV we can zoom in and not lose definition." Editing on HDV takes place in native format and Baumber is looking ahead to working with it on a Media Composer Adrenaline system. "As Media Composer Adrenaline is a 10-bit system we'll be able to get even better results, particularly for our TV work." As for editing HDV in general, Baumber observes, "It's a strange medium. The situation is similar to when we moved from Betacam to Digicam. It's better quality and because of that you have to cope with a higher data rate and use high specification software for editing and outputting." Baumber agrees that the fundamental principles and forms of editing remain the same whether material is on film,
Contact Kevin Hilton at [email protected]
4