Journal of Cancer Research and Therapeutics, Vol. 5, No. 2, April-June, 2009, pp. 121-123
The great debate: Stroboscopy vs high- speed imaging for assessment of alaryngeal phonation
Kazi R, Rhys-Evans P, Nutting CM, Harrington KJ
Head and Neck Unit, Royal Marsden Hospital, 203 Fulham Road, London SW3 6JJ
Code Number: cr09030
AbstractAssessment of total laryngectomy patients has always been problematic and controversial. This is particularly so when one wants to assess and characterize the mucosal wave using the existing modalities of assessment. Videostroboscopy is generally regarded as an easily available and clinically relevant technique for adequately assessing this important parameter. However, it has its limitations, especially in laryngectomees. Recently, high-speed imaging has been recommended as the ideal tool for studying the physiology and mucosal parameters in this set of patients. In this brief article we examine this debate on the 'best' modality for assessment of the total laryngectomy patient.
Keywords: High-speed imaging, laryngectomy, stroboscopy
It is known that assessment of voice should be multidimensional and that there is no single parameter or modality of assessment that can describe voice in its totality.  However, there exist a number of objective and subjective assessment methods that can be of considerable help. After decades of ′silence,′ during which vocal rehabilitation after total laryngectomy was mainly achieved with esophageal speech and /or artificial sound sources, Singer and Blom developed the first useful prosthetic device in the early 1980s, enabling restoration of pulmonary-driven speech.  Since then many more voice prostheses have been developed, each with its own special characteristics and qualities. Tracheoesophageal (TO) puncture is now the most widely accepted solution for voice restoration in laryngectomy patients. It is fairly easy in its application and allows the production of acceptable prosthetic speech, with little practice required as compared to the other methods (esophageal speech).  However, the problems and complications associated with laryngectomy have not changed dramatically over the years. Post-laryngectomy prosthetic speech is still regarded as unnatural and perceptually poorer than laryngeal speech, and its assessment is considered rather difficult. 
Since Hirano′s original description in the 1970s of the layered microanatomy of the human vocal fold, increasingly sophisticated diagnostic and surgical techniques have evolved to more appropriately address this delicate and complex structure. Innovative diagnostic modalities have grown out of the improved understanding of the critical importance of vocal fold oscillation in voice production. Videostroboscopy has evolved as one of the most practical and useful techniques for the clinical evaluation of vocal fold vibration. At present, it is the most easily accessible and clinically relevant technique in the otolaryngologist′s armamentarium. Videostroboscopy fulfils several important requirements of a modality suited for a complete voice examination. It provides useful information concerning the nature of vibration, an immediate image to detect the presence or absence of vocal pathology, and a permanent video record of the examination. Compared to other techniques, stroboscopy has significantly greater sensitivity in the diagnosis of subtle laryngeal disorders.
Strobolaryngoscopy takes advantage of Talbot′s law (regarding persistence of vision) by producing intermittent light flashes in relation to the speed of the vocal fold vibration.  The frequency of the examinee′s sustained voice is picked up by a microphone and allowed to trigger the stroboscopic light source. When the vocal vibrations are periodic and the frequency of light flashes is equal to the vocal frequency, it will produce a clear still image of the vibratory cycle. When the frequency of the flashes is slightly less than the vocal fold vibration, causing a delay in the portion of each vibratory cycle illuminated, the illusion of slow motion is obtained. While videostroboscopy greatly enhances the diagnostic sensitivity of laryngoscopy, its interpretation is dependent upon the skill and experience of the performing clinician and, more specifically, that of the diagnostic interpreter. An admitted limitation of the stroboscopic image is that vocal fold vibration must be relatively periodic to allow visualization of a slow-motion representation of the phonatory cycle.  Although several investigators over the last two decades have studied the anatomy and physiology of alaryngeal speech, only limited stroboscopic data are available on the vibratory characteristics of the pharyngo-esophageal segment in TO speakers using valved speech. Concern has been expressed that conventional stroboscopy is incapable of strobing TO speakers.
This limitation can be greatly reduced by combining electroglottography (EGG), which uses the glottal waveform, with videostroboscopy. The limitation of the stroboscopic image with regard to visualization of a slow-motion representation of the phonatory cycle has been mentioned earlier; this is a salient aspect, and the laryngograph (Lx) signal derived from EGG is widely relied on as a rich source of voice quality information and can be used to attempt to strobe possibly even irregular, aperiodic voices, as in total laryngectomees using voice prostheses. In recent years this has led to the combined use of EGG-derived Lx waveform information together with laryngeal stroboscopy so that both synchronization and wave shape are derived from an Lx signal source. However, EGG is an invasive procedure and can be uncomfortable. Moreover, the rigid scope technique interferes with normal alaryngeal voicing, especially in total laryngectomy patients. Correct placement of the electrodes, so as to avoid overloading the Lx waveform, can be difficult in laryngectomees and often needs to be individualized. With the rigid scope, it is only possible to have a bird′s-eye view of the neoglottis; it is not possible to see the origin of the vibratory segment or its length. Indeed, if there is more than one vibratory segment, this cannot be known. Thus, no data can be obtained regarding the valve or the rest of the PE segment in laryngectomees and this is vital. Another important limitation is its slow frame rate, which can result in loss of information.
Efforts to extend the sensitivity of laryngoscopy, to incorporate variations of wave characteristic across the glottis as well as in aperiodic patterns of vibration, have yielded new techniques. Videokymography (VKG) was developed in 1994 in Groningen as a low-cost alternative to a high-speed imaging system, and is especially suited for examination of vocal fold vibration.  The system uses a special CCD video camera which can work in two different modes: Standard and high-speed. In the standard mode, the VKG camera works like a standard commercial video camera with an image rate of 50 (interlaced) fields per second in accordance with the CCIR TV standard. (The NTSC video standard is also available, which provides a rate of 60 interlaced fields per second.) In the high-speed mode, the camera delivers images from a single line (selected from the whole video field) at a rate of around 7800 line images per second. The consecutive line images are presented below each other on a monitor and create a new VKG image, which shows the vibratory pattern of the selected part of the vocal folds. However, this high-speed image rate is achieved at the expense of greatly reduced spatial information and this is a big drawback in VKG.
This brings us to the ′reborn′ high-speed videoendoscopy systems. In fact, there have been a number of articles recently on the value of high-speed digital videoendoscopy systems in laryngology. Much of this work emanates from the Karolinska Institute, Stockholm, Sweden, and is a reflection of their extensive experience in this fast emerging field.  In high-speed systems, the voice source is sampled at very high rates, ranging from 1000 to over 8000 frames per second, as compared to the rate of 25-30 frames per second that is seen with conventional stroboscopic systems. With improvements in technology, its application in the recent years has increased. Recently, it has been used with modest success in the quantification of normal and abnormal glottal vibratory patterns. It has also been used in linguistic studies for the examination of different artistic singing styles, in patients with irregular vocal fold vibrations, and in laryngectomees. ,,, High-speed digital systems have also been combined with laser-based measurement systems and EGG. , All these studies highlight the potential of high-speed systems to serve as an tool for studying vocal physiology and glottal parameters, especially in dysphonic and aperiodic voices. Van et al . have looked into the possible use of digital high-speed systems in total laryngectomy patients and report interesting results.  They concluded that digital high-speed systems could be used as a modality of assessment in this group as it was possible to evaluate most patients. However, in a later study, they advocated that it be used in combination with videofluoroscopy in order to get more detailed information regarding the anatomic and morphologic characteristics of the neoglottis.  However, once again, it must be remembered that on account of the large number of frames involved (in excess of a few hundred per second), interpretation is dependent upon the skill and experience of the performing clinician, and more specifically, that of the diagnostic interpreter. It is, naturally, a time-consuming exercise.
We should view this technological advance with a degree of caution. It is necessary to have a balanced view of new technology and keep in mind the massive role that conventional stroboscopic systems have played in shaping laryngology. High-speed systems have now been around for approximately 60 odd years and during this period have been plagued with numerous problems. ,, They are still very expensive and, as yet, have very limited availability at most otolaryngology and head and neck surgery departments round the world. Although newer mathematical algorithms and cutting-edge computer systems with greater memory and storage are being introduced, these are not yet widely available. In addition, the poor image resolution of high-speed systems (especially of VKG) as compared to conventional videostroboscopic systems and the dull black-and-white mode has limited its potential clinical use. Another drawback is that only rigid scopes can be used in the currently commercially available high-speed systems. The limitations of a rigid scope are very evident when examining the PE segment in laryngectomees. ,
Until such teething problems are overcome, conventional videostroboscopy remains the benchmark; other modalities will have to prove themselves against this standard and can at best only be considered complementary. The high-speed digital videoendoscopy systems no doubt hold much promise. It is only now that we are beginning to realize its potential but it is far from being a finished product yet for mass clinical use. We will have to wait some more for that unless, of course, the potential of high-speed digital videoendoscopy is suddenly overshadowed by some new technological advance, as can happen ever so easily in this digital age.
Copyright 2009 - Journal of Cancer Research and Therapeutics