When a person speaks, voice problems usually include pain or discomfort and/or difficulties in terms of the pitch, the loudness and the quality of the voice. When patients with voice problems induced by stroke, Parkinson’s disease, and systemic diseases involving the voice are examined, generally, of the Four Diagnoses (四診), a Diagnosis of Hearing can be used in current Korean medicine. The effects of acupuncture and herb medicine on voice problems have been reported for over 20 years. However, when it comes to improvements, objective and subjective evaluation methods need to be explained.
Subjective methods for evaluating voice were studied through a literature search of old medicinal books containing Korean medicine diagnostics, and an objective evaluation method using Praat software is presented.
Korean medicine doctors analyze the patient’s voice in clinical settings unconsciously on a daily basis. However, most voice diagnoses depend on the doctor’s subjective evaluation. Voice qualities can be evaluated by using the Eight Principles (八綱), including Yin-Yang; the Five Elements (Phases); the Grade, Roughness, Breathy, Asthenic, Strained (GRBAS) score, and the Visual Analogue Scale (VAS) as subjective methods, and an acoustic analysis using the Praat program can be used as an objective method.
A more complete voice examination can be achieved by using subjective and objective methods at the same time. For an objective explanation and management of patient’s voice problems or systemic disorders, an objective method should be used in Korean medicine, which already has many subjective diagnostic methods. More research needs to be conducted, and more clinical evidence needs to be collected in the future.
Sound is caused turbulent changes in a gas, such as air, a fluid or a solid, and the sounds of the human voice are created by vibrations of the vocal cords [
]. The making of a voice in terms of Korean medicine was written about in 「Dongeuibogam (東醫寶鑑)」, 「Naegyeongpyeon (內景篇) Voice Chapter (聲音門)」, in which the Heart is described as the host of the voice, the Lungs as the gate of coming of the voice, and the Kidneys as the root of the voice [
]. Evil Qi, such as Wind, Cold, Summer Heat, Dampness, Blood, Phlegm,or Heat, invading the Heart and the Lungs or the Kidneys being weak can cause a patient to become voiceless [
]. As to the relationship between voice and general health, a men with low voices who sometimes scream with surprise might have disorders in joints, men who hesitate to speak and equivocate might have disorders between the Heart and the chest wall or pericardium, and the men with a thin and long voice might have disorders in the head according to Dongeuibogam [
]. In old times, doctors who could distinguish voices with diseases from a normal voice, were regarded as being highly qualified (聞而知之謂之聖). When Korean medicine doctors use the voice in a clinical setting, they classify five sounds and five voices [
The book Dongeuibogam explains many kinds of acute and chronic dysphonia or aphonia (voiceless disease): for example, those after drinking alcohol and being invaded by Wind, those after being invaded by Wind and Cold, dysphonia induced by cerebrovascular accident, dysphonia induced by shortage of Blood and Qi, etc. The book also describes the treatment method for dysphonia [
]. In addition, the character of voice, hoarseness, voice changes during the attacks due to manic disorder and speaking repeatedly due to mental disorders were written about in traditional medicine.
Many doctors are interested in treating hoarseness and dysphonia or dysphasia (dysarthria) induced by strokes or Parkinson’s disease and in evaluating the character of the voice and the degree of dysphasia [
]. Therefore, objective and subjective voice examinations need to be reviewed and used in clinics. In this study, for the objective evaluation method, a voice analysis program, such as Praat as an example, is introduced, and for the subjective evaluation method, a classification using the Five Elements of Yin-Yang, a Grade, Roughness, Breathy, Asthenic, Strained (GRBAS) evaluation, and the visual analogue scale (VAS) is explained.
2. Formation of voice
The human voice comes from below the vocal cords and changes due to the vocal tract, oral cavity, nasal cavity, etc. [
]. This theory is called the source-filter theory and was introduced by Fant [
]. Until now, this theory has been regarded as one of the most important theories in the area of phonetics. The source is the pressure below the vocal cord, which goes through the vocal tract and then enters the oral cavity and the nasal cavity. The voice changes with the help of many filters, such as the tongue, lips, teeth, etc. [
]. Thus, the voice reflects all the characteristics of the pressure below the vocal cords, the vibrations of the vocal cords, the vocal tract, and the voice-making organs that act as filters.
- 2.1. Basic voice-analyzing variables
There are many voice-analyzing variables, but considering medical aspects, a few basic variables can be summarized: for instance, the fundamental frequency (F0), the standard deviation of the fundamental frequency (F0 SD), the formant, the jitter, the shimmer and the harmonics-to-noise ratio (HNR). The degree of high and low as to what a human can hear and understand is called pitch. Pitch can be assumed to be the fundamental frequency in terms of the voice-analyzing variables [
]. Therefore, pitch and F0 represent the same idea. F0 differs according to age and sex. The range of F0 in boys and girls at the ages of 3 to 10 years is about 270 ─ 300 Hz. As age goes, women tend to have a slightly low pitch while men tend to have a very low pitch as they have thin vocal cords [
]. Patients with voice problems tend to have pitches that are too low or too high and tend to use narrower ranges of pitches for their age, sex and physical body shape [
]. If such patients are asked to pronounce the same vowel, the standard deviation of the fundamental frequency tends to be too small or too big [
], which means they may have an organic lesion or a neurological problem [
The formant frequency, or the resonance frequency, is made in the vocal cord and is the one harmonic frequency among several multiple harmonics that coincides with the speaker’s characteristics. According to a person’s own articulator, a person generally has 3 or 4 harmonics. From lowest to the highest frequency, those harmonics are named as F1, F2, F3 and F4. Especially, F1 and F2 have importance because they are affected by the position of the tongue and the size of the oral cavity [
]. If the patient has a paralyzed articulator so that the mouth wide cannot be opened wide or the tongue cannot be moved freely, the values of F1 and F2 in that patient are different from those in a normal patient, Therefore as the patient’s condition improves, F1 and F2 become normal.
Jitter and Shimmer may also be good surrogate markers or indicators for comparing a normal voice to an abnormal one. Jitter evaluations are used to study stuttering and vocal pathology, especially vocal-cord regulating problems [
]. The HNR is the ratio of the voice to the noise in dB. It describes relatively the amount of noise that is contained in the voice signal. If a person has a high HNR, he or she has a clear voice that can be heard and understood over the noise. The HNR can be used to evaluate a breathy voice, a hoarse voice, etc. [
]. A normal HNR is accepted to be in the range of 7 dB ─ 19.1 dB, so 15 dB can be assumed to be suitable in men and women [
- 2.2. Praat software: one example of an objective eval uation method
The human voice can be evaluated in two ways, one of which is hearing by the ears and the other is using hardware or a software program. Generally speaking, using both ways to evaluate a patient is best. Sound analyzers have been developed since the late 1940’s and have changed from analogue to digital. One of the most frequently used voice analyzers is Computerized Speech Lab (CSL) made by Kay Elemetrics (USA) [
]. Furthermore, nowadays many free software programs have been developed and can be used to analyze voice variables, including the above-mentioned basic variables. Among those programs, Praat software (
), which can analyze and transform the voice, is free and useful [
]. Sample vowels or a sample text needs to be analyzed, in which case / a /, / e /, / i /, / o /, and / u / or simple sentences from a fairy tale are useful.
- 2.3. Subjective evaluation method
- 2.3.1. Classification by Yin-Yang
Yin-Yang is the fundamental icon that represents east Asia or China. Yin-Yang is regarded as a general word in daily life nowadays and has opposing and complementary views at the same time. If classification is to be accomplished, then an opposing view is needed. If a patient’s voice can be categorized as clear, high-pitched, strong, and fast, then that person’s voice can be said to have the property of Yang. On the contrary, if the voice is categorized as thick, low, weak and slow, then the voice can be said to have the property of Yin. If voice is more precisely divided according to the Eight Principles (八綱), there is the Yang property, the Excessive property, and the Heat property for a patient who has a high-pitched, strong, powerful voice. On the other hand, there is the Yin property, the Empty property, and the Cold property for a patient who has a low, powerless, thin voice, who does not like to talk, or who has difficulty in talking and sometimes speaks with voice breaks [
]. In Sasang constitutional medicine, there are four distinctive voice properties: clear/thick, high-pitched/low-pitched, powerful/powerless, fast/slow [
- 2.3.2. Five voices
Five Voices sometimes are regarded as the same notion of five sounds, but in an exact meaning, five voices are more formal than five sounds [
]. In terms of five elements theory, Gung (宮) voice is allocated in Earth property, Sang (商) voice in Metal property, Gak (角) voice in Wood property, Chi (徵) Voice in Fire property and Wu (羽) voice in Water property. Gung voice has the longest and thickest voice. Chi voice comes from Gung voice, it has a little short and a little high-pitched and a little clear voice. Sang voice comes from Chi voice and it has a little long and a little low and a little thick voice. Wu voice comes from Sang voice and it has a shortest and the most high-pitched voice. Gak voice comes from Wu voice and it has in the middle of long and short, high-pitched and low-pitched, clear and thick [
In recent Chinese five voices treatment music, Gung music is in C major scale and has bright and calm feelings. On the contrary, Sang music is in D major and uses many kinds of metal musical instruments, has heavy and slowly ascending feelings. Gak music is in E major and uses many kinds of wooden musical instruments, so it gives the most cheerful, lightest and vivid feelings Chi music is in G major and it gives a magnificent and soft feeling in a serial order. Wu music is in A minor, so it gives a dark, slow, tragic and restrained feeling [
]. In addition, objective consensus and agreements on the notion of five voices into modernized music are thought to be needed.
- 2.3.3. Five sounds
Five sounds are regarded as more natural, original sounds than five voices [
. pointed out that the Gung sound is regarded to be a male, broad, big, baritone sound. It resembles the humming sound in the nasal cavity [
]. The Chi sound is regarded to be similar to a pig’s scream when it is surprised, the Sang sound to be similar to a lamb’s sound, the Wu sound to be similar to a horse’s sound, and the Gak sound to be similar to a hen’s cry [
]. The Gung sound is thought to be similar to the sounds produced when / ŋ / and / h / are spoken: likewise, the Chi sound is similar to / s / and / dз / or to / z / and / t∫ /, the Sang sound to / g / and / k /, and the Gak sound to / n /, / d / and / t / [
]. However, evaluating the voice with contemporary meaning by using the five sounds is difficult. Other meanings of the five sounds, that is, Exhaling (呼), Laughing (笑), Singing (歌), Mourning (哭), and Groaning (呻) are described in the book Huangdi Internal Classic (黃帝內經), but much research is needed to establish a consensus on their meanings for clinical applications.
- 2.3.4. Grade, roughness, breathy, asthenic, strained (GR BAS) evaluation
GRBAS is an acronym for Grade, Rough, Breathy, Asthenic and Strained. The GRBAS method has been used in Japan for a long time tor evaluate patients with vocal pathologies. By listening to the voices from the patients, trained professionals evaluate the overall grade and degree of roughness, breathiness, asthenia, and strain. It is a 4-point system in which item ranges from 0 (normal) to 3 (very severe) [
- 2.3.5. Visual analogue scale evaluation
The VAS is used to evaluate pain and general health condition. Generally, it ranges from 0 to 10 or from 0 to 100, where 10 or 100 means a patient has the most severe pain. A patient with a vocal pathology, such as dysarthria, dysphonia, aphonia, hoarseness, Parkinson’s disease, etc., can also be evaluated using the VAS.
Articulation of the voice might have been a curious thing for ancestors who treated patients, and descriptions of the articulators of the voice, such as the vocal cord, the vocal tract, the lips, the mouth, etc., have been found in old traditional medicine books. Five voices and five sounds were connected with five elements in physiological aspects; moreover feelings of voices were allocated to each of the five elements in pathological and diagnostic areas. Thus, each voice can give information about diseases of the five organs through the feelings of voices and about diseases of the six bowels through the properties of voices: clear/thick, high-pitched/low-pitched, powerful/powerless, and fast/slow. The five voices and five sounds used to be useful for diagnosing a patient’s disease and health condition based on the Eight Principles such as the Yin-Yang property, Empty/Excessive, Exterior/Interior, and Heat/Cold.
For example, 『Classic of Difficult Issues (難經)』 says that the five organs govern the five voices [
]; the voice sounds sad when one has liver disorders, urgent when one has lung disorders, magnificent when one has heart disorders, slow when one has spleen and pancreas disorders, and low when one has kidney disorders. Also, it sounds long when one has large intestine disorders, short when one has small intestine disorders, clear when one has gall-bladder disorders, fast when one has stomach disorders, and weak when one has gall-bladder disorders [
]. When the five voices combine with the five sounds, the Metal sound sounds gonging, the Earth sound sounds thick, the Wood sound sounds long, the Water sound sounds clear, and the Fire sound sounds dry [
]. Especially, the old book says that the Earth sound sounds like that made by a person speaking in the crock; a patient that has Dampness also sounds like he or she is speaking in the crock [
]. The cause of the disease was identified according to the characteristics of the voice. Although Dongeuibogam talks about loss of voice (aphonia), acute dysarthria, hoarseness, delirious speech, and treatments, voice cannot be used in diagnostics as much as the book says for many kinds of disorders. It can only be used only as a clue to identify the Eight Principles.
Although doctors in clinical settings are not aware of using voice examinations, they are always hearing and listening to what their patients say, so they are considering pattern identification during the discussions with the patient. They can sense the voice volume, the feeling of the voice, and the habits of speaking subconsciously. We can consider one more thing here: not only is a subjective voice examination important, but so is objective voice examination. Thus, what should be done to carry out an objective voice examination?
Doctors can identify patterns like the Yin pattern, the Deficiency pattern, and the Cold pattern through the patient’s voice being generally weak voice generally and patterns like the Yang pattern, the Excessive pattern and the Heat pattern through the patient’s voice being strong or powerful. Thus, the Eight Principles can be used in subjective voice examination with ease. However, when it comes to Goong, Sang, Gak, Chi, and Wu voices and the five sounds, standardization and consensus among health professionals are thought to be needed for clinical applications. Reliability, objectivity and standardization are important areas of diagnosis in Korean medicine. GRBAS and VAS use both subjective and objective criteria. Therefore, a scientific analysis, such as one using voice-analysis software, could be one option to measure the treatment outcome after Korean medicine treatment for voice-related diseases or systemic diseases (Fig
Subjective and objective voice examination.
Nowadays, voice is regarded as a necessary factor for communicating well, and voice-related problems, such as hoarseness, dysarthria and dysphonia, could possibly be used to compare the effects of Korean medicine treatments. When it comes to hoarseness for example, the points of view are different between Western medicine and Korean medicine. In Korean medicine, voice is affected not only by the vocal cords and the muscles surrounding the vocal cords, but also by many other factors, such as accommodations of the Lungs, Kidneys, and Liver for a high-pitched voice and of the Spleen for a soft voice, resonance through the Liver and the Spleen, and the emotional effect through the Heart and the Pericardium [
]. Therefore, voice problems induced by stress and tension should respond well to Korean medicine treatments. As Oriental voice examination and Oriental speech treatment are covered under national health insurance, although they are non-payment items, voice-related health problems may now be regarded as a specialized area of Korean medicine.
If effective treatment and management for patients with voice-related problems are to be achieved, in addition to the current subjective evaluation methods using Yin-Yang and Five Elements, including five voices and five sounds, an objective evaluation method using voice analytic software and a grading method for severity are needed. If eight principles and five voices and five sounds are combined with GRBAS and the VAS scoring system, a subjective voice examination can be done. Moreover, performing an objective voice examination using software in treating patients with voice problems and patients with systemic diseases with voice problems should be helpful in monitoring and managing those patients.
This research was supported by the Sangji University Research Fund, 2013.
Conflicts of interest. The authors declare that there are no conflicts of interest.
[Dongeui science institute, treasured mirror of eastern medicine (東醫寶鑑)].
Use of prosodic features for speech recognition
[Acoustic analysis of speech]
[Acoustic phonetics in korean academy of speech-language pathology and audiology. audiology for speech care professionals]
Koon Ja Publishing Inc
[Theory and reality of voice analysis using praat]
[Traditional chinese medicine diagnostics (中醫診斷學)]
China press of traditional Chinese medicine
[A study on the sasang constitutional diagnosis by perceptual voice analysis]
J Sasang Constitut Med
[A study on o-eum (五音) and oseong (五聲) in hwangjenaegyeong (黃帝內經) and akhakgwebeom (樂學軌範)]
The Journal of Korean Medical Classics
[Accessory wing of classified classic (類經附 翼)].
Hae-dong Medical Publishing
[Analysis on the therapeutic music of chinese five-sounds]
Korean J Oriental Physiology & Pathology
Japanese association of voice speech medicine
[Voice examination (clinical)]
Koon Ja Publishing Inc
[Classic of difficult issues]
Peking. people’s medical publishing House
[Manifestation of pediatrics (幼科發揮)]
China Press of Traditional Chinese Medicine;
[Introduction to medicine (醫學入門)]
Dae Seong Publishing
[Treatise on the spleen and stomach (脾胃論)]
Oju Publishing Co