Advanced
Sensorial Information Extraction and Mapping to Generate Temperature Sensory Effects
Sensorial Information Extraction and Mapping to Generate Temperature Sensory Effects
ETRI Journal. 2014. Feb, 36(2): 224-231
Copyright © 2014, Electronics and Telecommunications Research Institute(ETRI)
  • Received : September 03, 2013
  • Accepted : December 30, 2013
  • Published : February 01, 2014
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Sang-Kyun Kim
Seung-Jun Yang
Chung Hyun Ahn
Yong Soo Joo

Abstract
In this paper, a method to extract temperature effect information using the color temperatures of video scenes with mapping to temperature effects is proposed to author temperature effects of multiple sensorial media content automatically. An authoring tool to apply the proposed method is also introduced. The temperature effects generated by the proposed method are evaluated by a subjective test to measure the level of satisfaction. The mean opinion score results show that most of the test video sequences receive an average of approximately four points (in a five-point scale), indicating that test video sequences (with the temperature effects generated by the proposed method) enhance levels of satisfaction.
Keywords
I. Introduction
Along with the sensation of the 3D film industry, the development of MulSeMedia (Multiple Sensorial Media) [1] or 4D media has received a lot of attention from the public. 4D movies generally add sensory effects to 3D and/or IMAX movies, allowing audiences to immerse themselves more deeply into the movie viewing experience. Along with the two human senses of sight and hearing, such sensory effects as wind, vibration, and scent can stimulate other senses, such as the tactile and olfactory senses. MulSeMedia content indicates audiovisual content annotated with sensory effect metadata.
Since 2008, the MPEG International Standardization Group has been working to define interfaces and data formats between the virtual and real worlds, under the project name MPEG-V (ISO/IEC 23005) [2] - [5] . This standard supports the interoperability of data formats between sensory effects authored by content providers, rendering effects of diverse consumer and professional devices, and rendering environments. Due to the standardized format specified by MPEG-V, content providers can convey sensory effects as precisely as intended to consumers.
MulSeMedia content contains the metadata of sensory effects that are derived from physical properties of scenes, such as those containing wind, light, temperature, color, motion, and sound effects, as well as the emotional properties of the content. The emotional properties of the content include such feelings as happiness, love, fear, anger, surprise, or hatred. If emotions and feelings were to be expressed as appropriate sensory effects, viewers would be able to immerse themselves more deeply in the content as opposed to simply experiencing the sensory effects of the physical properties in the scenes. Experiences of such sensory effects can also assist with the understanding of video content for visually- or hearingimpaired people.
For the successful industrial deployment of MulSeMedia services, it is important to provide an easy and efficient means of producing MulSeMedia content. The existing methods of producing MulSeMedia content incur much time and effort to author sensory effects. Sensory effect authoring tools have been proposed to remedy this problem [6] - [9] . The authoring tool known as SEVino [6] includes the ability to verify XML instances from JAXB complied with the XML schema specified in MPEG-V Part 3 [3] . Another authoring tool, known as SMURF [9] , is able to support complex authoring functionalities such as GroupOfEffects , Declaration , and ReferenceEffect so that average users can easily create their desired sensory effect metadata. Ambient light devices are controlled via automatic color calculations to enable an immediate reaction to color changes in the content [10] .
More convenient authoring of MulSeMedia content can be achieved by extracting sensorial information automatically from the content. In other words, sensory effects can be generated automatically by extracting sensorial (physical and emotional) properties from the content and by mapping the major attributes of the extracted properties to sensory effects; this can speed up the authoring process significantly. In addition, the influence of the generated sensory effects should be properly measured for quality assessment purposes [11] .
The aim of this paper is to introduce a MulSeMedia authoring method using color temperature extraction and its mapping to temperature effects. A quality assessment of temperature effects generated by the proposed method is presented in this paper.
This paper is organized as follows. The general process of MulSeMedia content authoring is described in section II. Section III details a method of sensorial information extraction using color temperatures and its mapping to temperature effects. Section IV presents experimental settings and the results of human responses against temperature effects generated by the proposed method. Finally, the conclusion is presented in section V.
II. Authoring of MulSeMedia Content
The overall process flow of MulSeMedia content authoring discussed here is depicted in Fig. 1 . Sensorial information in video content is divided into two major categories; physical characteristics (such as lighting, vibration, motion, and sound) and emotional characteristics (determining mood or feelings such as happiness, love, fear, anger, or surprise). Both characteristics can be represented by such attributes as duration and intensity, as well as direction.
PPT Slide
Lager Image
Authoring process flow of MulSeMedia content.
Sensory information mapping transforms the physical and/or emotional characteristics of a scene into the attributes of appropriate sensory effects. The physical characteristics (duration, intensity, and direction information extracted from the scene) can be mapped directly to the attributes of sensory effects. For example, the magnitude and direction of motion information in a scene can be conveyed as the intensity and direction of a wind effect. The emotional characteristics, on the other hand, can be mapped indirectly to the properties of sensory effects. For example, fear can be converted to either vibration or motion effects. Happiness can be converted to either colored light or bubble effects. Research thus far has not attempted to determine the types of sensory effects for emotional characteristics or a means of transforming emotional characteristics to the attributes of the sensory effects.
Authors of MulSeMedia content can select for themselves the appropriate sensory effects of a scene and determine the attributes of the corresponding sensory effects with existing MulSeMedia content authoring tools [6] - [9] . In other words, authors perceive the sensorial information of a scene and determine (map) the corresponding sensory effects and their attributes. Depending purely upon an author’s decision is a subjective approach and has limitations when used to express subtle changes of sensorial information.
III. Sensorial Information Extraction and Mapping for Temperature Effects
In this section, we propose a method to generate temperature effects automatically by extracting color temperatures (that is, sensorial information) from a scene and by mapping their properties (that is, sensorial information mapping) to the attributes of temperature effects. In addition, an authoring tool for the proposed method is introduced briefly.
- 1. Generating Temperature Effects Using Color Temperature
To generate temperature effects automatically, the color temperatures of a scene are extracted from an instance of audiovisual content. The term “color temperature” refers to the color of a light source or to the white points of image-display devices, such as TVs or PC monitors. The correlated color temperature of a light source [12] is defined as the temperature of the Planckian radiator, in which the chromaticity is closest to that of the source in a suitable uniform chromaticity-scale diagram, such as the CIE 1960 UCS diagram. Henceforth, “color temperature” is used to indicate the correlated color temperature.
MPEG developed a low-level visual descriptor of the color temperature [13] - [15] to describe images and videos. The visual color temperature descriptor is specified in the MPEG-7 International Standard (ISO/IEC 15938-3). The semantics of the MPEG-7 color temperature descriptor is as follows: the ColorTemperatureValue parameter represents the color temperature of the given image/region. The range of the color temperature is 1,667 K to 25,000 K.
                                                                                                                      
The color temperature can be an effective bridge between images’ color characteristics and a human’s perceptual temperature feeling regarding the images. An image frame can be classified into one of four categories using the color temperature: hot, warm, moderate, and cool. While the color temperature value can be a physical characteristic for sensorial information, its category classification can be used as an emotional characteristic. As specified in the MPEG-7 standard, the color temperature boundaries for each category are as follows: hot (1,667 K to 2,250 K), warm (2,251 K to 4,170 K), moderate (4,171 K to 8,060 K), and cool (8,061 K to 25,000 K).
PPT Slide
Lager Image
Process of color temperature extraction and mapping to generate temperature effect.
Figure 2 shows the process of color temperature extraction and its mapping for the generation of the temperature effect. The process of sensorial information extraction using color temperatures is as follows.
  • ① Select a video frame section and a color temperature extraction area.
  • ② Define a frame rate interval to extract the color temperature.
  • ③ Extract image frames according to the frame interval from the selected video frame section.
  • ④ Calculate a color temperature from the area defined in step 1 for each image and then determine the color-temperature category.
  • ⑤ Merge consecutive and identical category frames.
  • ⑥ Extract major attributes (for example, the average color temperature, the color-temperature category, and the number of frames) for each merged subsection.
An area in a frame is selected to extract more precise sensorial information. For example, it is possible to select a precise region of a fire to obtain the color temperature. For example, a rectangular region of interest can be selected by dragging a mouse. The computation complexity is reduced by the area selection process as well.
Defining a frame rate interval in step 2 is done to control the amount of the extracted color temperature and its computational complexity. If the frame rate interval is short, the amount of the extracted color temperature increases; thus, more detailed sensorial information can be gathered. In contrast, the computational complexity increases as well. And vice versa if the frame rate interval is long. Empirical results show that two frames per second is likely adequate for the frame rate interval.
From the calculated color temperature of each frame, it is possible to obtain the color-temperature category. Frames with consecutive and identical color-temperature categories are merged so as to form subsections with the same categories. Each subsection produces major attributes, such as the average color temperature, the color-temperature category, and the number of frames that belong to the subsection.
The attributes extracted from the color temperature calculations can be mapped to the attributes of the temperature effect information, as follows.
  • ① The total number of frames for each merged subsection is converted to thedurationof the temperature effect information. The total number of frames can be converted to thedurationusing the frame rate information.
  • ② The average color temperature is converted to theintensityof the temperature effect information.
  • ③ The color temperature category is mapped to thetypeof the temperature effect information. The categories, warm and hot, can be mapped as the heating type of the temperature effect information, whereas the cool category can be mapped as the cooling type of the temperature effect information.
Table 1 shows the mapping table from the attributes of the color temperature to the attributes of the temperature effect information.
Temperature effect information mapping table.
Color temperature Temperature effect info.
Range of average CT value Category Range of intensity Type
1,667 K - 2,250 K Hot 51 - 100 (%) Heating
2,251 K - 4,170 K Warm 0 - 50 (%) Heating
4,171 K - 8,060 K Moderate - -
8,061 K - 25,000 K Cool 0 - 100 (%) Cooling
The intensity of the temperature effect information is calculated using the color temperature boundary values of each category. The formula used to calculate the intensity in units of percent is shown in (1).
Intensity= RC T bound i RC T in RC T max i RC T min i ×50× I coef i + I base i , i=Hot  or  Warm  or  Cool.
The reciprocal color temperature ( RCT ) in (1) is the unit of measurement used to express the color temperature. It is given by the formula
RCT= 1,000,000 CT ,
where RCT is the reciprocal desired color temperature value and CT is the color temperature in degrees Kelvin. In (1), RCT in is the average color temperature, RCT bound is the color temperature boundary values for each color temperature category, RCT max and RCT min respectively represent the maximum and minimum values for each category, and I base and I coef are constants to make the intensity between 0 and 100. Table 2 shows the constants included in (1).
Constants in (1).
Constant Category (i) Value
RCTbound Hot 444
Warm 240
Cool 124
RCTmax Hot 444
Warm 240
Cool 40
RCTmin Hot 600
Warm 444
Cool 124
Ibase Hot 50
Warm 0
Cool 0
Icoef Hot 1
Warm 1
Cool 2
Mapping between temperature effect information and temperature effect metadata as specified in MPEG-V.
Temperature effect info. Temperature effect metadata
Type Intensity-range Intensity-value Intensity-range
Heating 0 - 100 (%) 26 - 30 (°C) 18 - 30 (°C)
Cooling 0 - 100 (%) 18 - 22 (°C) 18 - 30 (°C)
With the generated temperature effect information, the temperature effect metadata specified by MPEG-V can be created. Table 3 shows the method of interlinking the temperature effect information with the temperature effect metadata specified in MPEG-V. Because the temperature effect metadata specified in MPEG-V cannot describe the type of temperature effect (heating or cooling), the type and the intensity from the temperature effect information should be properly mapped to the intensity-value of the temperature effect metadata. The intensity-range attribute of the temperature effect metadata can be set to 18°C to 30°C, at which ordinary devices can operate normally. Because most people feel comfortable in a temperature range of 22°C to 24°C, we regard temperature effects below 22°C as cooling effects. Temperature effects above 26°C are regarded as heating effects. Table 3 shows a mapping table, in which the temperature effect information is converted to the attributes of the temperature effect metadata. The duration of the temperature effect information is used directly as the duration attribute of the temperature effect metadata.
- 2. Implementation of Authoring Tool for Temperature Effect
To author MulSeMedia content, an efficient authoring tool is required to support the sensorial information extraction and mapping. Figure 3 demonstrates the authoring GUI to serve such a purpose.
The layout of the tool is composed of three parts: a video control component, a temperature effect authoring component, and an effect-timeline component. Numbers ① through ③ featured in Fig. 3 are described below.
  • ① Video control: a video can be loaded and played via this component. The video can go forward or backward frame by frame via this component.
  • ② Temperature effect authoring: this component provides functionalities to select the frame interval, to calculate color temperatures, to map color temperature properties, and to create temperature effect metadata.
  • ③ Effect-timeline: this component enables the selection of the video section to extract color temperatures, to show the calculated color temperature categories in the section, and to depict the created temperature effects.
PPT Slide
Lager Image
GUI composition of temperature effect authoring tool.
Figure 4 shows an XML instance of consecutive temperature effects, which is automatically generated using the authoring tool introduced in Fig. 3 .
PPT Slide
Lager Image
Automatically generated XML instance of temperature effects.
IV. Experiments
- 1. Assessment of Quality of Experience
To study the impact on the quality of experience (QoE) when consuming MulSeMedia content annotated with temperature effects, we conduct a subjective quality assessment. We adopt methods defined by ITU-T Rec. P.910 [16] and ITU-T Rec. P.911 [17] . The five-level impairment scale of degradation category rating (DCR) [16] - [17] is turned into a new five-level enhancement scale (that is, Big Enhancement, Little Enhancement, Imperceptible, Annoying, and Very Annoying) [10] . In this paper, the DCRs with the modified five-level enhancement scale [10] are used to assess the level of satisfaction of the temperature effects generated by the proposed method.
- 2. Experimental Setting and Procedure
Our subjective test consists of 30 volunteers including 17 males and 13 females aged between 22 and 31. None of the participants have previously taken part in a similar subjective test. The room temperature is maintained between 22°C and 24°C without the operation of any heating or cooling devices. Participants wear earphones so as not to experience any interference from the noises generated by the fan devices.
Figure 5 shows a media player with temperature effect display devices. The cooling effect is displayed by a cooling fan, and the heating effect is displayed by a heating fan. Both devices can render two levels of blowing intensity.
PPT Slide
Lager Image
Experimental system setup: (left to right) cooling fan, media player, and heating fan.
Test video sequences.
ID Title Bit rate (kbit/s) Resolution Duration (sec) No. of effects
c1 The Chronicles of Narnia: The Voyage of the Dawn Treader 1,953 720×400 26.193 2
c2 The Last Airbender 1,953 720×304 29.697 2
c3 Haeundae 1,953 720×304 29.2 4
h1 Batman Begins 1,953 1,280×528 24.859 4
h2 Sherlock Holmes 1,953 704×384 30.30 3
h3 Harry Potter and the Order of the Phoenix 1,953 1,280×528 25.192 3
Table 4 shows information about the test video sequences. We select video sequences that contain scenes with heat-related content (for example, a fire scene) or cool-related content (for example, a major tidal scene with wind). The ID of each video sequence is assigned in accordance with the type (heating or cooling) of temperature effect. The ID starts with “h” if a sequence contains heating effects and with “c” if a sequence contains cooling effects. The number of effects represents the number of temperature effects created automatically by the proposed method. The experimental procedure in terms of the DCR is detailed as follows.
  • ① The test sequences are shown to test-subjects in a random order.
  • ② A reference video sequence without sensory effects is shown first, and then the same video sequence with sensory effects is shown two seconds later.
  • ③ Test subjects evaluate the sequence using a five-level enhancement scale approximately 10 seconds after watching both video sequences.
  • ④ The evaluation is repeated for every test video sequence.
- 3. Experiment Results
Figure 6 shows the level of satisfaction (that is, the voting results) with the temperature effects deployed to the test subjects. The votes on the five-level enhancement scale are described in the graph.
PPT Slide
Lager Image
Evaluation results.
Figure 7 shows the level of satisfaction according to %GOB (good or better), %POW (poor or worse), and %Rest. The %GOB attribute represents Big Enhancement and Little Enhancement, the %POW attribute represents Annoying and Very Annoying, and the %Rest attribute represents Imperceptible. The level of satisfaction with the temperature effects is clearly increased, as shown in Fig. 7 .
PPT Slide
Lager Image
Evaluation using %GOB, %POW, and %Rest.
Two participants among the test subjects feel annoyed by most of the deployed sensory effects. The first subject votes “Annoying” five times and “Imperceptible” once, while the second subject votes three times respectively for “Annoying” and “Imperceptible”. They vote negatively because they feel the deployed sensory effects interfere with their immersion into the movie content.
Figure 8 shows the calculation of the mean opinion score (MOS) at the 95% confidence interval. The continuous rating scale for the enhancement of the QoE ranged from zero to five, where five indicates a significant enhancement and zero indicates that the sensory effects are significantly annoying. Most of the video sequences receive an average MOS of approximately four points indicating that sequences with temperature effects enhance the level of satisfaction.
PPT Slide
Lager Image
Mean opinion scores with confidence interval of five-level enhancement scale voting.
The sensorial information (that is, temperature effect information) extracted using color temperatures and its mapping to temperature effects enhance the level of satisfaction because the temperature effects generated by the proposed method coincide with the video content. Therefore, the method proposed in this paper is well suited for MulSeMedia content authoring.
V. Conclusion
In this paper, a method to extract sensorial information (that is, temperature effect information) using color temperatures with sensorial information mapping was introduced to author temperature effects of MulSeMedia content automatically. An authoring tool to apply the proposed method was also introduced. The temperature effects generated by the proposed method were evaluated by a subjective test to measure the level of satisfaction. The MOS results showed that most of the test video sequences received an average of approximately four points, meaning that the sequences with temperature effects, automatically generated by the proposed method, clearly enhanced the level of satisfaction.
In the future, we plan to extend our research to investigate other types of sensorial information that can be automatically extracted from audiovisual content, such as motion blur, texture, sound, and music. Mapping methods between the extracted sensorial information and sensory effects will be reported as well.
This research was funded by the MSIP (Ministry of Science, ICT & Future Planning), Korea in the ICT R&D Program 2013 (Development of Broadcasting System based on Personalized Emotional UI/UX).
BIO
goldmunt@gmail.com
Sang-Kyun Kim received his BS, MS, and PhD degrees in computer science from the University of Iowa in 1991, 1994, and 1997, respectively. In 1997, he joined the Samsung Advanced Institute of Technology as a researcher. He was a senior research staff member as well as a project leader on the Image and Video Content Search Team of the Computing Technology Lab until 2007. He is now an associate professor in the Department of Computer Engineering at Myongji University. His research interests include digital content (image, video, and music) analysis and management, fast image search and indexing, color adaptation, 4D media, sensors, VR, and multimedia standardization. He serves as a project editor of MPEG-V International Standards, that is, ISO/IEC 23005-2/3/4/5 and 23005-7.
sjyang@etri.re.kr
Seung-Jun Yang received his BS degree in computer science from Suncheon National University, Rep. of Korea, in 1999 and his MS degree in computer science from Chonnam National University, Rep. of Korea, in 2001. Since 2001, he has been a senior researcher in the Realistic Broadcasting Media Research Department of ETRI, where he has developed advanced digital television technology, including data broadcasting and personalized broadcasting. He participated in making the domestic transmission and reception standard for terrestrial personalized broadcasting as a member of the Telecommunications Technology Association. He is currently involved in the development of emotion-based broadcasting service and assistive broadcasting service for disabled people.
hyun@etri.re.kr
Chung Hyun Ahn received his PhD degree in GIS/RS from Chiba University, Japan, 1995 and worked as a member of the research and teaching staff at Chiba University. He has been working at ETRI since 1996, leading many projects in GIS/RS (1996-2000) and digital broadcasting areas (2001 - present). He was the leader of the 3DTV and DMB and Next Broadcasting Service Planning Team. Currently, the major focus of his research is emotion-based broadcasting service and new broadcasting service for disabled people.
dkjs112@gmail.com
Yong Soo Joo received his BS and MS degrees in the Department of Computer Engineering of Myongji University in 2008 and 2010, respectively. He is now a PhD Student in the Department of Computer Engineering of Myongji University. His research interests include digital content (image and video) analysis and management, color adaptation, 4D media, sensors, VR, and multimedia standardization.
References
Kannan R. , Balasundaram S.R. , Andres F. 2010 “The Role of Mulsemedia in Digital Content Ecosystem Design,” Proc. Int. Conf. Manag. Emergent Digital EcoSyst. 264 - 266    DOI : 10.1145/1936254.1936305
2011 ISO/IEC 23005-2:2011, Information Technology – Media Context and Control – Part 2: Control Information
2011 ISO/IEC 23005-3:2011, Information Technology – Media Context and Control – Part 3: Sensory Information
2011 ISO/IEC 23005-4:2011, Information Technology – Media Context and Control – Part 4: Virtual World Objects and Characteristics
2011 ISO/IEC 23005-5:2011, Information Technology – Media Context and Control – Part 5: Data Formats for Interaction Devices
Waltl M. , Timmerer C. , Hellwagner H. “A Test-bed for Quality of Multimedia Experience Evaluation of Sensory Effects,” Proc. Int. Workshop Quality Multimedia Exp. San Diego, CA, USA July 29-31, 2009 145 - 150    DOI : 10.1109/QOMEX.2009.5246962
Choi B. , Lee E.-S. , Yoon K. “Streaming Media with Sensory Effect,” Proc. Int. Conf. Inf. Sci. Appl. Jeju Island, Rep. of Korea Apr. 26-29, 2011 1 - 6    DOI : 10.1109/ICISA.2011.5772390
Joo Y.-S. , Kim S.-K. 2011 “Sensory Effect Authoring Tool for Sensible Media,” J. Broadcast Eng. 16 (5) 693 - 893    DOI : 10.5909/JEB.2011.16.5.773
Kim S.-K. 2013 “Authoring Multisensorial Content,” Signal Process., Image Commun. 28 (issue 2) 162 - 167    DOI : 10.1016/j.image.2012.10.011
Timmerer C. 2012 “Assessing the Quality of Sensory Experience for Multimedia Presentations,” Signal Process., Image Commun. 909 - 916    DOI : 10.1016/j.image.2012.01.016
Kim S.-K. , Joo Y.-S. , Lee Y. 2013 “Sensible Media Simulation in an Automobile Application and Human Responses to Sensory Effects,” ETRI J. 35 (6) 1001 - 1010    DOI : 10.4218/etrij.13.2013.0038
Wyszecki G. , Stiles W.S. 1982 Color Science, Concepts and Methods, Quantitative Data and Formulae 2nd ed. John Wiley & Sons, Inc. New York 224 - 225
2001 ISO/IEC JTC1/SC29/WG11 M7265, “Report of VCE-6 on MPEG-7 Color Temperature Browsing Descriptors,” Sydney, Australia
2001 ISO/IEC JTC1/SC29/WG11 M7712, “Report of VCE-6 on MPEG-7 Color Temperature Browsing Descriptors,” Pattaya, Thailand
2002 ISO/IEC JTC1/SC29/WG11 M7993, “Report of VCE-6 on MPEG-7 Color Temperature Descriptor for Display Preference,” Jeju Island, Rep. of Korea
2008 ITU-T Rec. P.910, Subjective Video Quality Assessment Methods for Multimedia Applications
1998 ITU-T Rec. P.911, Subjective Audiovisual Quality Assessment Methods for Multimedia Applications