Semantics a level
In order to improve the retrieval accuracy of content-based image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to reducing the ‘semantic gap’ between the visual features and the richness of human semantics. This paper attempts to provide a comprehensive survey of the recent technical achievements in high-level semantic-based image retrieval. Major recent publications are included in this survey covering different aspects of the research in this area, including low-level image feature extraction, similarity measurement, and deriving high-level semantic features. We identify five major categories of the state-of-the-art techniques in narrowing down the ‘semantic gap’: (1) using object ontology to define high-level concepts; (2) using machine learning methods to associate low-level features with query concepts; (3) using relevance feedback to learn users’ intention; (4) generating semantic template to support high-level image retrieval; (5) fusing the evidences from HTML text and the visual content of images for WWW image retrieval. In addition, some other related issues such as image test bed and retrieval performance evaluation are also discussed. Finally, based on existing technology and the demand from real-world applications, a few promising future research directions are suggested.
- Content-based image retrieval;
- Semantic gap;
- High-level semantics;
Copyright © 2006 Pattern Recognition Society. Published by Elsevier B.V. All rights reserved.
About the Author—Ms. YING LIU received her B.Sc. and M.Sc. degree from Dept. of Infor. Eng. from Xidian University, China, in 1993 and 1996, respectively. Then she served as an associate lecturer in the same Dept. for 2 years. She received her M.Eng. degree in Dept. of E.E. from National University of Singapore in 2000. After this, she worked as a research Engineer in Center for Signal Processing, Nanyang Technological University in Singapore. Ms. Liu is now a Ph.D. candidate in Gippsland School of Computing and Information Technology, Monash University, Australia.
About the Author—Dr. DENGSHENG ZHANG received B.Sc. in Mathematics and B.A. in English in 1985 and 1987, respectively, both from China. He spent 12 years on teaching Mathematics and Computing before he was involved in his Ph.D. program in 1999. He received Ph.D. in Computer Technology from Monash University, Australia, in 2002. He is now a lecturer in Gippsland School of Computing and Information Technology of Monash University. Dr. Zhang has over 10 years research experience in the area of multimedia and has published over 20 referred international journal and conference papers.