Hu, Junfeng is now an associated professor in the School of Computer Science, Peking University.
He got his Bachelor's degree of computer science and engineering in University of Science and Technology of PLA in 1988, and Ph.D. degree of computer science from computer science department, Peking University in 2001. After two years post-doctoral research period in Peking University, he started his lecturing career in computer science department, Peking University.
One of Dr. Hu’s research fields is computer aided analyzing of Chinese ancient poetry. He established a computer aided system of ancient poem of Tang and Song dynasty, published several papers in this field. This research was supported by Chinese National Social Science Foundation. Dr. Hu was invited to be an expert member of Chinese National Ancient Archive Protection Committee (2002-2012).
The other research interest of Dr.Hu is text mining and language knowledge base construction. He has published more than 30 papers in this field and has been granted three NSFC projects in recent years on language resource construction and ontology mining. The ontologies constructed by his research group were used as the kernel knowledge base of the proposal-reviewer assignment system of Chinese National Science Foundation.
Dr. Hu’s main research contributions focus on the two fields listed below:
1) Established a computer aided analysis system of Chinese ancient poetry based on the automatic extracted multi-word dictionary. Different from the traditional view point that the most of the multi-character word appears only in modern Chinese (except some name entities), Hu’s research found more than forty thousand multi-character words used in ancient poetry which denote events and activities in ancient culture, traditional manufactural skills, and some well accepted metaphorical expressions in that age. The rhythm rules of ancient poetry are used to help to find the comparatively low frequent multi-unite-words.
2) Established a practical ontology mining system which can extract high quality word-concept hierarchy from large scale row corpus. The coming results can be used as supervision or evaluation knowledge to optimize some very important language model such as word-embedding, unsupervised event extraction and machine translation system.