Jie Tang (唐杰)

 

Assistant Professor, IEEE Member, ACM Professional Member.

Department of Computer Science and Technology, Tsinghua University

 

Work Phone Number:          +8610-62788788-18

Office Address:                     1-308, FIT Building, Tsinghua University, Beijing, 100084. China PR.

E-Mail Address:                 abcd

My Blog:                                Jie Tang’s Blog

My FOAF:                             Jie Tang’s FOAF

My ArnetMiner Page:          Jie Tang’s Arnet

 

 

 

 

I am an assistant professor in Department of Computer Science and Technology of Tsinghua University. I obtained my Ph.D. in DCST of Tsinghua University in 2006. I became ACM Professional member in 2006 and IEEE member in 2007.

 

I was an intern at NLC group of Microsoft Research Asia from 2004 to 2005. I also attended the internship program of IBM China Research Lab in 2004.

 

I am interested in information extraction, semantic web (especially semantic annotation and ontology mapping), text mining, and statistical learning (maximal margin learning, sequential labeling).

 

New ** ArnetMiner 2.0 is online!! **

 

RESEARCH PROJECTS

·      Research of Semantic Content Annotation. (2008-2010). Chinese Young Faculty Research Funding under Grant No. 20070003093 (PI).

·      Social Search in Web Community. (2008-2009). Joint Research Project funded by IBM China Research Lab (PI).  In this project, we jointly study how to combine the human intelligence with computer algorithms for improve the search quality. We also consider how to model the tagging and the timely information in the social search model.

·      Research of Unified Models for Semantic Content Annotation. (2008-2010). NSFC Funded Project under Grant No. 60703059 (PI). The project addresses the semantic content annotation and semantic relationship extraction. Specifically, we will focus on studying different Markov random fields (e.g., Tree-structured Conditional Random Fields) for extracting and annotating the semantic instance and semantic relationship. We plan to apply the proposed models to a real-world social network system, ArnetMiner.

·      Requirement Engineering Validation and Management. (2007-2012). National Foundational Science Research (973) under Grant No. 2007CB310803. (Major member).

·      Expertise Oriented Mining for Web Community. (2007-2008). Minnesota/China Collaborative Research Program jointly funded by University of Minnesota and Tsinghua University (PI and co-PI with Prof. Loren Terveen).  In this research project, we will jointly study the new expertise oriented mining issues in the area of Web-based social networks. Specifically, we will focus on investigating three sub-topics, namely structured data extraction, information integration, and expertise search.

·      Semantic Web-based Social Network Mining. (2007-2009). Research project funded by Tsinghua University (PI). In this project, we will focus on studying mining issues in the Semantic Web-based social network. Specifically, we will focus on integrating different Web-based social networks into a unique Semantic Web-based social network; we will investigate the name disambiguation problem; we will also study the trust problem in the Semantic Web-based social network.

·      Text Mining for Web 2.0. (2007-2008). Joint Research Project funded by IBM China Research Lab (PI).  In this research project, we jointly study the new techniques in the area of Text Mining for Web 2.0. Specifically, we will focus on investigating new mining issues on structured and un-structured data (e.g. social network mining). We have developed a prototype system: ArnetMiner.

·      Research of Semantic Web Content Availability. (2007-2008). Research project funded by DCST, Tsinghua University (PI). In this project, we focus on investigating new sequential labeling models for semantic annotation and new supervised machine learning methods for ontology alignment.

·      Information Sharing and Recommendation in Web Community. (2006-2008). Project funded by International cooperation (PI). http://www.powazi.com/.

·      Toward Managing Semantic Web Content. (2007-). Joint Research Project funded by IBM China Research Lab (co-PI with Prof. Juanzi Li). In this project, we will apply advanced Semantic Web technologies and integrated development environments to manipulate semantic information so as to simplify managing Semantic Web Content. We will use ontology to represent, organize, and manage the content of resources from different sources including database, web pages, and plain text.

·      Research of Key Technologies and their Application in Domain-specific Semantic Web Content Management. (2006-2008). NSFC Funded Project under Grant No. 90604025 (Major member). The project addresses semantic annotation, ontology mapping, and semantic content management (e.g. semantic retrieval, semantic data visualization, etc.). We have developed a prototype system called SWARMS (Semantic Web Aided Rich Mining System), an ontology alignment tool called RiMOM (Risk Minimization based Ontology Mapping), and several semantic annotation tools.

·      Semantic Web, Ontology, Granularity and Distributed Ontology System. (2003-2004). A project funded by National Natural Scientific Funding under Grant No. 60443002 (Major member). In this project, I focused on investigating new information extraction methods for semantic web annotation and new matching methods for ontology mapping.

·      Advanced Semantic Web Technologies to Support Ontology based Enterprise Content Management. (2006-2008). A project funded by China-Greece Academic Research Funding (Major member). In this project, we are aimed at employing information extraction and information integration methods in enterprise content management. Specifically, we try to extract information from different formats of documents. We also try to integrate information from different data sources.

·      TIPSI (The Intelligence Processor of Semi-structured Information). (2002-2005). A project funded by International Cooperation with ITF Frontier (Major member). TIPSI is aimed at extracting complex information from semi-structured information. Enterprise annual reports from ShangHai Stock Exchange are used as experimental data. The reports are first converted into a uniform format in XML, and then are passed into a process of semi-automatic extraction; finally result into a semantic view.

·      Email Data Cleaning. (2004-2005). This work was conducted when I was an intern at Microsoft Research Asia. Mentor: Hang Li. In this work, we investigate the issue of email data cleaning. many text mining applications need take emails as input. Email data is usually noisy and thus it is necessary to clean it before mining. In this work, email cleaning is formalized as a problem of non-text filtering and text normalization. A cascaded approach is proposed, which cleans up an email in four passes including non-text filtering, paragraph normalization, sentence normalization, and word normalization. Methods for performing the tasks on the basis of Support Vector Machines (SVM) have also been proposed in this work.

·      Indoor Location of Wireless Device. (2004). This work was conducted when I was an intern at IBM CRL (China Research Laboratory). Mentor: Zhe Xiang. This project intends to predict the indoor location of wireless devices according to the signals collected by multiple access points. A supervised machine learning method is proposed to predict the location.

 

PUBLICATIONS

2008

l  Jing Zhang, Jie Tang, Liu Liu, and Juanzi Li. A Mixture Model for Expert Finding. In Proceedings of 2008 Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2008). (to appear)

l  Jie Tang, Duo Zhang, Limin Yao, and Yi Li. Automatic Semantic Annotation using Machine Learning. In the book of The Semantic Web for Knowledge and Data Management: Technologies and Practices. Zhongmin Ma (Ed.), Springer Inc. (to appear)

l  Jie Tang, Bangyong Liang, and Juanzi Li. SWARMS: A Platform for Domain Knowledge Management and Applications. In the book of The Semantic Web for Knowledge and Data Management: Technologies and Practices. Zhongmin Ma (Ed.), Springer Inc. (to appear)

2007

l  Jie Tang, Jing Zhang, Duo Zhang, Limin Yao, and Chunlin Zhu. ArnetMiner: An Expertise Oriented Search System for Web Community. Semantic Web Challenge. In Proceedings of the 6th International Conference of Semantic Web (ISWC’2007).

l  Jie Tang, Duo Zhang, and Limin Yao. Social Network Extraction of Academic Researchers. In Proceedings of 2007 IEEE International Conference on Data Mining (ICDM’2007). pp. 292-301

l  Limin Yao, Jie Tang, and Juanzi Li. A Unified Approach to Researcher Profiling. In Proceedings of 2007 IEEE/WIC/ACM International Conferences on Web Intelligence (WI’2007). pp. 359-366

l  Xin Xin, Jie Tang, and Juanzi Li. Enhancing Semantic Web by Semantic Annotation: Experiences in Building an Automatic Conference Calendar. In Proceedings of 2007 IEEE/WIC/ACM International Conferences on Web Intelligence (WI’2007). pp. 439-442

l  Duo Zhang, Jie Tang, Juanzi Li, and Kehong Wang. A Constraint-Based Probabilistic Framework for Name Disambiguation. In Proceedings of the Sixteenth Conference on Information and Knowledge Management (CIKM’2007). pp. 1019-1022

l  Jie Tang, Mingcai Hong, Duo Zhang, Bangyong Liang, and Juanzi Li. Information Extraction: Methodologies and Applications. In the book of Emerging Technologies of Text Mining: Techniques and Applications, Hercules A. Prado and Edilson Ferneda (Ed.), Idea Group Inc., Hershey, USA, 2007. pp. 1-33 [PDF]

l  Jie Tang, Mingcai Hong, Jing Zhang, Bangyong Liang, Limin Yao, and Juanzi Li. ArnetMiner: Toward Building and Mining Social Networks. (Demo) In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’2007).

l  Chonghui Zhu, Jie Tang, Hang Li, Hwee Tou Ng, and Tiejun Zhao. A Unified Tagging Approach to Text Normalization. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL’2007). pp. 688-695 [PDF] [PPT]

l  Yi Li, Jie Tang, Duo Zhang, and Juanzi Li. Toward Strategy Selection for Ontology Alignment. (Poster) In Proceedings of the 4th European Semantic Web Conference 2007 (ESWC’2007).

l  Juanzi Li, Jie Tang, Jing Zhang, Qiong Luo, Yunhao Liu, and Mingcai Hong. EOS: Expertise Oriented Search Using Social Networks. (Poster). In Proceedings of the 16th International World Wide Web Conference (WWW’2007). pp. 1271-1272 [PDF]

l  Duo Zhang, Mingcai Hong, and Jie Tang. Social SIM Networks. (Telcel Award). In the 8th Worldwide Mobile Communication and Java Card-TM Developer Contest (SIMagine’2007 Finalist).