The ENP-China project is developing two major databases for research on modern China. They are designed to serve as fundamental resources for the study of people and places in China between 1830 and 1949. These databases are being developed as part of the research agenda of the ENP-China team, especially through the projects and case studies it explores. Their purpose and coverage, however, extends beyond the topics actively pursued in such research and aims to provide essential data and tools.
The modern period in China presents a particular challenge, which historians of previous periods did not have to address. Whereas the production of sources was by and large done in China proper and in Chinese until the nineteenth century (though the use of Manchu and Mongol under the Qing may draw some parallels), two highly connected processes changed the conditions of knowledge production in the modern period: The rise of mass printing, including the formidable development of the press, and the involvement of a wide array of actors, not just Chinese but foreigners, in knowledge production. This represents a massive amount of information about modern China. In the course of the century of Western presence through businessmen, diplomats, settlers, travelers, scholars, etc., « things Chinese » came to be named and described by people with different language background (English, French, Italian, German, etc.).
Thus both in terms of time and space (origin), all sorts of transliteration systems emerged, even within the same language domain (English), which also included variations based on different Chinese dialects. The wealth of information contained in the books, periodicals, archives, manuscripts, etc. produced about/in China before 1949 is simply unintelligible without an often delicate exercise in transporting these terms — people’s names, locations, institutions to name the main categories — into Chinese. To give an example, 宋子文 was known as T.V. Soong, not Sung Tzu-wen (standard Wade-Gile) or as Song Ziwen (current pinyin). Song Ziwen was famous, like his two brothers-in-law Chiang Kai-shek (蔣介石 based on Cantonese pronunciation) or H. H. Kung (for Kong Xiangxi 孔祥熙). But who knows who Mr. Fong Sek was ?
The same applies to place names with both issues of transliteration, but also the use of plain Western terms to designate locations in Western sources. In the latter category, Port-Arthur for 旅順 (Lüshun) in Manchuria (Liaoning Province nowadays) is a prime example. But even for a well-known place such as 上海 (Shanghai), the range of names is bewildering: Shanghae, Shang-hai, Chang-haï, Schanghai, Sciangai, etc. When it comes down to lesser known places, it becomes virtually impossible to identify the actual place name in Chinese and to locate such a place. Or it comes at a great cost of painstakingly checking out each place name and retrieving its identity in Chinese. Such difficulties stay in the way of any form of spatial approach.
Even in Chinese sources, in fact, there are also specific difficulties related to the change of names (places), variations of names (institutions), and courtesy names, aliases, alternative given names, pen names, etc. (individuals). The very same individual can be known under several names used concurrently in his/her lifetime, such as Zhang Gongquan 張公權 / 张嘉璈 Zhang Jia’ao. Yet, despite the richness of the Chinese language for names, distinct individuals happen to have the very same name, sometimes in the same source.
One of the major objectives of the ENP-China project is to establish the conditions for addressing and solving the problem of identifying transliterated names for individuals, places, institutions, and other named entities in non-Chinese sources, as well as the multiplicity of names in Chinese and to attach each to a unique identifier. This is the only way to collect biographical data through data mining and incrementally and accurately attach it to the right entity.
The MCBD aims to collect biographical data on any individual active in China, both Chinese and non-Chinese, through systematic data mining in source books such as directories, biographical dictionaries, Who’s who’s, etc., in newspapers and periodicals, and in the academic literature. The design and structure of MCBD are described in detail in the MCBD User Manual.