The ENP-China project is developing two major databases for research on modern China. They are designed to serve as fundamental resources for the study of people and places in China between 1830 and 1949. These databases are being developed as part of the research agenda of the ENP-China team, especially through the projects and case studies it explores. Their purpose and coverage, however, extends beyond the topics actively pursued in such research and aims to provide essential data and tools.
The modern period in China presents a particular challenge, which historians of previous periods did not have to address. Whereas the production of sources was by and large done in China proper and in Chinese until the nineteenth century (though the use of Manchu and Mongol under the Qing may draw some parallels), two highly connected processes changed the conditions of knowledge production in the modern period: The rise of mass printing, including the formidable development of the press, and the involvement of a wide array of actors, not just Chinese but foreigners, in knowledge production. This represents a massive amount of information about modern China. In the course of the century of Western presence through businessmen, diplomats, settlers, travellers, scholars, etc., « things Chinese » came to be named and described by people with different language background (English, French, Italian, German, etc.).
Thus both in terms of time and space (origin), all sorts of transliteration systems emerged, even within the same language domain (English), which also included variations based on different Chinese dialects. The wealth of information contained in the books, periodicals, archives, manuscripts, etc. produced about/in China before 1949 is simply unintelligible without an often delicate exercise in transporting these terms — people’s names, locations, institutions to name the main categories — into Chinese. To give an example, 宋子文 was known as T.V. Soong, not Sung Tzu-wen (standard Wade-Gile) or as Song Ziwen (current pinyin). Song Ziwen was famous, like his two brothers-in-law Chiang Kai-shek (蔣介石 based on Cantonese pronunciation) or H. H. Kung (for Kong Xiangxi 孔祥熙). But who knows who Mr. Fong Sek was ?
The same applies to place names with both issues of transliteration, but also the use of plain Western terms to designate locations in Western sources. In the latter category, Port-Arthur for 旅順 (Lüshun) in Manchuria (Liaoning Province nowadays) is a prime example. But even for a well-known place such as 上海 (Shanghai), the range of names is bewildering: Shanghae, Shang-hai, Chang-haï, Schanghai, Sciangai, etc. When it comes down to lesser known places, it becomes virtually impossible to identify the actual place name in Chinese and to locate such a place. Or it comes at a great cost of painstakingly checking out each place name and retrieving its identity in Chinese. Such difficulties stay in the way of any form of spatial approach.
Even in Chinese sources, in fact, there are also specific difficulties related to the change of names (places), variations of names (institutions), and courtesy names, aliases, alternative given names, pen names, etc. (individuals). The very same individual can be known under several names used concurrently in his/her lifetime, such as Zhang Gongquan 張公權 / 张嘉璈 Zhang Jia’ao. Yet, despite the richness of the Chinese language for names, distinct individuals happen to have the very same name, sometimes in the same source.
One of the major objectives of the ENP-China project is to establish the conditions for addressing and solving the problem of identifying transliterated names for individuals, places, institutions, and other named entities in non-Chinese sources, as well as the multiplicity of names in Chinese and to attach each to a unique identifier. This is the only way to collect biographical data through data mining and incrementally and accurately attach it to the right entity.
Modern China Biographical Database
The MCBD aims to collect biographical data on any individual active in China, both Chinese and non-Chinese, through systematic data mining in source books such as directories, biographical dictionaries, Who’s who’s, etc., in newspapers and periodicals, and in the academic literature. The design and structure of MCBD are described in detail in the MCBD User Manual.
Modern China Geospatial Database
The MCGD aims to collect place names in their various denominations and to develop a historical gazetteer for the identification of place names in both Chinese and non-Chinese sources, and to provide their coordinates (latitude/longitude). The MCGD has several components:
MCDG Online Maps: this is a collection of shape files that reconstitute the administrative geography of China from 1820 to 1949 at various levels (Province/省, Prefecture/府, County/縣). For the 1820-1911 period, we relied on the excellent work done by the China Historical GIS projet at Harvard University, except for 1909 which we recreated entirely from a new set of maps. For the Republican period, we recreated the successive provincial boundaries for each sub-period and we drew a full map of China at the county level for 1934. MCGD Online maps also includes a set of geo-referenced historical maps (rasters). Finally, we also publish here the thematic maps produced in the course of research to build an online historical atlas.
MCGD Location Finder: the Finder serves to identify locations based on their name, whatever the language and the system of transliteration, and to provide their coordinates.
MCGD Bulk Finder: it is an interface developed to upload a csv file containing place names and parse them in the historical gazetteer.
MCGD Datasets: the historical gazetteer is made up of two connected files, Names and Locations, in a Postgres database. The PostGres database is not accessible directly by external users, but the datasets are saved and updated regularly on our Zenodo account.