The Electronic Text Corpus of Sumerian Literature

Institute of Oriental Studies

The creation of text databases that can be shared over electronic networks opens up a wide range of new possibilities for research and teaching in ancient documentary studies (Newsletter Issue 4: Towards a Virtual Library of Ancient Documents). In this general context it was exciting for a seminar audience at the Centre to hear a report from Dr. Jeremy Black and Dr. Eleanor Robson of the Institute of Oriental Studies on the creation of an Electronic Text Database for Sumerian Literature.

Sumerian is the oldest written language in existence - reaching back into the Third Millennium BC - and preserves the oldest known literature in the world. The main body of Classical Sumerian literature consists of 50,000 lines of verse in a range of different genres, dating from approximately 2100 to about 1650 BC. In spite of its range, sophistication and influence on later literatures, however, Sumerian literature is little known to scholars in other fields and even less accessible to a wider audience.

Clay tablet with Sumerian literary work written in Cuneiform
Part of a Sumerian literary work, written on a clay tablet in cuneiform script. Photograph © Staatliche Museen zu Berlin

The reasons for this isolation are various. The recovery of Sumerian literature has its own complex archaeology. The majority of surviving compositions have had to be reconstructed during the past fifty years from thousands of often fragmentary clay tablets, inscribed in cuneiform writing, such as the Berlin tablet illustrated on above.

The written survival of Sumerian literature is the product of a scribal tradition which has resulted in the preservation of individual compositions in multiple copies made by scribes over a range of several centuries. The construction of modern texts of Sumerian compositions from these individual and overlapping copies is slow and painstaking work. Relatively few compositions have yet been published in satisfactory or readily accessible editions. Several major compositions have not yet been edited at all. Moreover, continuing progress in knowledge of the language renders translations little more than twenty years old already unreliable or unusuable.

Given these difficulties, a more dynamic and collaborative model of publication is required if the acute need for a coherently and systematically published, universally available textual corpus is to be met. The development of electronic text scholarship has now made it possible to aim at such a goal. A team based at Oxford University's Institute of Oriental Studies, working closely with the Humanities Computing Unit, has begun work on a project to produce a 'collected works' of over 400 poetic compositions of the Classical Sumerian literature, equipped with translations along the lines of the Perseus Project corpus available for Classical Greek and Latin literature.

A pilot project funded by a grant from the University of Oxford's Research and Equipment Committee and employing as full-time postdoctoral researcher Dr Eleanor Robson, under the direction of Dr Jeremy Black, was undertaken in 1996/7 to establish the extent of the Sumerian literary corpus and to collect complete source and publication information on each composition; to investigate and devise suitable technical procedures and format for publication of the corpus; to establish a basis for international collaboration, and the sharing of material and expertise, with other electronic text corpora and Sumerian projects, in particular with the universities of Chicago and Philadelphia; and to produce specimen text and a publicly accessible pilot Web site.

The successful completion of these goals made it possible to attract further funding, from the Leverhulme Trust, to undertake the main project. Preparation of the Corpus began at Oxford University in November 1997 with a project team consisting of Dr Jeremy Black, Dr Graham Cunningham and Dr Gábor Zólyomi, with the continued collaboration of Dr Eleanor Robson.

The reconstructed texts are being encoded in Standard Generalised Markup Language (SGML), which will ensure the widest accessibility of the material into the foreseeable future. The principal form of delivery for the Corpus will be the World Wide Web. The Corpus will eventually comprise:

  1. An information database.
  2. Transliterations of 13 ancient literary catalogues.
  3. Composite texts of 409 literary compositions.
  4. New translations of all the composite texts.
The emphasis in the translations will be on providing coherent, readable English prose.

To enable users to check original sources and to explore variant traditions, it is essential to include transliterations of individual exemplars. Accordingly, the Corpus will also include for a representative 'core' sample of 42 compositions (roughly ten per cent of the whole Corpus) separate transliterations of all individual manuscripts, which will enable the construction of a lineated apparatus - the so-called 'musical score' format now generally favoured for the publication of Sumerian compositions - to make it possible to get behind the harmonised text of the edition.

Work on entering texts for the Corpus is now proceeding apace. Currently available texts and the latest progress reports can be consulted on the ETCSL WWW site at

Specific questions can also be addressed directly to the ETCSL team

Return to Table of Contents

Home | What's New | Events | Images | Links

Created on Monday, 08 March, 1999: 21:44:06