The term STANDARD DATA has its origins in librarianship. When cataloguing publications it is, for example, indispensable to be able to identify authors unambiguously in order to be able to clearly assign works to the respective author regardless of differing spellings or name changes. The same necessity also arises when indexing the recorded literature so as to identify geographical terms that are the subject of different treatises and relate them to each other. These necessities led to the creation of corresponding onomasiological lists, which were initially kept independently by the individual libraries.

The possibility to create a technical network of data collections, which dates back to the 1970s, soon made it necessary to standardise the indexes that had previously been kept by each library individually. For this reason, libraries began to align the individual directories and create common indexes of persons and subject headings in the 1980s – initial plans had already been made at the end of the 1970s – in order to achieve cross-library consistency. Initially, thematically separate directories were created: A person index (Name Authority File, PND), an index of corporate names (Authority File for Corporate Names, GKD) and a subject index (Subject Headings Authority File, SWD). In the end, it became clear that the resulting separation was not appropriate, especially since persons and corporate bodies may not only appear as authors or editors, but may also be the subject of publications, which is why they must be taken into account accordingly when indexing the Subject Headings Authority File. For this reason, the three separate authority files (including the Uniform Title File of the German Music Archive) were combined between 2009 and 2012 in a joint undertaking of the German National Library and the German-language library networks to form the so-called Integrated Authority File (GND). This has been available to the public since 2012 – at times in a whole range of formats (MARC 21 Authority, MARC21-xml and RDFxml) – and is increasingly being used for indexing outside the library sector. For example, authority files are used in the DH projects at the ITG BMLO (Bavarian Musicians' Encyclopaedia online) and Kaiserhof for the doubtless identification of persons.

On the website http://ognd.bsz-bw.de/ (a service of the Library Service Centre Baden-Württemberg), a convenient search tool is available for research in the GND. Authority files comparable to the GND are maintained by institutions, primarily libraries, worldwide. The project VIAF (Virtual International Authority File), jointly launched by the DNB and the Library of Congress, has been in existence since 2003. It aims to unite these databases in one system and make them accessible.

Even if the system of authority files enables the unambiguous identification of persons and concepts in purely theoretical terms, its actual usability depends on the technical implementation within the electronic library catalogues. In the catalogues of the DNB and the Bayerische Staatsbibliothek (BSB), the search for "Homère" also returns matches whose bibliographic record contains only the German spelling "Homer". On the other hand, in the online catalogue of the BSB, a click on an author's name with a link attached to it currently (Nov. 2018) still lists titles not only by the corresponding individual, but also by authors with the same name.

Even though the origin of the concept of authority files obviously lies in the library environment, the use of them has since become established in other areas as well. Examples of this are the projects Geonames (entity geographica), Pleiades (entity ancient geographica) or also Glottolog (entity world languages).

Authority files are of great importance for the interoperability demanded by the FAIR Initiative among others. By defining a standard date and assigning an (alpha)numeric identifier, it is possible to logically and technically link congruent data in separate databases in addition to integrating them into the content index of library catalogues.

From the perspective of VerbaAlpina, the establishment of the authority file categories "morpho-lexical type" (⇒ Typification) and "concept" would be methodologically consistent and therefore highly desirable. This would allow morpho-lexical types and concepts to be provided with identifiers. In this way, lexical data could be unambiguously related to each other worldwide and, in the case of concepts, independently of the individual language. Approaches in this direction can be observed sporadically. For example, in the structured databases of the Wikidata project, so-called Q-IDs are assigned which uniquely identify extra-linguistic concepts and thereby provide a common and identical reference for the various articles on the same topic in different language versions of Wikipedia. For instance, the concept ALPINE HUT is uniquely identified in Wikidata with the Q-ID Q2649726. The corresponding entry in Wikidata refers to associated Wikipedia articles, presently (October 2018) in a total of seven different languages. Of the 2629 concepts recorded by VerbaAlpina, exactly 400 have been assigned a Q-ID so far. The Q-IDs, if available, are registered in the VerbaAlpina database. An analogous, systematic identification of morpho-lexical types does not yet exist within the framework of the Wikipedia or Wikidata project; only rudimentary L-IDs have been assigned for linguistic designations. However, it is not clear if this refers to precisely defined types.

Based on the model of authority file IDs, VerbaAlpina assigns its own identifiers for the data categories (entities) "concept", "morpho-lexical type" (see Typification) and "municipality", which can be referred to other already established authority file systems, such as the Q-IDs of the Wikidata project, by simply mapping them. VerbaAlpina is also striving to integrate the data category "morpho-lexical type" particularly into the systematics of the Integrated Authority File (GND). The perspective for this becomes visible with the GND being expanded in terms of content and structure and adapted to the needs of science as well as cultural institutions and persons in general. The GNDCon 2018 conference scheduled for December 2018 is intended to facilitate a corresponding exchange. VerbaAlpina's interests will be represented there by members of the Munich University Library and the ITG.

The GND currently distinguishes the following entities: corporation (abbreviation: b), conference (f), geographic (g), person (not individualised) (n), person (individualised) (p), subject term (s) as well as work (u) (https://wiki.dnb.de/download/attachments/90411323/entitaetenSatztypen.pdf). A DNB document from the category "Arbeitshilfen zur gemeinsamen Normdatei (GND)" also shows that for the category "Buchstaben, Morpheme, Wörter als Gegenstand linguistischer Untersuchungen" (letters, morphemes, words as the subject of linguistic research) the specific entity code "slz" is provided as a subcategory of the entity "Sachbegriff" (subject term). It seems obvious to link the VerbaAlpina authority files to this category.

Literature:
Capellaro 2003