Lexical resource

In digital lexicography, natural language processing, and digital humanities, a lexical resource is a language resource consisting of data regarding the lexemes of the lexicon of one or more languages e.g., in the form of a database.


Different standards for the machine-readable edition of lexical resources exist, e.g., Lexical Markup Framework (LMF) an ISO standard for encoding lexical resources, comprising an abstract data model and an XML serialization, and OntoLex-Lemon, an RDF vocabulary for publishing lexical resources as knowledge graphs on the web, e.g., as Linguistic Linked Open Data.

Depending on the type of languages that are addressed, a lexical resource may be qualified as monolingual, bilingual or multilingual. For bilingual and multilingual lexical resources, the words may be connected or not connected from one language to another. When connected, the equivalence from a language to another is performed through a bilingual link (for bilingual lexical resources, e.g., using the relation vartrans:translatableAs in OntoLex-Lemon) or through multilingual notations (for multilingual lexical resources, e.g., by reference to the same ontolex:Concept in OntoLex-Lemon).

It is possible also to build and manage a lexical resource consisting of different lexicons of the same language, for instance, one dictionary for general words and one or several dictionaries for different specialized domains.

Machine-readable dictionary vs. NLP dictionary

Lexical resources in digital lexicography are often referred to as machine-readable dictionary (MRD), a dictionary stored as machine (computer) data instead of being printed on paper. It is an electronic dictionary and lexical database. The term MRD is often contrasted with NLP dictionary, in the sense that an MRD is the electronic form of a dictionary which was printed before on paper. Although being both used by programs, in contrast, the term NLP dictionary is preferred when the dictionary was built from scratch with NLP in mind.

Lexical database

A lexical database is a lexical resource which has an associated software environment database which permits access to its contents. The database may be custom-designed for the lexical information or a general-purpose database into which lexical information has been entered.

Information typically stored in a lexical database includes spelling, lexical category and synonyms of words, as well as semantic and phonological relations between different words or sets of words.

See also

