Glossary
About the Glossary
Placeholder text about how attribution scholarship often uses highly technical language.
Entries
A
algorithm
A set of logical instructions or mathematical rules used to perform calculations and problem-solving operations, typically by a computer.
M
machine learning
A class of algorithm that analyse patterns and relationships in data to make determinations and predictions, using the outcomes of these operations to learn by iterations and improve future accuracy. Machine learning procedures may be supervised, requiring human intervention to provide pre-defined examples with which to train the algorithm, or unsupervised, where no human pre-processing of the data is required.
T
text encoding
A set of explicit instructions for the computational representation of text. Whereas a human reader of the text of Othello, for example, is familiar with the conventions distinguishing the word Othello as it functions as a title, a running header, a speech prefix, or as a reference to the character in stage directions and dialogue, a computer generally requires these distinctions to be made explicit. Textual encoding or "markup" involves annotating or "tagging" to define units of text (from individual letters to entire documents) into categories. Linguistic features (e.g. grammatical part of speech, syntactic function, and so on) are commonly tagged to support natural language processing, and structural features are used to distinguish elements of text and paratext (e.g. title, dialogue, speech prefix, stage direction, prologue). Other kinds of analysis may require additional categories, such as tagging gender or social status.
token
A concrete instance of a type. "I came, I saw, I conquered", for example, contains six word-tokens (excluding punctuation): "came", "saw", "conquered", and three instances of "I".
type
A unique form; cf. token. "I came, I saw, I conquered", for example, contains four word-types (excluding punctuation): "I", "came", "saw", and "conquered".
W
word-token
See #token.
word-type
See type.