Glossary

Translation AI Glossary

Find key terms and jump to dedicated glossary pages.

A

Accountability in AI

Organisational responsibility for how AI systems function, make decisions, and impact users.

Algorithmic Bias

Systematic errors in AI outputs arising from skewed, imbalanced, or prejudiced training data.

API Key Security

Protection measures preventing unauthorised access or misuse of API credentials used to process translations.

Artificial Intelligence

Computational systems capable of performing tasks that traditionally require human intelligence.

Attention Mechanism

A neural network method that helps AI models focus on the most relevant parts of the input when generating output, improving context handling and translation quality.

B

Backpropagation

A learning process used in neural networks to adjust internal weights after comparing predictions with expected results.

Baseline System

A reference translation system used for comparison when evaluating improvements or alternative models.

Beam Search

A decoding method used in AI language models and neural machine translation to evaluate several candidate word sequences before selecting the most likely output.

BERTScore

A metric comparing contextual embeddings of a machine translation and a human reference.

Bilingual Corpus

A collection of aligned texts in two languages used to train, evaluate, or improve translation systems and language technologies.

BLEU

A traditional translation metric measuring n-gram overlap between machine output and a reference.

C

CAT Tools

Software environments supporting professional translation through TMs, termbases, and quality checks.

chrF++

A character-level machine translation metric evaluating similarity between system output and a reference.

COMET

A neural evaluation metric scoring translation quality based on semantic similarity and model judgments.

Concordance Search

A tool function retrieving previous translations or term occurrences from a translation memory.

Context-aware Translation

Translation methods using surrounding sentences or document-wide information for better accuracy.

Context Window

The amount of text an AI model can process at once during translation or generation.

Continuous Localisation

A workflow where translation, QA, and updates occur continuously alongside product development.

D

Data Privacy

Protection of personal or sensitive information during processing, transfer, and storage.

Data Residency

A requirement for data to remain within a specific legal jurisdiction or region.

Deep Learning

A machine-learning approach using multi-layer neural networks to model complex patterns.

E

Encryption

Technical measures ensuring data is secured during transfer and storage.

Ethical AI

AI developed and used in alignment with fairness, safety, and respect for rights.

EU AI Act

The European Union’s regulatory framework governing the development and use of AI.

F

Fairness and Bias

Principles ensuring AI systems behave without unjustified discrimination.

Federated Learning

A machine learning approach where models are trained across distributed devices or servers without centralising raw data.

File Parsing

Automated extraction of text and structure from formats such as DOCX, PDF, PPTX, or XLSX.

Fine-Tuning

The adaptation of a pre-trained model for a specific task through additional training on focused data.

G

Gender Bias in AI

Differences in how AI treats or represents genders due to training-data patterns.

Generative AI

AI systems that generate new text, images, audio, code, or other content from learned patterns.

Gradient Descent

An optimisation algorithm that iteratively updates model parameters to reduce prediction error.

H

Hallucinations

Confident but incorrect outputs generated by AI models.

Human Evaluation

Assessment of translation quality performed manually by linguists.

Human-in-the-Loop

Workflows in which human specialists supervise or correct AI outputs.

Hybrid Translation

A translation approach combining machine output with human expertise or rule-based controls for higher quality.

I

Inference

The process by which a trained model generates translations or other outputs.

Information Retrieval

The process of searching for and identifying relevant documents, data, or information in response to a user query.

Interoperability

The ability of different software systems, tools, or platforms to exchange and use information effectively.

J

Jaccard Similarity

A statistical measure used to evaluate the similarity between two sets, often applied in text analysis and information retrieval.

Joint Training

A machine learning training approach where a model is trained simultaneously on multiple tasks or datasets.

K

Keyphrase Extraction

The automated identification of important phrases within a document to represent its main topics.

Knowledge Graph

A structured representation of entities and their relationships used by AI systems to organise and connect information.

L

Large Language Model (LLM)

A neural model trained on vast text corpora, capable of understanding and generating natural language.

Latency

The delay between sending a translation request and receiving a response.

Logging

Automatic recording of system events or requests used for monitoring and troubleshooting.

M

Machine Learning

A field of AI where systems learn patterns from data to make predictions or generate content.

Model Training

Adjusting model parameters through exposure to data so it can learn linguistic patterns.

Multimodal AI Models

AI systems that combine text, image, audio, and video inputs to improve understanding and generation across tasks.

N

Named Entity Recognition (NER)

A natural language processing method used to identify and classify entities such as names, organisations, locations, and dates in text.

Neural Network

A computational model made up of interconnected layers that learns patterns from data and powers many modern AI systems.

O

OOV (Out-of-Vocabulary)

Words or tokens that do not appear in a model’s training vocabulary and therefore cannot be directly recognised or translated by the system.

Open-Source Model

An AI model whose architecture, code, or weights are publicly available, allowing researchers and developers to inspect, modify, and deploy it.

Overfitting

A machine learning problem where a model learns the training data too closely and performs poorly on new or unseen inputs.

P

Parallel Corpus

A collection of texts and their translations in two or more languages used to train machine translation systems.

Post-Editing (MTPE)

The process of reviewing and correcting machine translation output to achieve publishable quality.

Pretraining

The initial phase of training a machine learning model on large datasets before adapting it to specific tasks.

Prompt

A structured instruction or input guiding an AI model’s behaviour.

Prompt Engineering

The practice of designing and structuring prompts to obtain more accurate and useful outputs from AI models.

Q

Quantisation

A model optimisation technique that reduces the numerical precision of neural network parameters to decrease memory usage and improve inference speed.

Quality Assurance (QA)

Systematic checks ensuring accuracy, consistency, and compliance with project requirements.

R

Rate Limits

Restrictions on how many API requests can be processed within a time window.

Reinforcement Learning

A machine learning method in which an agent learns decision-making through rewards and penalties.

Responsible AI

An umbrella concept combining ethics, transparency, fairness, and safety in AI operations.

Revision vs Review

Quality control stages: revision checks against the source; review checks monolingually.

Rule-Based Translation

A translation approach based on linguistic rules and dictionaries instead of statistical or neural methods.

S

Semantic Similarity

A measure used in natural language processing to determine how similar two texts are in meaning.

Speech Recognition

Technology that converts spoken language into written text using machine learning models.

Style Guide

A document defining linguistic, stylistic, and formatting rules for translation.

Supervised Learning

A machine learning method in which models are trained using labelled data.

T

Token

A unit of text processed by a model, such as a word, subword, or punctuation mark.

U

Universal Language Models

Multilingual language models that use shared representations to support many languages and NLP tasks.

Unsupervised Learning

A type of machine learning in which models identify patterns in data without labelled training examples.

V

Validation Dataset

A dataset used during model training to evaluate performance and detect problems such as overfitting before final testing.

Vector Database

A specialised database designed to store and retrieve vector embeddings efficiently for similarity search and semantic retrieval.

Vendor Neutrality

A commitment to flexible, non-proprietary technologies that avoid vendor lock-in.

Vocabulary

The set of tokens or words that a language model can recognise and process when analysing or generating text.

W

Weight Parameters (Model Weights)

The numerical parameters inside a neural network that determine how input data is processed and how predictions or generated text are produced.

Wide-context Translation

Translation using extended context windows or whole-document information to improve coherence and accuracy.

X

XML (Extensible Markup Language)

A structured markup language used to encode documents and data in a format that can be processed by both humans and machines.

Z

Zero Data Retention

A mode in which user data is deleted almost immediately after processing.

Zero-Shot Learning

A machine learning capability that allows models to perform tasks they were not explicitly trained on by leveraging generalised knowledge.

Zero-Shot Translation

A multilingual translation capability where models translate between language pairs not directly present in training data.

Z-Score

A statistical measure indicating how many standard deviations a data point is from a dataset mean.

Topic Collections

AI Translation Terms

Explore core concepts used in modern AI translation workflows, from model architecture to linguistic quality controls.

Machine Learning Glossary

A practical glossary collection covering model training, optimisation, and evaluation terms used across applied AI.

NLP Terminology

Understand natural language processing terms that shape language understanding, generation, and translation quality.

AI Privacy and Security Terms

Review the privacy, legal, and infrastructure vocabulary needed to deploy AI responsibly in enterprise environments.

LLM Architecture Glossary

Discover architecture and runtime concepts behind large language model systems used for translation and content workflows.