scalar-labs · josh-wong · Jan 16, 2025 · Jan 16, 2025
diff --git a/docs/glossary.mdx b/docs/glossary.mdx
@@ -0,0 +1,119 @@
+---
+tags:
+  - Community
+  - Enterprise Standard
+  - Enterprise Premium
+displayed_sidebar: docsEnglish
+---
+
+# Glossary
+
+This glossary includes database and distributed-system terms that are often used when using ScalarDB.
+
+## ACID
+
+Atomicity, consistency, isolation, and durability (ACID) is a set of properties that ensure database transactions are processed reliably, maintaining integrity even in cases of errors or system failures.
+
+## concurrency control
+
+Concurrency control in databases ensures that multiple transactions can occur simultaneously without causing data inconsistency, usually through mechanisms like locking or timestamp ordering.
+
+## consensus
+
+Consensus in distributed systems refers to the process of achieving agreement among multiple computers or nodes on a single data value or system state.
+
+## data federation
+
+Data federation is the process of integrating data from different sources without moving the data, creating a unified view for querying and analysis.
+
+## data mesh
+
+A data mesh is a decentralized data architecture that enables each business domain within a company to autonomously manage data and use it efficiently.
+
+## data virtualization
+
+Data virtualization is similar to data federation in many aspects, meaning that it virtualizes multiple data sources into a unified view, simplifying queries without moving the data.
+
+## database anomalies
+
+Database anomalies are inconsistencies or errors in data that can occur when operations such as insertions, updates, or deletions are performed without proper transaction management.
+
+## federation engine
+
+A federation engine facilitates data integration and querying across multiple disparate data sources, often as part of a data federation architecture.
+
+## global transaction
+
+A global transaction spans multiple databases or distributed systems and ensures that all involved systems commit or roll back changes as a single unit.
+
+## heterogeneous databases
+
+Heterogeneous databases refer to systems composed of different database technologies that may have distinct data models, query languages, and transaction mechanisms.
+
+## HTAP
+
+Hybrid transactional/analytical processing (HTAP) refers to a system that can handle both transactional and analytical workloads concurrently on the same data set, removing the need for separate databases.
+
+## JDBC
+
+Java Database Connectivity (JDBC) is an API that allows Java applications to interact with databases, providing methods for querying and updating data in relational databases.
+
+## linearizability
+
+Linearizability is a strong consistency model in distributed systems where operations appear to occur atomically in some order, and each operation takes effect between its start and end.
+
+## NoSQL database
+
+A NoSQL database is a non-relational databases designed for specific data models, such as document, key-value, wide-column, or graph stores, often used for handling large-scale, distributed data.
+
+## Paxos
+
+Paxos is a family of protocols used in distributed systems to achieve consensus, even in the presence of node failures.
+
+## PITR
+
+Point-in-time recovery (PITR) allows a database to be restored to a previous state at any specific time, usually after an unintended event like data corruption.
+
+## polystores
+
+Polystores are database architectures that allow users to interact with multiple, heterogeneous data stores, each optimized for a specific workload or data type, as if they were a single system.
+
+## read-committed isolation
+
+Read-committed isolation is an isolation level where each transaction sees only committed data, preventing dirty reads but allowing non-repeatable reads.
+
+## relational database
+
+A relational database stores data in tables with rows and columns, using a structured query language (SQL) to define, query, and manipulate the data.
+
+## replication
+
+Replication in databases involves copying and distributing data across multiple machines or locations to ensure reliability, availability, and fault tolerance.
+
+## Saga
+
+The Saga pattern is a method for managing long-running transactions in a distributed system, where each operation in the transaction is followed by a compensating action in case of failure.
+
+## serializable isolation
+
+Serializable isolation (serializability) is the highest isolation level in transactional systems, ensuring that the outcome of concurrently executed transactions is the same as if they were executed sequentially.
+
+## snapshot isolation
+
+Snapshot isolation is an isolation level that allows transactions to read a consistent snapshot of the database, protecting them from seeing changes made by other transactions until they complete.
+
+## TCC
+
+Try-Confirm/Cancel (TCC) is a pattern for distributed transactions that splits an operation into three steps, allowing for coordination and recovery across multiple systems.
+
+## transaction
+
+A transaction in databases is a sequence of operations treated as a single logical unit of work, ensuring consistency and integrity, typically conforming to ACID properties.
+
+## transaction manager
+
+A transaction manager coordinates the execution of transactions across multiple systems or databases, ensuring that all steps of the transaction succeed or fail as a unit.
+
+## two-phase commit
+
+Two-phase commit is a protocol for ensuring all participants in a distributed transaction either commit or roll back the transaction, ensuring consistency across systems.
diff --git a/docs/scalardb-benchmarks/README.mdx b/docs/scalardb-benchmarks/README.mdx
@@ -97,9 +97,9 @@ After applying the schema and configuring the properties file, select a benchmar
 
 ### Prepare a benchmarking configuration file
 
-To run a benchmark, you must first prepare a benchmarking configuration file. The configuration file requires at least the locations of the workload modules to run and the database configuration. 
+To run a benchmark, you must first prepare a benchmarking configuration file. The configuration file requires at least the locations of the workload modules to run and the database configuration.
 
-The following is an example configuration for running the TPC-C benchmark. The ScalarDB properties file specified for `config_file` should be the properties file for the [benchmarking environment that you previously set up](#set-up-your-environment).
+The following is an example configuration for running the TPC-C benchmark. The ScalarDB properties file specified for `config_file` should be the properties file that you created as one of the steps in [Load the schema](#load-the-schema).
 
 :::note