Quick recommendation: For database primary keys, prefer
GUID / UUID v7 (time-ordered) over
GUID / UUID v4 (random) to minimize index fragmentation and improve insert performance in B-tree indexes.
Overview: Why database performance matters
The primary performance concern with GUIDs / UUIDs in databases is their impact on index behavior, particularly B-tree indexes used by most relational databases. Random identifiers can cause poor insert performance due to index fragmentation, while sequential identifiers maintain better locality.
Random UUIDs (v4) Issues
- Index fragmentation (B-tree page splits)
- Poor cache locality
- Higher write amplification
- Degraded insert performance at scale
- Increased I/O for maintenance operations
Time-Ordered UUIDs (v7) Benefits
- Sequential insertion pattern
- Better cache locality
- Reduced page splits
- Improved insert throughput
- Natural chronological ordering
B-tree indexes and GUID / UUID insertion patterns
Most relational databases (PostgreSQL, MySQL, SQL Server, Oracle, SQLite and many others) use B-tree indexes for primary keys and secondary indexes. B-tree structures maintain sorted data in leaf pages, with internal nodes pointing to child pages.
How B-tree indexes work
A B-tree index organizes data in a hierarchical tree structure where:
- Leaf pages contain the actual index entries (key + pointer to row)
- Internal nodes contain keys and pointers to child pages
- Pages have fixed size (typically 8-16 KB depending on database)
- Sorted order is maintained for efficient range queries and lookups
Sequential vs random insertion
Sequential insertion (UUID v7, v6, v1):
- Inserts at the end: New values append to the rightmost leaf page
- Minimal page splits: Only the last page needs updates until full
- Cache-friendly: Recent pages stay in buffer pool
- Write locality: Reduces random I/O operations
Random insertion (UUID v4):
- Inserts anywhere: New values scatter across all leaf pages
- Frequent page splits: Pages fill unevenly causing splits across the tree
- Cache pollution: Many different pages need to be loaded and modified
- Write amplification: More disk I/O due to page management overhead
Example: Inserting 1 million UUIDs into a B-tree index:
- UUID v7 (sequential): ~100-200 page splits (appending to end)
- UUID v4 (random): ~50,000-100,000 page splits (scattered insertions)
Index fragmentation explained
Index fragmentation occurs when the logical order of index pages does not match their physical order on disk. This happens with random UUIDs because insertions split pages throughout the index tree.
Types of fragmentation
- Logical fragmentation (internal): Pages are not fully packed, wasting space within each page
- Physical fragmentation (external): Logically consecutive pages are not physically adjacent on disk
- Page splits: When a page is full, it splits into two half-full pages, reducing density
Performance impact
- Range scans: Slower because consecutive pages may require random disk I/O
- Buffer pool efficiency: More pages needed in memory for same data volume
- Insert performance: Additional I/O to read, modify, and write split pages
- Index size: Fragmented indexes consume more storage space (30-50% larger)
Real-world impact: In high-volume systems, random GUID / UUID (v4) primary keys can reduce insert throughput by 30-50% compared to sequential GUID / UUIDs (v7) due to index fragmentation and cache pressure.
Mitigation strategies
- Use GUID / UUID v7: Time-ordered UUIDs reduce fragmentation significantly
- Rebuild indexes: Periodic index rebuilds to defragment (requires maintenance window)
- Fillfactor tuning: Leave space in pages to accommodate future inserts (database-specific)
- Partition tables: Split large tables to limit index size and fragmentation scope
Storage considerations
GUIDs / UUIDs are 128 bits (16 bytes), compared to typical integer primary keys like BIGINT (8 bytes) or INT (4 bytes). This affects storage, memory and I/O.
Storage overhead
| Type | Size | Storage for 10M rows | Index size impact |
|---|
INT | 4 bytes | ~40 MB | Baseline |
BIGINT | 8 bytes | ~80 MB | 2x INT |
UUID (GUID) | 16 bytes | ~160 MB | 4x INT, 2x BIGINT |
VARCHAR(36) | 36+ bytes | ~360 MB | 9x INT (if stored as string) |
Memory and cache impact
- Buffer pool: Larger keys mean fewer rows fit per page, reducing cache efficiency
- Index nodes: Internal B-tree nodes hold fewer keys, increasing tree height and lookup cost
- Foreign keys: Every foreign key reference also stores 16 bytes (multiplied across relationships)
- Network transfer: Larger identifiers increase query result payload size
Best practice: Always store GUIDs / UUIDs using native UUID or BINARY(16) types, never as VARCHAR(36) or CHAR(36). String storage wastes space and degrades performance.
Database-specific considerations
PostgreSQL
PostgreSQL has native UUID type and excellent UUID support.
- Native type: Use
UUID type (16 bytes storage) - Extensions:
uuid-ossp for v1/v4, pgcrypto for v4 - UUID v7: Use custom function or application-generated values
- Index type: B-tree by default, consider BRIN for time-ordered UUIDs on large tables
Example: Creating UUID primary key
-- Using UUID v4 (random)
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
username VARCHAR(100) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
-- Using UUID v7 (time-ordered, application-generated)
CREATE TABLE events (
id UUID PRIMARY KEY, -- Generated by application
event_type VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
-- Index performance check
SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public';
MySQL / MariaDB
MySQL 8.0+ and MariaDB 10.7+ have native UUID support. Earlier versions require BINARY(16).
- MySQL 8.0+:
UUID type and UUID_TO_BIN() / BIN_TO_UUID() functions - Older versions: Use
BINARY(16) for storage - InnoDB: Clustered index on primary key makes sequential inserts critical
- UUID v1 reordering: MySQL's
UUID_TO_BIN(uuid, 1) reorders v1 for time-sorting
Example: UUID with InnoDB
-- MySQL 8.0+ with native UUID
CREATE TABLE products (
id BINARY(16) PRIMARY KEY,
name VARCHAR(200) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_created (created_at)
);
-- Insert with UUID conversion
INSERT INTO products (id, name)
VALUES (UUID_TO_BIN(UUID()), 'Product Name');
-- Query with conversion back
SELECT BIN_TO_UUID(id) as id, name
FROM products
WHERE id = UUID_TO_BIN('550e8400-e29b-41d4-a716-446655440000');
-- Check fragmentation
SELECT TABLE_NAME, DATA_FREE, DATA_LENGTH
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database';
InnoDB warning: InnoDB uses a clustered index where table data is stored in primary key order. Random UUID v4 primary keys cause severe fragmentation and poor insert performance. Always prefer UUID v7 for InnoDB tables.
Microsoft SQL Server
SQL Server has UNIQUEIDENTIFIER type and NEWSEQUENTIALID() for sequential GUIDs.
- Native type:
UNIQUEIDENTIFIER (16 bytes) - Sequential GUIDs:
NEWSEQUENTIALID() generates time-based sequential values - Byte ordering: SQL Server reorders bytes differently than RFC 4122 standard
- Clustered indexes: Sequential GUIDs essential for clustered primary keys
Example: Sequential GUID in SQL Server
-- Using NEWSEQUENTIALID() for sequential GUIDs
CREATE TABLE Orders (
OrderId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
CustomerName NVARCHAR(200) NOT NULL,
OrderDate DATETIME2 DEFAULT GETDATE()
);
-- Index fragmentation check
SELECT
OBJECT_NAME(ips.object_id) AS TableName,
i.name AS IndexName,
ips.avg_fragmentation_in_percent,
ips.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED') ips
JOIN sys.indexes i ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent > 10
ORDER BY ips.avg_fragmentation_in_percent DESC;
SQLite
SQLite has no native UUID type but can store as BLOB or TEXT.
- Storage: Use
BLOB (16 bytes) for efficiency or TEXT for readability - B-tree index: SQLite uses B-tree indexes; sequential UUIDs improve performance
- Generation: No built-in UUID functions, use application layer
- Page size: Default 4KB pages fill faster with UUIDs vs integers
Example: UUID in SQLite
-- Store as BLOB for efficiency
CREATE TABLE sessions (
id BLOB PRIMARY KEY, -- 16 bytes
user_id INTEGER NOT NULL,
expires_at INTEGER NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id)
);
-- Insert (application generates UUID and converts to bytes)
-- Example with Python: uuid.uuid4().bytes
INSERT INTO sessions (id, user_id, expires_at)
VALUES (X'550e8400e29b41d4a716446655440000', 1, 1735689600);
-- Query
SELECT hex(id) as id, user_id FROM sessions;
MongoDB
MongoDB uses ObjectId by default but supports UUID storage as BinData.
- ObjectId: 12-byte time-ordered identifier (similar concept to UUID v7)
- UUID support: Store as
BinData subtype 4 (UUID) - Indexes: B-tree style indexes benefit from time-ordered identifiers
- Sharding: Random UUIDs distribute well across shards (ObjectId also works)
Example: UUID in MongoDB
// Using ObjectId (default, recommended)
db.users.insertOne({
_id: ObjectId(), // 12 bytes, time-ordered
username: "john_doe",
createdAt: new Date()
});
// Using UUID (if required for interoperability)
db.users.insertOne({
_id: UUID("550e8400-e29b-41d4-a716-446655440000"),
username: "jane_doe",
createdAt: new Date()
});
// Index stats
db.users.stats().indexSizes;
Practical recommendations
Database optimization checklist
- Use native UUID types:
UUID (PostgreSQL), BINARY(16) (MySQL), UNIQUEIDENTIFIER (SQL Server) - Never store as VARCHAR: String storage wastes 2-3x space and degrades performance
- Prefer UUID v7 for primary keys: Minimizes fragmentation in B-tree indexes
- Monitor fragmentation: Check index health regularly and rebuild when needed
- Consider partitioning: For very large tables (100M+ rows), partition by time range
- Tune fillfactor: Leave space in index pages for future inserts (e.g., 90% fillfactor)
- Benchmark your workload: Read-heavy vs write-heavy patterns impact UUID choice
When NOT to use UUIDs
- Small tables: Integer IDs are simpler and more efficient for small datasets
- Extreme performance requirements: Auto-increment integers provide best performance
- Storage-constrained systems: UUIDs consume 2-4x more space than integers
- Simple single-database apps: No need for distributed ID generation
Performance benchmarks
Real-world performance varies by database, hardware, and workload. These approximate benchmarks illustrate typical differences:
Insert performance comparison
| Key Type | Inserts/sec (10M rows) | Index Size | Fragmentation |
|---|
BIGINT AUTO_INCREMENT | ~50,000 | ~120 MB | <5% |
| UUID v7 (time-ordered) | ~40,000 | ~220 MB | <10% |
| UUID v1 (timestamp-based) | ~35,000 | ~230 MB | 10-15% |
| UUID v4 (random) | ~25,000 | ~280 MB | 30-50% |
Approximate benchmarks on PostgreSQL 15 with 16GB RAM, SSD storage. Actual performance varies by hardware and configuration.
Query performance impact
- Point lookups: GUID / UUID indexes ~10-20% slower than integer indexes (larger keys)
- Range scans: Sequential GUIDs / UUIDs comparable to integers; random GUIDs / UUIDs 2-3x slower
- Join performance: GUID / UUID joins slightly slower due to larger key comparison
- Index-only scans: GUIDs / UUIDs reduce number of rows cached per page
Recommendation: Always benchmark with your specific workload. Read-heavy applications may not notice GUID / UUID overhead, while write-heavy systems benefit significantly from sequential GUIDs / UUIDs.
Conclusion
GUIDs / UUIDs offer significant advantages for distributed systems and avoiding ID coordination, but come with performance trade-offs in databases. The choice between random (GUID / UUID v4) and time-ordered (GUID / UUID v7) identifiers has substantial impact on index behavior and insert performance.
For most database use cases, GUID / UUID v7 provides the best balance: standard-compliant, time-ordered for index efficiency, and no coordination overhead. Random GUID / UUID v4 remains appropriate for privacy-sensitive contexts or public API identifiers where time leakage is unacceptable.
Always use native GUID / UUID storage types, monitor index fragmentation and benchmark with representative workloads to optimize for your specific requirements.