DBW.

Expert

Homomorphic Encryption for Query Processing

Article diagram
May 20, 2026·11 min read

Homomorphic encryption enables private query processing over ciphertexts, but extreme computational overhead currently restricts practical use to simple aggregation workloads.

Introduction

Database systems increasingly operate on data they should not be able to read.
Cloud-hosted databases, multi-party analytics, and regulatory requirements around data privacy all create scenarios where a database engine must execute queries over encrypted data without ever accessing plaintexts.
Homomorphic encryption (HE) offers a cryptographic path toward this goal: it allows computation directly on ciphertexts, producing encrypted results that, when decrypted, match the output of the equivalent plaintext computation.

The theoretical appeal is immediate.
A client encrypts their data, uploads it to an untrusted server, and issue queries.
The server processes those queries entirely in the encrypted domain and returns encrypted results.
At no point does the server learn anything about the data or the query results.
In practice, the engineering challenges are severe.
HE schemes impose orders-of-magnitude computational overhead, constrain the types of operations that can be efficiently supported, and require careful management of noise growth in ciphertexts.
This article examines how HE integrates with query processing, what tradeoffs current systems make, and where the boundaries of practicality lie.

Background on Homomorphic Encryption Schemes

Homomorphic encryption schemes are classified by the operations they support and how many of those operations can be composed before the ciphertext becomes undecryptable.

Partially Homomorphic Encryption (PHE)

PHE schemes support a single operation an unlimited number of times.
Textbook (unpadded) RSA is multiplicatively homomorphic: Enc(a) * Enc(b) = Enc(a * b).
Note that practical RSA deployments use randomized padding schemes (e.g., OAEP) that destroy this homomorphic property.
Paillier is additively homomorphic: Enc(a) * Enc(b) = Enc(a + b).
PHE schemes are efficient and well understood, but cannot express general computation because they support only one operation type.

Somewhat Homomorphic Encryption (SHE)

SHE schemes support both addition and multiplication, but only for a bounded number of operations.
Each operation introduces noise into the ciphertext.
After enough operations, the noise overwhelms the signal and decryption fails.
The noise budget constrains the depth of the arithmetic circuit that can be evaluated.

Fully Homomorphic Encryption (FHE)

FHE schemes, first constructed by Gentry in 2009, support arbitrary computation on ciphertexts.
They achieve this through a technique called bootstrapping, which homomorphically evaluates the decryption circuit to refresh the noise in a ciphertext.
Bootstrapping is extremely expensive (tens of milliseconds to several seconds per operation in modern implementations, though this continues to improve), but it removes the depth limitation.

The most practical FHE schemes today are based on the Ring Learning With Errors (RLWE) problem.
Notable constructions include BGV, BFV, and CKKS.
BGV and BFV both operate over integers and are suitable for exact arithmetic, though they differ in how they manage noise: BGV uses modulus switching to keep noise small relative to the ciphertext modulus, while BFV scales the message instead and uses a different rescaling strategy.
CKKS supports approximate arithmetic over real numbers and is better suited for machine learning workloads.
Its relaxed correctness guarantees — results are correct only up to a bounded approximation error — introduce important subtleties for database query processing, where SQL semantics typically require exact, deterministic results.

Mapping Query Operations to HE Primitives

The core challenge of HE-based query processing is expressing relational algebra in terms of the addition and multiplication operations that HE supports.

Selection (Filtering)

Selection predicates like WHERE salary > 50000 require comparison, which is not natively supported by most HE schemes.
Comparison must be decomposed into a Boolean circuit operating on bit-level encryptions, or approximated using polynomial evaluation in CKKS.
Both approaches are expensive.
A common optimization is to precompute encrypted indicator bits for frequently queried predicates at data-load time.
However, this requires the client to anticipate which predicates will be queried — a significant limitation discussed further in the walkthrough below.

Projection

Projection is straightforward when columns are encrypted independently.
The server simply returns the requested encrypted columns without needing to perform any homomorphic operations.

Aggregation

Summation maps directly to homomorphic addition and is the most efficient HE query operation. SUM, COUNT (as a sum of ones), and weighted averages are all expressible with addition and scalar multiplication. MIN and MAX require comparison circuits and are substantially more expensive.

Joins

Equality joins require testing whether two encrypted values are equal.
This can be done by computing Enc(a) - Enc(b) and testing whether the result encrypts zero, but the zero-test itself is a non-trivial circuit.
Join processing over encrypted data remains one of the most expensive operations, and most practical systems avoid it or use alternative techniques such as private set intersection (PSI) protocols.

Arithmetic Expressions

Expressions in SELECT clauses involving addition and multiplication translate directly.
Division is problematic because HE operates over finite rings or fields, and computing multiplicative inverses homomorphically requires deep circuits.

Walkthrough

The following walkthrough illustrates a simplified query execution pipeline for computing SELECT SUM(salary) FROM employees WHERE dept = 'engineering' using a BFV-style encryption scheme.

An important assumption underlies this walkthrough: the client encrypts the predicate indicator values (ct_dept_match) at data-load time.
This requires the client to know the dept column values upfront and to decide which predicates to precompute before any query is issued.
This is appropriate for static datasets but conflicts with the dynamic query scenario described in the introduction.
If the server were to evaluate the comparison dept = 'engineering' homomorphically at query time, it would require an expensive circuit for string equality over encrypted values.
This tension — between query flexibility (server-side predicate evaluation) and performance (client-side precomputation) — is a fundamental constraint in HE-based query processing.

Step-by-step Execution

diagram-1
Encrypted query execution pipeline for a filtered SUM
1. CLIENT SETUP
   - Generate public key (pk), secret key (sk), evaluation key (evk).
   - For each row i in employees:
       ct_salary[i] = Encrypt(pk, salary[i])
       ct_dept_match[i] = Encrypt(pk, 1 if dept[i] == 'engineering' else 0)
       // Note: client precomputes the predicate result at load time.
       // Dynamic predicate evaluation by the server requires expensive
       // encrypted comparison circuits.
   - Upload ct_salary[], ct_dept_match[], and evk to server.

2. SERVER-SIDE QUERY PROCESSING
   - For each row i:
       ct_filtered[i] = HomMultiply(evk, ct_salary[i], ct_dept_match[i])
       // ct_filtered[i] encrypts (salary[i] * indicator[i])
       // = salary[i] if dept matches, 0 otherwise
   
   - ct_result = ct_filtered[0]
   - For i = 1 to N-1:
       ct_result = HomAdd(ct_result, ct_filtered[i])
       // ct_result encrypts SUM of filtered salaries

3. CLIENT DECRYPTION
   - result = Decrypt(sk, ct_result)
   - result now contains the plaintext sum of engineering salaries.

Noise Budget Analysis

diagram-2
Noise budget consumption by HE operations and bootstrapping reset

Each HomMultiply consumes a significant portion of the noise budget.
Each HomAdd consumes comparatively little.
For a table of N rows processed in the non-batched manner shown above, this query requires N multiplications and N-1 additions.
Crucially, however, the multiplications are independent across rows — they do not chain into each other — giving a multiplicative circuit depth of 1.
This is comfortably within the capacity of a single-level BFV parameterization without bootstrapping.
The batched variant described in the next section reduces total operation count further.

Slot Packing for Vectorized Processing

Modern HE schemes support SIMD-style batching through polynomial slot packing.
In BFV and BGV, a single ciphertext can encode thousands of integer values in separate "slots," and a single homomorphic operation applies element-wise across all slots simultaneously.

// Pack N salary values into a single ciphertext
plaintext_vector = [salary_0, salary_1, ..., salary_{N-1}]
ct_salaries = Encrypt(pk, plaintext_vector)

// Pack N indicator values into a single ciphertext
indicator_vector = [ind_0, ind_1, ..., ind_{N-1}]
ct_indicators = Encrypt(pk, indicator_vector)

// Single HomMultiply filters all rows simultaneously
ct_filtered = HomMultiply(evk, ct_salaries, ct_indicators)

// Rotation-based reduction to compute sum across slots
for step in [1, 2, 4, 8, ..., N/2]:
    ct_rotated = HomRotate(evk, ct_filtered, step)
    ct_filtered = HomAdd(ct_filtered, ct_rotated)

// Slot 0 now contains the sum

This batching reduces N multiplications to 1 and N additions to log(N) rotations plus additions.
Rotation is itself a moderately expensive operation (comparable to multiplication in cost), but the asymptotic improvement is substantial.
For a table with 8192 rows fitting in a single ciphertext, this reduces thousands of operations to roughly 13 rotation-and-add steps.
CKKS supports an analogous slot-packing mechanism for approximate real-number arithmetic, though its use for exact SQL aggregation requires careful analysis of accumulated approximation error.

Practical Systems and Hybrid Approaches

Pure FHE query processing remains impractical for general-purpose workloads.
The overhead for complex queries (multi-way joins, nested subqueries, string operations) is typically 10^5 to 10^7 times slower than plaintext execution.
Practical systems adopt hybrid strategies.

CryptDB (Popa et al., 2011) does not use FHE.
Instead, it uses an onion model of layered encryption where different encryption layers (including PHE schemes like Paillier for addition and order-preserving encryption for comparisons) support different operations.
The server peels layers as needed, revealing weaker encryption for specific operations.
This leaks some information (e.g., order relationships when using order-preserving encryption) but achieves practical performance.
CryptDB represents a different point in the design space from FHE-based systems: it trades confidentiality guarantees for usability.

SEAL and HElib are open-source FHE libraries that provide the cryptographic primitives underlying BFV/BGV/CKKS.
Systems built on top of them typically restrict the query language to linear aggregations and low-degree polynomial functions to stay within practical noise budgets.

Hybrid approaches partition query execution between client and server.
The server handles what it can (additions, simple aggregations), and the client decrypts intermediate results to perform operations that are too expensive homomorphically (comparisons, joins).
This introduces round trips, and requires careful analysis to ensure that intermediate decrypted values do not leak sensitive information.

Performance Characteristics

Concrete performance numbers shift rapidly as implementations improve, but the relative costs remain stable:

  • Homomorphic addition: microseconds per operation (batched).
  • Homomorphic multiplication: tens of microseconds to milliseconds per operation (batched); consumes noise budget.
  • Homomorphic rotation: comparable to multiplication in cost; required for slot-packing reductions.
  • Bootstrapping: tens of milliseconds to several seconds depending on parameters and security level; an active area of optimization.
  • Key generation and encryption: seconds for large parameter sets.
  • Ciphertext size: a single BFV ciphertext at 128-bit security is typically 32KB to 256KB, compared to a few bytes for the corresponding plaintext. This expansion factor — often 1000x or more — transforms compute-bound queries into I/O-bound queries and has significant implications for storage layout and buffer management.

Key Points

  • Homomorphic encryption enables computation on ciphertexts, allowing query processing on untrusted servers without exposing plaintext data.
  • Additive operations (SUM, COUNT) map efficiently to HE primitives, while comparisons, joins, and string operations require expensive circuit decompositions.
  • SIMD-style slot packing in schemes like BFV and BGV allows batched processing of thousands of values in a single ciphertext, dramatically amortizing per-element costs.
  • Noise growth in ciphertexts limits the depth of computation; bootstrapping removes this limit but introduces substantial per-operation overhead.
  • Pure FHE query processing incurs 10^5 to 10^7 times overhead for complex queries, making hybrid client-server approaches necessary for practical systems.
  • Ciphertext expansion (often 1000x or more) transforms compute-bound queries into I/O-bound queries, requiring rethinking of storage and buffer management.
  • The choice between exact (BFV/BGV) and approximate (CKKS) arithmetic schemes has correctness implications that interact with SQL semantics around precision and determinism.
  • Practical systems like CryptDB use layered PHE schemes rather than FHE, trading some confidentiality for dramatically better performance.

References

Craig Gentry. Fully Homomorphic Encryption Using Ideal Lattices. Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC), 2009.

Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan. CryptDB: Protecting Confidentiality with Encrypted Query Processing. Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP), 2011.

Junfeng Fan and Frederik Vercauteren. Somewhat Practical Fully Homomorphic Encryption. IACR Cryptology ePrint Archive, Report 2012/144, 2012.

Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. Homomorphic Encryption for Arithmetic of Approximate Numbers. Advances in Cryptology (ASIACRYPT), 2017.

Shai Halevi and Victor Shoup. Algorithms in HElib. Proceedings of the 34th Annual International Cryptology Conference (CRYPTO), 2014.

Newsletter

Signal
over noise.

Database deep-dives, delivered once a week. Storage engines, query optimization, and the data layer.

You will receive Databases Weekly.