Big data ecosystems collect a significant amount of data from different sources for different purposes, and some of the collected data contain sensitive information that requires protection. Homomorphic Encryption (HE) enables the processing of data in a ciphertext form as if the processing was performed on the corresponding plaintext. This research report assesses the current capabilities of homomorphic encryption (HE) schemes to protect big data from practicality perspective. The report sets four criterions for practicality and assesses the capabilities of homomorphic encryption (HE) against them. The report concludes that it is impractical to secure big data with the current capabilities of homomorphic encryption schemes, including the fully homomorphic encryption (FHE) schemes.
Keywords: Homomorphic Encryption, FHE, SWHE, Big Data
The term "Big Data" refers to the ever-increasing large datasets that cannot be stored, managed, and processed by traditional information technology solutions within a tolerable time (Chen, Mao, Zhang, & Leung, 2014). Big data is gathered from different sources in various formats which can be mined to derive information and knowledge. Similar to traditional data, big data might contain sensitive or classified information which requires protection.
Cryptography is a proven security solution that can protect the confidentiality of data. Encrypting big data is impractical with traditional cryptography solutions because applications will not be able to process the data while it is encrypted, and it is computationally impractical to encrypt and decrypt a large amount of data promptly. Homomorphic encryption (HE) is a type of public-key encryption that allows the processing of data while being encrypted (Hu, 2016).
This research report analyses the practicality of using homomorphic encryption methods as a cryptography solution to protect the confidentiality of big data based on the current overall capabilities of homomorphic encryption schemes.
This section defines the terms “Big Data” and “Homomorphic Encryption”.
There are various definitions of the term “Big Data” across the industry, academia, and the media. However, the major definitions in the industry assert at least one of the following attributes as a critical factor when defining big data (Ward & Barker, 2013):
- Size: The volume of the dataset is large
- Complexity: The structure of the dataset is complex
- Technologies: The tools and techniques that are used to process the dataset leverage distributed computing, and produce an output within a tolerable time
Based on the factors mentioned above, “Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to NoSQL, MapReduce and machine learning” (Ward & Barker, 2013).
Homomorphic Encryption (HE)
Homomorphic encryption (HE) is a type of encryption that allows the processing of a ciphertext via specific types of computation functions and generates a ciphertext that when decrypted, matches the result of the same functions if they are performed on the corresponding plaintexts. (Yi, Paulet, & Bertino, 2014)
Homomorphic encryption (HE) schemes are classified into three categories in terms of the supported computational functions and limitations:
- Partially Homomorphic Encryption (PHE)
- Somewhat Homomorphic Encryption (SWHE)
- Fully Homomorphic Encryption (FHE)
In Partially Homomorphic Encryption (PHE), it is possible to only perform one operation on the ciphertext, such as multiplication or addition but not both. In Somewhat Homomorphic Encryption (SWHE) it is feasible to perform both addition and multiplication but can only support a limited number of addition and multiplication operations. A Fully Homomorphic Encryption (FHE) sustains unlimited numbers of both addition and multiplication and hence enables the computation of any circuit which results in computing any function on the ciphertext. (Ogburn, Turner, & Dahal, 2013)
The criteria of practicality
For a homomorphic encryption method to be considered practical it must fulfil the following requirements:
- It is computationally feasible to encrypt plaintext within a tolerable time
- It is computationally infeasible to decrypt the ciphertext by only knowing the ciphertext, the encryption algorithm, and the public key
- It is computationally feasible to perform arbitrary operations on the ciphertext
- It is computationally feasible to decrypt the processed ciphertext within a tolerable time and receive an output as if the computational operations were performed on the plaintext
This section analyses the current capabilities of homomorphic encryption schemes against the practicality criteria as mentioned earlier.
The feasibility of encrypting plaintext
All the available homomorphic encryption schemes are based on asymmetric encryption methods. However, asymmetric encryption applications are normally restricted to small chunks of data such as symmetric keys management and digital signature (Diffie, 1988) because of their computational overhead; it is impractical to use for encrypting a large amount of data. Therefore, homomorphic encryption (HE) does not fulfil the first criteria of practicality.
The security of ciphertext
Homomorphic encryption schemes such as fully homomorphic encryption (FHE) from ring-LWE provides a security level matching AES-128 (Cachin, Ristenpart, & SIGSAC, 2011). As such, it is possible for homomorphic encryption (HE) to fulfil the second criteria of practicality.
Arbitrary computation on ciphertext
The partially and somewhat homomorphic encryptions schemes only allow a limited number of computation functions on the ciphertext, but fully homomorphic encryption schemes enable arbitrary computation on the ciphertext (Ogburn, Turner, & Dahal, 2013).
State of the art optimised implementations of fully homomorphic encryptions that compromise computations flexibility such as Microsoft Simple Encrypted Arithmetic Library (SEAL) shows it is significantly slower than performing similar computation operations on plaintext (Dowlin et al., 2015). Moreover, software applications have to be developed in a specific way, such as using IBM HELib to be capable of processing ciphertext (Armknecht et al., 2015).
The computational overhead is increased by the size and complexity of the data, hence it will be infeasible to process big data in a ciphertext form with the current implementations of homomorphic encryptions. Therefore, it does not fulfil the third criteria of practicality.
Decrypting the computed ciphertext
Somewhat homomorphic encryption schemes are noise-based, which means the plaintext is disguised by noise which can be removed by decryption. However, this noise increases with every computation operation on the ciphertext, and decryption will fail if it exceeds a certain threshold (Armknecht et al., 2015).
However, somewhat homomorphic encryption schemes can be converted into a fully homomorphic encryption schemes by applying Gentry’s bootstrapping technique (Gentry, C. 2009) which converts a somewhat homomorphic encryption scheme (SWHE) to a fully homomorphic encryption (FHE) scheme if it can homomorphically decrypt the ciphertext, and perform an additional computation on it.
To maintain the sanity of the ciphertext, additional processing operations are required for every computation operation on the ciphertext. This will introduce computational overhead in terms of processing and memory for traditional data, and this will be more challenging for big data due to its sheer size. Thus, the current capabilities of homomorphic encryptions do not fulfil the fourth criteria of practicality.
Homomorphic encryption (HE), and particularly fully homomorphic encryption (FHE) is a promising field of cryptography that can potentially change the current paradigms of information security. However, it is still a new field, and the current implementations have significant computational limitations that make it impractical to use for securing traditional data, and this impracticality increases significantly with big data due to the already existing computational challenges of big data.
Armknecht, F., Boyd, C., Carr, C., Gjøsteen, K., Jäschke, A., Reuter, C. A., & Strand, M. (2015). A guide to fully homomorphic encryption. IACR Cryptology ePrint Archive (2015/1192).
Cachin, C., Ristenpart, T., & SIGSAC, A. (2011). Proceedings of the 3rd ACM workshop on cloud computing security workshop. New York, NY: ACM.
Chen, M., Mao, S., Zhang, Y., & Leung, V. C. M. (2014). Big data: Related technologies, challenges and future prospects. Switzerland: Springer International Publishing AG.
Diffie, W. (1988). The first ten years of public-key cryptography. Proceedings of the IEEE, 76(5), 560–577. doi:10.1109/5.4442
Dowlin, N., Gilad-Bachrach, R., Laine, K., Lauter, K., Naehrig, M., & Wernsing, J. (2015, November 13). Manual for using Homomorphic Encryption for Bioinformatics - Microsoft research. Retrieved from https://www.microsoft.com/en-us/research/publication/manual-for-using-homomorphic-encryption-for-bioinformatics/
Gentry, C. (2009). A fully homomorphic encryption scheme (Doctoral dissertation, Stanford University).
Hu, F. (Ed.). (2016). Big data: Storage, sharing, and security. United States: Productivity Press.
Ogburn, M., Turner, C., & Dahal, P. (2013). Homomorphic Encryption. Procedia Computer Science, 20, 502–509. doi:10.1016/j.procs.2013.09.310
Ward, J. S., & Barker, A. (2013). Undefined By Data: A Survey of Big Data Definitions. arXiv, 1309(5821).
Yi, X., Paulet, R., & Bertino, E. (2014). Homomorphic Encryption and applications. Switzerland: Springer International Publishing AG.