# HI-MAN white paper resume
The paper introduces HE-MAN (Homomorphically Encrypted MAchine learning with oNnx models), a toolset designed for privacy-preserving machine learning inference using homomorphically encrypted data. The primary focus of HE-MAN is to facilitate machine learning services that handle sensitive data while ensuring the privacy of both the model and the input data. Below are the key points and components of the paper:
1. **Problem and Solution Overview**:
- The paper acknowledges the increasing importance of machine learning (ML) algorithms and the vast amount of data being generated. It notes the hesitation of individuals to pass sensitive data to ML service providers and the providers' reluctance to share their models, which are their intellectual property.
- HE-MAN is introduced as an open-source toolset that performs privacy-preserving inference on ML models, particularly those in the ONNX format, using homomorphically encrypted data. The system ensures that both the model and the input data remain undisclosed to the opposite parties, abstracting cryptographic details away from the users.
2. **Background on Homomorphic Encryption**:
- Homomorphic Encryption (HE) allows computations to be performed on encrypted data without needing to decrypt it. Fully Homomorphic Encryption (FHE) supports both addition and multiplication operations on encrypted data.
- The security of modern FHE schemes is based on the ring learning with errors (RLWE) hardness assumption, where random noise is added to ciphertexts during encryption. The noise level increases with operations, especially multiplications, limiting the multiplicative depth of homomorphic computations.
3. **HE-MAN's Architecture and Workflow**:
- HE-MAN is designed as a two-party system comprising a model owner and a data owner. The model owner holds the neural network for inference, while the data owner has the input data.
- The workflow involves initializing and deriving encryption parameters, encrypting input data, performing inference homomorphically, and decrypting the results.
- HE-MAN supports a wide range of pretrained models in the ONNX format and provides a user-friendly command-line interface. It abstracts away the complexities of FHE, like encryption parameter selection and cryptographic operations.
4. **Privacy and Security Considerations**:
- The system assumes a semi-honest (honest-but-curious) threat model where both parties adhere to the protocol but may attempt to infer private information. HE-MAN's design ensures that neither the model nor the sensitive input data is exposed to the other party.
- The security of data privacy relies on the underlying FHE schemes, and the model's details are protected as the model is never transferred to the data owner.
5. **Evaluation and Results**:
- The paper evaluates HE-MAN's performance in terms of accuracy and latency on tasks like classifying handwritten digits and face recognition.
- The results show that the accuracy of HE-MAN is on par with models using plaintext input, although the inference latency is significantly higher. The system's performance and encryption parameter choices are thoroughly analyzed and discussed.
6. **Conclusion and Future Work**:
- The paper concludes that HE-MAN successfully enables privacy-preserving machine learning services, allowing service providers to offer inference services without compromising the privacy of their models or the clients' data.
- It identifies areas for future development, including supporting a broader range of operators from the ONNX specification, unifying tools under a common frontend, and exploring improvements in the efficiency of the toolset, potentially through parallelization support or integration with more homomorphic encryption libraries.
Overall, HE-MAN represents a significant step towards practical, privacy-preserving machine learning services, addressing the crucial need for protecting sensitive data in ML applications.
## could you please let me know what is a oNnx model?
ONNX, which stands for Open Neural Network Exchange, is an open standard for representing machine learning models. It's designed to enable models to be portable across different frameworks and hardware, making it easier for developers to move models between various tools or platforms during the development and deployment phases. Here are some key points about ONNX:
1. **Framework Agnostic**: ONNX offers a common format for AI models from various frameworks. It's supported by many popular frameworks like PyTorch, TensorFlow, Microsoft Cognitive Toolkit (CNTK), and more. This means you can train a model in one framework and then transfer it to another for inference, without the need for significant modifications.
2. **Interoperability**: The primary goal of ONNX is to enable interoperability between different software tools. This allows developers to choose the best tools for each stage of their workflow (e.g., training in PyTorch, inference in TensorFlow).
3. **Optimization**: ONNX models can be optimized for efficient inference and execution on different hardware platforms, including CPUs, GPUs, and specialized accelerators like FPGAs or ASICs.
4. **Components of ONNX**:
- **Model Format**: ONNX defines a standard format for machine learning models. This includes the architecture of the neural network (layers, activation functions, etc.) and the parameters (weights and biases).
- **Operators**: ONNX specifies a set of basic operations (like add, multiply, convolution, etc.) which are used to build the neural network models. This ensures that the operations are interpreted the same way across different platforms.
- **Runtimes**: ONNX models can be run on various platforms using ONNX-compatible runtimes. These runtimes can optimize the execution of the models on specific hardware.
5. **Use Cases**: ONNX is useful in scenarios where development and deployment environments are different, or when there's a need to utilize specific hardware accelerators that are supported by one framework but not another. It's also beneficial for model sharing and collaboration, as models can be easily shared and used across different frameworks.
In summary, ONNX is a key technology for promoting interoperability and flexibility in the AI and machine learning ecosystem, allowing developers and businesses to streamline their model development and deployment workflows.
## could you please explain HE MAN components and workflows ?
HE-MAN (Homomorphically Encrypted MAchine learning with oNnx models) is a framework designed to enable privacy-preserving machine learning inference. It does this by allowing machine learning computations on encrypted data using Homomorphic Encryption (HE). Here's an overview of how HE-MAN works:
### Overview of Components
1. **Data Owner**: The entity that owns the data and wants to make predictions using a machine learning model without revealing the actual data.
2. **Model Owner**: The entity that owns the machine learning model and wants to provide inference services without revealing the details of the model.
3. **Homomorphic Encryption**: A form of encryption that allows computation on encrypted data without needing to decrypt it first. The result of the computation remains encrypted and can only be decrypted with the appropriate key.
### Workflow of HE-MAN
1. **Model Preparation**:
- The model owner converts their machine learning model into the ONNX format, a standard format for representing machine learning models that ensures compatibility and interoperability across different platforms and tools.
2. **Encryption Parameter Setup**:
- Appropriate encryption parameters are chosen based on the complexity of the model and the level of security required. These parameters are crucial for the performance and security of the homomorphic encryption operations.
3. **Data Encryption**:
- The data owner encrypts their input data using the public key provided by the model owner. This ensures that the data remains confidential and can only be used in its encrypted form.
4. **Homomorphic Inference**:
- The encrypted data is sent to the model owner, who then performs inference using the encrypted model and the encrypted data.
- The computations are performed homomorphically, meaning the operations are done on the encrypted data directly, without decrypting it. This includes all the necessary operations required by the machine learning model, such as matrix multiplications, activations, etc.
5. **Result Encryption and Transmission**:
- The result of the inference is also encrypted. It is sent back to the data owner.
6. **Decryption of Results**:
- The data owner uses their private key to decrypt the result, obtaining the prediction or output of the machine learning model.
7. **Model and Data Privacy**:
- Throughout this process, the model owner never sees the actual data, and the data owner never has access to the raw model. The privacy of both the model and the data is maintained.
### Points to Note
- **Performance Considerations**: Homomorphic encryption is computationally intensive, and the performance (in terms of speed and resource usage) can be significantly slower than operations on plaintext data. The complexity of the model and the size of the input data can greatly affect the inference time.
- **Security vs. Performance Trade-off**: The choice of encryption parameters involves a trade-off between security and performance. More robust security parameters can lead to increased computational requirements.
- **Current Use Cases**: Due to performance considerations, HE-MAN and similar systems are currently more suited for scenarios where the privacy of data/model is of utmost importance, and the latency of inference can be tolerated.
HE-MAN represents a significant step forward in the field of privacy-preserving AI, enabling secure, confidential computation without compromising the utility of the models or the confidentiality of the data.