Provable inferencing for Morpheus with Marlin Oyster

# Provable inferencing for Morpheus with Marlin Oyster ## What is Marlin Oyster? Marlin Oyster leverages the power of secure enclaves (TEEs) to bring secure computation to Web3. This allows services typically requiring central hosting (relays, gateways, APIs) to run on a decentralized network of untrusted nodes, while being managed entirely through smart contracts. ###### Oyster docs : https://docs.marlin.org/learn/what-is-oyster ## How does Oyster contribute to enabling the Morpheus network to offer verifiable inferencing? For a quick demo on how Oyster provides verifiable computing, we are using a modified version of the [Morpheus Lite Client](https://github.com/MorpheusAIs/Lite-Client) with Ollama running inside oyster. The Ollama server running inside Oyster returns a signature in the header of the Ollama `/chat` API endpoint. This signature is a Keccak-256 hash of the model name, model prompt, inference response, and the timestamp, signed with Oyster enclave's private key. The Morpheus Light Client recovers the address associated with the signature and compares it with the address recovered from the Oyster attestation (Attestation is the process of verifying if a Oyster enclave is running a given enclave image. To understand more about remote attestation please refer to the [remote attestation docs](https://docs.marlin.org/learn/oyster/isolated-instances/topics/attestation)). This allows the client to prove that the inference for the prompt was generated by a given node running a particular LLM model. The response for the prompt displayed below in the image was generated using the Llama2 (7B) model running on Ollama inside Oyster. ![image](https://hackmd.io/_uploads/HJm2bXpy0.png) ###### Morpheus-oyster-lite-client repo: https://github.com/marlinprotocol/morpheus-oyster-lite-client ###### For a tutorial on how to run Llama2 inside oyster please refer the official marlin docs : https://docs.marlin.org/user-guides/oyster/instances/tutorials/llama2/intro To enhance user confidence, the proposed [Morpheus Lumerin Model](https://github.com/MorpheusAIs/Docs/blob/main/!KEYDOCS%20README%20FIRST!/Morpheus%20Lumerin%20Model.md) can leverage a signature approach similar to the one employed in the Morpheus-Oyster-Lite-Client demo. This would empower user nodes to verify not only the provider node used for inference but also the specific LLM model utilized for inferencing. #### Steps for inference verification : 1. The provider node within Oyster receives the Ollama chat request through an HTTP proxy running inside the enclave. 2. The Ollama chat request is then redirected to the Ollama server, also running within Oyster. 3. The Ollama server generates the inference using the LLM model specified in the Ollama chat request and redirects the response back to the HTTP proxy. 4. The HTTP proxy captures the generated response and creates a Keccak-256 hash using the model name, prompt, model inference response (returned by the Ollama server), and a timestamp generated within the proxy. 5. The generated Keccak-256 hash is signed with Oyster's private key. This signature, along with the timestamp, is appended to the response header. 6. Upon receiving the response, the user node can use the timestamp from the headers, the model name, and the inference response from the body to regenerate the hash and recover the signer's address using the signature present in the Ollama response header. 7. The user node can then compare the recovered address to the public key returned by the Oyster attestation API. This verification process ensures that the inference was generated by a specific node and a particular LLM model.