Try   HackMD

利用實體資訊人工智慧技術加速能源轉型油藏建模 [S73019]
https://www.nvidia.com/en-us/on-demand/session/gtc25-s73019/

探索如何利用 NVIDIA Modulus 框架,開發出順序物理資訊神經算子(Sequential Physics-Informed Neural Operator, PINO),並與生成對抗網路(Generative Adversarial Network, GAN)及優化器整合,徹底改變傳統與新興能源應用中的地下建模方式。本場演講將深入介紹這些新技術如何增強油藏模擬(Reservoir Simulation)的能力,同時應對可持續資源管理中不斷演變的挑戰。我們的重點在於優化現有石油與天然氣生產流程,以最大程度減少對環境的影響,並針對枯竭油藏中的碳捕獲與儲存(Carbon Capture and Storage, CCS)進行建模,進而支持全球脫碳(Decarbonization)的努力。

Explore how the NVIDIA Modulus framework is used to develop the Sequential Physics-Informed Neural Operator (PINO), integrated with Generative Adversarial Networks (GANs) and optimizers, to revolutionize subsurface modeling for both traditional and emerging energy applications. This session will dive into how these new techniques enhance reservoir simulation capabilities while addressing the evolving challenges in sustainable resource management. Our focus is on optimizing existing oil and gas production processes to minimize environmental impact and modeling carbon capture and storage (CCS) in depleted reservoirs to support global decarbonization efforts.

了解如何利用 NVIDIA Modulus 的順序物理資訊神經算子(PINO),顯著提升石油儲層模擬(Reservoir Simulation)中的歷史匹配(History Matching)精度,實現比其他人工智慧模型更高的準確性。我們將展示這種方法如何超越傳統技術,提供更可靠的模擬結果,以支持能源決策。

Learn how the Sequential Physics-Informed Neural Operator (PINO) in NVIDIA Modulus significantly improves the accuracy of history matching in reservoir simulation, achieving greater precision than other artificial intelligence models. We will demonstrate how this approach outperforms traditional techniques, delivering more reliable simulation results to support energy decision-making.

進一步探索我們如何將此方法與 CUDA 技術相結合,優化計算工作流程(Computational Workflow),實現更快、更有效率的油藏模擬(Reservoir Simulation)。透過高效的並行計算技術,我們能夠大幅縮短模擬所需的時間,同時保持高精度的結果。

Discover how we integrate this approach with CUDA technology to optimize computational workflows, enabling faster and more efficient reservoir simulation. Through high-performance parallel computing techniques, we can significantly reduce the time required for simulations while maintaining high-accuracy results.

深入了解 NVIDIA H100 GPU 如何加速訓練與推理(Training and Inference)過程,顯著提升石油儲層建模(Reservoir Modeling)的運算速度。這種高效能硬體的應用,讓我們能夠快速處理複雜的模擬任務,為能源產業帶來前所未有的效率提升。

Dive into how the NVIDIA H100 GPU accelerates training and inference processes, dramatically improving the computational speed of reservoir modeling. The application of this high-performance hardware enables us to rapidly handle complex simulation tasks, bringing unprecedented efficiency to the energy industry.

本演講將分享如何利用人工智慧(Artificial Intelligence, AI)與 GPU 加速技術,應對能源資源管理中的實際挑戰。我們將提供實用的見解,展示這些技術如何應用於現實場景,幫助產業專業人士更有效地管理資源,並為可持續能源的未來做出貢獻。

This session will share how artificial intelligence (AI) and GPU-accelerated technologies are used to tackle real-world challenges in energy resource management. We will provide practical insights into how these technologies can be applied to real-world scenarios, helping industry professionals manage resources more effectively and contribute to a sustainable energy future.

首先,我們有場變量(Field Variables),例如孔隙度(Porosity)、儲層厚度(Reservoir Thickness)和射孔位置(Perforation Location)。此外,我們還有標量變量(Scalar Variables),包括注入速率(Injection Rates)、初始壓力(Initial Pressure)和溫度(Temperature)。這個區域模型代表一個典型的海上碳捕獲與儲存(Offshore CCS, Carbon Capture and Storage)項目的中央注入井(Central Injection Well)。您可以透過以下連結找到這些數據集。我要特別感謝來自倫敦帝國理工學院(Imperial College London)的Gigi提供這些開源數據集。

First, we have field variables such as porosity, reservoir thickness, and perforation location. Additionally, we have scalar variables, including injection rates, initial pressure, and temperature. This region model represents a typical central injection well for an offshore carbon capture and storage (CCS) project. You can find this dataset via the provided link. I would like to thank Gigi from Imperial College London for making this dataset open source.

我們的輸出包含24個三維快照(3D Snapshots),用於描述自由相(Free Phase)和其他相關參數的空間分佈。傅立葉神經算子(Fourier Neural Operator, FNO)是一種專為高效解決參數化偏微分方程(Parametric Partial Differential Equations)設計的神經網路架構(Neural Network Architecture)。我們的目標是將物理空間中的輸入函數(Input Function)F轉換為頻譜域(Spectral Domain)中的關係。

Our outputs consist of 24 three-dimensional snapshots describing the spatial distribution of the free phase and other relevant parameters. The Fourier Neural Operator (FNO) is a neural network architecture designed to efficiently solve parametric partial differential equations. Our goal is to convert the input function F in physical space into relationships in the spectral domain.

具體來說,我們首先將輸入(例如孔隙度場)轉換到頻率域(Frequency Domain)。然後,我們應用一個濾波器R(Filter R),這在頻譜空間(Spectral Space)中相當於神經網路版本的格林函數(Green’s Function)。接著,通過逆傅立葉變換(Inverse Fourier Transform),我們重建物理空間中的解,例如壓力(Pressure)和飽和度(Saturation)。這裡的W層充當物理空間中的非線性投影層(Nonlinear Projection Layer),並在通過激活函數(Activation Function)之前與每一層相加。

Specifically, we first convert our inputs, such as the porosity field, into the frequency domain. Then, we apply a filter R, which is the neural network equivalent of a Green’s function in spectral space. Next, through the inverse Fourier transform, we reconstruct the physical solution, such as pressure and saturation. Here, the W layer acts as a nonlinear projection layer in physical space and is summed with each layer before passing through the activation function.

傅立葉神經算子(FNO)的一個關鍵優勢是它能夠捕捉變量之間的交互作用(Interactions),而無需顯式求解方程。我們訓練了兩個不同的模型:一個用於儲層中的壓力(Pressure),其遵循的橢圓方程(Elliptic Equation)依賴於全局條件,例如邊界條件(Domain Boundaries)。為此,我們設計的損失函數(Loss Function)專注於均方誤差(Mean Squared Error),以直接比較預測值與真實值。

One of the key advantages of the Fourier Neural Operator (FNO) is its ability to capture interactions between variables without explicitly solving the equations. We trained two different models: one for pressure in the reservoir, which follows an elliptic equation dependent on global conditions, such as domain boundaries. For this reason, the loss function is designed to focus on mean squared errors, providing a direct comparison between predicted and true values.

另一個模型針對飽和度(Saturation),其遵循的雙曲型方程(Hyperbolic Equation)描述了類似波動的動態現象(Dynamic Phenomena),資訊沿特定路徑傳播。飽和度受到傳輸(Transport)和對流(Convection)過程的影響。為了考慮這些差異,我們在損失函數中加入了一項額外項,以限制波前(Wave Front)上的誤差,從而使飽和度波前的梯度(Gradient)更準確。

The other model addresses saturation, which follows a hyperbolic equation describing dynamic phenomena similar to waves, with information propagating along specific paths. Saturation is influenced by transport and convection processes. To account for these differences, we added an additional term to the loss function to limit errors at the wave front, thereby improving the accuracy of the gradient at the saturation front.

該模型是使用 NVIDIA Modulus 框架(NVIDIA Modulus Framework)開發的,訓練過程在2000個樣本上進行,總共100個迭代週期(Epochs),並使用 NVIDIA H100 GPU(GPU H100)進行加速。推理時間(Inference Time)比傳統模擬器(Traditional Simulator)快約1000倍。我們在測試集(Test Set)上取得了非常高的精度,左邊是壓力,右邊是飽和度的結果。

The model was developed using the NVIDIA Modulus framework. Training was conducted on 2000 samples over 100 epochs using the NVIDIA H100 GPU. The inference time is approximately 1000 times faster than the traditional simulator. We achieved very high accuracy on the test set, with results for pressure on the left and saturation on the right.

在這兩個圖表中,我們可以看到隨著時間變化的 R² 分數(R² Score)。我們的模型在24個時間步(Time Steps)中能夠保持非常高的精度。讓我們比較一下使用順序傅立葉神經算子(Sequential FNO)在 Modulus 中預測的壓力與中間的參考模擬(Reference Simulation)。底部顯示的誤差(Errors)是由注入井(Injection Well)引起的,壓力傳播是逐漸發生的。我們的模型能夠非常精確地重現傳統模擬器的結果。

In these two graphs, we can see the R² score over time. Our model maintains very high accuracy across 24 time steps. Let’s compare the pressure predicted by the sequential Fourier Neural Operator in Modulus with the reference simulation in the middle. The errors shown at the bottom are due to the injection well, and propagation occurs gradually. Our model reproduces the results obtained with cretional simulator quite accurately.

對於飽和度(Saturation),我們也有類似的比較。誤差主要來自飽和度波前(Saturation Front)。我們已通過前面提到的損失函數盡可能減少這些誤差。從這裡的尺度來看,誤差保持在非常低的水平。

For saturation, we have a similar comparison. The errors mainly come from the saturation front. We have tried to minimize them as much as possible using the loss function mentioned earlier. Looking at the scale here, you can see that the errors remain very low.

接下來,我們來談談歷史匹配(History Matching)的部分。

Let’s move on to the part about history matching.

現在,我們將使用順序物理資訊神經算子(Sequential Physics-Informed Neural Operator, PINO),以下簡稱為 PINO,以及一種變分卷積自編碼器(Variational Convolutional Autoencoder, VCAE),以下簡稱為 VCAE,來解決一個逆問題(Inverse Problem)。我們的目標是什麼?PINO 用於根據儲層參數(Reservoir Parameters),如滲透率(Permeability)和孔隙度(Porosity),預測儲層中的壓力(Pressure)和油飽和度(Oil Saturation)。通過獲得壓力和飽和度,我們能夠確定油井(Wells)的油水產量(Oil and Water Production)。這構成了我們的正向問題(Forward Problem)。

Now, we will use the Sequential Physics-Informed Neural Operator, henceforth referred to as PINO, and a Variational Convolutional Autoencoder, henceforth referred to as VCAE, to solve an inverse problem. What do we aim to achieve? Our PINO is designed to predict the pressure and oil saturation in the reservoir based on reservoir parameters such as permeability and porosity. Obtaining pressure and saturation allows us to determine the oil and water production at our wells. This constitutes our forward problem.

現在我們要解決的是逆問題(Inverse Problem),也就是根據歷史生產數據(Historical Production Data)推導儲層的地質參數(Geological Parameters)。在這部分,我們的目標是量化和評估不確定性(Uncertainties)。為此,我們將結合 PINO 和集成方法(Ensemble Methods)進行分析。

What we now aim to address is the inverse problem, meaning finding the geological parameters of a reservoir based on historical production data. In this part, we aim to quantify uncertainties. To do so, we will use PINO along with ensemble methods.

為了介紹 PINO 的應用,我將首先簡要說明問題的控制方程(Governing Equations)。我們使用黑油模型(Black Oil Model),這是一種簡化的儲層模擬模型(Reservoir Simulation Model)。為了模擬流體流動(Fluid Flow),我們對每個相(Phase)應用質量守恆原理(Principle of Mass Conservation)。在我們的案例中,涉及兩個相:水(Water)和油(Oil)。每個相在多孔介質(Porous Medium)中的流動由達西定律(Darcy’s Law)的簡化版本描述,這是一種穩態、近似線性的儲層模型。

To introduce the application of PINO, I will start by briefly outlining the governing equations of our problem. We use the black oil model, which is a simplified reservoir simulation model. To model this flow, the principle of mass conservation is applied to each phase. In our case, there are two phases: water and oil. The flow of each phase in the porous medium is described by a simplified version of Darcy’s law, representing a steady-state, near-linear reservoir model.

我們的儲層被表示為一個由32,000個單元(Cells)組成的三維網格(Three-Dimensional Grid)。儲層包含兩種類型的岩石:一種是滲透率較低的泥岩相(Mud Facies),滲透率為20毫達西(Millidarcies);另一種是滲透率較高的砂岩相(Sand Facies),滲透率為2000毫達西。模擬包括六口油井(Oil Production Wells)和兩口注水井(Water Injection Wells)。每口井以固定的井底壓力(Bottomhole Pressure)運行,注水井為330巴(Bars),生產井為210巴。儲層採用無流邊界條件(No-Flow Boundary Conditions)進行建模。

Our reservoir is represented as a three-dimensional grid consisting of 32,000 cells. The reservoir comprises two types of rock: a mud facies with lower permeability of 20 millidarcies and a much more permeable sand facies of 2000 millidarcies. The simulation includes six oil production wells and two water injection wells. Each well operates at a constant bottomhole pressure, with 330 bars for the injection wells and 210 bars for the production wells. The reservoir is modeled with no-flow boundary conditions.

本研究的主要目標是預測儲層在給定時間段內的壓力和水飽和度(Water Saturation)的演變。模擬涵蓋總共51個時間步(Time Steps),每個時間步代表30天的間隔(30-Day Interval)。該模型使用相對較小的數據集進行訓練,僅包含600個案例(Cases),這對建模複雜儲層構成了一定的挑戰。然而,本研究還旨在測試物理資訊神經算子模型(Physics-Informed Neural Operator Model)在有限數據下捕捉參數間複雜交互(Complex Interactions)的能力。

The primary objective of this study is to predict the evolution of pressure and water saturation in the reservoir over a given time period. The simulation covers a total of 51 time steps, each representing a 30-day interval. The model uses a relatively small dataset of 600 cases for training, which poses a challenge for modeling complex reservoirs. However, this study also aims to test the capabilities of the physics-informed neural operator model in capturing complex interactions between parameters with a limited amount of data.

PINO 模型的順序性質(Sequential Nature)對於大規模儲層模擬(Large-Scale Reservoir Simulation)尤其有利,因為它通過將計算分解為可管理的步驟(Manageable Steps),顯著降低了記憶體需求(Memory Requirements)。這使得即使在計算資源有限的情況下,處理複雜儲層問題也變得可行。我們方法的主要目標是基於時間 T-1 的可用數據,預測時間 T 的條件。具體來說,我們的模型旨在使用時間 T-1 的壓力和飽和度數據,預測時間 T 的壓力和飽和度。

The sequential nature of the PINO model is particularly advantageous for large-scale reservoir simulation, as it significantly reduces memory requirements by breaking down computations into manageable steps. This makes it feasible to handle complex reservoir problems even with limited computational resources. The primary goal of our approach is to predict conditions at time T based on available data at time T-1. Specifically, our model aims to predict pressure and saturation at time T using pressure and saturation data from time T-1.

為了實現這一目標,我們首先計算油和水的通量(Oil and Water Flux),然後在時間層(Time Layer)中準備數據,模擬時間跨度為四個月(Four Months)。PINO 模型的輸入變量包括滲透率場(Permeability Field)、孔隙度(Porosity)、總流量(Total Flow Rates)、水流量(Water Flow Rates)、壓力(Pressure)、飽和度(Saturation)、時間步長(Time Step)等。為了預測時間 T 的壓力和飽和度,我們使用來自時間 T-1 的這些輸入變量。

To achieve this, we first compute the oil and water flux and then prepare our data in the time layer, covering a time span of four months. The input variables for our PINO model include the permeability field, porosity, total flow rates, water flow rates, pressure, saturation, and the time step. To predict pressure and saturation at time T, we use these input variables from time T-1.

我們的順序方法(Sequential Approach)的訓練基礎包含五個主要步驟。首先,模型根據輸入數據估計未來的壓力和飽和度值(Pressure and Saturation Values)。接著,我們計算物理損失(Physical Loss),利用黑油模型方程(Black Oil Model Equations)和達西定律(Darcy’s Law)施加物理約束(Physical Constraints)。這裡我們有來自達西定律的壓力方程,以及來自黑油模型的飽和度方程。然後,我們計算監督損失(Supervised Loss),這些損失比較模型的預測結果與真實數據(Real Data),以確保與觀測值的一致性。我們的總損失函數(Total Loss Function)通過整合物理損失和監督損失來構建,並給予數據更高的權重(Weights),因為從數據中學習提供了最有價值的信息。最後,我們更新模型參數(Model Parameters),使用梯度下降法(Gradient Descent)調整參數,以在訓練過程中最小化總損失。

The foundation of our sequential approach for training consists of five main steps. First, the model estimates future pressure and saturation values based on the inputs. Then, we compute the physical loss, imposing physical constraints using the black oil model equations and Darcy’s law. Here, we have the equation for pressure from Darcy’s law and the equation for saturation from the black oil model. Next, we calculate the supervised loss, which compares the model’s predictions to real data to ensure consistency with observations. Our total loss function is constructed by integrating physical and supervised losses, with more weight assigned to the data, as learning from data provides the most valuable information. Finally, we update the model parameters, adjusting them using gradient descent to minimize the total loss during training.

在測試階段(Testing Phase),僅需提供時間 T=0 的初始條件(Initial Conditions),即壓力(Pressure)和水飽和度(Water Saturation)。隨後,預測將按順序進行,每個時間 T 的預測使用時間 T-1 的預測值作為新輸入。我們還計算流率數據(Flow Rate Data)。時間 T 的新流率(Flow Rates)通過一個方程計算,該方程結合了預測的壓力和飽和度值。這確保流率與多孔介質(Porous Media)中流體流動的物理特性保持一致。整個過程應用於我們的51個時間步(Time Steps)。

During the testing phase, only the initial conditions at time T=0 for pressure and water saturation are needed to start the prediction. Predictions are then performed sequentially, with each prediction at time T using the predicted values at time T-1 as the new input. We also calculate flow rate data. The new flow rates at time T are calculated using an equation that incorporates the predicted pressure and saturation values. This ensures that the flow rates remain consistent with the physics of fluid flow in porous media. This entire process is applied over our 51 time steps.

我們使用 NVIDIA Modulus 框架(NVIDIA Modulus Framework)開發了物理資訊神經算子(Physics-Informed Neural Operator)。訓練過程使用 NVIDIA H100 GPU(H100 GPU)進行加速。我們的數據集包含600個訓練樣本(Training Samples)。比較結果顯示,物理資訊神經算子(PINO)在預測壓力和水飽和度方面優於傅立葉神經算子(Fourier Neural Operator, FNO)。在圖表中,紅色線條表示 PINO 模型的結果,綠色線條表示 FNO 模型的結果,兩者在水飽和度預測中也有類似表現。在這兩種情況下,PINO 模型均提供了更高的預測精度(Prediction Accuracy),展示了將物理知識(Physical Knowledge)融入神經網路(Neural Networks)的優勢。

We developed the physics-informed neural operator using the NVIDIA Modulus framework. Training was conducted using an NVIDIA H100 GPU. Our dataset consists of 600 training samples. Comparative R² scores show that the physics-informed neural operator outperforms the Fourier Neural Operator (FNO) in predicting pressure and water saturation. In the graphs, the red line represents the results for the PINO model, and the green line represents the FNO model. We observe similar performance for water saturation. In both cases, PINO provides superior prediction accuracy, demonstrating the advantage of integrating physical knowledge into neural networks.

這些結果證實,PINO 在捕捉模擬系統的複雜動態(Complex Dynamics)方面比 FNO 更有效,能夠為壓力與飽和度提供更可靠的預測。然而,我們注意到 PINO 的訓練時間(Training Time)是 FNO 的四倍。因此,在提高精度(Accuracy)和優化訓練時間之間找到平衡至關重要。

These results confirm that PINO is more effective than FNO in capturing the complex dynamics of the simulated system, offering more reliable predictions for both variables. However, we observe that the training time for PINO is four times longer than for FNO. Therefore, it is essential to find a balance between improving accuracy and optimizing training time.

這裡我們展示了一個動畫圖表,左邊是使用 NVIDIA Modulus 開發的模型預測,中间是真實的飽和度(True Saturation),右邊是兩者之間的誤差(Error)。我們可以看到,壓力的預測結果非常理想。對於飽和度,我們也有類似的比較。最大的誤差出現在流動前沿(Flow Fronts)。為了最小化這些誤差,我們使用了前面描述的損失函數(Loss Function)。

Here, we present an animated plot comparing our model developed using NVIDIA Modulus on the left, the true saturation in the middle, and the error between the two on the right. We can see that the results for pressure are quite good. For saturation, we have a similar comparison. The largest errors are located at the flow fronts. To minimize these errors, we used the loss function described earlier.

對於我們的問題,我們的目標是根據油水產量數據(Oil and Water Production Data)重建滲透率圖(Permeability Map)。然而,這個問題屬於病態問題(Ill-Posed Problem),意思是系統是欠定的(Under-Determined)。儲層中的未知參數(Unknown Parameters)多於可用數據(Available Data),即歷史生產數據(Historical Production Data)。測量數據的數量不足,滲透率圖的約束條件(Constraints)無法排除所有可能的解(Possible Solutions)。此外,數值誤差(Numerical Errors)在前向模型(Forward Model)中可能被放大。

For our problem, we aim to reconstruct a permeability map from oil and water production data. However, this problem is ill-posed, meaning the system is under-determined. There are more unknown parameters in our reservoir than available data, which consists of historical production data. The number of measurements is insufficient, and the constraints on the permeability map are too weak to eliminate all possible solutions. Additionally, numerical errors in the forward model can be amplified.

變分卷積自編碼器(Variational Convolutional Autoencoder, VCAE)是解決逆問題(Inverse Problems)中一個強大的工具,用於降維(Dimensionality Reduction)和複雜數據的概率建模(Probabilistic Modeling)。VCAE 能夠將高維數據(High-Dimensional Data),例如滲透率圖,壓縮到低維潛在空間(Latent Space),從而促進高效的採樣(Sampling)和重建(Reconstruction)。

The Variational Convolutional Autoencoder (VCAE) is a powerful tool for inverse problems, enabling dimensionality reduction and probabilistic modeling of complex data. VCAE facilitates a compact representation of high-dimensional data, such as a permeability map, in a reduced latent space. This enables efficient sampling and reconstruction.

為什麼要將先驗知識(Prior Knowledge)融入學習潛在分佈(Latent Distribution)?透過變分卷積自編碼器(Variational Convolutional Autoencoder, VCAE),我們的目標是通過從潛在空間(Latent Space)解碼隨機向量(Random Vector),生成不在我們數據集中的新滲透率圖(Permeability Maps)。潛在空間的參數數量遠少於儲層單元(Reservoir Cells)的數量,這使我們能夠更好地管理不確定性(Uncertainties),並朝著良定問題(Well-Posed Problem)邁進。換句話說,描述儲層所需的未知參數(Unknown Parameters)數量顯著減少。

Why incorporate prior knowledge to learn the latent distribution? With the Variational Convolutional Autoencoder (VCAE), we aim to generate new permeability maps that are not present in our dataset by decoding a random vector from our latent space. The latent space has significantly fewer parameters than the number of cells in our reservoir, allowing us to better manage uncertainties and move toward a well-posed problem. In other words, the number of unknown parameters needed to describe the reservoir is significantly reduced.

VCAE 包含三個基本步驟。首先是編碼器(Encoder),它從輸入數據中提取潛在表示(Latent Representation)。接著是重參數化(Reparameterization),利用編碼器生成的均值(Mean)和方差(Variance)產生潛在變量 Z(Latent Variable Z)。最後是解碼器(Decoder),它從潛在空間重建原始數據(Original Data)。這種模型特別適用於數據生成任務(Data Generation Tasks)和降維(Dimensionality Reduction),同時融入了概率約束(Probabilistic Constraints)。因此,通過 VCAE,我們可以通過從潛在空間中取一個隨機向量並解碼,生成不在當前數據集中的新儲層(New Reservoirs)。

The VCAE consists of three fundamental steps. First, the encoder extracts a latent representation from the input data. Then, reparameterization generates a latent variable Z using the mean and variance produced by the encoder. Finally, the decoder reconstructs the original data from the latent space. This model is particularly well-suited for data generation tasks and dimensionality reduction while incorporating probabilistic constraints. Thus, with the VCAE, we can generate new reservoirs that are not present in our current dataset by taking a random vector in the latent space and decoding it.

為了解決這個逆問題(Inverse Problem),我使用了自適應正則化集成更新方法(Adaptive Regularized Ensemble Update Method),這是由 Kitanidis 等人開發的算法,我們稱之為 ERGM(Ensemble Regularized Gaussian Method)。如果您需要更多算法細節,我建議閱讀這裡提供的參考論文。但為了簡單起見,我們有順序物理資訊神經算子模型(Sequential PINO Model)和真實生產數據(Real Production Data)作為我們的目標(Target)。

To solve this inverse problem, I used the adaptive regularized ensemble update method, an algorithm developed by Kitanidis and others, which we call ERGM (Ensemble Regularized Gaussian Method). If you need more details on this algorithm, I recommend reading the reference paper provided here. But to keep it simple, we have our sequential PINO model and real production data, which serve as our target.

在初始化階段(Initialization),我們使用 VCAE 生成100個滲透率圖的集成(Ensemble of Permeability Maps)。然後,將這些儲層作為順序 PINO 模型的輸入,預測壓力場(Pressure Fields)和飽和度場(Saturation Fields)。接著,我們使用泊松方程(Poisson Equation)計算每個井隨時間變化的油水產量(Oil and Water Production Rates)。

In the initialization phase, we generate an ensemble of 100 permeability maps using the VCAE. Then, we use these reservoirs as input for our sequential PINO model to predict pressure and saturation fields. Finally, we use the Poisson equation to calculate the oil and water production rates over time for each well.

隨後,我們計算初始集成中每個成員的測量數據(Measured Data)與模擬數據(Simulated Data)之間的殞差(Residual),並計算它們的均值(Mean)和方差(Variance)。這個殞差同時也是我們的成本函數(Cost Function)。這裡,Γ(Gamma)表示與目標 Y 相關的誤差協方差矩陣(Covariance Matrix of Errors)。在殞差計算過程中,Γ 用於標準化測量數據與模擬數據之間的差異,確保誤差在一致的尺度上進行比較。

Next, we compute the residual between the measured and simulated data for each member of the initial ensemble, along with their mean and variance. This residual is also our cost function. Here, Γ represents the covariance matrix of errors associated with the target Y. During residual calculation, Γ is used to normalize the difference between measured and simulated data, ensuring that errors are compared on a consistent scale.

接著,我們應用正則化(Regularization)來根據預測的均值和方差調整殞差。然後,我們計算兩個協方差矩陣(Covariance Matrices)。第一個協方差矩陣用於衡量集成成員預測的變異性(Variability of Predictions)。第二個協方差矩陣用於建立滲透率圖表示的權重空間(Weighting Space)與儲層參數之間的關聯。

Then, regularization is applied to adjust the residual based on the mean and variance of the predictions. Next, we calculate two covariance matrices. The first one measures the variability of predictions among the ensemble members. The second one establishes a link between the weighting space, where the permeability map is represented, and the reservoir parameters.

物理預測空間(Physical Prediction Space)用於更好地將潛在變量(Latent Variables)與觀測數據(Observed Data)對齊。滲透率場(Permeability Field)通過變分卷積自編碼器(Variational Convolutional Autoencoder, VCAE)編碼到潛在空間 Z(Latent Space Z)。然後,利用殞差(Residuals)、協方差(Covariance)和誤差噪聲(Noise)更新潛在變量。更新後的變量被解碼以生成優化的滲透率場(Refined Permeability Fields)。通過這種方式,我們更新初始集成(Initial Ensemble),並重複此過程30次(30 Iterations)。我們可以看到,成本函數(Cost Function)已經收斂(Converged),最終集成(Final Ensemble)成功趨向於最佳滲透率圖(Optimal Permeability Map)。

The physical prediction space is used to better align the latent variables with observed data. The permeability field is encoded into the latent space Z using the Variational Convolutional Autoencoder (VCAE). Then, the latent variables are updated using residuals, covariance, and error noise. The updated variables are decoded to generate refined permeability fields. By doing this, we update our initial ensemble, repeating this process 30 times for 30 iterations. We can see that our cost function has converged, and our final ensemble has successfully converged toward the optimal permeability map.

在中間,我們展示了初始集成(Initial Ensemble)。集成方法(Ensemble Method)的目標是使這裡顯示為灰色的初始集成(Gray Ensemble)趨向於目標(Target),即以黑色表示的四口生產井(Four Production Wells)的油產量(Oil Production)。綠色曲線顯示了最終集成(Final Ensemble)在多次迭代後的結果。我們僅在模擬的前25個時間步(25 Time Steps)上運行集成更新算法(Ensemble Update Algorithm),其餘時間步用於預測(Forecasting),以檢驗是否仍能趨向目標。圖表顯示了四口井的油產量分佈。

In the middle, we show our initial ensemble. The goal of our ensemble method is to make the initial ensemble, shown here in gray, converge toward our target, represented in black, which corresponds to the oil production at our four production wells. The green curve shows our final ensemble after search iterations. We ran the ensemble update algorithm only on the first 25 time steps of our simulation, with the rest used for forecasting to see if we still converge toward our target. Here, we can see the separation for the four wells.

同樣的分析也應用於水產量(Water Production),展示了四口生產井的水產量分佈。在這一頁中,我展示了從最終集成重建的可能滲透率圖(Permeability Map Reconstruction),並與用於生成真實生產數據(Real Production Data)的真實滲透率圖(True Permeability Map)進行比較。我們可以看到,儲層地質(Reservoir Geology)被相當準確地重建。然而,某些區域(例如這裡和這裡)存在缺失,這是因為這些區域沒有井(Wells),導致難以重建相關信息。我們的井位於這些通道(Channels)上,通過這一工作流程(Workflow),我們能夠成功重建這些通道。

The same analysis is applied to water production, showing the water production at our four production wells. On this slide, I present a possible permeability map reconstruction from our final ensemble and compare it with the true permeability map used to generate the real production data. We can see that we have managed to reconstruct the geology of our reservoir fairly accurately. However, some areas, like this one and this one, are missing. This is due to the fact that there are no wells in those areas, making it difficult to reconstruct that information. Our wells are located on these channels, and we are able to reconstruct them with this workflow.

通過結合順序物理資訊神經算子(Sequential PINO)、集成更新算法(ERGM Algorithm)和使用 VCAE 生成的初始集成(Initial Ensemble),我們顯著提高了流程的執行速度(Execution Speed)。以100個初始集成為例,在一塊 NVIDIA H100 GPU(H100 GPU)上,每次迭代需要40分鐘(40 Minutes)。若使用300個初始集成,則需要134分鐘(134 Minutes)。相比之下,使用傳統儲層模擬器(Conventional Reservoir Simulator)的工作流程,對於相同儲層規模(Reservoir Size),儲層工程師(Reservoir Engineer)需要大約8小時(8 Hours)完成100個初始集成的計算。

By combining the sequential PINO, the ERGM algorithm, and an initial ensemble generated using VCAE, we can significantly accelerate the execution speed of the process. With an initial ensemble of 100, it takes 40 minutes per iteration on one NVIDIA H100 GPU. With an initial ensemble of 300, it takes 134 minutes. For comparison, using a traditional workflow on a conventional reservoir simulator, a reservoir engineer would take approximately 8 hours with an initial ensemble of 100 for this specific reservoir size.

總結來說,這一工作流程(Workflow)適用於需要採樣整個後驗密度(Posterior Density)以量化逆問題不確定性(Inverse Uncertainty)的場景。需要注意的是,人工智慧代理模型(AI Surrogate Models)旨在加速某些任務,例如我們今天看到的歷史匹配(History Matching),以及其他任務如油藏布局優化(Well Placement Optimization)。我們實現了更快的推理時間(Inference Times)並降低了推理過程中的計算成本(Computational Cost)。然而,訓練前向模型(Forward Models)仍然需要大量計算資源(Computational Resources),特別是當我們希望擴展到具有數百萬單元的真實油田(Real Fields)時。因此,通過跨多個 GPU(Multiple GPUs)並行化訓練(Parallelizing Training)並使用計算集群(Computing Cluster)至關重要。

To conclude, this workflow is designed for scenarios where it is necessary to sample the entire posterior density to quantify inverse uncertainty. It is important to keep in mind that AI surrogate models are meant to accelerate tasks such as history matching, as we saw today, but also others like well placement optimization. We achieve faster inference times and reduce computational costs during inference. However, training the forward models still requires significant computational resources, especially if we want to scale up to real fields with millions of cells. As a result, parallelizing training across multiple GPUs and using a computing cluster is essential.

傳統模擬器(Traditional Simulators)仍將是高保真建模(High-Fidelity Modeling)的參考標準。我們的目標不是取代儲層工程師(Reservoir Engineers),而是提供互補的解決方案(Complementary Solutions)與替代方法(Alternative Approaches)。感謝大家參加我的演講。我要特別感謝 TotalEnergies(TotalEnergies),尤其是我的經理 Daniel Busby(Daniel Busby),他在整個項目中提供了寶貴的指導和支持。我也要感謝 NVIDIA 團隊(NVIDIA Team)在準備這次演講中的幫助,特別是 Clément Hénon(Clément Hénon),他向我介紹了 NVIDIA Modulus(Modulus)並提供了技術指導,幫助我構建了從 PINO 到解決逆問題的完整工作流程。

Traditional simulators will remain the reference for high-fidelity modeling. The goal is not to replace reservoir engineers but rather to provide complementary solutions with alternative approaches. Thank you so much for attending my presentation. I would like to express my gratitude to TotalEnergies, and especially my manager Daniel Busby, for his invaluable guidance and support throughout this project. I also want to thank the NVIDIA team for their help in preparing this presentation, and in particular, Clément Hénon, who introduced me to Modulus and provided technical guidance and support in building the entire workflow from PINO to solving the inverse problem.

如果您有任何問題,請在聊天室中提問,我很樂意回答。您也可以通過電子郵件(Email)聯繫我,或在 LinkedIn(LinkedIn)上與我連接。再次感謝大家!

If you have any questions, feel free to ask in the chat. I would be happy to answer. You can also reach out to me by email or connect with me on LinkedIn. Thank you again!