Seeing Red: PPG Biometrics Using Smartphone Cameras (用鏡頭量血壓等)

--- tags: 論文-技術類 --- # Seeing Red: PPG Biometrics Using Smartphone Cameras (用鏡頭量血壓等) [論文連結](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w48/Lovisotto_Seeing_Red_PPG_Biometrics_Using_Smartphone_Cameras_CVPRW_2020_paper.pdf) [論文github](https://github.com/ssloxford/seeing-red) [論文dataset](https://ora.ox.ac.uk/objects/uuid:1a04e852-e7e1-4981-aa83-f2e729371484) > dataset內容： > 15位被測試者，每位被測試者每次測試時都有6-11小節測試時間。 > 每段測試時間都 > 2小時 > 一段測試包含![](https://i.imgur.com/ZmhCz3d.png) > 解析度1280x720 / 片長30s / 240幀， > 把一些 resized 成 360x240 作為 validation data signal_extractor.py -> video to signal signal_preprocessor.py -> preprocess signal signal_beat_separation.py -> separate beats signal_fiducial_points_detection.py -> find fiducial points fta_average_beat.py -> find average beat (needed for failure to acquire) signal_beat_fta.py -> filter bad beats feature_extractor.py -> extract features feature_selection1.py -> feature selection one feature_selection2.py -> feature selection two classify.py -> run experiments ## 量測流程 **錄影片 -> 提取訊號 -> 訊號處理(加入filter等等) -> 切割訊號 -> 紀錄基準點** ![](https://i.imgur.com/jAgzjjm.png) ## 錄影片 Video Recording 作者把影片拆解成一幀一張截圖作者也把每一張截圖的檔名和時間記錄在csv裡 (方便之後好plot) ### 影片前處理 ==作者取每一幀影片的平均亮度作為量化數值並通過下面的公式，讓RGB三種數值合併量化成一種數值，再描繪成波形圖== > ![](https://i.imgur.com/X4dmZY0.png) > 參數來自 ITU-R BT.601 standard [ITU-R BT.601官方文件](https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.601-7-201103-I!!PDF-E.pdf) > (g) -> G，剩下的以此類推 > i , j -> 圖片的像素 ==都會先忽略前一秒，因為相機在自動調整對焦光圈快門等等== - Red Channel Mean -> 計算每張截圖紅色通道的平均 ```python signal = [] for frame_bgr in frames: mean_of_r_ch = frame_bgr[..., 2].mean() signal.append(mean_of_r_ch) ``` - ICA decomposition (獨立成份分析) > 這個問題本來是源自於心理學，他們稱作「雞尾酒會效應」，研究的是人類為何能在一片吵雜的雞尾酒會中，依然能專注於自己想聽的那個談話，或是某些特殊聲音(例如，遠方忽然有人用自己的母語在交談)。這個問題從心理學、聽覺、以及腦科學的角度，解釋人類的「聽力選擇能力」。 ```python s_r, s_g, s_b = [], [], [] for frame_bgr in frames: b, g, r = frame_bgr.mean(axis=0).mean(axis=0) s_r.append(r) s_b.append(b) s_g.append(g) s_r = np.array(s_r).reshape(1, -1) s_b = np.array(s_b).reshape(1, -1) s_g = np.array(s_g).reshape(1, -1) fica = FastICA(n_components=1) stackd = np.concatenate((s_r, s_b, s_g), axis=0).T signal = fica.fit_transform(stackd).flatten() ``` - luma component mean (亮度平均值) ```python signal = [] for frame_bgr in frames: img_ycrcb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2YCrCb) mean_of_luma = img_ycrcb[..., 0].mean() signal.append(mean_of_luma) ``` ### 程式執行實際結果 ![](https://i.imgur.com/JUyQne7.png) ==------------------------------------------實作指示線------------------------------------------== ## 提取訊號 Signal Extraction :::info 論文：To obtain the signal from the raw video, we compute the mean of the pixel-wise luma component from the pixels in each video frame, so that if F is a video composed by a sequence of frames. ::: ### Step1. 利用移動平均(rolling average) 來去除趨勢(綠框) > 作者的dataset為240幀(每秒240張照片) ==作者以**1秒**為單位，計算出移動平均之後，原本的訊號減去移動平均，得到detrended之後的訊號== ### Step2. 透過 low-pass filtering 之後的漂亮波型(藍框) > [Butterworth filter 巴特沃斯濾波器](/D7UXNLmvTRyK-J0tMFKgkA?view) > 作者用 low pass filter 來移除高頻率的雜訊，cutoff為4hz，ex. 240bpm ![](https://i.imgur.com/u0QLQJT.png) ## 切割訊號 Beat Seperation ![](https://i.imgur.com/VAIXZv9.png) ## 紀錄基準點 Fiducial Peak ==作者特別focus在這三個點== 1. 收縮峰 (systolic peak, sp) 2. 動脈切跡 (dicrotic notch, dn) 3. 舒張峰 (diastolic peak, dp) ![](https://i.imgur.com/lU9Oj5F.png) > 論文4.4後半：However we found that this procedure needs to account for noisy signals, so we craft a more robust algorithm to detect them which falls back to best guess points. > 彥汝：這段我看不懂作者想表達什麼 ## 訊號品質 - 雜訊太多的話會造成準確率下降 - 透過一些規則來過濾雜訊 1. max bpm ![](https://i.imgur.com/l53JkRv.png) 2. number of peaks ![](https://i.imgur.com/BZFnd94.png) 3. distance from reference - 作者利用動態時間校正(Dynamic Time Warping)來設定門檻 > ex: 假如threshold = 2.1，那distance超過2.1的波會被拋棄 > 而 reference wave 是整個 dataset 裡**平均的波** ![](https://i.imgur.com/Xfdrzt3.png) ## 特徵提取 1. statistical 2. curve widths 3. frequency domain 4. fiducial points ==作者說： 為了在特徵提取時，把bpm特別抓出來，在計算出生理特徵之後，會將beat resample到1000hz，並進行normalization，使得數值在[0, 1]之間== ### statistical ==這些是使用者直觀的生理數值統計特徵==![](https://i.imgur.com/R5TcSno.png) 1. 最大值 2. 最小值 3. delta值 (|最大值 - 最小值|) 4. 波長 (beat length) ### curve widths ==量化數值，使其都介在[0.05, 0.95]==![](https://i.imgur.com/csIAq26.png) ### frequency domain ==作者說他們用了離散傅立葉轉換，在**特徵選取**的章節會提到細節== ### fiducial points ==透過上面提出的特徵數值，來標示出基準點== ## 特徵選取 (Feature Selection) > 作者說：經過特徵提取之後，會獲得541個特徵， > 4個生理數值 / 18個curve widths / 500個frequency domain的數值 / 19個fiducial points ==特徵選取流程== ![](https://i.imgur.com/wwY7IO7.png) ### Step1 ==透過[**主成分分析(principal component analysis, PCA)**](/Q7AYj8uVRrOblGfWtKakBQ)，把frequency domain和curve width這兩個group的特徵數降下來== - **frequency domain** :::info 作者：挑選100個成分給PCA，只保留n個成分，使其代表全部的99% ::: - **curve width** :::info 作者：和上面做的事情一樣，給PCA model 15個成分，我們發現，5個是下限，9個剛剛好足夠(根據dataset)。 ::: ### Step2 ==作者說利用兩種技巧讓特徵選取時也能保留好原本的特徵== 1. 計算特徵和(for a pair of feature distributions)(看不懂)之間的相關係數 > 如果相關係數 > 0.95，則會隨機從dataset裡丟棄其中一項特徵 2. 為了避免極端值，會拋棄1%和99%部分的特徵分布，再利用minimum redundancy maximum relevance (mRMR)特徵選取，選擇表現最好的前60%特徵...未完 ## 測試結果 [SVM](https://hackmd.io/tGU6QwDrQuyOa3XobGiASg?view) ### One-class Case ### Multi-class Case ### One-class Cross-Session ## 結論 ## 未來展望