# Table Structure Recognition using Top-Down and Bottom-Up Cues
###### tags: `Table Detection`
這篇在解==辨識無框線的表格==
[Paper](https://arxiv.org/pdf/2010.04565.pdf)
[Github](https://github.com/sachinraja13/TabStructNet)
[中文版簡介](https://www.gushiciku.cn/pl/gTrM/zh-tw)
## TabStruct-Net 簡介
TabStruct-Net 是 <font color=blue>end to end</font> 結構,用於單元檢測和表格結構識別
TabStructNet ==使用表格影象作為輸入== (而<font color=red>不是包含表格的文件影象</font>),並嘗試預測表格結構,它使用兩個階段的過程
* Top-Down: Cell Detection 自上而下階段(<font color=blue>分解階段</font>):這是基於 RCNN (修改的 FPN )的單元檢測 (基本表格物件) 網路
* Bottom-Up: Structure Recognition 自下而上階段(<font color=blue>合成階段</font>):其從單元檢測網路獲取資訊(自上而下階段),以及它們使用鄰接矩陣的行-列關聯,並重建整個表格
* Post-processing: 用 Tesseract 辨識文字
大致流程如圖:

## UNLV Dataset 標註
* SR / ER: start row / end row
* SC / EC: start column / end column

## TabStruct-Net 架構
Cell Detection Network + Structure Recognition Network

* Top-Down: Cell Detection Network
* 使用 Mask R-CNN 架構
* (a) we augment the Region Proposal Network (rpn) ==with dilated convolutions== to better <font color=blue>capture long-range row and column</font> visual features of the table.
* (b) we ==append the feature pyramid network== with a top-down pathway, which propagates high-level semantic information to low-level feature maps. This allows the network to <font color=blue>work better for cells with varying scales</font>
* (c) we ==append additional losses== during the training phase in order to model the inherent struc- tural constraints.
* Loss function: 看起來就是四個欄位的 loss 加總


* Bottom-Up: Structure Recognition Network
* Visual Component
* 拿 FPN 的 P2 層與預測出來的 cell 做 linear interpolation (內插法),分別拿 centre horizontal and centre vertical lines 做 LSTM 得到最後的 final visual features
* Interaction Component
* 使用 ==DGCNN== architecture based on graph neural networks 整合 LSTM 輸出的 visual features
* Classification Component
* This is fed as an input to the row/column classifiers <font color=blue>to predict row/column associations.</font>
* 最後 output 應該是得到==相鄰矩陣== (adjacency matrix) 找對應的 row & column


## 論文辨識結果
Cell Detection Network:

Structure Recognition Network:
* First Row: prediction of cells which belong to the ==same row==
* Second Row: prediction of cells which belong to the ==same column==
