文档图像智能
文档图像智能,是指对输入的文档图像进行分析处理,检测并识别出该图像当中的文本信息。由于文字是重要的信息载体,因此准确地检测并识别出图片中的文字具有广阔的应用前景,目前该技术也是人工智能三大成功落地的应用之一,在金融、商业、安防、智能交通、智慧城市、自动化办公等领域具有数千亿的市场。场景文字理解是传统光学字符识别(optical character recognition, OCR)技术在自然场景图像的扩展和延伸,是团队的主要研究方向之一,目前团队已经取得了十分突出的成绩,在OCR领域拥有世界领先的文字检测识别技术,并构建了多个重要的文字检测和识别数据集,如[MSRA-TD 500 Dataset]、[HUST-TR400 Dataset]、[music score recognition datasets]、[RCTW-17 Dataset]、[WordArt Dataset],在学术界和产业界有重大影响。相关工作数十次发表于TPAMI、TIP、CVPR、ECCV、AAAI等顶级期刊和会议。团队研发的OCR算法被谷歌,高通,DropBox,三星,阿里,腾讯等知名企业使用,其中DBNet被应用于微信文字检测中。目前也与华为、阿里巴巴、百度、好未来、Adobe、依图、唯你网、阿博茨科技等知名企业建立了深度合作。
相关成果
2022
1. Liao M, Zou Z, Wan Z, Yao C, Bai X. Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion. IEEE transactions on pattern analysis and machine intelligence (TPAMI), 2022 [pdf] [code]
2. Wang H, Liao J, Cheng T, et al. Knowledge Mining with Scene Text for Fine-Grained Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 [pdf] [code]
3. Xie X, Fu L, Zhang Z, Wang Z, Bai X. Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition. European Conference on Computer Vision (ECCV), 2022 [pdf] [code]
4. Li B, Yuan Y, Liang D, Liu X, Ji Z, Bai J, Liu W, Bai X. When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition. European Conference on Computer Vision (ECCV), 2022 [pdf] [code]
2021
1. Gao Y, Li X, Zhang J, Zhou Y, Jin D, Wang J, Zhu S, Bai X. Video Text Tracking With a Spatio-Temporal Complementary Model. IEEE Trans on Image Process (TIP), 2021 [pdf] [code]
2. Liao M, Lyu P, He M, Yao C, Wu W, Bai X. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. IEEE transactions on pattern analysis and machine intelligence (TPAMI), 2021[pdf] [code]
3. Wang H, Bai X, Yang M, Zhu S, Wang J, Liu W. Scene Text Retrieval via Joint Text Detection and Similarity Learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [pdf] [code]
4. He M, Liao M, Yang Z, et al. MOST: A Multi-Oriented Scene Text Detector with Localization Refinement. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 [pdf]
2020
1. Zhang H, Yao Q, Yang M, Xu Y, Bai X. AutoSTR: Efficient Backbone Search for Scene Text Recognition. European Conference on Computer Vision (ECCV), 2020 [pdf] [code]
2. Liao M, Pang G, Huang J, Hassner T, Bai X. Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting. European Conference on Computer Vision (ECCV), 2020[pdf] [code]
3. Liao M, Wan Z, Yao C, Chen K, Bai X. Real-time Scene Text Detection with Differentiable Binarization. Association for the Advancement of Artificial Intelligence (AAAI), 2020 [pdf] [code]
4. Wang H, Lu P, Zhang H, et al. All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting. Association for the Advancement of Artificial Intelligence (AAAI), 2020 [pdf]
2019
1. Wu, L., Zhang, C., Liu, J., Han, J., Liu, J., Ding, E., & Bai, X. (2019, October). Editing text in the wild. 27th ACM international conference on multimedia (ACM MM) (pp. 1500-1508). [pdf]
2. Yang, M., Guan, Y., Liao, M., He, X., Bian, K., Bai, S., ... & Bai, X. (2019). Symmetry-constrained rectification network for scene text recognition. IEEE/CVF international conference on computer vision (ICCV) (pp. 9147-9156). [pdf]
3. Tang, J., Yang, Z., Wang, Y., Zheng, Q., Xu, Y., & Bai, X. (2019). Seglink++: Detecting dense and arbitrary-shaped scene text by instance-aware component grouping. Pattern recognition (PR), 96, 106954. [pdf]
4. Xu Y, Wang Y, Zhou W, Wang Y, Yang Z, Bai X. TextField: Learning A Deep Direction Field for Irregular Scene Text Detection. IEEE Transactions on Image Processing (TIP) 2019 [pdf]
5. Liao M, Zhang J, Wan Z, et al. Scene Text Recognition from Two-Dimensional Perspective. Association for the Advancement of Artificial Intelligence (AAAI) 2019 [pdf]
2018
1. P. Lyu, M. Liao, C. Yao, W. Wu, X. Bai. Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. European Conference on Computer Vision (ECCV) , 2018. [pdf]
2. B. Shi, M. Yang, X. Wang, P. Lyu, C. Yao, X. Bai. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification. IEEE Trans. on PAMI, accepted , 2018. [code] [pdf]
3. P. Lyu, C. Yao, W. Wu, S. Yan, X. Bai. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2018.[pdf]
4. M. Liao, Z. Zhu, B. Shi, G. Xia, X. Bai. Rotation-Sensitive Regression for Oriented Scene Text Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2018. [pdf]
5. M. Liao, B. Shi, X.Bai. TextBoxes++: A Single-Shot Oriented Scene Text Detector. IEEE Transations on Image Processing (TIP) , 2018. [code][pdf]
2017
1. B. Shi, X. Bai, C. Yao. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2016. [code][music score recognition datasets][pdf]
2. B. Shi, X. Bai, S. Belongie. Detecting Oriented Text in Natural Images by Linking Segments. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [code][pdf]
3. M. Liao, B. Shi, X. Bai, X. Wang, W. Liu. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. The 31st AAAI Conference on Artificial Intelligence (AAAI), 2017. [code][pdf]
2016
1. C. Yao, X. Bai, N. Sang, X. Zhou, S. Zhou, Z. Cao. Scene text detection via holistic, multi-channel prediction. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2016. [pdf]
2. Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu, X. Bai. Multi-oriented text detection with fully convolutional networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [code][pdf]
3. B. Shi, X. Wang, P. Lyu, C. Yao, X. Bai. Robust scene text recognition with automatic rectification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [pdf]
2015
1. Z. Zhang, W. Shen, C. Yao, X. Bai. Symmetry-based text line detection in natural scenes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [code][pdf]
2014
1. C. Yao, X. Bai, W. Liu. A Unified Framework for Multi-Oriented Text Detection and Recognition. IEEE Transactions on Image Processing (TIP),2014. [HUST-TR400 Dataset][pdf]
2. C. Yao, X. Bai, B. Shi, W. Liu. Strokelets: A learned multi-scale representation for scene text recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [pdf]
2013
1. C. Yao, X. Zhang, X. Bai, W. Liu, Y. Ma, Z. Tu. Rotation-invariant features for multi-oriented text detection in natural images. PLoS ONE 8(8),2013 [pdf]
2012
1. C. Yao, X. Bai, W. Liu, Y Ma, Z. Tu. Detecting texts of arbitrary orientations in natural images. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [MSRA-TD 500 Dataset][pdf]