American Institute of Mathematical Sciences

April  2017, 2(2): 97-106. doi: 10.3934/bdia.2017001

First steps in the investigation of automated text annotation with pictures

 York University, Dept. of Electrical Engineering and Computer Science, 4700 Keele Street, Toronto, Ontario, M3J 1P3, Canada

* Corresponding author: Kent Poots *

Published  April 2017

We describe the investigation of automatic annotation of text with pictures, where knowledge extraction uses dependency parsing. Annotation of text with pictures, a form of knowledge visualization, can assist understanding. The problem statement is, given a corpus of images and a short passage of text, extract knowledge (or concepts), and then display that knowledge in pictures along with the text to help with understanding. A proposed solution framework includes a component to extract document concepts, a component to match document concepts with picture metadata, and a component to produce an amalgamated output of text and pictures. A proof-of-concept application based on the proposed framework provides encouraging results

Citation: J. Kent Poots, Nick Cercone. First steps in the investigation of automated text annotation with pictures. Big Data & Information Analytics, 2017, 2 (2) : 97-106. doi: 10.3934/bdia.2017001
References:
 [1] B. Coyne and R. Sproat, WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496. [2] D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish, (2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf [3] A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec, Columbia University (2014). [4] D. Joshi, J. Z. Wang and J. Li, The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89. [5] J. McCarty, Programs with common sense, Defense Technical Information Center (1963). [6] J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press), IEEE IT Professional (2016). [7] S. Rose, D. Engel, N. Cramer and W. Cowley, Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20. [8] V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta and F. Ciravegna, Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28. [9] N. UzZaman, J. P. Bigham and J. F. Allen, Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52. [10] T. Veale, A. Conway and B. Collins, The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106. [11] L. Zhao, K. Kipper, W. Schuler, C. Vogler, N. Badle and M. Palmer, A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.

show all references

References:
 [1] B. Coyne and R. Sproat, WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496. [2] D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish, (2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf [3] A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec, Columbia University (2014). [4] D. Joshi, J. Z. Wang and J. Li, The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89. [5] J. McCarty, Programs with common sense, Defense Technical Information Center (1963). [6] J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press), IEEE IT Professional (2016). [7] S. Rose, D. Engel, N. Cramer and W. Cowley, Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20. [8] V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta and F. Ciravegna, Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28. [9] N. UzZaman, J. P. Bigham and J. F. Allen, Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52. [10] T. Veale, A. Conway and B. Collins, The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106. [11] L. Zhao, K. Kipper, W. Schuler, C. Vogler, N. Badle and M. Palmer, A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.
Processing for Text to Picture System
The Hierarchy of Meaning
Proposed Annotation With Pictures Framework
Proposed Annotation With Pictures Framework
Example of Text Automatically Annotated With Pictures From UzZaman et al.[10]
Core Components for Text Picturing Implementation
 1 Knowledge Representation Subject / Verb / Object using Stanford form dependencies 2 Knowledge Extraction RAKE baseline, Stanford, TextRazor, Deppattern parsers 3 Test Sentences 4 short sentences including French translation 4 Image Database Google-Image pictures of nouns, verbs in Sign Language 5 Text/Image Matching Binary match
 1 Knowledge Representation Subject / Verb / Object using Stanford form dependencies 2 Knowledge Extraction RAKE baseline, Stanford, TextRazor, Deppattern parsers 3 Test Sentences 4 short sentences including French translation 4 Image Database Google-Image pictures of nouns, verbs in Sign Language 5 Text/Image Matching Binary match
Core Components -Test Results Summary
 1 Extract baseline terms using RAKE RAKE extracted meaningful terms 2 Constituency and dependency parse Stanford and TextRazor gave same result; Depparse results varied 3 Compare input text to SVO SVO was adequate for basic sentences 4 Create rendered scene Scene matched SVO 5 Knowledge Extract, Rendering Gaps? SVO sometimes needs additional terms; consider verb valency. 6 Compare to prior work Renderings provided equivalent detail. 7 Evaluate output pictures alone Pictures could help understanding pictures not a replacement.
 1 Extract baseline terms using RAKE RAKE extracted meaningful terms 2 Constituency and dependency parse Stanford and TextRazor gave same result; Depparse results varied 3 Compare input text to SVO SVO was adequate for basic sentences 4 Create rendered scene Scene matched SVO 5 Knowledge Extract, Rendering Gaps? SVO sometimes needs additional terms; consider verb valency. 6 Compare to prior work Renderings provided equivalent detail. 7 Evaluate output pictures alone Pictures could help understanding pictures not a replacement.
Research Objectives vs. Actual Results
 Objective Actual Result 1 Evaluate feasibility Feasibility was demonstrated 2 Identify CL topic areas Topics include IR, KE, parsing, matching, cognitive science (perception) 3 Propose a framework Proposed, demonstrated dependency parsing, binary match 4 Provide Test Results Demonstrated SVO model for short sentences; may need to consider term valence.
 Objective Actual Result 1 Evaluate feasibility Feasibility was demonstrated 2 Identify CL topic areas Topics include IR, KE, parsing, matching, cognitive science (perception) 3 Propose a framework Proposed, demonstrated dependency parsing, binary match 4 Provide Test Results Demonstrated SVO model for short sentences; may need to consider term valence.
 [1] Nan Liu, Yong Ye. Humanitarian logistics planning for natural disaster response with Bayesian information updates. Journal of Industrial and Management Optimization, 2014, 10 (3) : 665-689. doi: 10.3934/jimo.2014.10.665 [2] Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i [3] Reuven Segev. Book review: Marcelo Epstein, The Geometrical Language of Continuum Mechanics. Journal of Geometric Mechanics, 2011, 3 (1) : 139-143. doi: 10.3934/jgm.2011.3.139 [4] Bas Janssens. Infinitesimally natural principal bundles. Journal of Geometric Mechanics, 2016, 8 (2) : 199-220. doi: 10.3934/jgm.2016004 [5] Karim El Laithy, Martin Bogdan. Synaptic energy drives the information processing mechanisms in spiking neural networks. Mathematical Biosciences & Engineering, 2014, 11 (2) : 233-256. doi: 10.3934/mbe.2014.11.233 [6] M. L. Bertotti, Sergey V. Bolotin. Chaotic trajectories for natural systems on a torus. Discrete and Continuous Dynamical Systems, 2003, 9 (5) : 1343-1357. doi: 10.3934/dcds.2003.9.1343 [7] Daniel Grieser. A natural differential operator on conic spaces. Conference Publications, 2011, 2011 (Special) : 568-577. doi: 10.3934/proc.2011.2011.568 [8] Editorial Office. Retraction: Xiaohong Zhu, Lihe Zhou, Zili Yang and Joyati Debnath, A new text information extraction algorithm of video image under multimedia environment. Discrete and Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1265-1265. doi: 10.3934/dcdss.2019087 [9] Roya Soltani, Seyed Jafar Sadjadi, Mona Rahnama. Artificial intelligence combined with nonlinear optimization techniques and their application for yield curve optimization. Journal of Industrial and Management Optimization, 2017, 13 (4) : 1701-1721. doi: 10.3934/jimo.2017014 [10] Irina Kareva, Faina Berezovkaya, Georgy Karev. Mixed strategies and natural selection in resource allocation. Mathematical Biosciences & Engineering, 2013, 10 (5&6) : 1561-1586. doi: 10.3934/mbe.2013.10.1561 [11] Rui Wang, Denghua Zhong, Yuankun Zhang, Jia Yu, Mingchao Li. A multidimensional information model for managing construction information. Journal of Industrial and Management Optimization, 2015, 11 (4) : 1285-1300. doi: 10.3934/jimo.2015.11.1285 [12] Vikram Krishnamurthy, William Hoiles. Information diffusion in social sensing. Numerical Algebra, Control and Optimization, 2016, 6 (3) : 365-411. doi: 10.3934/naco.2016017 [13] Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016 [14] Apostolis Pavlou. Asymmetric information in a bilateral monopoly. Journal of Dynamics and Games, 2016, 3 (2) : 169-189. doi: 10.3934/jdg.2016009 [15] Ioannis D. Baltas, Athanasios N. Yannacopoulos. Uncertainty and inside information. Journal of Dynamics and Games, 2016, 3 (1) : 1-24. doi: 10.3934/jdg.2016001 [16] Vieri Benci, C. Bonanno, Stefano Galatolo, G. Menconi, M. Virgilio. Dynamical systems and computable information. Discrete and Continuous Dynamical Systems - B, 2004, 4 (4) : 935-960. doi: 10.3934/dcdsb.2004.4.935 [17] Wai-Ki Ching, Jia-Wen Gu, Harry Zheng. On correlated defaults and incomplete information. Journal of Industrial and Management Optimization, 2021, 17 (2) : 889-908. doi: 10.3934/jimo.2020003 [18] Nicolas Rougerie. On two properties of the Fisher information. Kinetic and Related Models, 2021, 14 (1) : 77-88. doi: 10.3934/krm.2020049 [19] Gabrielle Nornberg, Delia Schiera, Boyan Sirakov. A priori estimates and multiplicity for systems of elliptic PDE with natural gradient growth. Discrete and Continuous Dynamical Systems, 2020, 40 (6) : 3857-3881. doi: 10.3934/dcds.2020128 [20] Shenzhou Zheng, Xueliang Zheng, Zhaosheng Feng. Optimal regularity for $A$-harmonic type equations under the natural growth. Discrete and Continuous Dynamical Systems - B, 2011, 16 (2) : 669-685. doi: 10.3934/dcdsb.2011.16.669

Impact Factor: