
-
Previous Article
Rendering website traffic data into interactive taste graph visualizations
- BDIA Home
- This Issue
- Next Article
First steps in the investigation of automated text annotation with pictures
York University, Dept. of Electrical Engineering and Computer Science, 4700 Keele Street, Toronto, Ontario, M3J 1P3, Canada |
We describe the investigation of automatic annotation of text with pictures, where knowledge extraction uses dependency parsing. Annotation of text with pictures, a form of knowledge visualization, can assist understanding. The problem statement is, given a corpus of images and a short passage of text, extract knowledge (or concepts), and then display that knowledge in pictures along with the text to help with understanding. A proposed solution framework includes a component to extract document concepts, a component to match document concepts with picture metadata, and a component to produce an amalgamated output of text and pictures. A proof-of-concept application based on the proposed framework provides encouraging results
References:
[1] |
B. Coyne and R. Sproat,
WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496.
|
[2] |
D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish,
(2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf |
[3] |
A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec,
Columbia University (2014). |
[4] |
D. Joshi, J. Z. Wang and J. Li,
The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89.
|
[5] |
J. McCarty, Programs with common sense,
Defense Technical Information Center (1963). |
[6] |
J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press),
IEEE IT Professional (2016). |
[7] |
S. Rose, D. Engel, N. Cramer and W. Cowley,
Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20.
|
[8] |
V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta and F. Ciravegna,
Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28.
|
[9] |
N. UzZaman, J. P. Bigham and J. F. Allen,
Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52.
|
[10] |
T. Veale, A. Conway and B. Collins,
The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106.
|
[11] |
L. Zhao, K. Kipper, W. Schuler, C. Vogler, N. Badle and M. Palmer,
A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.
|
show all references
References:
[1] |
B. Coyne and R. Sproat,
WordsEye: An automatic text-to-scene conversion system, Proceedings of the 28th annual conference on Computer graphics and interactive techniques(2), 3 (2003), 487-496.
|
[2] |
D. Genzel, K. Macherey and J. Uszkoreit, Creating a high-quality machine translation system for a low-resource language: Yiddish,
(2009), Available from: www.mt-archive.info/MTS-2009-Genzel.pdf |
[3] |
A. Handler, An empirical study of semantic similarity in WordNet and Word2Vec,
Columbia University (2014). |
[4] |
D. Joshi, J. Z. Wang and J. Li,
The Story Picturing Engine-a system for automatic text illustration, ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 2 (2006), 68-89.
|
[5] |
J. McCarty, Programs with common sense,
Defense Technical Information Center (1963). |
[6] |
J. K. Poots and E. Bagheri, Automatic annotation of text with pictures, (in-press),
IEEE IT Professional (2016). |
[7] |
S. Rose, D. Engel, N. Cramer and W. Cowley,
Automatic keyword extraction from individual documents, Text Mining, (2010), 1-20.
|
[8] |
V. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta and F. Ciravegna,
Semantic annotation for knowledge management: Requirements and a survey of the state of the art, Web Semantics: science, services and agents on the World Wide Web, 4 (2006), 14-28.
|
[9] |
N. UzZaman, J. P. Bigham and J. F. Allen,
Multimodal summarization of complex sentences, Proceedings of the 16th international conference on Intelligent user interfaces, 2 (2004), 43-52.
|
[10] |
T. Veale, A. Conway and B. Collins,
The challenges of cross-modal translation: English-to-Sign-Language translation in the Zardoz system, Machine Translation, 13 (1998), 81-106.
|
[11] |
L. Zhao, K. Kipper, W. Schuler, C. Vogler, N. Badle and M. Palmer,
A machine translation system from English to American Sign Language, Envisioning Machine Translation in the Information Future, (2000), 54-67.
|




1 | Knowledge Representation | Subject / Verb / Object using Stanford form dependencies |
2 | Knowledge Extraction | RAKE baseline, Stanford, TextRazor, Deppattern parsers |
3 | Test Sentences | 4 short sentences including French translation |
4 | Image Database | Google-Image pictures of nouns, verbs in Sign Language |
5 | Text/Image Matching | Binary match |
1 | Knowledge Representation | Subject / Verb / Object using Stanford form dependencies |
2 | Knowledge Extraction | RAKE baseline, Stanford, TextRazor, Deppattern parsers |
3 | Test Sentences | 4 short sentences including French translation |
4 | Image Database | Google-Image pictures of nouns, verbs in Sign Language |
5 | Text/Image Matching | Binary match |
1 | Extract baseline terms using RAKE | RAKE extracted meaningful terms |
2 | Constituency and dependency parse | Stanford and TextRazor gave same result; |
Depparse results varied | ||
3 | Compare input text to SVO | SVO was adequate for basic sentences |
4 | Create rendered scene | Scene matched SVO |
5 | Knowledge Extract, Rendering Gaps? | SVO sometimes needs additional terms; |
consider verb valency. | ||
6 | Compare to prior work | Renderings provided equivalent detail. |
7 | Evaluate output pictures alone | Pictures could help understanding |
pictures not a replacement. |
1 | Extract baseline terms using RAKE | RAKE extracted meaningful terms |
2 | Constituency and dependency parse | Stanford and TextRazor gave same result; |
Depparse results varied | ||
3 | Compare input text to SVO | SVO was adequate for basic sentences |
4 | Create rendered scene | Scene matched SVO |
5 | Knowledge Extract, Rendering Gaps? | SVO sometimes needs additional terms; |
consider verb valency. | ||
6 | Compare to prior work | Renderings provided equivalent detail. |
7 | Evaluate output pictures alone | Pictures could help understanding |
pictures not a replacement. |
Objective | Actual Result | |
1 | Evaluate feasibility | Feasibility was demonstrated |
2 | Identify CL topic areas | Topics include IR, KE, parsing, matching, |
cognitive science (perception) | ||
3 | Propose a framework | Proposed, demonstrated dependency parsing, binary match |
4 | Provide Test Results | Demonstrated SVO model for short sentences; |
may need to consider term valence. |
Objective | Actual Result | |
1 | Evaluate feasibility | Feasibility was demonstrated |
2 | Identify CL topic areas | Topics include IR, KE, parsing, matching, |
cognitive science (perception) | ||
3 | Propose a framework | Proposed, demonstrated dependency parsing, binary match |
4 | Provide Test Results | Demonstrated SVO model for short sentences; |
may need to consider term valence. |
[1] |
Nan Liu, Yong Ye. Humanitarian logistics planning for natural disaster response with Bayesian information updates. Journal of Industrial and Management Optimization, 2014, 10 (3) : 665-689. doi: 10.3934/jimo.2014.10.665 |
[2] |
Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i |
[3] |
Reuven Segev. Book review: Marcelo Epstein, The Geometrical Language of Continuum Mechanics. Journal of Geometric Mechanics, 2011, 3 (1) : 139-143. doi: 10.3934/jgm.2011.3.139 |
[4] |
Bas Janssens. Infinitesimally natural principal bundles. Journal of Geometric Mechanics, 2016, 8 (2) : 199-220. doi: 10.3934/jgm.2016004 |
[5] |
Karim El Laithy, Martin Bogdan. Synaptic energy drives the information processing mechanisms in spiking neural networks. Mathematical Biosciences & Engineering, 2014, 11 (2) : 233-256. doi: 10.3934/mbe.2014.11.233 |
[6] |
M. L. Bertotti, Sergey V. Bolotin. Chaotic trajectories for natural systems on a torus. Discrete and Continuous Dynamical Systems, 2003, 9 (5) : 1343-1357. doi: 10.3934/dcds.2003.9.1343 |
[7] |
Daniel Grieser. A natural differential operator on conic spaces. Conference Publications, 2011, 2011 (Special) : 568-577. doi: 10.3934/proc.2011.2011.568 |
[8] |
Editorial Office. Retraction: Xiaohong Zhu, Lihe Zhou, Zili Yang and Joyati Debnath, A new text information extraction algorithm of video image under multimedia environment. Discrete and Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1265-1265. doi: 10.3934/dcdss.2019087 |
[9] |
Roya Soltani, Seyed Jafar Sadjadi, Mona Rahnama. Artificial intelligence combined with nonlinear optimization techniques and their application for yield curve optimization. Journal of Industrial and Management Optimization, 2017, 13 (4) : 1701-1721. doi: 10.3934/jimo.2017014 |
[10] |
Irina Kareva, Faina Berezovkaya, Georgy Karev. Mixed strategies and natural selection in resource allocation. Mathematical Biosciences & Engineering, 2013, 10 (5&6) : 1561-1586. doi: 10.3934/mbe.2013.10.1561 |
[11] |
Rui Wang, Denghua Zhong, Yuankun Zhang, Jia Yu, Mingchao Li. A multidimensional information model for managing construction information. Journal of Industrial and Management Optimization, 2015, 11 (4) : 1285-1300. doi: 10.3934/jimo.2015.11.1285 |
[12] |
Vikram Krishnamurthy, William Hoiles. Information diffusion in social sensing. Numerical Algebra, Control and Optimization, 2016, 6 (3) : 365-411. doi: 10.3934/naco.2016017 |
[13] |
Subrata Dasgupta. Disentangling data, information and knowledge. Big Data & Information Analytics, 2016, 1 (4) : 377-389. doi: 10.3934/bdia.2016016 |
[14] |
Apostolis Pavlou. Asymmetric information in a bilateral monopoly. Journal of Dynamics and Games, 2016, 3 (2) : 169-189. doi: 10.3934/jdg.2016009 |
[15] |
Ioannis D. Baltas, Athanasios N. Yannacopoulos. Uncertainty and inside information. Journal of Dynamics and Games, 2016, 3 (1) : 1-24. doi: 10.3934/jdg.2016001 |
[16] |
Vieri Benci, C. Bonanno, Stefano Galatolo, G. Menconi, M. Virgilio. Dynamical systems and computable information. Discrete and Continuous Dynamical Systems - B, 2004, 4 (4) : 935-960. doi: 10.3934/dcdsb.2004.4.935 |
[17] |
Wai-Ki Ching, Jia-Wen Gu, Harry Zheng. On correlated defaults and incomplete information. Journal of Industrial and Management Optimization, 2021, 17 (2) : 889-908. doi: 10.3934/jimo.2020003 |
[18] |
Nicolas Rougerie. On two properties of the Fisher information. Kinetic and Related Models, 2021, 14 (1) : 77-88. doi: 10.3934/krm.2020049 |
[19] |
Gabrielle Nornberg, Delia Schiera, Boyan Sirakov. A priori estimates and multiplicity for systems of elliptic PDE with natural gradient growth. Discrete and Continuous Dynamical Systems, 2020, 40 (6) : 3857-3881. doi: 10.3934/dcds.2020128 |
[20] |
Shenzhou Zheng, Xueliang Zheng, Zhaosheng Feng. Optimal regularity for $A$-harmonic type equations under the natural growth. Discrete and Continuous Dynamical Systems - B, 2011, 16 (2) : 669-685. doi: 10.3934/dcdsb.2011.16.669 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]