An overview of information extraction methods, techniques and tools for the contents in chemical document
Keywords:
Information Extraction, NLP, ChemEx, IRAbstract
The amount of electronic documents has speedily increased and the information over the internet is increasing day by day too. The web is continuously growing because the new information is added over it a day. These massive documents contain substantial information, but it has to be retrieved and managed in a constructive and useful way. Extracting the information from these documents is useful for many applications such as text categorization, summarization, clustering, topic tracking etc. Information Extraction (IE) is the field of extracting useful information using different methods and approaches. In this paper, the concept of information extraction (IE) is discussed, as well as presents overview of techniques used for information extraction from chemical documents.
Downloads
References
Agarkar VV, Ajmire PE, Bodkhe PS (2020) “Web Mining: An Application of Data Mining”, Aayushi International Interdisciplinary Research Journal, Special Issue No.66, pp 56-59.
Atima Tharatipyakul, Somrak Numnark, Duangdao Wichadakul, Supawadee Ingsriswang (2012) ChemEx: information extraction system for chemical data curation", From Asia Pacific Bioinformatics Network (APBioNet) Eleventh International Conference on Bioinformatics, (InCoB2012), Bangkok, Thailand.
Cunningham H (2006) Information Extraction, Automatic”, Encyclopaedia of Language and Linguistics, 2nd Edition, 5:665–677.
Elsadig Muawia, Ahmed Ali & Himmat, Mubatak (2015) “Information Extraction methods and extraction techniques in the chemical document's contents: Survey”, ARPN Journal of Engineering and Applied Sciences. 10. 1068-1073.
Jie Tang, Mingcai Hong, Duo Zhang, Bangyong Liang, and Juanzi Li, (2007)“Information Extraction: Methodologies and Applications”, Emerging Technologies of Text Mining: Techniques and Applications. 10.4018/978-1-59904-373-9.ch001.
Mani. and I Zhang. (2003), "kNN approach to unbalanced data distributions: a case study involving information extraction," in Proceedings of Workshop on Learning from Imbalanced Datasets, 2003.
Meystre S and Haug PJ (2006), "Natural language processing to extract medical problems from electronic clinical documents: performance evaluation," Journal of biomedical informatics, vol. 39, pp. 589-599.
Mykowiecka, M. Marciniak, and A. Kupsc. (2009), "Rule-based information extraction from patients’ clinical data," Journal of biomedical informatics, vol. 42, pp. 923-936.
Ono T, Hishigaki H, A. Tanigami, and T Takagi (2001) "Automated extraction of information on protein–protein interactions from the biological literature," Bioinformatics, vol. 17, pp. 155-161, 2001.
Patil SR and Mahajan SM (2012) Optimized summarization of research papers as an aid for research scholars using data mining techniques”, International Conference on Radar, Communication and Computing (ICRCC), IEEE, pp 243 – 249.
Postma GJ, van der Linden B, Smits JR, Kateman G (1990) TICA: A System for the Extraction of Data from Analytical Chemical Text. Chemometrics and Intellegent Laboratory Systems 9, 65–74.
Praveen Shagufta and Chandra Umesh. (2017), “Influence of Structured, SemiStructured, Unstructured data on various data models”, International Journal of Scientific & Engineering Research Volume 8, Issue 12.
Rocktaschel T, M. Weidlich, and U. Leser.(2012), "ChemSpot: a hybrid system for chemical named entity recognition," Bioinformatics, vol. 28, pp. 1633-1640, 2012.
Sint R, Shaffert S, Stroka S and Ferstl R, (2009) Combining unstructured, fully structured and semi-structured information in semantic wikis”, Paper presented at the Semantic Wikis.
Sukanya M and Biruntha S (2012), “Techniques on Text Mining”, IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp: 269-271.
Swain Matthew C and Cole Jacqueline M (2016), "ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature", J. Chem. Inf. Model. 56, 1894–1904.
Zamora EM and Blower Jr PE.(1984), "Extraction of chemical reaction information from primary journal text using computational linguistics techniques. 1. Lexical and syntactic phases," Journal of chemical information and computer sciences, vol. 24, pp. 176- 181, 1984.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Agarkar VV, Ajmire PE
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license unless indicated otherwise in a credit line to the material. If the material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/