An overview of information extraction methods, techniques and tools for the contents in chemical document

Authors

  • Agarkar VV Assistant Professor, Department of Computer Science, Shri D. M. Burungale Science & Arts College, Shegaon MS, India,
  • Ajmire PE Professor & Head, Department of Computer Science, G. S. Science Arts & Commerce College, Khamgaon MS, India,

Keywords:

Information Extraction, NLP, ChemEx, IR

Abstract

The amount of electronic documents has speedily increased and the information over the internet is increasing day by day too. The web is continuously growing because the new information is added over it a day. These massive documents contain substantial information, but it has to be retrieved and managed in a constructive and useful way. Extracting the information from these documents is useful for many applications such as text categorization, summarization, clustering, topic tracking etc. Information Extraction (IE) is the field of extracting useful information using different methods and approaches. In this paper, the concept of information extraction (IE) is discussed, as well as presents overview of techniques used for information extraction from chemical documents.

Downloads

Download data is not yet available.

References

Agarkar VV, Ajmire PE, Bodkhe PS (2020) “Web Mining: An Application of Data Mining”, Aayushi International Interdisciplinary Research Journal, Special Issue No.66, pp 56-59.

Atima Tharatipyakul, Somrak Numnark, Duangdao Wichadakul, Supawadee Ingsriswang (2012) ChemEx: information extraction system for chemical data curation", From Asia Pacific Bioinformatics Network (APBioNet) Eleventh International Conference on Bioinformatics, (InCoB2012), Bangkok, Thailand.

Cunningham H (2006) Information Extraction, Automatic”, Encyclopaedia of Language and Linguistics, 2nd Edition, 5:665–677.

Elsadig Muawia, Ahmed Ali & Himmat, Mubatak (2015) “Information Extraction methods and extraction techniques in the chemical document's contents: Survey”, ARPN Journal of Engineering and Applied Sciences. 10. 1068-1073.

Jie Tang, Mingcai Hong, Duo Zhang, Bangyong Liang, and Juanzi Li, (2007)“Information Extraction: Methodologies and Applications”, Emerging Technologies of Text Mining: Techniques and Applications. 10.4018/978-1-59904-373-9.ch001.

Mani. and I Zhang. (2003), "kNN approach to unbalanced data distributions: a case study involving information extraction," in Proceedings of Workshop on Learning from Imbalanced Datasets, 2003.

Meystre S and Haug PJ (2006), "Natural language processing to extract medical problems from electronic clinical documents: performance evaluation," Journal of biomedical informatics, vol. 39, pp. 589-599.

Mykowiecka, M. Marciniak, and A. Kupsc. (2009), "Rule-based information extraction from patients’ clinical data," Journal of biomedical informatics, vol. 42, pp. 923-936.

Ono T, Hishigaki H, A. Tanigami, and T Takagi (2001) "Automated extraction of information on protein–protein interactions from the biological literature," Bioinformatics, vol. 17, pp. 155-161, 2001.

Patil SR and Mahajan SM (2012) Optimized summarization of research papers as an aid for research scholars using data mining techniques”, International Conference on Radar, Communication and Computing (ICRCC), IEEE, pp 243 – 249.

Postma GJ, van der Linden B, Smits JR, Kateman G (1990) TICA: A System for the Extraction of Data from Analytical Chemical Text. Chemometrics and Intellegent Laboratory Systems 9, 65–74.

Praveen Shagufta and Chandra Umesh. (2017), “Influence of Structured, SemiStructured, Unstructured data on various data models”, International Journal of Scientific & Engineering Research Volume 8, Issue 12.

Rocktaschel T, M. Weidlich, and U. Leser.(2012), "ChemSpot: a hybrid system for chemical named entity recognition," Bioinformatics, vol. 28, pp. 1633-1640, 2012.

Sint R, Shaffert S, Stroka S and Ferstl R, (2009) Combining unstructured, fully structured and semi-structured information in semantic wikis”, Paper presented at the Semantic Wikis.

Sukanya M and Biruntha S (2012), “Techniques on Text Mining”, IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), pp: 269-271.

Swain Matthew C and Cole Jacqueline M (2016), "ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature", J. Chem. Inf. Model. 56, 1894–1904.

Zamora EM and Blower Jr PE.(1984), "Extraction of chemical reaction information from primary journal text using computational linguistics techniques. 1. Lexical and syntactic phases," Journal of chemical information and computer sciences, vol. 24, pp. 176- 181, 1984.

Downloads

Published

2021-03-15

How to Cite

Agarkar VV, & Ajmire PE. (2021). An overview of information extraction methods, techniques and tools for the contents in chemical document. International Journal of Life Sciences, 25–30. Retrieved from https://ijlsci.in/ls/index.php/home/article/view/381

Issue

Section

Research Articles