Abstract
A significant number of scholarly articles in computer science and other disciplines contain algorithms that provide concise descriptions for solving a wide variety of computational problems. For example, Dijkstra's algorithm describes how to find the shortest paths between two nodes in a graph. Automatic identification and extraction of these algorithms from scholarly digital documents would enable automatic algorithm indexing, searching, analysis and discovery. An algorithm search engine, which identifies pseudocodes in scholarly documents and makes them searchable, has been implemented as a part of the CiteSeerX suite. Here, we illustrate the limitations of start-of-the-art rule based pseudocode detection approach, and present a novel set of machine learning based techniques that extend previous methods.
Original language | English (US) |
---|---|
Article number | 6628716 |
Pages (from-to) | 738-742 |
Number of pages | 5 |
Journal | Proceedings of the International Conference on Document Analysis and Recognition, ICDAR |
DOIs | |
State | Published - Dec 11 2013 |
Event | 12th International Conference on Document Analysis and Recognition, ICDAR 2013 - Washington, DC, United States Duration: Aug 25 2013 → Aug 28 2013 |
All Science Journal Classification (ASJC) codes
- Computer Vision and Pattern Recognition