This package aims to allow for the extraction and parsing of equations in PDFs of technical and scientific documents. It utilizes techniques from both computer vision and natural language processing to identify the equations and to ground their variables to scientific concepts.
In the bigger picture, the package serves as a proof of concept for extracting information stored in the form of structured objects (including equations, tables, and charts) in PDF documents.
This module is being developed as part of the AutoMATES project at the University of Arizona.