Category: pdf
-
A process to find and extract data-points from graphs in pdf files
Ever since I discovered that it’s sometimes possible to extract the x/y values of the points/circles/diamonds appearing in a graph, within a pdf, I have been trying to automate the process. Within a pdf there are two ways of encoding an image, such as the one below. The information can be specified using a graphics […]
-
Working with PDF Highlight Annotations Programmatically
PDFs are the format of choice in academia, but extracting the information they contain is annoyingly hard.I’ve just started working on my degree’s final project. An academic project requires lots of research, which means reading lots of papers.Papers a…