My girlfriend aproached me with the request that she has a lot of data tables in pdf files that she needs to convert. There are services that do that like pdftoexcelonline.com though I am not sure how well they do it. I decided to make my own little program. I ended up doing a text table to excel converter that I like to call Table2Excel.
Converts text into excel tables by using spaces and new lines as separators.
Lets say you have the following table in a pdf file and you want to get all data to an excel file.
(Table from: www.arrhenius.ucsd.edu/pub/precambrian.pdf)
Step1: Copy the text into a text file with copy past. Name it anything you want. Lets call this text file: Input.txt for the moment. Your text file should look like this:
Step2: Open Table2Excel.exe and browse through your files till you find Input.txt (or whatever you named it).
Step3: Hit process
Step4: Your data has now been converted to an excel file. It is stored in output.xls
This is it. Please let me know if Table2Excel works for you down in the comments or if you have suggestions to improve converting tables to excel.
Remark: Should you convert several tables, make sure you rename the output.xls file since it will be overwritten.
Python programmer details
Python modules required
GNU GENERAL PUBLIC LICENSE, feel free to use and share it