extraction - Automatic deletion of some pdf contents -


i've got extract specific data high number of .pdf files. problem first thing have convert .pdf .txt can find data i'm interested in. after conversion there's high amount of artefacts in .txt files ( page numbers, hyperlinks contents page, footers, headers etc. ). these .pdf files quite huge ones ( every single file transcription of 7-12 hours of people talking ) cannot afford deleting things manually ( i've got ~60 .pdf files ). question - know tool allows automatic deletion of such contents?

i'd glad hear every proposition improve work :) thanks!


Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -