extraction - Automatic deletion of some pdf contents -
i've got extract specific data high number of .pdf files. problem first thing have convert .pdf .txt can find data i'm interested in. after conversion there's high amount of artefacts in .txt files ( page numbers, hyperlinks contents page, footers, headers etc. ). these .pdf files quite huge ones ( every single file transcription of 7-12 hours of people talking ) cannot afford deleting things manually ( i've got ~60 .pdf files ). question - know tool allows automatic deletion of such contents?
i'd glad hear every proposition improve work :) thanks!
Comments
Post a Comment