Search text in PDF files using Java (Apache Lucene and Apache PDFBox)



DOWNLOAD










I came across this requirement recently, to find whether a specific word is present or not in a PDF file. Initially I thought this is a very simple requirement and created a simple application in Java, that would first extract text from PDF files and then do a linear character matching like mystring.contains(mysearchterm) == true. It did give me the expected output, but
Share on Google Plus

About Filegot

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.
    Blogger Comment

0 comments:

Post a Comment

Thank you for your comment