Bug #3453
Exception extracting text from doc11.pdf
Start date:
07/16/2008
Due date:
% Done:
0%
Estimated time:
Bugzilla-Id:
3453
Description
When parsing doc11.pdf from the private/samples directory, calling PDFExtract.extract gives the following stack trace:
java.util.NoSuchElementException
at java.util.LinkedList$ListItr.next(Unknown Source)
at edu.msu.first.parser.extract.PDFExtract.stripHeaders(PDFExtract.java:931)
at edu.msu.first.parser.extract.PDFExtract.extractContent(PDFExtract.java:206)
at edu.msu.first.parser.extract.PDFExtract.extract(PDFExtract.java:136)
History
#1 Updated by Bridger Hamilton over 12 years ago
PDFExtract.stripHeaders now checks to make sure there is another line of text before checking to see if it's a header.
#2 Updated by Redmine Admin almost 8 years ago
Original Bugzilla ID was 3453