TextMining.org

Ryan Ackley, whose name appears in Word-related POI classes, has written a library to extract text from Word documents. According to him, textmining.org has several advantages compared to POI:

  • The textmining.org library is optimized for extracting text. POI is not.
  • The textmining.org libraries supports extracting text from Word 6/95. POI does not.
  • The textmining.org libraries do not extract deleted text that is still in the document for the purposes of revision marking. POI does not handle this.

The question is, now: did he document textmining.org as well as the POI classes?

Update: POI stands for Poor Obfuscation Implementation. Certainly didn’t find it on the site.

 
---

Commenting is closed for this article.

---