PDF Text OCR Xtractor uses Tesseract OCR technology. Tesseract is perhaps the most powerful and advanced OCR software out there and here is why: First of all, a bit of history. It was developed by HP in 1994, but soon the company released it under Apache License for open-source development. In 2006, Google took over the project and sponsored developers to work on Tesseract. Fast forward now and Tesseract has become the most powerful OCR engine that uses Deep Learning to extract texts from images (BMP, PNG, JPEG, TIFF, etc.) and PDF files.