i using python 3.x , using following code convert image text:
from pil import image pytesseract import image_to_string image = image.open('image.png', mode='r') print(image_to_string(image))
i getting following error:
traceback (most recent call last): file "c:/users/hp/desktop/gii/image_to_text.py", line 12, in <module> print(image_to_string(image)) file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string config=config) file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract stderr=subprocess.pipe) file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__ restore_signals, start_new_session) file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child startupinfo) filenotfounderror: [winerror 2] system cannot find file specified
please note have put image in same directory python present. not raise error on image = image.open('image.png', mode='r')
raises on line print(image_to_string(image))
.
any idea might wrong here? thanks
you have have tesseract
installed , accesible in path.
according source, pytesseract
merely wrapper subprocess.popen
tesseract binary binary run. not perform kind of ocr itself.
relevant part of sources:
def run_tesseract(input_filename, output_filename_base, lang=none, boxes=false, config=none): ''' runs command: `tesseract_cmd` `input_filename` `output_filename_base` returns exit status of tesseract, tesseract's stderr output ''' command = [tesseract_cmd, input_filename, output_filename_base] if lang not none: command += ['-l', lang] if boxes: command += ['batch.nochop', 'makebox'] if config: command += shlex.split(config) proc = subprocess.popen(command, stderr=subprocess.pipe) return (proc.wait(), proc.stderr.read())
quoting part of source:
# change if tesseract not in path, or named differently tesseract_cmd = 'tesseract'
so quick way of changing tesseract path be:
import pytesseract pytesseract.tesseract_cmd = "/absolute/path/to/tesseract" # should done once pytesseract.image_to_string(img)
Comments
Post a Comment