Image to text python -


i using python 3.x , using following code convert image text:

from pil import image pytesseract import image_to_string  image = image.open('image.png', mode='r') print(image_to_string(image)) 

i getting following error:

traceback (most recent call last):   file "c:/users/hp/desktop/gii/image_to_text.py", line 12, in <module>     print(image_to_string(image))   file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 161, in image_to_string     config=config)   file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\site-packages\pytesseract\pytesseract.py", line 94, in run_tesseract     stderr=subprocess.pipe)   file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 950, in __init__     restore_signals, start_new_session)   file "c:\users\hp\downloads\winpython-64bit-3.5.1.2\python-3.5.1.amd64\lib\subprocess.py", line 1220, in _execute_child     startupinfo) filenotfounderror: [winerror 2] system cannot find file specified 

please note have put image in same directory python present. not raise error on image = image.open('image.png', mode='r') raises on line print(image_to_string(image)).

any idea might wrong here? thanks

you have have tesseract installed , accesible in path.

according source, pytesseract merely wrapper subprocess.popen tesseract binary binary run. not perform kind of ocr itself.

relevant part of sources:

def run_tesseract(input_filename, output_filename_base, lang=none, boxes=false, config=none):     '''     runs command:         `tesseract_cmd` `input_filename` `output_filename_base`      returns exit status of tesseract, tesseract's stderr output     '''     command = [tesseract_cmd, input_filename, output_filename_base]      if lang not none:         command += ['-l', lang]      if boxes:         command += ['batch.nochop', 'makebox']      if config:         command += shlex.split(config)      proc = subprocess.popen(command,             stderr=subprocess.pipe)     return (proc.wait(), proc.stderr.read()) 

quoting part of source:

# change if tesseract not in path, or named differently tesseract_cmd = 'tesseract' 

so quick way of changing tesseract path be:

import pytesseract pytesseract.tesseract_cmd = "/absolute/path/to/tesseract"  # should done once  pytesseract.image_to_string(img) 

Comments