
TECHNOLOGY
Patent Filed
"These prototypes make use of computer vision algorithms such as YOLO and VLMs, as well as a novel directional haptic feedback algorithm."
Read the Research Paper Here
Hardware
Version 3 ($20) uses an ESP32 CAM board in order to record surroundings (wide 160° field of view), creating a WiFi signal (an access point) and using YOLOv11 run on the user device to identify objects in less than 300 ms, as well as GPT-4o to provide detailed analysis.
V2 runs on a Raspberry Pi as well, streaming video from its dedicated camera module over the internet (LAN), using a python Flask server. The processing is done on an app (on the user's phone) connected to the internet, using the Apple Vision Framework.


USER-CENTERED DESIGN
The prototypes are designed to be as ergonomic as possible, as well as minimizing weight and being less visually obvious (i.e. decreasing size by which it protrudes). This has inspired the approach using distributed computing to greatly reduce the weight of the product. Moreover, the model used has multilingual support and uses an intuitive form of directional haptic communication.
OCR Software
OCR, VLMs, and YOLO - a range of Computer Vision techniques integrated for versatility

Version 3
The processing here occurs on a phone, and uses the new YOLOv11 model (300ms process times at a few MB). Based on confidence levels, it can also dynamically choose between YOLO and VLMs, which allow it to provide feedback even with camera obstructions or aberrations, making it more robust.
Version 2
The processing here occurs on a phone (currently runs on iOS), and uses Apple's Vision framework. Unlike Traditional OCR software (character-by-character), this uses neural networks in order to recognise text, which is then spoken out by text-to-speech software
Version 1
Uses Tesseract, an open Source OCR software, to detect text from a camera image, and this supports numerous languages. Alongside this, it also uses an algorithm specifically made for formatting recognised text on a page, greatly improving accuracy. All of this is run on Python.
WIFI AND DISTRIBUTED COMPUTING
In India, more than 80% of all individuals, even those living in Rural areas, have smartphones and internet connectivity, and this figure has continued rising through the COVID era. What this means is accessibility technology can remain affordable by leveraging the incredible computing power of most modern smartphones (even lower-end models). This, along with provision for WiFi connectivity in most households means that the product can remain extremely low cost, at $20, while providing superior quality and comfort for the user. This form of distributed computing is a key part of what makes EyeSight so affordable, while continuing to serve its purpose and remain practical.