QUIP-RS

A.I. For the Visually Impaired: Real-Time Object Detection with YOLOv8

QUIP-RS

Christopher Falcone ‘27 conducted this study to develop a prototype aimed at enhancing accessibility and mobility for individuals who are blind or visually impaired.

Overview

This project focuses on enhancing the lives of individuals who are blind or visually impaired to make independence and mobility easier. By using a head-mounted Intel RealSense camera connected to a phone, the program has a voice interface to allow users to give commands and change detection zones to their preference. Additionally, relevant information will be relayed first via a priority algorithm, which will show closer objects and individuals approaching first.

School

School of Computing & Engineering

Program

BA or BS in Computer Science

Researcher

Christopher Falcone '27

Computer Science

School of Computing & Engineering

A.I. For the Visually Impaired: Real-Time Object Detection with YOLOv8

Problem Statement

Individuals who are blind or visually impaired often face significant difficulties that can hinder their independence and restrict their ability to fully participate in routine activities and public life.
While assistive tools currently exist, many fall short in effectively supporting users in dynamic, real-world environments where conditions are constantly changing.
As a result, there is an increasing demand for more intelligent, adaptable, and responsive assistive technologies that can better address these real-life complexities.

Project Objectives

The primary goal of this project was to improve the daily lives of visually impaired individuals by helping them better understand and navigate their surroundings.
Our research was focused on developing a prototype concept for a smart navigation tool that allows the visually impaired to better understand and navigate their environment.
An object detection algorithm is used to scan the environment, and objects and their distance are relayed back to the user through auditory feedback.
A depth camera is worn on the head with the help of a wearable camera mount and is plugged into the phone via USB-C cable.

Methods Utilized

Throughout this project, we investigated several different methodologies in order to provide the most streamlined and accessible tool possible.
The object detection algorithm YOLOv8 was used for detection of common items and obstacles and provided the most accurate detection and fastest latency. It was trained on around forty thousand images of thirty-one different common objects, including cars, people, dogs, trees, etc.
The Intel RealSense D435i was used in order to accurately display depth, and each item detected is assigned a distance in steps. The maximum distance of an item is set to ten steps away, to ensure only relevant objects are relayed.
Voice Interaction was integrated to let the user talk to the system and hear what it detects:
- Speech-to-text → issue commands like "start detection"
- Text-to-speech → speak detected objects and distances
The screen is divided into three parts, dividing the camera's view into a left, center, and right section, allowing for specific callouts of an item's location. This zone can be dynamically expanded with a voice command, allowing for a wider zone of detection if the user desires.
A priority algorithm is implemented in order to give certain items a higher priority in terms of reading. For example, items within the center of the camera are read at a greater priority than items outside of this range. Objects that are closer are also read before objects that are further away.
Approaching people take the highest priority, which is calculated with account to the person's speed, distance, and location.

Comparison to Existing Solutions

Existing Smart Glasses:

Google Glass, Ray-Ban Meta Glasses, Envision Glasses

Limitations:

Primarily designed for Augmented Reality (AR), productivity, or entertainment.
Lack or have limited accessibility settings for visually impaired.
Depend on other people for navigation.
Often come at a very high cost, with the Envision glasses costing over $2000.

YOLORealSense App:

Focused fully on assisting navigation, providing detection and depth-sensing
Runs independently, not dependent on others for navigation
Fully built on voice interaction, enabling accessibility for visually impaired

Challenges Encountered/Reflections

Model Training & Performance: The YOLOv8 model went through many renditions, either due to inaccurate detections or adding additional classes. False positive detections were rather common throughout the project's development.
RealSense Integration: Integrating the RealSense camera with the YOLOv8 model presented its own set of challenges, requiring several modifications to properly scale the dimensions of the camera feed to match the model's input.
User Experience: Designing intuitive voice commands, ensuring said voice commands were heard, and creating a non overwhelming feedback experience was rather difficult. Commands had to be summarized and spaced out to ensure a clear understanding.
Latency Optimization: Latency was a frequent issue encountered throughout our project. Our initial plan included integrating Grounding DINO for specific object detection, but its high computational demands led to inefficient latency and it being phased out.
Detections in Low-Light: Detections in low-light scenarios, such as a dark rooms or nighttime, tended to provide little to no quality output. This could be solved by expanding the YOLO dataset to include images taken in the dark to improve the device's accessibility.
Response Delay: Some detections, especially approaching people, often detected too late for a person to be able to react to them. This could be solved by expanding the range at which the device could detect approaching objects.

Professional Application

"This project was incredibly beneficial for my future career path, helping me to gain a better understanding of the expectations and work environment of a software developer. It gave me the opportunity to challenge myself, think creatively and grow more confident in my abilities. Working closely under the guidance of my professors not only pushed me to improve my technical skills, but also gave me valuable insight into how to approach problems logically. Their mentorship showed me the importance of asking questions, staying open to feedback, and learning from those with more experience. Overall, this experience has prepared me to continue developing both personally and professionally as I move forward in my career, and I'm truly fortunate to have undergone this project." - Christopher Falcone ’27

For Further Discussion

This serves as an overview of the project and does not include the complete work. To further discuss this project, please email Christopher Falcone.

Quinnipiac University Interdisciplinary Program for Research and Scholarship

Open to students of all majors, QUIP-RS provides up to $5,000 in funding for undergraduate students to conduct research or complete creative projects alongside faculty mentors. This intensive 8-week program enables students to develop scholarly skills while encouraging discussion about successes and shortcomings with fellows and mentors.

Learn more about QUIP-RS

Explore Our Areas of Interest

We've sorted each of our undergraduate, graduate and doctoral programs into unique Areas of Interest. Explore these categories to discover which programs and delivery methods best align with your educational and career goals.

Explore Computing and Technology at Quinnipiac

Explore STEM-Programs at Quinnipiac