Chapter 10 Drishti: a generative AI-based application for gesture recognition and execution
-
Tathagata Bhattacharya
, Harshavardhan Meka , Srikanth Ponaganti , Peddi Adithya Vardhan and Irshad Ali Mohammad
Abstract
This study explores the evolution of the inclusive educational tool, now named “Drishti,” a new release of the preceding “Dishari” project. Drishti integrates current technologies, specifically hand gesture detection, and generative AI, to cater to individuals with hearing and speech impairments. Traditional engines like Google frequently overlook the unique accessibility desires of these users, developing barriers to digital engagement. Drishti bridges this gap by using machine learning algorithms and computer vision to interpret hand gestures captured via web cameras, translating them into both sign language and keyword inputs for search engines like Google and Yahoo. The updated version extends the functionality of Dishari by incorporating not only alphabet inputs but also numerical inputs (0-9), delete button gesture, and space button gesture. Generative AI further complements the quest procedure, permitting seamless query inputs through both textual content and gestures. Through an in-depth literature analysis, we list the advancements in gesture recognition and the role of generative AI in improving accessibility, marking Drishti as an enormous step toward empowering people with hearing and speech impairments, to engage with digital platforms more efficiently.
Abstract
This study explores the evolution of the inclusive educational tool, now named “Drishti,” a new release of the preceding “Dishari” project. Drishti integrates current technologies, specifically hand gesture detection, and generative AI, to cater to individuals with hearing and speech impairments. Traditional engines like Google frequently overlook the unique accessibility desires of these users, developing barriers to digital engagement. Drishti bridges this gap by using machine learning algorithms and computer vision to interpret hand gestures captured via web cameras, translating them into both sign language and keyword inputs for search engines like Google and Yahoo. The updated version extends the functionality of Dishari by incorporating not only alphabet inputs but also numerical inputs (0-9), delete button gesture, and space button gesture. Generative AI further complements the quest procedure, permitting seamless query inputs through both textual content and gestures. Through an in-depth literature analysis, we list the advancements in gesture recognition and the role of generative AI in improving accessibility, marking Drishti as an enormous step toward empowering people with hearing and speech impairments, to engage with digital platforms more efficiently.
Chapters in this book
- Frontmatter I
- Preface V
- Contents VII
-
Section: Image processing
- Chapter 1 Magnetic resonance image re-parameterization on real data 1
- Chapter 2 Denoising and gradient fusion for effective edge detection for noisy color images 17
- Chapter 3 Understanding driver attention to objects for ADASs: what do drivers see? 39
- Chapter 4 Image clustering enhanced with refined image classification 59
- Chapter 5 AI-powered framework for objective scoring of product design innovation 89
-
Section: Computer vision
- Chapter 6 Image inpainting using GAN transformerbased model 111
- Chapter 7 Enhanced image watermarking through cross-attention and noise-invariant domain learning 127
- Chapter 8 Online melt pool monitoring using a deep transformer image processing solution 153
- Chapter 9 Implementation of deep learning techniques on thermal image classification 173
- Chapter 10 Drishti: a generative AI-based application for gesture recognition and execution 203
-
Section: Pattern recognition
- Chapter 11 Exploring muzzle biometrics: a deep learning framework for noninvasive cattle recognition 239
- Chapter 12 Utilizing real-world data to develop a userindependent sensor-based human activity recognition system 253
- Index 273
Chapters in this book
- Frontmatter I
- Preface V
- Contents VII
-
Section: Image processing
- Chapter 1 Magnetic resonance image re-parameterization on real data 1
- Chapter 2 Denoising and gradient fusion for effective edge detection for noisy color images 17
- Chapter 3 Understanding driver attention to objects for ADASs: what do drivers see? 39
- Chapter 4 Image clustering enhanced with refined image classification 59
- Chapter 5 AI-powered framework for objective scoring of product design innovation 89
-
Section: Computer vision
- Chapter 6 Image inpainting using GAN transformerbased model 111
- Chapter 7 Enhanced image watermarking through cross-attention and noise-invariant domain learning 127
- Chapter 8 Online melt pool monitoring using a deep transformer image processing solution 153
- Chapter 9 Implementation of deep learning techniques on thermal image classification 173
- Chapter 10 Drishti: a generative AI-based application for gesture recognition and execution 203
-
Section: Pattern recognition
- Chapter 11 Exploring muzzle biometrics: a deep learning framework for noninvasive cattle recognition 239
- Chapter 12 Utilizing real-world data to develop a userindependent sensor-based human activity recognition system 253
- Index 273