It is so disrespectful of me to have avoided giving Dave the ability to hear what is going on around him for so long. Dave is over 8 years old and the ROS enabled Dave will be 5 years old this month, and until now I have not given him “Carl’s super power” (speech recognition).
It turns out due to my experience with Carl, that it was relatively easy to install the VOSK speech recognition engine, and get Dave listening for a few salutations. What is a bit harder is to decide what Dave’s responses should be, and to avoid repeating himself with an overly simple algorithm.
20 second video demonstrating Lyrical-Dave’s new “Hello Dave” function:
(Hearing “Good Night Dave” he responds “Good Night. I’m going to sleep now” and exits the Hello Dave program for the time being.)
Interesting progression of complexity the more I think and read about human-robot greetings:
Simple grammar, simple one to one response, grammar and responses in-line Python
Many to one grammar, simple responses, still all defined in-line python
Many to one grammar, random non repetitive responses for generic greeting, single Python script incorporating phrase classification and non-repeat response generation method(s)
Make it a ROS node
Parameter file one to one phrase/simple response map, separate ROS speech recognition server and ROS parameter file based response processor, no helper methods (GitHub/voskros)
Switch to C++ and Behavior Trees
Parameter file grammar, parameter file behavior tree, ROS behavior tree server with helper methods
Above adding ROS Behavior Tree blackboard and ROS topic to blackboard methods and temporal blackboard validity methods
Above adding behaviors that call ROS actions, services, or topic publishers
Above adding “human has stopped near me” sensing and trigger/test (vision and or LIDAR)
LLM with model-context-protocol agent with “tools” to execute robot functions
Distributed Processing- too large for Raspberry Pi 5
Vision LLM to Evaluate Human In View Emotions and Intensions