Even a Pi5 will not be enough for vision

I have dabbled with giving my robots face and object recognition for nearly 5 years now.

First with OpenCV to process still and video images from the PiCam on Carl’s Raspberry Pi 3B. Carl could process about 10 frames per second doing lane recognition and not much else.

Next I tried offloading the processing using the inexpensive HuskyLens camera/object processor. While this device was really quick at recognizing bananas, its functionality was too limited (and bug-ridden) to become an integral part of Carl’s capabilities.

When the GoPiGo OS came out, I discovered that MR had included running the then latest Neural Net Object Recognition in the Python tutorials, which I reproduced on Carl in Raspbian For Robots. I had to upgrade Carl’s OpenCV from v3 to v4 but it worked. Again, it was great at recognizing bananas but I really couldn’t figure out a use in Carl’s “life”.

When Luxonis announced their Oak-D-Lite kick-starter campaign, I was quick to jump on this opportunity - stereo mono cameras, plus a color camera, plus a dedicated neural net processor to run off-the-shelf and user tuned models. I was excited.

After waiting many months for my Oak-D-Lite to arrive, I mounted it on Dave, reconfigured his power distribution system, and fired up the YOLO3 demos and confirmed the device did everything it promised. Then it was up to me to decide what to do with it on my robots… well it didn’t happen because Dave was supposed to be a ROSbot, and I’ve been learning ROS ever since.

Today, while wandering YouTube, I spotted a Turtlebot4 claiming 30 frames per second 1280x720 video, and 3.7ms (270 FPS) YOLO7 object recognition on 640x384 resized video with simultaneous 3D video SLAM.

Wow, now that has my attention…until in the comments someone asked “Can I run this on Raspberry Pi?” and the response - “No, it takes Two RTX3090 video cards!

(Sort of explains why TurtleBot4/TB4lite, which has an Oak-D-Pro/Oak-D-Lite device, points owners to the Luxonis GitHub examples but does not include a single example of integrating vision with the TurtleBot4 ROS 2 functionality. I asked a TB4 user what load he was seeing doing 2D SLAM - 100% and throttling on his RPi4 - no room for vision.)