Carl Wanders Into Augmented Reality

THANK YOU @KeithW! This is exactly what Carl needed to recognize his dock from anywhere in the office. Carl can reliably spot a 3.5inch Augmented Reality University of Cordoba - ArUco 4x4 marker from 10 feet in our night time minimal illumination.

This shows Carl’s view 10 feet away from the ArUco marker at his dock:
(and the temperature jumping up almost to throttling threshold)

It really taxed Carl’s Pi3B to add this video stream to OpenCV for detection at “as fast as possible” - The load reached 5 to 6, meaning he was running at 125 to 150% of processing capacity. I added a short wait to cut the process down to 10 frames a second which brought the load down to 50% of the processor capacity.

Next I need to write the code to drive to the dock and get square to the wall.

1 Like

Fantastic - glad it worked.

I wasn’t familiar with Aruco fiducial markers. Quick googling found this article that suggests they’re easier to set up with OpenCV, but a bit more computationally intensive.

I know there’s a ROS library for April tags that I don’t believe uses OpenCV, but if you’ve got OpenCV running already makes sense to use it. Sounds like you found a way to make the processing needs reasonable.


I noted that as well - but having done that Pyimagesearch “Practical Python and OpenCV” course a while back, and the ArUco detect in 3 lines of code, I decided to see how it does.

I just tested on the Pi4. The detect with 2 markers in a 640x480 video frame takes 20ms for about 50 FPS, which I slow down the loop to 10FPS to get the load down to 2.0 (50% of processor).

The guy in the Netherlands that helped me get OpenCV 4.5.5 installed on Bullseye asked why Python not C++? I guess there would be even more performance available if I chased it.


Undoubtedly. But then you’d have to use C++. I do use it sometimes (the Arduino flavor), but don’t program enough to be proficient in it. Not that I’m proficient in Python, but it’s easier to know enough to get stuff done (and much faster to fix mistakes).


. . . For the same reason Hillary climbed Mt. Everest - because it’s there.

Also because everything else is in Python.  Not to mention that any performance benefit would probably be less than the aggravation of getting two different code bases to talk to the same libraries would cause.

If it ain’t broke, don’t fix it!


There is just something about having to run make after every edit, that is painful for me. I did it for years and years with Pascal, C, Ada, C++ and Java, and “publishing to the server” in client-server JavaScript, and dealing with CRUD for the database, all before I could retry “one more time and then I’ll go home”. Python just seems to flow naturally for me these days.

I think the C++, Java, and Python APIs are pretty symmetrical. It would be interesting to test the Python mutexes of to see if the C++ API honors the same mutex.

My understanding is that some percentage of what we think of as Python functionality is actually a Python interface to C/C++ under the covers. I don’t know what that percentage is, but until I run into some hard limits that need optimal performance, Python is just more comfortable for me - which is sort of strange since I taught three semesters of C++ and coded in C++ for roughly ten years, but the language always seemed wordy and awkward.


I’ve sometimes chosen to build with numpy rather than use OpenCV, like my “Easy PiCam Sensor For GoPiGo3” (Light-Motion-Color sensor), and in other applications that use motion detection since the Raspberry Pi implements it in hardware but OpenCV implements it in software.

I saw some April tags tutorials so it would be interesting to do a head to head comparison on the GoPiGo3 Pi3B+ and Pi4. The ArUco markers worked so well I stopped investigating and am trying to add a “ArUco Sensor Behavior”, “ArUco Find Behavior”, and “ArUco Drive (to) Behavior” to my subsumption architecture demo (“cruise behavior”, “avoid behavior”, “escape behavior”, “scan (with distance sensor) behavior”).

Eventually it would be interesting to add “Line Follow Behavior” also.

In addition to “Hey Carl, Go To Sleep”, I want to be able to tell him “Hey Carl, Go To Your Dock”

(Of course if he was running ROS2 with SLAM, it would come for free…)


For terse, nothing beats assembler.

I think Python is wordy and convoluted, but that’s what most of this stuff runs on, so I deal with it.


Apparently it’s somewhat old news, but I just saw a story today that Python is About to Become 64% Faster . So performance should be even less of an issue.




Discovered that these ArUco markers can give distance and angle off but I have to create camera calibration and distortion matrices.

Luckily someone already went down this path on the Raspberry Pi, and created some great YouTubes.

My first test with a very casual calibration reported Carl was 1075mm away for an actual distance of 1 meter, so I’m encouraged to try a more careful calibration and finish the videos.


Wow - sounds very promising


The angle is the most important value I need from this and it looks to be pretty good. The DI Dist Sensor is more reliable on the distance from the dock. The two of them combined with finding the ArUco marker are going to allow Carl to line up quite well (I expect.)

This will be when Carl first sees the marker, and needs to get aligned normal to the dock:

This will be when Carl is aligned to the dock:

And this is 30 degrees off to the left of the marker 1meter from wall:


I need to convert to computing based on snapping a jpg instead of video. Carl doesn’t need real-time path planning. That will decrease the processor load nicely.

These are some shots from the calibration process:

1 Like

Makes sense, especially as you get closer. I wonder if the camera is better at longer distances. And by better I mean at least gets you in the ballpark vs. being out of range for the distance sensor.

1 Like

Exactly. The sequence is:

  • wander till battery level suggests return to dock
  • spin around till see ArUco marker (vision)
  • turn to face marker (vision)
  • calc angle between norm from marker (dock)
  • calc right angle distance to the marker normal
  • turn to face the dock normal (90 - angle, encoders)
  • drive distance to dock normal (encoders)
  • turn 90 to face dock (vision)
  • drive till DI dist sensor shows at dock approach point
  • turn 180 to face away from dock (encoders)
  • back onto dock (encoders)

Video shows PiCamera calibration start to finish and pose estimation results:


Wow. Great tutorial video. Well done.