GoPiGo3 Vision: OpenCV, TensorFlow Lite, and Google Cloud

@KeithW commented about the Google AIY Vision Kit and starting a course about OpenCV. It will be interesting to hear your thoughts after your course.

(I did the pyimagesearch.com “Practical Python and OpenCV” course and book. I learned a bunch about Python at the same time as gaining an intro to OpenCV. )

The Google-Cloud Vision, (and their Speech Reco, and their TTS) is (are) so incredibly capable it feels a little like “re-inventing the wheel” to be down in the guts of gray-scaling, blurring, thresholding, masking, etc. in OpenCV.

One thing to note is for motion-detection - the Raspberry Pi’s video processing chip will beat OpenCV performance.

I was so impressed with the TensorFlow-Lite example object recognition performance, that I am wondering how it would do for other “OpenCV features” like line detection. DI/ModRobotics included the TensorFlow-Lite example in their Jypiter lessons in the GoPiGo OS, and I noticed @Cleoqc just finished a TensorFlow course. I am thinking that TensorFlow-Lite may be more suited to the GoPiGo3 than the very general purpose (and processor intensive) OpenCV.

1 Like

So, did you get the printed book(s), or did you do this entirely on-line?

Maybe you are, but you’ll know a lot more about it than if you just piggyback on Google.  Not to mention that Google has a hard dependency on an external-facing network and a Google developer account - which isn’t free.

Cool beanies!
(How did you find out?)

I feel like such a back-number now. . .  All I’m doing is trying to figure out how to dip-switch select operating systems to boot on the Pi, and get a joystick working.

Sigh. . . .

1 Like

I did just the basic e-book which comes with downloadable code , online supplemental chapters and a quiz to unlock the next chapter of supplemental material

2 Likes

Exactly my point - There are lots of “Robot Architectures”:

  1. A “Cloud-Connected” GoPiGo, (for Speech and Vision features) that you primarily code GoPiGo3 behaviors,
  2. A “Distributed-Processing” GoPiGo, (for sensor fusion, situation analysis, control and behaviors) that you primarily code message interfaces for the GoPiGo3 and onboard sensors (ala @KeithW’s ROS GoPiGo3)
  3. A “Remote Control” GoPiGo, where you control GoPiGo3 effectors from keyboard or mouse
  4. A “Learn From Examples” GoPiGo, where you run the DI/ModRobotics examples
  5. A “Learn To Code” GoPiGo, where you learn coding or language skills by creating simple sensor and effector code
  6. A “Donkey Car” GoPiGo, for learning about machine learning applied to vision in an autonomous bot
  7. A “Robot Ontology and Knowledge Development” GoPiGo, for learning about RDF/OWL
    graph representation of a robot and its environment with SPARQL (Query Language)
  8. An “Electronics Learning” GoPiGo, for learning about circuits and hardware
  9. And surely not finally, the “Autonomous Robot Life Form” I am trying to create!

It is probably hard to exceed the monthly free-trial usage, but I agree that there is a uncomfortable feeling knowing they might send you a warning that your usage is getting close to costing you real money.

For the “Cloud Connected” GoPiGo, there is the concern of sending photos from the inside of your home, or audio of your voice to “the cloud of questionable security” that creeps me out.

Not making use of the cloud means learning how to perform those functions on the GoPiGo locally - feels like “reinventing”

In my “Dreams for Carl” is a self-learning GoPiGo3 that:

  • takes photo of “interesting, unknown” objects in the robot’s environment
  • processes the photo for object segmentation to isolate the unknown object
  • Starts an “unknown object 12345” in the robot’s RDF database
  • analyses the object for “locally discernible features” such as shape, size, color, proximity to known objects, mobility, …?
  • Asks me if “now is a good time” to help identify and classify some objects?
  • Puts the image on the robot’s desktop with a text entry window for dialog
  • Dialogs to add to the “unknown object 123” a minimal set of RDF knowledge relations
    such as identity, class, features, purpose, utility to the bot
  • Dialogs to add “how does it differ from xyz that I know about?” (just another instance, or some true difference)
  • If not just another instance, revisit the object to collect more photos
  • File transfer all the photos of the object to a folder on my Mac for “transfer learning update” to the robot’s TensorFlow-Lite Object Detection model
  • ADDITIONALLY, the big if:
    • search the Internet to find “potentially useful to the bot” information about the object
    • Periodically review knowledge gained from the Internet to see if ever used, and delete unused learning!

 

I am reasonably certain all of this is possible on the Raspberry Pi 3B of Carl, (excepting the TensorFlow-Lite transfer learning model update that would use the Radeon RX 580 Graphics Processor attached to my Mac Mini)

2 Likes

That’s an ambitious project.

I should look into TensorFlow Lite. You’re probably right about OpenCV. I really don’t intend to be down in the weeds, but want to learn enough to use existing models. I supported the OpenCV AI Kit (OAK) camera on Kickstarter. My thought there was that it runs the model on the camera hardware itself, thereby offloading that task from the robot’s CPU. I tend to overestimate how taxing things are on the robot (which is why I should start monitoring more closely like you do).
/K

1 Like

What’s that?

That, if nothing else, is what drives me away from any kind of cloud storage except those I explicitly pay for and have a reasonable privacy policy.

There’s nothing wrong with “reinventing”

In the late 19th century and early to mid 20th century, American car makers were “reinventing” the automobile - or so it seemed - every thirty minutes or so.  Not to mention the Europeans.

Curtis “reinvented” the Wright brother’s basic biplane into something that actually looks like an airplane instead of a kite with engines.

Ships. chemical processes, machines - especially steam, gasoline, and diesel engines - you name it, were, and are, being “reinvented” almost continuously.

Even programming languages have come a long way since Grace Hopper invented the first compiled language back in the 50’s; and like the Wright brothers, people thought she was absolutely hair-brained crazy!  Until, like the Wright brothers, they saw it work.

Maybe you can have Carl and your Mac Mini do some distributed computing where the Mac does the really heavy lifting and passes results back to Carl?

2 Likes

I tried the HuskyLens version of offloading vision tasks. The cost for HuskyLens is 100ma each and every hour because I don’t have an I2C power switching bus extender.

The HuskyLens is pretty good, but not very customizable.

The OpenCV kit looks really good for “build a custom needs vision sensor”. That makes it much better than the HuskyLens.

If I could stand the OpenCV board power consumption, the performance boost would be very welcome.

2 Likes

Maybe it’s time to either revisit battery chemistry or invest in a larger battery pack for Carl.

You say MY 'bot is top-heavy? Carl’s center of gravity is at least 5" higher than Charlie’s!  :wink:

Seriously, you keep asking more and more of Carl, and expect it to come at a zero to minimal power cost.  Not gonna happen, my friend.

Even software comes with a power cost - it may seem de-minimus at first, but - like grains of sand, pretty soon you have a really heavy weight.

IMHO, you need to revisit your power source for Carl.

1 Like

If you remember years and years ago, there was a computer guessing game:
C: “Is it a person, place, or thing?”
H: “Thing”
Knowledge Result: (unknown, IS_A, THING)

C: “Does it have wheels?”
H: “Yes”
Knowledge Result: (unknown, HAS, WHEEL)

C: “What Color Is It?”
H: “Multi-Colored”
Knowledge Result: (unknown, PROPERTY, COLOR)
Knowledge Result: (unknown, property(COLOR), MULTIPLE)

C: “Is it a multi-colored ROBOT?”
H: “Yes”

C: “I guessed ROBOT because CAR usually has only one COLOR”

The ontology is a vocabulary of concepts and their relationships and in Carl’s case:

  • Ontology is the branch of “robot philosophy” that studies concepts such as existence, being, becoming, and reality.
  • It includes the questions of how entities are grouped into basic categories and which of these entities exist on the most fundamental level.
  • IEEE has two proposed standard robot ontologies:
    • CORA: Common Objects in Robot Applications,
    • AuR: Autonomous Robot

In order for robots to understand their environment, or to dialog with humans, they need a knowledge base and processing techniques to query the knowledge, use the knowledge, and communicate the knowledge.

After a robot has knowledge and these abilities, it needs the “meta-level” ability

  • to assess the utility of each knowledge item,
  • to vary the trust level of each concept,
  • to see gaps in its knowledge and set goals to acquire new knowledge to fill those gaps
  • to use known learning techniques to meet learning goals
2 Likes

No, it must come efficiently. Having an “always on” sensor that is only needed a few seconds a day, is not efficient.

Making the battery bigger so Carl can waste 100mA all day, is not my preferred solution.

What Carl needs most is software, not bigger battery, not faster CPU, and not smarter sensors.

I used to explain to co-workers “To be successful in business, you only need to be 10% efficient with a 10% technology solution”.

As a “tolerant, secret perfectionist” I do not demand of others what I seek for my-self (and my robot).

2 Likes

In general, I entirely agree, efficiency is something that is designed in, not bolted on later.

However, when I am designing or modifying something, I follow a methodology something like this:

  1. What exactly is the problem I wish to solve?
    If the problem is not adequately defined, the solution will be sloppy, a waste of time and money, or a fuster-cluck from the beginning.

  2. What appears to be the best solution?
    This is understanding that any solution will be a trade-off between time and resources, money, power, software skills, and other things.
    a. Is the solution reasonable?  (i.e. Does it cost like a battleship?  Would it require a car battery on a little cart to power it?  Would I have to spend three years in college to understand it, or program it?)
    b. if the solution is reasonable, (not outright lunacy), is it practical to implement?  Are the costs and potential pain-points outweighed by the end-product benefits?  Or, would it require a twenty-armed monkey to run it?
     
    In a related case, I was investigating several possible solutions to the problem of multi-booting the Raspberry Pi.  After eliminating all the potential solutions that were clearly not reasonable, I was left with a choice of creating a layered operating system structure or using an automated process like PINN.
     
    * The layered approach works by creating multiple partitions and then rsync’ing the images to the partitions.  Though it sounds easy, it’s actually a non-trivial process. Additionally it’s not reasonably easy to make backups of individual operating systems.
     
    * PINN has a huge up-front cost in repackaging the individual operating systems in a PINN installable format.  Most of the difficulties I had were because, (according to PINN’s maintainer), I was skirting edge-cases and asking things of PINN that - theoretically - it is supposed to be able to do.  However, these edge-cases hadn’t been tested well, and the documentation for them was hideous.  The project maintainer admitted as much, thanked me for helping him, and offered me yet another aspect of PINN that needed to be beaten senseless.
     
    However, once the packaging was done, using, re-using, backing up, and restoring individual images becomes a trivial process.  This is something that would NOT be true for the layered approach.
     
    * The layered approach would have a trivially simple O/S switching mechanism in the primary boot partition by adding conditional statements to the config.txt file.
    * PINN requires more advanced scripting, but the maintainer is keen on my project and is providing a lot of help - especially with the more esoteric aspects like hardware boot-selection.  Since the boot-selection process is scripted, there is the potential for more advanced techniques.  For example, using three GPIO pins allows me to select one of four operating systems based of which, if any, of the pins are pulled low.  Scripting might provide the opportunity to take the three various pins and use them as a binary number, allowing a one-of-eight selection.
     
    Though it was a close call, I have ultimately decided on PINN, since, (IMHO), the advantages appear to outweigh the disadvantages.

  3. If the solution is both reasonable and practical, how expensive will it be in battery power, time to implement, recurring costs, (time, money, software, hardware, etc.)
    a. This was a part of the trade-offs between PINN and the more traditional multi-boot approach.  PINN’s up-front cost is considerably higher, but the follow-on costs in time and effort are absolutely trivial.  Even adding, removing, re-arranging, or otherwise modifying the structure of the filesystems on the storage device is comparatively trivial compared to a “hard coded” approach like the first one.

By the time I get this far, there’s either a clear winner, or the two choices are so similar that it’s worthwhile to try both of them and see which one wins in the real world.

So, in my opinion, the “best” solution may involve a tradeoff, spending battery or CPU power for a better, cleaner, more “efficient” solution to the problem taking into account not only present cost, but future cost in time and other precious resources.

I don’t want to say “efficiency is in the eyes of the beholder”, but it kind-of is.  You may consider power consumption to be the most precious and therefore the primary limiting consideration.  On the other hand, I may be willing to redesign - or even replace - the power source if it makes other important aspects that much simpler.

What say ye?

2 Likes

Coupling, cohesion, and don’t re-solve problems until they are the problem. That’s what say me.

2 Likes

Great thread. The battery issue is a real one. Just like I overestimate CPU use, I tend to forget about this. Finmark (my GoPiGo) is really training wheels for a larger robot, where I anticipate having a much larger battery, so proportionately something like the OAK camera won’t be using much of the juice. For Finmark I don’t mind offloading the more complicated processing to a networked laptop, so everything doesn’t have to run on the robot itself - one of the advantages of running ROS.
/K

2 Likes

“Don’t cross a bridge until you come to it.”

(Russian saying)
“Don’t trouble trouble, until trouble troubles you.”  (Say that three times fast!)

In my own case I know that - at least some people - think I’m “tilting at windmills” or, (more diplomatically said), investing time and resources into solving a problem that isn’t a problem yet.

Maybe, even possibly true.

In my case, the primary constraining resource is my own time.  With a wife, two granddaughters, several doctors, and God only knows what else vying for my time on a daily - often even hourly - basis, everything else is constrained by the small slivers of time I have to invest.

Because time is the primary constraint, anything I can do to make my time spent on Charlie more productive is a bonus, and I sacrifice other potential constraints in favor of time.

In addition I have a mental roadmap of future projects, each of which involves using a different operating system for a different project in a different way.  Likewise, I work most efficiently if I have several pots, (projects), on the stove in front of me.  This way I can work on something until I either finish it, get sick of it, or I reach a roadblock that stops me.  I then switch to the next project and continue until it’s done, stopped, or the angst meter is up against the pin.  Then I rotate to the third project, and so on.

This “rotation” may happen several times a day, depending on what I’m doing.  (Interesting!  I wonder what happens if I try that in R4R, or on the 64 bit Raspbian spin?)

The result is that I swap SD cards a LOT - which is not good for the 'bot, nor is it good for the SD card.  Seeing as I’ve already broken several SD cards simply by trying to remove/insert them, my goal is to reduce and/or eliminate SD card swaps.  And it’s not just the mechanical risk; card swaps take time.  With my shaky hands, (and limited mobility post COVID), card swaps are a non-trivial process, taking up an inordinate amount of my time - up to 15 or 20 minutes per swap - unless I get really lucky.

As a consequence, I went out and invested some non-trivial money in two (small sized) USB external 500 gig SSD drives.  The goal being to install an entire library of target operating systems on them, (one of them), the other is so that I can try more than one method at a time, or “save the game” if I’m planning on doing something potentially stupid.  With this, I can freely move from one O/S to another in the time it takes to reboot, without risk of breaking, or loosing, a tiny SD card.

Additionally, the SSD’s, even in a USB-2 slot, are considerably faster than even a very good, expensive, high-end SD card, so reboot time, (and potential I/O bottlenecks while working), is reduced to it’s absolute minimum.  With this, the “cost” of switching context is minimized.

Since I am also researching the benefits of “local” development on the 'bot, by binding Visual Studio Code, (on my laptop), directly to the 'bot itself, I have to install a special client/server application on the 'bot for VSC to connect with.  (And we all know how quick, small, and efficient Microsoft code is.  :wink:)

All of these things consume storage, I/O, and processing time.  By using large partitions on very fast media, using the fastest Pi I can get, these “costs” are minimized, making the small slivers of time I have to invest in this more efficient and productive.

Is this really a “problem” NOW?  Is it absolutely mandatory that I solve this “problem” NOW?

No.  I could continue working even if I don’t do any of these things.

However, (IMHO), it’s like buying an air-conditioned car when it’s -20 outside.  I’m going to need it anyway, and it’s cheaper to buy the car now when dealers are trying to make their end-of-year sales numbers, (and my projects are still in relatively early stages), rather than waiting until my projects are in full bloom and mucking around with partitions and installations could cause irretrievable damage.  Or, at the very least, be a serious pain in the tush!

1 Like

Oooh I got my OAK camera today! Looks like a nice piece of hardware!

3 Likes

Pictures, or it didn’t happen!

It is - the metal case/heat sink really seals the deal.
/K

2 Likes

Carl has his eye on the OAK-D, which also appears to have a nice case/heatsink.

I’ll be interested in your experience.

  • Do you know how long your board takes to boot up yet?
  • Does it have a “low power sleep mode”?

The vision with depth capability seems like it can replace Lidar.

2 Likes

Nothing happened yet. I just received it :slight_smile:

3 Likes

I still want to see pictures of it - it sounds interesting!