Wanted: A good SPI mutex

jimrh · June 27, 2024, 9:15am

I have decided that the (apparent) only way to prevent SPI buss collisions is to create a SPI mutex.

I need to know:

Early on in the boot process various things start that affect the controller, like setting LED’s, etc. I need to know where these interactions come from.
Do ALL the GoPiGo SPI requests go through the GoPiGo libraries? or is there other code that uses the GoPiGo’s SPI buss?
Where’s a good world-writeable location for a mutex lock file?

cyclicalobsessive · June 27, 2024, 1:36pm

deleted - off track idea

jimrh · June 27, 2024, 2:34pm

Thanks!

Pardon my stupidity, but would you please explain this to me with examples on how to use it?

Does the Dexter software already use, and respect, this mutex? If not, how do I activate it?
I notice that it appears to be an i²c mutex, at least by name. Is this also used by the SPI code? Or, are you suggesting that I could use this to create a modified class using a different mutex lock file for SPI communication? Is the lockfile directory world-writable, or do I have to become root?
What would be the best way to use this in the Waveshare code, (or in any other code I might wish to use?)
Since I am not as skilled with classes and instances as you are, (to me they are “magic”. I wave my hands and say the magic words and Shazam! It works!), could you explain what all this code means and how it works so I can understand it better?

Thanks and my apologies for my stupidity.

jimrh · June 27, 2024, 3:07pm

I was originally thinking that programmatically monitoring the other CS pin on the Raspberry Pi might be a good solution; but a mutex might be a better, more generalized, solution as that will work with any SPI configuration. Actually, it might not be effective in a 3-wire mode, but that’s another problem.

If I rename it to something like “SPI_mutex”, it might be easier to understand what’s happening and more portable to other people’s code.

Also, if I rename it, I might be able to offer it to the Waveshare people as a way to create a cooperative mutex that others can use.

cyclicalobsessive · June 27, 2024, 3:27pm

(post deleted not helpful to issue)

jimrh · June 27, 2024, 3:28pm

Then, can you tell me where it is used so I can copy how they use it?

cyclicalobsessive · June 27, 2024, 4:27pm

deleted - off track idea

jimrh · June 27, 2024, 4:31pm

cyclicalobsessive:

2024-06-27 12:15:46.040522 | 2nd_SPI_user : I got the mutex        <<< ??? how if spi_user.py has the mutex ???
2024-06-27 12:15:47.041749 | 2nd_SPI_user : I released the mutex

Look at the date stamps - the second user’s lock is a different time than the first. In fact, the second user appears to have grabbed the lock first.

What should happen is the first user grabs and releases the lock and the second user spins on the lock until it’s released. Or vice versa.

jimrh · June 27, 2024, 4:34pm

Thank you for the examples!

Classes and their invocation is still a mystery to me.

P.S.

Why the extremely short wait time? wouldn’t it be better to make it something like 0.1?

cyclicalobsessive · June 27, 2024, 4:43pm

deleted - off track ideas

jimrh · June 27, 2024, 4:49pm

Actually, it seems that the 2nd spi user got the lock first.

If you look at the timestamps, the 2nd spi user grabbed it before the 1st user and released it prior to the first user getting it.

Another refinement - there’s no watchdog in the mutex. If process “X” grabs the lock and then abends, the lock will never get released and any other process that depends on the mutex will spin forever. Implementing a watchdog is an interesting TODO.

Question about classes:
If I understand correctly, a class instance method doesn’t continue to run in the background, but rather simply does something, returns a result, (if needed) and then ends. right?

Ideally then, a watchdog should be implemented outside the class by the calling function:
(i.e. If I’ve been waiting for more that “X” amount of time, abend with an exception, possibly sudoing and clearing the lock file too.)

Does that make sense to you?

jimrh · June 27, 2024, 5:04pm

One other question:

Implementing a spin-lock on the Waveshare side will be relatively easy as there are only one or two places where SPI calls are actually done. However, I am not sure where the SPI calls are done on the Dexter side. maybe in the base gopigo class?

Also, I’ve noticed that many of these classes, like easygopigo, appear to be in several different places at the same time - and the question of which one is being used isn’t easy to determine, (at least I don’t remember how.) I think, vaugely recalling something from the past, there’s an obscure .egg file somewhere that contains the classes and methods being used. How would I go about modifying an egg file to include my changes? I know I can make my own personal copy and modify that, but it’s not very robust and gets lost every time I re-create a new image for a new project.

What say ye?

P.S.

Which is convenient since I can modify the library files directly to include a third library, write wrappers, and it will automatically work for any Waveshare display.

jimrh · June 27, 2024, 5:20pm

in the case of the display and the robot, AFAIK, the user is always “pi” - or does this mean you can create virtual users that hold the mutex programmatically?

i.e. the dexter side could be “dexter_user” and the waveshare side could be waveshare_user accessing the same mutext SPI_lock?

cyclicalobsessive · June 27, 2024, 5:21pm

deleted - off track questioning

jimrh · June 27, 2024, 5:25pm

God no! Or if I am, I don’t know about it.

As far as I know, there are only two SPI sources on my robot, (and hardware wise, that’s the maximum), the native GPG controller and the external display.

Also, as far as I know, neither runs more than one process and I suspect that the dexter code either serializes commands generated so they don’t collide, (i.e. user and system created communication with the GPG controller), or has a mutex it already uses.

In my case, the two devices, (GPG controller and the Waveshare display) are two asynchronous things running at the same time, hence the collisions and the need for a mutex to arbitrate between them.

cyclicalobsessive · June 27, 2024, 5:31pm

deleted - wrong info - didn’t understand

cyclicalobsessive · June 27, 2024, 5:39pm

(deleted by author - that cannot delete for 23 more hours)

jimrh · June 27, 2024, 5:44pm

“Now I’mREALLY confused!”
George Washington Slept Here

As far as I know, the mutex should work like this:

. . . .and the process repeats itself. User “B” should not be able to get the lock while “A” has it and vice versa.

Maybe the status messages should be “I grabbed the lock”, “I have the lock”, “I’ve released the lock” and “I’m waiting for the lock”

Your confusion, (or at least the apparent confusion in your mutex state list), might be caused by confusing messaging?

jimrh · June 27, 2024, 5:55pm

. . . or unsynchronized threads that don’t use the same I/O path.

In this case there are two, distinct, SPI I/O handlers:

The Dexter SPI handler on CS-1
The Waveshare SPI handler on CS-0

Neither handler cares, or even knows, if the other is using the buss, therefore they can collide. At least at the present time, this is the suspicion that I have and am investigating.

I am assuming this because it appears that the Waveshare and Dexter devices assert their CS lines while the other is working, (as far as I can tell with my el-cheapo 'scope.)

jimrh · June 28, 2024, 6:43am

There are two “things” (threads, processes, programs?) that can run at the same time:

The GoPiGo robot’s continuing processes that run in the background to allow the robot to function.
The Waveshare e-Paper display demo program which runs on demand, (manually via Thonny’s “RUN” button), for test purposes.

The robot’s SPI communication happens when either:

It is commanded to do something, like “turn on an eye” using the web interface at localhost. At that point an immediate SPI message is sent.
Every couple of seconds it sends a brief message by itself without an explicitly commanded action. I don’t know what the message is, but I am assuming that it is some kind of “status request” or “keep-alive” message.

The Waveshare’s SPI communication only happens when commanded to communicate with the e-Paper display. However, these communications can take several seconds to complete because the display needs to be woken up, initialized, cleared, sent a message, and then put back to sleep after the message is displayed.

Because of this, the messages can, and do, overlap.

Note that I do not have adequate instrumentation to determine if an actual collision is taking place.

However, because the messages appear to be happening at the same time and afterwards the robot goes brain-dead, I am assuming a collision.

I should do more research, but since:

There is no interprocess communication that I can see:
- There is no central, kernel driven or system controlled SPI communication routine in the same way that storage access is managed by the kernel that serializes and arbitrates communication requests.
- The Waveshare display does its own SPI communication by manually sending bytes to the SPI pins.
- There is no Waveshare mutex, (or other method I can see), that detects if something else is happening.^[1]
- I vaguely remember that the GoPiGo robot also manually sends bytes to the SPI pins.
I am forced to conclude that there is a high probability of SPI buss contention and resultant collisions.

Does this make sense now?

I suspect that /run/lock or /var/lock requires root access. In fact, I generally assume that anything outside of the user’s home folder requires root access.

====================

I had originally thought I could avoid collisions by having the Waveshare display monitor the other chip-select pin. However, I soon remembered that the robot can still transmit during a Waveshare communication interval, therefore I decided that a mutually respected mutex would be the best solution.