Stackable? Daisy chaining? dev tree compatible to Jessie? [C/C++]

aahm, yes, I re-edited, the forumā€™s editor does not show the code correctly formatted.

My question is about processing alternating GPIO reading and writing by Python vs. C.
You said, Python would be very fast and both C and Python are not real-time capable.

My test code always shows 120-130ns up to max. 180ns under strong cpu stress
(for the high priority thread and almost the same for the low priority thread,
reading and writing GPIOs one after another,
both for single and for multi-threading (2 parallel threads),
plus parallel loop in main(),
when simultaneously WiFi, internet browsing, plus simultaneous video streaming in Firefox up to 180ns,
but regardless if counts==25000, 100000, or 1000000.)

My question was, just out of curiosity, how python would perform by the same test.

To me the results for C seem to be very reliably real-time capable by those results always <= 180ns even while multithreading and even when simultaneously multi-tasking.

Communicating with hardware (through the kernel to toggle GPIO, or through an external interface e.g. with an Arduino) has driver layer overhead and bus speed limits, so itā€™s not a good way to determine processing speed.

Anyhow, this program toggles the GPIO at about 135kHz, and without the reads between the writes, it toggles the GPIO at about 255kHz:

import wiringpi

wiringpi.wiringPiSetup()

wiringpi.pinMode(15, 1)
wiringpi.pinMode(16, 0)

while True:
    wiringpi.digitalWrite(15, 1)
    wiringpi.digitalRead(16)
    wiringpi.digitalWrite(15, 0)
    wiringpi.digitalRead(16)

Basically unless youā€™re implementing bit-bang communication, itā€™s more than fast enough for using GPIO. And again, GPIO read/write is a very poor indication of performance, since it depends on so many factors.

Even though at runtime Python can never be as fast as C, I maintain that Python is fast enough for everything but the most demanding BrickPi3 applications. Even then, I canā€™t think of anything where Python would be too slow. The simplicity and fast development time for writing applications in Python is, in my opinion, typically worth the trade-off for slightly slower run-time performance. A lot of applications (e.g. the BalanceBot) donā€™t need to run loops at more than e.g. 120Hz, and they do well to actually limit the execution speed so as to not put too much processing load on the system (keep the CPU at maybe 25% instead of 100%) since running faster doesnā€™t offer an advantage. Most robots shouldnā€™t need to update faster than 120Hz because physically they canā€™t have moved/changed enough that higher loop speeds would be required.

On a multi-core system (like a quad-core RPi), by setting the process priority high, you might be able to get real-time performance, but itā€™s still not guaranteed.

ok, your test is a different one, I am running GPIO r/w under multithreading and multitasking conditions.
Single pin read/write by wiringPi or pigpio is even much faster by C
(pigpio: <10 ns ! thatā€™s >= 100 MHz ! almost 1000x quicker than Python!).

But my test is running GPIO r/w under high cpu load by multiple threads and tasks running simultaneously, and even then itā€™s at <= 180ns for r/w 4 different GPIOs simultaneously plus parallel tasks, which is 10-20MHz for 2 parallel GPIO r/w threads, so still 100x faster than Python by single-toggling.
Of course, ā€œreal-timeā€ always depends on the current application. But IMO 180ns even at high cpu load is extremely good for real-time capability.

edit:
can you port my C code 1:1 to Python?
I am really curious how quick Python would perform under multithreading and multitasking conditions.

edit2, just tested:
by 5 parallel tasks even read/write of 10 different GPIOs is possible at exactly the same time:

125ns for toggling all the 10 GPIOs, so still 100% real-time capable!

//=====================================================================================
/*     Program GPIO test
 *     Raspberry Pi 2
 *     ver 003       
 * 
 */




#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <math.h>
#include <fcntl.h>
#include <string.h>
#include <sys/ioctl.h>
#include <stdint.h>
#include <sys/time.h>
#include <errno.h>
#include <pthread.h>

#include <wiringPi.h>
#include <wiringSerial.h>
#include <wiringPiI2C.h>
#include <softPwm.h>




//*************************************************************
// tasks
//*************************************************************

unsigned long itime0, xtime0, itime1, xtime1, itime2, xtime2, itime3, xtime3, itime4, xtime4; 
int s0=1, s1=1, s2=1, s3=1, s4=1;
int counts = 1000000;

void* thread0 (void* ) {       //  
	volatile int ival;
	
	itime0=millis();
    for (int i=0; i<counts; i++) {
       ival=digitalRead(5);
       digitalWrite(20, HIGH);
       ival=digitalRead(5);
       digitalWrite(20, LOW);
    }    
    xtime0=millis();
    s0=0;
    return NULL;
}


void* thread1 (void* ) {       //  
	volatile int ival;
	
	itime1=millis();
	 
    for (int i=0; i<counts; i++) {
       ival=digitalRead(6);
       digitalWrite(21, HIGH);
       ival=digitalRead(6);
       digitalWrite(21, LOW);
    }    
     
    xtime1=millis();
    s1=0;
    return NULL;
}

void* thread2 (void* ) {       //  
	volatile int ival;
	
	itime2=millis();
    for (int i=0; i<counts; i++) {
       ival=digitalRead(17);
       digitalWrite(22, HIGH);
       ival=digitalRead(17);
       digitalWrite(22, LOW);
    }    
    xtime2=millis();
    s2=0;
    return NULL;
}


void* thread3 (void* ) {       //  
	volatile int ival;
	
	itime3=millis();
    for (int i=0; i<counts; i++) {
       ival=digitalRead(18);
       digitalWrite(23, HIGH);
       ival=digitalRead(18);
       digitalWrite(23, LOW);
    }    
    xtime3=millis();
    s3=0;
    return NULL;
}


void* thread4 (void* ) {       //  
	volatile int ival;
	
	itime4=millis();
    for (int i=0; i<counts; i++) {
       ival=digitalRead(19);
       digitalWrite(24, HIGH);
       ival=digitalRead(19);
       digitalWrite(24, LOW);
    }    
    xtime4=millis();
    s4=0;
    return NULL;
}





//
void setupwiringPi() {
  
    int iores;
    putenv ("WIRINGPI_GPIOMEM=1");                 // no sudo for gpios required
    iores = wiringPiSetupGpio();                   // init by GPIO pin numbering
    if( iores == -1 ) exit(1);     
    
    pinMode(  5, INPUT); pullUpDnControl( 5, PUD_UP);
    pinMode(  6, INPUT); pullUpDnControl( 6, PUD_UP);
    pinMode( 17, INPUT); pullUpDnControl(17, PUD_UP);
    pinMode( 18, INPUT); pullUpDnControl(18, PUD_UP);
    pinMode( 19, INPUT); pullUpDnControl(19, PUD_UP);
     
    pinMode( 20, OUTPUT);  
    pinMode( 21, OUTPUT);  
    pinMode( 22, OUTPUT);  
    pinMode( 23, OUTPUT);  
    pinMode( 24, OUTPUT);  
    
    
	
}




//*************************************************************
// main
//*************************************************************

int main() {
    int key;
    unsigned long t=0, t0, t1, t2, t3, t4;
    double ns0, ns1, ns2, ns3, ns4;
    
    sleep(1);
    
    setupwiringPi();     
   
    pthread_t tid0, tid1, tid2, tid3, tid4;
    struct  sched_param  param;

    pthread_create(&tid0, NULL, thread0, NULL);    //  
    param.sched_priority = 80;
    pthread_setschedparam(tid0, SCHED_RR, &param);
    
    pthread_create(&tid1, NULL, thread1, NULL);    //  
    param.sched_priority = 60;
    pthread_setschedparam(tid1, SCHED_RR, &param);
    
    pthread_create(&tid2, NULL, thread2, NULL);    //  
    param.sched_priority = 40;
    pthread_setschedparam(tid2, SCHED_RR, &param);
    
    pthread_create(&tid3, NULL, thread3, NULL);    //  
    param.sched_priority = 20;
    pthread_setschedparam(tid3, SCHED_RR, &param);
    
    pthread_create(&tid4, NULL, thread4, NULL);    //  
    param.sched_priority = 10;
    pthread_setschedparam(tid4, SCHED_RR, &param);
    
    while( s0 || s1 || s2 || s3 || s4) { 
		printf("%ld\n", t);
		delay(100);
		t+=100;		
	};
   
     
    // wait for threads to join before exiting
    pthread_join( tid0, NULL);
    pthread_join( tid1, NULL);
    pthread_join( tid2, NULL);
    pthread_join( tid3, NULL);
    pthread_join( tid4, NULL);
    
    printf("\n\n");
    t0=xtime0-itime0;
    t1=xtime1-itime1;
    t2=xtime2-itime2;
    t3=xtime3-itime3;
    t4=xtime4-itime4;
    
    ns0=t0*1000000/(float)(4*counts); 
    ns1=t1*1000000/(float)(4*counts); 
    ns2=t2*1000000/(float)(4*counts); 
    ns3=t3*1000000/(float)(4*counts); 
    ns4=t4*1000000/(float)(4*counts); 
    
    printf("time thread0: %ld ms,   time thread1: %ld ms\n", t0, t1); 
    printf("time thread2: %ld ms,   time thread3: %ld ms\n", t2, t3);
    printf("time thread4: %ld ms \n\n", t4);
     
    printf("time delta0:  %.1f ns,  time delta1:  %.1f ns\n", ns0, ns1); 
    printf("time delta2:  %.1f ns,  time delta3:  %.1f ns\n", ns2, ns3); 
    printf("time delta4:  %.1f ns \n\n", ns4); 
    
    
    key=getchar();

    
    exit(0);
}

Sorry, but I donā€™t have time now to write a python program to match your C program (Iā€™d rather put my time into implementing some BrickPi3 improvements). Raspbian is not a real-time OS, so you canā€™t be guaranteed that the process (compiled C program or Python program) wonā€™t be interrupted for a different process to run. By setting the priority to a high level, it is less likely to be interrupted, but I donā€™t think thereā€™s a way to guarantee that it wonā€™t.

Iā€™ll re-iterate the fact that although C will always be faster than Python, the advantages to using Python typically outweigh the disadvantages for my applications, and I still canā€™t think of a situation with the BrickPi3 that Python would be too slow. Most high-level robotics applications donā€™t need the speed of C, and in fact that difference doesnā€™t even effect the behavior (it will affect bench-mark numbers, but not robot performance). I would encourage you to try Python and see (even though it will be slower) if itā€™s still fast enough to work well for the application.

In the end, for high level programming, C vs. Python mostly comes down to personal preference and libraries/support required for that application.

I agree that perhaps ā€œmostā€ robotic applications will run even with Python, but my point was that I proved that even under highest cpu stress I can achieve reproducable real-time capabilities even for multiple parallel running threads and tasks by my C-programs when using POSIX pthread features (I proved that already even for dozens of simultaneous parallel threads) .
So even in case Python failed, C still wonā€™t.
And even if not 100% real-time, but extremely close though.

But that was just about timing, nonetheless there are dozens of different draw-backs of Python, regardless of speed. So Iā€™ll gladly look forward both to your C API and the new FW providing stackable BrickPi3 HATs. As soon it will be available I will immediately order a couple of them, because I still have none yet of course. If you want some suggestions in advance about API lib implementation , installation to Raspbian, including and linking to gcc/g++ programs etc., just let me know!
regards,
Helmut

ps,
the forumā€™s editor for code tags corrupts the formatted code by indents, white spaces, tabs, and empty linesā€¦ :frowning:

The forums code tags are now much improved :slight_smile:

For an update regarding stacking multiple BrickPi3s, see here.

great news!
Now letā€™s do some obfuscated Cā€¦ :wink:

C++ drivers are written, and Iā€™m testing them right now :wink:

thatā€™s been quick :open_mouth:

Iā€™ve been very busy :slight_smile:

I never dared to doubt that :sunglasses:

Very early, but here are the BrickPi3 C++ drivers.

readable and comprehensible! :smiley:
great work! :sunglasses:

will the BP commands be thread-safe or would one need mutexes additionally?

Since most robots donā€™t need multiple threads, making it thread-safe is not a priority. You will need to add mutexes into the user program as necessary.

actually most serious robots urgently need multithreading, but as most of the motor control is supposed to be performed by the HATs and not on the SoC I think/hope it will be feasable though :wink:

e.g., just a couple of sensors yet, 20 more are still missingā€¦:


void* thread0Go (void* ) {          // low priority: display values               // low priority: display values
   
    //printf("init graph...: ");
    setupGraphics();
    // DEBUG
    //printf(" done. \n\n");
           
    while(_TASKS_ACTIVE_) {
      cls();
      displayvalues();
      delay(40);   
    }   
    End();

    FinishShapes(); 
   
    return NULL;   

}


//*************************************************************


void* thread1Go(void *)           // low priority:  keyboard monitoring          // low priority:  keyboard monitoring
{
   int   c, ch;  // keyboard scancode, letter char
   
   while(_TASKS_ACTIVE_) {   
            
   
      c=0; ch=0;     
      echoOff();     
     
      if (kbhit())    {
          cursxy_lxt(0,11); printf(cleol_lxt);
          ch=getchar();     
          if(ch>=32 && ch<127) {
              printf("%c", ch);
              fflush(stdout);
          }   
          else if (ch<32) {
              if (kbhit()) ch=getchar();
              ch=0;
          }
      }
     
      // DEBUG
      else  if(ch<32) {
        c = getkbscancode();
        if(c) {
        if( c==1 ) {     // ESC to quit program
               printf("\n\n ESC pressed - program terminated by user \n");
               _TASKS_ACTIVE_=0;  // semaphore to stop all tasks
               printf("\n wait for tasks to join main()... \n\n");
               return NULL;
          }

          else {   
             cursxy_lxt(0,11); printf(cleol_lxt); 
             cursxy_lxt(0,10); printf(cleol_lxt);
                  if(_keypress_==_BSP_)  printf("BSP ");
             else if(_keypress_==_RET_)  printf("RET ");
             else if(_keypress_==_F1_)   printf("F1  ");
             else if(_keypress_==_F2_)   printf("F2  ");
             else if(_keypress_==_F3_)   printf("F3  ");
             else if(_keypress_==_F4_)   printf("F4  ");
             else if(_keypress_==_F5_)   printf("F5  ");
             else if(_keypress_==_F6_)   printf("F6  ");
             else if(_keypress_==_F7_)   printf("F7  ");
             else if(_keypress_==_F8_)   printf("F8  ");
             else if(_keypress_==_F9_)   printf("F9  ");
             else if(_keypress_==_F10_)  printf("F10 ");
             else if(_keypress_==_F11_)  printf("F11 ");
             else if(_keypress_==_F12_)  printf("F12 ");
             else if(_keypress_==_CUp_)  printf("CUp ");
             else if(_keypress_==_CDn_)  printf("CDn ");
             else if(_keypress_==_CLe_)  printf("CLe ");
             else if(_keypress_==_CRi_)  printf("CRi ");
             else if(_keypress_==_PUp_)  printf("PUp ");
             else if(_keypress_==_PDn_)  printf("PDn ");
             else if(_keypress_==_Hom_)  printf("Hom ");
             else if(_keypress_==_End_)  printf("End ");
             else if(_keypress_==_DEL_)  printf("DEL ");
             else if(_keypress_==_INS_)  printf("INS ");

             if(_keyshift_)     printf(" +SHFT ");
             if(_keyctrl_)      printf(" +CTRL ");
             if(_keyalt_)       printf(" +ALT  ");
             printf("scancode=%6d \n", _kbscode_ ); 
             fflush(stdout);

         }         
       }
     }
     delay(50);
   }         
   return NULL;
}


//*************************************************************


void* thread2Go (void* ) {   // medium priority: motor remote control      // medium priority: motor remote control
   static int pwm0, pwm1, i;
   int deadzone = 10;
   
   printf(cls_lxt);
   while(_TASKS_ACTIVE_) {   
     
     
      if( ! _REMOTE_OK_ ) {
       for (i=0; i< MAXMOTORS; ++i ) {
          motorCoast(i);         
       }
     }
     else {
      if( _REMOTE_OK_ )  {  // _REMOTE_OK_ 
         pwm0 = recvanalog[0];
         pwm1 = recvanalog[1];

         if(abs(pwm0) < deadzone) pwm0=0;
         if(abs(pwm1) < deadzone) pwm1=0;
     
         
         motorOn(1, pwm1 );
         motorOn(0, pwm0 );
     
         // debug
         

         cursxy_lxt(0,6);
         printf(cleol_lxt);
         printf("pwm0=%7d pwm1=%7d \n", pwm0, pwm1);
         printf("enc0=%7d enc1=%7d \n",motor[0].motenc, motor[1].motenc);
       
       }     
       
       if (!_TASKS_ACTIVE_) return NULL;
       delay(10);
     }
   }
   return NULL;
}




//*************************************************************



void* thread3Go (void* ) {       // highest priority:  100Āµs encoder timer        // highest priority:  100Āµs encoder timer
   while(_TASKS_ACTIVE_) {   
      updateEncoders();
      usleep(100);
   }
   return NULL;
}


//*************************************************************



void* thread4Go (void* ) {       // medium priority:  UART RC                    // medium priority:  UART RC
    while(_TASKS_ACTIVE_) {   
      UARTcomm();
   }
    return NULL;
}


//*************************************************************


void* thread5Go(void *)          // medium priority:  IMU + odometry            // medium priority:  IMU + odometry
{
   uint8_t  high_byte, low_byte;
   int16_t  angle16;
   
   int32_t  encLeft=0, encRight=0,
             encLeftOld, encRightOld, dEncLeft=0, dEncRight=0;
   double   fxold=0, fyold=0, rollRW, rollLW, rollM;
   double   MotencHeading=0, dMencHdgRad, MencHdgRad=0;    // Heading by odometry
   double   IMUHdgRad=0, oldIMUHdg, dIMUHdgRad, dIMUHdg;   // Heading by IMU
   double   fHdgRad=0, oldfHdg, dfHdgRad;                  // Heading by sensor fusion

   
   while(_TASKS_ACTIVE_) {     
     
    //-----------------------------------------------------------------------
    // IMU Gyro Compass
   
    oldIMUHdg = IMUHeading;                     // save old IMUHeading degree
     
     high_byte = wiringPiI2CReadReg8 (fcmps11, 2) ;
     low_byte  = wiringPiI2CReadReg8 (fcmps11, 3) ;
     //pitch     = wiringPiI2CReadReg8 (fcmps11, 4) ;
     //roll      = wiringPiI2CReadReg8 (fcmps11, 5) ;
     
     angle16 =   high_byte << 8;                // Calculate 16 bit angle degree
     angle16 +=  low_byte;
         
     IMUHeading = (double)angle16/10.0;        // new IMUHeading degree
     dIMUHdg    = IMUHeading - oldIMUHdg;       // delta IMU Heading degree
     IMUHdgRad  = IMUHeading * deg2rad;         // degree Hdg => rad
     dIMUHdgRad = dIMUHdg  * deg2rad;           // delta degree Hdg => rad     
         
     //-----------------------------------------------------------------------
     // odometry
     
     encLeftOld =encLeft;                       // save old encoder values
     encRightOld=encRight;
     encLeft = motorLeft.motenc;                // get new encoder values
     encRight= motorRight.motenc;

     dEncLeft=encLeft-encLeftOld;               // delta encooder
     dEncRight=encRight-encRightOld;           
     rollLW= Wheelrollperencdeg*dEncLeft;               // left wheel roll way
     rollRW= Wheelrollperencdeg*dEncRight;              // right wheel roll way

     rollM=(rollRW + rollLW)/2;                         // center roll way

     dMencHdgRad=atan2((rollLW - rollRW), Wheelgauge);  // delta encoder heading !! GEO !!

     //-----------------------------------------------------------------------
     // fusioned Heading:  to do: IMU + odometry + GPS !!!!!         
     
     fHeading = IMUHeading;                  // fusioned Hdg deg: IMU only << to do !
     fHdgRad  = IMUHdgRad;                   // fusioned Hdg rad
     dfHdgRad = dIMUHdgRad;                  // fusioned delta Hdg rad
     
     //-----------------------------------------------------------------------
     analogIn[8] = round(fHeading);          // fusioned Hdg deg ->-> analog array[8]
                                             // ->-> send array back to Arduino dash board
     
     //-----------------------------------------------------------------------
     // calculate new position coordinates
     
     fxold=fxpos; fyold=fypos;               // save old pos (x,y)

     fxpos = fxold + rollM*(sin( (fHdgRad +(dfHdgRad /2))));  // new xpos !! GEO !!
     fypos = fyold + rollM*(cos(-(fHdgRad +(dfHdgRad /2))));  // new ypos !! GEO !!
     
     //-----------------------------------------------------------------------
     analogIn[6] = round(fxpos);             // new x-pos ->-> analog array[6]
     analogIn[7] = round(fypos);             // new y-pos ->-> analog array[7]
                                             // ->-> send array back to Arduino dash board
     
     //-----------------------------------------------------------------------


     MencHdgRad    = MencHdgRad + dMencHdgRad;  // new motor enc Heading
     MotencHeading = MencHdgRad * 180.0/M_PI;
                                                        //----------------------------//
     while (MotencHeading >=360) MotencHeading-=360;    // odometry heading           //
     while (MotencHeading < 0)   MotencHeading+=360;    //----------------------------//     
     
     
   
     //-----------------------------------------------------------------------
   
     delay(1);  // 1 ms delay before next iteration loop
   }         
   return NULL;
}

//*************************************************************

void* thread6Go (void* ) {              // medium priority:  RC heartbeat                // medium priority:  RC heartbeat
   while(_TASKS_ACTIVE_) {   
      _lastbeat_ = _heartbeat_;                      // store old value
      delay(10);
      _heartbeat_ = rcrtlrecvbuf[TERM];  // read increasing number from Arduino
      if(abs(_heartbeat_ - _lastbeat_) == 0)         // no change:
        { if(_missedbeats_ <998) _missedbeats_++; }  // transmission error count++
       
      else  if(abs(_heartbeat_ - _lastbeat_) <= 3)   // small increase:
        { _missedbeats_ = 0;  }                      // transmission ok
       
      else  if(abs(_heartbeat_ - _lastbeat_) > 3 )   // large increase:
        { if(_missedbeats_ <998) _missedbeats_++; }  // transmission error count++ 
       
      _REMOTE_OK_ = ( _missedbeats_ < 100 );         // missed beats tolerance
      delay(20);
   }
   return NULL;
} 
      

I could maybe add multi-threading support to the drivers, but I donā€™t think most applications would take advantage of it (it would just be extra complications). It seems that it would be much better to implement in user code according to project requirements.

for 1 single motor there would be needed just 1 thread each to control it,
i.e., not 2 different threads are supposed to control the same motor,
but all motors on all HATs (e.g., 16 different motors) must be able to be controlled by independent proprietary threads running simultaneously -
do you understand what I mean?

I understand what you mean, but most applications shouldnā€™t need multiple threads. For most applications there are alternative ways to write programs, so that they donā€™t need multiple threads.

if I needed not multiple threads then I wouldnā€™t use pre-emptive pthread threads (both joinable and detached).
multithreading by different pthread_ priorities is indispensible to achieve real-time behaviour by C programs, thatā€™s finally why I need to use C + POSIX.
In my final program at least 20 threads will be running simultaneously, permanently and/or optionally, temporarily.