Possible file-system issue with GoPiGo OS 3.0.n?

This is the text of a support e-mail sent to @mitch.kremm and @cleoqc:

============================================

Mitch/Nicole,

While working with the GoPiGo OS 3.0.1 image and trying to reproduce the “update causes robot functionality to fail” issue, I have been getting inconsistent results.

As a result I have been looking more closely at the images themselves.

There are certain inconsistencies with the as-shipped GoPiGo OS 3.0.1 image that I would like to bring to your attention.

  1. As is already known, the images released up to this point have been truncated – that is some unknown portion of the end of the rootfs partition is missing.

  2. Using an external system, (Mint 19.3 installed on my laptop), I have seen some very strange results when running e2fsck:

    • If I use a USB => microSD adapter dongle, the e2fsck fails with corrupted sectors.
      Viz.:
root@Mint-19:/home/jim/Downloads/Test_Images# e2fsck -fDv /dev/sdc2
e2fsck 1.44.1 (24-Mar-2018)
rootfs: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 411067, block #0, offset 0: directory corrupted
Salvage<y>? no
e2fsck: aborted

rootfs: ********** WARNING: Filesystem still has errors **********

root@Mint-19:/home/jim/Downloads/Test_Images#
  • If I use a “full size” multi-card reader, the e2fsck returns about six paths that can be made shorter.
    Viz.:
root@Mint-19:/home/jim/Downloads/Test_Images# e2fsck -fDv /dev/sdc2
e2fsck 1.44.1 (24-Mar-2018)
rootfs: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Inode 16322 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 32238 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 32477 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 48436 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 132283 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 141808 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 158505 extent tree (at level 1) could be shorter.  Fix<y>? no
Inode 389571 extent tree (at level 1) could be shorter.  Fix<y>? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      318514 inodes used (49.47%, out of 643840)
        1266 non-contiguous files (0.4%)
         232 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 298299/376
     2318995 blocks used (88.81%, out of 2611200)
           0 bad blocks
           1 large file

      253229 regular files
       45247 directories
           7 character device files
           0 block device files
           0 fifos
        1214 links
       20021 symbolic links (19823 fast symbolic links)
           1 socket
------------
      319719 files

root@Mint-19:/home/jim/Downloads/Test_Images#

The results are the same before and after first boot.

No other Raspbian downloaded image, (Buster and Bullseye were tested by me), exhibit this interesting behavior.

I have not tried this test using a Raspberry Pi.

Certain interesting things jump out at me:

  1. The truncation issue already mentioned.
  2. The GoPiGo OS image, regardless of where tested, starts out with the line “replaying journal”. This tells me that the image was not unmounted cleanly before imaging.
  3. Obviously – the corrupt directory inode. (It’s the same one every time.)

I do not know why the e2fsck fails on one kind of card reader and not the other. I have tried this with three different USB => microSD dongles and three different full-size multi-card readers. I do know that none of the images I’ve downloaded from other places exhibit this behavior.

And yes, I have downloaded the Dexter/MR images multiple times and verified them against each other using a SHA-256 checksum hash. (they all match)

Absent a better answer, I am compelled to assume that there is something wrong with the file system on the Dexter/MR images.

Attached:
The log file I created while testing.
Flash_Using_MicroSD_Adapter.txt (20.8 KB)

P.S

Full sized multi-card reader:


 

USB => microSD adapter “dongle”

1 Like

Update:

Performing the same process on a Raspberry Pi 4 - 4g, returns the same results:

Viz:
Using a USB to microSD adapter dongle:

  • (copying Raspbian Bullseye using ddrescue)
root@rsync:/home/pi/Downloads/Test_Images# ddrescue -vv --force -c 4096 ./2021-10-30-raspios-bullseye-armhf.img /dev/sda
GNU ddrescue 1.23
About to copy 3972 MBytes from './2021-10-30-raspios-bullseye-armhf.img' (3972005888) to '/dev/sda' [UNKNOWN] (63864569856)
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 4096 sectors       Initial skip size: 128 sectors
Sector size: 512 Bytes
Direct in: no     Direct out: no     Sparse: no     Truncate: no     
Trim: yes         Scrape: yes        Max retry passes: 0

Press Ctrl-C to interrupt
     ipos:    3969 MB, non-trimmed:        0 B,  current rate:   2097 kB/s
     opos:    3969 MB, non-scraped:        0 B,  average rate:  11580 kB/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
  rescued:    3972 MB,   bad areas:        0,        run time:      5m 42s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                              time since last successful read:         n/a
Finished
  • (checking Raspbian Bullseye using e2fsck)
root@rsync:/home/pi/Downloads/Test_Images# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      111295 inodes used (49.29%, out of 225792)
         145 non-contiguous files (0.1%)
         120 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 93246/2
      758303 blocks used (83.96%, out of 903168)
           0 bad blocks
           1 large file

       84051 regular files
        9002 directories
           8 character device files
           0 block device files
           0 fifos
        1363 links
       18225 symbolic links (18031 fast symbolic links)
           0 sockets
------------
      112649 files
  • (copying GoPiGo O/S 3.0.1 using ddrescue)
root@rsync:/home/pi/Downloads/Test_Images# ddrescue -vv --force -c 4096 ./GoPiGo_OS_3.0.1_.img /dev/sda
GNU ddrescue 1.23
About to copy 10968 MBytes from './GoPiGo_OS_3.0.1_.img' (10968104960) to '/dev/sda' [UNKNOWN] (63864569856)
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 4096 sectors       Initial skip size: 256 sectors
Sector size: 512 Bytes
Direct in: no     Direct out: no     Sparse: no     Truncate: no     
Trim: yes         Scrape: yes        Max retry passes: 0

Press Ctrl-C to interrupt
     ipos:   10966 MB, non-trimmed:        0 B,  current rate:   6291 kB/s
     opos:   10966 MB, non-scraped:        0 B,  average rate:  11330 kB/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
  rescued:   10968 MB,   bad areas:        0,        run time:     16m  7s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                              time since last successful read:         n/a
Finished
  • (checking GoPiGo O/S 3.0.1 using e2fsck)
root@rsync:/home/pi/Downloads/Test_Images# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
rootfs: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 411067, block #0, offset 0: directory corrupted
Salvage<y>? no
e2fsck: aborted

rootfs: ********** WARNING: Filesystem still has errors **********

root@rsync:/home/pi/Downloads/Test_Images#

 

Using a multi-card reader:

  • (copying Raspbian Bullseye using ddrescue)
root@rsync:/home/pi/Downloads/Test_Images# ddrescue -vv --force -c 4096 ./2021-10-30-raspios-bullseye-armhf.img /dev/sda
GNU ddrescue 1.23
About to copy 3972 MBytes from './2021-10-30-raspios-bullseye-armhf.img' (3972005888) to '/dev/sda' [UNKNOWN] (63864569856)
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 4096 sectors       Initial skip size: 128 sectors
Sector size: 512 Bytes
Direct in: no     Direct out: no     Sparse: no     Truncate: no     
Trim: yes         Scrape: yes        Max retry passes: 0

Press Ctrl-C to interrupt
     ipos:    3969 MB, non-trimmed:        0 B,  current rate:  16777 kB/s
     opos:    3969 MB, non-scraped:        0 B,  average rate:  14035 kB/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
  rescued:    3972 MB,   bad areas:        0,        run time:      4m 43s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                              time since last successful read:          0s
Finished

 

  • (checking Raspbian Bullseye using e2fsck)
root@rsync:/home/pi/Downloads/Test_Images# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      111295 inodes used (49.29%, out of 225792)
         145 non-contiguous files (0.1%)
         120 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 93246/2
      758303 blocks used (83.96%, out of 903168)
           0 bad blocks
           1 large file

       84051 regular files
        9002 directories
           8 character device files
           0 block device files
           0 fifos
        1363 links
       18225 symbolic links (18031 fast symbolic links)
           0 sockets
------------
      112649 files
root@rsync:/home/pi/Downloads/Test_Images#

 

  • (copying GoPiGo O/S 3.0.1 using ddrescue)
root@rsync:/home/pi/Downloads/Test_Images# ddrescue -vv --force -c 4096 ./GoPiGo_OS_3.0.1_.img /dev/sda
GNU ddrescue 1.23
About to copy 10968 MBytes from './GoPiGo_OS_3.0.1_.img' (10968104960) to '/dev/sda' [UNKNOWN] (63864569856)
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 4096 sectors       Initial skip size: 256 sectors
Sector size: 512 Bytes
Direct in: no     Direct out: no     Sparse: no     Truncate: no     
Trim: yes         Scrape: yes        Max retry passes: 0

Press Ctrl-C to interrupt
     ipos:   10966 MB, non-trimmed:        0 B,  current rate:   6291 kB/s
     opos:   10966 MB, non-scraped:        0 B,  average rate:  13557 kB/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
  rescued:   10968 MB,   bad areas:        0,        run time:     13m 28s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                              time since last successful read:         n/a
Finished
root@rsync:/home/pi/Downloads/Test_Images#

 

  • (checking GoPiGo O/S 3.0.1 using e2fsck)
root@rsync:/home/pi/Downloads/Test_Images# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
rootfs: recovering journal
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Inode 16322 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32238 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32477 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 48436 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 132283 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 141808 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 158505 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 389571 extent tree (at level 1) could be shorter.  Optimize<y>? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      318514 inodes used (49.47%, out of 643840)
        1266 non-contiguous files (0.4%)
         232 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 298299/376
     2318995 blocks used (88.81%, out of 2611200)
           0 bad blocks
           1 large file

      253229 regular files
       45247 directories
           7 character device files
           0 block device files
           0 fifos
        1214 links
       20021 symbolic links (19823 fast symbolic links)
           1 socket
------------
      319719 files
root@rsync:/home/pi/Downloads/Test_Images#

 

I have not done an exhaustive test as the results are already exactly the same.

  • Final test:
    I am going to re-copy GoPiGo O/S 3.0.1 to the SD card and then use tune2fs to force a fsck on reboot, (by setting a maximum mount count and then setting the current mount count to a larger value.  After rebooting and initializing, I will retest as above.

  • (Copying GoPiGo O/S 3.0.1 using ddrescue)

root@rsync:/home/pi/Downloads/Test_Images# ddrescue -vv --force -c 4096 ./GoPiGo_OS_3.0.1_.img /dev/sda
GNU ddrescue 1.23
About to copy 10968 MBytes from './GoPiGo_OS_3.0.1_.img' (10968104960) to '/dev/sda' [UNKNOWN] (63864569856)
    Starting positions: infile = 0 B,  outfile = 0 B
    Copy block size: 4096 sectors       Initial skip size: 256 sectors
Sector size: 512 Bytes
Direct in: no     Direct out: no     Sparse: no     Truncate: no     
Trim: yes         Scrape: yes        Max retry passes: 0

Press Ctrl-C to interrupt
     ipos:   10966 MB, non-trimmed:        0 B,  current rate:  12582 kB/s
     opos:   10966 MB, non-scraped:        0 B,  average rate:  13744 kB/s
non-tried:        0 B,  bad-sector:        0 B,    error rate:       0 B/s
  rescued:   10968 MB,   bad areas:        0,        run time:     13m 18s
pct rescued:  100.00%, read errors:        0,  remaining time:         n/a
                              time since last successful read:          0s
Finished                                     
root@rsync:/home/pi/Downloads/Test_Images#
  • (Forcing a fsck on reboot using tune2fs - “c” sets max mounts before fsck is forced and “C” sets the current mount count.)
root@rsync:/home/pi/Downloads/Test_Images# tune2fs -c 5 -C 6 /dev/sda2
tune2fs 1.44.5 (15-Dec-2018)
Recovering journal.
Setting maximal mount count to 5
Setting current mount count to 6
root@rsync:/home/pi/Downloads/Test_Images#

(Notice it “recovered the journal”.)

  • (Rebooting in a Pi-4 and then re-testing with a USB to microSD dongle)
root@rsync:/home/pi# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 411067, block #0, offset 0: directory corrupted
Salvage<y>? no
e2fsck: aborted

rootfs: ********** WARNING: Filesystem still has errors **********

root@rsync:/home/pi#

 

  • (Retesting with a multi-card reader)
root@rsync:/home/pi# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Inode 16322 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32238 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32477 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 48436 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 132283 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 141808 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 158505 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 389571 extent tree (at level 1) could be shorter.  Optimize<y>? no
Pass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      318509 inodes used (8.35%, out of 3814752)
        1266 non-contiguous files (0.4%)
         232 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 298295/376
     2518949 blocks used (16.22%, out of 15525376)
           0 bad blocks
           1 large file

      253230 regular files
       45242 directories
           7 character device files
           0 block device files
           0 fifos
        1214 links
       20020 symbolic links (19822 fast symbolic links)
           1 socket
------------
      319714 files
root@rsync:/home/pi#

 

Short of disassembling my card readers and dongles to look up the chips used, (and see what limitations they might have), I do not understand the significance of this difference.  I have also searched the various Fountains of Wisdom on the Web with no joy.  (Possibly because it appears to be limited to these particular operating systems.)

I also doubt I would find anything interesting because everything else has no difficulties whatsoever.

Obviously when the card is booted from the SD slot the system “passes” fsck, otherwise it would panic and halt.

I am at a loss here. . . .

Since I don’t know what this means, I have no choice but to assume a worst-case scenario and avoid the GoPiGo O/S releases until I know otherwise.

What say ye?

P.S.

Before anyone asks, I have tried this with several different microSD cards of varying sizes, and the results are always the same.
 

P.P.S.
Cleaning up the inode lengths does not affect subsequent retests on a USB dongle.

Attempting to “salvage” the corrupted directory inode results in the entire image being magically transformed into a quivering lump of GAGH!

1 Like

Update:

I noticed that the size of the images for the Raspbian images were about 4 gigs in size whereas the GoPiGo OS images are considerably larger - to the tune of about 11 gigs plus.

So, I tried an experiment:

I took a Raspbian image that I knew worked, expanded the file system, and then added about 11 gigs worth of stuff to it to bulk up the size of the filesystem.

I then 'fsck’d the root partition to see if a huge filesystem would cause the microSD dongles to fail.  Nope.

I am still puzzled.  I asked over on the Raspberry Pi forums and hopefully they’ll have an idea.

I just cannot imagine why one kind of adapter would fail and another work wonderfully, but only on one specific kind of image.

Unfortunately I do not know if this is a real failure or a phantom failure and I’d really like to know what’s going on.

2 Likes

Question: how do I check if my flashed (and booted) Gopigo OS has this issue?

2 Likes

You may not be able to, as it appears to be “undetectable” when installed in a Raspberry Pi.

I discovered this by using a microSD USB dongle and running a e2fsck on the SD card.


 

What’s puzzling is that these devices ALWAYS fail hard on a Dexter Industries/Modular Robotics image, but a regular Raspbian image works wonderfully.

Being “undetectable” on a Pi, (or when using certain kinds of adapters), makes me wonder what is going on.

Just in case these dongle adapters have problems with large file images, I decided to create a very large Raspbian image - and it still worked.

2 Likes

I don’t understand what that means “fail hard”.

I booted a buster and ran both fsck and e2fsck -fDv on my GoPiGo OS 3.0.1 card:

pi@raspberrypi:~ $ sudo fsck /dev/sda2
fsck from util-linux 2.33.1
e2fsck 1.44.5 (15-Dec-2018)
rootfs: clean, 318513/941616 files, 2339092/3822976 blocks
pi@raspberrypi:~ $ sudo e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Inode 16322 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32238 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 32477 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 48436 extent tree (at level 1) could be shorter.  Optimize<y>? no
nInode 132283 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 141808 extent tree (at level 1) could be shorter.  Optimize<y>? no
Inode 158505 extent tree (at level 1) could be shorter.  Optimize<y>? no
nInode 389571 extent tree (at level 1) could be shorter.  Optimize<y>? no
nPass 4: Checking reference counts
Pass 5: Checking group summary information

rootfs: ***** FILE SYSTEM WAS MODIFIED *****

      318513 inodes used (33.83%, out of 941616)
        1265 non-contiguous files (0.4%)
         179 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 298297/378
     2339092 blocks used (61.19%, out of 3822976)
           0 bad blocks
           1 large file

      253234 regular files
       45242 directories
           7 character device files
           0 block device files
           0 fifos
        1214 links
       20020 symbolic links (19822 fast symbolic links)
           1 socket
------------
      319718 files

The general Internet wisdom seems to be: “This happens from time to time, it’s not corruption, and it doesn’t indicate an impending failure.”

2 Likes

The inode stuff isn’t anything to worry about - that’s just e2fsck trying to optimize the layout of the inode trees - to make access faster because there is less to traverse.

When I say “fail hard”, I’m talking about serious filesystem corruption, like this:

root@rsync:/home/pi# e2fsck -fDv /dev/sda2
e2fsck 1.44.5 (15-Dec-2018)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Directory inode 411067, block #0, offset 0: directory corrupted
Salvage<y>? no
e2fsck: aborted

rootfs: ********** WARNING: Filesystem still has errors **********

root@rsync:/home/pi#

I am still investigating. . . .

P.S.
It’s getting late. (early - almost 4:00 am Moscow time) and my wife’s agitating for me to go to bed!

I may not have an image ready for you to download until tomorrow sometime.

1 Like

No problem - I’m going to continue with this card since it is not failing.

2 Likes

I have a clean image and I am uploading it to MediaFire.

It will take awhile to upload so I will post a link tomorrow with details on what I did and what happened.

========================

Update:

The corrected image is available via the links shown below.

What I did to correct it:

  • Flashed image using Etcher and the JMCR SD Card interface built into my laptop.
  • Used rsync -aAxXv to copy the content of the second partition to my Linux partition on my laptop using Mint 19.3
  • Re-formatted the partition to clear any cruft and re-labeled it as “rootfs”
  • Used rsync (as above) to copy the partition content back to the root partition.
  • Verified that I could 'fsck the partition with each kind of adapter without significant errors.
  • Booted and then re-verified.
  • Used ddrescue to create a new image.
  • Ran the following image-utils:
    image-utils.zip.txt (14.7 KB)
    • image-auto-expand: (causes the image to auto-expand on next boot)
    • image-shrink: (reduces the image to its smallest size)
  • Re-flashed the new image to a SD card and verified it as above.
  • Uploaded to MediaFire.

Oops!  Forgot to zip it up. . . . Done!

Checksums for image download:
https://www.mediafire.com/file/z3isa7b2sy032pf/GoPiGoOS_3.0.1-corrected.img_chechsums.txt/file

Image Download:
https://www.mediafire.com/file/skgc095kezj6nr9/GoPiGoOS_3.0.1-corrected.img.zip/file

========================

Update:

I shared this information and the files with @mitch.kremm and @cleoqc at the M/R support e-mail address.

I have invited them to use the image as they see fit, and if they wish, they can replace the existing image with it with my blessings.

I am going to continue the update causes the robot to fail research using this image.

1 Like