Posts: 1,308
BitJam
Joined: 31 Aug 2009
#1
Skidoo has suggested that we store the excludes for these programs in a central place and make them easy for users to edit. I think this is a terrific idea and it is time to do it.

Remaster-live and persist-save (now) store their exclude lists in variables in the file /usr/local/lib/antiX/antiX-excludes.sh. I believe the snapshot program holds its exclude info in a different format and in a different location. I like the idea of"common excludes" that are combined with specific excludes for persistence, remastering and snapshots. To make them easier to edit, the lists should be stored directly in 4 files:
  • common-excludes
  • remaster-excludes
  • persist-excludes
  • snapshot-excludes
The logic would be we combine the common-excludes list with the specific excludes for the particular program. We would honor comments both before and after the file names. I already have some simple code that does this sort of thing as part of our build system.

We should try to make it as easy for the user as possible and then work around that. I don't know if combining common-excludes with a specific $PROGRAM-excludes will be easier for the user but I would hate to have four different lists that are almost the same. That sort of thing drives me bonkers. We can also have a .orig copy of each file that is read-only.

Does anyone have ideas or suggestions?
Posts: 1,445
skidoo
Joined: 09 Feb 2012
#2
Sorry to hear that"different, but almost the same, lists" drive you bonkers. I'm convinced such is necessary.
For your default lists (the lists you distribute), an entry YOU deem desirable for the"common" list... I may disagree.
For that item, I may wish to exclude it from ONLY remaster operations... and I need to go hunt/kill it over in common.
A good example here, I think, is /var/cache/apt/archives/*.deb
-=-
Hmm, phrase it like this:
The separate lists are task-specific ~~ that's why they are (should be) separate, separately maintained.
The"common" list, I'm suggesting that should only contain the MUST exclude (or else will break things) entries.

If I never remaster, I probably don't (ever) care to look inside the remaster list.
When I perform a snapshot, I damn sure want to see every relevant entry... and referring back-n-forth between lists is PITA.
It's reasonable to expect a user will need to tweak his list(s) often, incrementally (outweighs infrequent dev PITA in"maintaining" the default lists).

A"default" remaster list cannot responsibly cover every consideration.
For that list, an inline comment should remind the would-be remasterer to consider what
(tweaked configs for installed apps) needs to be copied to etc/skel, in addition to what should/must be excluded.

Regarding"persist-excludes", exposing its content to the user would be enlightening, even if s/he never has need to edit it.

Ideally (eventually) a comment line would precede each entry, explaining why excluded (and whether or not user might change).
Some of the patterns can be commented in blocks ( proc / run / srv ) but many do merit individual explanation.
Posts: 1,445
skidoo
Joined: 09 Feb 2012
#3
reminder: an issue illustrated in the July 2014"snapshots not working as expected" thread
snapshots-not-working-as-expected-t5211.html
In this example case, botched snapshot result occurred due to the pattern media/* within /usr/lib/antixsnapshot/snapshot_exclude.list

related (usability issue):
because EVERY non-commented, non-blank line within the excludes file is an EXCLUDE pattern
{ - } usr/share/fuzzydice
{ - } home/*/thumbsuckers/large/*
having the parser expect/require a BOL minussign character for each pattern line within the excludes file is $!#&@ ...um, is not ideal. compared to refractasnapshot

======================

proposed conventions for content and format of excludes lists:

1)
Each item (aka path, rule, pattern, regex pattern) in an excludes file should begin with a forward slash character, which serves to"anchor" it.

2) A comment paragraph atop each list should describe, in a nutshell, the intended purpose of its associated script.
This header should also provide a reference link ( /path/to/local/doc or URL to online wiki / discussion thread ).
Ideally (for best results) the user should thoroughly review the external documentation.
Nonetheless, generous inclusion of Inline comments throughout each excludes file is recommended.
#Bear in mind that users will
#typically edit the list using
#a text editor and that
#linewrapped comment
#text is $!#&@ hard to digest.



3) In the default list(s), 3 hash chars BOL to preface an"important" line item (or group of lines)
### these paths, if present, MUST be excluded ! (else bad things will happen)
### Yes, some of the top-level dirs listed here, e.g."cdrom", might not currently exist in your system.
### If so, no problem... and it's advisable to, just In case

/cdrom/*
/dev/*
/live
/media/*
/mnt/*
/proc/*
/swapfile

/srv/* this line is absent from the current antiX default excludes list?
/sys/*
/tmp/*
###
perhaps followed by a matching"end of section" marker, at least for large multiline groupings


4) In the default list(s), a comment line bearing a single hash char BOL precedes an"optional" line item (or group of lines)
#OPTIONAL: can safely exclude the following, if present; the system will automatically regenerate it during boot.
# filesize is typically 2Mb+ (its"squashed" size within the resulting snapshot will be considerably less, though)
# (its exclusion yields a smaller snapshot image. OTOH, regenerating it each live session adds 2-3secs boot overhead)
# /var/lib/mlocate/mlocate.db

^--- just serves as a formatting example here.
Admittedly, for an optional item which would be outcomented by default and is accompanied by several comment lines...
...mentioning it within the external docs instead of cluttering the excludes file may be preferable.
Posts: 850
fatmac
Joined: 26 Jul 2012
#4
I think information headers ought to have multiple ###'s as I have seen in other files, it makes it easier to find section headers when a file becomes more than one or two screens.
Posts: 1,445
skidoo
Joined: 09 Feb 2012
#5
Thanks for suggesting ### section headers, fatmac. My hope for the"number of hashes" suggestion was to visually emphasize optional vs advisable vs necessary exclusions.
I agree that section headers are helpful, but only to a point -- after which, they begin to seem too restrictive.
In my use, I've found that a ---DO NOT EDIT ANYTHING ABOVE THIS LINE--- approach, with optional items grouped further down the list... became a hassle.
When grouping items based on {to save space} {privacy} {system will recreate these each boot} {system will recreate if/when needed} too often a given item seems to 'overlap', defying placement in just one group.
We would honor comments both before and after the file names.
That's interesting. I don't think I've ever tried tacking on end-of-line comments. I've just placed my comment line(s) immediately above the associated pattern line.
Placing comments end-of-line would reduce my snapshot exclusion list quite a bit.

Reading"file names" in the post I've quoted reminds to mention confusion introduced by mix-n-match terminology.
In my post above, I was careful to write"Each item (aka path, rule, pattern, regex pattern) in an excludes file"
because a given item may result in the exclusion of
{one specific file}, or {a directory}, or {everything beneath the named directory, but not the directory itself} or {all files with matching names}, or...

Pick a card, any card. Throughout the docs (and discussions) we need a consistent name/term for"a non-comment, non-blank, line within an excludes file".
Separately, typing"snapshot excludes file" (or"snapshot exclusions list") seems awkwardly verbose, yet necessary, with multiple exclusion lists in the mix.
We should try to make it as easy for the user as possible and then work around that.
Without inline commentary, the user probably won't realize that new, empty, the top-level dirs 'media, proc, sys, tmp' et al
are created in the working directory... and is left wondering"Why do 'swapfile' and 'live' items lack tailend slash-asterisk compared to the others?"

The default lists currently contain only a few dozen lines. These can be easily grouped, and commented.
I was suggesting expansion of the default lists, so that they contain additional, initially-outcommented, items
in order to convey recommendations and to illustrate useful regex-based patterns (relevant, useful, patterns. Just uncomment to activate)
I don't know if combining common-excludes with a specific $PROGRAM-excludes will be easier for the user
but I would hate to have four different lists that are almost the same.
Back in 2012 (?) I posted"Consider the status quo, from a user perspective" and I spelled out (walked through) the scenario of"not being able to see the forest through the trees":
(I'm explaining, not ranting)
Find/open antixsnapshot... find/open the antix-common script which is"sourced" by antixsnapshot... figure out"who da hell defines 'du_excludes', and where"...
Ultimately, I reached the conclusion that the antixsnapshot mechanism wasn't usable for me (wasn't acceptable, didn't produce a desired result)
due to hard-coded exclusions stipulated in the"shared" (focused on remastering) exclusions list.
Under this status quo, user expects, but cannot acheive: snapshot == faithful copy of the running system
We can also have a .orig copy of each file that is read-only
That sounds like a good idea, along with advising user to backup his customized list(s).