Benutzer:Andreas Plank/BASH: Unterschied zwischen den Versionen

Aus Open Source Ecology - Germany
Zur Navigation springen Zur Suche springen
(16 dazwischenliegende Versionen desselben Benutzers werden nicht angezeigt)
Zeile 1: Zeile 1:
 +
{{ZITATFORMAT Kapitälchen}}
 +
Kurze Zusammenfassungen wichtiger Empfehlungen:
 +
* {{Zitat|Muth - Better Bash Scripting - 2012|Muth (Better Bash Scripting, 2012)}}
 +
* {{Zitat | Vreckem - Bash best practices - 2020 |Vreckem (Bash best practices, 2020)}}
 +
 
== Dateinamen auffinden ==
 
== Dateinamen auffinden ==
  
Zeile 71: Zeile 76:
 
<pre style="font-size:smaller;margin-left:1.5em;">drei-beide
 
<pre style="font-size:smaller;margin-left:1.5em;">drei-beide
 
fünf-beide</pre>
 
fünf-beide</pre>
 +
 +
== Compare Two URL Lists ==
 +
 +
Assume to have two lists of URLs, one old and one new, and you want to get only those URLs that are actually new compared to the old list. The following example asumes to have CSV (comma separated values) or TSV (tab separated values) and tries to extract the very URL, regardless of any text after the URL.
 +
 +
<div style="margin-left:1.5em">
 +
For this we use command:<br/><code>comm ‹-options› oldlist_sorted comparelist_sorted</code> or<br/><code>comm ‹-options› file_1_sorted file_2_sorted</code> and this results in 3 output columns:
 +
column-1        column-2        column-3
 +
only-in-file_1
 +
                  only-in-file_2
 +
                                  in-file_1-and-2
 +
… so using command <code>comm</code> you can now suppress one or two of these three output columns using the option:
 +
* <code>comm -1</code> suppress output column 1 (results left: col 2 and 3, i.e. only-of-file_2 + both of in file_1-and-2)
 +
* <code>comm -12</code> suppress output columns 1 + 2 (results left: col 3, i.e. from both of in file_1-and-2)
 +
* <code>comm -13</code> suppress output columns 1 + 3 (results left: col 2, i.e. results only of in file_2)
 +
* <code>comm -23</code> suppress output columns 2 + 3 (results left: col 1, i.e. results only of in file_1)
 +
* aso.
 +
</div>
  
 
<syntaxhighlight lang="bash" style="font-size:smaller;">
 
<syntaxhighlight lang="bash" style="font-size:smaller;">
 
# # # # # # # # # # # # # #
 
# # # # # # # # # # # # # #
# Check for URI-Differences (in general)
+
# Check for URI-Differences old list vs. new list (in general for CSV or TSV lists)
 
# ```bash
 
# ```bash
 +
# comm file1.txt file2.txt
 +
# LIST1-only-of-file1 LIST2-only-of-file2 LIST3-both-in-and-of-file1-file2
 +
#
 
# comm -13 donelistsorted comparelistsorted > todolistsorted
 
# comm -13 donelistsorted comparelistsorted > todolistsorted
 
# comm -13 donelistsorted comparelistsorted > todolistsorted
 
# comm -13 donelistsorted comparelistsorted > todolistsorted
Zeile 89: Zeile 115:
 
todolist_sorted_noprotocol=${comparelist_source%.*}_todo_noprotocol.tsv;
 
todolist_sorted_noprotocol=${comparelist_source%.*}_todo_noprotocol.tsv;
  
# assume to have URLs beginning at the line start and after it (word-boundary \b), anything other text gets ignored
+
# assume CSV (comma separated values) or TSV (tab separated values)
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*(https?://[^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
+
# assume to have URLs beginning at the line start and after it (word-boundary), any other text herein after gets ignored
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
 
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted"
 
comm -13 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
 
grep --count "/" "$todolist_sorted"; # 2447
 
  
# compare by removing any protocol part (http:// https:// ftp:// sftp:// aso.)
+
# compare by removing any protocol part (http:// https:// ftp:// sftp:// aso. OR remove <…>)
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*https?://([^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
+
# without protocol
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
+
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?[[:alpha:]]+://([^[:space:],]+)>?\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted_noprotocol"
+
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?[[:alpha:]]+://([^[:space:],]+)>?\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted_noprotocol"
comm -13 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
+
  # only in done-list
grep --count "/" "$todolist_sorted_noprotocol"; # 2447 
+
  comm -23 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
 +
  # only in compare-list
 +
  # comm -13 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
 +
  grep --count "/" "$todolist_sorted_noprotocol";
 +
# with protocol
 +
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?([[:alpha:]]+://[^[:space:],]+)>?\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
 +
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?([[:alpha:]]+://[^[:space:],]+)>?\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted"
 +
  # only in done-list
 +
  comm -23 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
 +
  # only in compare-list
 +
  # comm -13 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
 +
  grep --count "/" "$todolist_sorted";
 
# ```
 
# ```
 +
</syntaxhighlight>
 +
 +
== Summieren von Zahlen, Listen ==
 +
 +
<syntaxhighlight lang="bash" style="font-size:smaller;">
 +
# wir wollen die Spalte der Dateigrößen zusammenrechnen 1798891, 2804087 usw.
 +
# Abhängigkeit: awk (verarbeite Textfelder und -dateien)
 +
# Abhängigkeit: bc (eine Rechensprache für beliebige Genauigkeit)
 +
ls -l *importsplit* | head -n 5
 +
# -rw-r--r-- 1 myusername myusername 1798891 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1108_importsplit_01.rdf._normalized.ttl.trig.gz
 +
# -rw-r--r-- 1 myusername myusername 2804087 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1546_importsplit_01.rdf._normalized.ttl.trig.gz
 +
# -rw-r--r-- 1 myusername myusername  862051 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1546_importsplit_02.rdf._normalized.ttl.trig.gz
 +
# -rw-r--r-- 1 myusername myusername 2276286 Jun 30 16:32 Thread-01_botanicalcollections.be_20220511-1106_importsplit_01.rdf._normalized.ttl.trig.gz
 +
# -rw-r--r-- 1 myusername myusername  692749 Jun 30 16:32 Thread-01_botanicalcollections.be_20220511-1106_importsplit_02.rdf._normalized.ttl.trig.gz
 +
ls -l *importsplit* | awk '{ print $5 }' | paste --serial --delimiters=+ - | bc
 +
# 362150562
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Zeile 277: Zeile 326:
 
#!/bin/bash
 
#!/bin/bash
 
# normalerweise ist IFS=" \t\n" aber Problem in for, weil Leerzeichen falsche Trennung erzeugt
 
# normalerweise ist IFS=" \t\n" aber Problem in for, weil Leerzeichen falsche Trennung erzeugt
OLDIF=$IFS
+
OLDIFS=$IFS
 
IFS=$'\n'
 
IFS=$'\n'
 
for datei in *.{jpg,jpeg,JPG,JPEG};do
 
for datei in *.{jpg,jpeg,JPG,JPEG};do
Zeile 317: Zeile 366:
 
   #        ____________________________ for sort -t '/' -k3.1 --debug
 
   #        ____________________________ for sort -t '/' -k3.1 --debug
 
   #                        ____________ for sort -t '/' -k4.1 --debug
 
   #                        ____________ for sort -t '/' -k4.1 --debug
 +
 +
sort --field-separator=$'\t' --stable +0 -4 --unique filename-tab-separated-data.tsv
 +
  # sort uniquely from field 1 (i.e. +0), 2, … but not after field 5 (i.e. 4 (zero indexed field counting))
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Zeile 539: Zeile 591:
 
<syntaxhighlight lang="bash" style="font-size:smaller;">
 
<syntaxhighlight lang="bash" style="font-size:smaller;">
 
# table head (and adding an index column)
 
# table head (and adding an index column)
 +
cat institutionID_20220808.json  | jq --raw-output '.head.vars | @tsv' | sed --regexp-extended 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
 
cat institutionID_20220808.json  | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
 
cat institutionID_20220808.json  | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
 +
# sed: s@^@| # | @;  add index column-header | # | to line start
 +
# sed: s@$@ |@;      append closing table row at the end |
 +
# sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells)
 +
# sed: h;            put this ready formatted header into hold space buffer
 +
# sed: s@[^|]@-@g;  replace all but “|” by “-” to make the markdown header separation
 +
# sed: x;            exchange hold space buffer (formatted 1st table row) with the markdown header separation
 +
# sed: G;            now we have only the first table row in place and append (G) the markdown header separation by a \n
 +
#                    and get a nice complete table header:
 
# | # | cspp_example | institutionID | publisher | graph |
 
# | # | cspp_example | institutionID | publisher | graph |
 
# |---|--------------|---------------|-----------|-------|
 
# |---|--------------|---------------|-----------|-------|
  
 
# table body
 
# table body
   # understand table and sort it (use debug)
+
   # understand table data but sort them by another column (use sort --debug to find out)
 
   cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv'
 
   cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv'
 +
    # show tab separator output
 +
 
 +
  # we use “|” as colum separators and also to format using “|” for the column command
 +
  cat institutionID_20220808.json  | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
 +
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b --debug
 +
  # short options
 
   cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
 
   cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
  | sed -r 's@^@|~ @; s@$@ |~@; s@[\t]@ |~ @g;' | column -t -s '|' | sort -t '~' -k5.1b --debug
+
    | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b --debug  
    # cat institutionID_20220808.json | jq -r '.results.bindings[] | [ .[].value ] |@tsv' | sed -r 's@^@|~ @; s@$@ |~@; s@[\t]@ |~ @g;' | column -t -s '|' | sort -t '~' -k5.1b --debug  
+
     # sort -t '|' -k5.1b using | as table sort separator, and based on that sort the 5th field, use 1st character in the 5th field to line end,
     # sort -t '~' -k5.1b using ~ as table sort separator, and based on that sort the 5th field, use 1st character in the 5th field to line end, and ignore b(lanks)
+
    # -k5.1b ignore b(lanks)
 +
    # -k5.1Vb version sort, ignore b(lanks)
 +
    # -k5.1n natural sort (aso.)
  
   # table body (and adding an index column)
+
   # table body (and adding an index column (sed -r "=;"))
 +
  # short options
 +
  cat institutionID_20220808.json | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
 +
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b \
 +
    | sed "=" | sed --regexp-extended "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
 
   cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
 
   cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
     | sed -r 's@^@|~ @; s@$@ |~@; s@[\t]@ |~ @g;' | column -t -s '|' | sort -t '~' -k5.1b \
+
     | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b \
     | sed -r "s@~@|@g;=;" | sed -r "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
+
     | sed "=" | sed -r "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
 +
  # | 1 | https://admont.jacq.org/ADMONT100680                          | http://viaf.org/viaf/128466393  | ADMONT | http://admont.jacq.org                    |
 +
  # | 2 | https://bak.jacq.org/BAK0-0000001                              | https://ror.org/006m4q736        | BAK    | http://bak.jacq.org                        |
 +
  # | 3 | https://www.botanicalcollections.be/specimen/BR0000005065868  | https://ror.org/01h1jbk91        |        | http://botanicalcollections.be/specimen/  |
 +
  # | … | …                                                              | …                                | …      | …                                          |
 +
</syntaxhighlight>
 +
 
 +
== Sed (kurz für ''stream editor'')<span id="sed_stream_editor"></span> ==
 +
 
 +
Anleitungen
 +
- https://snipcademy.com/shell-scripting-sed – gute Schemadarstellungen der Kommandoabfolgen
 +
 
 +
== Functions for BASH-Programming ==
 +
 
 +
<syntaxhighlight lang="bash" style="font-size:smaller;">
 +
comment_exit_code() {
 +
    # unused
 +
    # -------------------------------
 +
    # Usage:
 +
    #  comment_exit_code $exit_code
 +
    #  comment_exit_code $exit_code "Some more exact comment what was done"
 +
    # -------------------------------
 +
    local this_exit_code=$1
 +
    local this_comment=${2-}
 +
 
 +
    case $this_exit_code in [1-9]|[1-9][0-9]|[1-9][0-9][0-9])
 +
      if [[ "${#this_comment}" -lt 1 ]];then
 +
      echo -e "${ORANGE}Something unexpected happened. Exit Code: ${this_exit_code} $(kill -l $this_exit_code)${NOFORMAT}"
 +
      else
 +
      echo -e "${ORANGE}Something unexpected happened: ${this_comment}. Exit Code: ${this_exit_code} $(kill -l $this_exit_code)${NOFORMAT}"
 +
      fi
 +
      ;;
 +
    esac
 +
}
 +
 
 +
repeat_text() {
 +
    # -------------------------------
 +
    # Usage:
 +
    #  repeat_text n-times text
 +
    #  repeat_text 10 '.'
 +
    #    prints 10 dots: ..........
 +
    #  repeat_text 10 '.' storingvariablename
 +
    #    stores 10 dots to $storingvariablename
 +
    # -------------------------------
 +
    # $1=number of patterns to repeat
 +
    # $2=pattern
 +
    # $3=output variable name
 +
    local tmp
 +
    local local_1=$1
 +
    local local_2=$2
 +
    local local_3=${3-}
 +
    printf -v tmp '%*s' "$local_1"
 +
    if [[ "$local_3" ]];then
 +
      printf -v "$local_3" '%s' "${tmp// /$local_2}"
 +
    else
 +
      printf '%s' "${tmp// /$local_2}"
 +
    fi
 +
}
 +
 
 +
setup_colors() {
 +
  # 0 - Normal Style; 1 - Bold; 2 - Dim; 3 - Italic; 4 - Underlined; 5 - Blinking; 7 - Reverse; 8 - Invisible;
 +
  if [[ -t 2 ]] && [[ -z "${NO_COLOR-}" ]] && [[ "${TERM-}" != "dumb" ]]; then
 +
    NOFORMAT='\033[0m'
 +
    BOLD='\033[1m' ITALIC='\033[3m'
 +
    BLUE='\033[0;34m' BLUE_BOLD='\033[1;34m' BLUE_ITALIC='\033[3;34m'
 +
    CYAN='\033[0;36m' CYAN_BOLD='\033[1;36m' CYAN_ITALIC='\033[3;36m'
 +
    GREEN='\033[0;32m' GREEN_BOLD='\033[1;32m' GREEN_ITALIC='\033[3;32m'
 +
    ORANGE='\033[0;33m' ORANGE_BOLD='\033[1;33m' ORANGE_ITALIC='\033[3;33m'
 +
    PURPLE='\033[0;35m' PURPLE_BOLD='\033[1;35m' PURPLE_ITALIC='\033[3;35m'
 +
    RED='\033[0;31m' RED_BOLD='\033[1;31m' RED_ITALIC='\033[3;31m'
 +
    YELLOW='\033[1;33m' YELLOW_BOLD='\033[1;33m' YELLOW_ITALIC='\033[3;33m'
 +
  else
 +
    NOFORMAT=''
 +
    BOLD='' ITALIC=''
 +
    BLUE='' BLUE_BOLD='' BLUE_ITALIC=''
 +
    CYAN='' CYAN_BOLD='' CYAN_ITALIC=''
 +
    GREEN='' GREEN_BOLD='' GREEN_ITALIC=''
 +
    ORANGE='' ORANGE_BOLD='' ORANGE_ITALIC=''
 +
    PURPLE='' PURPLE_BOLD='' PURPLE_ITALIC=''
 +
    RED='' RED_BOLD='' RED_ITALIC=''
 +
    YELLOW='' YELLOW_BOLD='' YELLOW_ITALIC=''
 +
  fi
 +
}
 +
setup_colors
 +
 
 +
test_dependencies() {
 +
  local exit_level=0
 +
 
 +
  if ! [[ -e "${working_directory}/${file_input}" ]];then
 +
    echo -e "${ORANGE}# We can not find the data in ${NOFORMAT}${working_directory}/${file_input}${ORANGE} (stop)${NOFORMAT}";
 +
    exit_level=1;
 +
  fi
 +
  if ! [[ -x "$(command -v $bin_dwcagent)" ]]; then
 +
    printf "${ORANGE}Command${NOFORMAT} $bin_dwcagent ${ORANGE} to parse names were not found. See https://libraries.io/rubygems/dwc_agent${NOFORMAT}\n"; exit_level=1;
 +
  fi
 +
  if ! [[ -x "$(command -v awk)" ]]; then
 +
    printf "${ORANGE}Command${NOFORMAT} awk ${ORANGE} to read the data was not found. Please install it in your software management system.${NOFORMAT}\n"; exit_level=1;
 +
    exit_level=1;
 +
  fi
 +
  case $exit_level in [1-9])
 +
    printf "${ORANGE}(stop)${NOFORMAT}\n";
 +
    exit 1;;
 +
  esac
 +
}
 +
test_dependencies
 +
 
 +
processinfo() {
 +
  echo -e "${GREEN}# ---------------------------- ${NOFORMAT}"
 +
  echo -e "${GREEN}# Description: We read ${NOFORMAT}${file_input}${GREEN} and search for name lists of multiple names (in this case: containing an ampersand &) …${NOFORMAT}"
 +
  echo -e "${GREEN}# We would parse it with ${NOFORMAT}${bin_dwcagent}${GREEN} and …${NOFORMAT}"
 +
  echo -e "${GREEN}# … would write all parsed names into …${NOFORMAT}"
 +
  echo -e "${GREEN}#  ${NOFORMAT}${file_output}"
 +
  echo -e "${GREEN}#  ${NOFORMAT}${file_output_unique}"
 +
  echo -e "${GREEN}# We would parse names from single text lines, which is slow but overall more accurate.${NOFORMAT}"
 +
  echo -e "${GREEN}#  ($N_parallel parallel executions of dwcagent)${NOFORMAT}"
 +
}
  
# | 1 | https://admont.jacq.org/ADMONT100680                          | http://viaf.org/viaf/128466393  | ADMONT | http://admont.jacq.org                    |
 
# | 2 | https://bak.jacq.org/BAK0-0000001                              | https://ror.org/006m4q736        | BAK    | http://bak.jacq.org                        |
 
# | 3 | https://www.botanicalcollections.be/specimen/BR0000005065868  | https://ror.org/01h1jbk91        |        | http://botanicalcollections.be/specimen/  |
 
# | … | …                                                              | …                                | …      | …                                          |
 
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
== Date and Time ==
 +
 +
<syntaxhighlight lang="bash" style="font-size:smaller;">
 +
# seconds to days hours min sec (→ https://unix.stackexchange.com/a/338844 “bash - Displaying seconds as days/hours/mins/seconds?”)
 +
seconds="755";date --utc --date="@$seconds" +"$(( $seconds/3600/24 )) days %H hours %Mmin %Ssec"
 +
</syntaxhighlight>
 +
 +
Calculate time process
 +
<syntaxhighlight lang="bash" style="font-size:smaller;">
 +
#!/bin/bash
 +
if ! command -v datediff &> /dev/null &&  ! command -v dateutils.ddiff &> /dev/null
 +
then
 +
  echo -e "\e[31m# Error: Neither command datediff or dateutils.ddiff could not be found. Please install package dateutils.\e[0m"
 +
  do_exit=1
 +
else
 +
  if ! command -v datediff &> /dev/null
 +
  then
 +
    # echo "Command dateutils.ddiff found"
 +
    exec_datediff="dateutils.ddiff"
 +
  elif ! command -v dateutils.ddiff &> /dev/null
 +
    then
 +
      # echo "Command datediff found"
 +
      exec_datediff="datediff"
 +
  fi
 +
fi
 +
 +
datetime_start=`date --rfc-3339 'ns'` ;
 +
 +
echo "Sleep for 5 seconds… or some other process is going on …"; sleep 5; echo "Completed";
 +
 +
datetime_end=`date --rfc-3339 'ns'`;
 +
 +
 +
echo $( date --date="$datetime_start" '+# Started: %Y-%m-%d %H:%M:%S%:z' )
 +
echo $( date --date="$datetime_end"  '+# Ended:   %Y-%m-%d %H:%M:%S%:z' )
 +
#  echo "# Started: $datetime_start"
 +
#  echo "# Ended:  $datetime_end" 
 +
 
 +
$exec_datediff "$datetime_start" "$datetime_end" -f "# Done. This took %dd  %0Hh:%0Mm:%0Ss to do something"
 +
</syntaxhighlight>
 +
 +
<syntaxhighlight lang="bash" style="font-size:smaller;">
 +
get_timediff_for_njobs_new () {
 +
  # Description: calculate estimated time to finish n jobs and the estimated total time
 +
  # ---------------------------------
 +
  # Dependency: package dateutils
 +
  # ---------------------------------
 +
  # Usage:
 +
  # get_timediff_for_njobs_new --test # to check for dependencies (datediff)
 +
  # get_timediff_for_njobs_new begintime nowtime ntotaljobs njobsnowdone
 +
  # get_timediff_for_njobs_new "2021-12-06 16:47:29" "2021-12-09 13:38:08" 696926 611613
 +
  # ---------------------------------
 +
  # echo '('`date +"%s.%N"` ' * 1000)/1' | bc # get milliseconds
 +
  # echo '('`date +"%s.%N"` ' * 1000000)/1' | bc # get nanoseconds
 +
  # echo $( date --rfc-3339 'ns' ) | ( read -rsd '' x; echo ${x@Q} ) # escaped
 +
  # ---------------------------------
 +
   
 +
  local this_command_timediff
 +
 
 +
  # read if test mode to check commands
 +
  while [[ "$#" -gt 0 ]]
 +
  do
 +
    case $1 in
 +
      -t|--test)
 +
        doexit=0
 +
        if ! command -v datediff &> /dev/null &&  ! command -v dateutils.ddiff &> /dev/null
 +
        then
 +
          echo -e "# \e[31mError: Neither command datediff or dateutils.ddiff could not be found. Please install package dateutils.\e[0m"
 +
          doexit=1
 +
        fi
 +
        if ! command -v sed &> /dev/null
 +
        then
 +
          echo -e "# \e[31mError: command sed (stream editor) could not be found. Please install package sed.\e[0m"
 +
          doexit=1
 +
        fi
 +
        if ! command -v bc &> /dev/null
 +
        then
 +
          echo -e "# \e[31mError: command bc (arbitrary precision calculator) could not be found. Please install package bc.\e[0m"
 +
          doexit=1
 +
        fi
 +
        if [[ $doexit -gt 1 ]];then
 +
          exit;
 +
        else
 +
          return 0 # (return 0 seems success?) and exit function
 +
        fi
 +
      ;;
 +
      *)
 +
      break
 +
      ;;
 +
    esac
 +
  done
 +
 
 +
  if ! command -v datediff &> /dev/null
 +
  then
 +
    # echo "Command dateutils.ddiff found"
 +
    this_command_timediff="dateutils.ddiff"
 +
  elif ! command -v dateutils.ddiff &> /dev/null
 +
    then
 +
      # echo "Command datediff found"
 +
      this_command_timediff="datediff"
 +
  fi
 +
 +
  # START estimate time to do
 +
  # convert also "2022-06-30_14h56m10s" to "2022-06-30 14:56:10"
 +
  this_given_start_time=$( echo $1 | sed -r 's@([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})[_[:space:]-]([[:digit:]]{2})h([[:digit:]]{2})m([[:digit:]]{2})s@\1 \2:\3:4@' )
 +
  this_given_now_time=$(  echo $2 | sed -r 's@([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})[_[:space:]-]([[:digit:]]{2})h([[:digit:]]{2})m([[:digit:]]{2})s@\1 \2:\3:4@' )
 +
 
 +
  local this_unixnanoseconds_start_timestamp=$(date --date="$this_given_start_time" '+%s.%N')
 +
  local this_unixnanoseconds_now=$(date --date="$this_given_now_time" '+%s.%N')
 +
  local this_unixseconds_todo=0
 +
  local this_n_jobs_all=$(expr $3 + 0)
 +
  local this_i_job_counter=$(expr $4 + 0)
 +
  # echo "scale=10; 1642073008.587244684 - 1642028400.000000000" | bc -l
 +
  local this_timediff_unixnanoseconds=`echo "scale=10; $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp" | bc -l`
 +
  # $(( $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp ))
 +
  local this_n_jobs_todo=$(( $this_n_jobs_all - $this_i_job_counter ))
 +
  local this_msg_estimated_sofar=""
 +
 +
  # echo -e "\033[2m# DEBUG Test mode: all together $this_n_jobs_all ; counter $this_i_job_counter\033[0m"
 +
  if [[ $this_n_jobs_all -eq $this_i_job_counter ]];then # done
 +
    this_unixseconds_todo=0
 +
    # njobs_done_so_far=`$this_command_timediff "@$this_unixnanoseconds_start_timestamp" "@$this_unixnanoseconds_now" -f "all $this_i_job_counter done, duration %dd %0Hh:%0Mm:%0Ss"`
 +
    this_msg_estimated_sofar="nothing left to do"
 +
  else
 +
    # this_unixseconds_todo=$(( $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter ))
 +
    # this_unixseconds_todo=$(( $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter ))
 +
    this_unixseconds_todo=`echo "scale=0; $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter" | bc -l`
 +
   
 +
    job_singular_or_plural=$([ $this_n_jobs_todo -gt 1 ]  && echo jobs  || echo job )
 +
    if [[ $this_unixseconds_todo -ge $(( 60 * 60 * 24 * 2 )) ]];then
 +
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0ddays %0Hh:%0Mmin:%0Ssec"`
 +
    elif [[ $this_unixseconds_todo -ge $(( 60 * 60 * 24 )) ]];then
 +
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0dday %0Hh:%0Mmin:%0Ssec"`
 +
    elif [[ $this_unixseconds_todo -ge $(( 60 * 60 * 1 )) ]];then
 +
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0Hh:%0Mmin:%0Ssec"`
 +
    elif [[ $this_unixseconds_todo -lt $(( 60 * 60 * 1 )) ]];then
 +
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0Mmin:%0Ssec"`
 +
    fi
 +
  fi
 +
 
 +
  this_unixseconds_done=`printf "%.0f" $(echo "scale=0; $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp" | bc -l)`
 +
  this_unixseconds_total=`printf "%.0f" $(echo "scale=0; $this_unixseconds_done + $this_unixseconds_todo" | bc -l)` 
 +
  if [[ $this_unixseconds_total -ge $(( 60 * 60 * 24 * 2 )) ]];then
 +
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0ddays %0Hh:%0Mmin:%0Ssec"`
 +
  elif [[ $this_unixseconds_total -ge $(( 60 * 60 * 24 )) ]];then
 +
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0dday %0Hh:%0Mmin:%0Ssec"`
 +
  elif [[ $this_unixseconds_total -ge $(( 60 * 60 * 1 )) ]];then
 +
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0Hh:%0Mmin:%0Ssec"`
 +
  elif [[ $this_unixseconds_total -lt $(( 60 * 60 * 1 )) ]];then
 +
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0Mmin:%0Ssec"`
 +
  fi
 +
  if ! [[ $this_unixseconds_todo -eq 0 ]];then this_msg_time_total="estimated $this_msg_time_total"; fi
 +
 
 +
  #echo "from $this_n_jobs_all, $njobs_done_so_far; $this_msg_estimated_sofar"
 +
  echo "${this_msg_estimated_sofar} (${this_msg_time_total})"
 +
  # END estimate time to do
 +
}
 +
export -f get_timediff_for_njobs_new # export needed otherwise /usr/bin/bash: get_timediff_for_njobs_new: command not found
 +
get_timediff_for_njobs_new --test
 +
</syntaxhighlight>
 +
 +
 +
{{Literaturverzeichnis}}

Version vom 19. Juni 2023, 11:02 Uhr

Kurze Zusammenfassungen wichtiger Empfehlungen:

Dateinamen auffinden

?      # Genau ein beliebiges Zeichen
*      # Beliebig viele (auch 0) beliebige Zeichen
[def]  # Eines der Zeichen
[^def] # Keines der angegebenen Zeichen
[!def] # Wie oben
[a-d]  # Alle Zeichen aus dem Bereich
ls -d /[a-d]* # Verzeichnisse → /bin  /boot  /dev

Sortierte Dateien vergleichen

Inhalte beider Dateien anzeigen:

cat Datei_1.txt # sollte sortiert sein
cat Datei_2.txt # sollte sortiert sein
drei-beide
eins-1
fünf  -1
fünf-beide
zwei-1
acht-2
drei-beide
fünf-beide
sechs-2
zweilerlei-2
zweitens-2


Standardmäßige Ausgabe:

comm Datei_1.txt Datei_2.txt # es werden 3 Spalten ausgegeben
        acht-2
                drei-beide
eins-1
fünf  -1
                fünf-beide
        sechs-2
zwei-1
        zweilerlei-2
        zweitens-2

Es bedeuten:

  • Spalte 1: Ergebnis einzig aus Datei_1.txt
  • Spalte 2: Ergebnis einzig aus Datei_2.txt
  • Spalte 3: Ergebnis aus beiden Dateien

Das Kommando comm kann nun diese drei Ausgabespalten vermittels Option unterdrücken:

  • comm -1 unterdrücke Ausgabespalte 1 (Ergebnis übrig: Datei_2 + beide)
  • comm -12 unterdrücke Ausgabespalten 1 + 2 (Ergebnis übrig: aus beiden)
  • comm -13 unterdrücke Ausgabespalten 1 + 3 (Ergebnis übrig: Einziges aus Datei_2)
  • usw.


comm -23 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 2 + 3, erübrige Spalte 1, ergibt Einziges aus Datei_1
eins-1
fünf  -1
zwei-1
comm -13 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 1 + 3, erübrige Spalte 2, ergibt Einziges aus Datei_2
acht-2
sechs-2
zweilerlei-2
zweitens-2
comm -12 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 1 + 2, erübrige Spalte 3, ergibt aus beiderlei: Datei_1 und Datei_2
drei-beide
fünf-beide

Compare Two URL Lists

Assume to have two lists of URLs, one old and one new, and you want to get only those URLs that are actually new compared to the old list. The following example asumes to have CSV (comma separated values) or TSV (tab separated values) and tries to extract the very URL, regardless of any text after the URL.

For this we use command:
comm ‹-options› oldlist_sorted comparelist_sorted or
comm ‹-options› file_1_sorted file_2_sorted and this results in 3 output columns:

column-1         column-2        column-3
only-in-file_1
                 only-in-file_2
                                 in-file_1-and-2

… so using command comm you can now suppress one or two of these three output columns using the option:

  • comm -1 suppress output column 1 (results left: col 2 and 3, i.e. only-of-file_2 + both of in file_1-and-2)
  • comm -12 suppress output columns 1 + 2 (results left: col 3, i.e. from both of in file_1-and-2)
  • comm -13 suppress output columns 1 + 3 (results left: col 2, i.e. results only of in file_2)
  • comm -23 suppress output columns 2 + 3 (results left: col 1, i.e. results only of in file_1)
  • aso.
# # # # # # # # # # # # # #
# Check for URI-Differences old list vs. new list (in general for CSV or TSV lists)
# ```bash
# comm file1.txt file2.txt 
# LIST1-only-of-file1 LIST2-only-of-file2 LIST3-both-in-and-of-file1-file2
#
# comm -13 donelistsorted comparelistsorted > todolistsorted
# comm -13 donelistsorted comparelistsorted > todolistsorted
donelist_source=urilist_Naturalis_20220516.csv;
donelist_sorted=${donelist_source%.*}_sorted.tsv;
donelist_sorted_noprotocol=${donelist_source%.*}_sorted_noprotocol.tsv;

comparelist_source=urilist_Naturalis_20220817.tsv;
comparelist_sorted=${comparelist_source%.*}_sorted.tsv;
comparelist_sorted_noprotocol=${comparelist_source%.*}_sorted_noprotocol.tsv;

todolist_sorted=${comparelist_source%.*}_todo.tsv;
todolist_sorted_noprotocol=${comparelist_source%.*}_todo_noprotocol.tsv;

# assume CSV (comma separated values) or TSV (tab separated values)
# assume to have URLs beginning at the line start and after it (word-boundary), any other text herein after gets ignored

# compare by removing any protocol part (http:// https:// ftp:// sftp:// aso. OR remove <…>)
# without protocol
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?[[:alpha:]]+://([^[:space:],]+)>?\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?[[:alpha:]]+://([^[:space:],]+)>?\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted_noprotocol"
  # only in done-list
  comm -23 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
  # only in compare-list
  # comm -13 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
  grep --count "/" "$todolist_sorted_noprotocol";
# with protocol
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?([[:alpha:]]+://[^[:space:],]+)>?\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
  sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*<?([[:alpha:]]+://[^[:space:],]+)>?\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted"
  # only in done-list
  comm -23 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
  # only in compare-list
  # comm -13 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
  grep --count "/" "$todolist_sorted";
# ```

Summieren von Zahlen, Listen

# wir wollen die Spalte der Dateigrößen zusammenrechnen 1798891, 2804087 usw.
# Abhängigkeit: awk (verarbeite Textfelder und -dateien)
# Abhängigkeit: bc (eine Rechensprache für beliebige Genauigkeit)
ls -l *importsplit* | head -n 5 
# -rw-r--r-- 1 myusername myusername 1798891 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1108_importsplit_01.rdf._normalized.ttl.trig.gz
# -rw-r--r-- 1 myusername myusername 2804087 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1546_importsplit_01.rdf._normalized.ttl.trig.gz
# -rw-r--r-- 1 myusername myusername  862051 Jun 30 16:32 Thread-01_botanicalcollections.be_20220509-1546_importsplit_02.rdf._normalized.ttl.trig.gz
# -rw-r--r-- 1 myusername myusername 2276286 Jun 30 16:32 Thread-01_botanicalcollections.be_20220511-1106_importsplit_01.rdf._normalized.ttl.trig.gz
# -rw-r--r-- 1 myusername myusername  692749 Jun 30 16:32 Thread-01_botanicalcollections.be_20220511-1106_importsplit_02.rdf._normalized.ttl.trig.gz
ls -l *importsplit* | awk '{ print $5 }' | paste --serial --delimiters=+ - | bc
# 362150562

BASH kurz-Optionen zu lang-Optionen ersetzen (automatisiert aus Handbuch-Dokumentation (man pages))

# aus dem Handbuch von `iptables` die Optionen …
# [!] -k, --kurz-ausgeschriebeoption als auch …
#     -k, --kurz-ausgeschriebeoption …
# … herausgreifen und ein sed-Kommando daraus machen und hübsch in Spaltendarstellung
  man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
     | sed --regexp-extended 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
     | sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'
# Usage sed --file=short-options2long-options4rules.v4.sed rules.v4 # read changes on the screen
# Usage sed --in-place --file=short-options2long-options4rules.v4.sed rules.v4 # replace without backup
# Usage sed --in-place=.backup_20220802 --file=short-options2long-options4rules.v4.sed   # replace with backup: rules.v4.backup_20220802
#   man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
#      | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; # &/g;' \
#      | sort --ignore-case
#   man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
#      | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
#      | sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'
s@ -4 @ --ipv4 @g;            #        -4, --ipv4
s@ -6 @ --ipv6 @g;            #        -6, --ipv6
s@^-A @--append @g;
s@ -A @ --append @g;          #        -A, --append chain rule-specification
s@ -C @ --check @g;           #        -C, --check chain rule-specification
s@ -c @ --set-counters @g;    #        -c, --set-counters packets bytes
s@ -D @ --delete @g;          #        -D, --delete chain rulenum ... -D, --delete chain rule-specification
s@ -d @ --destination @g;     #        [!] -d, --destination address[/mask][,...]
s@ -E @ --rename-chain @g;    #        -E, --rename-chain old-chain new-chain
s@ -F @ --flush @g;           #        -F, --flush [chain]
s@ -f @ --fragment @g;        #        [!] -f, --fragment
s@ -g @ --goto @g;            #        -g, --goto chain
s@ -i @ --in-interface @g;    #        [!] -i, --in-interface name
s@ -I @ --insert @g;          #        -I, --insert chain [rulenum] rule-specification
s@ -j @ --jump @g;            #        -j, --jump target
s@ -L @ --list @g;            #        -L, --list [chain]
s@ -m @ --match @g;           #        -m, --match match
s@ -N @ --new-chain @g;       #        -N, --new-chain chain
s@ -n @ --numeric @g;         #        -n, --numeric
s@ -o @ --out-interface @g;   #        [!] -o, --out-interface name
s@ -P @ --policy @g;          #        -P, --policy chain target
s@ -p @ --protocol @g;        #        [!] -p, --protocol protocol
s@ -R @ --replace @g;         #        -R, --replace chain rulenum rule-specification
s@ -S @ --list-rules @g;      #        -S, --list-rules [chain]
s@ -s @ --source @g;          #        [!] -s, --source address[/mask][,...]
s@ -t @ --table @g;           #        -t, --table table
s@ -v @ --verbose @g;         #        -v, --verbose
s@ -V @ --version @g;         #        -V, --version
s@ -w @ --wait @g;            #        -w, --wait [seconds]
s@ -X @ --delete-chain @g;    #        -X, --delete-chain [chain]
s@ -x @ --exact @g;           #        -x, --exact
s@ -Z @ --zero @g;            #        -Z, --zero [chain [rulenum]]

Redirect Errors/Standard Output

Siehe: https://www.thomas-krenn.com/de/wiki/Bash_stdout_und_stderr_umleiten

Funktion Bash redirection
stdout -> Datei umleiten programm > Datei.txt
stderr -> Datei umleiten programm 2> Datei.txt
stdout UND stderr -> Datei umleiten programm &> Datei.txt
stdout -> Datei umleiten UND stderr -> Datei umleiten programm > Datei_stdout.txt 2> Datei_stderr.txt
stdout -> stderr programm 1>&2
stderr -> stdout programm 2>&1

Parameter Substitution

echo {a,b}{1,2,3} # a1 a2 a3 b1 b2 b3
# Inhalt von Archiven vergleichen
  diff <(tar tzf Buch1.tar.gz) <(tar tzf Buch.tar.gz)

d='message'
echo $d   # → message
echo ${d} # → message
# d may be not set but a local default (no definition!)
  echo ${d-default} # → default
  echo ${d-'*'}     # → *
  echo ${d-$1}      # → output of d or the first parameter given 
# d may be not set but a default definition
  echo ${d=default}
# no d + default given, but a message and procedure is than abandoned:
  echo ${d?message}
# A shell procedure that requires some parameters to be set might start as follows:
  : ${user?} ${acct?} ${bin?}
  # will print something like: "bash: user: Parameter ist Null oder nicht gesetzt."

${string/substring/replacement} # replaces the first match
${string//substring/replacement} # replaces all matches
${string#substring} # Deletes shortest match of $substring from front of $string.
${string##substring} # Deletes longest match of $substring from front of $string.
${string%substring} # Deletes shortest match of $substring from back of $string.
${string%%substring} # Deletes longest match of $substring from back of $string.

Command substitution

# commands in `...`
echo `pwd` # → /home/myusername → is the current working directory

ls `echo "$1"`
# is the same as
ls $1

set `date`; echo $6 $2 $3, $4 # → 2010 7. Dez, 17:28:44

for i in `ls -t`; do ... # list in time order (ls -t)

Extended mode

# the shell option extglob must be activated
help shopt # print help
shopt extglob
# extglob         off
shopt -s extglob; shopt extglob
# extglob         on
#shopt -u extglob; shopt extglob
## extglob         off

?(a|b|c) # Keine oder eine der eingeschlossenen Zeichenketten
*(a|b|c) # Keine oder mehrere der eingeschlossenen Zeichenketten
+(a|b|c) # Eine oder mehrere der eingeschlossenen Zeichenketten
@(a|b|c) # Genau eine der eingeschlossenen Zeichenketten
!(a|b|c) # Alle außer den eingeschlossenen Zeichenketten

# list all directory names, beginning with "bi", "*+" or "us" 
ls -d /+(bi|*+|us)*
# /bin  /lost+found  /usr  

# list all directory names, beginning not with "b*" and the 2nd character has no "o"
ls -d /!(b*|?o*)
# /cdrom  /dev  /etc  /floppy  /lib  /mnt  /opt  /proc  /sbin  /tmp  /usr  /var

Substrings

string="0123456789stop"
echo ${string:7}      # 789stop
echo ${string:0:7}    # 0123456
echo ${string:-10000} # 0123456789stop
echo ${string: -5}    # 9stop

For Loops

#!/bin/bash
# normalerweise ist IFS=" \t\n" aber Problem in for, weil Leerzeichen falsche Trennung erzeugt
OLDIFS=$IFS
IFS=$'\n'
for datei in *.{jpg,jpeg,JPG,JPEG};do
  if [ -e "$datei" ];then
    echo "$datei (jpg > png) …";
    convert "$datei" "${datei%.*}.png" 
  fi
done
IFS=$OLDIFS

Useful Commands

# sort ls listing by domain Thread-…_wu.jacq.org_ rather than numeric by Thread-01…
file_pattern="Thread-*_gat.jacq.org*2022*-[0-9][0-9][0-9][0-9]_modified.rdf.gz"
ls $file_pattern | sed -r 's@(Thread-)[0-9]+_(.+)@& \2@' | sort -k 2 | sed -r 's@^([^[:space:]]+) .+$@\1@;'

stat --printf="# \e[32mfile name %n\e[0m (%s bytes)…" file
stat --printf="# \e[32mfile name %n\e[0m was modified %y" file
stat --format='%y' file | grep --only-matching --extended-regexp '^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'

sort

Sorting of URLs but by domain regardless of which protocol there is (http, https, ftp aso.):

sort -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv # sort by (t)able-field-character “/” 
  # -k*3*. 1 b  → set (k)ey field to sort after field 3, and from this position 
  # -k 3 .*1*b  → start sorting from the very 1st character to line end as being relevant for sorting
  # -k 3 . 1*b* → ignore (b)lanks
  sort --debug -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv | head -n 6 # show what it is sorting actually

  # https://admont.jacq.org/ADMONT100002 
  # ____________________________________ for sort -t '/' -k1.1 --debug
  #        _____________________________ for sort -t '/' -k2.1 --debug
  #         ____________________________ for sort -t '/' -k3.1 --debug
  #                         ____________ for sort -t '/' -k4.1 --debug

sort --field-separator=$'\t' --stable +0 -4 --unique filename-tab-separated-data.tsv
  # sort uniquely from field 1 (i.e. +0), 2, … but not after field 5 (i.e. 4 (zero indexed field counting))

jq ~ Get Markdown Table from Dynamic Data Values

(see the data in the hidden box, click right)

{ "head": {
    "vars": [ "cspp_example" , "institutionID" , "publisher" , "graph" ]
  } ,
  "results": {
    "bindings": [
      { 
        "cspp_example": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/p00039900" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03wkt5x30" } ,
        "publisher": { "type": "uri" , "value": "https://science.mnhn.fr/institution/mnhn/collection/p/item/search" } ,
        "graph": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/113251" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/0566bfb96" } ,
        "graph": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://id.herb.oulu.fi/0014586" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03yj89h83" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://id.snsb.info/snsb/collection/1000/1579/1000" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05th1v540" } ,
        "publisher": { "type": "literal" , "value": "http://www.snsb.info" } ,
        "graph": { "type": "uri" , "value": "http://id.snsb.info/snsb/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://lagu.jacq.org/object/AM-02278" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01j60ss54" } ,
        "publisher": { "type": "literal" , "value": "LAGU" } ,
        "graph": { "type": "uri" , "value": "http://lagu.jacq.org/object" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/K000989827" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00ynnr806" } ,
        "publisher": { "type": "uri" , "value": "https://www.kew.org" } ,
        "graph": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tbi.jacq.org/object/TBI1014287" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/051qn8h41" } ,
        "publisher": { "type": "literal" , "value": "TBI" } ,
        "graph": { "type": "uri" , "value": "http://tbi.jacq.org/object" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MHD.107807" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03tcx6c30" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.342315" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05vghhr25" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.863532" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/029pk6x14" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://admont.jacq.org/ADMONT100680" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128466393" } ,
        "publisher": { "type": "literal" , "value": "ADMONT" } ,
        "graph": { "type": "uri" , "value": "http://admont.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://bak.jacq.org/BAK0-0000001" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/006m4q736" } ,
        "publisher": { "type": "literal" , "value": "BAK" } ,
        "graph": { "type": "uri" , "value": "http://bak.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://boz.jacq.org/BOZ000001" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128699910" } ,
        "publisher": { "type": "literal" , "value": "BOZ" } ,
        "graph": { "type": "uri" , "value": "http://boz.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://brnu.jacq.org/BRNU000205" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/02j46qs45" } ,
        "publisher": { "type": "literal" , "value": "BRNU" } ,
        "graph": { "type": "uri" , "value": "http://brnu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://data.rbge.org.uk/herb/E00000001" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/0349vqz63" } ,
        "publisher": { "type": "uri" , "value": "http://www.rbge.org.uk" } ,
        "graph": { "type": "uri" , "value": "http://data.rbge.org.uk/herb/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://dr.jacq.org/DR000023" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/155418159" } ,
        "publisher": { "type": "literal" , "value": "DR" } ,
        "graph": { "type": "uri" , "value": "http://dr.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://ere.jacq.org/ERE0000012" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05mpgew40" } ,
        "publisher": { "type": "literal" , "value": "ERE" } ,
        "graph": { "type": "uri" , "value": "http://ere.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gat.jacq.org/GAT0000014" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/02skbsp27" } ,
        "publisher": { "type": "literal" , "value": "GAT" } ,
        "graph": { "type": "uri" , "value": "http://gat.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gjo.jacq.org/GJO0000012" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00nxtmb68" } ,
        "publisher": { "type": "literal" , "value": "GJO" } ,
        "graph": { "type": "uri" , "value": "http://gjo.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gzu.jacq.org/GZU000000208" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01faaaf77" } ,
        "publisher": { "type": "literal" , "value": "GZU" } ,
        "graph": { "type": "uri" , "value": "http://gzu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://hal.jacq.org/HAL0053120" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05gqaka33" } ,
        "publisher": { "type": "literal" , "value": "HAL" } ,
        "graph": { "type": "uri" , "value": "http://hal.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://herbarium.bgbm.org/object/B100000004" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00bv4cx53" } ,
        "publisher": { "type": "literal" , "value": "BGBM" } ,
        "graph": { "type": "uri" , "value": "http://herbarium.bgbm.org/object/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://id.smns-bw.org/smns/collection/275449/772800/279829" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05k35b119" } ,
        "graph": { "type": "uri" , "value": "http://id.smns-bw.org/smns/collection/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://je.jacq.org/JE00000020" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05qpz1x62" } ,
        "publisher": { "type": "literal" , "value": "JE" } ,
        "graph": { "type": "uri" , "value": "http://je.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://kiel.jacq.org/KIEL0007010" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/239180770" } ,
        "publisher": { "type": "literal" , "value": "KIEL" } ,
        "graph": { "type": "uri" , "value": "http://kiel.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://lz.jacq.org/LZ161177" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03s7gtk40" } ,
        "publisher": { "type": "literal" , "value": "LZ" } ,
        "graph": { "type": "uri" , "value": "http://lz.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://mjg.jacq.org/MJG000015" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/023b0x485" } ,
        "publisher": { "type": "literal" , "value": "MJG" } ,
        "graph": { "type": "uri" , "value": "http://mjg.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://pi.jacq.org/PI000648" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03ad39j10" } ,
        "publisher": { "type": "literal" , "value": "PI" } ,
        "graph": { "type": "uri" , "value": "http://pi.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://prc.jacq.org/PRC2535" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/024d6js02" } ,
        "publisher": { "type": "literal" , "value": "PRC" } ,
        "graph": { "type": "uri" , "value": "http://prc.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://tub.jacq.org/TUB002830" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03a1kwz48" } ,
        "publisher": { "type": "literal" , "value": "TUB" } ,
        "graph": { "type": "uri" , "value": "http://tub.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://ubt.jacq.org/UBT0010195" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/142509930" } ,
        "publisher": { "type": "literal" , "value": "UBT" } ,
        "graph": { "type": "uri" , "value": "http://ubt.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://w.jacq.org/W0000011a" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01tv5y993" } ,
        "publisher": { "type": "literal" , "value": "W" } ,
        "graph": { "type": "uri" , "value": "http://w.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://wu.jacq.org/WU0000004" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03prydq77" } ,
        "publisher": { "type": "literal" , "value": "WU" } ,
        "graph": { "type": "uri" , "value": "http://wu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://www.botanicalcollections.be/specimen/BR0000005065868" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01h1jbk91" } ,
        "graph": { "type": "uri" , "value": "http://botanicalcollections.be/specimen/" }
      }
    ]
  }
}

These data results have also missing values, the point is to extract .head.vars and use those values in the variable $fields to query all values out of .results.bindings[].[].value, the rest is getting a nice looking:

# table head (and adding an index column)
cat institutionID_20220808.json  | jq --raw-output '.head.vars | @tsv' | sed --regexp-extended 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
cat institutionID_20220808.json  | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
# sed: s@^@| # | @;  add index column-header | # | to line start
# sed: s@$@ |@;      append closing table row at the end |
# sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells)
# sed: h;            put this ready formatted header into hold space buffer
# sed: s@[^|]@-@g;   replace all but “|” by “-” to make the markdown header separation
# sed: x;            exchange hold space buffer (formatted 1st table row) with the markdown header separation
# sed: G;            now we have only the first table row in place and append (G) the markdown header separation by a \n 
#                    and get a nice complete table header:
# | # | cspp_example | institutionID | publisher | graph |
# |---|--------------|---------------|-----------|-------|

# table body
  # understand table data but sort them by another column (use sort --debug to find out)
  cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv'
    # show tab separator output
  
  # we use “|” as colum separators and also to format using “|” for the column command
  cat institutionID_20220808.json  | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b --debug 
  # short options
  cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b --debug 
    # sort -t '|' -k5.1b using | as table sort separator, and based on that sort the 5th field, use 1st character in the 5th field to line end,
    # -k5.1b ignore b(lanks)
    # -k5.1Vb version sort, ignore b(lanks)
    # -k5.1n natural sort (aso.)

  # table body (and adding an index column (sed -r "=;"))
  # short options
  cat institutionID_20220808.json | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b \
    | sed "=" | sed --regexp-extended "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
  cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b \
    | sed "=" | sed -r "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
  # | 1 | https://admont.jacq.org/ADMONT100680                           | http://viaf.org/viaf/128466393   | ADMONT | http://admont.jacq.org                     |
  # | 2 | https://bak.jacq.org/BAK0-0000001                              | https://ror.org/006m4q736        | BAK    | http://bak.jacq.org                        |
  # | 3 | https://www.botanicalcollections.be/specimen/BR0000005065868   | https://ror.org/01h1jbk91        |        | http://botanicalcollections.be/specimen/   |
  # | … | …                                                              | …                                | …      | …                                          |

Sed (kurz für stream editor)

Anleitungen - https://snipcademy.com/shell-scripting-sed – gute Schemadarstellungen der Kommandoabfolgen

Functions for BASH-Programming

comment_exit_code() {
    # unused
    # -------------------------------
    # Usage:
    #   comment_exit_code $exit_code
    #   comment_exit_code $exit_code "Some more exact comment what was done"
    # -------------------------------
    local this_exit_code=$1
    local this_comment=${2-}

    case $this_exit_code in [1-9]|[1-9][0-9]|[1-9][0-9][0-9])
      if [[ "${#this_comment}" -lt 1 ]];then 
      echo -e "${ORANGE}Something unexpected happened. Exit Code: ${this_exit_code} $(kill -l $this_exit_code)${NOFORMAT}" 
      else
      echo -e "${ORANGE}Something unexpected happened: ${this_comment}. Exit Code: ${this_exit_code} $(kill -l $this_exit_code)${NOFORMAT}" 
      fi
      ;;
    esac
}

repeat_text() {
    # -------------------------------
    # Usage:
    #   repeat_text n-times text
    #   repeat_text 10 '.'
    #     prints 10 dots: ..........
    #   repeat_text 10 '.' storingvariablename
    #     stores 10 dots to $storingvariablename
    # -------------------------------
    # $1=number of patterns to repeat
    # $2=pattern
    # $3=output variable name
    local tmp
    local local_1=$1
    local local_2=$2
    local local_3=${3-}
    printf -v tmp '%*s' "$local_1"
    if [[ "$local_3" ]];then
      printf -v "$local_3" '%s' "${tmp// /$local_2}"
    else
      printf '%s' "${tmp// /$local_2}"
    fi
}

setup_colors() {
  # 0 - Normal Style; 1 - Bold; 2 - Dim; 3 - Italic; 4 - Underlined; 5 - Blinking; 7 - Reverse; 8 - Invisible;
  if [[ -t 2 ]] && [[ -z "${NO_COLOR-}" ]] && [[ "${TERM-}" != "dumb" ]]; then
    NOFORMAT='\033[0m' 
    BOLD='\033[1m' ITALIC='\033[3m'
    BLUE='\033[0;34m' BLUE_BOLD='\033[1;34m' BLUE_ITALIC='\033[3;34m' 
    CYAN='\033[0;36m' CYAN_BOLD='\033[1;36m' CYAN_ITALIC='\033[3;36m' 
    GREEN='\033[0;32m' GREEN_BOLD='\033[1;32m' GREEN_ITALIC='\033[3;32m' 
    ORANGE='\033[0;33m' ORANGE_BOLD='\033[1;33m' ORANGE_ITALIC='\033[3;33m' 
    PURPLE='\033[0;35m' PURPLE_BOLD='\033[1;35m' PURPLE_ITALIC='\033[3;35m' 
    RED='\033[0;31m' RED_BOLD='\033[1;31m' RED_ITALIC='\033[3;31m' 
    YELLOW='\033[1;33m' YELLOW_BOLD='\033[1;33m' YELLOW_ITALIC='\033[3;33m'
  else
    NOFORMAT='' 
    BOLD='' ITALIC=''
    BLUE='' BLUE_BOLD='' BLUE_ITALIC='' 
    CYAN='' CYAN_BOLD='' CYAN_ITALIC='' 
    GREEN='' GREEN_BOLD='' GREEN_ITALIC='' 
    ORANGE='' ORANGE_BOLD='' ORANGE_ITALIC='' 
    PURPLE='' PURPLE_BOLD='' PURPLE_ITALIC='' 
    RED='' RED_BOLD='' RED_ITALIC='' 
    YELLOW='' YELLOW_BOLD='' YELLOW_ITALIC=''
  fi
}
setup_colors

test_dependencies() {
  local exit_level=0

  if ! [[ -e "${working_directory}/${file_input}" ]];then
    echo -e "${ORANGE}# We can not find the data in ${NOFORMAT}${working_directory}/${file_input}${ORANGE} (stop)${NOFORMAT}";
    exit_level=1;
  fi
  if ! [[ -x "$(command -v $bin_dwcagent)" ]]; then
    printf "${ORANGE}Command${NOFORMAT} $bin_dwcagent ${ORANGE} to parse names were not found. See https://libraries.io/rubygems/dwc_agent${NOFORMAT}\n"; exit_level=1;
  fi
  if ! [[ -x "$(command -v awk)" ]]; then
    printf "${ORANGE}Command${NOFORMAT} awk ${ORANGE} to read the data was not found. Please install it in your software management system.${NOFORMAT}\n"; exit_level=1;
    exit_level=1;
  fi
  case $exit_level in [1-9]) 
    printf "${ORANGE}(stop)${NOFORMAT}\n"; 
    exit 1;; 
  esac
}
test_dependencies

processinfo() {
  echo -e "${GREEN}# ---------------------------- ${NOFORMAT}"
  echo -e "${GREEN}# Description: We read ${NOFORMAT}${file_input}${GREEN} and search for name lists of multiple names (in this case: containing an ampersand &) …${NOFORMAT}"
  echo -e "${GREEN}# We would parse it with ${NOFORMAT}${bin_dwcagent}${GREEN} and …${NOFORMAT}"
  echo -e "${GREEN}# … would write all parsed names into …${NOFORMAT}"
  echo -e "${GREEN}#   ${NOFORMAT}${file_output}"
  echo -e "${GREEN}#   ${NOFORMAT}${file_output_unique}"
  echo -e "${GREEN}# We would parse names from single text lines, which is slow but overall more accurate.${NOFORMAT}"
  echo -e "${GREEN}#   ($N_parallel parallel executions of dwcagent)${NOFORMAT}"
}

Date and Time

# seconds to days hours min sec (→ https://unix.stackexchange.com/a/338844 “bash - Displaying seconds as days/hours/mins/seconds?”)
seconds="755";date --utc --date="@$seconds" +"$(( $seconds/3600/24 )) days %H hours %Mmin %Ssec"

Calculate time process

#!/bin/bash
if ! command -v datediff &> /dev/null &&  ! command -v dateutils.ddiff &> /dev/null
then
  echo -e "\e[31m# Error: Neither command datediff or dateutils.ddiff could not be found. Please install package dateutils.\e[0m"
  do_exit=1
else
  if ! command -v datediff &> /dev/null
  then
    # echo "Command dateutils.ddiff found"
    exec_datediff="dateutils.ddiff"
  elif ! command -v dateutils.ddiff &> /dev/null
    then
      # echo "Command datediff found"
      exec_datediff="datediff"
  fi
fi

datetime_start=`date --rfc-3339 'ns'` ;

echo "Sleep for 5 seconds… or some other process is going on …"; sleep 5; echo "Completed";

datetime_end=`date --rfc-3339 'ns'`;


echo $( date --date="$datetime_start" '+# Started: %Y-%m-%d %H:%M:%S%:z' )
echo $( date --date="$datetime_end"   '+# Ended:   %Y-%m-%d %H:%M:%S%:z' )
#   echo "# Started: $datetime_start" 
#   echo "# Ended:   $datetime_end"   
  
$exec_datediff "$datetime_start" "$datetime_end" -f "# Done. This took %dd  %0Hh:%0Mm:%0Ss to do something"
get_timediff_for_njobs_new () {
  # Description: calculate estimated time to finish n jobs and the estimated total time
  # ---------------------------------
  # Dependency: package dateutils
  # ---------------------------------
  # Usage:
  # get_timediff_for_njobs_new --test # to check for dependencies (datediff)
  # get_timediff_for_njobs_new begintime nowtime ntotaljobs njobsnowdone
  # get_timediff_for_njobs_new "2021-12-06 16:47:29" "2021-12-09 13:38:08" 696926 611613
  # ---------------------------------
  # echo '('`date +"%s.%N"` ' * 1000)/1' | bc # get milliseconds
  # echo '('`date +"%s.%N"` ' * 1000000)/1' | bc # get nanoseconds
  # echo $( date --rfc-3339 'ns' ) | ( read -rsd '' x; echo ${x@Q} ) # escaped
  # ---------------------------------
    
  local this_command_timediff
  
  # read if test mode to check commands
  while [[ "$#" -gt 0 ]]
  do
    case $1 in
      -t|--test)
        doexit=0
        if ! command -v datediff &> /dev/null &&  ! command -v dateutils.ddiff &> /dev/null
        then
          echo -e "# \e[31mError: Neither command datediff or dateutils.ddiff could not be found. Please install package dateutils.\e[0m"
          doexit=1
        fi
        if ! command -v sed &> /dev/null 
        then
          echo -e "# \e[31mError: command sed (stream editor) could not be found. Please install package sed.\e[0m"
          doexit=1
        fi
        if ! command -v bc &> /dev/null 
        then
          echo -e "# \e[31mError: command bc (arbitrary precision calculator) could not be found. Please install package bc.\e[0m"
          doexit=1
        fi
        if [[ $doexit -gt 1 ]];then
          exit;
        else
          return 0 # (return 0 seems success?) and exit function
        fi
      ;;
      *)
      break
      ;;
    esac
  done
  
  if ! command -v datediff &> /dev/null
  then
    # echo "Command dateutils.ddiff found"
    this_command_timediff="dateutils.ddiff"
  elif ! command -v dateutils.ddiff &> /dev/null
    then
      # echo "Command datediff found"
      this_command_timediff="datediff"
  fi

  # START estimate time to do 
  # convert also "2022-06-30_14h56m10s" to "2022-06-30 14:56:10"
  this_given_start_time=$( echo $1 | sed -r 's@([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})[_[:space:]-]([[:digit:]]{2})h([[:digit:]]{2})m([[:digit:]]{2})s@\1 \2:\3:4@' )
  this_given_now_time=$(   echo $2 | sed -r 's@([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2})[_[:space:]-]([[:digit:]]{2})h([[:digit:]]{2})m([[:digit:]]{2})s@\1 \2:\3:4@' )
  
  local this_unixnanoseconds_start_timestamp=$(date --date="$this_given_start_time" '+%s.%N')
  local this_unixnanoseconds_now=$(date --date="$this_given_now_time" '+%s.%N')
  local this_unixseconds_todo=0
  local this_n_jobs_all=$(expr $3 + 0)
  local this_i_job_counter=$(expr $4 + 0)
  # echo "scale=10; 1642073008.587244684 - 1642028400.000000000" | bc -l
  local this_timediff_unixnanoseconds=`echo "scale=10; $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp" | bc -l`
  # $(( $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp ))
  local this_n_jobs_todo=$(( $this_n_jobs_all - $this_i_job_counter ))
  local this_msg_estimated_sofar=""

  # echo -e "\033[2m# DEBUG Test mode: all together $this_n_jobs_all ; counter $this_i_job_counter\033[0m"
  if [[ $this_n_jobs_all -eq $this_i_job_counter ]];then # done
    this_unixseconds_todo=0
    # njobs_done_so_far=`$this_command_timediff "@$this_unixnanoseconds_start_timestamp" "@$this_unixnanoseconds_now" -f "all $this_i_job_counter done, duration %dd %0Hh:%0Mm:%0Ss"`
    this_msg_estimated_sofar="nothing left to do"
  else
    # this_unixseconds_todo=$(( $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter ))
    # this_unixseconds_todo=$(( $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter ))
    this_unixseconds_todo=`echo "scale=0; $this_timediff_unixnanoseconds * $this_n_jobs_todo / $this_i_job_counter" | bc -l`
    
    job_singular_or_plural=$([ $this_n_jobs_todo -gt 1 ]  && echo jobs  || echo job )
    if [[ $this_unixseconds_todo -ge $(( 60 * 60 * 24 * 2 )) ]];then
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0ddays %0Hh:%0Mmin:%0Ssec"`
    elif [[ $this_unixseconds_todo -ge $(( 60 * 60 * 24 )) ]];then
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0dday %0Hh:%0Mmin:%0Ssec"`
    elif [[ $this_unixseconds_todo -ge $(( 60 * 60 * 1 )) ]];then
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0Hh:%0Mmin:%0Ssec"`
    elif [[ $this_unixseconds_todo -lt $(( 60 * 60 * 1 )) ]];then
      this_msg_estimated_sofar=`$this_command_timediff "@0" "@$this_unixseconds_todo" -f "Still $this_n_jobs_todo $job_singular_or_plural to do, estimated end %0Mmin:%0Ssec"`
    fi
  fi
  
  this_unixseconds_done=`printf "%.0f" $(echo "scale=0; $this_unixnanoseconds_now - $this_unixnanoseconds_start_timestamp" | bc -l)`
  this_unixseconds_total=`printf "%.0f" $(echo "scale=0; $this_unixseconds_done + $this_unixseconds_todo" | bc -l)`  
  if [[ $this_unixseconds_total -ge $(( 60 * 60 * 24 * 2 )) ]];then
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0ddays %0Hh:%0Mmin:%0Ssec"`
  elif [[ $this_unixseconds_total -ge $(( 60 * 60 * 24 )) ]];then
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0dday %0Hh:%0Mmin:%0Ssec"`
  elif [[ $this_unixseconds_total -ge $(( 60 * 60 * 1 )) ]];then
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0Hh:%0Mmin:%0Ssec"`
  elif [[ $this_unixseconds_total -lt $(( 60 * 60 * 1 )) ]];then
    this_msg_time_total=`$this_command_timediff "@0" "@$this_unixseconds_total" -f "total time: %0Mmin:%0Ssec"`
  fi
  if ! [[ $this_unixseconds_todo -eq 0 ]];then this_msg_time_total="estimated $this_msg_time_total"; fi
  
  #echo "from $this_n_jobs_all, $njobs_done_so_far; $this_msg_estimated_sofar"
  echo "${this_msg_estimated_sofar} (${this_msg_time_total})"
  # END estimate time to do 
}
export -f get_timediff_for_njobs_new # export needed otherwise /usr/bin/bash: get_timediff_for_njobs_new: command not found
get_timediff_for_njobs_new --test


Weiterführende Literatur

Muth, R. 3. August 2012: Better Bash Scripting in 15 Minutes. New York ( http:/​/​robertmuth.​blogspot.​com/​2012/​08/​better-bash-scripting-in-15-minutes.​html, abgerufen am 24. August 2022).
Vreckem, B. V. 6. November 2020: Bash best practices. In: cheat-sheets (Cheat sheets for various stuff). ( https:/​/​bertvv.​github.​io/​cheat-sheets/​Bash.​html, abgerufen am 14. November 2022).