Benutzer:Andreas Plank/BASH: Unterschied zwischen den Versionen

Version vom 18. August 2022, 09:12 Uhr

Dateinamen auffinden

?      # Genau ein beliebiges Zeichen
*      # Beliebig viele (auch 0) beliebige Zeichen
[def]  # Eines der Zeichen
[^def] # Keines der angegebenen Zeichen
[!def] # Wie oben
[a-d]  # Alle Zeichen aus dem Bereich
ls -d /[a-d]* # Verzeichnisse → /bin  /boot  /dev

Sortierte Dateien vergleichen

Inhalte beider Dateien anzeigen:

cat Datei_1.txt # sollte sortiert sein	cat Datei_2.txt # sollte sortiert sein
drei-beide eins-1 fünf -1 fünf-beide zwei-1	acht-2 drei-beide fünf-beide sechs-2 zweilerlei-2 zweitens-2

Standardmäßige Ausgabe:

comm Datei_1.txt Datei_2.txt # es werden 3 Spalten ausgegeben

        acht-2
                drei-beide
eins-1
fünf  -1
                fünf-beide
        sechs-2
zwei-1
        zweilerlei-2
        zweitens-2

Es bedeuten:

Spalte 1: Ergebnis einzig aus Datei_1.txt
Spalte 2: Ergebnis einzig aus Datei_2.txt
Spalte 3: Ergebnis aus beiden Dateien

Das Kommando comm kann nun diese drei Ausgabespalten vermittels Option unterdrücken:

comm -1 unterdrücke Ausgabespalte 1 (Ergebnis übrig: Datei_2 + beide)
comm -12 unterdrücke Ausgabespalten 1 + 2 (Ergebnis übrig: aus beiden)
comm -13 unterdrücke Ausgabespalten 1 + 3 (Ergebnis übrig: Einziges aus Datei_2)
usw.

comm -23 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 2 + 3, erübrige Spalte 1, ergibt Einziges aus Datei_1

eins-1
fünf  -1
zwei-1

comm -13 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 1 + 3, erübrige Spalte 2, ergibt Einziges aus Datei_2

acht-2
sechs-2
zweilerlei-2
zweitens-2

comm -12 Datei_1.txt Datei_2.txt 
# unterdrücke Ausgabespalte 1 + 2, erübrige Spalte 3, ergibt aus beiderlei: Datei_1 und Datei_2

drei-beide
fünf-beide

# # # # # # # # # # # # # #
# Check for URI-Differences (in general)
# ```bash
# comm -13 donelistsorted comparelistsorted > todolistsorted
# comm -13 donelistsorted comparelistsorted > todolistsorted
donelist_source=urilist_Naturalis_20220516.csv;
donelist_sorted=${donelist_source%.*}_sorted.tsv;
donelist_sorted_noprotocol=${donelist_source%.*}_sorted_noprotocol.tsv;

comparelist_source=urilist_Naturalis_20220817.tsv;
comparelist_sorted=${comparelist_source%.*}_sorted.tsv;
comparelist_sorted_noprotocol=${comparelist_source%.*}_sorted_noprotocol.tsv;

todolist_sorted=${comparelist_source%.*}_todo.tsv;
todolist_sorted_noprotocol=${comparelist_source%.*}_todo_noprotocol.tsv;

# assume to have URLs beginning at the line start and after it (word-boundary \b), anything other text gets ignored
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*(https?://[^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted"
comm -13 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
grep --count "/" "$todolist_sorted"; # 2447

# compare by removing any protocol part (http:// https:// ftp:// sftp:// aso.)
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*https?://([^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }'  "$donelist_source"    | sort > "$donelist_sorted_noprotocol"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }'  "$comparelist_source" | sort > "$comparelist_sorted_noprotocol"
comm -13 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
grep --count "/" "$todolist_sorted_noprotocol"; # 2447  
# ```

BASH kurz-Optionen zu lang-Optionen ersetzen (automatisiert aus Handbuch-Dokumentation (man pages))

# aus dem Handbuch von `iptables` die Optionen …
# [!] -k, --kurz-ausgeschriebeoption als auch …
#     -k, --kurz-ausgeschriebeoption …
# … herausgreifen und ein sed-Kommando daraus machen und hübsch in Spaltendarstellung
  man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
     | sed --regexp-extended 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
     | sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'

# Usage sed --file=short-options2long-options4rules.v4.sed rules.v4 # read changes on the screen
# Usage sed --in-place --file=short-options2long-options4rules.v4.sed rules.v4 # replace without backup
# Usage sed --in-place=.backup_20220802 --file=short-options2long-options4rules.v4.sed   # replace with backup: rules.v4.backup_20220802
#   man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
#      | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; # &/g;' \
#      | sort --ignore-case
#   man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
#      | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
#      | sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'
s@ -4 @ --ipv4 @g;            #        -4, --ipv4
s@ -6 @ --ipv6 @g;            #        -6, --ipv6
s@^-A @--append @g;
s@ -A @ --append @g;          #        -A, --append chain rule-specification
s@ -C @ --check @g;           #        -C, --check chain rule-specification
s@ -c @ --set-counters @g;    #        -c, --set-counters packets bytes
s@ -D @ --delete @g;          #        -D, --delete chain rulenum ... -D, --delete chain rule-specification
s@ -d @ --destination @g;     #        [!] -d, --destination address[/mask][,...]
s@ -E @ --rename-chain @g;    #        -E, --rename-chain old-chain new-chain
s@ -F @ --flush @g;           #        -F, --flush [chain]
s@ -f @ --fragment @g;        #        [!] -f, --fragment
s@ -g @ --goto @g;            #        -g, --goto chain
s@ -i @ --in-interface @g;    #        [!] -i, --in-interface name
s@ -I @ --insert @g;          #        -I, --insert chain [rulenum] rule-specification
s@ -j @ --jump @g;            #        -j, --jump target
s@ -L @ --list @g;            #        -L, --list [chain]
s@ -m @ --match @g;           #        -m, --match match
s@ -N @ --new-chain @g;       #        -N, --new-chain chain
s@ -n @ --numeric @g;         #        -n, --numeric
s@ -o @ --out-interface @g;   #        [!] -o, --out-interface name
s@ -P @ --policy @g;          #        -P, --policy chain target
s@ -p @ --protocol @g;        #        [!] -p, --protocol protocol
s@ -R @ --replace @g;         #        -R, --replace chain rulenum rule-specification
s@ -S @ --list-rules @g;      #        -S, --list-rules [chain]
s@ -s @ --source @g;          #        [!] -s, --source address[/mask][,...]
s@ -t @ --table @g;           #        -t, --table table
s@ -v @ --verbose @g;         #        -v, --verbose
s@ -V @ --version @g;         #        -V, --version
s@ -w @ --wait @g;            #        -w, --wait [seconds]
s@ -X @ --delete-chain @g;    #        -X, --delete-chain [chain]
s@ -x @ --exact @g;           #        -x, --exact
s@ -Z @ --zero @g;            #        -Z, --zero [chain [rulenum]]

Redirect Errors/Standard Output

Siehe: https://www.thomas-krenn.com/de/wiki/Bash_stdout_und_stderr_umleiten

Funktion	Bash redirection
stdout -> Datei umleiten	`programm > Datei.txt`
stderr -> Datei umleiten	`programm 2> Datei.txt`
stdout UND stderr -> Datei umleiten	`programm &> Datei.txt`
stdout -> Datei umleiten UND stderr -> Datei umleiten	`programm > Datei_stdout.txt 2> Datei_stderr.txt`
stdout -> stderr	`programm 1>&2`
stderr -> stdout	`programm 2>&1`

Parameter Substitution

echo {a,b}{1,2,3} # a1 a2 a3 b1 b2 b3
# Inhalt von Archiven vergleichen
  diff <(tar tzf Buch1.tar.gz) <(tar tzf Buch.tar.gz)

d='message'
echo $d   # → message
echo ${d} # → message
# d may be not set but a local default (no definition!)
  echo ${d-default} # → default
  echo ${d-'*'}     # → *
  echo ${d-$1}      # → output of d or the first parameter given 
# d may be not set but a default definition
  echo ${d=default}
# no d + default given, but a message and procedure is than abandoned:
  echo ${d?message}
# A shell procedure that requires some parameters to be set might start as follows:
  : ${user?} ${acct?} ${bin?}
  # will print something like: "bash: user: Parameter ist Null oder nicht gesetzt."

${string/substring/replacement} # replaces the first match
${string//substring/replacement} # replaces all matches
${string#substring} # Deletes shortest match of $substring from front of $string.
${string##substring} # Deletes longest match of $substring from front of $string.
${string%substring} # Deletes shortest match of $substring from back of $string.
${string%%substring} # Deletes longest match of $substring from back of $string.

Command substitution

# commands in `...`
echo `pwd` # → /home/myusername → is the current working directory

ls `echo "$1"`
# is the same as
ls $1

set `date`; echo $6 $2 $3, $4 # → 2010 7. Dez, 17:28:44

for i in `ls -t`; do ... # list in time order (ls -t)

Extended mode

# the shell option extglob must be activated
help shopt # print help
shopt extglob
# extglob         off
shopt -s extglob; shopt extglob
# extglob         on
#shopt -u extglob; shopt extglob
## extglob         off

?(a|b|c) # Keine oder eine der eingeschlossenen Zeichenketten
*(a|b|c) # Keine oder mehrere der eingeschlossenen Zeichenketten
+(a|b|c) # Eine oder mehrere der eingeschlossenen Zeichenketten
@(a|b|c) # Genau eine der eingeschlossenen Zeichenketten
!(a|b|c) # Alle außer den eingeschlossenen Zeichenketten

# list all directory names, beginning with "bi", "*+" or "us" 
ls -d /+(bi|*+|us)*
# /bin  /lost+found  /usr  

# list all directory names, beginning not with "b*" and the 2nd character has no "o"
ls -d /!(b*|?o*)
# /cdrom  /dev  /etc  /floppy  /lib  /mnt  /opt  /proc  /sbin  /tmp  /usr  /var

Substrings

string="0123456789stop"
echo ${string:7}      # 789stop
echo ${string:0:7}    # 0123456
echo ${string:-10000} # 0123456789stop
echo ${string: -5}    # 9stop

For Loops

#!/bin/bash
# normalerweise ist IFS=" \t\n" aber Problem in for, weil Leerzeichen falsche Trennung erzeugt
OLDIF=$IFS
IFS=$'\n'
for datei in *.{jpg,jpeg,JPG,JPEG};do
  if [ -e "$datei" ];then
    echo "$datei (jpg > png) …";
    convert "$datei" "${datei%.*}.png" 
  fi
done
IFS=$OLDIFS

Useful Commands

# sort ls listing by domain Thread-…_wu.jacq.org_ rather than numeric by Thread-01…
file_pattern="Thread-*_gat.jacq.org*2022*-[0-9][0-9][0-9][0-9]_modified.rdf.gz"
ls $file_pattern | sed -r 's@(Thread-)[0-9]+_(.+)@& \2@' | sort -k 2 | sed -r 's@^([^[:space:]]+) .+$@\1@;'

stat --printf="# \e[32mfile name %n\e[0m (%s bytes)…" file
stat --printf="# \e[32mfile name %n\e[0m was modified %y" file
stat --format='%y' file | grep --only-matching --extended-regexp '^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'

sort

Sorting of URLs but by domain regardless of which protocol there is (http, https, ftp aso.):

sort -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv # sort by (t)able-field-character “/” 
  # -k*3*. 1 b  → set (k)ey field to sort after field 3, and from this position 
  # -k 3 .*1*b  → start sorting from the very 1st character to line end as being relevant for sorting
  # -k 3 . 1*b* → ignore (b)lanks
  sort --debug -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv | head -n 6 # show what it is sorting actually

  # https://admont.jacq.org/ADMONT100002 
  # ____________________________________ for sort -t '/' -k1.1 --debug
  #        _____________________________ for sort -t '/' -k2.1 --debug
  #         ____________________________ for sort -t '/' -k3.1 --debug
  #                         ____________ for sort -t '/' -k4.1 --debug

jq ~ Get Markdown Table from Dynamic Data Values

(see the data in the hidden box, click right)

{ "head": {
    "vars": [ "cspp_example" , "institutionID" , "publisher" , "graph" ]
  } ,
  "results": {
    "bindings": [
      { 
        "cspp_example": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/p00039900" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03wkt5x30" } ,
        "publisher": { "type": "uri" , "value": "https://science.mnhn.fr/institution/mnhn/collection/p/item/search" } ,
        "graph": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/113251" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/0566bfb96" } ,
        "graph": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://id.herb.oulu.fi/0014586" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03yj89h83" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://id.snsb.info/snsb/collection/1000/1579/1000" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05th1v540" } ,
        "publisher": { "type": "literal" , "value": "http://www.snsb.info" } ,
        "graph": { "type": "uri" , "value": "http://id.snsb.info/snsb/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://lagu.jacq.org/object/AM-02278" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01j60ss54" } ,
        "publisher": { "type": "literal" , "value": "LAGU" } ,
        "graph": { "type": "uri" , "value": "http://lagu.jacq.org/object" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/K000989827" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00ynnr806" } ,
        "publisher": { "type": "uri" , "value": "https://www.kew.org" } ,
        "graph": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tbi.jacq.org/object/TBI1014287" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/051qn8h41" } ,
        "publisher": { "type": "literal" , "value": "TBI" } ,
        "graph": { "type": "uri" , "value": "http://tbi.jacq.org/object" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MHD.107807" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03tcx6c30" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.342315" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05vghhr25" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.863532" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/029pk6x14" } ,
        "publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
        "graph": { "type": "uri" , "value": "http://tun.fi" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://admont.jacq.org/ADMONT100680" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128466393" } ,
        "publisher": { "type": "literal" , "value": "ADMONT" } ,
        "graph": { "type": "uri" , "value": "http://admont.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://bak.jacq.org/BAK0-0000001" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/006m4q736" } ,
        "publisher": { "type": "literal" , "value": "BAK" } ,
        "graph": { "type": "uri" , "value": "http://bak.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://boz.jacq.org/BOZ000001" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128699910" } ,
        "publisher": { "type": "literal" , "value": "BOZ" } ,
        "graph": { "type": "uri" , "value": "http://boz.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://brnu.jacq.org/BRNU000205" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/02j46qs45" } ,
        "publisher": { "type": "literal" , "value": "BRNU" } ,
        "graph": { "type": "uri" , "value": "http://brnu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://data.rbge.org.uk/herb/E00000001" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/0349vqz63" } ,
        "publisher": { "type": "uri" , "value": "http://www.rbge.org.uk" } ,
        "graph": { "type": "uri" , "value": "http://data.rbge.org.uk/herb/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://dr.jacq.org/DR000023" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/155418159" } ,
        "publisher": { "type": "literal" , "value": "DR" } ,
        "graph": { "type": "uri" , "value": "http://dr.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://ere.jacq.org/ERE0000012" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05mpgew40" } ,
        "publisher": { "type": "literal" , "value": "ERE" } ,
        "graph": { "type": "uri" , "value": "http://ere.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gat.jacq.org/GAT0000014" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/02skbsp27" } ,
        "publisher": { "type": "literal" , "value": "GAT" } ,
        "graph": { "type": "uri" , "value": "http://gat.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gjo.jacq.org/GJO0000012" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00nxtmb68" } ,
        "publisher": { "type": "literal" , "value": "GJO" } ,
        "graph": { "type": "uri" , "value": "http://gjo.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://gzu.jacq.org/GZU000000208" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01faaaf77" } ,
        "publisher": { "type": "literal" , "value": "GZU" } ,
        "graph": { "type": "uri" , "value": "http://gzu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://hal.jacq.org/HAL0053120" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05gqaka33" } ,
        "publisher": { "type": "literal" , "value": "HAL" } ,
        "graph": { "type": "uri" , "value": "http://hal.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://herbarium.bgbm.org/object/B100000004" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/00bv4cx53" } ,
        "publisher": { "type": "literal" , "value": "BGBM" } ,
        "graph": { "type": "uri" , "value": "http://herbarium.bgbm.org/object/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://id.smns-bw.org/smns/collection/275449/772800/279829" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05k35b119" } ,
        "graph": { "type": "uri" , "value": "http://id.smns-bw.org/smns/collection/" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://je.jacq.org/JE00000020" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/05qpz1x62" } ,
        "publisher": { "type": "literal" , "value": "JE" } ,
        "graph": { "type": "uri" , "value": "http://je.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://kiel.jacq.org/KIEL0007010" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/239180770" } ,
        "publisher": { "type": "literal" , "value": "KIEL" } ,
        "graph": { "type": "uri" , "value": "http://kiel.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://lz.jacq.org/LZ161177" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03s7gtk40" } ,
        "publisher": { "type": "literal" , "value": "LZ" } ,
        "graph": { "type": "uri" , "value": "http://lz.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://mjg.jacq.org/MJG000015" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/023b0x485" } ,
        "publisher": { "type": "literal" , "value": "MJG" } ,
        "graph": { "type": "uri" , "value": "http://mjg.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://pi.jacq.org/PI000648" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03ad39j10" } ,
        "publisher": { "type": "literal" , "value": "PI" } ,
        "graph": { "type": "uri" , "value": "http://pi.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://prc.jacq.org/PRC2535" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/024d6js02" } ,
        "publisher": { "type": "literal" , "value": "PRC" } ,
        "graph": { "type": "uri" , "value": "http://prc.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://tub.jacq.org/TUB002830" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03a1kwz48" } ,
        "publisher": { "type": "literal" , "value": "TUB" } ,
        "graph": { "type": "uri" , "value": "http://tub.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://ubt.jacq.org/UBT0010195" } ,
        "institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/142509930" } ,
        "publisher": { "type": "literal" , "value": "UBT" } ,
        "graph": { "type": "uri" , "value": "http://ubt.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://w.jacq.org/W0000011a" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01tv5y993" } ,
        "publisher": { "type": "literal" , "value": "W" } ,
        "graph": { "type": "uri" , "value": "http://w.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://wu.jacq.org/WU0000004" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/03prydq77" } ,
        "publisher": { "type": "literal" , "value": "WU" } ,
        "graph": { "type": "uri" , "value": "http://wu.jacq.org" }
      } ,
      { 
        "cspp_example": { "type": "uri" , "value": "https://www.botanicalcollections.be/specimen/BR0000005065868" } ,
        "institutionID": { "type": "uri" , "value": "https://ror.org/01h1jbk91" } ,
        "graph": { "type": "uri" , "value": "http://botanicalcollections.be/specimen/" }
      }
    ]
  }
}

These data results have also missing values, the point is to extract .head.vars and use those values in the variable $fields to query all values out of .results.bindings[].[].value, the rest is getting a nice looking:

# table head (and adding an index column)
cat institutionID_20220808.json  | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
# sed: s@^@| # | @;  add index column-header | # | to line start
# sed: s@$@ |@;      append closing table row at the end |
# sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells)
# sed: h;            put this ready formatted header into hold space buffer
# sed: s@[^|]@-@g;   replace all but “|” by “-” to make the markdown header separation
# sed: x;            exchange hold space buffer (formatted 1st table row) with the markdown header separation
# sed: G;            now we have only the first table row in place and append (G) the markdown header separation by a \n 
#                    and get a nice complete table header:
# | # | cspp_example | institutionID | publisher | graph |
# |---|--------------|---------------|-----------|-------|

# table body
  # understand table data but sort them by another column (use sort --debug to find out)
  cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv'
    # show tab separator output
  
  # we use “|” as colum separators and also to format using “|” for the column command
  cat institutionID_20220808.json  | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b --debug 
  # short options
  cat institutionID_20220808.json  | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b --debug 
    # sort -t '|' -k5.1b using | as table sort separator, and based on that sort the 5th field, use 1st character in the 5th field to line end,
    # -k5.1b ignore b(lanks)
    # -k5.1Vb version sort, ignore b(lanks)
    # -k5.1n natural sort (aso.)

  # table body (and adding an index column (sed -r "=;"))
  # short options
  cat institutionID_20220808.json | jq --raw-output '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b \
    | sed "=" | sed --regexp-extended "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
  cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] |  [.[($fields[])].value] |@tsv' \
    | sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b \
    | sed "=" | sed -r "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
  # | 1 | https://admont.jacq.org/ADMONT100680                           | http://viaf.org/viaf/128466393   | ADMONT | http://admont.jacq.org                     |
  # | 2 | https://bak.jacq.org/BAK0-0000001                              | https://ror.org/006m4q736        | BAK    | http://bak.jacq.org                        |
  # | 3 | https://www.botanicalcollections.be/specimen/BR0000005065868   | https://ror.org/01h1jbk91        |        | http://botanicalcollections.be/specimen/   |
  # | … | …                                                              | …                                | …      | …                                          |

Benutzer:Andreas Plank/BASH: Unterschied zwischen den Versionen

Version vom 18. August 2022, 09:12 Uhr

Inhaltsverzeichnis

Dateinamen auffinden

Sortierte Dateien vergleichen

BASH kurz-Optionen zu lang-Optionen ersetzen (automatisiert aus Handbuch-Dokumentation (man pages))

Redirect Errors/Standard Output

Parameter Substitution

Command substitution

Extended mode

Substrings

For Loops

Useful Commands

sort

jq ~ Get Markdown Table from Dynamic Data Values

Navigationsmenü

Suche

Benutzer:Andreas Plank/BASH: Unterschied zwischen den Versionen

Version vom 18. August 2022, 09:12 Uhr

Dateinamen auffinden

Sortierte Dateien vergleichen

BASH kurz-Optionen zu lang-Optionen ersetzen (automatisiert aus Handbuch-Dokumentation (man pages))

Redirect Errors/Standard Output

Parameter Substitution

Command substitution

Extended mode

Substrings

For Loops

Useful Commands

sort

jq ~ Get Markdown Table from Dynamic Data Values

Navigationsmenü

Suche

jq ~ Get Markdown Table from Dynamic Data Values