Benutzer:Andreas Plank/BASH: Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
(→jq ~ Get Markdown Table from Dynamic Data Values: long options and short ones) |
|||
Zeile 540: | Zeile 540: | ||
# table head (and adding an index column) | # table head (and adding an index column) | ||
cat institutionID_20220808.json | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;' | cat institutionID_20220808.json | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;' | ||
− | # sed: s@^@| # | @; add index column-header | # | | + | # sed: s@^@| # | @; add index column-header | # | to line start |
# sed: s@$@ |@; append closing table row at the end | | # sed: s@$@ |@; append closing table row at the end | | ||
# sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells) | # sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells) |
Version vom 18. August 2022, 09:12 Uhr
Dateinamen auffinden
? # Genau ein beliebiges Zeichen
* # Beliebig viele (auch 0) beliebige Zeichen
[def] # Eines der Zeichen
[^def] # Keines der angegebenen Zeichen
[!def] # Wie oben
[a-d] # Alle Zeichen aus dem Bereich
ls -d /[a-d]* # Verzeichnisse → /bin /boot /dev
Sortierte Dateien vergleichen
Inhalte beider Dateien anzeigen:
cat Datei_1.txt # sollte sortiert sein
|
cat Datei_2.txt # sollte sortiert sein
|
drei-beide eins-1 fünf -1 fünf-beide zwei-1 |
acht-2 drei-beide fünf-beide sechs-2 zweilerlei-2 zweitens-2 |
Standardmäßige Ausgabe:
comm Datei_1.txt Datei_2.txt # es werden 3 Spalten ausgegeben
acht-2 drei-beide eins-1 fünf -1 fünf-beide sechs-2 zwei-1 zweilerlei-2 zweitens-2
Es bedeuten:
- Spalte 1: Ergebnis einzig aus
Datei_1.txt
- Spalte 2: Ergebnis einzig aus
Datei_2.txt
- Spalte 3: Ergebnis aus beiden Dateien
Das Kommando comm
kann nun diese drei Ausgabespalten vermittels Option unterdrücken:
comm -1
unterdrücke Ausgabespalte 1 (Ergebnis übrig: Datei_2 + beide)comm -12
unterdrücke Ausgabespalten 1 + 2 (Ergebnis übrig: aus beiden)comm -13
unterdrücke Ausgabespalten 1 + 3 (Ergebnis übrig: Einziges aus Datei_2)- usw.
comm -23 Datei_1.txt Datei_2.txt
# unterdrücke Ausgabespalte 2 + 3, erübrige Spalte 1, ergibt Einziges aus Datei_1
eins-1 fünf -1 zwei-1
comm -13 Datei_1.txt Datei_2.txt
# unterdrücke Ausgabespalte 1 + 3, erübrige Spalte 2, ergibt Einziges aus Datei_2
acht-2 sechs-2 zweilerlei-2 zweitens-2
comm -12 Datei_1.txt Datei_2.txt
# unterdrücke Ausgabespalte 1 + 2, erübrige Spalte 3, ergibt aus beiderlei: Datei_1 und Datei_2
drei-beide fünf-beide
# # # # # # # # # # # # # #
# Check for URI-Differences (in general)
# ```bash
# comm -13 donelistsorted comparelistsorted > todolistsorted
# comm -13 donelistsorted comparelistsorted > todolistsorted
donelist_source=urilist_Naturalis_20220516.csv;
donelist_sorted=${donelist_source%.*}_sorted.tsv;
donelist_sorted_noprotocol=${donelist_source%.*}_sorted_noprotocol.tsv;
comparelist_source=urilist_Naturalis_20220817.tsv;
comparelist_sorted=${comparelist_source%.*}_sorted.tsv;
comparelist_sorted_noprotocol=${comparelist_source%.*}_sorted_noprotocol.tsv;
todolist_sorted=${comparelist_source%.*}_todo.tsv;
todolist_sorted_noprotocol=${comparelist_source%.*}_todo_noprotocol.tsv;
# assume to have URLs beginning at the line start and after it (word-boundary \b), anything other text gets ignored
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*(https?://[^[:space:]]+)\b.*$@\1@; p }' "$donelist_source" | sort > "$donelist_sorted"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }' "$donelist_source" | sort > "$donelist_sorted"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*([[:alpha:]]+://[^[:space:]]+)\b.*$@\1@; p }' "$comparelist_source" | sort > "$comparelist_sorted"
comm -13 "$donelist_sorted" "$comparelist_sorted" > "$todolist_sorted";
grep --count "/" "$todolist_sorted"; # 2447
# compare by removing any protocol part (http:// https:// ftp:// sftp:// aso.)
# sed --silent --regexp-extended '/http/{ s@[[:space:]]*https?://([^[:space:]]+)\b.*$@\1@; p }' "$donelist_source" | sort > "$donelist_sorted_noprotocol"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }' "$donelist_source" | sort > "$donelist_sorted_noprotocol"
sed --silent --regexp-extended '/[[:alpha:]]+:\/\// { s@[[:space:]]*[[:alpha:]]+://([^[:space:]]+)\b.*$@\1@; p }' "$comparelist_source" | sort > "$comparelist_sorted_noprotocol"
comm -13 "$donelist_sorted_noprotocol" "$comparelist_sorted_noprotocol" > "$todolist_sorted_noprotocol";
grep --count "/" "$todolist_sorted_noprotocol"; # 2447
# ```
BASH kurz-Optionen zu lang-Optionen ersetzen (automatisiert aus Handbuch-Dokumentation (man pages))
# aus dem Handbuch von `iptables` die Optionen …
# [!] -k, --kurz-ausgeschriebeoption als auch …
# -k, --kurz-ausgeschriebeoption …
# … herausgreifen und ein sed-Kommando daraus machen und hübsch in Spaltendarstellung
man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
| sed --regexp-extended 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
| sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'
# Usage sed --file=short-options2long-options4rules.v4.sed rules.v4 # read changes on the screen
# Usage sed --in-place --file=short-options2long-options4rules.v4.sed rules.v4 # replace without backup
# Usage sed --in-place=.backup_20220802 --file=short-options2long-options4rules.v4.sed # replace with backup: rules.v4.backup_20220802
# man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
# | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; # &/g;' \
# | sort --ignore-case
# man iptables | grep -i --extended-regexp -- '^([[:space:]]*|[[:space:]]*[\[\]!]*[[:space:]]*)-[[:digit:][:alpha:]],[[:space:]]*--' \
# | sed -r 's/.*(-[[:alnum:]]),[[:space:]](--[[:alnum:]]+-?[[:alnum:]]+\b).*/s@ \1 @ \2 @g; #§marker§ &/g;' \
# | sort --ignore-case | column -s '#' -t | sed 's@§marker§@#@'
s@ -4 @ --ipv4 @g; # -4, --ipv4
s@ -6 @ --ipv6 @g; # -6, --ipv6
s@^-A @--append @g;
s@ -A @ --append @g; # -A, --append chain rule-specification
s@ -C @ --check @g; # -C, --check chain rule-specification
s@ -c @ --set-counters @g; # -c, --set-counters packets bytes
s@ -D @ --delete @g; # -D, --delete chain rulenum ... -D, --delete chain rule-specification
s@ -d @ --destination @g; # [!] -d, --destination address[/mask][,...]
s@ -E @ --rename-chain @g; # -E, --rename-chain old-chain new-chain
s@ -F @ --flush @g; # -F, --flush [chain]
s@ -f @ --fragment @g; # [!] -f, --fragment
s@ -g @ --goto @g; # -g, --goto chain
s@ -i @ --in-interface @g; # [!] -i, --in-interface name
s@ -I @ --insert @g; # -I, --insert chain [rulenum] rule-specification
s@ -j @ --jump @g; # -j, --jump target
s@ -L @ --list @g; # -L, --list [chain]
s@ -m @ --match @g; # -m, --match match
s@ -N @ --new-chain @g; # -N, --new-chain chain
s@ -n @ --numeric @g; # -n, --numeric
s@ -o @ --out-interface @g; # [!] -o, --out-interface name
s@ -P @ --policy @g; # -P, --policy chain target
s@ -p @ --protocol @g; # [!] -p, --protocol protocol
s@ -R @ --replace @g; # -R, --replace chain rulenum rule-specification
s@ -S @ --list-rules @g; # -S, --list-rules [chain]
s@ -s @ --source @g; # [!] -s, --source address[/mask][,...]
s@ -t @ --table @g; # -t, --table table
s@ -v @ --verbose @g; # -v, --verbose
s@ -V @ --version @g; # -V, --version
s@ -w @ --wait @g; # -w, --wait [seconds]
s@ -X @ --delete-chain @g; # -X, --delete-chain [chain]
s@ -x @ --exact @g; # -x, --exact
s@ -Z @ --zero @g; # -Z, --zero [chain [rulenum]]
Redirect Errors/Standard Output
Siehe: https://www.thomas-krenn.com/de/wiki/Bash_stdout_und_stderr_umleiten
Funktion | Bash redirection |
---|---|
stdout -> Datei umleiten | programm > Datei.txt
|
stderr -> Datei umleiten | programm 2> Datei.txt
|
stdout UND stderr -> Datei umleiten | programm &> Datei.txt
|
stdout -> Datei umleiten UND stderr -> Datei umleiten | programm > Datei_stdout.txt 2> Datei_stderr.txt
|
stdout -> stderr | programm 1>&2
|
stderr -> stdout | programm 2>&1
|
Parameter Substitution
echo {a,b}{1,2,3} # a1 a2 a3 b1 b2 b3
# Inhalt von Archiven vergleichen
diff <(tar tzf Buch1.tar.gz) <(tar tzf Buch.tar.gz)
d='message'
echo $d # → message
echo ${d} # → message
# d may be not set but a local default (no definition!)
echo ${d-default} # → default
echo ${d-'*'} # → *
echo ${d-$1} # → output of d or the first parameter given
# d may be not set but a default definition
echo ${d=default}
# no d + default given, but a message and procedure is than abandoned:
echo ${d?message}
# A shell procedure that requires some parameters to be set might start as follows:
: ${user?} ${acct?} ${bin?}
# will print something like: "bash: user: Parameter ist Null oder nicht gesetzt."
${string/substring/replacement} # replaces the first match
${string//substring/replacement} # replaces all matches
${string#substring} # Deletes shortest match of $substring from front of $string.
${string##substring} # Deletes longest match of $substring from front of $string.
${string%substring} # Deletes shortest match of $substring from back of $string.
${string%%substring} # Deletes longest match of $substring from back of $string.
Command substitution
# commands in `...`
echo `pwd` # → /home/myusername → is the current working directory
ls `echo "$1"`
# is the same as
ls $1
set `date`; echo $6 $2 $3, $4 # → 2010 7. Dez, 17:28:44
for i in `ls -t`; do ... # list in time order (ls -t)
Extended mode
# the shell option extglob must be activated
help shopt # print help
shopt extglob
# extglob off
shopt -s extglob; shopt extglob
# extglob on
#shopt -u extglob; shopt extglob
## extglob off
?(a|b|c) # Keine oder eine der eingeschlossenen Zeichenketten
*(a|b|c) # Keine oder mehrere der eingeschlossenen Zeichenketten
+(a|b|c) # Eine oder mehrere der eingeschlossenen Zeichenketten
@(a|b|c) # Genau eine der eingeschlossenen Zeichenketten
!(a|b|c) # Alle außer den eingeschlossenen Zeichenketten
# list all directory names, beginning with "bi", "*+" or "us"
ls -d /+(bi|*+|us)*
# /bin /lost+found /usr
# list all directory names, beginning not with "b*" and the 2nd character has no "o"
ls -d /!(b*|?o*)
# /cdrom /dev /etc /floppy /lib /mnt /opt /proc /sbin /tmp /usr /var
Substrings
string="0123456789stop"
echo ${string:7} # 789stop
echo ${string:0:7} # 0123456
echo ${string:-10000} # 0123456789stop
echo ${string: -5} # 9stop
For Loops
#!/bin/bash
# normalerweise ist IFS=" \t\n" aber Problem in for, weil Leerzeichen falsche Trennung erzeugt
OLDIF=$IFS
IFS=$'\n'
for datei in *.{jpg,jpeg,JPG,JPEG};do
if [ -e "$datei" ];then
echo "$datei (jpg > png) …";
convert "$datei" "${datei%.*}.png"
fi
done
IFS=$OLDIFS
Useful Commands
# sort ls listing by domain Thread-…_wu.jacq.org_ rather than numeric by Thread-01…
file_pattern="Thread-*_gat.jacq.org*2022*-[0-9][0-9][0-9][0-9]_modified.rdf.gz"
ls $file_pattern | sed -r 's@(Thread-)[0-9]+_(.+)@& \2@' | sort -k 2 | sed -r 's@^([^[:space:]]+) .+$@\1@;'
stat --printf="# \e[32mfile name %n\e[0m (%s bytes)…" file
stat --printf="# \e[32mfile name %n\e[0m was modified %y" file
stat --format='%y' file | grep --only-matching --extended-regexp '^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}'
sort
Sorting of URLs but by domain regardless of which protocol there is (http, https, ftp aso.):
sort -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv # sort by (t)able-field-character “/”
# -k*3*. 1 b → set (k)ey field to sort after field 3, and from this position
# -k 3 .*1*b → start sorting from the very 1st character to line end as being relevant for sorting
# -k 3 . 1*b* → ignore (b)lanks
sort --debug -t '/' -k3.1b urilist_JACQ_20220815_todo_sorted.tsv | head -n 6 # show what it is sorting actually
# https://admont.jacq.org/ADMONT100002
# ____________________________________ for sort -t '/' -k1.1 --debug
# _____________________________ for sort -t '/' -k2.1 --debug
# ____________________________ for sort -t '/' -k3.1 --debug
# ____________ for sort -t '/' -k4.1 --debug
jq ~ Get Markdown Table from Dynamic Data Values
(see the data in the hidden box, click right)
{ "head": {
"vars": [ "cspp_example" , "institutionID" , "publisher" , "graph" ]
} ,
"results": {
"bindings": [
{
"cspp_example": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/p00039900" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03wkt5x30" } ,
"publisher": { "type": "uri" , "value": "https://science.mnhn.fr/institution/mnhn/collection/p/item/search" } ,
"graph": { "type": "uri" , "value": "http://coldb.mnhn.fr/catalognumber/mnhn/p/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/113251" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/0566bfb96" } ,
"graph": { "type": "uri" , "value": "http://data.biodiversitydata.nl/naturalis/specimen/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://id.herb.oulu.fi/0014586" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03yj89h83" } ,
"publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
"graph": { "type": "uri" , "value": "http://tun.fi" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://id.snsb.info/snsb/collection/1000/1579/1000" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05th1v540" } ,
"publisher": { "type": "literal" , "value": "http://www.snsb.info" } ,
"graph": { "type": "uri" , "value": "http://id.snsb.info/snsb/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://lagu.jacq.org/object/AM-02278" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/01j60ss54" } ,
"publisher": { "type": "literal" , "value": "LAGU" } ,
"graph": { "type": "uri" , "value": "http://lagu.jacq.org/object" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/K000989827" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/00ynnr806" } ,
"publisher": { "type": "uri" , "value": "https://www.kew.org" } ,
"graph": { "type": "uri" , "value": "http://specimens.kew.org/herbarium/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://tbi.jacq.org/object/TBI1014287" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/051qn8h41" } ,
"publisher": { "type": "literal" , "value": "TBI" } ,
"graph": { "type": "uri" , "value": "http://tbi.jacq.org/object" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://tun.fi/MHD.107807" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03tcx6c30" } ,
"publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
"graph": { "type": "uri" , "value": "http://tun.fi" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.342315" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05vghhr25" } ,
"publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
"graph": { "type": "uri" , "value": "http://tun.fi" }
} ,
{
"cspp_example": { "type": "uri" , "value": "http://tun.fi/MKA.863532" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/029pk6x14" } ,
"publisher": { "type": "literal" , "value": "http://gbif.fi" } ,
"graph": { "type": "uri" , "value": "http://tun.fi" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://admont.jacq.org/ADMONT100680" } ,
"institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128466393" } ,
"publisher": { "type": "literal" , "value": "ADMONT" } ,
"graph": { "type": "uri" , "value": "http://admont.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://bak.jacq.org/BAK0-0000001" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/006m4q736" } ,
"publisher": { "type": "literal" , "value": "BAK" } ,
"graph": { "type": "uri" , "value": "http://bak.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://boz.jacq.org/BOZ000001" } ,
"institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/128699910" } ,
"publisher": { "type": "literal" , "value": "BOZ" } ,
"graph": { "type": "uri" , "value": "http://boz.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://brnu.jacq.org/BRNU000205" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/02j46qs45" } ,
"publisher": { "type": "literal" , "value": "BRNU" } ,
"graph": { "type": "uri" , "value": "http://brnu.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://data.rbge.org.uk/herb/E00000001" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/0349vqz63" } ,
"publisher": { "type": "uri" , "value": "http://www.rbge.org.uk" } ,
"graph": { "type": "uri" , "value": "http://data.rbge.org.uk/herb/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://dr.jacq.org/DR000023" } ,
"institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/155418159" } ,
"publisher": { "type": "literal" , "value": "DR" } ,
"graph": { "type": "uri" , "value": "http://dr.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://ere.jacq.org/ERE0000012" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05mpgew40" } ,
"publisher": { "type": "literal" , "value": "ERE" } ,
"graph": { "type": "uri" , "value": "http://ere.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://gat.jacq.org/GAT0000014" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/02skbsp27" } ,
"publisher": { "type": "literal" , "value": "GAT" } ,
"graph": { "type": "uri" , "value": "http://gat.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://gjo.jacq.org/GJO0000012" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/00nxtmb68" } ,
"publisher": { "type": "literal" , "value": "GJO" } ,
"graph": { "type": "uri" , "value": "http://gjo.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://gzu.jacq.org/GZU000000208" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/01faaaf77" } ,
"publisher": { "type": "literal" , "value": "GZU" } ,
"graph": { "type": "uri" , "value": "http://gzu.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://hal.jacq.org/HAL0053120" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05gqaka33" } ,
"publisher": { "type": "literal" , "value": "HAL" } ,
"graph": { "type": "uri" , "value": "http://hal.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://herbarium.bgbm.org/object/B100000004" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/00bv4cx53" } ,
"publisher": { "type": "literal" , "value": "BGBM" } ,
"graph": { "type": "uri" , "value": "http://herbarium.bgbm.org/object/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://id.smns-bw.org/smns/collection/275449/772800/279829" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05k35b119" } ,
"graph": { "type": "uri" , "value": "http://id.smns-bw.org/smns/collection/" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://je.jacq.org/JE00000020" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/05qpz1x62" } ,
"publisher": { "type": "literal" , "value": "JE" } ,
"graph": { "type": "uri" , "value": "http://je.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://kiel.jacq.org/KIEL0007010" } ,
"institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/239180770" } ,
"publisher": { "type": "literal" , "value": "KIEL" } ,
"graph": { "type": "uri" , "value": "http://kiel.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://lz.jacq.org/LZ161177" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03s7gtk40" } ,
"publisher": { "type": "literal" , "value": "LZ" } ,
"graph": { "type": "uri" , "value": "http://lz.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://mjg.jacq.org/MJG000015" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/023b0x485" } ,
"publisher": { "type": "literal" , "value": "MJG" } ,
"graph": { "type": "uri" , "value": "http://mjg.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://pi.jacq.org/PI000648" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03ad39j10" } ,
"publisher": { "type": "literal" , "value": "PI" } ,
"graph": { "type": "uri" , "value": "http://pi.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://prc.jacq.org/PRC2535" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/024d6js02" } ,
"publisher": { "type": "literal" , "value": "PRC" } ,
"graph": { "type": "uri" , "value": "http://prc.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://tub.jacq.org/TUB002830" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03a1kwz48" } ,
"publisher": { "type": "literal" , "value": "TUB" } ,
"graph": { "type": "uri" , "value": "http://tub.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://ubt.jacq.org/UBT0010195" } ,
"institutionID": { "type": "uri" , "value": "http://viaf.org/viaf/142509930" } ,
"publisher": { "type": "literal" , "value": "UBT" } ,
"graph": { "type": "uri" , "value": "http://ubt.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://w.jacq.org/W0000011a" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/01tv5y993" } ,
"publisher": { "type": "literal" , "value": "W" } ,
"graph": { "type": "uri" , "value": "http://w.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://wu.jacq.org/WU0000004" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/03prydq77" } ,
"publisher": { "type": "literal" , "value": "WU" } ,
"graph": { "type": "uri" , "value": "http://wu.jacq.org" }
} ,
{
"cspp_example": { "type": "uri" , "value": "https://www.botanicalcollections.be/specimen/BR0000005065868" } ,
"institutionID": { "type": "uri" , "value": "https://ror.org/01h1jbk91" } ,
"graph": { "type": "uri" , "value": "http://botanicalcollections.be/specimen/" }
}
]
}
}
These data results have also missing values, the point is to extract .head.vars
and use those values in the variable $fields
to query all values out of .results.bindings[].[].value
, the rest is getting a nice looking:
# table head (and adding an index column)
cat institutionID_20220808.json | jq -r '.head.vars | @tsv' | sed -r 's@^@| # | @; s@$@ |@; s@[\t]@ | @g; h; s@[^|]@-@g;x;G;'
# sed: s@^@| # | @; add index column-header | # | to line start
# sed: s@$@ |@; append closing table row at the end |
# sed: s@[\t]@ | @g; as it has tab separated values, replace \t by | (the colums or table data cells)
# sed: h; put this ready formatted header into hold space buffer
# sed: s@[^|]@-@g; replace all but “|” by “-” to make the markdown header separation
# sed: x; exchange hold space buffer (formatted 1st table row) with the markdown header separation
# sed: G; now we have only the first table row in place and append (G) the markdown header separation by a \n
# and get a nice complete table header:
# | # | cspp_example | institutionID | publisher | graph |
# |---|--------------|---------------|-----------|-------|
# table body
# understand table data but sort them by another column (use sort --debug to find out)
cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] | [.[($fields[])].value] |@tsv'
# show tab separator output
# we use “|” as colum separators and also to format using “|” for the column command
cat institutionID_20220808.json | jq --raw-output '.head.vars as $fields | .results.bindings[] | [.[($fields[])].value] |@tsv' \
| sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b --debug
# short options
cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] | [.[($fields[])].value] |@tsv' \
| sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b --debug
# sort -t '|' -k5.1b using | as table sort separator, and based on that sort the 5th field, use 1st character in the 5th field to line end,
# -k5.1b ignore b(lanks)
# -k5.1Vb version sort, ignore b(lanks)
# -k5.1n natural sort (aso.)
# table body (and adding an index column (sed -r "=;"))
# short options
cat institutionID_20220808.json | jq --raw-output '.head.vars as $fields | .results.bindings[] | [.[($fields[])].value] |@tsv' \
| sed --regexp-extended 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column --table --separator '|' --output-separator '|' | sort --field-separator='|' --key=5.1b \
| sed "=" | sed --regexp-extended "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
cat institutionID_20220808.json | jq -r '.head.vars as $fields | .results.bindings[] | [.[($fields[])].value] |@tsv' \
| sed -r 's@^@| @; s@$@ |@; s@[\t]@ | @g;' | column -t -s '|' -o '|' | sort -t '|' -k5.1b \
| sed "=" | sed -r "/^[[:digit:]]/{ N; s@(^[[:digit:]]+)\n@| \1 @; }"
# | 1 | https://admont.jacq.org/ADMONT100680 | http://viaf.org/viaf/128466393 | ADMONT | http://admont.jacq.org |
# | 2 | https://bak.jacq.org/BAK0-0000001 | https://ror.org/006m4q736 | BAK | http://bak.jacq.org |
# | 3 | https://www.botanicalcollections.be/specimen/BR0000005065868 | https://ror.org/01h1jbk91 | | http://botanicalcollections.be/specimen/ |
# | … | … | … | … | … |