Benutzer:Andreas Plank/Hilfe – Textumwandlung (HTML, Markdown, Wiki, Text usw.): Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
(Die Seite wurde neu angelegt: „<syntaxhighlight lang="bash"> soffice --headless --convert-to txt:MediaWiki "Word-Dokument.doc" </syntaxhighlight> Englische Beschreibung aus <code>soffice -…“) |
|||
Zeile 1: | Zeile 1: | ||
+ | Zum Umwandeln von Textformaten ist das Werkzeug <code>pandoc</code> (https://pandoc.org) sehr hilfreich, es kann die meisten üblichen Textformate umwandeln, und die möglich verfügbaren Eingabeformate und Ausgabeformate erfragt man wie folgt: | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | pandoc --list-input-formats | ||
+ | |||
+ | # biblatex; bibtex; commonmark; commonmark_x; creole; csljson; csv; docbook; docx; dokuwiki; endnotexml; epub; fb2; gfm; haddock; html; ipynb; jats; jira; json; latex; man; markdown; markdown_github; markdown_mmd; markdown_phpextra; markdown_strict; mediawiki; muse; native; odt; opml; org; ris; rst; rtf; t2t; textile; tikiwiki; tsv; twiki; vimwiki; | ||
+ | |||
+ | pandoc --list-output-formats | ||
+ | |||
+ | # asciidoc; asciidoctor; beamer; biblatex; bibtex; chunkedhtml; commonmark; commonmark_x; context; csljson; docbook; docbook4; docbook5; docx; dokuwiki; dzslides; epub; epub2; epub3; fb2; gfm; haddock; html; html4; html5; icml; ipynb; jats; jats_archiving; jats_articleauthoring; jats_publishing; jira; json; latex; man; markdown; markdown_github; markdown_mmd; markdown_phpextra; markdown_strict; markua; mediawiki; ms; muse; native; odt; opendocument; opml; org; pdf; plain; pptx; revealjs; rst; rtf; s5; slideous; slidy; tei; texinfo; textile; typst; xwiki; zimwiki; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | |||
+ | == Markdown-Dokumente == | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | pandoc --to gfm 'LibreOffice-Text-Datei.odt' --output 'LibreOffice-Text-Datei.odt.md' | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | == Word-Dokumente == | ||
+ | |||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
soffice --headless --convert-to txt:MediaWiki "Word-Dokument.doc" | soffice --headless --convert-to txt:MediaWiki "Word-Dokument.doc" |
Version vom 11. Dezember 2023, 10:08 Uhr
Zum Umwandeln von Textformaten ist das Werkzeug pandoc
(https://pandoc.org) sehr hilfreich, es kann die meisten üblichen Textformate umwandeln, und die möglich verfügbaren Eingabeformate und Ausgabeformate erfragt man wie folgt:
pandoc --list-input-formats
# biblatex; bibtex; commonmark; commonmark_x; creole; csljson; csv; docbook; docx; dokuwiki; endnotexml; epub; fb2; gfm; haddock; html; ipynb; jats; jira; json; latex; man; markdown; markdown_github; markdown_mmd; markdown_phpextra; markdown_strict; mediawiki; muse; native; odt; opml; org; ris; rst; rtf; t2t; textile; tikiwiki; tsv; twiki; vimwiki;
pandoc --list-output-formats
# asciidoc; asciidoctor; beamer; biblatex; bibtex; chunkedhtml; commonmark; commonmark_x; context; csljson; docbook; docbook4; docbook5; docx; dokuwiki; dzslides; epub; epub2; epub3; fb2; gfm; haddock; html; html4; html5; icml; ipynb; jats; jats_archiving; jats_articleauthoring; jats_publishing; jira; json; latex; man; markdown; markdown_github; markdown_mmd; markdown_phpextra; markdown_strict; markua; mediawiki; ms; muse; native; odt; opendocument; opml; org; pdf; plain; pptx; revealjs; rst; rtf; s5; slideous; slidy; tei; texinfo; textile; typst; xwiki; zimwiki;
Markdown-Dokumente
pandoc --to gfm 'LibreOffice-Text-Datei.odt' --output 'LibreOffice-Text-Datei.odt.md'
Word-Dokumente
soffice --headless --convert-to txt:MediaWiki "Word-Dokument.doc"
Englische Beschreibung aus soffice --help
--convert-to OutputFileExtension[:OutputFilterName] \ [--outdir output_dir] [--convert-images-to] Batch convert files (implies --headless). If --outdir isn't specified, then current working directory is used as output_dir. If --convert-images-to is given, its parameter is taken as the target filter format for *all* images written to the output format. If --convert-to is used more than once, the last value of OutputFileExtension[:OutputFilterName] is effective. If --outdir is used more than once, only its last value is effective. For example: --convert-to pdf *.odt --convert-to epub *.doc --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc --convert-to "html:XHTML Writer File:UTF8" \ --convert-images-to "jpg" *.doc --convert-to "txt:Text (encoded):UTF8" *.doc