Rendered at 19:40:16 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
lifthrasiir 11 hours ago [-]
Is there anyone feeling that Pandoc is ever increasingly bloated? I have used Lua filters a decade ago [1] and the current documentation is nothing like my memories. I'm not even sure that how much of Lua scripts remain compatible across different Pandoc versions.
With a tagline of "a universal document converter" it is almost a guarantee to become a complicated program but how much of it is being used for any single conversion?
Two more examples:
Rclone is "bloated" but it needs to be in order to fulfill its purpose.
ZFS is "bloated" because it combines volumes and filesystems but breaking the Unix philosophy also enables a different kind of synergy and simplicity elsewhere.
Blackthorn 5 hours ago [-]
"bloat" just means "any feature I am not personally using therefore I deem as useless and pointless".
lifthrasiir 8 hours ago [-]
A universal document converter is expected to expand via adding support for additional formats---that's okay (same for your other examples). I'm much more worried about the widening scope of the project.
applicative 6 hours ago [-]
A universal document converter knows what document it is working with and what to do with it once it has it. 'What a document is' is an AST that has resulted from a few thousand years of literate civilization. You can detect the outline of this AST - or AAST as you might call it - by asking what must be preserved in a different printing of the same, or in a translation.
A universal document converter is 'expected' to admit transformations on the AST of a document. Luafilters do this more or less directly; operations via json representation do it in another.
I never used luafilters before, not knowing lua, but these days use them all the time for simple problems and am getting a clearer picture of the possibilities. This is because claude and codex write luafilters at the drop of a hat.
One simple illustration I have found of use with academic writing published inter alia in html arises from the willful decision of the html bureaucracy never to include a footnote syntax - and thus fall short of ABCs of any document concept however narrow and curtailed - because having said 'o we don't need footnotes, we have hypertext' back in clintontime they are too proud to change. In fact of course html is the format par excellence of footnotes ... as a gander at wikipedia will tell you. Pandoc can't parse them out of html - including its own html - since there is nothing to parse: the reader recognizes them by inspection in the browser. But you can ask claude to write a lua filter e.g. recognizing pandoc's own html footnotes - which are as arbitrary as everyone else's - and generate the structure intended by the author, in which they are footnotes.
redsocksfan45 7 hours ago [-]
[dead]
a1o 8 hours ago [-]
We use it for seven years and it still runs fine when we update Pandoc - we usually always update things. I don’t remember anything about the docs, so not sure what changed.
fwip 6 hours ago [-]
I might be worried if it wasn't pandoc. It's always been bulletproof for me.
kalcode 4 hours ago [-]
I liked the Lua filters for solving issues on DOCX stuff for Markdown to Docx.
For PDF stuff I haven't needed much Lua filters since switching to WeasyPrint for the PDF engine.
leephillips 6 hours ago [-]
Lua filters for Pandoc have been around for a quite a while. What’s newer is Pandoc’s ability to be used in web browsers. There’s a bit more about this and a general rundown of Pandoc in my recent article for LWN:
I've always wondered if pandoc can be made reactive. Say markdown to Pandoc AST.
If one changes something, a quick update to the AST would happen incrementally.
Now with all these llm I might actually see if it can be done.
dapperdrake 2 hours ago [-]
Look at flag --standalone. At least for html output pandoc seems to be able to handle something that feels like partial pandoc input in practice and produce html output that behaves like a snippet.
Pandoc AST - format called "native" - parses faster than pandoc markdown.
[1] https://github.com/mearie/mearie.github.io/blob/source/res/w...
Two more examples:
Rclone is "bloated" but it needs to be in order to fulfill its purpose.
ZFS is "bloated" because it combines volumes and filesystems but breaking the Unix philosophy also enables a different kind of synergy and simplicity elsewhere.
A universal document converter is 'expected' to admit transformations on the AST of a document. Luafilters do this more or less directly; operations via json representation do it in another.
I never used luafilters before, not knowing lua, but these days use them all the time for simple problems and am getting a clearer picture of the possibilities. This is because claude and codex write luafilters at the drop of a hat.
One simple illustration I have found of use with academic writing published inter alia in html arises from the willful decision of the html bureaucracy never to include a footnote syntax - and thus fall short of ABCs of any document concept however narrow and curtailed - because having said 'o we don't need footnotes, we have hypertext' back in clintontime they are too proud to change. In fact of course html is the format par excellence of footnotes ... as a gander at wikipedia will tell you. Pandoc can't parse them out of html - including its own html - since there is nothing to parse: the reader recognizes them by inspection in the browser. But you can ask claude to write a lua filter e.g. recognizing pandoc's own html footnotes - which are as arbitrary as everyone else's - and generate the structure intended by the author, in which they are footnotes.
For PDF stuff I haven't needed much Lua filters since switching to WeasyPrint for the PDF engine.
https://lwn.net/Articles/1064692/
If one changes something, a quick update to the AST would happen incrementally.
Now with all these llm I might actually see if it can be done.
Pandoc AST - format called "native" - parses faster than pandoc markdown.