PIC(IMAGE, assets/generated-dark.png, IMAGE_ALT)
P()
Everyone writes their text with LLMs, right?
AHERE(not-ai, (Well, except me.))
And LLMs produce relatively good texts at times.
But what was the state of affairs before LLMs killed text generation?
P()
It was fun!
Lots of algorithms and attempts to deconstruct the language.
And build the texts back from these parts, be it letters, words, or grammatical structures.
P()
I wanted to try (implementing?) all the methods I could find and understand.
This post is a result of such a search and attempts.
P()
From now on, all the text will be generated by the algorithms I describe.
Except section headers, code snippets, and the closing word, of course—you need to understand what I’m doing, after all.
Random Chars
P()
uibv6>O0nF|"a.,L}F|+wVLIZ n$C6!rDyYuuG(1XnlSIVa;{)"[_|C*TCoB!zc3leH}5-@#Dl|7
Frequency Generation
P()
tpfo
A(https://en.wikipedia.org/wiki/Letter_frequency, oooPnrresoitoi)
hahthw bosaerrliwrn \u oesage pseanogigaekln
ierfhinelc .ne. e hrf d.mtlhbemidtltehtn nneerogsue lcrhndh rseh
ul iitwBr rigege o eaofs qeóok axe l
PRE(lisp)
(defvar *freqs* (make-hash-table))
(let ((str (uiop:read-file-string
HASH()p"~/web/public/assets/war-and-peace.txt")))
(clrhash *freqs*)
(loop for char across str
do (setf (gethash char *freqs*)
(1+ (gethash char *freqs* 0)))))
PRECAP(Calculating the occurence frequency for chars in War And Peace)
P()
rgio rno \n, mtiargi. ar tteeanshoysfPdf sso. o soaatcs,reoi
o cesla rnr huhiut urohntcaa”owraawponiagctiatag,o
nPuals h snepktehhnwólgtesatenerioKoytaeearhslufe o rewtonet p. \n anu ddt egonmeynn spbie \r!oenIg,ttwl ianalr, aeMu Atnsehd es Aretioewouir rusditpdgeo deyhdannintanecgh
p ihuislu ea ggir
sfe dmi.et
PRE(lisp)
(defun random-hash-key (hash &optional (sum (loop for i being the hash-value of hash sum i)))
(loop with value = (random sum)
for char being the hash-key of hash
using (hash-value occurences)
do (decf value occurences)
when (minusp value)
do (return char)))
(defun freq-generate (&optional (length 1000))
(loop for i below length
collect (random-hash-key *freqs*)
into letters
finally (return (coerce letters 'string))))
(freq-generate (+ 50 (random 300)))
PRECAP(Getting the frequency-generated string)
P()
uonai.cei ePesofe—msletnelasfhn e \ts
oopiiatóa sstnrde’ ssk,tri nHttodog eeeute od ,asecladd esi
P()
\t oytgs co cgaeucodeantc edh tor seatne apr wphn. tvrg e
oaa
Markov Chains
P()
FFe y \t d dd e
P()
A(https://en.wikipedia.org/wiki/Markov_chain, plisaboy ouskhe)
be höwaprd,
bunthil ay bewaye Bespore he f
anofity aitind ackes core \t I is ro \f o, ougrshed, htese, ine Tormpl
imofertol!” hste songnofonden te ouftoned te atinletherimedathie Fro
oilfre lecocusondinters wat
PRE(lisp)
(defvar *markov* (make-hash-table))
(let ((str (uiop:read-file-string
HASH()p"~/web/public/assets/war-and-peace.txt")))
(clrhash *markov*)
(loop for char across str
and prev-char = char
unless (gethash prev-char *markov*)
do (setf (gethash prev-char *markov*)
(make-hash-table))
when prev-char
do (setf (gethash char (gethash prev-char *markov*))
(1+ (gethash char (gethash prev-char *markov*) 0)))
finally (remhash nil *markov*)))
PRECAP(Feeding War And Peace to Markov chains)
P()
APring
limof ld traverned lin as túteremaice ic in
PRE(lisp)
(defun markov-generate (&optional (length 1000))
(loop repeat length
for char
= (loop repeat (random (hash-table-count *markov*))
for char being the hash-key of *markov*
finally (return char))
then (random-hash-key (gethash char *markov*))
when char
collect char
into letters
finally (return (coerce letters 'string))))
(markov-generate (+ 50 (random 300)))
PRECAP(Markov chains generation)
P()
meverins me, a bun athed. te Thin rutouratanl d as icr. t
Ines smorpowedmindind intang de.” the hiondred he asmes
insanbat \f siokntove af I’veld
P()
incen ve Sed thenlond prsme tomedis (we—agn beress mnghacelin, An’s homof sstocendicouis wanthoutt hin
Dissociated Press
P()
tenacity which were valued
A(https://en.wikipedia.org/wiki/Dissociated_press, The cold stern motionless Att ention shouted)
Denísov was in the door The whole place If it will lie and more hugging him of a side and did not believe me what concern me pass Cursing and his whole meaning of forcing an air brightly lit up was carried out in the same before the Emperor went to the first column he is unthinkable otherwise 3 Morality and galloped in the room Magnítski s voice Ah what is the cause me a conflict that could not merely amused himself with him to business an immense forces of France badly I am afraid that wail tears in the will result of them toward him as is determined manly voice said he termed it forbidden earthly life
PRE(lisp)
(defvar *dissociated* (make-hash-table :test HASH()'equalp))
(let* ((str (uiop:read-file-string
HASH()p"~/web/public/assets/war-and-peace.txt"))
(words (uiop:split-string str :separator " ,.!?:—-$%=();/*#[]’”“
"))
(words (remove-if HASH()'uiop:emptyp words)))
(clrhash *dissociated*)
(loop for word in words
and prev-word = word
unless (gethash prev-word *dissociated*)
do (setf (gethash prev-word *dissociated*)
(make-hash-table :test HASH()'equalp))
when prev-word
do (setf (gethash word (gethash prev-word *dissociated*))
(1+ (gethash word (gethash prev-word *dissociated*) 0)))
finally (remhash nil *dissociated*)))
PRECAP(Filling the Dissociated Press word table)
P()
march is unarmed inhabitants of hussars who was at Drissa camp And so often been torn off to give a direct descent of generals But there s and cautious with the Lodge meetings with me \r s path that and under precisely two conceptions of the dust churned up to put that is very much astonished gaze under the nineteenth century with a field behind the eldest who got it
PRE(lisp)
(defun join-spaced (words)
(reduce
(lambda (w1 w2)
(uiop:strcat w1 " " w2))
words))
(defun dissociated-generate (&optional (length 100))
(loop repeat length
for word
= (loop repeat (random (hash-table-count *dissociated*))
for word being the hash-key of *dissociated*
finally (return word))
then (random-hash-key (gethash word *dissociated*))
when word
collect word
into words
finally (return (join-spaced words))))
(dissociated-generate (random 100))
PRECAP(Generating the text with Dissociated Press)
P()
muslin with whom he heard but yet reached the drawing room to that he
Generation With Grammar
P()
A(https://en.wikipedia.org/wiki/Generative_grammar, which repulsed breathless premium violently)
Komaróv attaching maggot smoking unmeaningly seers Finnish saves principal Ismáylov gendarmes symbol jesting unattractive nonintervention futility Gavríl farming ironically qu distinctive Manna acres reveal Seslávin talks swishing masterly relations kneaded superfluity spun imperative most girdles impatiently eager scanned experiment nondescript Makárovna devices exclusively craftsmanship exclusively participants humiliate
PRE(lisp)
(defvar *nouns* (make-hash-table :test HASH()'equalp))
(defvar *verbs* (make-hash-table :test HASH()'equalp))
(defvar *adjectives* (make-hash-table :test HASH()'equalp))
(let* ((str (uiop:read-file-string
HASH()p"~/web/public/assets/war-and-peace.txt"))
(words (uiop:split-string str :separator " ,.!?:—-$%=();/*#[]’”“
"))
(words (remove-if HASH()'uiop:emptyp words)))
(clrhash *nouns*)
(clrhash *verbs*)
(clrhash *adjectives*)
(clrhash *adverbs*)
(flet ((suffix (word suffix)
(uiop:string-suffix-p word suffix)))
(loop for word in words
if (or (suffix word "fy")
(suffix word "ize")
(suffix word "en")
(suffix word "ate")
(suffix word "ed")
(suffix word "ing")
(suffix word "es")
(suffix word "ould"))
do (setf (gethash word *verbs*) 1)
else if (or (suffix word "al")
(suffix word "ble")
(suffix word "an") (suffix word "ian")
(suffix word "ary")
(suffix word "ful")
(suffix word "ic")
(suffix word "ive")
(suffix word "iish")
(suffix word "less")
(suffix word "y")
(suffix word "ous")
(suffix word "ose")
(suffix word "nt")
(suffix word "ile"))
do (setf (gethash word *adjectives*) 1)
else
do (setf (gethash word *nouns*) 1))))
PRECAP(Classifying words into parts of speech by suffixes)
P()
6 rinsed banteringly errors above disapproving antagonistic arcade astounded vient build standstill chewed positively prevents inconceivable straighter sacrificed philosophic intruder contemporaneously clowns pawing
PRE(lisp)
(defun grammar-generate (&optional (length 100))
(flet ((maybe ()
(zerop (random 2))))
(join-spaced
(loop for i below (/ length 3)
when (maybe)
collect (random-hash-key *adjectives*)
collect (random-hash-key *nouns*)
collect (random-hash-key *verbs*)
when (maybe)
collect (random-hash-key *adjectives*)
and collect (random-hash-key *nouns*)))))
(grammar-generate (random 100))
PRECAP(Generation with a (rather simplistic) (A)NV(AN) grammar)
P()
Andwew marrying noticeably conversation apologizing manly drop Povarskóy oeuvre flagging unjustly 178 denser treading clergy owns
Okay, Enough!
P()
This stuff was deranged and I doubt anyone read past the first two lines of every paragraph.
Because that’s the thing with LLM-less text generation—it’s primitive and nonsensical.
Except for grammars, maybe—the complex ones might encode relatively sensible sentences.
But I like Dissociated Press the most and I want to feed all my blog posts to it to see what king of language I end up with.
P()
I hope that you got something for youself in these machine doodles.
If only the names of the approaches and disgust for what I’m doing 😅