text_generation

Text-based AI training and generation with the aitextgen package using OpenAI's GPT-2 architecture.


Project maintained by willcpope Hosted on GitHub Pages — Theme by mattgraham

AI Text Generation

The goal of this project was to train an artificial intelligence model to generate text.

AI Model: Lovecraftian Text Generation

This model was trained using the works of the American writer of weird fiction and horror fiction, H.P. Lovecraft. The training corpus consisted of 502,786 words.

The prompt used to generate new text is from his “Commonplace Book” - a listing of story ideas, concepts, and other elements which he might at some point include in his stories. The goal of this model was to generate text that builds on entry 111 from 1923 -

“Ancient ruin in Alabama swamp—voodoo.”

Technology

forthebadge made-with-python

This project uses the Python package aitextgen using OpenAI’s GPT-2 architecture.

aitextgen is a Python package that leverages PyTorch, Hugging Face Transformers and pytorch-lightning with specific optimizations for text generation using GPT-2, plus many added features.

Results

The model produced some interested text samples but it required curation to find usable results. Per the aitextgen documentation -

Not all AI generated text will be good, hence why human curation is currently a necessary strategy for many finetuned models. In testing, only 5% — 10% of generated text is viable. One of the design goals of aitextgen is to help provide tools to improve that signal-to-noise ratio.

I generated 100 text samples using the Lovecraftian model. While none of the samples were 100% coherent or grammatically correct, some did include interest plot points. The most interesting text to me was the following sample.

Ancient ruin in Alabama swamp - voodoo. The black trees of the old man seemed to tell them, but he would do not speak a thing it, but one would help no doubt a single thing could be, nor would find their father that might have left and other things.