The Ultimate Story Book Generator: How to Turn a Prompt into a Book with AI

6 min readAug 27, 2023

Introduction

Generative models have taken over the world and the its uses are limitless. It has empowered people to increase their productivity and efficiency, even gather information and understand any concept within seconds.

I will be demonstrating how to develop a tool that enables users to submit a prompt, whether it’s a topic or a basic story idea, and then utilizes GPT and Stable Diffusion to automatically generate a complete storybook.

Let's get started

The framework I have used to develop this application is based on Langchain from python, and the models as I mentioned earlier is OpenAI hosted GPT based text-davinci-3 for the story text generation as well as a prompt for image generation supporting the story and Stable diffusion for image generation.

Here is a diagram to help you understand it better.

Code

The following code is tried and tested on google colab. Let's begin by setting up the environment on colab.

Install the dependencies

!apt-get update
!apt-get install -y libreoffice
!pip install openai tiktoken langchain replicate kor python-docx unoconv

2. Set environment variables

import os
os.environ["REPLICATE_API_TOKEN"] = "PASTE_TOKEN_HERE"
os.environ["OPENAI_API_KEY"] = "PASTE_TOKEN_HERE"
os.environ["OPENAI_ORGANIZATION"] = "PASTE_ORG_ID__HERE"

3. Import Libraries

from tqdm import tqdm
from PIL import Image
import requests
from io import BytesIO
import cv2
from google.colab.patches import cv2_imshow

from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT

import subprocess

from langchain import PromptTemplate, LLMChain
from langchain.llms import Replicate, OpenAI

4. Create the prompt template for Langchain input.

This prompt will actually go as an input to the LLM. Here we tell the model what's the input of the user and also mention that we want it to generate prompts for our image generation model which would be aligned to the story. Also mention that how the format of the output should be.

template = """Write a short story about {topic}.
First generate a title for the story and a cover picture. Make the story in a narrative format.
Divide it into 5 chapters. Limit each chapter to 200 words. Also generate a prompt for an image generation model at the end of each chapter.
Always generate the output in following format:

Title: Main title of the story
Cover Image: Prompt for image generation model
Chapter 1:
Title: Title of chapter 1
Text: Story text of chapter 1
Image Prompt: Prompt for image generation model
Chapter 2:
Title: Title of chapter 2
Text: Story text of chapter 2
Image Prompt: Prompt for image generation model
Chapter n:
Title: Title of chapter n
Text: Story text of chapter n
Image Prompt: Prompt for image generation model



"""

prompt = PromptTemplate(template=(template), input_variables=["topic"])

As you can see here we explain the OpenAI model what to do and which format to generate the output on. This will help us put the format into a structured format like list, tuple or a dictionary.

5. Setup the LLM chain

llm = OpenAI(temperature=0.7, model="text-davinci-003", max_tokens=2000)
llm_chain = LLMChain(prompt=prompt, llm=llm)

Since we want the model to generate story and make things up and give it all the creative liberty, we can keep the temperature upto 0.7. You can feel free to play around with this number.

6. Run the LLM chain

## ENTER YOUR TOPIC HERE
topic_text = input()

result = llm_chain.run(
    topic_text
)

On executing this cell the user will be prompted to enter the topic of the story. Here is an example of what you can provide to the tool.

An animal kingdom which used to live peacefully in the jungle and 
one day humans start to capture their land. The animals plan to attack the 
human base and gain control of the land again. But this ends up to be a 
endless battle in the end the everyone - humans and animals dies. 
End with a moral of the story.

Please note: The richness of the story depends on the quality of prompt written by the user.

7. Save the result in a text file

with open("story.txt", 'w+') as file:
    file.write(result)

This is just to create a checkpoint. So that you can resume from this point anytime.

8. Convert the text into a dictionary

with open('story.txt', 'r') as file:
    lines = file.readlines()

story = {}
is_chapter = False

chapters = []
chapter_dict = {}

for line in lines:



    line_sm = line.lower().strip()

    if 'title' in line_sm and not is_chapter:
        story['title'] = line.split(':')[-1].strip()

    if 'title' in line_sm and is_chapter:
        chapter_dict['title'] = line.split(':')[-1].strip()

    if 'cover image' in line_sm:
        story['cover_image'] = line.split(':')[-1].strip()

    if 'text' in line_sm:
        chapter_dict['text'] = line.split(':')[-1].strip()

    if 'image prompt' in line_sm:
        chapter_dict['image_prompt'] = line.split(':')[-1].strip()
        chapters.append(chapter_dict)

    if 'chapter' in line_sm:
        is_chapter = True
        chapter_dict = {}
        continue

story['chapters'] = chapters
story

As we have mentioned in the prompt to the model that we want the output to be in a particular format. This is how is helps use to convert it into a structured output.

9. Set up Image Generation Pipeline

text2image = Replicate(
    model="stability-ai/stable-diffusion:db21e45d3f7023abc2a46ee38a23973f6dce16bb082a930b0c49861f96d1e5bf",
    input={"image_dimensions": "512x512"},
)

10. Set the image style

You need to play around with this style. Explore more styles here.

style_prompts = " Graphic Novel, 4K, Global Illumination, Dreamy"

11. Generate images from prompts

img_path = './images/'

if not os.path.exists(img_path):
    os.makedirs(img_path)

generated_images_dict = {}

cover_image_prompt = image_prompts_dict["cover_image"]


# style of image
cover_image_prompt += style_prompts

# execute image generation pipeline
image_output = text2image(cover_image_prompt)

response = requests.get(image_output)
image_output = Image.open(BytesIO(response.content))

cover_path = os.path.join(img_path, 'cover_image.jpg')
image_output.save(cover_path)

generated_images_dict["cover_image"] = cover_path

chapter_images = []
for idx, chapter_image_prompts in tqdm(enumerate(image_prompts_dict["chapter_image_prompts"])):
    chapter_image_prompts += style_prompts
    image_output = text2image(chapter_image_prompts)
    response = requests.get(image_output)
    image_output = Image.open(BytesIO(response.content))
    chapter_path = os.path.join(img_path, f'chapter_{idx+1}.jpg')
    image_output.save(chapter_path)
    chapter_images.append(chapter_path)

generated_images_dict["chapter_images"] = chapter_images
generated_images_dict

{'cover_image': './images/cover_image.jpg',
 'chapter_images': ['./images/chapter_1.jpg',
  './images/chapter_2.jpg',
  './images/chapter_3.jpg',
  './images/chapter_4.jpg',
  './images/chapter_5.jpg']}

12. Create a document


# Create a new Word document
doc = Document()

title = story['title']
title_paragraph = doc.add_paragraph(title)
title_paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.CENTER
title_paragraph.runs[0].bold = True
title_paragraph.runs[0].font.size = Pt(18)
doc.add_picture(generated_images_dict['cover_image'], width=Inches(4), height=Inches(3))


# Loop through each combination and add to the document
for image_filename, long_text in image_text_combinations:
    doc.add_page_break()  # Add a page break for each combination

    # Create a table with 1 row and 2 columns
    table = doc.add_table(rows=1, cols=2)
    table.autofit = False
    table.columns[0].width = Inches(4)  # Adjust the width of the first column
    table.columns[1].width = Inches(2)  # Adjust the width of the second column

    # Add the image to the first cell
    cell_1 = table.cell(0, 0)
    image = cell_1.add_paragraph().add_run()
    image.add_picture(image_filename, width=Inches(4), height=Inches(3))  # Adjust width as needed

    # Add the text to the second cell
    cell_2 = table.cell(0, 1)
    paragraph = cell_2.add_paragraph(long_text)
    paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.CENTER
    paragraph.alignment = WD_PARAGRAPH_ALIGNMENT.LEFT

# Save the Word document
doc.save("output.docx")

# convert the document to pdf
import subprocess

input_docx_path = "output.docx"  # Replace with your actual input path
output_pdf_path = "output.pdf"  # Replace with your desired output path

try:
    subprocess.run(["unoconv", "-f", "pdf", "-o", output_pdf_path, input_docx_path], check=True)
    print("Conversion successful!")
except subprocess.CalledProcessError as e:
    print("Error:", e)

Conclusion

This is how you can create your own tool which can generate an entire story book with just a single prompt. To save costs I have currently limited the LLMChain to generate 5 chapters and the limited number of words to 200 for each chapter. But you can definitely increase it according to your need. Here is one story book which I generated using this tool. The entire notebook for this tool can be found here.

This message is for you!!
🚀 Hello, fellow knowledge enthusiast! 🌟
Searching for hands-on wisdom in Data Science, AI, and Python? You’re in the right place! 👩‍💻💡
I’m on a mission to demystify complexity, unleashing real-time applications that fuel your success. Let’s embark on this thrilling voyage of discovery!
Come, be a part of this exciting journey. I’m striving to reach 150 followers by year-end. 📈 Your follow is the boost I crave.
Fuel your curiosity, surge ahead. 🚀📊
Follow now and unlock the world of practical tech!

WRITER at MLearning.ai / Premier AI Video / AI art Copyright

Mlearning.ai Submission Suggestions

How to become a writer on Mlearning.ai

medium.com