MLOps for Unleashing Creative Content Generation using OpenAI’s Large Language Models FastAPI swagger and AWS

11 min readApr 28, 2023

Ihave long struggled with the challenge to automate and operationalize Machine Learning(ML) products and I am also aware that in the industry many ML endeavours fail to deliver on their expectations.

Luckily MLOps can solve the situation and I fully agree with D. Kreuzberge and Al who in “Machine Learning Operations (MLOps): Overview, Definition, and Architecture”, fully describe the field and propose an innovative approach.

MLOps paradigm — source: https://ieeexplore.ieee.org/document/10081336

After conducting intensive research, D. Kreuzberge and Al identify nine principles required to realize successful MLOps. I won't extensively describe those principles as a quick look at their below figure is self-explanatory; they also provide associated components.

sources: D. Kreuzberger, N. Kühl and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture” in IEEE Access, vol. 11, pp. 31866–31879, 2023, doi: 10.1109/ACCESS.2023.3262138.

For today’s blog post’s hot topic, Buckle up, because I am about to dive into the world of scalable and distributed applications, with REST API serving. Not to mention the game-changers — Large Language Models from OpenAi like ChatGPT and DALL-E. They have completely transformed the field of machine learning and artificial intelligence as we know it!
The good news is that, since the introduction of these models, implementing the “C8 Model Serving Component (P1)” has become simpler and less expensive. So let’s get this started !!! But before digging deep into the technicalities, have a quick look at the below demo. This app could be leveraged in innovative Blogging strategies and SEO optimisations.

I will also showcase the use of that REST API in a CMS (WordPress) for automated AI-powered posts and in a Streamlit application.

Keywords: OpenAI, FastAPI , Streamlit , Chatgpt, Dall.e 2 , WordPress

Flow logic: As a best practice, I always draw a flow logic when designing an application, and you can see an overview of what I plan to build below.

      +--------------+
      |    Start     |
      +--------------+
             |
             v
+---------------------------------------+  Yes  +----------------------+
| Check if text generation is disabled  |  -->  |return cached results |
+---------------------------------------+       +----------------------+
             |
             | No
             v
+---------------------------------------+
|       Validate input parameters       | -< through decorators >
+---------------------------------------+
             |
             v
+---------------------------------------+
|   Generate title using OpenAI API     | - <>
+---------------------------------------+
             |
             v
+---------------------------------------+
|    Generate post using OpenAI API     | - <>
+---------------------------------------+
             |
             v
+---------------------------------------+
|   Generate hashtags using OpenAI API  | - <>
+---------------------------------------+
             |
             v
+---------------------------------------+
|  Generate image using DALL-E API      | - <>
+---------------------------------------+
             |
             v
  +------------------------+
  |     Build response     | - <>
  +------------------------+
             |
             v
  +----------------------+
  |    Return response   | - <>
  +----------------------+
             |
             v
        +--------+
        |  End   |
        +--------+

For simplicity, we will separately perform calls to OpenAI for Title, Post and Hashtag. In a production-ready setup, they could be wrapped in a single call.

Let’s look at how to create a serverless REST API using FastAPI Python and AWS Lambda. I chose AWS to take advantage of serverless computing and low-cost REST APIs. Then, I’ll use OpenAI’s LLMs to create a sample content generation booster.
To follow along you just need an Amazon AWS active account you could create here: https://aws.amazon.com/free/ and an Openai API key to be requested here. Optionally you could also consider registering for the Streamlit share app here and testing it on your own WordPress website.

Installing FastAPI and Dependencies

As a prerequisite, we need to install FastAPI and some additional dependencies. You can run the next command to install them:

pip install uvicorn[standard] fastapi openai logging pydantic

Serverless REST API

First, let’s create an endpoint called /generate-post that takes a text prompt as input and returns the generated post with SEO-optimised details and an image as a response.

from fastapi import FastAPI

app = FastAPI()

@app.get("/generate-post")
def generate_content(prompt: str, active: bool = False, model: str = "text-davinci-003",
                           language: str = "en", api_key: str = None,
                           temperature: float = 0.8, title_max_tokens: int = 30, post_max_tokens: int = 200,
                           hashtags_max_tokens: int = 100):

    # ...

    return {"message": "Success"}

Step 1 . Validate input parameters

My approach is to create a Pydantic model called ContentRequest to represent our API’s input parameters.

Based on the OpenAi API documentation and our objective our application main class have the following fields:

prompt: A string that represents the theme of the content
active: A boolean that specifies whether the content generation is enabled or not. If active is set to False, the API will return a default response. I am adding this parameter as the usage of Openai API incurs some costs. you can get rid of it or replace it with a tracker that could help check cost and usage.
model: A string that specifies the OpenAI language model to use. By default, let’s use the text-davinci-003 model.
language: A string that specifies the language of the response. By default, let’s use English. The API will support French, German and Spanish also.
api_key: This string represents the OpenAI API key. In production, we will add the API key set as an environment variable.
temperature: A float to specify the randomness of the response. By default is 0.8.
title_max_tokens: An integer specifying the maximum number of tokens in the generated title. By default is 30.
post_max_tokens: An integer that specifies the maximum number of tokens in the generated post. By default is 200.
hashtags_max_tokens: An integer that specifies the maximum number of tokens in the generated hashtags. By default is 100.

For instance, we can make sure all models belong to a set of allowed values based on OpenAI API documentation: “text-davinci-003”, “text-curie-001”, “text-babbage-001”, “gpt-3.5-turbo”. Otherwise, we raise an error message.

@validator('model')
    def validate_model_name(cls, model_name):
        # Validate model name
        allowed_models = ["text-davinci-003", "text-curie-001", "text-babbage-001",
                          "gpt-3.5-turbo"]  # Add allowed model names here
        if model_name not in allowed_models:
            raise ValueError("Invalid model name. Allowed models: text-davinci-003, text-curie-001, text-babbage-001")
        return model_name

Let’s then apply the same logic to other attributes and complete the ContentRequest class


class ContentRequest(BaseModel):
    prompt: str
    active: bool = False
    model: str = "text-davinci-003"
    language: str = "en"
    api_key: str = None
    temperature: float = 0.8
    title_max_tokens: int = 30
    post_max_tokens: int = 200
    hashtags_max_tokens: int = 100

    @validator('model')
    def validate_model_name(cls, model_name):
        # Validate model name
        allowed_models = ["text-davinci-003", "text-curie-001", "text-babbage-001",
                          "gpt-3.5-turbo"]  # Add allowed model names here
        if model_name not in allowed_models:
            raise ValueError("Invalid model name. Allowed models: text-davinci-003, text-curie-001, text-babbage-001")
        return model_name

    @validator('temperature')
    def validate_temperature(cls, temperature):
        # Validate temperature value
        if temperature < 0 or temperature > 1:
            raise ValueError("Temperature value must be between 0 and 1")
        return temperature

    @validator('title_max_tokens', 'post_max_tokens', 'hashtags_max_tokens')
    def validate_max_tokens(cls, max_tokens):
        # Validate max tokens value
        if max_tokens < 1 or max_tokens > 2048:
            raise ValueError("Max tokens value must be between 1 and 2048")
        return max_tokens

Notice we’re also adding decorators.

The decorator function @validator(‘’) is used to define a validation function that will be called when the attributes of the ContentRequest instance are set. They take the attribute name as an argument and are used to validate the model attribute value when it is set. Our endpoint would then be able to access validated attributes as described below:

@app.get("/generate_content")
async def generate_content(content_request: ContentRequest):
    prompt = content_request.prompt
    active = content_request.active
    model = content_request.model
    language = content_request.language
    temperature = content_request.temperature
    title_max_tokens = content_request.title_max_tokens
    post_max_tokens = content_request.post_max_tokens
    hashtags_max_tokens = content_request.hashtags_max_tokens

    # ...

    return {"message": "Success"}

Step 2. Generate title using OpenAI API

I will use OpenAI’s GPT-3 language model through completion to generate titles, posts, and hashtags all along. Using this, I will need my code to generate a title for a given prompt.
So the title is generated using the ‘openai.Completion.create()’ method.
The language model to be used is specified by the ‘model’ parameter of the ‘create()’ method. It is obtained from the ’model’ attribute of the ’request’ object, which is passed to the function as a parameter.
The `prompt` parameter of the `create()` method specifies the prompt text to be used for generating the title. It is constructed by concatenating the `prompt` attribute of the `request` object and some additional text.
The ‘temperature’ and ’max tokens’ parameters of the ‘create()’ method are set to the values returned by the corresponding ’request’ object attributes.

These parameters, in turn, control the randomness and length of the generated text. The ‘choices’ attribute of the ‘title response’ object contains a list of potential responses to the given prompt, ranked by the model’s confidence in their correctness. The generated title is derived from the ‘text’ attribute of the first (and most likely) selection in the list. Finally, the generated title is stripped using the ’strip()’ method to remove any leading or trailing white space characters. The resulting title is assigned to the ‘title’ variable, which is then used to generate the post and hashtags.

So the code should be like below

        # Generate title
        title_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a title for the theme: {request.prompt} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.title_max_tokens
        )
        title = title_response.choices[0].text.strip()

I insist on the completion to point out the intuition behind the LLMs meaning. we can consider them as a super neural network excelling at predicting the next token. The breakthrough in the AI field was Google developing Transformers, including concepts like self-attention and many more making it possible to build and fine-tune LLMs.
More specifically LLMs can be considered a type of artificial intelligence model that uses neural networks to predict the next token or word in a text sequence. These models can learn and recognize patterns in language data and generate new text that is similar to the input data in style and content. LLMs are regarded as “super” neural networks due to their ability to handle large amounts of data while producing highly accurate and coherent text.
One good example I really appreciate is the BLOOM model with its 176 billion parameters. According to their definition, BLOOM is able to generate text in 46 natural languages and 13 programming languages. They also state that BLOOM is the first language model with over 100B parameters ever created. Additional details could found here: https://huggingface.co/blog/bloom.

Step 3. Generate post using OpenAI API

Apart from designing the prompt to meet our needs, the logic remains the same.

# Generate post
        post_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a post SEO optimized for the theme: {title} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.post_max_tokens
        )
        post = post_response.choices[0].text.strip()

Step 4. Generate hashtags using OpenAI API

No comment !!! very same logic.

# Generate hashtags
        hashtags_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a list of hashtags for the theme: {title} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.hashtags_max_tokens
        )
        hashtags = hashtags_response.choices[0].text.strip().split('\n')

Step 5. Generate image using DALL-E API

In this step, let's generate an image using DALL.e.2.
The image generation process is started by calling the openai.Image.create() method. I do this using the previously defined prompt parameter to describe the image that I want to generate and I enrich the prompt. I also include the ‘size’ parameter for the image’s size and the ‘response_format’ param which in our case is a URL.

# Generate image using DALL·E
        image_response = openai.Image.create(
            prompt=f"Generate an image for the theme: {title}",
            size="512x512",
            response_format="url"
        )
        image_url = image_response.data[0].url

for additional customisation feel free to visit the developer documentation here: https://platform.openai.com/docs/guides/images/image-generation-beta

Step. 6 Build response

In summary, the complete code should look like below, including the construction of the response data as a dictionary. We also include the Try Except block to ensure catching exceptions.

@app.get("/generate_content")
async def generate_content(request: ContentRequest):
    try:
        if not request.active:
            # Return a default response when active flag is set to False
            default_response = {"message": "Text generation is disabled."}
            return default_response

        # Use completion to generate title, post, and hashtags
        # Generate title
        title_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a title for the theme: {request.prompt} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.title_max_tokens
        )
        title = title_response.choices[0].text.strip()

        # Generate post
        post_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a post SEO optimized for the theme: {title} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.post_max_tokens
        )
        post = post_response.choices[0].text.strip()

        # Generate hashtags
        hashtags_response = openai.Completion.create(
            model=request.model,
            prompt=f"Generate a list of hashtags for the theme: {title} and, the response language is {request.language};",
            temperature=request.temperature,
            max_tokens=request.hashtags_max_tokens
        )
        hashtags = hashtags_response.choices[0].text.strip().split('\n')

        # Generate image using DALL·E
        image_response = openai.Image.create(
            prompt=f"Generate an image for the theme: {title}",
            size="512x512",
            response_format="url"
        )
        image_url = image_response.data[0].url

        # Construct the response data as a dictionary
        response_data = {
            "title": title,
            "post": post,
            "hashtags": hashtags,
            "image_url": image_url
        }

        return response_data

    except Exception as e:
        # Log the error
        logging.error(str(e))
        # Return an error message as a dictionary if any exception occurs
        return {"error": str(e)}

Testing the application

If you’re following along you’re also now ready to run the app, if you’ve named your Python file call_openai.py :

Make sure the required packages are successfully installed.
Run the app using the command: uvicorn call_openai:app --reload --port 8000 . If no OpenAI API key is in a .env file in the root directory of the project we will still be able to provide it on the Swagger UI.

If you see the below result when navigating to http://127.0.0.1:8000/generate_content then everything is working perfectly !!!

We can now navigate to the Swagger and test the rest API

Let’s play with various parameters:

I will skip the REST API deployment onto AWS as there are extensive documentation and tutorials available online.

And tada:

Sample use cases

Model Serving onto a Streamlit application through API calls.

Feel free to test a live version here: https://postsgpt.streamlit.app/

To use this service in a Streamlit application it is enough to do a call as below:

 # Save button to trigger the JSON response saving
   if st.button("Generate AI Post"):
      url = f"http://nyami-beta.us-west-2.elasticbeanstalk.com/?prompt={prompt}&active={active}&model=text-davinci-003&language={language.lower()}&temperature=0.8&title_max_tokens=30&post_max_tokens=200&hashtags_max_tokens=100&api_key={api_key}"
      save_json_response_to_file(url, prompt)

I will propose another article to explain how to build such a streamlit app from scratch.

Model Serving onto a WordPress application through API calls.

You can test a live version here: https://bazous.com/?p=423

Conclusion and future steps

In today's article, I provide a thorough explanation of how I use OpenAI’s GPT-3 for text generation and DALL.E for image generation. I leverage the concept of language models and generative AI, to serve a model through rest API. The article emphasizes on how MLOps can unleash creative content generation based on OpenAI’s Large Language Models and FastAPI swagger on AWS.
The potential of these models in a variety of fields, such as content creation, marketing, and even art is amazing. They are clearly powerful tools with the potential to revolutionize the way we create content and communicate with machines. It is, however, critical to be aware of the ethical implications of such advanced AI systems and to use them responsibly.

In future steps, we could think about deploying CICD pipelines to server such models. But for now, feel free to leave me feedback if you’ve found my approach interesting.

References
1- D. Kreuzberger, N. Kühl and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture” in IEEE Access, vol. 11, pp. 31866–31879, 2023, doi: 10.1109/ACCESS.2023.3262138
2-🌸 Introducing The World’s Largest Open Multilingual Language Model: BLOOM 🌸 https://huggingface.co/blog/bloom
3- Transformer: A Novel Neural Network Architecture for Language Understanding https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
4-OpenAI API https://platform.openai.com/docs/api-reference/