Spring AI: Generate Images Based on the Given Text with Open AI APIs

4 min readJan 6, 2024

AI applications are going to play a vital role across various industries, transforming the way tasks are performed, decisions are made, and problems are solved. Every engineer is trying to build some AI applications to enhance their skills and contribute to the organization's success.

In this article, we want to explore how we can generate images based on the user-provided text or topic using the OpenAI APIs and Spring AI.

In the previous article, we explored integrating the OpenAI APIs with Java using Spring AI.

The previous article serves as the basis for this post. Please read if you still need to do so.

What is Spring AI?

Spring AI takes inspiration from Python projects like LangChain and LlamaIndex, but it’s specifically designed for the Spring ecosystem. This ensures the utilization of familiar Spring concepts and patterns well-known to Java developers, facilitating a seamless learning curve and adoption.

Generate Image using Spring AI and Open AI APIs

To generate the images using open AI, we need to use DALL-E APIs. DALL-E allows you to generate images from natural language descriptions. DALL-E APIs can be used to generate the images and classify the images.

Implementation Approach

We will be leveraging the code base that we have implemented as part of the blog: Spring AI: Streamlining AI Application Development with Java and AI APIs.

Let us add the image URL to our Yaml configuration. Below is the updated one.

spring:
  ai:
    openai:
      api-key: sk-XXXXX
      image-generator-url: https://api.openai.com/v1/images/generations

Let us build a simple REST endpoint that takes an image description and image size as input and provides the image URL as output.

Let us define the request and response objects for our REST endpoint.

Request Class

package com.techmonks.openai.model;

import lombok.Getter;
import lombok.Setter;

@Getter @Setter
public class ImageGeneratorRequest {
    private String promptText;
    private String imageSize;
}

Response Class

package com.techmonks.openai.model;

import lombok.Getter;

@Getter
public class GeneratedImage {
    private String url;
}

package com.techmonks.openai.model;

import lombok.Getter;
import java.util.List;

@Getter
public class GeneratedImageResponse {
    private List<GeneratedImage> data;
}

REST Endpoint


    @PostMapping("images")
    public ResponseEntity<GeneratedImageResponse> generateImages(@RequestBody ImageGeneratorRequest imageGeneratorRequest) throws JsonProcessingException {
        PromptTemplate promptTemplate = new PromptTemplate("""
                Please create a prompt that generates the image about {promptText}.. Please enhance the prompt
                 to make it more creative and fancy. Prompt length should not exceed 300 characters very strictly.""");
        //Please note that max prompt length is 1000 characters. Limiting the prompt length to avoid issues
        promptTemplate.add("promptText", imageGeneratorRequest.getPromptText());
        ChatResponse chatResponse = this.chatClient.generate(promptTemplate.create());
        String prompt = chatResponse.getGeneration().getContent();
        GenerateImageRequest generateImageRequest = new GenerateImageRequest();
        generateImageRequest.setPrompt(prompt);
        generateImageRequest.setSize(imageGeneratorRequest.getImageSize());
        generateImageRequest.setN(1);
        ObjectMapper objectMapper = new ObjectMapper();
        String imageJsonRequest = objectMapper.writeValueAsString(generateImageRequest);
        GeneratedImageResponse generatedImageResponse = getGeneratedImageResponse(imageJsonRequest);
        return ResponseEntity.ok(generatedImageResponse);

    }

    private GeneratedImageResponse getGeneratedImageResponse(String imageJsonRequest) {
        GeneratedImageResponse generatedImageResponse = webClient.post().uri(imageGeneratorUrl)
                .header("Authorization", "Bearer "+ apiKey)
                .contentType(MediaType.APPLICATION_JSON)
                .body(BodyInserters.fromValue(imageJsonRequest))
                .retrieve()
                .onStatus(HttpStatusCode::is4xxClientError, res -> {
                    System.out.println(res);
                    return res.bodyToMono(String.class)
                            .flatMap(body -> Mono.error(new RuntimeException(body)));
                })
                .onStatus(HttpStatusCode::is5xxServerError, res -> {
                    System.out.println(res);
                    return res.bodyToMono(String.class)
                            .flatMap(body -> Mono.error(new RuntimeException(body)));
                })
                .bodyToMono(GeneratedImageResponse.class).block();
        return generatedImageResponse;
    }

In the above example, we are sending user provided text to Open AI to refine the prompt to make it more clear and concise to generate more accurate images.

After receiving the refined prompt, we are sending the prompt, image size, and number of images to the DALL-E API to generate the image URL. DALL-E API will return the generated image URL.

Below is the postman's collection to verify it.

{
 "info": {
  "_postman_id": "b3f73dc4-7eaf-4065-8b4c-5b28b469d744",
  "name": "open-ai",
  "schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
  "_exporter_id": "3152420",
  "_collection_link": "https://releasedashboard.postman.co/workspace/Anji~01fb9b44-0970-46dd-9195-56044a7ead66/collection/3152420-b3f73dc4-7eaf-4065-8b4c-5b28b469d744?action=share&source=collection_link&creator=3152420"
 },
 "item": [
  {
   "name": "translate-text",
   "request": {
    "method": "POST",
    "header": [],
    "body": {
     "mode": "raw",
     "raw": "{\r\n    \"translateFrom\":\"English\",\r\n    \"translateTo\":\"Hindi\",\r\n    \"textToTranslate\":\"Hello, How are you doing today?\"\r\n}",
     "options": {
      "raw": {
       "language": "json"
      }
     }
    },
    "url": {
     "raw": "http://localhost:8080/v1/translations",
     "protocol": "http",
     "host": [
      "localhost"
     ],
     "port": "8080",
     "path": [
      "v1",
      "translations"
     ]
    }
   },
   "response": []
  },
  {
   "name": "dalle-e-images",
   "request": {
    "method": "POST",
    "header": [],
    "body": {
     "mode": "raw",
     "raw": "{\r\n    \"promptText\": \"cat and dog\",\r\n    \"imageSize\":\"512x512\"\r\n}",
     "options": {
      "raw": {
       "language": "json"
      }
     }
    },
    "url": {
     "raw": "http://localhost:8080/v1/images",
     "protocol": "http",
     "host": [
      "localhost"
     ],
     "port": "8080",
     "path": [
      "v1",
      "images"
     ]
    }
   },
   "response": []
  }
 ]
}

As always, you can find the entire source code here.

That’s all for today!

Thank you for taking the time to read this article. I hope you have enjoyed it. If you enjoyed it and would like to stay updated on various technology topics, please consider following and subscribing for more insightful content.

References:

Spring AI

The Spring AI project aims to streamline the development of applications that incorporate artificial intelligence…

docs.spring.io

Spring AI: Generate Images Based on the Given Text with Open AI APIs

What is Spring AI?

Generate Image using Spring AI and Open AI APIs

Spring AI

The Spring AI project aims to streamline the development of applications that incorporate artificial intelligence…

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Anji…

No responses yet

More from Anji…

Spring Cloud Config: Externalizing the Configurations From Your Microservice

A deep dive into the Spring Cloud Config and how we can leverage it to externalize application configurations.

Spring Cloud Gateway — Dynamic Route Configuration and Loading from the Datastore

Spring Cloud Gateway is the successor of the Spring Cloud Zuul API Gateway. Spring Cloud Gateway is built on the reactive programming…

12 Factor App Principles and Cloud-Native Microservices

12-factor app is a methodology or set of principles for building the scalable and performant, independent, and most resilient enterprise…

Architecture 101: Top 10 Non-Functional Requirements (NFRs) you Should be Aware of

specification that describes the system’s operation capabilities, constraints, and how it should operate, rather than what the system…

Recommended from Medium

Building AI-Powered Microservices in Java: Hands on Example-A Step into the Future -Part 1

Learn how to build AI-powered microservices in Java using Spring Boot. Explore real-world examples, simple code, and key advantages of…

Mastering Advanced Spring Boot Concepts: A Guide for Senior Java Developers

Spring Boot has become the de facto standard for building enterprise-grade Java applications. While many developers are familiar with its…

Lists

Staff picks

Stories to Help You Level-Up at Work

Self-Improvement 101

Productivity 101

Implementing Keycloak in Spring Boot Using Docker

Learn how to integrate Keycloak, a popular open-source identity and access management solution, with a Spring Boot application using…

Spring Boot AI + Azure OpenAI Hello World Example

In this tutorial, we will walk through the process of setting up Azure OpenAI and connecting it with Spring AI. Integrating artificial…

Easy OAuth2 in Microservices: Quick Setup with Spring Cloud Gateway & Spring Security

In a MicroServices architecture, key security requirements include:

Agentic Automation in Java: Leveraging Tools4AI and JADE for Dynamic AI Agents

With the rise of multi-agent systems in AI, automating processes through intelligent agents is becoming increasingly important. Agentic…