
Spring AI: Generate Images Based on the Given Text with Open AI APIs
AI applications are going to play a vital role across various industries, transforming the way tasks are performed, decisions are made, and problems are solved. Every engineer is trying to build some AI applications to enhance their skills and contribute to the organization's success.
In this article, we want to explore how we can generate images based on the user-provided text or topic using the OpenAI APIs and Spring AI.
In the previous article, we explored integrating the OpenAI APIs with Java using Spring AI.
The previous article serves as the basis for this post. Please read if you still need to do so.
What is Spring AI?
Spring AI takes inspiration from Python projects like LangChain and LlamaIndex, but it’s specifically designed for the Spring ecosystem. This ensures the utilization of familiar Spring concepts and patterns well-known to Java developers, facilitating a seamless learning curve and adoption.
Generate Image using Spring AI and Open AI APIs
To generate the images using open AI, we need to use DALL-E APIs. DALL-E allows you to generate images from natural language descriptions. DALL-E APIs can be used to generate the images and classify the images.
Implementation Approach
We will be leveraging the code base that we have implemented as part of the blog: Spring AI: Streamlining AI Application Development with Java and AI APIs.
Let us add the image URL to our Yaml configuration. Below is the updated one.
spring:
ai:
openai:
api-key: sk-XXXXX
image-generator-url: https://api.openai.com/v1/images/generations
Let us build a simple REST endpoint that takes an image description and image size as input and provides the image URL as output.
Let us define the request and response objects for our REST endpoint.
Request Class
package com.techmonks.openai.model;
import lombok.Getter;
import lombok.Setter;
@Getter @Setter
public class ImageGeneratorRequest {
private String promptText;
private String imageSize;
}
Response Class
package com.techmonks.openai.model;
import lombok.Getter;
@Getter
public class GeneratedImage {
private String url;
}
package com.techmonks.openai.model;
import lombok.Getter;
import java.util.List;
@Getter
public class GeneratedImageResponse {
private List<GeneratedImage> data;
}
REST Endpoint
@PostMapping("images")
public ResponseEntity<GeneratedImageResponse> generateImages(@RequestBody ImageGeneratorRequest imageGeneratorRequest) throws JsonProcessingException {
PromptTemplate promptTemplate = new PromptTemplate("""
Please create a prompt that generates the image about {promptText}.. Please enhance the prompt
to make it more creative and fancy. Prompt length should not exceed 300 characters very strictly.""");
//Please note that max prompt length is 1000 characters. Limiting the prompt length to avoid issues
promptTemplate.add("promptText", imageGeneratorRequest.getPromptText());
ChatResponse chatResponse = this.chatClient.generate(promptTemplate.create());
String prompt = chatResponse.getGeneration().getContent();
GenerateImageRequest generateImageRequest = new GenerateImageRequest();
generateImageRequest.setPrompt(prompt);
generateImageRequest.setSize(imageGeneratorRequest.getImageSize());
generateImageRequest.setN(1);
ObjectMapper objectMapper = new ObjectMapper();
String imageJsonRequest = objectMapper.writeValueAsString(generateImageRequest);
GeneratedImageResponse generatedImageResponse = getGeneratedImageResponse(imageJsonRequest);
return ResponseEntity.ok(generatedImageResponse);
}
private GeneratedImageResponse getGeneratedImageResponse(String imageJsonRequest) {
GeneratedImageResponse generatedImageResponse = webClient.post().uri(imageGeneratorUrl)
.header("Authorization", "Bearer "+ apiKey)
.contentType(MediaType.APPLICATION_JSON)
.body(BodyInserters.fromValue(imageJsonRequest))
.retrieve()
.onStatus(HttpStatusCode::is4xxClientError, res -> {
System.out.println(res);
return res.bodyToMono(String.class)
.flatMap(body -> Mono.error(new RuntimeException(body)));
})
.onStatus(HttpStatusCode::is5xxServerError, res -> {
System.out.println(res);
return res.bodyToMono(String.class)
.flatMap(body -> Mono.error(new RuntimeException(body)));
})
.bodyToMono(GeneratedImageResponse.class).block();
return generatedImageResponse;
}
In the above example, we are sending user provided text to Open AI to refine the prompt to make it more clear and concise to generate more accurate images.
After receiving the refined prompt, we are sending the prompt, image size, and number of images to the DALL-E API to generate the image URL. DALL-E API will return the generated image URL.
Below is the postman's collection to verify it.
{
"info": {
"_postman_id": "b3f73dc4-7eaf-4065-8b4c-5b28b469d744",
"name": "open-ai",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json",
"_exporter_id": "3152420",
"_collection_link": "https://releasedashboard.postman.co/workspace/Anji~01fb9b44-0970-46dd-9195-56044a7ead66/collection/3152420-b3f73dc4-7eaf-4065-8b4c-5b28b469d744?action=share&source=collection_link&creator=3152420"
},
"item": [
{
"name": "translate-text",
"request": {
"method": "POST",
"header": [],
"body": {
"mode": "raw",
"raw": "{\r\n \"translateFrom\":\"English\",\r\n \"translateTo\":\"Hindi\",\r\n \"textToTranslate\":\"Hello, How are you doing today?\"\r\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "http://localhost:8080/v1/translations",
"protocol": "http",
"host": [
"localhost"
],
"port": "8080",
"path": [
"v1",
"translations"
]
}
},
"response": []
},
{
"name": "dalle-e-images",
"request": {
"method": "POST",
"header": [],
"body": {
"mode": "raw",
"raw": "{\r\n \"promptText\": \"cat and dog\",\r\n \"imageSize\":\"512x512\"\r\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "http://localhost:8080/v1/images",
"protocol": "http",
"host": [
"localhost"
],
"port": "8080",
"path": [
"v1",
"images"
]
}
},
"response": []
}
]
}
As always, you can find the entire source code here.
That’s all for today!
Thank you for taking the time to read this article. I hope you have enjoyed it. If you enjoyed it and would like to stay updated on various technology topics, please consider following and subscribing for more insightful content.
References: