Follow the message and like, take you to understand the most popular software development knowledge and the latest technology industry trends.

In this tutorial, learn how to integrate the ChatGPT engine into your Java application in an extensible way by sending prompts to the engine only when necessary.

One of the more notable aspects of ChatGPT is its engine, which not only powers web-based chatbots but can also be integrated into your Java applications.

Whether you prefer to read or watch, let's review how you can get started with the OpenAI GPT engine in an extensible way in your Java projects by sending prompts to the engine only when necessary:

Budget trip app

Imagine that you want to visit a city and have a specific budget. How should you spend your money and make your trip memorable? This is a good question delegated to OpenAI Engine.

Let's help users make the most of their trip by building a simple Java application called BudgetJourney. The app can suggest multiple points of interest within the city to accommodate specific budget constraints.

The architecture of the BudgetJourney application is as follows:

Create extensible OpenAI GPT applications in Java

BudgetJourney app

The user opens the BudgetJourney web UI running on Vaadin.

When users want to get city- and budget-specific recommendations, Vaadin connects to the Spring Boot backend.

Spring Boot connects to a YugabyteDB DB instance to check if there are already any recommendations for the requested city and budget. If the data is already in the database, the response is sent back to the user.

Otherwise, Spring Boot connects to the OpenAI API to get recommendations from the neural network. Responses are stored in YugabyteDB for future reference and sent back to the user.

Now, let's look at how the application communicates with the Open AI Engine (step 4) and how the database is used (step 3) to make the solution scalable and cost-effective.

OpenAI Java library

OpenAI Engine can be queried through the HTTP API. You need to create an account, get your token (i.e. API key) and use it when sending requests to one of your OpenAI models.

A model in the context of OpenAI is a computational structure trained on a large dataset to recognize patterns, make predictions, or perform specific tasks based on input data. Currently, the service supports a variety of models that can understand and generate natural language, code, images, or convert audio to text.

Our BudgetJourney application uses a GPT-3.5 model that understands and generates natural language or code. The application asked the model to suggest several points of interest within the city while taking into account budget constraints. The model then returns recommendations in JSON format.

The open-source OpenAI Java library implements the GPT-3.5 HTTP API to easily communicate with services through well-defined Java abstractions. Here's how you can get started with the library:

Add the latest OpenAI Java artifacts to your pom .xml file.

XML

<groupId>com.theokanning.openai-gpt3-java</groupId>

<artifactId>service</artifactId>

<version>${version}</version>

</dependency>

OpenAiService creates instances of this class by providing tokens and timeouts for requests between the application and OpenAI Engine.

OpenAiService openAiService = new OpenAiService(

apiKey, Duration.ofSeconds(apiTimeout));

Simple! Next, let's look at how to use the GPT-3.5 model OpenAiService with an example.

Send prompts to GPT-3.5 models

You can communicate with your OpenAI model by sending text prompts that tell you what you expect the model to do. Models perform best when your instructions are clear and include examples.

To build hints for GPT-3.5 models, you can use the API of the ChatCompletionRequestOpenAI Java library:

ChatCompletionRequest chatCompletionRequest = ChatCompletionRequest

.builder()

.model(“gpt-3.5-turbo”)

.temperature(0.8)

.messages(

List.of(

new ChatMessage("system", SYSTEM_TASK_MESSAGE),

new ChatMessage("user", String.format("I want to visit %s and have a budget of %d dollars", city, budget))))

.build();

The model ("GPT-3.5-turbo") is an optimized version of the GPT-3.5 model.

temperature(...) Control the randomness and creativity expected in the model response. For example, a higher value, such as 0.8, will make the output more random, while a lower value, such as 0.2, will make the output more deterministic.

messages(...) is an actual description or hint to the model. There are "system" messages that indicate the model is running in some way, "assistant" that stores messages that were previously responded to, and "user" that carries messages requested and queried by the user.

The appearance of the BudgetJourney application SYSTEM_TASK_MESSAGE is as follows:

You are an API server that responds in JSON format. Don't say anything else. Respond to JSON only.

Users will provide you with the city name and available budget. When considering that budget, you must suggest a list of places to visit.

Allocate 30% of your budget to restaurants and bars. Another 30% is allocated for shows, amusement parks and other sightseeing activities. Use the rest of your budget for shopping. Keep in mind that users have to spend 90-100% of their budget.

The response is in JSON format, including an array named "places". Each item of the array is another JSON object, including place_name as text, place_short_info as text, and place_visit_cost as number.

Don't add anything else after using the JSON response.

Although verbose and requires optimization, this system message conveys the action required: suggest multiple points of interest with maximum budget utilization and provide responses in JSON format, which is critical for the rest of the application.

After ChatCompletionRequest creates a prompt ( ) that provides system and user messages and other parameters, you can send it via an OpenAiService instance:

OpenAiService openAiService = … //created earlier

StringBuilder builder = new StringBuilder();

openAiService.createChatCompletion(chatCompletionRequest)

.getChoices().forEach(choice -> {

builder.append(choice.getMessage().getContent());

});

String jsonResponse = builder.toString();

This jsonResponse object is then further processed by the rest of the application logic, which prepares a list of points of interest and displays them with the help of Vaadin.

For example, suppose a user is visiting Tokyo and wants to spend up to $900 in the city. The model will respond strictly according to the instructions we get from the system message and respond with the following JSON:

{

"places": [

{

"place_name": "Tsukiji Fish Market",

"place_short_info": "Famous fish market where you can eat fresh sushi",

"place_visit_cost": 50

{

"place_name": "Meiji Shrine",

"place_short_info": "Beautiful Shinto shrine in the heart of Tokyo",

"place_visit_cost": 0

{

"place_name": "Shibuya Crossing",

"place_short_info": "Iconic pedestrian crossing with bright lights and giant video screens",

"place_visit_cost": 0

{

"place_name": "Tokyo Skytree",

"place_short_info": "Tallest tower in the world, offering stunning views of Tokyo",

"place_visit_cost": 30

{

"place_name": "Robot Restaurant",

"place_short_info": "Unique blend of futuristic robots, dancers, and neon lights",

"place_visit_cost": 80

// More places

]}

This JSON is then transformed into a list of different points of interest. Then display to the user:

Different points of interest

Note: The GPT-3.5 model was trained on the September 2021 dataset. Therefore, it cannot provide 100% accurate and relevant travel recommendations. However, this inaccuracy can be improved with the help of OpenAI plugins that enable models to access real-time data. For example, once OpenAI's Expedia plugin is publicly available as an API, you can further improve this BudgetJourney application.

Use database extensions

As you can see, integrating neural networks into your Java application and communicating with them in a similar way to other 3rd-party APIs is straightforward. You can also adjust the API behavior, such as adding the desired output format.

However, this is still a 3rd-party API that charges you for each request. The more prompts you send, the longer the prompts last, the more you pay. Nothing is free.

Also, the model takes time to process your prompts. For example, it may take 10-30 seconds for the BudgetJourney app to receive a full list of recommendations from OpenAI. This can be a bit overkill, especially when different users send similar prompts.

To make OpenAI GPT applications extensible, it is worth storing model responses in a database. The database allows you to:

Reduce the volume of requests to OpenAI APIs, reducing the associated costs.

Service user requests with low latency by returning previously processed (or preloaded) recommendations from the database.

The BudgetJourney application uses the YugabyteDB database because of its ability to scale globally and store model responses close to the user's location. With the geo-partitioned deployment model, you can have a single database cluster with data automatically pinned to different geographic locations and served with low latency.

Geographically partitioned YugabyteDB cluster

Custom geographic partitioning columns (the columns in the diagram above "region") let the database decide the target row position. For example, a database node from Europe already stores recommendations for a trip to Miami with a budget of $1500. Next, suppose a user from Europe wants to go to Miami and spend that amount. In this case, the application can respond in milliseconds by getting recommendations directly from database nodes in the same geographic location.

The BudgetJourney application uses the following JPA repository to get recommendations from the YugabyteDB cluster:

@Repository

public interface CityTripRepository extends JpaRepository<CityTrip, Integer> {

@Query("SELECT pointsOfInterest FROM CityTrip WHERE cityName=?1 and budget=?2 and region=?3")

String findPointsOfInterest(String cityName, Integer budget, String region);

}

The class Entity is as follows:

@Entity

public class CityTrip {

@Id

@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "landmark_generator")

@SequenceGenerator(name = "landmark_generator", sequenceName = "landmark_sequence", allocationSize = 5)

int id;

@NotEmpty

String cityName;

@NotNull

Integer budget;

@NotEmpty

@Column(columnDefinition = "text")

String pointsOfInterest;

@NotEmpty

String region;

//The rest of the logic

}

So, all you need to do is call the database first and return to the OpenAI API if the relevant suggestion is not already available in the database. As your application grows in popularity, more and more local referrals will become available, and this approach will become more cost-effective over time.

summary

ChatGPT's web-based chatbots are a great way to showcase the capabilities of OpenAI Engine. Explore the engine's powerful model and start building new Java applications. Just make sure you do it in a scalable way!

Create extensible OpenAI GPT applications in Java

In this tutorial, learn how to integrate the ChatGPT engine into your Java application in an extensible way by sending prompts to the engine only when necessary.