Building an App With the ChatGPT API: Lessons Learned

Profile Picture of Engin Arslan
Engin Arslan
Front-end Engineer

Since its release, ChatGPT has taken the world by storm. Seemingly out of nowhere, we had a computational system so effective that it could generate human language and dialogue good enough to pass expert evaluation tests like the bar or medical exams. 

One of the reasons ChatGPT became so popular is the simplicity of the user experience. It allows users to access the power of artificial intelligence with a mere input field. Some experts even think that the technology is showing glimpses of artificial general intelligence, which means that these tools could learn from experience and apply knowledge and skills to new and unfamiliar situations. 

Previously, most of the machine learning advancements required familiarity with programming, running on specialized hardware such as powerful GPUs and requiring complex installation steps. ChatGPT, conversely, offered a powerful alternative: it combined the power of a Large Language Model (LLM) with the simplicity of a conversation. It stands to be one of the fastest-growing apps ever.

However, it remains to be seen if a chat-based interface is the best way to interact with an LLM-based AI. Typing out repetitive questions can be time-consuming. And the linear nature of a chat could make it difficult to understand the big picture while creating different branches of conversation.

Table Of Contents

I think there are alternative ways one can engage with an LLM-based AI that would address some of these challenges. I built Heuristi.ca to explore one such alternative. Heuristica is an AI-powered knowledge exploration app. Instead of the prompt-reply method used in ChatGPT, a user simply clicks buttons to generate mind maps on any topic they want to explore. This method allows for the categorization and visualization of information in a more comprehensive way than linear, chat-based conversations. 

In this article, I will share the learnings I had when building an AI-powered app using the OpenAI APIs, including how to decide on which model to use, as well as how I leveraged the AI itself during the whole development as a programmer.

Gif showing the usage of the app developed with Open AI APIs

The OpenAI APIs – GPT3.5-turbo model

To build Heuristica, I leveraged the OpenAI APIs. Using the APIs, you can perform a range of operations, such as text completion, chat completion, and image generation, using various kinds of available models. 

To use these APIs, the developer/user (in this case, me) inputs information in the form of prompts. These act as natural language instructions for the AI model, which are then processed to return the generated output.  All the interaction with the model is done through API calls, which are designed to be simple and intuitive to use. For this app, I have specifically used the GPT3.5-turbo model, as it was less expensive and had sufficient capabilities. I’ll touch on that more below. 

Diagram of how the GPT 3.5-turbo model generates outputs from text-based inputs

Overview of Large Language Models (LLMs)

Let’s start with some definitions. ChatGPT is a Large Language Model (LLM)  based on neural networks, which are machine-learning models designed to simulate the structure and function of a human brain. LLMs are specific types of neural networks known as transformers that are trained on immense amounts of written data. 

While OpenAI has not revealed the source data for its LLM, we do know that it contains a large portion of textual information on the internet up until late 2021. Having trained on this vast amount of human-written content, ChatGPT can mimic human language. That gives it the ability to probabilistically predict text outputs with very high accuracy.

Table showing the main limitations of ChatGPT's large language model

Weaknesses of Large Language Models & ChatGPT

As Arthur C. Clarke once said, “Sufficiently sophisticated technology can feel indistinguishable from magic.” This was the impression I had when I first started to use ChatGPT. For the first time, I was interacting with a computational technology that understood what I was asking without requiring me to be overly prescriptive or specific. In fact, it seems to have better comprehension than some humans I’ve come across. 

However, just like people, it is not without flaws. Here are a couple of limitations of LLMs and ChatGPT.

2021 ChatGPT Knowledge Cutoff

As of today, training a Large Language Model is a time-consuming and expensive process. One needs to collect the necessary data for training, clean this data, then feed it into the system for training to take place. The actual training requires immense amounts of computational power that costs millions of dollars in energy consumption. This is not something that can be performed frequently or repeatedly. 

For this reason, ChatGPT’s knowledge was cut off in September 2021. Its model contains no data created after this point. While there are ways to update the LLM with new data, this is being performed only in limited ways and certainly doesn’t involve anything as comprehensive as the original training corpus.

This means that ChatGPTs knowledge and understanding of the world, advancements, and trends are based only on information until this date. Thus, there are distinct bounds to its knowledge, sometimes leading to inaccuracies or outputs that don’t reflect the current state.

Hallucinations

ChatGPT generates content using a statistical model that predicts the most probable outcome based on its training data. This is important: it means ChatGPT doesn’t have an inner understanding of truth. For this reason, ChatGPT will sometimes generate outputs that are false. This is called Hallucination. You should always be suspicious of the output from ChatGPT and never take it as the truth.

Token Limitations

When you input text into ChatGPT, it is converted into tokens before being fed into the system. You can think of tokens as portions of a word, each at most a few characters long. 

Practically, you will likely have observed this if you’ve used ChatGPT, when you input text and got an error message asking you to shorten the text. This is referred to as the “token length limitation,” in which only a certain amount of text – 2048 tokens for GPT-3, to be exact – can be passed to the model. 

Currently, you can’t copy-paste an entire book into ChatGPT and ask for a summary. This is unfortunate, but the token limit continues to increase (GPT-4 supports 8,192 tokens in its base version and 32.7K tokens in its larger version). Some recent models are touting incredible advances in token limitations, so this problem might disappear sooner than expected.

Tech Stack Used to Build an AI-Powered App

One of the big questions to answer is the tech stack to use when building an MVP application. I wanted to devote my time and energy to iterating on the product instead of solving common technical challenges. I choose a stack that allows me to focus on the product instead of the technicalities and implementation details. For this reason, I selected Next.js, Clerk, and Supabase.

An overview of the tech stack for building an AI-powered application

Next.js

Next.js has been my go-to framework for the last couple of years for building all kinds of front-end applications. Built on React, it takes care of many decisions you have to make when building an application, like how to structure your files, which router to use, and ways to implement SSR (Server-Side Rendering). It supports implementing a basic back-end and API endpoints alongside your application, so it takes care of both the back-end and front-end of the stack.

Clerk

I decided to give Clerk a go for implementing authentication. I can’t praise their solution enough. Authentication has always been a bit of a nightmare for me to deal with on personal projects, and it is ridiculous how easy Clerk made it to deal with this problem.

Supabase

Supabase is like Firebase but built on SQL instead of NoSQL databases. Just like Next.js and Clerk, it combines an awesome development experience with top-notch documentation, convenience, and sensible pricing. This was an easy choice to make for any of my back-end storage needs. You could, in fact, handle authentication using Supabase as well, but I found Clerk to be much easier to use in this aspect.

Tailwind

I have dabbled in all kinds of CSS solutions in the past, including styled-components, Theme UI, and Chakra UI. Truthfully, I was slow to adopt Tailwind, because it required adopting a new set of classes and having to write long lines class names inside HTML/JSX files. However, the simplicity of Tailwind eventually won me over. It is easy to install, implement, and reason about. These are great benefits to have when working on projects with quick turnover times.

Is LangChain suitable when interacting with ChatGPT API for this project?

As you start building AI-powered applications, one tool you will likely hear about is LangChain. LangChain is a Python and JavaScript framework for developing applications powered by language models. 

It has many interesting features that make common operations like document summarization or retrieval easier by providing useful abstractions and utility methods. 

Like any framework, however, it requires you to become familiar with another layer of abstraction. It can be incredibly useful depending on the complexity of the task at hand. 

Ultimately, though, I decided not to work with LangChain on this project. In my view, it’s better to start simple and see what kind of problems I encounter before introducing more complex solutions like LangChain. 

Starting to work with ChatGPT API is actually as simple as sending an HTTP request to their servers with some textual payload. So far, I’m happy not to have used LangChain, though I am considering introducing it as my requirements evolve. Notably, its Output Parsing features could be useful for my purposes, as it helps to generate an answer from the Large Language Model in the desired format.

Lessons Learned Building an AI App with the ChatGPT API

Here are some lessons I have learned from building an AI-powered app that uses the currently available ChatGPT APIs.

Choosing the Right Model

When using an OpenAI API, there are a couple of models you can choose from. How to make that decision boils down to what you would like to achieve, as well as your estimated usage amount. 

At the time of this writing, GPT-4 is the strongest model that is offered, but the cost of usage is 10 times more than GPT3.5-turbo. Given the order of magnitude difference in price, estimating the usage becomes an important metric in your decision. 

Given the pace of innovation in the field, I’m not overly concerned about the price. Large Language Models are becoming ubiquitous. There is a new technological development almost every day, so the prices are bound to trend down. Ultimately, it’s important to choose the model that works best for your business and product purposes.

Another factor to consider is the level of “intelligence” you require. GPT-4 is superior in tasks of reasoning and comprehension, which could be non-negotiable aspects of the application you’re building. 

In my experience, GPT3.5-turbo is not always great when it comes to following instructions. For example, I use prompts that request results to be formatted in a certain way (like in bullet points or comma-separated). But every once in a while, I’ll get a result that doesn’t follow the request. This can become problematic when you want the output to be formatted in a certain way for computational processing.

Comparison of GPT3.5-Turbo and GPT4 for building an AI-powered app

Temperature Parameter

At its core, ChatGPT is able to have conversations by identifying the statistically most likely ways to continue those conversations based on the vast amounts of data it is trained on. 

Temperature is a parameter available when using ChatGPT through an API. A low-temperature value means that the model will always return the statistically most likely result when doing completion. Working with the highest probability results sounds like a good thing, but it turns out that keeping this value too low might result in a text that is boring and dull. You might want to increase this value to get more creative and interesting responses from the model. 

The right number might require some testing and will again depend on the purpose of your application. I recommend keeping the temperature low for predictability and high for originality.

Prompting techniques

Today, prompt engineering is a high-demand skill for systems like ChatGPT. But I’m unsure if prompt engineering will be the most in-demand job in the future. Ideally, AIs will develop to a place where they can understand the intent behind our questions without us relying on obscure prompting tricks. That said, we aren’t in that future right now, and knowing at least a bit about prompting helps in our interactions with a language model.

The best way I’ve found to generate the desired outputs is to be clear, specific and offer references. Here are a couple of tricks that I found to be useful when I was tailoring my prompts to be sent to ChatGPT.

Tricks and examples of advance prompts for ChatGPT

Be Specific

Instead of saying, “Give me a few sentences,” you should write, “Give me 3 sentences.” The second one is a more clear instruction to follow.

Use Positive Language

You should tell the AI what to do instead of what not to do. Telling what exactly to do is a more clear and explicit instruction to follow.

Provide Examples

Provide an example to AI as part of your request for the AI to understand what exactly you are looking for. Here is an example:

“Return a comma-separated list of similar concepts or entities to those that are mentioned in the given input below. Make sure only to include names in your response and no explanations. Here is an example of how the output should be structured:

I: This is a text that talks about Bitcoin.

O: Ethereum, Solana, Blockchain”

This helps the AI to return results that are closer to what you are looking for.

Prompting is not an exact science. I found myself tinkering with my prompts a lot till they started to feel right. You don’t need to be an expert at prompting to get good results from AI. Knowing even a little bit of the basics goes a long way.

Guarding against prompt injections

A prompt injection is very similar to SQL injection, where the user inputs a prompt that tricks the AI into sharing information that wasn’t intended to be shared. For example, a given prompt might trick the AI into leaking what’s known as a “system prompt” – a prompt that configures the AI. 

A system prompt might start as “You are a helpful assistant. Help the user figure out a solution to their problem.” The successful prompt injection can expose the details of this prompt to the user when it was only intended to be used in the background. For example, users were able to leak the internal name of the AI-powered Bing Chat using prompt injection techniques.

Unfortunately, guarding against prompt injections can be hard. The best defense you can have is to exclude sensitive information in your prompts. Instead, treat your prompts as publicly available data.

Four Ways Developers can use AI in Their Daily Workflows

The ChatGPT API is only one of the AI tools I have used to build Heuristica. In fact, even though  LLMs are relatively new, they are increasingly becoming part of my toolset. I believe all developers should try to find ways to integrate LLMs into their daily workflows. Here are some primary ways I leveraged AI while developing this app.

1. Content Creation: Generating Lists of Topics

Content creation is one of the most obvious ways LLMs are incredibly useful. In my case, I needed to come up with a list of contentious topics that could be good candidates for knowledge exploration to present the user as preset options. I had some ideas of my own, like Universal Basic Income and Nuclear Power. But I needed a long list to build my app. Luckily, ChatGPT was able to generate hundreds of similar topics within seconds.

Outside of ChatGPT, I have also tried Midjourney, a generative AI program that can create images from natural language descriptions. I’ve used this tool to generate abstract graphics representing node-based interfaces. Unfortunately, I wasn’t able to get the results I was looking for. Image generation tools seem to be great for creating illustrations that are photorealistic, detailed, and stylized. But they aren’t as adept at producing images that are simple, symbolic, or abstract, likely because it’s harder to provide a suitable description for those.

2. Answering ambiguous questions

One of the strongest features of LLMs is their ability to deal with ambiguity. For example, imagine you have a vague understanding of a cultural or social event. Trying to formulate a question for a search engine can be difficult and may take several tries to generate the answer you’re looking for. LLMs are much better at handling ambiguity.

In my case, I had a base64 encoded string that wasn’t working correctly in the URL parameter. I pasted the encoded string to ChatGPT and provided a loose description of the problem. It correctly identified that I needed to further encode the base64 string and even provided the code to do it.

3. Getting explanations for new concepts

As a Front-end developer, there are aspects of Back-end systems that I am not familiar with. Getting accurate answers to questions that I might have is beneficial, but having those answers tailored to my level of understanding really makes a difference in my learning process.

4. Generating examples when documentation is lacking

While building Heuristica, I found that a lot of documentation was lacking. I asked ChatGPT several questions about how to implement certain functions using a given library. While the answers were not always usable in code, they were almost always useful. Even an incorrect answer can include hints about methods or techniques you might have overlooked previously. 

One thing to keep in mind here is the 2021 cut-off date. If you are looking for information about a recently released library, ChatGPT might not be able to help you there without something like a browser plugin.

Final Thoughts

Building Heuristica has been a great exercise in familiarizing myself with the recent developments in artificial intelligence. It feels revolutionary to have access to technology at your fingertips that can generate mostly sensible answers to most of your queries. It is able to do so without even running a search on the internet (although you could have it run a search as well). 

We are still in the early innings of this technology and yet to understand the true impacts it will have on jobs, society, and people. Fortunately, as a developer, it doesn’t take much to leverage this power and see what kind of impact you can have using this resource. It is a great time to start building the future.

Originally published on Jun 7, 2023Last updated on Jul 25, 2024

Looking to hire?

The Scalable Path Newsletter

Join thousands of subscribers and receive original articles about building awesome digital products