Supercharge Your ChatGPT Conversations with JavaScript
Have you ever wished you could bring more interactivity and functionality to your conversations with ChatGPT. Well you can, using JavaScript.
Read on to learn how to do just that ;)
Where did this idea come from?
Personal Observations:
During a casual chat, my wife expressed skepticism:
ChatGPT can’t solve basic math. I’m skeptical.
However, I argued that:
But computers excel at math; it’s just that large language models (LLMs) like ChatGPT aren’t necessarily the right tool for that job!
Inspirations
My insights grew with some observations:
- OpenAI’s introduction of plugins for ChatGPT, and ones like Wolfram Alpha that are tailored for mathematical challenges.
- OpenAI’s reveal of the Code Interpreter plugin, allowing the generation of Python-based data science pipelines. Potentially allowing ChatGPT to do anything that Python can.
- A captivating paper that shows a Minecraft bot powered by a GPT4-like system. This bot, intriguingly, writes code to address challenges, verifies solutions in gaming world, and, if effective, archives these tactics in a reusable skillset. Ingenious!
- Another enlightening paper demonstrated GPT4’s prowess to craft functions which GPT3 can utilize to solve problems more efficiently, hinting that high-end models could produce tools to augment the capabilities of their more economical counterparts.
Realization and Motivation:
ChatGPT might not be the best tool for solving a wide range of problems, but it can write code that does!
And with OpenAI’s intent to refine GPT models to harness external functions, the horizon of possibilities seemed endless. I was really eager to delve in to experimenting with that horizon. But the challenge lay in getting access to Code Interpreter(in Alfa back then) or making my own.
Challenges and Solution
My enthusiasm to explore and play with this was met with the daunting tasks:
- obtaining access to OpenAI APIs
- crafting my own ChatGPT
- and my own Code Interpreter
- making it accessible to others in ways that would not cost me a fortune
They appeared time-consuming and cumbersome, especially when I aimed to share my experiments with a wider audience.
But then a solution struck:
Why not utilise the browser’s native JavaScript interpreter available right in ChatGPT already?
So many advantages:
- No need to write my own ChatGPT
- or Code Interpreter
- Easy to publish and share
- no need to pay for OpenAI APIs, ChatGPT is free for everyone and spits out neat JavaScript blocks
All I need is Chrome Extension that integrates with ChatGPT!
Development of the JavaScript Interpreter
Instead of Chrome Extension I went with a Bookmarklet. It was a deliberate choice because:
- It simpler to start compared to a Chrome Extension.
- Despite my subsequent efforts to mold it into a Chrome Extension, Chrome’s ongoing shift to a V3 manifest restricts arbitrary code execution. This makes it imposible at the moment (TemperMonkey conversation)
- The Bookmarklet is functional on Android smartphones (though not on iOS from its 15th version) while extensions do not. And its a lot of fun “developing” on a phone on a beach :D
So, what does this tool offer?
- It identifies code sections in the chat marked with ‘JavaScript’.
- A ‘Run’ button appears at these blocks’ base.
- Clicking this button prompts the Bookmarklet to fetch the code and execute it using ‘eval’.
- It tracks logs and errors, presenting them beneath the ‘Run’ button and monitors chat updates to spot new code segments, subsequently adding ‘Run’ buttons to them.
Want to try it yourself?
Follow to this page and follow instructions. There are examples that show you how to:
- Visualise audio by loading and playing music from your computer right in ChatGPT.
- Draw random trees on the screen in ChatGPT.
- Load charting library and do some animated bar charts.
- Load text recognition library and add button to ChatGPT that allows to load image, recognise text, input it back in to the chat.
Here is video for random trees in a
Comparison with Code Interpreter
While immersed in this project, OpenAI released the Code Interpreter for everyone. My initial impressions were mixed. The Interpreter’s restricted environment, although for understandable performance and security reasons, limited certain capabilities like fetching online resources, using AI libraries, or calling external APIs. For instance, I endeavored to:
- Recognize text from an uploaded receipt.
- Leverage LLM to categorize items and rectify misrecognition-induced typos.
- Incorporate this into a downloadable CSV.
- Conduct an analysis on the CSV.
My endeavors hit a roadblock immediately when the Code Interpreter couldn’t access the text recognition library. Yet, I immediately did it with my JavaScript interpreter right in ChatGPT UI
Code Interpreter falls short of using everything that Python can do
On other hand with JavaScript interpreter you can:
- Modify ChatGPT UI
- Load any libraries from CDNs
- Use any browser APIs(text to speech and speech to text, or load, play, analyse and visualise audio)
- It can be interactive
- You can call some REST APIs that allow calls from chat.openai.com domain
Feedback and Future Directions
I eagerly invite you to test this tool and share your experience. Your insights will help in shaping my future explorations.
On the horizon, things I want to try are:
- Facilitating the auto-loading of any NPM browser library into ChatGPT and embedding its documentation too. This would allow ChatGPT to understand its usage, enabling swift prototyping — here a startup does something similar for any REST API by ingesting its documentation
- Integrating solutions like Firebase into the chat, complemented by an API to store, fetch, and reuse data across chats. Coupled with text search, this could amplify ChatGPT’s memory capabilities.
- Creating a electron app that registers as a local ChatGPT plugin with APIs to execute terminal commands or manipulate and run JavaScript code on any webpage. Chat and manipulate with your local machine or any website — inspired by this project for local file editing, targeted at developers though
- Enabling ChatGPT to access Hugging Face APIs, expanding its interactions with other AI models. Imagine that you can chat with whole repositories of AIs — seems Microsoft already trying this
- Making something like a Shader Toy in ChatGPT that would allow doing shaders interactively using plain language
Thank You
For joining me on this technological exploration! Whether it’s simple doodles or intricate interactive plots, this feels merely the beginning. Stay connected as I intend to share more experiments in the near future.
P.S
It’s surreal penning down thoughts on AI. My exploration started a mere two months ago, and the pace of innovation is frenetic. Every innovative idea seemed to materialise within weeks of its inception. This article? It underwent frequent updates to keep pace! All links in further ideas did not exist when I first drafted it and happened over last month 0_o