windhamdavid 4ca8491a84 all | 1 year ago | |
---|---|---|
components | 1 year ago | |
config | 1 year ago | |
pages | 1 year ago | |
public | 1 year ago | |
scripts | 1 year ago | |
styles | 1 year ago | |
types | 1 year ago | |
utils | 1 year ago | |
.env.local.example | 1 year ago | |
.eslintrc.json | 1 year ago | |
.gitignore | 1 year ago | |
.prettierrc | 1 year ago | |
README.md | 1 year ago | |
next.config.js | 1 year ago | |
package-lock.json | 1 year ago | |
package.json | 1 year ago | |
pnpm-lock.yaml | 1 year ago | |
postcss.config.cjs | 1 year ago | |
schema.sql | 1 year ago | |
tailwind.config.cjs | 1 year ago | |
tsconfig.json | 1 year ago |
Fork of 👉🏼 https://github.com/mayooear/langchain-supabase-website-chatbot
Create a chatgpt chatbot for your website using LangChain, Supabase, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps. Supabase is an open source Postgres database that can store embeddings using a pg vector extension.
Get in touch via twitter if you need help
The visual guide of this repo and tutorial is in the visual guide
folder.
Clone the repo
git clone [github https url]
Install packages
pnpm install
Set up your .env
file
Copy .env.local.example
into .env
Your .env
file should look like this:
OPENAI_API_KEY=
NEXT_PUBLIC_SUPABASE_URL=
NEXT_PUBLIC_SUPABASE_ANON_KEY=
SUPABASE_SERVICE_ROLE_KEY=
Visit openai to retrieve API keys and insert into your .env
file.
Visit supabase to create a database and retrieve your keys in the user dashboard as per docs instructions
In the config
folder, replace the urls in the array with your website urls (the script requires more than one url).
In the utils/custom_web_loader.ts
inside the load
function replace the values of title
, date
and content
with the css elements of text you'd like extract from a given webpage. You can learn more about how to use Cheerio here
You can add your custom elements to the metadata to meet your needs, note however that the default loader format as per below expects at least a string for pageContent
and metadata
that contains a source
property as a returned value:
async load(): Promise<Document[]>{
const $ = await this.scrape();
const text = $("body").text();
const metadata = { source: this.webPath };
return [new Document({ pageContent: text, metadata })];
}
The pageContent
and metadata
will later be stored in your supabase database table.
schema.sql
in your supabase sql editordocuments
table exists in the database as well as the match_documents
function.To run the scraping and embedding script in scripts/scrape-embed.ts
simply run:
npm run scrape-embed
This script will visit all the urls noted in the config
folder and extract the data you specified in the custom_web_loader.ts
file.
Then it will use OpenAI's Embeddings(text-embedding-ada-002
) to convert your scraped data into vectors.
Once you've verified that the embeddings and content have been successfully added to your supabase table, you can run the app npm run dev
and type a question to ask your website.
Frontend of this repo is inspired by langchain-chat-nextjs
This repo uses in-depth Notion guides from the website of productivity expert, Thomas Frank.