david/gpt-dw

No Description

windhamdavid 4ca8491a84 all		2 years ago
components	4ca8491a84 all	2 years ago
config	4ca8491a84 all	2 years ago
pages	4ca8491a84 all	2 years ago
public	4ca8491a84 all	2 years ago
scripts	4ca8491a84 all	2 years ago
styles	4ca8491a84 all	2 years ago
types	4ca8491a84 all	2 years ago
utils	4ca8491a84 all	2 years ago
.env.local.example	4ca8491a84 all	2 years ago
.eslintrc.json	4ca8491a84 all	2 years ago
.gitignore	4ca8491a84 all	2 years ago
.prettierrc	4ca8491a84 all	2 years ago
README.md	4ca8491a84 all	2 years ago
next.config.js	4ca8491a84 all	2 years ago
package-lock.json	4ca8491a84 all	2 years ago
package.json	63eee3fe2b init	2 years ago
pnpm-lock.yaml	4ca8491a84 all	2 years ago
postcss.config.cjs	4ca8491a84 all	2 years ago
schema.sql	4ca8491a84 all	2 years ago
tailwind.config.cjs	4ca8491a84 all	2 years ago
tsconfig.json	4ca8491a84 all	2 years ago

Fork of 👉🏼 https://github.com/mayooear/langchain-supabase-website-chatbot

LangChain & Supabase - Create a ChatGpt Chatbot for Your Website

Create a chatgpt chatbot for your website using LangChain, Supabase, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps. Supabase is an open source Postgres database that can store embeddings using a pg vector extension.

Tutorial video

Get in touch via twitter if you need help

The visual guide of this repo and tutorial is in the visual guide folder.

Development

Clone the repo
```
git clone [github https url]
```
Install packages
```
pnpm install
```
Set up your .env file

Copy .env.local.example into .env Your .env file should look like this:

OPENAI_API_KEY=

NEXT_PUBLIC_SUPABASE_URL=
NEXT_PUBLIC_SUPABASE_ANON_KEY=
SUPABASE_SERVICE_ROLE_KEY=

Visit openai to retrieve API keys and insert into your .env file.
Visit supabase to create a database and retrieve your keys in the user dashboard as per docs instructions

In the config folder, replace the urls in the array with your website urls (the script requires more than one url).
In the utils/custom_web_loader.ts inside the load function replace the values of title, date and content with the css elements of text you'd like extract from a given webpage. You can learn more about how to use Cheerio here

You can add your custom elements to the metadata to meet your needs, note however that the default loader format as per below expects at least a string for pageContent and metadata that contains a source property as a returned value:

async load(): Promise<Document[]>{
  const $ = await this.scrape();
      const text = $("body").text();
    const metadata = { source: this.webPath };
    return [new Document({ pageContent: text, metadata })];
  }

The pageContent and metadata will later be stored in your supabase database table.

Copy and run schema.sql in your supabase sql editor

cross check the documents table exists in the database as well as the match_documents function.

🧑 Instructions for scraping and embedding

To run the scraping and embedding script in scripts/scrape-embed.ts simply run:

npm run scrape-embed

This script will visit all the urls noted in the config folder and extract the data you specified in the custom_web_loader.ts file.

Then it will use OpenAI's Embeddings(text-embedding-ada-002) to convert your scraped data into vectors.

Run the app

Once you've verified that the embeddings and content have been successfully added to your supabase table, you can run the app npm run dev and type a question to ask your website.

Credit

Frontend of this repo is inspired by langchain-chat-nextjs

This repo uses in-depth Notion guides from the website of productivity expert, Thomas Frank.

README.md

LangChain & Supabase - Create a ChatGpt Chatbot for Your Website

Development

🧑 Instructions for scraping and embedding

Run the app

Credit