Building a chatbot for your data and pipelines is challenging because they are often too large (e.g., 1,000+ tables) to fit within the LLM context window. Cocoon addresses this by creating a RAG layer for your data and pipelines. With Cocoon's RAG, we offer a cursor-style chatbot for your data tasks.
Get Started
- 👉 Online Service to clean your uploaded CSV
- 👉 Try this Google Collab Notebook for Data Warehouse RAG
- 👉 Try this Google Collab Notebook for Data Pipeline RAG
Cocoon is available on PyPI. Create a virtual env and then:
pip install cocoon_data -U
To get started, you need to connect to
- LLMs (e.g., GPT-4, Claude-3, Gemini-Ultra, or your local LLMs)
- Data Warehouses (e.g., Snowflake, Big Query, Duckdb...)
from cocoon_data import * # if you use Open AI GPT-4 openai.api_key = 'xycabc' # if you use Snowflake con = snowflake.connector.connect(...) query_widget, cocoon_workflow = create_cocoon_workflow(con) # a helper widget to query your data warehouse query_widget.display() # the main panel to interact with Cocoon cocoon_workflow.start()
🎉 You shall see the following on a notebook:
We also offer a browser UI, only for the chat over RAG feature. Simply:
pip install cocoon_data -U cocoon_data
You shall see