paramount
Paramount lets your expert agents evaluate AI chats, enabling:
- quality assurance
- ground truth capturing
- automated regression testing
Usage
Getting Started
- Install the package:
- Decorate your AI function:
@paramount.record() def my_ai_function(message_history, new_question): # Inputs # <LLM invocations happen here> new_message = {'role': 'user', 'content': new_question} updated_history = message_history + [new_message] return updated_history # Outputs.
- After
my_ai_function(...)
has run several times, launch the Paramount UI to evaluate results:
Your SMEs can now evaluate recordings and track accuracy improvements over time.
Paramount runs completely offline in your private environment.
Usage
After installation, run python example.py
for a minimal working example.
Configuration
In order to set up successfully, define which input and output parameters represent the chat list used in the LLM.
This is done via the paramount.toml
configuration file that you add in your project root dir.
It will be autogenerated for you with defaults if it doesn't already exist on first run.
[record] enabled = true function_url = "http://localhost:9000" # The url to your LLM API flask app, for replay [db] type = "csv" # postgres also available [db.postgres] connection_string = "" [api] endpoint = "http://localhost" # url and port for paramount UI/API port = 9001 split_by_id = false # In case you have several bots and want to split them by ID identifier_colname = "" [ui] # These are display elements for the UI # For the table display - define which columns should be shown meta_cols = ['recorded_at'] input_cols = ['args__message_history', 'args__new_question'] # Matches my_ai_function() example output_cols = ['1', '2'] # 1 and 2 are indexes for llm_answer and llm_references in example above # For the chat display - describe how your chat structure is set up. This example uses OpenAI format. chat_list = "output__1" # Matches output updated_history. Must be a list of dicts to display chat format chat_list_role_param = "role" # Key in list of dicts describing the role in the chat chat_list_content_param = "content" # Key in list of dicts describing the content
It is also possible to describe references via config but is not shown here for simplicity.
See paramount.toml.example
for more info.
For Developers
The deeper configuration instructions about the client
& server
can be seen here.
Docker
By using Dockerfile.server
, you can containerize and deploy the whole package (including the client).
With Docker, you will need to mount the paramount.toml
file dynamically into the container for it to work.
docker build -t paramount-server -f Dockerfile.server . # or make docker-build-server docker run -dp 9001:9001 paramount-server # or make docker-run-server
License
This project is under GPL License.