What is the Datahub?
The Datahub is your central repository for storing and managing all datasets in Arato. It provides comprehensive dataset management features including file uploads, previews, and detailed metadata tracking. You can view important information such as event counts, variables, notebook usage, upload details, and custom tags for easy organization.
Why Use the Datahub?
The Datahub streamlines your dataset management workflow with several key benefits:
Efficient Organization: Quickly sort and find datasets across multiple notebooks
Version Control: Track dataset changes, including when they were modified and by whom
Tag Management: Create and manage custom tags for better dataset organization
File Preview: Examine dataset contents before use
Usage Tracking: See which notebooks are using each dataset
How to Use the Datahub
Adding New Datasets
Access the Datahub
Click the three dots (⋮) in your dataset header
Select "Open Datahub" from the menu
Upload a Dataset
Click the "Create new dataset" button
Select your CSV file when the upload modal appears
Preview your data before confirming
Give your dataset a meaningful name (required)
Once uploaded, the dataset appears in both Datahub and your current notebook
File Requirements
When uploading files to the Datahub, ensure your file meets these specifications:
Include a header row with column names
Contains a 'template' column with prompt templates
Use {{variable}} syntax for variables in templates
Include matching columns for all template variables
Use only alphanumeric characters and underscores in variable names
Include a 'response' column for prompt responses
Keep file size under 1MB
Managing Datasets
Preview Files: Check dataset contents before use
Move Files: Transfer datasets between notebooks
Remove Files:
Remove from notebook while keeping in Datahub
Delete permanently from Datahub
Track Usage: View which notebooks use each dataset
Add Tags: Create custom tags for better organization