Skip to main content

Managing Datasets in Datahub

Organizing, storing, and versioning datasets in Arato’s Datahub.

Updated this week

What is the Datahub?

The Datahub is your central repository for storing and managing all datasets in Arato. It provides comprehensive dataset management features including file uploads, previews, and detailed metadata tracking. You can view important information such as event counts, variables, notebook usage, upload details, and custom tags for easy organization.

Why Use the Datahub?

The Datahub streamlines your dataset management workflow with several key benefits:

  • Efficient Organization: Quickly sort and find datasets across multiple notebooks

  • Version Control: Track dataset changes, including when they were modified and by whom

  • Tag Management: Create and manage custom tags for better dataset organization

  • File Preview: Examine dataset contents before use

  • Usage Tracking: See which notebooks are using each dataset


How to Use the Datahub

Adding New Datasets

  • Access the Datahub

    • Click the three dots (⋮) in your dataset header

    • Select "Open Datahub" from the menu

  • Upload a Dataset

    • Click the "Create new dataset" button

    • Select your CSV file when the upload modal appears

    • Preview your data before confirming

    • Give your dataset a meaningful name (required)

    • Once uploaded, the dataset appears in both Datahub and your current notebook

File Requirements

When uploading files to the Datahub, ensure your file meets these specifications:

  • Include a header row with column names

  • Contains a 'template' column with prompt templates

  • Use {{variable}} syntax for variables in templates

  • Include matching columns for all template variables

  • Use only alphanumeric characters and underscores in variable names

  • Include a 'response' column for prompt responses

  • Keep file size under 1MB

Managing Datasets

  • Preview Files: Check dataset contents before use

  • Move Files: Transfer datasets between notebooks

  • Remove Files:

    • Remove from notebook while keeping in Datahub

    • Delete permanently from Datahub

  • Track Usage: View which notebooks use each dataset

  • Add Tags: Create custom tags for better organization

Did this answer your question?