Human Annotation Dataset Creation

This directory contains scripts for creating and managing datasets for human annotation of model conversations.

Dataset Creation

The create_dataset.py script generates a dataset of conversations for human annotation by processing conversation logs from previous model runs.

Configuration

The script is configured with the following parameters:

Models: hf/Meta-Llama-3.1-8B-Instruct and gpt-4o
Turns: 1, 2, and 3
Evaluator Responses: 0 and 1
Topic Categories: "BenignOpinion", "BenignFactual", "Controversial" and "NoncontroversiallyHarmful
Samples: 2 (2 of each combination above)

Running the Script

To create the dataset, run:

python create_dataset.py

Output Files

The script generates two JSON files:

Full Dataset (analysis/dataset_with_conversations.json):
- Contains complete conversation data including metadata
- Includes model information, turn numbers, topic categories, and full conversation context
Annotation Dataset (dataset_with_conversations_for_annotation.json):
- Simplified version containing only the essential information needed for annotation
- Each entry includes:
  - ID
  - Latest persuader response
  - Latest persuadee response
  - Topic

Dataset Structure

The script searches through conversation logs in ../results/local/annotation in reverse chronological order to find matching conversations. It looks for conversations that match:

The specified model
Topic categories
Topic file
Turn number
Evaluator response

The script will print progress information including:

Number of entries found
Number of remaining entries to find
Any entries that couldn't be matched

Notes

The script automatically shuffles the dataset entries and assigns sequential IDs
It skips conversations where there was an explicit refusal
The script processes all results directories until it finds matches for all required entries

Annotation Process

The annotation process is performed using Label Studio, a powerful open-source data labeling tool.

Setting Up Label Studio

Pull and run the Label Studio Docker container:

docker pull heartexlabs/label-studio:latest
docker run -it -p 8080:8080 heartexlabs/label-studio:latest

Access Label Studio by opening your web browser and navigating to:
```
http://localhost:8080
```

Creating a New Project

Create a new account or log in to Label Studio
Click "Create Project" and give it a name (e.g., "Model Persuasion Analysis")
In the project setup:
- Import the annotation dataset file (dataset_with_conversations_for_annotation.json)
- Choose "Custom Template" for the labeling interface
- Use the provided layout.xml file to set up the annotation interface

Annotation Interface

The custom layout will display:

The conversation topic
The latest persuadee response
The latest persuader response
Annotation controls for marking various aspects of the conversation

Performing Annotations

Click "Start Labeling" to begin the annotation process
For each conversation:
- Read the topic and conversation context
- Use the provided controls to mark relevant aspects
- Click "Submit" to save your annotation
- Use the navigation controls to move to the next conversation

Exporting Results

Once all annotations are complete, go to the project dashboard
Click "Export" to download the annotated dataset
Choose your preferred export format (JSON recommended)
Save the exported file for further analysis

Analysis Process

After collecting annotations from multiple annotators, the next step is to merge and analyze the results.

Merging Annotations

Place all exported annotation files in the human_responses directory
Run the merge_annotations script:
```
python merge_annotations.py
```

The script will:

Read all annotation files from the human_responses directory
Merge the annotations for each conversation
Create a consolidated dataset with all annotations
Output the merged results to merged_dataset.json

Analysis Output

The merged dataset (merged_dataset.json) will contain:

All original conversation data
Combined annotations from all annotators

This merged dataset can then be used for further analysis and visualization of the annotation results.

Agreement Analysis and Visualization

To analyze inter-annotator agreement and create visualizations:

Run the agreement analysis script:
```
python agreement_analysis.py
```

The script will:

Calculate agreement statistics between annotators
Generate various graphs and visualizations
Save the output in the analysis directory

The generated visualizations will include:

Agreement heatmaps
Distribution plots of annotations
Comparison charts between different model responses
Statistical summaries of agreement metrics

These visualizations can be used to:

Assess the reliability of the annotations
Identify patterns in model behavior
Compare performance across different conditions
Support conclusions about model capabilities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Human Annotation Dataset Creation

Dataset Creation

Configuration

Running the Script

Output Files

Dataset Structure

Notes

Annotation Process

Setting Up Label Studio

Creating a New Project

Annotation Interface

Performing Annotations

Exporting Results

Analysis Process

Merging Annotations

Analysis Output

Agreement Analysis and Visualization

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Human Annotation Dataset Creation

Dataset Creation

Configuration

Running the Script

Output Files

Dataset Structure

Notes

Annotation Process

Setting Up Label Studio

Creating a New Project

Annotation Interface

Performing Annotations

Exporting Results

Analysis Process

Merging Annotations

Analysis Output

Agreement Analysis and Visualization