Goal#
target:
1000 unique workplace meeting utterances
scope:
opening
agenda
clarification
interruption
agreement / disagreement
decision
action item
closing
Source Plan#
Collection Rules#
do:
collect short utterances only
deduplicate exact text
keep source_url for every utterance
keep speaker/context when available
tag by meeting function
do not:
paste long transcript blocks
import closed-license corpus text without permission
mix generated sentences into this file
Corpus Schema#
| id |
utterance |
tag |
source_url |
source_license |
note |