Community Embraces New Word Game at Mid-Year Play Day This past Sunday, families at Takoma Park’s Seventh Annual Mid-Year Play Day had the opportunity to experience OtherWordly for the first time. Our educational language game drew curious children and parents to our table throughout the afternoon. Words in Space Several children gathered around our iPads […]
Read moreAutomated Detection of Confounds and Inappropriate Context to Promote Prosocial Learning and Cognition
NSF SBIR Phase I (#2304423)
July 15, 2023 – August 31, 2024 · Completed
Grant Details
| Award Number | #2304423 |
|---|---|
| Program | America’s Seed Fund (SBIR/STTR) |
| Amount | $274,990 |
| Period | July 15, 2023 – August 31, 2024 |
| PI | Michael Douma |
| Lead | OtherWordly LLC |
| Sub-awardee | University of Wisconsin-Madison |
| Status | Completed |
Objective 1: Content Generation
The core problem: generating word content affordably while keeping it appropriate for diverse audiences. Curse words are easy to filter—there are roughly 6,000 vulgar terms in English. The hard part is benign terms with context-dependent offensive senses. “Monkey” can be an animal or a racial slur.
We developed a three-part offensiveness scoring framework:
- Universal Offensiveness (UO) — General offensiveness in typical usage
- Interpersonal Offensiveness (IO) — Offensiveness in social contexts
- Primary Meaning Overlap (PMO) — How much the offensive sense dominates the word’s meaning
We built validation benchmarks to test our algorithms: pairwise detection (is this word pair offensive?), group detection (does this group contain offensive combinations?), and group identification (which specific words are problematic?).
The database expanded significantly during the grant period:
- 3.1 million sets of meanings (up from 1.1 million initial terms)
- 60 million weighted word pairs
- 5,000 noun topics mapped to Library of Congress classification
- A 72-category social issues taxonomy
Objective 2: Prosocial Evaluation
With University of Wisconsin-Madison, we designed an 8-week gameplay study to measure cognitive and social impacts. The study received IRB approval (2024-0210) and measures critical thinking, systems thinking, openness to diverse perspectives, and willingness to engage difficult topics.
Results
- Puzzle construction efficiency — Reduced from 20 hours to 5 hours per puzzle (75% faster) through automated suggestions and problem detection
- Contextual definitions — Started with 20,000 manually crafted definitions, then scaled to hundreds of thousands using LLMs
- 15 difficulty levels — Ranging from young children to advanced wordsmiths
- Neurodiverse accessibility — Playtesting revealed that word rotation aligned with how some dyslexic players experience text
Key Insight
Bigger language models aren’t always better for specialized tasks. Custom embeddings trained on our datasets outperformed general-purpose LLMs on culturally sensitive edge cases—the kind of nuance that matters most for content appropriateness.
Related Funding
- NSF I-Corps ($20,000, 2023) — A supplement for customer discovery. The 250+ interviews directly shaped In Other Words.
- NSF XSEDE Supercomputer Grant (#IRI130011, 2013–2014) — 2.3 million supercomputer hours for the foundational NLP work that became the early Linguabase.
Outcomes
The grant supported development of:
- OtherWordly — 100 levels, 30,000+ unique words, 15 adaptive difficulty levels. Coming to iOS.
- In Other Words — Daily puzzles with over 1 million valid solutions each. Available on iOS.
- Open-source NLP benchmarks for semantic unrelatedness research
- Content filtering algorithms for prosocial game content
