URTEXTDATA

URTEXT

We capture how developers think — not just what they code

URTEXT is building the world's first dataset of real programming behavior — tracking keystrokes, edits, and reasoning — to help AI and research better understand how programmers think

From human thought to structured code behavior

Our pipeline transforms raw programming sessions into clean, structured datasets

1

Capture

Developers record their screen, keyboard, and voice as they solve problems

2

Redact

Sensitive data is automatically removed or masked

3

Structure

Each recording is synchronized and labeled for context

4

Validate

Experts check for quality, consistency, and privacy before use

Built for researchers, educators, and AI builders

For Researchers

Study real debugging, refactoring, and problem-solving processes

For Educators

Show students how experts think while coding, not just the final result

For AI Developers

Train systems that reason like real programmers — grounded in process, not static code

Code is more than text — it's a thought process

Most programming datasets capture finished code. URTEXT captures how that code was created — including mistakes, reasoning, and iteration

This provides new insight into learning, productivity, and intelligence in software creation

Your data. Your control.

We design for privacy from the start:

You decide what's recorded and what's redacted.

Raw recordings are never shared publicly.

All data goes through multi-stage anonymization before use.

Contributors can withdraw participation anytime.

Who's building URTEXT

URTEXT is a student-led initiative exploring AI, programming, and data-driven projects

Technical development, including coding and data structuring, is carried out by a network of trusted software engineers and researchers

Research-Driven

Grounded in academic rigor and real-world developer insights

Developer-First

Built by developers who understand the craft of programming

Privacy-Focused

Your data stays yours with full transparency and control

We're launching our first pilot soon

We're currently preparing small pilot collaborations with select research and industry partners

If you're interested in early access, sign up below — we'll contact you when applications open

Help us build the first dataset that captures how humans truly code

Join our pilot phase as a partner or contributor

URTEXT

Capturing real programming behavior for AI and research

hello@urtextdata.com