Day 02 / 04 Agentic Coding Tools for Researchers

Working with Data

Justin Frake

University of Michigan, Ross School of Business

Yesterday we got your agent up and running

1
Installed your agent
Claude Code (or Codex) running in a terminal you set up yourself.
2
Plan / Execute / Clear
The three-phase rhythm. Your default for non-trivial work.
3
Permissions you control
Allow you to control what the agent can do and access.
4
Explored the CPD data
Downloaded complaints-complaints.csv and learned where the data came from.
5
Your global memory file
CLAUDE.md or AGENTS.md loaded automatically in every session.
6
R, Python, Quarto ready
Installed end-of-day yesterday, with the packages we need.
Day 2 · Recap1

Agenda

§ 1
Day 1 recap + framing today
Discussion
§ 2-3
Prompts and plan mode in practice
Lecture
§ 4
Join the data and run the first regression
Workshop
§ 5
Memory
Lecture
§ 6
Project memory file
Workshop
§ 7
Skills
Lecture
§ 8
Author a table-formatting skill
Workshop
§ 9
What goes in a figure skill
Lecture
§ 10
Author a figure-formatting skill
Workshop
§ 11
Drafting prose with the agent
Lecture
§ 12
Write and compile your draft
Workshop
§ 13
Debrief + Day 3 preview
Discussion
Day 2 · Agenda2
§ 2

Prompting

You don't need magic phrasing. Just talk to it.

Natural language works. The agents are good enough now that you do not need ceremony.
Brief it like an RA. Be specific. Say what you want.
Clarity still matters. Vague brief in, vague work out.
Use dictation. I use Wispr Flow. Speaking is faster than typing and you will be more natural about it.
Show, don't just tell. Paste a screenshot straight into the prompt: a figure that looks wrong, an error message, a table, a journal layout you want to match.

The agent reads right through misspellings, bad grammar, duplicates, rambling, contradicting yourself mid-prompt, even saying “never mind.”

Day 2 · Prompting3

Four prompts I reach for most

These are additions to "interview me and make a plan."

01 Research first
please thoroughly research [topic]

Use it before touching an unfamiliar method or any package whose syntax shifted in the last year. Training cutoffs make these the highest-hallucination zones.

Grounds the rest of the conversation in what's actually true, not the agent's best guess from training.

02 Surface gaps
is there anything i'm missing? any assumptions you're making that i should clarify first?

Use it after plan mode when you want to push the agent to name what it is still assuming.

Surfaces gaps before they become mistakes. Cheaper to clarify now than to roll back later.

03 Scrap and rebuild
knowing what you know now, set this script aside and write the clean version from scratch

Use it after the first messy attempt.

Same logic as rewriting a script after the first exploratory pass. The agent knows the data shape now; let it design the clean version.

04 Walk it through
walk me through what you did

Use it after the agent says a task is done.

Asks for files touched, commands run, and decisions made, not the cheerful recap. Catches silent edits to scripts you did not ask about.

Day 2 · Prompting4
§ 3

Regressions

Today's data: the Invisible Institute's Chicago police records

A Chicago non-profit newsroom publishes the largest open dataset of municipal police misconduct in the country: every complaint filed against a Chicago Police Department officer, with disciplinary outcomes attached, going back to 1988.

Who runs it
The Invisible Institute, a Chicago investigative non-profit, with the University of Chicago Law School's Mandel Legal Aid Clinic. Citizens Police Data Project (CPDP) launched 2015.
Why we can use it
Kalven v. City of Chicago (2014) made police misconduct records public in Illinois. Released via FOIA. Public records; no IRB protocol needed for the workshop.
Heatmap of Chicago Police Department complaint density across the city, with the highest concentrations along the Loop, the South Side, and the West Side.

Complaint density across Chicago beats.

Source: invisible.institute/police-data · cpdp.co · github.com/invinst/chicago-police-data

Day 2 · The data5

Today's regression

Research question
Are complaints filed by CPD officers sustained more often than complaints filed by civilians?
complaints-complaints.csv
cr_id  · key
complainant_type
beat
incident_date
one row per complaint
join on
cr_id
complaints-accused.csv
cr_id  · key
UID  · officer ID
final_finding
complaint_category
one row per accused officer
result: one row per accused-officer per complaint
The regression
Whether an accused-officer in a complaint was sustained, regressed on whether the complaint was filed by a CPD officer (vs civilian), with beat fixed effects and standard errors clustered at the beat level.
Day 2 · The setup6

How to start the conversation

Don't
please estimate the effect of complainant type on sustained complaints

The agent jumps in and picks variables, an estimator, an FE structure, a sample, and a clustering level. You spend the next ten prompts unwinding choices you never made.

Do
let's plan a regression of whether a complaint was sustained on whether it was filed by a CPD officer. interview me before writing code. confirm variable names against the data. use beat fixed effects and cluster standard errors at the beat. list every assumption you are making that i did not specify.

The agent asks. You answer. The spec is more yours than the agent's.

The interview only narrows your choices to the ones the agent thought to ask. Notice what it didn't ask.

Day 2 · Starting7

What the agent should know before running a regression

Four things to pin down before any code runs. Hand them to the agent however fits: say it each time, keep the stable ones in a memory file, or bake them into a skill (we cover skills later).

1
Language
R or Python? Which one are we writing this in?
2
Package
Within the language, which package? fixest, statsmodels, linearmodels, etc.
3
Output
Which statistics to report? In what file format and where saved?
4
Specification
Dependent variable
Controls and fixed effects
Independent variable
Standard errors. Robust, clustered, and at what level.
Sample. Which rows enter the regression.
Model. OLS, logit, probit, etc.
Day 2 · The interview8

Build the plan with the agent

Three steps, in order. Once you and the agent agree on the plan, you have something concrete to check execution against.

01 Talk it through

Tell the agent what you want. Let it ask. Answer. Don't let it skip ahead to code.

02 Have it write the plan

Numbered steps, one screenful. What it loads, what it computes, what it saves. Concrete enough to disagree with.

03 Review for completeness

Read every step. Push back. Fix it. Once you approve, the agent works inside the plan.

The plan is a contract
It is the agent's way of telling you what it thinks you said. Read it. Tell it exactly what to change. Your job is to make sure it is complete before you sign off.
Without it, the agent negotiates the spec with itself while writing code.
Day 2 · Plan mode9
§ 4

Workshop · First regression

Join the data and run the first regression

Workshop instructions
socialscienceai.com/workshop/day-2#first-analysis
Day 2 · Workshop10

Debrief & Questions

  • What worked?
  • Where did you get stuck?
  • What surprised you?
Day 2 · Debrief11
§ 5

Memory

Global rules apply to every Claude Code session. Local rules only apply to the project.

~/.claude/CLAUDE.md · global
project-1/CLAUDE.md · local
Project 1
project-2/CLAUDE.md · local
Project 2
project-3/CLAUDE.md · local
Project 3

When you open the agent inside a project, it reads both files: the global one, then the local one.

Day 2 · Scopes12

The relationship between global and local memory files

~/.claude/CLAUDE.md · global
  • Cluster SE at the unit of treatment assignment
  • For staggered DiD: do not use TWFE, instead use CSDID
  • Round coefficients to 3 decimals
  • *p<.05, **p<.01, ***p<.001
chicago-pd-study/
CLAUDE.md
  • cr_id is the join key
  • cluster at beat
  • sustained is 0/1
  • codebook at data/codebook.md
data/ · code/ · results/
wage-study/
CLAUDE.md
  • person-year is the unit
  • cluster at firm-year
  • balance the panel before estimating
  • Round coefficients to 4 decimals
data/ · code/ · results/
patent-study/
CLAUDE.md
  • dedup self-citations
  • cluster at assignee
  • restrict to utility patents
  • *p<.10, **p<.05, ***p<.01
data/ · code/ · results/
Day 2 · Standing defaults13

When rules conflict, the closer rule wins

The agent reads global first, then project, then your chat. Each lower layer overrides the layers above.

in wage-study/ · agent needs to round
~/.claude/CLAUDE.md global
  • Round coefficients to 3 decimals
overridden by
wage-study/CLAUDE.md project
  • Round coefficients to 4 decimals
overridden by
> your message chat
round coefficients to 2 decimals
agent rounds to 2 decimals
in patent-study/ · agent needs sig stars
~/.claude/CLAUDE.md global
  • *p<.05, **p<.01, ***p<.001
overridden by
patent-study/CLAUDE.md project
  • *p<.10, **p<.05, ***p<.01
chat doesn't override stars
> your message chat
print a publication-quality regression table
agent uses *p<.10, **p<.05, ***p<.01 from project
Day 2 · Precedence14

Memory is rented attention, not free storage

Every line in your memory file loads into the agent's context on every session. The longer the file, the less room the agent has to do the work you asked for.

Lean memory file
global 40
proj 25
reasoning room for the actual task
Kitchen-sink memory file
global 40
project 380
reasoning room for the actual task

Rule of thumb: keep each file under about 200 lines, and shorter is better. If it grows past that, move the long parts into a linked file or a skill.

Day 2 · Attention budget15

A lean project memory file, annotated

# project memory file: CPD complaints
## Research question
- Are CPD-officer-filed complaints sustained more often than civilian-filed?
## Background and references (read as needed; don't memorize)
- background/cpd-history.md: short history of the Chicago Police Department
- background/invisible-institute.md: about the data source and how it was assembled
- codebook.md: variable definitions, value codes, units
- notes/meetings/: advisor meeting notes (read most recent first)
## Data
- complaints-complaints.csv: 1 row per complaint. Key: cr_id.
- complaints-accused.csv: 1 row per accused officer per complaint. Key: (cr_id, officer_id).
- Inner join on cr_id is one-to-many; report the share dropped (no accused).
- Unit of analysis: one row per accused-officer per complaint.
## Construct definitions
- sustained = 1 if final_finding == "SU", else 0.
- officer_filed = 1 if complainant_type == "CPD EMPLOYEE", else 0.
- Drop rows with blank complainant_type (not classifiable); report N before and after.
## Statistical conventions
- Specification: within-beat comparison of officer-filed vs civilian-filed (descriptive; identification deferred to Day 3).
- Fixed effects: beat. Cluster SE: beat.
- LPM is fine on the 0/1 outcome; don't silently switch to logit.
01
Lean
~20 lines.
02
Link out, don't dump
The References section names four documents the agent can pull in on demand. Nothing big is embedded here.
03
Project-specific only
Nothing here would apply to a different paper.
04
Yours will differ
In a few minutes you'll interview the agent to write your own version of this.
Day 2 · Case study16
§ 6

Workshop · Project memory file

Have the agent interview you to create your project memory file

Workshop instructions
socialscienceai.com/workshop/day-2#project-memory
Day 2 · Workshop17

Debrief & Questions

  • What worked?
  • Where did you get stuck?
  • What surprised you?
Day 2 · Debrief18
§ 7

Skills

A skill is a recipe for a task you do over and over.

~/.claude/skills/format-figure/SKILL.md
---
name: format-figure
description: Use when the user asks for a figure, plot, or coefficient plot.
---
# Format a figure
## Theme
- Use michigan maize (#FFCB05) and blue (#00274C) as the default colors
- Sans-serif, base font size 11
## Save
- Always save both .pdf and .png
- Save to results/figures/
## Axes
- Words, not column names
- Round large axis values: instead of $1,000,000 use $1M
Skills are just Markdown files.
A local CLAUDE.md applies to every session inside one project.

A SKILL.md applies to one specific task type across all projects.
For example
  • Format a regression table
  • Format a figure
  • Write prose in your voice
  • R or Python syntax preferences
Day 2 · What's a skill19

A SKILL.md lives in a folder. It can reference helper files.

Inside the folder
format-regression-table/
└─ SKILL.md
└─ example-table.tex
└─ zoom into SKILL.md ↓
SKILL.md top of file
---
1name: format-regression-table
2description: Use when the user asks for a publication-style regression table.
---
3# Format a regression table
- Round coefficients to 3 decimal places
- Round standard errors to 4 decimal places
4- Use example-table.tex as a reference for all tables
1
Name is the identifier
Used as the folder name and how the agent refers to the skill internally. Short, kebab-case.
2
Description is the trigger
The agent reads this on every session. When a request matches, it loads the full body. Lead with "Use when...", keep under ~150 characters.
3
Body is the instructions
Free-form Markdown: how to do the task, what conventions to follow, what to avoid.
4
Helper files travel with it
Reference tables, example output, templates: drop them in the folder and reference them from SKILL.md.
Day 2 · Anatomy20

A skill is a recipe you reuse across projects.

Table formatting
table-formatting/
Figure formatting
figure-formatting/
Prose writing voice
prose-voice/
~/.claude/CLAUDE.md · global
project-1/CLAUDE.md · local
Project 1
project-2/CLAUDE.md · local
Project 2
project-3/CLAUDE.md · local
Project 3
“make a figure”
obeys global + project-2 + figure-formatting skill
and every other project and session
Day 2 · Reuse21

How skills affect your context window

Skills load in three tiers. Only the name and description automatically load into the context window.

what loads every session TIER 1 TIER 2 TIER 3
Tier 1 · always in context
Just the skill's name and description. A line or two. The agent reads this every session to decide whether to reach for the skill.
Tier 2 · loaded when the description matches
The body of SKILL.md: the recipe steps, conventions, package calls.
Tier 3 · loaded only when the body references it
Helper files inside the skill folder: example outputs, templates, reference scripts.

Each skill costs roughly 75 to 150 tokens at Tier 1. Fifty short descriptions is about 5K tokens, small relative to a 1M context window.

Day 2 · Progressive disclosure22

Where skills live on your computer

User
Your home directory
Lives at ~/.claude/skills/ (Claude Code) or ~/.agents/skills/ (Codex CLI). Loads in every project.
today: format-regression-table, format-figure
Project
Inside the project root
Lives at <project>/.claude/skills/ or <project>/.agents/skills/. Loads only in this folder.
e.g. cpd-coding-conventions
Day 2 · Where they live23

You don't have to write all your skills from scratch

Other people have already written skills for many common tasks. Browse and install instead of building.

Anthropic
Official Claude Code skills
A growing collection of first-party skills maintained by Anthropic.
OpenAI
Official Codex skills
Curated skills maintained by OpenAI for the Codex CLI.
Community
Third-party marketplaces
Community-built skills. Vet the source before installing; these are not vendor-curated.
All three sources are just folders of SKILL.md files. The same skill works in either Claude Code or Codex. Drop it in ~/.claude/skills/ or ~/.agents/skills/.
Day 2 · Where to find them24
§ 8

Workshop · Making your first skill

Author a table-formatting skill

Workshop instructions
socialscienceai.com/workshop/day-2#table-skill
Day 2 · Workshop25

Debrief & Questions

  • What worked?
  • Where did you get stuck?
  • What surprised you?
Day 2 · Debrief26

Patterns that make a SKILL.md more reliable

A skill body can be more than freeform prose. Three patterns from well-authored skills keep the agent on the rails.

Hard gates

Tell the agent it MUST do X before continuing. Useful for prerequisites the agent likes to skip.

<HARD-GATE>
Do not write code
until the user has
approved the spec.
</HARD-GATE>
Anti-patterns

A table of thoughts that signal the agent is rationalizing, paired with the reality. Catches common drift.

| Thought | Reality |
|---|---|
| "Just a quick fix" | Use a plan. |
| "Tests too slow" | Run them anyway. |
Process flow

A Graphviz/DOT diagram embedded in the skill. The agent reads the graph and follows the path step by step.

```dot
digraph {
  explore -> design;
  design -> approve;
  approve -> build;
}
```

Browse the Superpowers and Anthropic skills repos to see these patterns in real skills (Day 3 install).

Day 2 · Patterns for skills27
§ 9

Figure skill

Figure-skill standing defaults

A figure skill answers the question: what about a figure should be the same every time, and what changes plot to plot?

Per figure · you keep deciding
  • Which variables go on which axis
  • Faceting or grouping for this comparison
  • Geom choice (points, lines, bars, ranges)
  • Palette for this specific comparison
spec it per plot
Standing defaults · encode in the skill
  • Save both .pdf and .png to results/figures/
  • Custom ggplot2 / matplotlib theme
  • Axis labels in words, not variable names
  • Source line baked into the figure (the caption stays in the manuscript)
the skill applies these every time

The figure skill knows your style. You only have to say what's different about this plot.

When a plot still looks wrong, paste a screenshot of it and ask the agent to fix the theme. It sees what you see.

Day 2 · Standing defaults28
§ 10

Workshop · Figure-formatting skill

Author a figure-formatting skill

Workshop instructions
socialscienceai.com/workshop/day-2#figure-skill
Day 2 · Workshop29

Debrief & Questions

  • What worked?
  • Where did you get stuck?
  • What surprised you?
Day 2 · Debrief30
§ 11

Drafting + compile

Drafting prose with the agent

Ghostwriting · don't
  • Ask for 500 words on the topic
  • Accept what comes back
  • Edit at the sentence level
  • Get hedged prose and invented citations you then have to fact-check
Collaboration · do
  • Bring your notes, the table, the figure, and your argument
  • Spec the structure: section headers, what each paragraph claims
  • Ask for paragraphs that fill in around your structure
  • Edit at the paragraph and argument level
Day 2 · Drafting31
§ 12

Workshop · Write and compile

Write and compile your draft

Workshop instructions
socialscienceai.com/workshop/day-2#draft-and-compile
Day 2 · Workshop32

Debrief & Questions

  • What worked?
  • Where did you get stuck?
  • What surprised you?
Day 2 · Debrief33
§ 13

Debrief + Day 3 preview

What's on your machine now

1
A regression you trust
Joined the raw data, set the spec yourself, ran it with the agent inside the lines.
2
Plan mode for analysis
The interview pattern made the spec yours, not the agent's.
3
A project memory file
Standing rules for this dataset. Loads automatically every session in this folder.
4
A table skill
format-regression-table. The next table starts from your format.
5
A figure skill
format-figure. The next plot starts in your theme.
6
A compiled draft
Quarto memo rendered to .docx with the table and figure embedded.
Day 2 · Today34

Day 3 · Customize and extend

Subagents and the hostile reviewer
Dispatch a fresh-context agent to critique your own analysis. Cross-check what the main session might have missed.
Plugins and Superpowers
Plugins are bundles of skills, slash commands, and hooks. Install one and get dozens of skills at once. Superpowers is the canonical one.
Spec-driven development
Brainstorm a research idea, write a plan, then let the agent execute against it. The Superpowers workflow.
MCP servers
Give the agent new tools it can reach for. We'll install Playwright (browser automation) and a docs-search MCP.
Author your own plugin
Use the plugin builder to generate a letter-of-recommendation plugin you take home as part of your toolkit.
Day 2 · Day 3 preview35