workshop / day-1

Day 1 · Monday, 1 June 2026

Foundations & first workflow

Deliverable
Working agent environment (R, Python, Quarto, packages) + the CPD complaints data downloaded and profiled + two exploratory plots + your own personalized global memory file.

Slides

Open deck in presentation mode ↗

Agenda

  1. § 1
    Foundations
    Lecture
  2. § 2
    Setup & first conversation

    Phase 1 · Set up your project folder and open the terminal

    1. 1
      Open Finder. Click Documents in the sidebar. Right-click in the empty area, choose New Folder, and name it workshop.
    2. 2
      Press Cmd+Space, type Terminal, press Enter.
    3. 3
      Type cd and a space. Drag the workshop folder onto the Terminal window. Press Enter.

    Phase 2 · Start the agent and run the prompt

    1. 4
      Type claude (or codex) and press Enter.
    2. 5
      Paste this prompt and approve permissions as the agent works:
      Download complaints-complaints.csv from the GitHub repo invinst/chicago-police-data into data/raw/ in this folder. Confirm it loaded as an actual CSV by reporting row count and the column list.
    3. 6
      Verify: 234,971 rows, 19 columns.

    Phase 3 · Ask the agent about the data

    1. 7
      What is the Invisible Institute, and where did this dataset come from?
    2. 8
      Is there a data dictionary?
    3. 9
      What are the values in the variables, and what do they mean?
    4. 10
      Where are there missing values?
    5. 11
      What other data is available in this repository?
    6. 12
      Ask your own questions about the data.

    Ask in your own words. Follow up if the answer is thin.

    Done early? Try one of these.
    1. 1
      Brainstorm with the agent: what research questions could you answer with these data? Ask it to propose 5 to 10, pick the one that interests you most, and refine it together.
    2. 2
      Brainstorm with the agent: what other datasets could you join to these? Have it suggest sources, the likely join key, and one obvious risk with each (coverage, time alignment, identifier mismatch).
    Workshop
  3. § 3
    Tooling
    Lecture
  4. § 4
    Install R and Python

    Phase 1 · Install R and Python

    1. 1
      Paste this prompt:
      Check whether R (version 4.4 or later), Python (version 3.12 or later), and Quarto (version 1.4 or later) are already installed on this machine. If any are missing or out of date, install them.
    2. 2
      Work with the agent until R, Python, and Quarto are installed. Approve permission prompts as it works. If it says something you don't understand, just ask it.
    3. 3
      Paste this prompt to verify:
      Tell me what versions of R, Python, and Quarto are installed. R should be 4.4 or later, Python should be 3.12 or later, and Quarto should be 1.4 or later.
    4. 4
      Optional but recommended: install LaTeX so you can use the LaTeX (PDF) output option in later exercises. It's a sizable download (a few hundred MB to a few GB depending on the distribution). Skip if your connection is slow or your laptop is short on disk; you can install it later when an exercise needs it. Paste this prompt to install:
      Install TinyTeX (a lightweight LaTeX distribution) so I can render documents to PDF.

    Phase 2 · Install packages

    1. 5
      Paste this prompt:
      Check whether the following packages are installed. If any are missing, install them. For R: ggplot2, fixest, modelsummary, data.table. For Python: pandas, matplotlib.
    2. 6
      Work with the agent until the packages are installed. Approve permission prompts as it works. If it says something you don't understand, just ask it.
    Done early? Try this.
    1. 1
      Tell the agent about the kind of research you actually do. Walk through your typical workflow: do you write papers in LaTeX, do qualitative data analysis, work with geospatial data, run survey experiments? Ask it to research what other tools and packages would help, with a short recommendation and rationale for each. Pick one and install it.
    Workshop
  5. § 5
    Context management
    Lecture
  6. § 6
    Plan mode
    Lecture
  7. § 7
    Permissions
    Lecture
  8. § 8
    Visualize the data
    1. 1
      Paste this prompt:
      Let's make a plan: from complaints-complaints.csv, make two plots and save both as PNG under results/figures/: (1) annual count of complaints using incident_date; (2) a density map of complaints using latitude and longitude. Then briefly tell me whether the patterns are what you'd expect given what you know about the dataset's history and about Chicago.
    2. 2
      Work with the agent until both plots are saved. Approve the plan and permission prompts as it works. If it says something you don't understand, just ask it.
    3. 3
      Ask the agent to open both PNGs so you can see them.
    Done early? Try one of these.
    1. 1
      If there's anything about the plots you don't like, tell the agent to fix it. Or just try: Make it prettier.
    2. 2
      Brainstorm with the agent: what other graphs could you make from this dataset? Get a short list of ideas, pick one or two that interest you, and make them.
    Workshop
  9. § 9
    Slash commands
    Lecture
  10. § 10
    Memory
    Lecture
  11. § 11
    Write your global memory file
    1. 1
      Paste this prompt:
      Interview me about a few facts so you can update my Claude Code global memory file at ~/.claude/CLAUDE.md:
      - My name
      - Who I work for (university, government, think tank, company, etc.)
      - My research field
      - The kinds of research methods I use
      
      Ask one short, focused question at a time. One topic per question, no list of possible answers.
      
      When I'm done answering, save the result to ~/.claude/CLAUDE.md:
      - If ~/.claude/CLAUDE.md already exists: read it first, then build an updated version that combines what's already there with my answers. Keep my existing rules where they overlap. Slot the Identity and Field-and-methods information into the right sections.
      - If ~/.claude/CLAUDE.md does not exist: fetch https://socialscienceai.com/materials/CLAUDE.md as a starting template, fill in the Identity section and the Field-and-methods section from my answers, and leave the rest as starting defaults.
      
      Show me the proposed final file content, then save it to ~/.claude/CLAUDE.md. Format everything as durable instructions for future Claude Code sessions across every project.
    2. 2
      Answer the agent's questions one at a time.
    3. 3
      When the agent shows you the proposed file, read it. Approve the file write when prompted.
    4. 4
      Verify the save: ask the agent to show you ~/.claude/CLAUDE.md.

    This file loads automatically in every session you start, in every project.

    Done early? Try one of these.
    1. 1
      Read through Andrej Karpathy's four rules below. Any that resonate, ask the agent to add to ~/.claude/CLAUDE.md (or ~/.codex/AGENTS.md) using the same merge-don't-overwrite rule from above. A. Ask, don't assume. If something's unclear, ask before writing a line. No silent guesses about intent, architecture, or requirements. B. Simplest solution first. Implement the minimum thing that works. No abstractions you didn't request. C. Don't touch unrelated code. If a file isn't part of the current task, leave it. D. Flag uncertainty explicitly. If you're not confident, say so before proceeding. Confidence without certainty causes more damage than admitting a gap.
    2. 2
      Ask the agent to look back at our conversations today and propose 3 to 5 more rules for your memory file, drawn from things you had to correct or clarify more than once. If you agree with one, ask the agent to add it to ~/.claude/CLAUDE.md (or ~/.codex/AGENTS.md) with the same merge-don't-overwrite rule from above.
    3. 3
      Customize how the agent talks to you. Tell it how you want responses: length, whether to ask before writing code, signoff format, how much hedging, whether to push back when it disagrees. Ask it to draft the rule wording and add it to your memory file.
    Workshop
  12. § 12
    Privacy & copyright
    Lecture
  13. § 13
    Failure modes
    Lecture
  14. § 14
    Debrief & Day 2 preview
    Discussion

← All days