WZ — Case · 03
Filed · 2026 / Q2
SF Bay Area · 37.77°N
v.01 · current
Pillar 03 — AI workflow leverage

Engineering work, turned into evidence.

Engineering work often gets reduced to commit counts, visible activity, or who speaks up in meetings. I built an internal system that turns code changes, reviews, and issue discussions into contribution profiles, with a methodology page that makes every judgment traceable and open to challenge.

01CONTEXT

As the company grew, engineering contribution became harder to see.

GitHub can hand you 16 repositories and tens of thousands of commits. What it will not hand you is shape. The company was growing quickly; work was spread across repos, time zones, and release cycles, and the old memory-based way of knowing who shipped what stopped working.

The issue was not a lack of data. The issue was that the data had not been translated into language leadership could use. Code changes, reviews, and issue discussions were all there, but they were scattered records. They did not explain a person's actual contribution on their own.

Without a more reliable interpretation layer, contribution discussions fall back to surface proxies: who committed more, who seemed more active, who was closer to leadership. The important contribution patterns stay off the table.

02PROBLEM

Raw records are not contribution judgment.

A code platform can show plenty of numbers: repositories, commits, reviews, issue discussions. None of those numbers directly equals contribution. One small fix can matter more than ten simple commits; someone who reviews, designs, or unblocks others can be central without being the loudest author.

The hard problem was translating scattered records into a contribution explanation that could survive questions. Every conclusion needed a route back to evidence. Every label needed an explanation.

So the system could not just be a dashboard. It had to align identities, collect the right records, interpret the work, and expose the judgment process. If any layer was missing, the output would become another table nobody trusted.

03APPROACH

I split the system into four layers, then made the methodology visible.

1 / Align identities first.

The first step was not reading code. It was making sure each record belonged to the right person. Engineers may use multiple accounts, bot accounts appear in the data, and co-authors can disappear during merges. If identity is wrong, everything downstream is biased.

2 / Collect the full record.

The system collects code changes, merged requests, reviews, and issue discussions. That keeps the analysis from becoming "who wrote code" only, and makes review, coordination, repair, and issue-closing work visible too.

3 / Use the model to interpret the work.

The model does not judge people directly. It reads each change first: whether the work is substantive, what kind of work it is, and how much may be generated or mechanical. Raw content is cached and truncated so reruns do not repeatedly fetch the same records or let one oversized change dominate the pipeline.

4 / Generate contribution profiles.

The system then aggregates the evidence by person and produces a contribution profile: what role the person mainly plays, where their work concentrates, and which deliverables best represent their contribution. The model sees organized evidence, not a blank prompt asking it to evaluate a person.

5 / Make the methodology visible.

I did not want a black box. The system includes a methodology page explaining what each layer does, why it exists, and where errors can enter. If someone disagrees with the result, they can point to the specific step or evidence, not just reject the whole system.

04WHAT I BUILT

A record-to-judgment system, not just a dashboard.

PIPELINE — IDENTITY → RECORDS → INTERPRETATION → PROFILE → PORTAL 01 Identity account ↔ roster 02 Delivery records requests · commits SUPPLEMENT Reviews + issues 03 Work interpretation change-by-change 04 Contribution profile person-level synthesis OUTPUT Internal portal methodology page CACHE LAYER — records cached for reruns TRUNCATION — long records capped to control model cost
Fig. 03 — From raw engineering records to contribution profiles.
  • Identity map — reconciles code accounts, automation accounts, and the company roster.
  • Contribution record pipeline — collects code changes, merged requests, reviews, and issue discussions instead of only counting commits.
  • Work interpretation layer — classifies the substance and type of each change, including possible generated or mechanical content.
  • Contribution profile layer — aggregates evidence by person into a role description, core contribution, and representative deliverables.
  • Methodology page — explains where the evidence comes from, how judgments are made, and where the system can be wrong.
  • Internal review portal — gives leadership and contributors the same evidence base instead of separate narratives.
05OUTCOME

The conversation moved from "how much did they commit?" to "what did they contribute?"

16Internal repositories analyzed
4Identity / records / interpretation / profile, plus methodology
60%Model-reading cost reduction
1Traceable methodology page

The most important outcome was not the existence of a dashboard. It was the change in how people talked about contribution. Leadership could stop asking only how much someone committed and start seeing what problems they solved, where they invested consistently, and which deliverables best represented their work.

The methodology page mattered just as much. Contributors could see why the system described them a certain way and which evidence supported that description. If they disagreed, they could argue with a specific step or source, not dismiss the whole system.

Engineering contribution became shared material: something leadership, the engineering team, and the person being reviewed could discuss from the same evidence base.

06WHAT IT TAUGHT ME

The hard part is not the pipeline. It is making the judgment withstand questions.

I started by thinking the hard part was implementation: how to collect records, cache them, control model cost, and produce profiles. That was only the first half.

The harder part was methodology. If the system says what role someone plays, where their contribution sits, and why it matters, the judgment needs evidence and it needs to allow disagreement. Too soft, and the conclusion has no force. Too hard, and people will not trust it.

Measuring engineering contribution is not just pulling the data. It is making every judgment trace back to evidence and survive the first serious question from the person being judged.