PRODUCT THINKING

How I Think Through Product Problems

How I approach a product problem end to end, from figuring out whether it's worth solving to deciding what changes after it ships. A few of the examples point back to the builds covered above.

Validating demand before writing code

Every product decision starts with a belief about what someone needs, and the first thing I do with that belief is test it against reality before building around it. I think about it in terms of the job someone is hiring a product to do, since what people ask for and what they actually need to get done are often two different things, and the gap between them is usually where the real opportunity sits. This is discovery work, and it happens before any line of code gets written. What I'm listening for is a specific kind of reaction: do people recognize the problem immediately, describe it back in their own words without prompting, and respond with something close to relief that someone is finally addressing it.

This is exactly how Down For started. Before touching Supabase or Expo, I asked friends directly about something specific: when someone mentions a place or idea in a group chat, does it actually stick, or does it just disappear once the conversation moves on. Every person I talked to said it disappears, and described that exact frustration in their own words within seconds, often before I'd even finished asking. That kind of immediate, unprompted overlap across different friend groups was the signal I needed. People wanted a way to keep a shared, lasting record of what the group has talked about and wants to do, instead of letting it disappear into chat history, and that alignment told me the problem was real and worth the months of work that followed.

Choosing what to build first, and why

Once a problem feels real, the next question is what the smallest version of a solution looks like that still proves the core idea works, and what gets deliberately left out of that first version. I think of this as the MVP question, but the more useful framing for me is sequencing: what has to exist before anything else can, and what depends on something earlier being proven first. Sorting features this way usually comes down to feasibility and risk. Some pieces are genuinely hard to build well and represent a real investment of time, and building those before confirming the underlying idea works means risking that investment on something that could change entirely once real usage comes in.

Roadmap Signals is a good example of this in practice. Native integrations with tools like Zendesk or Intercom were an obvious feature to want, since they would remove the manual step of exporting and uploading feedback. Building those first, though, would have meant putting significant engineering time into a layer that only matters once the underlying scoring and routing logic actually works. So v1 shipped with a CSV upload step instead, which let the core pipeline get built, tested, and proven quickly, with native integrations sitting clearly defined as the highest-value next step once that foundation held up.

Prioritizing with a weighted scoring model

When everything on a list seems important, the only way I can actually rank it is by scoring each item against the same weighted criteria every time, rather than going with whichever one feels most urgent in the moment. It's the same idea behind frameworks like RICE, where each factor gets weighted by how much it should matter to the final decision. I'll usually look at things like how often an issue comes up, how severe it is, which group of users it affects most, how recent the pattern is, and how strong the evidence behind it is. Each of those gets its own score, and the combination gives me a ranking I can explain and revisit later, rather than one based on a feeling that's hard to walk back if it turns out to be wrong.

Job Agent runs on a version of this same principle. Instead of forwarding every job posting it finds straight to an inbox, it scores each one for relevance against a weighted set of factors, so the listings at the top of the daily email digest earned that position for reasons that can be traced and explained. The same evidence-first thinking also applies in reverse, when the evidence points away from something. Down For had a feature called Surprise Me, which randomly suggested a saved idea when a group couldn't decide where to go, and it was genuinely one of the more polished pieces of UI in the app. During testing, five separate users had no idea what the feature was for or why it was there. That reaction, repeated across multiple people, was strong enough evidence on its own, and the feature came out before the app moved toward launch.

Setting the metric before the product ships

Part of scoping any product, for me, is deciding what success actually looks like before getting deep into the build, and being specific enough about it that the answer could genuinely change a roadmap decision later. That usually starts with a single North Star metric, one number that captures whether the product is doing the thing it exists to do, and then a small set of supporting metrics underneath it that explain movement in that number. Those supporting metrics tend to fall into a few familiar buckets: whether people get to the core value quickly, which is activation, whether they keep using the thing once they're in, which is engagement, and whether they come back, which is retention. Defining all of this upfront means every feature decision has to answer a follow-up question: what would this actually move, and is that the thing that matters most right now.

For Roadmap Signals, the North Star is the percentage of detected signals that turn into an approved artifact within two weeks, since that's the moment the entire point of the tool, turning feedback into action, actually happens. For PackSmart, it's whether someone who generates a packing list goes on to customize it and then save or export it, because that sequence of actions is what separates a generic baseline from a list someone trusts enough to actually use on a trip.

Letting real usage decide what changes next

Shipping something is where the real learning starts. This is the build-measure-learn loop in practice: everything leading up to a release is a hypothesis, and the release is how that hypothesis gets tested. Once people are actually using something, what they do tells you a lot more than what they say they'll do, and those two things diverge constantly. Usability sessions are where this shows up most clearly: someone might describe a flow as making sense, then get stuck on the very first step of it a moment later. The space between those two reactions is where the real product work lives, and closing it, one specific adjustment at a time, is what the next iteration is for.

Down For has been through fifteen user testing sessions so far, and the changes that came out of them were rarely big strategic pivots. They were small, specific, and numerous: how occasion tags get introduced during onboarding, how long a crew name is allowed to be, how a "mark as visited" action gets communicated back to the rest of the group. Individually, each of these felt minor. Together, they're a big part of what separates an app that technically works from one that feels like it was actually built for the people using it.