Quote of the week
“The man who moves a mountain begins by carrying away small stones.”
- Confucius
Edition 32 - August 10, 2025
“The man who moves a mountain begins by carrying away small stones.”
- Confucius
OpenAI has officially rolled out its next flagship model, GPT-5, which early testers describe as a competent and reliable upgrade rather than a dramatic leap forward. GPT-5 in ChatGPT uses a hybrid approach — switching between a fast default model, a deeper reasoning model, and a real-time router that decides which one to use based on the type and complexity of the conversation. The API version comes in three sizes — regular, mini, and nano — each with four reasoning levels, including a new “minimal” mode for faster responses. It supports large input and output limits (272k and 128k tokens respectively), handles both text and image inputs, and has competitive pricing compared to other top-tier models.
OpenAI positions GPT-5 as a direct replacement for most of its existing lineup, from GPT-4o to o3, with only audio and image generation left to other models in its catalog. Pricing undercuts most competitors — $1.25/million tokens for input and $10/million for output for the main model — with significant discounts for cached tokens. This pricing structure makes GPT-5 a strong choice for applications with frequent context replay, like chat interfaces. The company also highlights improvements in hallucination reduction, sycophancy avoidance, and a new “safe-completions” approach that aims to offer moderated answers rather than outright refusals.
Security-wise, GPT-5 shows progress but not perfection. A red-team assessment found a 56.8% success rate for prompt injection attacks after ten attempts — better than other models tested, but still a reminder that this remains an unsolved problem. On the developer side, GPT-5 introduces new API capabilities, including access to summarized “thinking traces” and adjustable reasoning effort to balance speed and depth. It’s clear OpenAI is aiming to make the model versatile for both technical and non-technical audiences.
From my perspective, GPT-5 feels smarter and noticeably faster, though the performance gains over top competitors like Grok 4 Heavy are slimmer than I’d hoped. I haven’t fully explored the API yet, but the new features are intriguing. The automatic model selection in ChatGPT is great when it works, though not flawless. What’s clear is that the big, jaw-dropping leaps in LLM capability might be behind us — future improvements could be more incremental than revolutionary.
Alongside GPT-5, OpenAI also released open-weight models, continuing a trend toward making powerful LLMs available outside their own hosted platforms. While these open-weight releases don’t match GPT-5 in capability, they give developers and researchers more control and flexibility, enabling custom deployments, fine-tuning, and use cases where hosting on OpenAI’s infrastructure isn’t possible or desirable. It’s a small but important move toward bridging the gap between closed, state-of-the-art systems and the growing open-source AI ecosystem.
In this excellent piece, the author takes aim at two types of people that make startup life harder than it needs to be: the “get rich quick” AI influencer who claims you can just prompt your way to a profitable startup, and the academic who’s never built anything but insists their overly complex frameworks are the key to finding ideas. Both approaches skip the messy, uncomfortable reality of discovering something people truly want.
Instead, the article breaks it down to three core questions: What are we looking for? Where do we find it? How do we know we’re right? Most of the usual advice — look for “pain points,” “market need,” or “underserved niches” — is vague at best. The truth is, most people have other priorities, and even when they have a problem your product could solve, they’re often fine with “good enough.” A good idea comes from identifying someone with an unavoidable priority and no workable options. If that’s the case, they’ll feel almost weird not to buy your product.
Which raises an interesting question — are there unlimited problems left to solve in the world, or is there a finite set of unavoidable priorities worth building for? If new problems emerge only with shifts in technology, culture, and environment, then perhaps the hunt for startup ideas is less about chasing “every problem” and more about catching the rare ones where timing, need, and lack of solutions all align perfectly. In a future world where with unlimited software, what are the boundary conditions to problem solving for humans?
In this interview, Benchmark co-founder Andy Rachleff explains the one idea he carried from venture capital into building Wealthfront: focus on slugging percentage, not batting average. In baseball terms, batting average measures how often you get a hit, while slugging percentage measures how much impact each hit has. For Andy, the goal wasn’t to succeed most of the time — it was to take enough big swings that, when one landed, the payoff was huge. At Wealthfront, he encouraged his team to try many products, accept that most would fail, and aim for the rare ideas that could succeed in a big way.
If you’re not into baseball, imagine a school quiz where each question is worth a different number of points: Batting average is the percentage of questions you get right on a quiz. Slugging percentage is your average score per question — when you get one right, how many points is it worth? In business, batting average is about playing it safe and racking up small wins, while slugging percentage is about aiming for bigger wins that carry more reward.
While this mindset works well for VCs with large amounts of money to invest, I think it’s just as useful in everyday business. Too often, big ideas never make it to decision makers because someone assumes they’re “too risky” (I’ve been guilty of this). Other times, leadership turns them down to avoid personal risk if something fails. But in today’s AI-driven world, testing ideas has never been cheaper or easier. If you’ve been sitting on a bold idea, take the swing. The worst that can happen is nothing — and the best could change everything.
My Theory of Misaligned Incentives proposes that the root cause of nearly all government dysfunction — whether in policymaking, administration, or service delivery — stems from a structural gap between the objectives of government actors and the interests of the citizens they serve. In this view, failures are not merely the result of poor competence, corruption, or lack of resources, but are predictable outcomes of incentive structures that reward behavior disconnected from desired public outcomes.
In democratic systems, elected officials often operate on short-term cycles dictated by re-election pressures. This naturally favors policies that produce visible benefits within those cycles, even if they create long-term costs. Bureaucratic agencies, meanwhile, are often rewarded for budget growth, complexity, or procedural compliance — metrics that are tangential, and sometimes contrary, to actual problem-solving efficiency. In both cases, the incentive is to optimize for personal or institutional security and growth, rather than to maximize public benefit.
The theory further posits that once misalignment becomes entrenched, it creates self-reinforcing feedback loops. Special interest groups, lobbyists, and entrenched bureaucrats adapt their strategies to exploit the incentive structures as they exist, making reform increasingly difficult. Over time, public trust erodes, not solely because of individual bad actors, but because the system reliably produces outcomes that feel irrational to the people it serves. The implication is that solving government dysfunction requires not just better leaders or more funding, but the deliberate redesign of incentives to ensure alignment between the success of public servants and the well-being of the public.
I hope to one day address this.
AI-generated video ads are quickly becoming a competitive advantage for brands, thanks to tools like Google’s Veo3. Marketing expert Dan Birdwhistell — who’s managed over $800M in ad spend for companies like Shopify, Coinbase, and Lyft — has developed a playbook for using these tools effectively. His approach, shared in detail in this article, covers not just prompts and scripts but also strategy, workflow, and lessons learned from hundreds of hours of testing. Veo3 can generate high-definition, photorealistic video with synchronized dialogue and sound effects — all from text or image prompts — and Birdwhistell has been using it to double conversion rates compared to human-produced ads on Meta and Reddit.
One of his core principles is leaning into specificity. AI tools can render settings, characters, and props with shocking accuracy if you clearly describe them, but brevity matters — too many words can confuse the model. Birdwhistell’s SCOA framework (Setting, Characters, Objects, Action cues) ensures every detail adds believability without cluttering the prompt. He also recommends starting with a UGC-style vibe instead of polished studio production. Viewers on social feeds respond better to casual, authentic-looking content — think handheld cameras, direct-to-audience dialogue, and minimal editing.
For storytelling, Birdwhistell advises structuring ad dialogue like sketch comedy: open with a grounded setting, set up tension or misdirection, then deliver a quick, surprising punchline. He also uses AI’s strength in visual effects for “magic realism” transformations — turning an ordinary setting into something aspirational mid-scene — to grab attention and illustrate product value. Both techniques work because they create memorable, shareable moments that feel like entertainment, not ads.
Execution matters as much as creativity. Birdwhistell runs a looped production workflow: brainstorming and scripting first, prototyping in Veo2 (unlimited use, no audio) to refine visuals, then moving to Veo3 for polished output. Since Veo3 limits daily generations, iteration is key — expect to run 3–6 variations before getting one great result. Exporting in multiple aspect ratios (16:9 and 1:1 work best right now) ensures reach across platforms.
Looking ahead, AI video will only get better at realism, product integration, and real-time data updates. For now, the big takeaway is this: specificity + authenticity + fast iteration wins. With tools like Veo3, even small teams can produce creative, high-performing ads that once required big budgets and specialized crews.
Australian and U.S. researchers have made a breakthrough in understanding stuttering, identifying 48 genes linked to the condition through a genetic analysis of one million people. Affecting 400 million individuals worldwide, stuttering often develops soon after children begin speaking, and early intervention can be key to preventing it from becoming lifelong. The findings open the door to pre-verbal diagnosis, allowing at-risk children to receive treatment before speech patterns are firmly established. Researchers also discovered 57 additional genomic “hot spots,” offering new paths for future study and potential treatments.
In Pittsburgh, a quick-thinking man and a determined pit bull became unlikely heroes after saving two unconscious people. While playing with his own dog in a park, Gary Thynes encountered the leashed but ownerless pit bull, who persistently led him down a secluded path to a tent encampment. There, Thynes found a man and woman unresponsive and immediately called 911, getting them to the hospital in time. Thynes, 16 months sober from heroin addiction, credited his recovery for sharpening the instincts that told him the dog needed help. After learning one of the rescued individuals wanted to meet him, Thynes agreed—and is fostering their loyal dog until they can be reunited.
Russia has warmly welcomed a U.S.-brokered draft peace agreement between Armenia and Azerbaijan, which aims to finally put an end to decades of conflict in the region. The deal includes the creation of a strategic “Trump Route” transit corridor designed to improve Azerbaijan’s access to European markets. Russian approval of the plan injects hope for increased stability in the South Caucasus.
A Japanese company, Lib Work, has unveiled its first fully 3D-printed home made primarily from local soil, marking a major leap in sustainable architecture. Completed in Kumamoto on July 22, the innovative “Lib Earth House Model B” uses no cement, instead relying on natural materials that cut CO₂ emissions while improving structural strength fivefold over earlier designs. Equipped with smart sensors to monitor wall health, remote-controlled climate and lighting systems, solar power, and battery storage for off-grid living, the home blends tradition with cutting-edge technology. The project showcases a future of zero-waste, recyclable housing that is both environmentally friendly and self-sufficient.
Enjoying The Hillsberg Report? Share it with friends who might find it valuable!
Haven't signed up for the weekly notification?
Subscribe Now