New Year, Deep Reflection

5 minute read

Published:

Happy New Year, folks! First, I sincerely wish everyone reading this post – no matter what you do or where you come from – a wonderful and prosperous new journey starting today.

We have experienced a great deal over the past year. Unprecedented attention from governments, enterprises, and the general public has been directed toward AI, accompanied by numerous national policies and development plans, massive new investments, and intense public discussions. These forces have created several eye-catching rags-to-riches stories. Much like the technological transformation during the Industrial Revolution, AI offers this generation an opportunity to achieve social and financial mobility through intellectual capability, serving as a potential channel for breaking rigid social strata. However, as a public-facing technology, AI also demands caution and vigilance regarding several fundamental issues in current systems.

Let us start from the very beginning: how we define intelligence. To date, nearly all progress in AI has been defined exclusively in terms of improvements in instrumental capacity—the quality with which AI systems complete tasks. We see ever-higher scores on various leaderboards. This is certainly necessary for AI development, but it should not be the sole objective. Human history offers a clear lesson: the exclusive pursuit of instrumental rationality (“Zweckrationalität”) without value rationality (“Wertrationalität”) often leads to undesirable outcomes, including immoral behavior and negative externalities (London fog is not only the name of a latte, but also a historical environmental disaster caused by unchecked industrial production).

Indeed, we have already observed early signs of similar behavior in AI systems, such as reward hacking and overly flattering interactions. It is time for AI researchers to reconsider what additional factors, beyond instrumental capability, should be included in our definition of intelligence.

At the core of this question lies the interpretability of AI systems. This is precisely why I founded AI Interpretability @ Illinois: to bring together researchers from diverse backgrounds to develop models that operate in an interpretable and accountable manner. We do not interpret black-box models in hindsight—doing so does not make them safer or more trustworthy the next time they are deployed. Nor do we cherry-pick individual safety concerns and patch models accordingly—it is impossible to enumerate all potential risks. Instead, we pursue what we call “generative interpretability”: the property that an AI system’s internal mechanisms are human-understandable during deployment and real-world use, rather than only in laboratory settings with post-hoc tools. Such generative interpretability allows risks to be detected before an AI system’s actions are executed in the environment and cause real-world consequences.

Decentralization also plays a critical role. As I discussed earlier in a TDS blog post, decentralized computation is a defining feature of many of the most powerful systems on Earth, including human brains, financial markets, and biological swarms. Modern deep neural networks partially inherit this property, which is one reason they are substantially more powerful than centralized statistical learning models. What remains missing, however, is the decentralization of training objectives—the incentives that drive an AI system’s learning dynamics. More fundamentally, decentralization contributes to the democratization of AI and makes the formation of monopoly power technically more difficult.

Finally, we must reconsider the relationship between AI and humans. One of the most popular AI product forms today is the AI agent, which aims to automate entire workflows with minimal human involvement. Many ambitious teams seek to build general agents capable of performing arbitrary tasks. With current black-box foundation models, this goal is neither safe nor practical. Unlike chatbots—the dominant AI product today—agents operate in an action space whose outputs are executed directly and have real effects on the environment, rather than merely producing text. In domains where unsafe or irreversible actions are possible, such as autonomous driving or medical decision-making, deploying AI agents is unacceptable unless unsafe intentions can be detected internally before actions are taken—once again pointing to the necessity of generative interpretability. At the same time, truly general agents that can perform any task remain technically infeasible, given the vast action space of the real world.

This dilemma forces us to rethink the human–AI relationship. An inequality mentioned by my advisor, Prof. ChengXiang Zhai, captures two possible paradigms: an agentic model, where Intelligence (AI) ≤ Intelligence (Human), and a collaborative model, where Intelligence (AI + Human) ≥ Intelligence (Human). This inequality carries rich implications. From my perspective, it resembles an envelope curve or Pareto frontier. Each individual possesses unique advantages that AI cannot fully learn, as AI systems compress information by smoothing out edge cases. At best, an AI agent may become an “extraordinary generic person”—capable across many domains without obvious weaknesses, yet lacking rare and outstanding skills. Human–AI collaboration, by contrast, offers the opportunity to break through this envelope when uniquely human intelligence is brought to bear.

At this moment, we are both witnessing history and writing it. Some skeptics point out that AI has thus far made only limited contributions to global GDP. Ironically, this echoes famous remarks from earlier technological eras: two centuries ago, “What use is electricity?”—“Sir, what use is a newborn baby?”; and in the 1970s, “You can see the computer age everywhere but in the productivity statistics.” The intrinsic lag between technological innovation and measurable productivity gains has been well documented in economic growth theory and validated by decades of empirical evidence. The real question we should be asking is this: now that the bus is already accelerating, how do we steer it onto the right path at this critical crossroads?

Cite This Post

@article{{xiaocong-yang-new-year-deep-reflection,
  title = {New Year, Deep Reflection},
  author = {Xiaocong Yang},
  year = {2026},
  month = {Jan},
  url = {https://xiaocong-yang.github.io/personal-website/posts/2026/01/new-year-reflection/},
  note = {Blog post}
}