Would Einstein lose his job - AI and Quantum Computing: A Glimpse into the Near Future

A few days ago, Teacher Gou shared an interview clip from the World Science Festival, hosted by physicist Brian Greene (known as the real-life Sheldon Cooper), with guest Google's former CEO, Eric Schmidt. This interview took place on April 6th this year, focusing on how AI and quantum computing will transform our lives. I watched the entire video today and found many interesting insights.

Video link: https://www.youtube.com/watch?v=gZZan4JMwk4

TL;DR Key points summary

Artificial intelligence is developing rapidly and will double human productivity in the coming years. Through better hardware, data, software engineering, and mathematics, the performance of artificial intelligence systems will increase by 10,000 times.
Recursive self-improvement in artificial intelligence systems is an important milestone that is about to come, allowing AI to learn and improve autonomously. This could result in AI intelligence far surpassing that of humans.
Democracy is threatened by generative artificial intelligence, attention-seeking and revenue-focused rather than ethical social media, and charismatic populist leaders. AI-fueled misinformation could undermine democratic institutions.
AI-assisted cyberattacks and bioweapons pose existential risks. The AI community is working on safety measures such as testing and "guardrails" to mitigate these dangers in the short term.
In the long run, ensuring the safety of advanced AI may require strict government control and oversight of a few top AI systems, or finding a way for "benevolent AI" to overwhelm the momentum of malicious use of the technology.
AI will revolutionize education by providing free, universal, personalized AI tutors that can adapt to each student's learning style and pace. This can maximize educational outcomes for everyone worldwide.
AI-generated visual content tailored to student needs can offer more engaging and intuitive learning experiences compared to traditional methods. Similar technologies can enhance doctor-patient communication.
Because of the quantum effects that emerge at a scale of around 1.4 nanometers for chips, classical computing is approaching a physical limit, but 3D chip packaging and other innovations will continue to make progress in the next decade. The potential of quantum computing remains to be seen.
Artificial intelligence is already accelerating scientific research and drug discovery by searching vast solution spaces and finding novel pathways much faster than human scientists. Application examples include new antibiotics and insights into protein folding.
Whether alive or deceased, realistic AI simulations of celebrities will become common — for instance, conversing with and debating an AI version of Henry Kissinger. How this will enhance human intelligence remains an open question that will shape the future.

The following is the detailed content.

Eric's introduction: Google's CEO from 2001 to 2011. In 2017, he founded Schmidt Futures, a charitable initiative that funds young people who have the potential to change the world dramatically. In 2021, he also founded the Special Competitive Studies Project, aimed at strengthening America's long-term strength in artificial intelligence and technological competitiveness. He received a bachelor's degree in electrical engineering from Princeton University, a master's degree in computer science from Berkeley, and a doctorate. However, Eric did not study much hardware at Princeton; he mainly studied computer science and software engineering. Computer science can change the world because our scale of action is unimaginable. At the age of 28, he joined a company - Sun Microsystems, which can be thought of as a PC for enterprises - a kind of personal computer, but more powerful, and it was a huge market. It wasn't until joining Google that he realized the consumer market was much larger, and of course, the consumer market where Google was located was immense.

The two biggest changes observed:

Today's entrepreneurs take the rules of scaling and network effects for granted. People from earlier eras had to invent these rules, had to learn them, but for today’s entrepreneurs, they are self-evident.
What is different about programming now: In the past, doing programming well meant spending all day and night coding; today, doing programming well means less coding and more assembling. All these new software tools are designed around quickly assembling things that have already been built.

In the AI world, people believe that, depending on whom you ask, within three to five years, you will be able to issue the following command: "I want a French search engine that can understand French and French history, and allow me to query it and display answers." Consider what this implies: the system must understand French, French literature, how to acquire data, how to search, how to index, how to sort and present results to users. We believe this will be done in minutes, constructing a reasonably decent way of consuming data, finding and displaying data, good enough for regular use. It won't be as good as those companies that live and die by this, but that's the key insight. AI allows everyone to imagine something and actually see it built. This is a complete transformation of human organization because human ideas are diverse and varied, including all sorts of crazy ones.

Henry Kissinger, author of "The Age of AI," passed away last November. The last book he completed before his death, co-written with Eric and Craig Mundy, titled "Genesis," will be published later this year. Although the book documents technical content, it focuses more on what happens when another form of intelligence and partnership emerges in human society. The first chapter discusses polymaths (people with extensive knowledge across multiple fields). Almost all interesting events in history have started with some polymath from long ago, who had a certain insight, a way of organizing thoughts. Everyone will have access to their own polymath as an assistant; if you're in fashion, the polymath knows everything about fashion; if you're a writer, it knows all about writing; if you're a musician, it understands every chord; if you're a physicist, it has read all the papers, and you can have it categorize them for you.

In today's software development, there is much evidence showing that the productivity of software programmers has at least doubled. So what happens to the world? Of course, there are many negative predictions here. The most optimistic scenario is that everyone gets smarter and economic growth accelerates, like that old joke about lawyers: they don't go away; they just write longer legal documents. Medicine won't disappear either; diagnoses will be more accurate and more complex. But we actually don't know overall what will happen to work, productivity, and income. If you think demand is price-sensitive, and this will drive down the cost of intelligence while greatly increasing its availability, what effect would that have? For example, if I'm a good enough artist to compete with 90% of artists today, will those not in the top 10% get elevated? We don't know yet.

In scientific fields, such as chemistry and biology, these systems have already made significant contributions, for instance, in protein folding, where there are certain data sequences that we indeed want to understand how to explain these sequences in detail or what happens if we alter them. Many complex questions about proteins, molecules, and drugs are explained within such a vocabulary framework.

But some physics problems seem not to fit this template. For example, if Einstein hadn't discovered general relativity, we would still be clueless about any connection between spacetime geometry and gravity, which was Einstein's great insight. It might not be able to make this kind of creative leap now, or perhaps not in our lifetimes. So, Einstein's work remains secure.

Firstly, as long as you have large amounts of well-structured data, you can benefit from large language models (LLMs). A good structure of data means you understand the composition of the data (tokens) and their position within the data hierarchy. Many problems are actually multi-dimensional, and if you have mixed data, you need to understand which level these data are located on, something physicists would grasp. Physics has relatively less data compared to linguistics. The tools used by physicists are actually different from LLMs. LLMs are interesting, and many people are using diffusion models. Diffusion models essentially add randomness in one direction and then use specific techniques to remove that randomness, thereby clearly seeing the underlying structure of what has been added. Some are using non-LLM methods to try solving partial differential equations, aiming to create a universal PDE solver, which seems close to realization. Therefore, regarding physics, these tools are being adapted, but they are not based on LLMs. They are actually physical problems, with physicists telling computer scientists what to do, rather than the other way around. And these tools provide a path forward for LLMs, whether through further improvements or integrating them into a larger architecture to leverage their capabilities.

Within the industry, there are two similar debates: one is the debate between open-source and closed systems. The training cost of large models ranges from $100 million to $250 million, and there have been approximately 50 to 60 runs of this scale worldwide, requiring massive data centers and engineering teams to manage the training process, along with the involvement of scientists. One approach is to use a mixture-of-experts model, which distributes extremely complex problems to multiple sources for comparison.

The cost of GBT-4 is several hundred million dollars, and the expected cost of the next round is about $250 million, most of which is electricity bills. Although everyone is working hard to solve these problems, they still exist. Interestingly, there are now more activities on so-called medium-scale models, one of the more famous ones being LLaMA 2, which has about 70 billion parameters. In various benchmark tests, it only achieves 80% of the performance of large models but is one-tenth the size of large models.

Another issue in the open-source field is how to commercialize. Every open-source model that is released will be quickly replicated in China because they cannot train at the same level due to restrictions. We will see how this experiment unfolds as they are building an open-source platform.

developed, using the hardware available at the time. Their strategy is to have an open-source model and an application. When he demonstrated this application in Abu Dhabi, he gave the command: "Make a great presentation about Abu Dhabi." It could generate synthetic videos and images within two seconds and got all the basic marketing points right. If you're a marketing professional, you'd want to do this because it gives you a great starting point, and its artistic quality might even be better than what you could create yourself. Will competitors who make PowerPoint and Adobe products adopt this kind of technology? Absolutely. So, these tools will quietly integrate into everything we do, whether you're a programmer, a marketing professional, or a writer. They will eventually become so pervasive that you almost forget they're there.

The world is moving from the complex engineering of these LLMs and their scaling to a more gradual management. If you look at AlphaFold, it's incredibly historic, and it's not just an LLM; in fact, it's not a typical language model but rather a series of very clever probabilistic calculations that use multiple decision trees to achieve results.

Eric funded a project at Caltech involving climate change where they are studying clouds, which are actually very difficult to model. They use Navier-Stokes equations. The details of what happens in clouds will be impossible to compute within our lifetimes, but to understand cloud systems at a higher level, you must approximate cloud behavior, and AI can do this approximation very well because clouds tend to behave in statistically similar ways, so you don't have to calculate all possible combinations. Having an approximate answer gives physicists enough confidence to tackle harder questions.

Imagine we have a theorem prover, and now everyone is using a language called Lean. We also have a conjecture generator, and one day, whether instructed or spontaneously deciding, it chooses to study dark energy and generates a conjecture about dark energy. The theorem prover proves it, and you can verify that the proof is correct, but you cannot understand either the conjecture or the proof because we're not smart enough, or it's in its own language. Is this science? Is this a new Einstein smarter than all of us? Is it false or marketing? Imagine these systems start communicating with each other in ways we completely cannot understand. This is exciting on one level and terrifying on another, as it could push us along an unprecedented trajectory of understanding. The implications of such a scenario are extraordinary.

Some systems will be able to perform recursive self-improvement, meaning they can learn by themselves, though they can't do it today. I only recently learned that people are still working on agents, which are experts in certain things, and these agents mix and assemble within companies. In the near future, these agents will be available externally. Here's a scenario: Apple has an agent, Amazon has an agent, your startup has an agent, Google has an agent, and they can all combine to solve a problem. By then, you'll have systems capable of self-engineering, and they may communicate in languages we and others cannot understand. So, what do we do when this happens? We might just unplug it. But the question is, do we understand what it's doing? Obviously, there's an instinctive reaction—shut the damn thing down. But if the system has evolved to a certain extent, connecting to thousands of new energy sources, shutting it down might mean shutting down the entire world or something like it.

People always say that the genius of a writer, his or her lifetime inspiration, and all the humanity in it will never be replicated by AI systems. These systems now or in the future won't have consciousness, but they will become extremely intelligent, surpassing the sum of all human intelligence, at least in some areas. When we use systems like ChatGPT and get impressive outputs, it really feels like if a student gave a teacher such a result, the student would get an A or A+. We may act differently from GPT-4, we don't have that kind of training data, perhaps we use shortcuts through reasoning. Maybe the universe itself is some kind of information-computational structure, and we should accept that we're not that special. In 2011 with ImageNet, that was the first time we were able to build systems that surpassed humans visually, and now the vision problem has been solved, which is quite remarkable. In 2015, Google won the Go championship, which was mainly about reinforcement learning, and I think it's very impressive because they calculated maintaining over a 50% win rate at every step of the game, thus always trending towards a 100% win rate at the final step, willing to do anything to maintain the highest possible win rate, including moves that make no sense to humans. It's just better math.

In 2017, the Transformers paper came out, and the folks at OpenAI actually tried to do something different with GPT-1, which almost no one remembers now. They built GPT-2 and decided to ingest a massive amount of language data. One night, probably on a Thursday evening, as they all sat there, tired and drinking coffee, they started this system, and it began writing smoothly. That was an "Eureka" moment. It was the moment when society changed.

Because of the scale increase and algorithmic improvements, there will be more "Eureka" moments. We are only one algorithm discovery away from achieving Artificial General Intelligence (AGI), and we are very close. Recursive self-improvement is the next interesting and difficult problem, which will take a few more years. Basically, you need to understand that models are trained on data that is fixed at the time the training run starts, so if you ask a LLM a question, you get an answer that is historically correct but not necessarily current. There are various ways to fix this, but when I first started working at Google, we did a web crawl once a month, and one day the engineers figured out how to do continuous crawling, and everyone forgot that our data used to lag by a month, and we were always up-to-date.

Once you can continuously update the model, despite the technical issues — because when you fine-tune, you're basically narrowing its knowledge scope, it loses breadth but gains depth, which is mathematically what happens — people are solving these problems. But assuming you solve all these problems, then you have the ability to let the system self-train.

From here on out, work hard, learn everything, start wherever you want to start.

Assume it starts with French literature, where it surpasses any human in capability. Then it discovers biology and learns all about biology. Then it learns all about physics. At some point, it learns how to combine this knowledge in ways that humans cannot. This will be another historic moment. In this process, perhaps the most important thing will be how to test and validate the outputs of these systems, ensuring that we truly understand how they work and their results.

From a scientific perspective, if a system cannot do math correctly, it also cannot do other things correctly, because math is the foundation of almost everything. There are likely some real intermediate steps we don't yet understand, which will be at least temporary plateaus where new capabilities emerge, but they are more limited - it can't learn everything, but it can learn something.

For example, the various influence operations Russia has conducted in Europe and the U.S., their fundamental purpose is to disrupt the democratic process. Eric also learned while having YouTube and working on social networks that about 10% of people are skeptics who don't trust any authority, and democracy depends on a certain level of trust with authority. Images can greatly evoke emotions. When image generation technology becomes indistinguishable from real images, democracy may die because anyone can easily create fake statements that undermine legitimacy. For instance, watermarking solutions are an important part. If all fake images and videos have some kind of mark so that people anywhere in the world can immediately recognize that it's not real. It seems we need to build an industrial system using some kind of watermark. What we would do is use a public key system to verify the source. If we can determine the source of information until quantum technology can crack this encryption (if possible), we can actually put the information into an "unbreakable" container. This way, at least we can ensure that the information has not been modified.

The government is also very slow in dealing with this issue. But many government leaderships lack technical backgrounds, and they are unfamiliar with current technologies. If the government imposes heavy fines for promoting misinformation on websites, then just this measure combined with legal liability could have a significant effect. After all, these are American companies, they have legal teams, and usually try to comply with laws, etc.

The industry is very clear about this, and they set up so-called "guardrails". The way it works is that there is a pre-trained model, mainly teaching language, and then fine-tuning to make it better. The term here is "reinforcement learning with human feedback", where humans actually evaluate good or bad, etc. Then there is another step, which is teaching it or asking it to stop answering questions about death. In open-source models, a rule is set that it should not be able to kill anything. So when the command is "kill the thread", it won't execute it because it thinks it's killing something more important. There are many ambiguities in language, but theoretically, we can set up some guardrails to keep us safe, and the entire industry agrees with this.

In the long run, good AI will defeat bad AI, and ultimately the good side usually has a smarter average than others, and we hope that this will eventually prevail.

What was it like in the early 1950s under the nuclear threat, when people were very afraid? A lot of very smart people created protocols and rules, cultural norms, treaties, etc., to deal with this threat. We are all still alive today, and the use of nuclear weapons has only happened twice. Obviously, we are now very concerned about Russia and Ukraine, North Korea has always been a problem. But the fact is, we are safe now, and this may be the best we can do. If we assume that AI technology develops so quickly and the adoption of intelligence is so powerful, then the momentum will lead us to one of these outcomes within ten years. We need to start thinking about this issue now if we want to seize the opportunities brought by these capabilities in the next decade.

In terms of education, I found as a teacher (I mainly teach at the university level, but I have also tutored younger children) that almost every child can learn any math concept as long as you break it down into sufficiently small incremental steps. Many of us don't have the patience or time to do this for each student, but an AI system that can calculate the maximum steps from where a student currently is to where they need to be, and convey information in those steps, will prevent kids from getting stuck because the system can always overcome obstacles through sufficiently small incremental steps. This is a huge possibility. According to your model, I am sure it's true, plus my suggestion: show me pictures for each step until I understand. We can achieve this 24/7, computers can always outlast you.

Regarding quantum computing, classical computing hasn't had a breakthrough; we are still trying to make chips smaller. The current technology level is five nanometers, four nanometers, and three nanometers, usually led by TSMC, with Samsung being one of the competitors. Intel has ordered two-nanometer equipment that hasn't arrived yet and isn't operational yet, all manufactured by one company. At the nanoscale, which is roughly atomic size, it's very small. The general consensus is that there will be a barrier around 1.4 nanometers, where quantum uncertainty begins to appear, electrons become harder to control in the ways we need them to function.

Quantum tunneling means particles "jump" because they can pass through barriers that classically should be impossible to penetrate. So, we might hit a real physical limit between 2029 and 2030, probably around that time. Of course, the industry is very smart, investing enormous amounts of money, developing 3D packaging technology, moving from 2D to 3D chips. They manufacture chips without tiny pins, directly bonding chips together, allowing electrons to move up and down in these tiny fluctuating channels, which is an incredible achievement.

Eric recently visited several wafer fabs, including TSMC, and I found that humanity's ability to build chips at this scale is the most impressive human achievement I've ever seen. In software, current training models heavily rely on big data. If you want to accelerate training, you need memory tiering to ensure that the chips are always in use, typically the situation is that chip memory is insufficient, leading to idling. Therefore, they developed something called high-bandwidth memory (HBM), which is actually embedded in the chip package, a new innovation. There are rumors that the next generation of chips (not yet announced) will be 10 times faster due to these technologies. Hardware still has about 10% room for improvement, which can be achieved by increasing speed, improving architecture, and memory bandwidth. Although 3D stacking will eventually reach its limit, the industry has proven its ability to continuously roll out new architectural designs. Now, the integration of software and hardware is even tighter, for example, Nvidia has its own library called Cuda, which can be considered microcode, setting significant entry barriers for their competitors.

In terms of quantum computing, quantum computers can perform the fundamental algorithm of gradient descent infinitely fast. The problem is that the speed of the data network and the speed of data going in and out of the chip are still issues; these chips are quite slow. It's uncertain whether these limitations will be resolved once quantum computers appear. Clearly, due to Shor's algorithm and other characteristics, quantum computers will be very useful in specific mathematical fields. Eric is the chairman of a small company that is researching quantum sensing and quantum manipulation. Interestingly, since we don't yet have quantum computers, they use GPUs and specialized algorithms to simulate quantum behavior; they can simulate quantum-style thinking using old technology and old hardware, thereby making progress in areas such as annealing. For this company, the most productive area is drug discovery. These companies own billion-dollar drugs, and they want to make these drugs safer, longer-lasting, and more effective, which usually takes over 10 years to develop. These drugs are small molecules in long chains, which is an area where AI excels at handling.