The Evolution of AI: From Stagnation to Integration

September 9, 2024
Maciej Rokosz

Artificial Intelligence (AI) has been at the forefront of technological advancement for years, promising to revolutionize various aspects of our lives. However, the journey of AI development has been far from linear.

Despite my immersion in the topic, I believe we are still far from developing a true self-conscious Artificial General Intelligence (AGI). I also acknowledge that I cannot say for certain whether a breakthrough in AGI might be happening right now.

Moreover, the assertions by numerous high-profile individuals, such as the CEO of AWS, that AI will soon replace human jobs, have yet to materialize. In my previous article, I explored the basics of AI, large language models (LLMs), and their potential impact on employment.

In this blog post, we will take a broader perspective and provide an update on the current state of AI. This will help you better understand the evolving landscape of AI/LLMs and its implications for the world.

Stage 1: Stagnation and Theoretical Limits

The Plateau in Performance

Recent years have witnessed a surge in hype surrounding AI developments, but was this narrative truly reflective of real exponential growth? The reality is that Google first proposed a transformer architecture in mid-2017, which sparked the development of the first iteration of the Generative Pre-trained Transformers (GPT) model by OpenAI almost a year later. The chatbot ChatGPT 3, often credited with fueling the AI-hype train, was released in mid-2020, but wasn’t made publicly available until the end of November 2022.

Foremost, advancements in computational hardware, particularly from NVIDIA, played a significant role. NVIDIA’s proprietary software, CUDA, and its 2019 acquisition of Mellanox enabled ultra-fast communication between chips, allowing for the scaling of models beyond previous limits. While progress hasn’t entirely stopped, these factors have undoubtedly contributed to the rapid development we’ve seen.

The exponential growth that people witnessed began many years ago, taking 5-6 years to reach the level of ChatGPT 3. However, we’ve now hit a plateau in AI development, and several factors contribute to this apparent stagnation:

Generalization Challenges: Large generative AI models, trained on vast datasets, often struggle with difficult or underrepresented tasks. While they excel at common scenarios, they falter when faced with niche or complex problems.
Logarithmic Performance Gains: Research has shown a logarithmic relationship between dataset size and model performance. This suggests that simply adding more data yields diminishing returns, leading to a plateau in performance gains.
Uneven Performance Across Classes: AI models tend to perform exceptionally well on overrepresented classes (like cats and dogs) but struggle with underrepresented classes (such as specific plant/animal species or complex medical diagnoses).
Hallucination and Poor Performance: Existing models may generate false information or perform poorly on tasks not well-represented in their training data, highlighting the limitations of current approaches.
Need for New Strategies: To achieve truly general, high-performing AI systems, we may need to move beyond just collecting more data. New machine learning strategies and novel model architectures could be the key to breaking through current limitations.

The Efficient Compute Frontier Problem

Another aspect of the stagnation stage is what’s known as the “efficient compute frontier problem.” While larger models trained on more data and with more computing power tend to reduce test loss, there seems to be a limit that cannot be crossed simply by scaling up. This presents a challenge: how can we continue to improve AI performance when simply making things bigger no longer yields significant gains?

As models become larger and more complex, the computational demands increase exponentially, leading to significant costs in terms of time, energy, and hardware resources. This problem is particularly acute in the context of deep learning, especially for models like LLMs. For more ambitious follow this paper.

Stage 2: Mitigation Strategies

In response to these challenges, the AI community has employed several strategies to push beyond the apparent plateau, to name a few:

Continued Scaling: Despite diminishing returns, some researchers continue to train ever-larger models, hoping to squeeze out marginal improvements.
Mixture of Experts (MoE) Approach: This technique involves training multiple specialized sub-models within a larger model, allowing for more efficient use of parameters and potentially improved performance on diverse tasks.
Novel Architectures: Researchers are exploring alternatives to the dominant transformer architecture (e.g. ChatGPT, Llama etc.). Models like Mamba and Jamba represent attempts to overcome the limitations of traditional approaches, by adding methods from recurrent Neural Networks, or other State-Space Models methods.
RAGs and Agentic Systems: Retrieval-Augmented Generation (RAG) systems and agentic approaches aim to combine the strengths of large language models with external knowledge sources and goal-directed behaviors, potentially overcoming some limitations of standalone models. And Agentic systems tend to utilise selection of various models cooperating like a team to solve problem together.

Stage 3: Narrative Change and Silent Revolution

As the AI landscape evolves, we’re witnessing a shift in both the narrative surrounding AI and the practical approaches to its development:

Resource Concentration: The leaderboard for top-performing AI models is increasingly dominated by organizations with vast resources. The ability to afford massive GPU clusters and pay for the enormous power requirements of training large models has become a significant factor in AI advancement.
Specialization Trend: There’s a growing interest in smaller, specialized models that can match or even outperform the largest models on specific tasks. This trend aligns with the concept of AI agents, where multiple specialized models work together to solve complex problems.
Efficiency Focus: Companies are investing in the development of smaller, highly efficient models like Claude Haiku, OpenAI’s GPT-4 Mini, or Llama 3.1 7B. These models aim to optimize for cost while only slightly compromising top-end performance, making AI more accessible and deployable in a wider range of scenarios.
Hardware Integration: Microsoft’s announcement of Windows+Copilot PC, which integrates AI capabilities into consumer computers via specialized Neural Processing Units (NPUs), signals a new era. This move suggests that AI is no longer just a cutting-edge technology but is becoming an integral part of our everyday computing experience.
Democratization of Local LLM Usage: The use of local LLMs is possible thanks to tools like GPT4All, Ollama, LlamaFile, and LM Studio. Combined with advanced quantization techniques and optimizations, these tools are making it increasingly feasible to run powerful language models on personal computers. If you are curious, follow the link to see how clever optimization can lead to performance improvements of up to 10x when using open-source LLMs without specialized hardware.

Stage 4: Professional Experience and Practical Implementation

As a data scientist, I have been working with both classical machine learning and more advanced AI techniques, depending on the requirements at hand. These technologies are here to stay, as not all problems can be addressed with LLMs alone.

As someone who leads some LLM initiatives within an organization, I have observed firsthand how these trends are playing out in real-world applications:

Learning Curve: Implementing AI solutions can be challenging. Models often require careful fine-tuning and can produce unexpected results. However, as tools improve and best practices emerge, this process is becoming more streamlined.
Growing Demand: There’s significant interest in leveraging AI across various business functions. From customer service chatbots to intelligent document processing engines, organizations are exploring diverse applications of LLM technology, particularly in areas where free text is utilized.
Cautious Optimism: While enthusiasm for AI is high, most organizations are proceeding with caution. Concerns about data privacy, algorithmic bias, and the reliability of AI-generated content necessitate a measured approach to implementation, and careful governance practices.
Commoditization of AI Services: As AI capabilities become more accessible and affordable, both open-source and proprietary ones, we’re seeing a shift towards AI as a commodity. This trend is democratizing access to powerful tools but also intensifying competition in the AI services market.
Practical Challenges: Working with AI systems can be frustrating at times. They require careful “taming” to produce reliable and useful outputs. However, the potential benefits often outweigh these challenges, driving continued adoption and refinement of AI technologies.

Conclusion: The Future of AI

The journey of AI development has been marked by periods of rapid advancement, apparent stagnation, and innovative breakthroughs. While we may have hit certain theoretical limits with current approaches, the field continues to evolve through new strategies, architectures, and applications.

The shift towards more specialized, efficient models and the integration of AI into everyday hardware suggests that we’re entering a new phase of AI development. Rather than chasing ever-larger models, the focus is increasingly on making AI more practical, accessible, and integrated into our daily lives and work processes.

As we move forward, the key to leveraging AI effectively will be to maintain a balanced perspective. We must recognize its current limitations while remaining open to its evolving capabilities. By doing so, we can harness the power of AI to augment human intelligence, streamline processes, and tackle complex challenges in innovative ways.

The AI revolution may not be unfolding exactly as some predicted, but its impact is undeniable and far-reaching. As practitioners and users of this technology, our task is to navigate the hype, embrace the reality, and shape the future of AI in a way that benefits society as a whole. The journey of AI is far from over – in many ways, we’re just beginning to scratch the surface of its true potential.

Are you using any AI tools? If so, how have they impacted your work or daily life? If not, what’s holding you back?

Facebook Tweet Pin

The Evolution of AI: From Stagnation to Integration

Table of Contents

Stage 1: Stagnation and Theoretical Limits

The Plateau in Performance

The Efficient Compute Frontier Problem

Stage 2: Mitigation Strategies

Stage 3: Narrative Change and Silent Revolution

Stage 4: Professional Experience and Practical Implementation

Conclusion: The Future of AI

Leave a Comment Cancel Reply

Unlock Your Potential
with Our Free Goal Setting Guide!

The Evolution of AI: From Stagnation to Integration

Table of Contents

Stage 1: Stagnation and Theoretical Limits

The Plateau in Performance

The Efficient Compute Frontier Problem

Stage 2: Mitigation Strategies

Stage 3: Narrative Change and Silent Revolution

Stage 4: Professional Experience and Practical Implementation

Conclusion: The Future of AI

Leave a Comment Cancel Reply

Unlock Your Potential with Our Free Goal Setting Guide!

Unlock Your Potential
with Our Free Goal Setting Guide!