Application

The Emerging LLM Stack In Investment Funds | Forbes

An analysis of how businesses are utilizing AI to improve operations and enhance customer experiences.

This article was originally published on Forbes

As a data scientist, I have been working with language model technologies since 2019. In 2021, I founded Coinfeeds, a company that collaborates with investment funds to provide data science and AI solutions. In this role, I have had numerous discussions with technology leaders at these funds, gaining insights into how they leverage and implement AI.

What I’ve learned is that ChatGPT can increase productivity in the workplace, assisting with writing emails, summarizing research papers and writing code and documentation. Portfolio managers and analysts, in particular, see significant productivity enhancements when AI can interact with internal research reports and meeting notes. These use cases, when set up properly, can significantly boost a firm’s overall efficiency.

Adoption still faces barriers and risks, and investment firms can gain the most benefits and can be unlocked when AI can access internal documents. That said, a large language model (LLM) stack is emerging to accommodate new use cases and address concerns.

Foundation For AI Success

Before examining why to build an LLM stack and the emerging strategies for managing it, let’s look at some of the key factors that determine success:

Data Foundations

A successful AI strategy lies on top of solid data foundations. This can require transitioning from legacy technology infrastructures developed over decades to modern data warehousing. This is because pre-trained AI models still need integration with internal data sources to be useful for real business applications.

Legacy systems often have data trapped in silos, stored in incompatible formats and are difficult to access and analyze holistically. With modern data warehousing solutions, companies can bring together data from heterogeneous sources, enforce common data standards and create an integrated data platform. This provides a unified view of data for AI algorithms to work with.

Technical Talent

Implementing AI in investment funds is strongly linked to the scale of their technical teams. Engineers, quantitative analysts and data scientists typically spearhead the adoption of AI, identifying innovative applications for the broader organization.

When developing or implementing new AI products, it will be important for these teams to collaborate with other stakeholders to discover unique uses—such as news summary alerts and automated invoice processing—that serve internal product managers.

Leadership Support

AI adoption is a complex process that requires strong leadership support. For investment firms to see success with AI adoption, they must first manage diverse stakeholder interests and collect potential applications from internal users.

Likewise, it is important to assess a range of technologies and orchestrate pilot projects while listening to and addressing concerns raised by legal, risk and compliance teams.

Key Considerations For AI

As organizations explore the adoption of large language models (LLMs) like ChatGPT, there are several key considerations to keep in mind regarding their limitations, optimization approaches and provider choices. This section outlines some of the critical factors companies should evaluate when leveraging LLMs.

Current Limitations Of LLMs

Some professionals have expressed disappointment with the recent hype surrounding LLMs. Quantitative researchers have utilized advanced machine learning and data mining for decades, and they are often the most skeptical of chatbots like ChatGPT. Quants find that ChatGPT is not great at mathematical reasoning, which is crucial for bond valuation, option pricing and risk management tasks.

When it comes to ChatGPT and coding, some programmers find it more effective than others. Also, concerns arise over the risk of errors from unverified machine-generated code and ChatGPT’s limited ability to refactor large legacy codebases.

RAG Vs. Finetuning

Retrieval-augmented generation (RAG) and finetuning are methods to improve language models. RAG integrates external information during generation while finetuning trains a model to a specific dataset for enhanced task performance.

In finance, RAG may be preferable for its immediate benefits, as finetuning might be unnecessary unless dealing with entirely new data. Given RAG’s efficiency, it is likely the more appealing option.

OpenAI vs Open Source

Sophisticated investment managers can explore open-source and alternative LLM providers, carefully considering their choices compared to OpenAI. It’s important to conduct experiments where internal users can choose between multiple LLMs.

These experiments can be set up as blind A/B tests, where internal users can ask the same question to two or more different models, and select which response they prefer. Examples of this model can be found online, such as the LMSYS chatbot arena.

The LLM stack

Once your team has an understanding of the concepts and considerations, it’s time to start building the stack:

The Basic LLM Tech Stack

The basic LLM setup features an internal ChatGPT-like system with data and IP protection, including a user interface for document uploads.

It’s important to establish a guardrail layer to ensure confidentiality and proper use, along with logging for compliance. Using OpenAI services in an Azure VPC is one way to keep data secure, as Microsoft says it will not use customer data for training.

Connecting LLMs to Internal Data Sources

For investment funds, the true potential of LLMs is unlocked through their integration with internal data sources. This process entails converting unstructured data, like Word documents and PDF files, into a format suitable for a vector database.

LLMs then interact with these vector databases using RAG. While conceptually straightforward, building a robust infrastructure to support this can be challenging.

The ideal setup provides the LLM seamless access to diverse, interconnected datasets. However, data is often fragmented across legacy systems that don’t easily link together. Assembling a useful knowledge base from disparate sources requires overcoming considerable technical hurdles. The payoff for solving these systems integration problems is more capable, intelligently informed LLMs.

Application Development

It’s important to go beyond merely integrating existing AI tools but to also actively develop downstream applications. These applications are tailored for particular workflows, such as completing forms or executing specific tasks to streamline the investment process.

Conclusion

AI undoubtedly has the potential to enhance productivity within organizations significantly. However, it’s crucial to remain aware of its limitations and associated risks. Data foundation and investment in technical talent will be vital in getting organizations ready for AI.