Unlocking the value of your data with self-organising AI

Every organisation has data. The biggest issue that organisations face is how to use that data, turning it from noise to signal. There will always be an eternal discussion of when organisations will truly understand their data. Instead, organisations should be thinking that there are and will always be many ways in which they can infer over their data to achieve greater value from it.

One of the key advantages that we help our customers achieve with their data is utilising AI, especially Gen AI, to achieve these greater values from their data.

Self-organising data for financial services

Data exists everywhere, inside and outside of organisations. One of the greatest tasks that any organisational data employee faces is correctly labelling that data to enhance its use.

At Firemind we’ve worked with several customers to help them on this journey by using Gen AI to build self-organising pipelines for internal and external data, with the outcome to provide directed signals to the organisation’s employees that help them make better decisions.

When this is combined with the modernisation of the data pipeline, it allows organisations to have a continuous flywheel of clustered information, grouped to extract the maximum value from that information.

More than embeddings

One of the dangers with any form of modernisation with AI is the reliance that “AI will figure it out itself”.

Let’s run though a scenario, imagine you have three pieces of data:

1. Cakes under £10
2. Cakes under £50
3. Cakes under £100

Looking at this data with a purely vector embeddings lens denotes that the three products are highly similar, they’re all talking about cakes, and there’s only one difference in the vector space.

We can visualise this using cosine similarity, like below:

Note: I’ve added an additional label because our three labels are very, very similar.

We can run this test using Amazon Bedrock on AWS, I used the excellent embedding model by Amazon Titan (amazon.titan-embed-text-v1):

				
					texts = ["Cakes under £10", "Cakes under £50", "Cakes under £100", "Best season's potatoes"] 

vectors = [] 

for text in texts: 
    response = bedrock.invoke_model( 
        body=json.dumps({"inputText": text}), 
        modelId="amazon.titan-embed-text-v1", 
        accept="*/*", 
        contentType="application/json", 
    ) 

    vectors.append(json.loads(response.get("body").read().decode("utf-8"))["embedding"]) 
				
			

The vector differences between 1, 2 and 3 are over 90% (92.9% and 94.5% respectively). From an analytical point of view, we can see, even with cosine similarity, that these categories are super close to a computer brain.

However, contextually, we know that although data piece 3 will also have data from pieces 1 and 2, if this were a recommendation tool and the user was in the market for “cakes under £10”, showing them cakes over £10 is a bad experience.

Relying purely on technology may not be enough, and this is why having the organisational approach to their data defined is so important.

Note: “defined” doesn’t mean it can’t change.

The magic of system prompting

While large language models are incredibly powerful in their abilities to brute force a response based on input that’s provided to them. And at that moment all they’re doing is using brute force to compute an answer.

This means that while LLMs are an incredible tool in the arsenal of any organisation, it requires investment in their organisation’s people to help them truly understand how to help this machine reason better with their input.

A key approach is to provide as much grounding, including system prompting. This approach allows us to set the machine up with the best possible information required to infer over the data, including how it can “think” about organising that data into clusters that can be utilised by the organisation.

And of course there’s more, before we even have to think about fine tuning, we should be using “multi-shot” (basically providing a good number of possible examples to the LLM on how you want it to reason and return the data), providing documents as context (through attachments or RAG approaches).

This grounding is super important to help that brain think, self-describe, and self-organise data so that it becomes informative signals for the organisations.

Modernising your approach to modernisation

One of Firemind’s greatest advantages are the tools, sandboxes, and ramps that we make available to our customers. We help our customers’ team’s breakdown the possibilities of their modernisation strategy into smaller, digestible and testable sprints, following our approach of “think big, start small, scale fast”.

One tool is our PULSE suite. This allows organisations to instantly deploy Gen AI tooling to their organisational teams. PULSE works seamlessly to help customer teams lay down their human approach to reasoning over their data, this is then used to update, modernise, and organise their approach to data modernisation.

Partner with Firemind

Firemind’s team of AI and data specialists can help you leverage the transformative power of AI. We work with you to identify the correct AI solutions and AWS Services for your organisational requirements, ensuring ethical implementation and maximising your return on investment.

Get in touch

Want to learn more?

Seen a specific case study or insight and want to learn more? Or thinking about your next project? Drop us a message!