Product enrichment, data training and smart automation for Simfoni

At a glance

Simfoni provides spend analytics and automation solutions, using AI to streamline procurement for global enterprises.

Challenge

Simfoni still relied on manual, labor-intensive processes via virtual machines to implement and automate cloud workflows.

Solution

Firemind developed two PoC’s to help Simfoni automate tasks and modernise cloud-based data ingestion and categorisation.

Services used
  • Amazon S3
  • AWS Lambda
  • Amazon DynamoDB
  • PostgreSQL
Outcomes
  • Automated, scalable processes replaced slow, manual processes.
  • Substantial cost savings and more time to focus on product development.
Business challenges

Streamlining Simfoni's cloud infrastructure automation

The Simfoni team are no strangers to the cutting edge world of machine learning and artificial intelligence processes. Processes that help to calculate spend, automate purchases and modernise the procurement journey of their customers.

However, Simfoni found themselves still using several manual and laborious internal processes, via virtual machines, when it came to the successful implementation and automation of their cloud enabled processes. The Simfoni team were looking to lessen the burden of manual cloud infrastructure changes and modernisation attempts, so they could shift their focus from maintaining the underlying structures, and instead prioritise time on their customer’s products and needs.

What our customers say

Hear directly from those who’ve experienced our services. Discover how we’ve made a difference for our clients.

Solution

The path to innovation

To better assist Simfoni in the shift from manual to automated tasks, as well as introduce more modernised and cloud optimised data ingestion and categorisation methods, Firemind worked on two integral proof-of-concepts.

The first was to build a product enrichment machine learning solution. This ML process would enhance how Simfoni not only pulled their data, but how it was grouped, stored, retrieved and user displayed in a Business Intelligence (BI) platform.

Simfoni’s customer data would be ingested by SFTP or directly into an Amazon S3 bucket. From here, customer data ingested would be grouped via a folder prefix, ensuring accurate grouping and batches of data. These data files then needed to be converted to a file format that was easier to handle and more flexible in terms of future value outcomes. Parquet was chosen during the ML enrichment phases of the process.

Parquet is optimised to work with complex data, in bulk, and features different ways for efficient data compression and encoding types. Using this approach helped with queries that need to read certain columns from a large table of data. Parquet can only read the needed columns within a BI platform, therefore greatly minimising the input/output (IO) required.

Once the ML phase was complete, the new output was stored again using S3, triggering an AWS Lambda event. This trigger sets off a unique classification scoring process, with meta data from the ML model being pushed through Amazon DynamoDB. DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale, the perfect choice for this use case.

The trigger also pushes ML enriched data in PostgreSQL. PostgreSQL is an advanced, enterprise class open source relational database that supports both SQL (relational) and JSON (non-relational) querying. Firemind uses it as it is a highly stable database management system, backed by more than 20 years of community development which has contributed to its high levels of resilience, integrity, and correctness. From this process, Simfoni’s customer’s UI can list all the ingestions, as well as their classification accuracy and meta data, via their selected BI platform.

Part 2

The secondary proof-of-concept, shifted the focus towards a new data machine learning solution. This solution would allow Simfoni users to use AWS Transfer, for raw files into S3. After a pre determined time, these files are adjusted to a different bucket tier for archiving.

A Lambda function converts all ingested xlsx files to Parquet format, allowing easier and more manageable data processing. When the Lambda completes it’s actions, it outputs to a new S3 bucket where the converted files are stored for the next stages.

The next stage (and objectively the most important), is the use of either AWS Glue DataBrew(cleaning/normalising datasets) or Amazon SageMaker. Both allowing for graphical user interfaces (GUIs) whilst also providing the tools and technical functions of advanced data modelling.

Finally, once the customer data has undergone ingestion, cleaning, modelling and deployment, it is stored in a new S3 bucket, ready for use.

Goodbye manual processes

Our POCs ensured that the slow, manual processes that the Simfoni team experienced were much less of a business challenge, via automated, scalable processes. Our combination of the right AWS tools and services have meant that Simfoni can worry less about ‘day to day’ data organisation, and more about the way they build new products and smart services for their range of customers.

Long standing infrastructure

Before our project had even kicked off, our team of Solution Architects and Data Engineers were able to map all necessary cloud processes using a well architected framework. This meant that when we were given the green light, we could immediately begin the build, working quickly to provide adequate testing time before a takeover was implemented to replace their existing Virtual Computer system.

Smart process, smart costs

Now the Simfoni team had a modern, scalable, machine learning process in place, they were no longer constrained by daily manual tasking and laborious administrative. This freed resource translates into substantial cost savings, from a human resource perspective, as well as providing more time within the team to focus on their own offerings development and additives (increasing revenue, simplifying their customer services and providing a data UI that informs).

Get in touch

Want to learn more?

Seen a specific case study or insight and want to learn more? Or thinking about your next project? Drop us a message!