Firemind co-host the bring your own data workshop in London

On Thursday 29th June, the Firemind team collaborated again with Amazon Web Services (AWS), to deliver our second Bring Your Own Data event. We guided a selection of businesses across StartUps and ScaleUps through datalakes, ETL and Amazon QuickSight.

Bring Your Own Data

Attendees had the opportunity to learn from industry experts here at firemind, as well as experienced AWS professionals. They learnt how to effectively use AWS analytics services, including Amazon RedshiftAmazon EMRAmazon Kinesis, enabling a better understanding of their data, to make more informed decisions for their business.

Attendees finished the day with a future-proof cloud-native, serverless, end-to-end data lake solution, built according to AWS best practices and tailored to their unique use cases.

The workshop provided a great opportunity to:

• Get hands-on experience with AWS analytics technologies with guided workshops

• Get expert advice and learn about best practices for data analysis and visualisation

• Explore future analytics and ML use-cases for data lake creation

• Take home a POC solution, ready to use and build on for future innovation

• Connect with other professionals in their fields/industries and share experiences and knowledge

 

The Workshop

We kicked off with the morning meet and greet. A chance for all attendees to meet both Firemind’s team of specialists as well as the AWS technical team. Sinan Erdem, Solutions Architect Manager at AWS, welcomed all attendees and then moved on to discuss the main topics and agenda for the day.

Ahmed Nuaman, Managing Director at Firemind, followed with a brief intro of Fireminds practices, and guidance available throughout the day.

As the workshop revolved around the attendees data, we began by looking at data ingestion and the pre-prepped data the attendees had placed into Amazon S3 buckets. Once data is ingested into an S3 bucket, it becomes immediately available for storage, retrieval, and further processing. S3 provides high durability, scalability, and availability for data, making it suitable for a wide range of data ingestion scenarios, including data backups, data lakes, log storage, and data archiving.

Once the data was prepped, attendees began the AWS Glue portion of the workshop. AWS Glue is a fully managed extract, transform, and load (ETL) service provided by AWS. It helps automate the process of preparing and transforming data for analysis, making it easier to discover insights and derive value from ingested data stored in Amazon S3 buckets.

When working with ingested data in an S3 bucket, AWS Glue offers the following functionalities:

Data Catalog: AWS Glue creates and maintains a centralised metadata repository called the Data Catalog. It catalogs the structure, schema, and location of your data in S3. The Data Catalog acts as a single source of truth for metadata, making it easier to discover and understand the ingested data.

Crawlers: AWS Glue Crawlers automatically scan the S3 bucket and infer the schema and structure of the data. Crawlers can identify file formats, partitioning schemes, and extract metadata, which are stored in the Data Catalog. This automated discovery process saves time and effort by eliminating the need to manually define schemas.

ETL Jobs: AWS Glue enables attendees to create ETL jobs to transform the ingested data into a format suitable for analysis. You can use a visual interface or code (Python or Scala) to define the transformations. ETL jobs can perform operations such as filtering, aggregating, joining, and cleaning the data. The transformed data can be stored back in S3 or loaded into other data stores or analytics services.

Data Lake Formation: AWS Glue integrates with AWS Lake Formation, which provides additional capabilities for managing and securing data lakes. It helps define and enforce fine-grained access control policies, data transformations, and data retention policies on the ingested data in S3.

The final stage of the workshop was visualising and exploring the data using Amazon QuickSight. With QuickSight, all users can meet varying analytic needs from the same source of truth through modern interactive dashboards, paginated reports, embedded analytics, and natural language queries.

 

Looking to work on your data?

If you’re a business that’s looking to further harness the power of your data, and would like specialist care, reach out to our team.

As data specialists, we’ll help you consolidate and clean your datasets, build data lakes that make data management a breeze, as well as bring you the value of visualised data. Visualisations and dashboards that help you make strategic decisions for your business.

Get in touch

Want to learn more?

Seen a specific case study or insight and want to learn more? Or thinking about your next project? Drop us a message!