The customer identified the need for a GDPR compliant application giving company users access to all kinds of data stored in a distributed Data Lake. The users themselves need to justify their access request by creating a data processing procedure and a domain owner is responsible for granting access. As a consequence, the data processing procedure is registered with a company-wide directory, where all data needs can be reviewed by auditors.
In order to leverage a maximum of comfort for the end user, the application features a shopping cart mechanism: Items (= data objects) can be put into a cart. The items have tags to categorize the data types and users can "pay" using a data processing procedure that covers exactly those data types.
After a short market research it became clear, that the company's requirements could only be covered partially by out-of-the-shelf Data Governance tools, so Holon decided to go for a "build" solution rather than a "buy" approach. An additional driver was the fact that - though the backbone of the Data Lake was build in AWS, relying on Kafka and S3 - other on-prem and cloud systems needed to be integrated not only for access but also for collecting object metadata in a centralized Data Catalog.
All Systems - cloud native as well as on-prem build - were integrated with a REST interface towards our application that are responsible for provisioning user access. For delivering metadata in an asynchronous way we decided to use a AWS SQS queue, consolidating system-specific queues into one application queue. The application itself runs on an EKS container to have full flexibility on scaling, because we planned for other auxiliary applications for maintenance and administration. It was implemented using REACT with its wide variety of frameworks and visualization components. The backend of the application was pure Java with an Oracle RDS database for persistency.
Both, RDS as a full flavored, relational Database service as well as EKS, as an AWS service for managed Kubernetes, proofed very reliable, resilient choices with a minimum of administration effort.
Regarding Single Sign On (SSO) and Identity Management we took advantage of the customer's IT centralized services: an on-prem Active Directory, a module for authentication and identity federation and a governance service taking care of creation, deletion and updates of user information.
While deployments to the 3 environments for development, test and production are done automatically through gitlab and gitlab runner, Terraform scripts were deployed separately to build a reliable, reproduceable infrastructure for all environments.
Holon was chosen by the customer to drive the project in an agile way and find the resources needed for a detailed solution concept as well as implementation power. During the project it was crucial to react on changing requirements and a close stakeholder management was necessary. AWS services helped a lot to enable a quick prototyping of software components and reduce the time-to-deliver over all, especially when it comes down to security topics and unpredictable infrastructure needs.
Though reiterating and refining the solution concepts from the ground was necessary as the customer needed to verify assumptions and data needs, the project was a huge success: AWS services and central services of the customer were seamlessly integrated and enable the companies' users to satisfy their data needs as well as take advantage of modern approaches for near-real-time data streaming and Big Data clusters as well as reliable and proven Business Intelligence applications and other database services.
By building a modular base system, we allowed the customer not only to scale up the environment but also to bring new functions to the end users:
We care about our customers by driving complex, hybrid projects as well as finding the right approach to quickly deliver insights. We are focused on Business Intelligence and Big Data, specialized to enable migrations to the cloud.