The Evolution of the Sausage Machine: Supporting an Effective Data Strategy

The Evolution of the Sausage Machine: Supporting an Effective Data Strategy

By Carl Lockwood – Liqueo Senior Consultant

The problem

In today’s world, organisations make more demands for efficient processing and accessing of increasing amounts of data. This can no longer be effectively supported by isolated IT teams trying to negotiate and interface with each other and the business. There is plenty of technology available to support more holistic ways of thinking about data. But understanding the technology is not enough to modernise an organisation's data strategy. Staying focused on the business needs, not the tech, is often the most effective way to make progress.

The early days

Several aeons ago, I started the first of many IT data-focused development roles. The role required me to align to specific technologies, and typically involved:

  • Creating or maintaining applications within siloed data systems 
  • Providing access and views of the data to allow business users to observe and maintain 
  • Building often complex, ‘sausage machine’ data pipelines using the technology relevant to the application to import, validate, enrich, match, master, export and archive.

Until recently, those data systems required considerable time spent to support the infrastructure. They also required great remediation effort to make the individual sausage machines churn faster when data volumes increased.

Data growth

For a long time, it seemed like a sensible idea to have independent systems, each interfacing with the other, reconciling and validating what they receive. However, demands for more and new data are increasing dramatically, and the sausage machines are struggling to keep up. 

Party data for example, used to be limited to less than a hundred data points. The necessity of assessing Party ESG metrics now involves thousands of data points that may need to be collected and reconciled from an increasing number of vendors. Traditional approaches of staging large sets of structured data through many transformations often push those systems past breaking point. Overcoming that problem alone isn’t often given priority of focus.

One solution has been to expand processing power at lower costs by adopting cloud technologies. While this can remove previous hardware resource constraints, it is not an “out-of-the-box” solution to the data modernisation objectives of increasing quality, efficiency and flexibility of business processes.

Data diversity challenges

There has also been an evolution in new semi-structured and unstructured data transfer and storage formats, such as Json, XML, Avro, Apache Orc, Parquet. 

New forms of semi or unstructured data require alternate methods of storage which applications based on hierarchical data models can’t support. Data such as PDFs, images or videos must be tagged and categorised in new ways. 

Throwing unstructured data in a data lake was offered as one solution. But that can pose its own challenges. It can be difficult to ensure required consumers can efficiently navigate the storage to access what they need when they need it. It can also be challenging when limiting users’ access to only what they are allowed.

The focus on process evolution rather than data modernisation strategies

The early Ways Of Working models were aligned to support teams of people within the microcosms of their application, limiting their concerns to the boundaries of their system interfaces and the performance of the pipelines within.

Process evolution has brought more focus on how we do things. For example, we have heard much about how Agile self-management philosophies maximise the delivery pipeline within those application silos. Likewise scaled Agile approaches focus on collaboration across teams.

But it is much harder for organisations to step back, take a holistic view and consider a data evolution, then visualise all the data systems together. Maybe there has been too little focus on how team efforts can support a cohesive data strategy aiming to maximise long-term business efficiency and benefit? 

There is an increasing need for a corporate-level holistic data strategy which requires companies to know the value of their data by cataloguing it and showing its lineage. Also, users now demand more timely data. Systems must validate increasingly scaled data quickly and access it on demand, rather than relying on the completion of sizeable daily processing batches within a pressured schedule.

New ways to leverage data

Effective data strategies must comprise a combination of initiatives based on a collaboration of people. A strategy must be supported by appropriate technology which provides consumers efficient interfaces to data they can understand - without necessarily having it delivered from a siloed application or by an IT team. 

Technology solutions may involve new approaches such as providing a Data Fabric. This unifies the underlying systems and structures to provide a single point of interface, as well as new organisation-wide implementations of comprehensive data intelligence tool-sets.

This technology has evolved into a new domain of “data intelligence”. 

Data intelligence concepts 

Data intelligence software has orientated businesses to consider organisation-level rather than application-level implementations. These encompass concept areas such as:

  • Data cataloguing and metadata management providing users with a cohesive understanding of data from all sources. Tracking data lineage to identify data provenance or being able to identify the downstream impact of a change on any metadata element
  • Automated metadata-driven ETL mechanisms for importing or capturing streams of data and providing tools to actively monitor the data - validating, profiling and benchmarking to establish trust
  • Facilitating data cleansing and stewardship, delegating responsibilities with workflow-driven self-service applications
  • Mastering data into repositories, interfacing simply to scalable, cloud-based technologies such as Snowflake. Automating pipeline delivery, enabling validated data for consumers when data change events occur.
  • Security and privacy through rules and policy management

Which technologies should be used?

With so many approaches and products available, and vendor sales packs framing data questions in terms that best emphasise their offerings, it can take time to evaluate vendor and product-specific strengths and weaknesses.

Fortunately, some companies provide analysis in this area. Gartner, for example, has devised a concept for evaluatingthe marketplaceThe Gartner “Magic Quadrant” makes helpful practical assessments across the industry - based on the ability to execute and completeness of vision. It also identifies the niche players, visionaries, challengers and leaders.

However, white papers and an understanding of the industry players are likely not enough to answer all the questions. Vendors may dazzle and mesmerise and try to change the perspective from what is needed to what they have, without genuinely focussing on the whole problem at hand.

To be effective, a data strategy should not be driven by the technology. It should be based on an understanding of the business data value chains and the need for collaborative data philosophies.

The message

Modern data strategies require new thinking. We should think about how people work together and use technology. For example, organisations are abandoning traditional inefficient reliance on negotiating with IT teams to deliver data from their isolated “sausage machines”, via interfaces they control. A lot of information is available in the data intelligence space, via a vast range of vendors, tools and technological approaches.

At Liqueo, we understand that effecting change should be an evolving process. It starts with engaging the right people, building relationships, and gaining an understanding of existing business limitations. Fulfilling data modernisation objectives must be based on the understanding that technology should be supporting the journey rather than the objective in itself.

Liqueo has strong experience in both directly delivering implementations and prioritising business objectives while a technology-orientated vendor is delivering the change. 

Sharing expertise from our Ways Of Working practice, we can support small initiatives or large projects on a data strategy plan. We can align with your business to help implement technology or processes supporting a data evolution that adapts and evolves as the journey progresses.

Whether you need support with your data needs or you’re doing something interesting in the data space, we’d love to hear from you. 

Our use of cookies

We use necessary cookies to make our site work. We'd also like to set analytics cookies that help us make improvements by measuring how you use the site. These will be set only if you accept.

For more detailed information about the cookies we use, see our Cookies page. Cookie Control Link Icon

Necessary cookies

Necessary cookies enable core functionality such as security, network management, and accessibility. You may disable these by changing your browser settings, but this may affect how the website functions.