The prize: Cloud and the hyperscalers promise so much for data and analytics— yet cloud is not right for everything
The fundamental mismatch. The world’s global IT landscape is fast becoming data-centric. In contrast, traditional enterprises were designed around business units and applications, with data a by-product as opposed to a highly valued resource.
The result is that traditional enterprises are out of step with the dynamic data-centric environment in which they operate, unable to metabolize and act on data at the required speed.1
You will always be behind in some areas if you are not in the cloud. The hyperscalers — Amazon, Google and Microsoft — and the venture capital industry are channeling huge investment into an ecosystem of data and analytics solutions, with cloud as the “go-to” deployment option. In augmented data and decision-making, for example, you cannot keep up if you are not in the cloud.
But cloud is not right for everything. Nevertheless, we should not view everything through the lens of cloud. Considerations of security, performance, cost and compliance can all make cloud the wrong choice for certain systems and data. After the cost of transformation has been taken into account, the business case may no longer stand; and the capacity of an enterprise to move to cloud will be limited by organizational inertia, skills availability and change bandwidth. So, what is the point of a strategy that cannot be implemented?
Sizing the prize: A cloud journey must start with an assessment of the business value that cloud-based solutions will bring to your enterprise.
Innovation: putting data at the heart of the enterprise. Becoming data driven of course includes making better decisions faster, based on more and better-quality data; but being a truly data-driven enterprise also entails a transformation that places data at the heart of the enterprise to drive revenue and other business outcomes. Cloud can play a key role in this transformation by bringing innovation at multiple levels. Cloud enables new business models, for example, constructing data ecosystems. Cloud also empowers machine learning (ML) and artificial intelligence (AI), through out-of-the-box models and services that support the entire ML/AI life cycle.
Cloud can open up the inner sanctum of data science — formerly held close by its high priests, professional data scientists — and put data and tools into the hands of citizen data scientists. Finally, cloud can help unlock the value in data, for example, via open data sets.
Agility: where cloud comes into its own. The COVID-19 crisis has demonstrated the concrete value of agility. Whereas some enterprises were able to pivot seamlessly in reconfiguring operations, others struggled. Similarly, the faster pace of change across the entire economy places a premium on agility: variable costs vs. fixed costs; on demand vs. on order; fluid vs. static; modular vs. monolithic; test and learn vs. plan and specify. Agility is where cloud comes into its own, bringing scalability and enabling rapid experimentation. Cloud also increases data flow and fluidity, making data composable as building blocks for rapid assembly of new analytical and business capabilities. This can allow managers to ask critical questions that had not previously been considered.
Data management: integration, automation and control. Cloud brings many opportunities to transform and optimize data management, without which data centricity remains a dream. Cloud data management greatly facilitates integration of data from across the enterprise and beyond through data hubs and data meshes. Similarly, cloud brings the opportunity to reduce data proliferation arising from the “spaghetti and meatballs”2 effect. Furthermore, through automation, cloud can impose extra discipline in data management. Finally, cloud increases the range of options for storing data, so that it can be optimized for distinct types of data and processing.
Cost: Run the numbers early. Cloud will reduce CAPEX and, if deployed correctly, bring greater control over costs. Yet, as highlighted in a recent paper by Andreesen Horowitz, it is by no means certain that cloud will reduce costs overall,3 especially once an enterprise matures. Whether compute and storage are cheaper on premises or in the cloud will depend on your cost alternatives and on how workloads and data are used. In fact, larger savings are typically to be had from rationalizing applications and data prior to moving to cloud. Run the numbers early, since there are major implications for architecture — and make sure to account fully for networking, data transfer and transformation costs that are typically underestimated.
Big questions: Before an enterprise can decide how and where to adopt cloud for data and analytics, answers to some big questions are required to act as guardrails for your journey.
Strategy: What business and IT outcomes do we need to deliver through analytics and data? The most fundamental question is how you intend to compete and create value through data and analytics. For example, will your priority be analysis and exploration, or operational analytics? What are the most valuable data types? Since few enterprises have the luxury of focusing solely on the long term, your strategy should also resolve problems in the here and now — such as forthcoming regulation or performance problems that stakeholders want fixed.
Compliance: What rules must data and analytics meet and how do we ensure compliance? Too often cloud programs are expensively stalled by the need to retrofit regulatory rules. An essential early step is the identification of all regulations. Many have specific regulations governing cloud, while data governance rules such as the General Data Protection Regulation (GDPR) apply across all sectors. Years of gradual change may have led to some aspects of compliance being fudged, but cloud’s software-defined approach will demand the removal of all ambiguity. Since regulation now changes frequently, compliance should be future-proofed by employing a policy-driven software approach instead of hard-wired controls.
Security: How do we keep data and analytics applications secure in the cloud? Although the hyperscalers deliver exceptional security for their cloud infrastructure, cloud inevitably increases the attack surface and you, not the hyperscaler, are responsible for protection of data in the cloud. You are likely to want to explore new security models to strengthen security and mitigate these risks — in particular zero trust, data-centricity, and modern identity and access management.
Architecture: What is the overarching architecture for data and analytics in the cloud? Choices are required around architecture. First, different primary business use cases will need different patterns for ingesting, storing and managing data. Second, there is a variety of architectural models to choose from. The lakehouse, the data integration hub-data mesh and the cloud data warehouse all have their advocates. Although new ventures will almost certainly be born in the cloud, most legacy firms will have some analytical applications and data stores that are not suited to cloud, making a hybrid of solutions inevitable.
Partners: Who will be our strategic partners on our cloud journey? The hyperscalers are not all the same when it comes to data and analytics. For example, they bring different capabilities in compute speed and scale, open data sets, artificial intelligence models (say, for speech and text) and services for the model development lifecycle. Moreover, specialist vendors fill important niches in the landscape. Furthermore, the journey involves many novel technologies, and skills are scarce. As a result, making the right choice of IT service providers is a critical success factor in reducing risk, cost and timescales.
Strategy to action: Once you have addressed the big questions, you can move from strategy to action.
Data. In resolving how and where data should be stored and processed, a whole range of factors has to be weighed: business needs, security, compliance, cost and technical performance. Moreover, these considerations are frequently opposed, making tradeoffs essential. Significantly, egress fees and data gravity (the pull that data exerts to attract other data) mean that your decisions will have long-term implications. Because data will most likely end up being stored in more than one location, a common metadata layer is vital to prevent the cementation of new silos.
Data management. Cloud presents an opportunity to enhance data and information management, but you have a narrow window to get things right. You will need to map out the principal data flows, with separate patterns defined for each — for example, operational decision support, self-service analytics and data science. Notably, these data flows may operate completely differently in cloud to on premises, with Extract, Load and Transform (ELT), for instance, potentially replacing Extract, Transform and Load (ETL). Likewise, you need to define the tools, processes and governance that you will use to manage information across the end-to-end lifecycle. Key areas include data augmentation, metadata, data lineage, data catalogs and archiving.
Rich data and analytics landing zone. A vital prerequisite for a cloud data and analytics program is a richly functional enterprise cloud platform that includes common components — for example, single sign-on, networking, zero trust security, monitoring and DevOps. Even though such a platform will typically take 3 to 6 months to build, this is a matter of “going slower to go faster,” ensuring each later project does not duplicate effort and delivers a more standardized solution. Similarly, a rich data and analytics landing zone is required that should contain information management and governance tools and processes such as data cataloging, data protection and data lineage. Without this, you will miss the opportunity to standardize and to bake in compliance by design and security by design. You will spend more, as successive projects reinvent the wheel, each with their own vision of roundness.
Migration and transformation. In order to build out from this first landing zone, you will need a roadmap for scaling, as well as a robust and repeatable approach to migration, transformation and archiving, which is often best achieved by a factory approach that reduces costs and enforces standardization. DXC’s approach to choosing the right platforms, modernizing and optimizing infrastructure for managing data is called Cloud Right™. As when moving operational systems, for each application or data store, you must decide if business goals are best met by simpler rehosting and replatforming, or more complex refactoring and rearchitecting. Even when the goal is transformation, you will have to choose whether to transform and migrate or to migrate and refactor/reengineer, since the cloud offers many tools to facilitate transformation and data cleansing. Finally, a robust approach is needed to archiving systems and data so that the full benefits of moving to the cloud are realized.
Making the matrix work. Since a successful cloud journey necessitates balancing numerous factors — cost, risk, compliance, security, technical performance and business outcomes — you need organizational structures and governance to make the matrix work. At the working level, compliance and security SMEs should be embedded within teams to ensure compliance by design and security by design, with architects likewise dispersed among project teams to ensure adherence with architecture and design principles. At the next level up, a cloud business office (CBO) applies integrated decision making where individual scrums cannot resolve issues or where an issue spans many teams. Above the CBO, an executive forum can act as a point of escalation but the buck needs to stop with a single accountable executive.
Conclusion: Cloud will play a vital role in reorientating the enterprise around data — a vital feature of the 21st century enterprise.
It is hard to see how for most enterprises cloud would not form a key strategy in the transformation to data centricity. Yet you should avoid starry-eyed thinking that fails to accept that migrating to cloud is difficult and that for large enterprises with complex legacy systems a hybrid solution is all but inevitable. Success will depend on clear top-down thinking to answer the big questions and to develop a strategy to scale adoption. The top-down approach, however, has to be tempered with bottom-up planning and action, since complex issues can only be addressed by working through the details and learning by doing.
Download the full paper
Learn more about how to transform to a data-centric enterprise through cloud. Read the full paper.
About the authors
James Coleman, Michael Conlin, Mamoun Hirzalla, Sebastian Kloeser, Andriy Sas , Chris Swan and Dave Whitehead
1See Supercharging your data metabolism, DXC Research, September 2021
2Chris Swan, Spaghetti and meatballs, Chris Swan’s Weblog, July 7, 2019.
3Sarah Wang and Martin Cosado, The cost of cloud, a trillion dollar paradox, Future from a16z, May 27, 2021.