# Data Strategy Bootcamp ###### tags: `Work` `Data` `Training` `PDI` # Day 1 ## Notes ### Introduction #### What is a Data Strategy? - It can't be 'worn', should be implementable - Should be built into the day-to-day function of the org #### Case Study A - Success story - Retail brokerage - Success story - Legacy applications were making business difficult to do - made them a takeover candidate - bud was brought in to assess and evaluate, and eventually developed a strategy - the organization had an ongoing data strategy initiative, but they started to early and drove forward - 'Integrator support', approach for working with consultants - Solutions prioirtized efficiency, low overhead - What does it look like when we are _part_ of the organization - have you ever put together a bicycle for your children? #### Case Study B - Failure story :::info *Questions* - Issues with regulation? will be addressed later - How long did Case A take to see success? 3 months! ::: - Hospital - Driven by ACA, operational efficiency, DGC - Challenged by data quality, data regulations, Mandated data collection, hostile executive environment #### Case Study C - Another Success - Financial org, another - started program within legal - progress of data strategy integrated into performance management - starting with corporate counsel offered inroad into authorizing environment, powerful mandate #### Intro close :::info _Questions_ - Can data governance and data strat be done in parallel? Yes, in fact it almost always needs to be. Most CEOs don't understand how data or its attendant systems work, are governed, whatever. Execs are 'in need of enlightenment' - (to be covered later) What do you do if too many people want to be involved in data governance? Tell them what it is and they will flee. If everyone wants to be in DG, why? What's the implied or real data strategy that is being implemented? Sometimes everyone wants to get on board with DS because leadership is on board. If that's you're problem, you need to get a champ / sponser to sit everyone down and simplify the internal contributor team. ::: - We have to be practical Lots of organizations would be better off with a simple data warehouse - Next, we're going to get into establishing buy in and engagement ### Engagement and Buy in #### Motivation and Drivers - People say they are struggling with: - too much data - responsiveness - decision-making - monetization - insecurity / peer pressure > These things will get you buy-in but not engagement - Things that are often going on, to which the data strat is responding: - Data quality issues undermine all reporting outputs - There is a burning platform - Previous partial attempts :::info _Questions_ - Orange we all data stewards? And therefore involved in Data Strategy? ::: #### Business alignment - The whole key is connecting things together - you want to start with a mission, move to business capabilities - If you can talk business talk, you can endear yourself and your ideas to the executives - Then move to data management capability, from which follows data governance capability - Ultimately, alignment means linking data initiatives to irl business problems #### Focus on Needs - A true strategy addresses what the organization needs, which maintains engagement and drives change management - What people want 'changes daily', is transient and ephemeral. People want dashboards because that's what they think they shoudl want - 6 ways to add value: - processes - competitive position: develop and act on competitive intelligence - results - asset / intellectual capital (knowledge management) - enabler (competency, growth, empowerment) - Risk mitigation #### Deriving Needs - E.g. - Objective: Improve retention - Business Action: respond to client queries faster - Information Use: accurate understanding of accounts and holdings - Business Information Requirements and Metrics: household characteristics, time to response, types of response - Data Candidates: Households, Accounts, Clients #### Connecting the line of sight - Every biz cap demands data cap, essentially #### Showing the value - Mapping caps to needs allows the data strategist to clearly demonstrate value to the exec environment - Often get asked "What should we do first?" and the answer is always "What does your organization want to do?" - Leaders have been proselytized the wonders of data, the mysterious and miraculous force, ad nauseum, and you use this to build a meaningful and tractable plan, rather than a shopping list :::info _Question_ - Can this be done prior to a 'maturity study'? This exercise can be done in parallel or any order. - If there is no data dictionary or glossary, can this work? The need should precede the glossary, which is difficult to get up and running, can be painful and slow (1 year long process). What do you do in the interim? You're going to use some old excel spreadsheet. Once you know what the spreadsheet is lacking, then you know what you need to address, and you can buy things that do that, rather than things that just sound good. ::: #### Communicating Data Strategy - John advocates 1 pager for Leadership on data strategy - Link goals to deliverables, accountable objectives - We want to have a nice, clear line of sight between where the org needs to go, what they need to do, how data can support it, and what needs to get bought or built to make it happen ### Breakout Exercise - Connecting data words with the organization and an initiative - Be prepared to talk about your current org and a data issue that you're having #### Stuff that we're doing - reflecting on the case studies - Framing a need, initiative, mgmt and gov caps - We focused on resource allocation and planning ### Components of Data Strategy - Not big on it being a 'technology deployment plan' - Business elements - Technology elements - Enetrprise architectural elements #### The Big Things - Literacy - Governance - Strategy - These three 'components' or buckets of capabilities enter every data strategy development process #### Overall Approach - There are lots of data strategy methodologies, and John doesn't really express a strong preference. Instead, he advocates a high level approach (which aligns well with the PDI structure). - Understand - Prepare - Change #### Understand - Needs -> Assessments -> Definitions -> Vision #### Prepare - Plan -> Define -> Build Awareness - Building awareness, for John, involves socializing the data strategy and engaging with staff across the organization #### Change - Sustain -> Integrate -> Start - The methodology is not the issue, can be applied in Agile or Waterfall contexts #### Philosophy - The anchor of org philosophy is the organization's principles - If you can get an organization to commit to a vision (e.g. data is an asset in our organization), you can arrive at decisions that follow from that by interrogating their interaction with that vision - If you start your company with a principle related to data, your issues will become trivial over time - 'Shaping factors' - Literacy - Culture - Compliance - Regulation - Selling the idea of starting with principles - This approach creates a foundation of excellence - Starting with principles creates a toolkit that simplifies decision-making, reduces friction, and streamlines the meetings required to get shit done - You can start with 5-6 principles, can build per data-related interaction / engagement - Principles beget policies beget processes #### Capabilities - Caps are stable, an organization would need a significant event to alter - It's not abstract, so it's less controversial - The term caps comes from the field of enterprise architecture - Tends to be a bit taxonomic (caps within caps, turtles all the way down, levels to this shit) - Processes are volatile, caps and data are stable - Caps are things that help you get things done - Capability analysis enables you to identify gaps - I want to put the capabilities into a vision that give them meaning :::info _Question_ - What is an example of the difference between a capability and a process? Capability is like "Marketing", Sub-capability is like "running a campaign", process is like "identifying target population", "developing materials", "disemminating materials", etc. Processes have verbs. ::: - Capabilities, specifically data capababilities, are positioned at the end of the alignment process chain. #### People - Research organizations are my 'favourite' - They believe that their data is the only data that are important - You do have areas where data is not visible or legible to anyone in the organization because of a distributed set of hyperfocuses on local data - External partners complicates data governance and management #### Tone and pace - Figure out the scale and scope of the changes described in your data strat - Is this at a level of cultural mandate? - The key selling point is that misaligned expectations can create huge issues within the planning and implementation phases, adding unpredictable time and cost to project requirements #### Readiness - How mature is the orgainzation, - what are they ready to change, and what are they not - Big Bang approaches are the least likely technique to implementing a data strategy, the strategy needs to meet the organization where it's at - Don't want to disparage anyone, but one of the big companies that sells cap assessments, they didn't have any BI capability in their 'maturity model' #### Literacy - If you're truly a literate organization, data becomes an integral part of all of your operations - We're asking people to adopt new behaviours and change old ones, so we need to educate them as to why if we expect them to make the change - Lots of orgs will say that they don't have time, but they're wrong - in Lean or Six Sigma, or in classic Compliance training, everyone in the organization is expected to receive and participate in training - Putting literacy as part of the principles and culture is a slam dunk - You can have a lot of fun with literacy - You will have to do it eventually, so you might as well bake it into the start #### Metadata management - every vendor has the ultimate solution, according to them, but nobody has that - metadata management does not stand on its own, and benefits significantly from universally understood principles of data and information stewardship / documentation, etc. #### Data Architecture - Structure of an organization's logical and physical data assets (TOGAF) - Everyone has an architecture. It might be undocumented, it might be fucked up, but it exists nonetheless. - Consider existing standards, where they meet your needs, and where they fall short, potentially #### Data and other models - They exist, there are best practices, etc. ### Data Capabilities - Data quality, intentional ETL - Business Intelligence - Analytics / Big Data - Master / Reference Data Management - Large monolithic projects - High failure rate - Reference data are data that are used programatically across the organization - Data integration is the holy grail of data strategy efforts across time, more intensive alternative to MDM - Bimodal considerations refer to different implications, data models, and functions of having different parts of the organization which use data very differently. - John accepts the potential need for differnet data management and governance approaches required for different parts of the organization - Make sure that, whatever you choose, your data strategy does not ignore one of them - Workflow and operations are often overlooked, but need to be included - The Roadmap is the most commonly requested deliverable. - the 'Capacity Building Plan', essentially - Level of detail is critical - You need to lay out what happens in the first 6 weeks, first 6 months in excruciating detail - Data Quality Mcgilveray ### Creating a climate of success - Educating leaders - Engaging the client team - Aligning on the mission, the needs - Understanding maturity - Demonstrating accountability - Communicating the value proposition ### Maturity - Refer to the slide deck for a nice diagram - Blueprint is not very mature ### Business Data Requirements #### Overview - derived, not generally stated - includes a wide variety of potential vehicles: reports, kpis, facts, events, codes, identifiers, lists, etc. - dependent on contexts. Need to gather all of the context at once. - Other techniques can be helpful, but should be used with caution: - Interviews can give you a lot more information than documents, and they can go in different directions as they unfold, but they are slow, prone to bias, and demanding for interviewers - Data collection efforts (surveys?) can miss the details - John's book includes a set of templates for BDR determination. - I don't quite get his process, but he really hits it hard. Worth pulling it from his book. - Data strategy should happen in 3 months because things happen too fast - BDR process overview - create core BDRs to start with - refine, consolidate BDRs, and reflect on them to characterize completely - complete artifacts by analyzing and applying what has been created - Taxonomy of Business Information requirements in slides - Rationalizing, clarifying, and discussing helps to prioritize BDRs - A fixed architecture simplifies the process, but can act as a constraint on the possibility space #### A Note on Requirements - I need a data lake describes a requirement, rather than an objective - Test this by reflecting back with the question, 'maybe you don't want a data lake' - If it's an objective, it won't make sense #### Technical Architecture - Uses the same catalog of BDRs - Dimensions, considerations, etc. #### Wrap up - Alignment sets you up for efficiency and success - Business Data Requirements structure the translation of needs into solutions ## Thinking ### Highlights - Spreadsheets are both chill and necessary, and so the practices that govern spreadsheet creation, documentation, and usage should be included in the data strategy - Data literacy is critical at every level of the organization - Our idea of 'readiness' is reflect4ed well in the concept of data 'Maturity' - John's emphasis on the importance of customization, experimentation, tailoring tools to your approach is driven by his expertise, rather than persisting in spite of it. ## Glossary - DMBOK 2 - TOGAF: pre-defined organizational data architecture 'ontology' - Financial Industry Business Ontology (FIBO): a metadata management framework for the financial industry - CMS: same thing for medical domain - Lazy table - Integrator - Master Data Management (MDM): - Seems to generally refer to an implementation of an [MDM Platform](https://www.gartner.com/reviews/market/master-data-management-solutions), an enterprise software product that supports integration and management of persistent, unified records. - There are 4 MDM hub implementation styles - CISO - Reference Data Strategy - Metadata - Chief Information Officer (CIO) - Business Alignment - Inegrator Hostility - Steward - Data Governance - DGC? - Big Bang / Big 5 - Metadata - Operating Model - Data Quality - Burning Platform - Business Information Requirements - Capability: What an organization does to accomplish a goal. If I want to sell stuff, I have capabilities in *Sales* and *Marketing*. No verb, no action in it. - Maturity (Study / Model / Assessment) - Agile - Waterfall - Maturity model - Scufworks ## Resources Mentioned - [Infonomics](https://drive.google.com/file/d/1b_2B7KDogWKEiWXXpF3o0OU9j3hLCAdX/view?usp=sharing) - [Strategy Maps](https://drive.google.com/file/d/1eF4N77DJ9xqcB0guhs0xIZrw1iyaRpGY/view?usp=sharing) - [Ladley, Data Governance](https://drive.google.com/file/d/1eF4N77DJ9xqcB0guhs0xIZrw1iyaRpGY/view?usp=sharing) - [McGilveray, Executing Data Quality Projects](https://drive.google.com/file/d/1Md9QmXutw2V17Ji-xLW6nxiE-3qO_lNN/view?usp=sharing) - [TOGAF 9](https://drive.google.com/file/d/1m8ssAQ5Ibo35AvWjoPsPkQDH-sAKe0pM/view?usp=sharing) # Day 2 ## Notes ### What is Data Governance? - It's a business capability - Part of a broader change management competency. If everything was already great, you wouldn't need this class / data strategy. You can use the language of organization change management to talk about data governance, but it's also a service, competency, etc. - The value of DG is only apparent when the data are being used to drive irl outcomes - A focus on standards, processes, decision-making, conflict resolution, and culture is positioned as an objective, when it's really a requirement - DG doesn't do anything on its own. It won't guarantee success. It's kind of tedious and mundane. It's much more like the audit aspect of accounting than it is like the technical / logical aspects of data management - For instance, it cannot actually _do_ data quality, it can only provide structure for the data quality to be improved. It's not a technical function. - Governance is the definition and oversight, while management is the implementation of that oversight - DG is important because data is used in our organizations, and without intention, it gets used poorly and inconsistently, leading to bad decision-making - The classic order, which is ok, starts with Business Strategy, moves to Data Strategy, and then ends with People Strategy :::info _Question_ - Where can I find a methodology? West Monroe Partners, Doug Laney's firm, has some nice stuff, some intimations of their methodologies on their website, as do other big integrators like Accenture. - Data Governance is a compliance function, how can that be justified in small startups? If you're regulated, then yeah, it's a compliance function, but if not, then it doesn't have to be. DG helps you use your data to fulfill your mission better, it's not strictly limited to compliance. - I've seen Data Strat done without DG again and again, how to assert the need for DG? The fact that it's been done again and again should be evidence enough. ::: ### Finding Alignment on Governance? - Start by identifying your main objectives, maintaing clear distinction between needs vs wants - You need to have the organization lined up, with engagement, authority, etc. - Need detail, in the form of an operating framework, an engagement model, and a road map, which itself needs to be prescriptive and imminent - The cost of your data governance program should be front-loaded, and eventually converge to zero or negative (savings) - Eventually, data governance should disappear and be functionally replaced by Data Culture ### Operating Model - There's levels to this shit - Don't want to create a new committee, want to coopt and affect the agenda of existing structures that already have a place in the authorizing environment - There needs to be accountability for failure, which needs to be felt at a higher level of the organization - Forget the three layer model. Typically it's four layers, but you need to figure them out, and the operating framework needs to engage with it. - The DAMA wheel is a nice, comprehensive starting point for getting from needs to solution areas - Ladley breaks his version of the DAMA wheel down along organizational lines. His breakdown is included in his book (linked in day 1) - You've got to figure out how the work is going to happen. - Data Governance Council chair interfaces beteween DGC and leadership, DG coordinator interfaces between DGC and operations - Start with minimum sustainable operating model (MSOM), and shift to long term operating model - You don't appoint stewards at the beginning because you don't know what they're going to do. :::info _Question_ - What are the differences between MSOM and LTOM? Crude, targeted at biggest pain points, simply surviving. Number of layers and capabilities supported. MSOM is a field hospital, LTOM is an institution. - What is a data ethics strategy? Operationalized commitment to ethical behaviour, like actually opting people out of contact when they ask to be. ::: ### Data Governance Teams - Steward can be useful, but often means nothing - Custodian is useful, but demeaning - Coordinator is often used, but downplays the role. Prefer 'Data Lead' when the person is an engine of the initiative - Broadly, create the titles that you need - Job titles may change, but downplay the consequences in terms of comp and org structure - Embed a data lead in planning at the project and strategy level - When you are starting to plan, put a Data Governance Person at the table to identify Questions, BDRs, and Governance Requirements :::info _Question_ - If Gov should be led internally, but literacy is a barrier, where do you start? Start by bringing in an expert to mentor the candidate, 'train em up'! ::: ### Break for IC Conference :::danger Fill with SK notes ::: ### Tactics - Define terms, establish shared language and common understanding. - DG project checklist for the PMO to use when taking on new projects - Show value by demonstrating both effectiveness and progress - Progress ## Thoughts ### Highlights - Data Governance starts prescriptive and intensive, but should create Data Culture that functionally replaces it in that respect. I really like this idea, that Governance creates Culture, which changes how Governance shows up in Management. Data Governance is the parent teaching the child how to ride a bike, first pushing them and balancing them, showing them what it feels like, and later giving them strategic guidance, like "don't ride on busy roads". - The language of requirements is helpful for the PDI, as connective tissue between gaps and solutions, and also more broadly at Blueprint, as a tool to interrogate strategic goal setting more generally. Is this an objective, or a requirement? - Creating alignment between goals, objectives, and requirements depends first on an alignment of understanding, so one of the first steps should be sharing a glossary of data terms. For PDI, we can probably just find one to steal and give it to existing and new partners as homework. ## Glossary - Minimum Sustainable Operating Model (MSOM) - Organizational Change Management (OCM) ## Resources Mentioned - [O'Keefe and O'Brien, Ethical Data and Information Management](https://drive.google.com/file/d/1Js2zVdwFRz745k70E2jX4wz3ERZ46azN/view?usp=sharing)