Data Governance Council 2020-21 Annual Report 

September 9, 2021

Overview 


Stony Brook’s data governance system was established in fall 2016 in order to improve Stony Brook’s data infrastructure. The Data Governance Council (DGC) oversees the data governance system, and began meeting in spring 2017. This is the fourth annual report of the DGC.


Major accomplishments

  • Reviewed effectiveness of data governance functions and approved a restructured system

  • Extended University data asset inventory to include information about data acquisition, linkages, and access

  • Continued implementation of Data Cookbook as metadata management tool

  • Demonstrated improvement on average maturity of data governance, improving on a 5-point scale from mean of 2.38 in spring 2017 to a mean of 3.39 in spring 2021.

Members 2020-21


Braden Hosch, Chair
Kim Berlin, Co-Chair
Andrei Antonenko
Ahmed Belazi
Diane Bello
David Bousquet
David Cyrille
Robert Davidson
Paula Di Pasquale-Alvarez
Lyle Gomes
Jim Gonzales
Tracey McEachern
Nicholas Prewett
Theresa Diemer, ex officio


Challenges

The COVID-19 pandemic prompted a restructuring (see below) that acknowledges the administrative function behind data governance, yet this function remains under-resourced with no positions exclusively devoted to data governance activities. Without at least one position in place, such as a Data Governance Specialist, the data governance initiative will be slow to deliver value to the University. Projects that will be slower to implement are: extension of scope to all university administrative data, comprehensive implementation of Data Cookbook, development and management of the data asset inventory, the pilot to begin staged data purging from PeopleSoft, and development of strategy for further data integration.


Restructuring of Data Governance Council Organization and Scope

Based on a review of operations and challenges from 2019-20 that indicated a need for clearer lines of reporting and dwindling engagement of the Functional Data Governance Committees, the DGC proposed a restructuring of the organization of Stony Brook's data governance system. This restructuring moves away from the David Plotkin model of data governance adopted by Stanford in the early 2010s that relies on significant activity in shared working groups of data stewards to a model more centered on administrative support adopted by the University of Wisconsin-Madison. The revised structure features

  • Identifying Executive Sponsors for the DGC
  • Division of data stewards into domain data stewards and area data stewards
  • Elimination of the Functional Data Governance Committees
  • Explicit acknowledgment of an administrative data governance function
  • Addition of DGC members to reflect splitting of SVP of Finance and SVP of Administration into separate roles and addition of a designee from VP of Advancement
  • Expansion of the scope of data governance from just PeopleSoft and the Data Warehouse to encompass all university administrative data.


Expanded data asset inventory

The university data asset inventory was refreshed in spring 2021 to update initial information about basic contents, storage location and sensitivity level, and then extended to collect additional information about data acquisition, data integration, linkage data access and reporting. Inactive, consolidated, and new data assets were identified. In all, the University maintained 74 major data assets, as identified in the inventory at the end of the 2020-21 academic year. Sensitivity levels and storage location distributions remained relatively unchanged from the initial inventory. Data connection protocols generally represented a hub and spoke model with PeopleSoft providing one way connections from PS to ancillary data assets, with limited connection back to PeopleSoft. Almost half (46%) of data assets had no automated process to push data out, although most (78%) featured some sort of on-board reporting, while 7% were accessed primarily through direct query or ad hoc reports, and 4% were accessed through Tableau.

High-level summary of data asset inventory as of June 2021 (n=74)

Sensitivity

                                  

Data Source                    

                                           

Frequency of                     

Data in                  

Data out     

High

64%

User input

62%

Live

30%

12%

Moderate

23%

PeopleSoft

54%

Multiple times/day

5%

3%

Low

14%

External Org

31%

Daily

43%

27%

   

Other

16%

Weekly

3%

0%

       

Monthly

3%

1%

Locations

 

Authentication

 

Periodic

9%

5%

SAAS/Cloud

59%

Stony Brook SSO

62%

Annually

7%

5%

On Premise Server         

54%

Asset-specific

32%

None

0%

46%

File share

9%

Open access

4%

     

Other

9%

No info

1%

     

 

Review and Revision of Definitions in Data Cookbook metadata management tool

The Data Cookbook was acquired at the end of 2017-18. This metadata management tool provides a repository for data definitions and other metadata that will be integrated with existing data tools.

Following established naming conventions and standard style, the data governance administrative team (Berlin, Diemer and Hosch) reviewed content in the tool and standardized according to accepted standards. This review and revision was two-thirds complete as of the end of 2020-21. This review represents a small portion of definitions that still need to be created inside this tool.


Demonstrated improvement in maturity of data governance

In June 2021, the DGC evaluated the university’s maturity in data governance practices using a maturity model deployed in spring 2017 and spring 2019. This maturity model measures practices in communication, culture, organizational structures, data quality, and roles and responsibilities across five data domains: financial, student/academic, human resources, research, and facilities. Respondents also had the opportunity to rate additional domains, but only two responses were received (rankings data and scholarship metrics data). Maturity was rated on a 5-point scale (1-informal, 2-developing, 3-adopted & implemented, 4-managed & repeatable, 5-integrated and optimized). Members of the DGC, data stewards of assets in the university data asset inventory, and selected university leaders with titles such as Assistant/Associate Vice President/Provost. Thirty-six responses were received. Overall, maturity improved from the high end of the “developing” range to solidly within the “adopted & implemented” range. The change in survey layout and survey population may have contributed to some of increase in 2021; DGC 2021 average response values were about 0.2 below others, suggesting overall rating could have been closer to 3.2.

Bar chart showing data governance maturity scores across categories. “Overall” has scores of 2.4, 2.8, and 3.4. “Roles and Responsibilities” shows 2.0, 3.0, and 3.6. “Culture” has 2.8, 2.9, and 3.4. “Data Quality” shows 2.8, 2.9, and 3.4. “Organizational Structures” has 2.3, 3.1, and 3.3. “Communication” shows 2.1, 2.5, and 3.2. Each category displays three colored bars indicating progression in maturity levels.

Bar chart comparing data governance maturity scores across departments for 2017, 2019, and 2021. Student scores increase from 2.4 to 3.2 to 3.9. Research rises from 2.5 to 2.4 to 3.5. Human Resources improves from 2.4 to 2.8 to 3.3. Financial increases from 2.8 to 3.0 to 3.2. Facilities rises from 1.8 to 2.8 to 2.9. Each group shows three colored bars representing the different years.





 

ANNUAL REPORT 2024-25

                                             

ANNUAL REPORT 2023-24

                                             

ANNUAL REPORT 2022-23

ANNUAL REPORT 2021-22

                                             

ANNUAL REPORT 2020-21

                                             

ANNUAL REPORT 2019-20

ANNUAL REPORT 2018-19

                                             

ANNUAL REPORT 2017-18