Data Governance Council 2021-22 Annual Report

September 8, 2022

Overview

Stony Brook’s data governance system was established in fall 2016 in order to improve Stony Brook’s data infrastructure. The Data Governance Council (DGC) oversees the data governance system, and began meeting in spring 2017. This is the fifth annual report of the DGC.


Major accomplishments

  • Extended University data asset inventory to add 29 assets and retire 6 assets for a total of 95 assets

  • Continued implementation of Data Cookbook metadata management tool-139 approved definitions and acquisition of data profiling module

  • Launched a pilot project to dispose of sensitive data after they are no longer needed

  • Launched a review of the University's data access policy

  • Acquired a data tool to clean address data

  • Received approval to hire a data governance specialist 


Expanded data asset inventory

The university data asset inventory was extended in 2021-22 to add 29 new data assets and retire 6 assets that are no longer in use, for a total of 95 major data assets maintained by the University. The inventory includes information about basic contents, storage location, sensitivity level, data acquisition, data integration, linkage data access and reporting. 

Members 2021-22

Braden Hosch, Chair

Kim Berlin, Co-Chair

Andrei Antonenko

Ahmed Belazi

Diane Bello

David Cyrille

Robert Davidson

Paula Di Pasquale-Alvarez

Lyle Gomes

Jim Gonzales

Kathleen LeViness

Tracey McEachern

Dawn Medley

Nicholas Prewett

Theresa Diemer, ex officio


High-level summary of data asset inventory (2021 n=74, 2022 n=95)  

 

Sensitivity       

2021          

2022         

Data Source       

2021        

2022                  

Freq (2022)     

Data in       

Data out       

High

64%

58%

User input

62%

63%

Live

26%

12%

Moderate

23%

27%

PeopleSoft

54%

51%

Multiple x/day            

5%

2%

Low

14%

15%

External Org

31%

28%

Daily

41%

25%

     

Other

16%

32%

Weekly

1%

0%

           

Monthly

2%

1%

Locations

2021

2022

Authentication         

2021

2022

Periodic

13%

7%

SAAS/Cloud

59%

63%

Stony Brook SSO

62%

61%

Annually

5%

4%

On Premise Server       

54%

38%

Asset-specific

32%

27%

None

0

35%

File share

9%

8%

Open access

4%

4%

     

Other

9%

19%

No info

1%

2%

     
     

Other

 

5%

     

 

Are Analytics Available?                                                          

If analytics are set up, where are they housed?                    

Yes                        

57%                        

Housed - In application itself                

76%                          

No

42%

Housed - In a separate application

24%

Unknown                      

1%

Housed - Other

13%

       

If analytics are set up, what features are available?

If analytics are set up, which types of analytics are available?        

Underlying data can be downloaded                        

80%                  

Descriptive

80%

Visualizations

76%

Exploratory

74%

Users can make selections or change parameters                           

57%

Predictive

11%

Data definitions are available to users

35%

Prescriptive

11%

Maps or other geo-spatial features

26%

   


Information about analytics were also collected in 2022. Just over half (57%) of data assets had analytics available, and three quarters of those housed analytics in the application itself, while a quarter of data assets had analytics for their data housed in a separate system. Analytics were principally (76%) data visualizations, with a quarter having maps or other geo-spatial features. Four out of five analytics systems allowed users to download underlying data but only about a third provided users with data definitions. Analytics were principally descriptive (80%) and exploratory (76%) while only about one out of nine (11%) featured predictive or prescriptive analytics.

Of perhaps most importance, the data asset inventory has been useful in advancing various projects, including addressing the Governor’s executive order about preferred name and gender and various security reviews.

Review and Revision of Definitions in Data Cookbook metadata management tool 

The Data Cookbook was acquired at the end of 2017-18. This metadata management tool provides a repository for data definitions and other metadata that will be integrated with existing data tools. Following established naming conventions and standard style, the data governance administrative team (Berlin, Diemer and Hosch) added 36 more definitions for a total of 139. The addition of a data governance specialist will advance this work more quickly. Also of note, Stony Brook acquired and began implementation of the data profiling features available in Data Cookbook and began discussions about integration of definitions into existing dashboards.


Data Disposition

Nick Prewitt led an ad hoc group to explore purging sensitive financial aid data that was no longer needed. This group identified about 76,000 ISIR data records from 2003-2015 that could be removed with minimal impact. Next steps include deleting these records and creating a schedule for ongoing destruction. Records should be removed before the end of calendar year 2022. Next steps will be to examine removal processes of application data of prospective students who did not enroll as well as application data for prospective employees who are not hired.


Review of Data Access Policy

The DGC empaneled an ad hoc committee to review Stony Brook’s existing data access policy and propose revisions. Following a review of policies at other institutions, the ad hoc committee will consider the appropriateness of identifying data trustees for all data assets, differentiating between routine and non-routine access, data steward/custodian responsibilities for developing written policies for access, a basis for evaluating data access requests (legitimate university business, advancing mission), and re-release of data. The committee will also consider elevating this policy to be a University policy rather than a DoIT policy. Recommendations are due to the Data Governance Council by October 28, 2022.


Tool to Improve Postal Address Data Quality 

At the recommendation of the DGC, Stony Brook acquired the Runner EDQ Clean_Address application to promote data quality of postal addresses. This application will provide a real-time check on addresses entered by students, faculty, and staff and store data in a consistent format. It will allow the University to implement a regularized process to ask individuals to update their current local address without requiring manual review. The product was acquired in late fall 2021 but implementation was delayed due to higher-priority information technology upgrades and projects. The contract was renegotiated to extend Stony Brook’s use of the product while it is not yet installed. 


Data Governance Specialist

The Senior Executive Team (SET) approved a request to hire a data governance specialist (SL-3) to be housed in the Office of Institutional Research, Planning & Effectiveness. The position will serve as a business analyst supporting data governance activities to advance policy development and coordination, data stewardship, data definitions and standards, and communications. The position will collaborate with various stakeholders across the organization to achieve the goals defined in the enterprise data governance and enterprise data management strategies.




 

 

ANNUAL REPORT 2024-25

                                             

ANNUAL REPORT 2023-24

                                             

ANNUAL REPORT 2022-23

ANNUAL REPORT 2021-22

                                             

ANNUAL REPORT 2020-21

                                             

ANNUAL REPORT 2019-20

ANNUAL REPORT 2018-19

                                             

ANNUAL REPORT 2017-18