Data Science

Advances in technology have led to exponential growth in the amount and complexity of data. We are at a threshold of an era in which hypothesis-driven science is being complemented with data-driven discovery. The data collected are complex in size, dimension and heterogeneity and provide unprecedented opportunities for new discoveries in theoretical and applied research. Broadly, data scientists extract meaning from this wealth of data to generate critical insights to drive decision making and innovation. They combine computing, statistics, mathematics, visualization, software development and domain knowledge to make inferences from various forms of data (including but not limited to numerical, text, audio, and visual data). Thus, data science involves engineering, reproducibility and provenance, and often includes software development and design.

About the Data Science Initiative

The Data Science Initiative (DSI) at UC Davis leverages existing knowledge to tackle complex problems, develops new tools to do so more effectively, and teaches their implementation and development. As a cross-university activity, the DSI fosters, promotes and facilitates data science to acceleratediscovery at the frontiers of scientific, engineering and social disciplines. The DSI partners with researchers across UC Davis to push the envelope both within and across disciplines to perform qualitatively novel, interdisciplinary research. By combining techniques across disciplines, we problem solve to further both data science application and theory.

Data scientists at the DSI engage in all activities involved in working with data (i.e., identifying, acquiring/accessing, processing, transforming, exploring, modeling, summarizing, visualizing and interpreting data). In addition to this data pipeline, we focus on generating and developing novel questions and approaches. Our training and education in data science focuses on the theory, methods, process and tools for working with, interpreting and applying data. To that end we focus on cultivating curiosity, creativity, communication and critical thinking skills in students and researchers across the University.

The mission of the DSI is to:

  • Promote academic and research excellence through quality programs, engaged researchers, and an innovative research and learning environment.
  • Meet growing industry and academic need for graduates with data science skills.

The primary goals of the DSI are to:

  • Make qualitatively new research possible,
  • Accelerate data-driven exploration, and
  • Train researchers throughout their career to be skilled at working with data at all stages of the analysis pipeline.

Our core values are excellence in learning, discovery and engagement to promote curiosity and innovation in the development and application of data science. To that end, the DSI complements existing and developing educational programs to facilitate research in data science and provide a niche for students, faculty and professional researchers seeking to move beyond their individual programs to identify and exploit new opportunities.

What we do

The DSI is a bottom-up organization adapting to UC Davis' needs and opportunities, rising to fill identified gaps and support institutional excellence. As a hub of emerging activity on campus related to statistical and machine learning, technology developments, and interdisciplinary data-related research and education, we: Hold informal seminars to discuss new technologies and methods, and brainstorm about problems in different domains. Run interdisciplinary seminar series on cutting-edge aspects of data science. Offer practical workshops on data science, HPC and Big Data topics, methods and technologies. House researchers from different disciplines in a shared space to foster interactions, ideas and solutions. Foster interdisciplinary research using data science as a common pillar. Work with faculty from different departments and develop new courses covering data science skills, methods, and practice. Facilitate new programs (minors, majors, designated emphases, degrees) in data science and link courses and resources across campus. Connect students with internship opportunities. * Assist in other campus efforts to strengthen computing infrastructure, training and education for UC Davis and the UC system.

### DSI Services

The Data Science Initiative provides education, advice and collaboration services to UC Davis researchers (faculty, postdocs, graduate students, staff) on all aspects of the data science process, including:

  • Developing research questions and proposals
  • Using data science tools (see languages
  • Obtaining and accessing data (e.g., from Web pages, APIs, databases)
  • Structuring data for analysis (e.g., relational databases, NoSQL, text search engines, Hive, Pig)
  • Data management, curation, security and privacy
  • Approaches for cleaning and transforming data
  • Data analysis via statistical and machine learning and modeling
  • Visualization for exploratory data analysis and presentation of results (web based visualization, dashboards, etc.)
  • Computational issues (e.g., parallel computing, algorithmic and software development)
  • Data sharing and availablility (via Web APIs, bulk-download, etc.)
  • Reproducibility and provenance
  • Other data science education and implementation resources at UC Davis