Ariyo Sanmi, PhD

DATA SCIENTIST | MACHINE LEARNING ENGINEER

Ariyo Sanmi, PhD

UC, Berkeley Course Catalog Analysis

Data Visualization Capstone Project — Udacity

Introduction

The goal of this project is to create visual improvement of the already published UC Berkeley Course Catalog from 1900–2011 which was analyzed by The CSHE Research and Occasional Papers Series. The data used for this analysis can be found on the MakeoverMonday page.

The original visualization challenges:

  1. The graphical relationship of the California universities was well presented, but the pie charts used for displaying the courses makes it difficult to analyze the ratio of each school attentively. As generally accepted, pie charts are best with data not having more than five feature representations.
  2. Absence of a dashboard which depicts the names of the various courses available which can be interactively filtered based on years and colleges.

My Recommendation – Create a data story visualization:

This post would give a complete break down of both schools and courses available at UC Berkeley from 1900–2011. As of 2011, there are a total of 6 schools, 90 academic departments and 306 field programs at the University of California, Berkeley (UC Berkeley). The three schools and colleges with the highest students are Humanities, Professional and Social sciences.

Since 1967 when this data was recorded, the school has continued to grow academically, incorporating latest advances in all spheres of study and this improvement over years can be visualized in our animated dashboard which shows the changes in courses offered as time flies.
Changes in courses availability in the engineering facilities over time

 

Each faculty consists of several departments which offer different fields of study. In the engineering faculty, for example, there are several academic departments available with varying number of students. These departments and their ratio are categorized below.

Departments in the engineering faculty
Furthermore, with each department comes even a more significant number of programs available with their research focus and specialization.

Limitation- Data Processing:

The major limitation with the data is the missingness in some of the features, a potential Missing Completely at random (MCAR) type of missing data. This was adequately resolved via the pairwise strategy when a particular feature is not required for precise analysis, or via imputation methods over the mean or mode distribution.

Conclusion

In this write-up, we have presented an improved visual analysis of the UC Berkeley Course Catalog from 1900–2011.

1. We observed that some other forms of charts could be more suitable for the inter-faculty analysis with suitable feature display.

2. We presented the available departments and how their ratio are distributed.

3. Lastly, we studied the available programs for each college and department.

Understandably, the analysis presented here could help interested persons interact with the dashboard towards recommending a suitable course of study for themselves. With this,

Which field of study do you find most interesting of the catalogue?

To have an unobstructed view of this analysis, check the visualization available on my tableau public account here. Also, the medium post of this article can be found here.

Leave a Reply

Your email address will not be published. Required fields are marked *