Jul
2013
Business Intelligence @EDU
I am here in Jupiter, Florida where the temperatures are much better than in the Northeast. Every night since 7/18/2013, we have been going for a walk in the beach in search of baby sea turtles which hatch this time of the year. We saw several of them. We did see some birds waiting to feast on these turtles, but they failed because of our presence. Though these turtles can live up to 80 years, obviously they have to escape these birds and other predators first. As a bonus, we also got to see a lot of green turtles come ashore to lay eggs. We waited silently and patiently until they began digging a hole after which, we went closer to watch the 45 minute process of laying eggs and then masterfully navigating out of the hole and covering the hole up with sand. One of the mornings, we spoke to volunteers who mark the new spots (of course, they also report back using GPS) as well as dig up the holes from where the babies have already hatched. We saw them dig up the hole and pick up the remains. They do an approximate count of how many eggs hatched and how many did not. We even saw one where the baby was still alive, but struggling to get out. They take such babies to Loggerhead Marine Life Center in Juno beach. I was very skeptical about the whole thing because of our recent disappointment with the Cherry Blossoms in Washington DC. It is all about timing, but I was so thrilled about this experience. Since we are strongly advised against the use of flashlights, we have very limited videos and photos (quality is poor as you can imagine).
During my recent presentation at the NERCOMP workshop on data governance and business intelligence we talked less about the technologies than the need to use the data we have intelligently. The use of data in business has progressed at a much faster pace than in Higher Ed. As you know, the use of data in business goes beyond technologies. There are serious privacy and ethical issues. We won’t go there. But, in case you are not aware of this, I am fascinated by this story on the use of data by Target. By analyzing buying patterns of women, they can predict with high confidence women in their second trimester of pregnancy and target marketing materials appropriate for the new baby. They looked at the buying patterns of a young woman and started targeting similar marketing materials to her. The father was puzzled by all of these mailings & denied his daughter’s pregnancy, but, after a few months realized that his daughter was indeed pregnant!
So why is it that Higher Ed is behind on Business Intelligence (BI)?
First off, I know that many dislike the notion that Higher Ed is a business, while many claim that this is what Higher Ed has become. Let us leave that debate aside and perhaps refer to the principles of BI. In an article I wrote a while ago, we preferred the term “Academic Analytics”.
Unlike business, in Higher Ed, the lack of agreement on common data definitions is the main cause of the problems. In simple terms, who is a full time student is defined differently by the Registrar than by Financial Aid or Student Life. Of course, this is a oversimplification, but generally true. These definitions come from legitimate business needs, so, what typically happens is that each business unit extracts data from institutional databases and maintain their own mini data bases with whatever definitions that is required for them to conduct their business. You can immediately see all the issues with this model. Generally, reconciliation of data becomes next to impossible. Secondly, resources are spent on making sure that the data is maintained well in the departmental databases rather than centrally.
What is the best way to solve this problem? This is where the governance becomes the most important piece of this puzzle, not technology. Before data can be used intelligently, one needs to be able to trust the data – both the quality and clarity. Technology, as always, is the easy part. In terms of simplicity, the systems like Banner that are optimized for collecting data are not the right systems for reporting or pulling data out. They are constructed in ways that makes the process of pulling data out very slow. Ralph Kimball developed the theoretical framework for solving this issue – dimensional modeling. Again, in simple terms, one can use these principles to extract and create a datawarehouse that conforms to the star schema, core to the dimensional modeling. Before I get carried away, you get the idea. Technology is pretty advanced and that is not the problem. If your data is unclean, no technology can fix it!
Governance is important in Higher Ed. Unlike businesses, where their existence is based on profits, all decisions are driven by the need to maximize profits. This typically results in decisions making that follows a very different set of rules and dynamics.
In Higher Ed, this requires someone to rally all the users of data to come together and get them to agree to common data definitions as well as commitment to clean the data in central repositories and commit to continue. This is a lengthy process. We have done this successfully at Wellesley for student data and are pretty close to opening up the student data warehouse for some power users. For student data, the governance committee had representatives from Admissions, Registrar, Student Life, OIR and LTS. We also have completed Finance datawarehouse, where the governance is much simpler in that only finance and budget office were involved and definitional complexities are far less.
At the NERCOMP meeting I was wondering aloud about the governance. How do we make sure that these important institutional decisions stick? It is obvious that the turnover of administrative staff in Higher Ed is much greater than it used to be. How do we make sure that the new administrators remain as committed to these collective decisions as the previous ones? I don’t necessarily know the answer, but this remains an important, yet unanswered, piece of the puzzle.