Industry is experiencing a dramatic increase in the use of data warehousing techniques, but higher education has been slow to follow suit. This paper illustrates some common uses of data warehousing in industry, provides higher education analogues and suggests the potential value of data warehousing techniques to colleges and universities.
The use and value of data warehousing has traditionally been defined in terms of manufacturing and/or sales entities. As higher education has grown through the years, educational institutions have developed into large businesses in and of themselves (Desareseax, 2000). This change has resulted in more business-like management of these institutions as well (Lazerson, Wagener, & Moneta, 2000). To gain an appreciation of how the data warehousing process can benefit institutions of higher education, we can take the many examples of the benefits manufacturing and sales organizations have enjoyed from this process and apply them to the educational setting. This application will help to provide a more clear understanding of the value data warehousing can provide to higher education institutions.
Data warehousing has emerged as a solution to the needs of many end users of data today. Many of these users are frustrated by what they feel is a lack of acceptable access to corporate-wide data from which to obtain information which can help them make better decisions. Data warehousing is not a single solution, but rather a collection of products, processes, and technologies which, when used together in a unified approach, can provide end users with an answer to the problem of sufficient data access. To understand data warehousing it will help to contrast traditional data systems with data warehousing. Traditional data systems concentrate on daily operational systems such as transaction processing and recording of daily transactions, whereas data warehousing concentrates on analytic processing, examining historical data previously recorded to discover value. A traditional organizational approach to data would include numerous independent data systems, each having one of many possible formats, and including unstandardized and inconsistent data.
The term data warehousing includes the creation and maintenance of the data warehouse and the process of obtaining useable information from the stored data. In simple terms, data warehousing is a methodical approach to the process of collecting data of various types internal and external to the organization, recording it in one common format, and providing end users within the organization the ability to easily query the data in multiple ways to discover value within the data.
There are 3 primary benefits immediately apparent to the end users of a data warehouse process.
1. Standardized and consistent data
2. User-friendly data
3. Ease of access to data
In order to understand these benefits, a discussion of the creation of the data warehouse itself and the value that comes from this process is useful. As stated earlier, the data warehouse is built from data of various types both internal and external to the organization. This is where the first measure of value is to be gained from data warehousing. Gaining value from data is a difficult task and is made much more difficult when data is stored in various technical formats across multiple systems, which do not communicate with each other and cannot work with other, non native data formats.
Standardization and consistency are essential for a successful data warehousing effort. Standardization is achieved when designing, creating and loading the data warehouse by examining the various incoming data sources for variations in definition or data type. Typically, older data systems placed a premium on reduced data size, as disk storage was very expensive. This encouraged the use of codes as substitutes for many pieces of data. Recording an individual's sex is typical example of this. Some systems might record a code for sex in a field defined as sex, or sex type, or gender, and each of these might have a different meaning as they source from various systems.
Codes for sex might be a numeric type, representing males with a 0 and females with a 1 or perhaps representing males with a 1 and females with a 2. A code might also be an alphabetic type, representing males with an M, and females an F. Standardization involves recoding all of these different formats into a single code.
Inconsistency of data values recorded in each of these 'codes' can be also illustrated. For example in the alpha sex code, the letter values might be capitalized or not capitalized, upper and lower case having different values at the machine level. In one system, the value of 'y' was stored in the sex code to represent 'womyn'. One of the best examples of inconsistency in data is illustrated in Planning and Designing the Data Warehouse (1997), which states that 'AT&T, ATandT, and ATT are just three of the more than 60 ways this company's name has been spelled".
data has been examined and efforts taken to address standardization
and consistency, end users of the data will find it easier to
work with. They will have fewer doubts as to its accuracy and
intended meaning, and will be able to discuss the data with other
end users in greater confidence and understanding. Extracting
and combining data from various dissimilar systems into one comprehensive
standardized and consistent data store and technical format, makes
the discovery and reporting of important information a much easier
task, and thus provides value.
A user-friendly format for the data is also an essential benefit of the data warehousing process. When extracting data from its native system and loading it into the data warehouse, a premium is placed on translating the data into a very user-friendly format. Such a format would be readily consumable by both end users and decision-makers. Consumption refers to how the data is organized in the data warehouse and presented to the users for viewing. In a traditional data system there may be data stored in systems such as the accounts payable system, the accounts receivable system and the member records system. In these systems sales and revenue normally may mean different things to different individuals, depending on their view. Once extracted from those native systems, data would be formulated into something easier for a user without specialized accounting knowledge to comprehend, and analyze. Such a structure might take the form of a single member transaction file within the data warehouse, encompassing payments received, refunds, and changes to the member's information.
Logical user-friendly data structures representing data the way users are accustomed to conceptualizing it allow a more logical approach to querying the data. Users do not need to learn or understand complicated relationships between dissimilar files or database tables. They can view the data the same way they think about it or work with it on the transactional side.
Once the data warehouse has been constructed, ease of access to the data contained within the data warehouse is a third benefit of data warehousing, and ease is provided by the reporting tools selected. Rather than requiring the specific knowledge of a multitude of various data gathering, processing, storage and reporting systems that an information systems department must possess to support daily transactional systems, users must only learn reporting tools which are designed specifically to access the data warehouse and are designed to be user oriented and user-friendly. Learning these data warehousing reporting tools provides the users with the skills necessary to derive valuable information from the data as they desire.
Reporting tools for data warehousing provide several types of access to the data, the most important of which is the ability to quickly query the data on an ad hoc basis. This frees the user from the constraints and delays involved in requesting the organizations Information systems department and professionals to use more complicated reporting tools and processes to provide the information the users are interested in. Rather than making such a request, users themselves can use the query tool and access the data directly, creating a query, analyzing the results, reformulating the query and reexamining the results on an ongoing basis to find the information they seek. Once end-users have found the information they seek within the data of the data warehouse, they can reformulate the quick queries they have created into standard reports. These reports can then be run on a scheduled basis, or triggered by exception processing, and delivered to the appropriate user groups for decision making.
As with all types of commerce, competition among educational institutions is increasing and undergoing rapid change. For educational institutions, the time available to set up educational programs to satisfy the needs of businesses and students is becoming shorter. The Internet has provided a radical change in approach to education, and educational institutions everywhere are beginning to use this medium to reach potential students around the world (Tomcheck, 1998). Those institutions that can identify trends faster than competitors will have an advantage in developing new programs, attracting students and forming beneficial relationships with businesses in their communities. As with any new opportunity, the early adopters are likely to become the market leaders and set the standard for others to follow. Consumers of these services are likely to increase as the realization of their value becomes more widespread. Data Warehousing can provide assistance in dealing with competition in an Educational setting as well as in other types of commerce.
The prior discussion establishes a basic understanding of what data warehousing is, and some of the immediate benefits visible to end users that can be derived from the data warehousing effort. The following specific examples of the use of data warehousing show how educational institutions can also use data warehousing, and some of the value that can be obtained. The business examples used will come from different companies and situations, but for the educational equivalent example, let us consider a set of small fictional universities in hypothetical situations paralleling the company examples given.
Technical University (Tech) is a technology campus offering 4-year degrees in many technical fields. Waldun University (Waldun) is a prestigious, expensive private business college located 30 miles west of Tech. In between the two lies State University (State), a public university primarily offering business, education and engineering degrees. State University was the first of the three to initiate a data warehousing process, and has recently completed the first stage of this project. Several inputs into the data warehouse database were student information, held by the admissions department in an Oracle database, course enrollment data, from relational databases and indexed files maintained by each individual college, and graduate placement data in spreadsheet form from the placement office. Also included were the results of a region wide employer survey commissioned by the university to project employment needs through the next 5, 10, 15 and 20 years, as well as state-wide demographics (obtainable in print form only).
All of these data sources were successfully extracted, cleansed and, standardized into a DB2 relational database. The information was denormalized and stored in a user-friendly format that links each student's individual data to their college, all courses they have enrolled in, and the demographics for each student's family's residence. This format can be easily viewed and understood by end users in various departments.
Identifying Market Trends
Retail establishments have several main goals. They seek profit and to provide the goods and services their customers need. They also need to set themselves apart from competing retailers (Data Warehousing for the Retail Industry, 1998). To accomplish these goals, they must first be aware of market trends.
Fingerhut, headquartered in Minnetonka, Minnesota, is a two billion dollar business with a data warehouse built from data accumulated over the past 50 years. This huge database is the key to Fingerhut's success in mail order sales. "We've never done business without some form of database marketing." says Andy Johnson, senior vice president of market development. "It's the heart and soul of this company, and we wouldn't be in business without it." (Pearson, 1998). Fingerhut's marketing department researched purchasing trends and discovered that after a change in residence, customers were likely to triple their purchases over the next 12 weeks. Furthermore, they found that most of these purchases were made in the first 4 weeks after the change, and followed a set pattern. Furniture was first, followed by telecommunications equipment and decorations. They also found that certain items like home electronics and jewelry did not sell well to these customers during this time. This was a very important trend for Fingerhut to clearly identify. With this information Fingerhut created a specialized "mover's catalog" for those who had changed residences, and saved money by not sending other catalogs during this time frame.
What Fingerhut learned is that marketing needs to search for and identify all types of customer subsets in order to maximize sales and minimize costs. Their data warehouse is a vital key to that process.
The president of State University became aware of declining revenue from student enrollments. He asked the finance department to justify the recent investment in the new data warehouse by using it to uncover the reason for declining enrollment.
The finance department was already aware that enrollment in the Junior and Senior had declined from sophomore year enrollment figures. The finance department head belonged to the same professional organization as the comptroller of Tech, and through informal conversations on the subject had been aware that Tech had experienced a large increase in their enrollment over the prior two years. The finance head suspected that the lure of technology was pulling students away from State and into Tech, and thought this was compelling information which explained the falling enrollment revenue. The finance head saw no value to be gained from the data warehouse, but because State had recently entered into a agreement with both Tech and Waldun University to share student, enrollment, and placement data, decided to have an associate run a few queries just to confirm his suspicions.
The results surprised him. Queries against the data warehouse showed both the decline in enrollment from State in the Junior and Senior years and the large increase in enrollment at Tech. But they also showed an increase in enrollment in the junior year at Waldun University. Additionally, the inclusion of demographic data allowed additional queries to show that while some students leaving State were enrolling at Tech, almost half of those leaving were enrolling at Waldun University, and came from middle and lower income families. Most of those transferring from Waldun University were enrolling at Tech and came from mixed economic backgrounds.
At that point the finance head reconsidered his earlier opinion about the value of the data warehouse, as it had just prevented him from voicing a mistaken assumption to the President of the University. Soft analysis then took over and confirmed a long-held theory that middle and lower income students who desired a degree from a more prestigious university often enrolled at a more financially affordable public university and transfer after they have completed their basic educational requirements. The lure of technology explained half of the lost enrollment, as well as the shift in enrollment from Waldun University to Tech. The finance department then had the information necessary to explain the falling enrollment and corresponding revenue to the President, and they identified those market trends in a very short amount of time.
Database/Precision Marketing is a business activity whose goal is to leverage the data that the organization has about Consumers in order to develop outbound programs and offerings targeting specific Consumer segments...eventually at a market of one. Database/Precision Marketing necessitates the integration of demographics, Consumer data, geographic data and marketing experience (Makos & Schmarzo, 1999).
At Fingerhut, marketing discovered a correlation between customers who exhibited a greater response to catalogs in Spanish and those who purchased larger amounts of Jewelry, regardless of geographic location. This resulted in sending these customers catalogs printed in Spanish and an increased focus on fine jewelry. "This may seem simple, but it took a lot of querying and a lot of weeding out to arrive at the final mailing list for that catalog," says Andy Johnson, senior VP of market development. (Pearson, 1998). These specialized catalogs resulted in a 40 percent increase in sales over normal from customers who received them.
President of State U. received the results from the finance head,
reviewed the data, and agreed that they had uncovered some interesting
market trends. He then asked the Marketing arm of the university
to provide more detail on these trends, and to develop an approach
to reverse the declining enrollment. The Marketing department
accessed the queries created and saved by the Finance department
and modified them to show which colleges the students were enrolled
in before they transferred out. The results showed that the students
transferring out of State and into Waldun University were primarily
from the business college, while those who transferred to Tech
came from the college of education, which lost half its students.
However, of the technology majors in the college of business,
none had transferred out. Questionnaires were sent out to students
in the colleges of business and social sciences to determine why
they would or would not consider transferring to either of the
other institutions. The responses were loaded into the data warehouse
and additional queries were run. The results showed that over
80% of those transferring to Waldun University were doing so simply
to acquire the degree from a more prestigious university. The
majority of those transferring to Tech out of social sciences
were doing so in order to obtain a technical degree. The technology
majors from the college of business stressed that they did not
consider transferring because they were aware of the high quality
of their technical education at State. The students stated that
this was evidenced by beginning salaries from local employers,
which ranged 10 to 20 percent higher than those offered to graduates
of Tech. The marketing department at State University then had
all the information it needed to propose a precision marketing
campaign targeted to its current students in the college of education.
This campaign would stress two issues. First, that State itself
provided degrees in technology through its college of business,
and second, that these degrees were more highly valued in the
business world as evidenced by the starting salaries. Now State
had a plan of action to begin stemming the flow of transfers to
Data warehousing is used in many phases of new product development and evaluation. In some cases data warehousing is used to predict markets for new products, new products for existing markets, or to modify products to meet consumer demand (Krauter, 1998).
Land O'Lakes used its data warehouse to justify the benefits to grocers of adopting new Land O'Lakes product lines. Queries run on the warehouse immediately show the results of competitors' decisions to carry Land O'Lakes products, and dramatically increased the ease of fighting for shelf space with other suppliers (Krauter, 1998).
State's president was pleased. It seemed the data warehouse was performing as advertised and provided various departments within the university with timely access to vital information with which to make management decisions. As he reviewed the results of the queries against the data warehouse and the marketing proposal, he wondered if there wasn't something more that could be done. The data warehouse had provided great detail to this point, so he directed his deans to work together to determine additional steps. Once again focus returned to getting more detail out of the data warehouse. Various departments ran queries and held meetings, finally returning to the president's office with the following information.
Of those students transferring to Tech, 75 percent were enrolling in at least one telecommunications class at Tech. And although neither Tech nor State offered a degree in Telecommunications, the employer needs survey predicted a 50 percent increase in the need for telecommunications specialist over the next 10 years. Also noted was that State currently offered several telecommunications classes. As for those students transferring to Waldun University, the perceived Prestige of Waldun seemed to be the primary factor, as additional research showed that starting salaries for Waldun graduates were only five percent higher than State's.
The deans recommended the following courses of action. First, that State's college of business add more telecommunications classes and offer a new telecommunications technology degree. This information would be included in the precision marketing to current students as well as stressed in the recruitment of new students. These actions would capitalize on the increasing demand for telecommunications graduates and exploit the lack of equivalent degree programs at the other universities. Second that the University should begin a public relations/marketing campaign to persuade students of the economic benefits of attending State in tandem with an effort to improve the actual and perceived quality of the education provided through the business college. This would allow State to compete more equally with Waldun University while exploiting its cost efficiency.
The use of a data warehouse presents several potential advantages for higher education institutions, including timely access to evaluate data. Institutions can quickly identify troublesome trends in its enrollment and evaluate why they are occurring. They can determine strengths and weaknesses within themselves as well as their competitors. They can associate this information with trends in the market place and desired products, and are able to identify precise targets for marketing efforts. The use of a data warehouse allows an educational institution to use this information in making appropriate decisions, and that is the desired end goal of the data warehousing process.
Data Warehousing for the Retail Industry. (1998). Informix Corporation. http://www.dmreview.com/whitepaper/dwe.pdf.
Desruisseaux, P. (2000). Universities Venture into Venture Capitalism. The Chronicle of Higher Education, A44. May 26, 2000.
Krauter, A. (1998). Land O'Lakes Speeds Strategies to Market. Data Warehousing: What Works? http://www.dw-institute.com/cases/2land.htm
Lazerson, M, Wagener, U. & Moneta, L. (2000). Like the Cities They Increasingly Resemble, Colleges Must Train and Retain Competent Managers. The Chronicle of Higher Education, A72. July 28, 2000.
Makos, R. & Schmarzo, S. (1999). On the Path to One to One Marketing. http://www.dmreview.com/master.cfm?avID=61&WhitePaperID=123.
Pearson, D. (1998). Marketing for Survival: Good Uses of a Data Warehouse Can Mean the Difference Between Staying in Business or Chapter 11. CIO Magazine. http://www.cio.com/archive/041598_finger.html.
Planning and Designing the Data Warehouse, (1997). The Data Warehousing Institute. Upper Saddle River, NJ: Prentice Hall PTR.
Tomcheck, D. (1998). Java-Enabled Web Warehousing and UC Irvine: Excellence in Education. Data Warehousing: What Works? Vol. 5. http://www.dw-institute.com/cases/5uci.htm.