data warehousing in higher education

11
Data warehousing and mining usage in the higher education innovation and improvements for various stakeholders Uma Pavan Kumar # , Narayana Swamy * #-Research Scholar (PEC), Associate Professor, AIMS Institutions, Bangalore [email protected] * -Research scholar (BU), Associate Professor &HOD Presidency College, Bangalore,[email protected] Abstract Quality in higher education and technology integration with higher education are two different entities, because quality concerns with the continuing the standards and policies existing in the universities, colleges and other regulatory bodies of higher education, whereas integration of technology in the higher education is establishing new products and services to strengthen the education towards the stakeholders of the education system. Research is associated with quality and efficiency in the higher education, without research in the science and technology we can’t expect the quality and innovations in the higher education. In this article mainly dealing with the higher education aims and goals and what is current situation in the higher education sector and where we are in that ,problem areas in the field and we are trying to provide some solutions towards the benefit of higher education quality improvement and taking that towards to the stakeholders of the education. Especially we are providing the data warehousing and data mining scenarios so as to explain the integration of technology in higher education. With the provision of warehousing environments a biggest advantage is starting from the Flat files up to the complex format such as web based uncertain data can be migrated to required format of the target audience. The process of getting the data and pre-processing according to the requirements, loading to the repositories are major concerns of the data warehousing. Once that data population into the warehousing is done we can able to generate the strategic decisions of various levels of the data and users. For simple analysis purpose we can go with online analytical processing through which the generation of the reports, graphical representation of data are possible. But if the data available in the warehouse containing hidden knowledge and hidden patterns for that we need the data mining which is able to perform the knowledge discovery in data bases/data warehousing environments with the aspect of pattern recognition,

Transcript of data warehousing in higher education

Data warehousing and mining usage in the higher educationinnovation and improvements for various stakeholders

Uma Pavan Kumar#, Narayana Swamy*

#-Research Scholar (PEC), Associate Professor, AIMS Institutions, Bangalore [email protected]

*-Research scholar (BU), Associate Professor &HOD Presidency College, Bangalore,[email protected]

Abstract

Quality in higher education and technology integration with highereducation are two different entities, because quality concerns with thecontinuing the standards and policies existing in the universities,colleges and other regulatory bodies of higher education, whereasintegration of technology in the higher education is establishing newproducts and services to strengthen the education towards thestakeholders of the education system. Research is associated withquality and efficiency in the higher education, without research in thescience and technology we can’t expect the quality and innovations inthe higher education. In this article mainly dealing with the highereducation aims and goals and what is current situation in the highereducation sector and where we are in that ,problem areas in the fieldand we are trying to provide some solutions towards the benefit ofhigher education quality improvement and taking that towards to thestakeholders of the education. Especially we are providing the datawarehousing and data mining scenarios so as to explain the integrationof technology in higher education. With the provision of warehousingenvironments a biggest advantage is starting from the Flat files up tothe complex format such as web based uncertain data can be migrated torequired format of the target audience. The process of getting the dataand pre-processing according to the requirements, loading to therepositories are major concerns of the data warehousing. Once that datapopulation into the warehousing is done we can able to generate thestrategic decisions of various levels of the data and users. For simpleanalysis purpose we can go with online analytical processing throughwhich the generation of the reports, graphical representation of dataare possible. But if the data available in the warehouse containinghidden knowledge and hidden patterns for that we need the data miningwhich is able to perform the knowledge discovery in data bases/datawarehousing environments with the aspect of pattern recognition,

classification, clustering, association and in some cases such ashandling of the uncertain data we can make use of Bayesian formalisms.The additional aspects that we are going to describe is handling of thevarious contexts of the data along with estimation of the amount of datato be handled, verification of certain and uncertain data because withpattern recognition it is possible to handle the certain data but thedata such as web-based environments are in uncertain formats for thatusage of soft-set computing is one solution. The text mining approachessuch as identification of key word searching and summary estimation ofthe given content are also described in this paper.

Keywords: Flat file, Higher education, Technology, data population,patterns.

Introduction

Post-independence, India has witnessed an enormous growth in its highereducation. Still in the higher education sector, the country is farbehind China and the United States in GER (Gross Enrolment Ratio). In1950, around 700 colleges and 16 universities were there in India.However, if the statistics of UGC's (university Grants Commission)publication "Higher Education at a Glance-2012" is considered then, in2011, around 33,000 colleges and about 700 universities were there inIndia. Therefore, in order to achieve the target of 30% GER (i.e.enrolment of around 30% of students who have finished 12 years ofeducation in undergraduate courses) about 1,500 more universities arerequired. The question is doing India need just more universities oraccredited universities, which re-instate the quality of education. Fromthe source of UGC the following are various enrolment percentages invarious fields of study.The current discussion in this paper is how higher educationimprovements and innovations are possible through technology. To provethat we chose data is warehousing and data mining methods so as toprocess the data towards to the stake holders of the higher education.The organization of this discussion is as follows. In section I describethe literature about higher education; top higher education institutes

in India and student faculty ratio in the world towards higher educationare explained. In section II some learning centres are presented forhigher education improvement. In section III explains the drawbacks inhigher education sector, section IV explains the contribution andmethodology of data warehousing and Data mining aspects in theinnovation and improvement of quality in higher education. Section Vgives the conclusion and future research scope in the improvement ofhigher education in the context of data warehousing and data miningtechnologies.

I. Literature Review

The study about the involvement of technology to improve the quality ofhigher education consists of much literature. The first source weobserved was from a conference report and the views of intellectualstowards the higher education.Prof Yashpal Sharma committee’s reportstated that,” The existing number of universities viz., around 400 andcolleges viz., around 20,000 are definitely not able to cater to theincreasing number of school pass-outs. Even if the number ofuniversities is increased to 1,500 and the number of colleges doubled,we will be able to cater to just 15 per cent of the population. There isa lot to be done in the field of higher education in India to meet theglobal demands in terms of quality, reach and access.There is a wide gap between the industry expectations and the universitystandards, on account of which millions of people areunemployed/unemployable, while thousands of jobs are lying vacant forwant of the right personnel. The role of private operators in the fieldof higher education has increased and a meaningful private-public

partnership is a must to cater to the increasing number of users. Theimportance of all aspects to our lives should be understood by thestudents so that besides getting good jobs, they also turn out to beuseful citizens for the country and useful human beings for the world atlarge. Source: Times of India Article

The following list gives the ranking of higher educational institutionsin India.

Integrating ICT into teaching and learning is not a new concept. ForWang and Woo (2007), it may be as old as other technologies such asradios or televisions. Citing Earle (2002), Wang and Woo describeintegration as having a sense of completeness or wholeness by which allessential elements of a system are seamlessly combined together to makea whole.Whilst acknowledging that defining both terms (technology andintegration), may drive the problem, Earle (2002) supports the positionby Wang and Woo when he argues that integration does not only mean theplacement of hardware in classrooms. He further contends thattechnologies must be pedagogically sound and go beyond informationretrieval to problem solving. “The whole purpose of using technology inteaching is to give better value to students”. The following graphvisualizes the student-teacher ratio across the world.

Source:The Indian Higher Education Images from google

As more and more technologies, such as net books, interactivewhiteboards, smart phones and digital video recorders, have become moreavailable and affordable, coupled with the rapid expansion of computernetworking capability in educational system, there have been continuedresearch efforts in investigating how teachers can use ICT to facilitatestudent learning. These points of view are generally taken in twogroups. The first is the technological point of view, which supports theintegration of technological infrastructures and systems into theeducational environment; the second is the pedagogical point of view, whichsupports the integration of ICT materials and programs in terms ofsocial constructivist learning principles (Richards, 2006).Hepp, Hinostroza, Laval, and Rehbein (2004) have been cautious toemphasize that there is no universal truth when it comes to applyingICTs in education, and that there is no advice that can be directlyapplied without considering each country’s reality, priorities, andlong-term budgetary prospects and commitment. In developing countries

ICT should be combined with traditional technologies, such as print andbroadcast radio, to achieve better effectiveness (Pelgrum & Law, 2003).In many countries, ICT has helped in improving the quality of education.It has the ability to address illiteracy and improve the quality ofeducation in all sectors through multimedia capabilities such assimulations and models. ICT can give learners access to concepts thatthey previously could not grasp. The acquisition of ICT skills ineducational institutions helps knowledge sharing, thereby multiplyingeducational opportunities. A constructive atmosphere must be there toprovide an occasion for all stakeholders to form a part of theinformation society. Instead of focussing on cost, efforts should betaken to promote broadband, computers, and Internet access.

Progress and planning is still needed in providing attractive learningcontent and learning technologies.E-Governance refers to the process of using Information Technology(I.T.) for automating the internal operations of the government andexternal interaction with citizen and other businesses. When we talk ofthe Indian Higher Education System (IHS), there is tremendous increasein the number of colleges and universities. There are number of aspectsrelated to the quality education like progression of course, quality offaculty members, research facilities given to teachers and students,number of students, examination system and administration.

Ideally, a single window should exist for approval process ofperformance measurement. The need is to deliver services at the doorstepmaking it hassle free transparent and to facilitate decision makers toget all analysis and decide. E-Governance is not doing all theactivities on-line or other sort of computerization but it is actuallythe way to rethink and re-engineer the existing structure of the systemof higher education in India with its functions, processes, etc.The governing bodies can very easily develop a mechanism to analysewhich course is in heavy demand in a particular area or region. Theinformation obtained can help the government in analysing employmentversus passing graduates ratio. Government can plan the generation ofemployment according to the passing graduates in the country. E-governance will remove the need of Transfer certificates or Migrationand reduce unnecessary administrative work and paper work Interlinkingof universities will enable regular updating of curriculum. Commoncurriculum can also be improvised. Centralized database of students willprovide better opportunities to the bright students. The focus of the paper is on the benefits that ICT integration ineducation can provide, right from breaking time and distance barriers tofacilitating collaboration and knowledge sharing among geographicallydistributed students. The findings reveal that it also facilitatessharing of best practices and knowledge across the world. Enhanced groupcollaboration made possible via ICT, It enhances the international

dimension of educational services, eliminating time barriers ineducation for learners as well as teachers. The components include e-portfolios, cyber infrastructures, digital libraries and online learningobject repositories.

II. Official E-learning centreso Jadavpur University is using a mobile-learning centreo IIT-Bombay has started the program of CDEEP (Centre for Distance

Engineering Education Program) as emulated classroom interactionthrough the use of real time Interactive satellite technology

o Indira Gandhi National Open University (IGNOU) uses radio,television, and Internet technologies.

o IIT-Kanpur has developed Brihaspati, an open source e-learningplatform

oIII. Drawbacks

It may create a digital divide within class as students who are morefamiliar with ICT will reap more benefits and learn faster than thosewho are not as technology savvy.

Since not all teachers are experts with ICT they may be lax inupdating the course

Content online which can slow down the learning amongstudents.

The potential of plagiarism is high as student can copy informationrather than learning and developing their own skills.

The cost of hardware and software can be very high.The following graph shows the world total enrolments when comparedwith India

Source: Statics of Higher Education site

IV. Our contribution and Methodology

From the literature review the need for usage of ICT in higher educationis requiring the maintenance of required information to various stakeholders in the form of digital libraries is the first requirement.Global exposure to teachers and developers to handle the contents up tothe level of various learners. Avoiding the duplication of the data inthe population of digital data. The data warehousing tool Informaticacan be used to collect various formats of the data such as Flat files,XML source, Relational model, ERP for the population of the digitaldata. For the collection of exact data informatica provides the Ranktransformation to give the relevancy among the generated data. Supposethe data generated from the source system may contain Arts, Science,Technology categories and better to redirect the informationcorrespondingly.Informatica provides Router transformation so as toroute various categories of the required data. Suppose to process aparticular topic or content from the source informatica provides asource transformation and mapping of the selected contents to targetaudience. Finally in some cases there might be some additional andadvancements in the contents or technology. In that case only updatedinformation from the source need to be added with the digital storagewhich is a typical aspect, to handle such scenario informatica provides3 categories of data updating strategy those are Type-1, Type-2 andType-3.In Type-1 no historical data will be there complete updating ofthe old data with new content. In case of Type-2 keeping both historicaland current data to handle the digital data by adding one more record tokeep track of the current changes every time. Hare as in Type-3 themanagement of one time historical data just by adding one specificcolumn to the content.Handling of the classification of the data by the context of the usercan be achieved with the technique of classification in mining usingweka tool, the relative data items among the same community of the userscan be achieved through association mining analysis, grouping of samecategory of the contents and users is possible with clustering mechanismthrough which the concept of parallelism can be achieved. For thespecification of certain data the pattern recognition is possible and incase of uncertain data handling two possibilities are there initiallyBayesian formalisms can be used and for more complex uncertain data theusage of softest computing is better.To generate the data from various sources the data ware housingenvironment with Extraction ,transformation and Loading process will beused, the ETL process provides data warehouse and data mart so as tostore the data. Data warehouse can be used to handle centralized datawhich is common to all the users and other reporting environments,whereas Data marts are contains the portion of data from data

warehousing repository, which will be helpful for the handling ofspecific data. For example if core papers are there to share betweenuniversities and colleges as a digital content then we can place thatcontent in data warehousing since all the colleges and universities arealmost having common core papers. But in case of electives there is adifferentiation of selection made by the students only that specificcontent is enough to share in this case better to manage the data indata marts.To process the stored data according to the end user requirement and tosearch for the direct data available online analytical processing toolssuch as Business Objects, Cognos are helpful, but they are unable to getthe hidden patterns and discovery of knowledge in data bases is notpossible, to achieve this powerful kind of searching such as abovementioned is done by the data mining techniques. Data mining is powerfulsearch process in complex data sources and which allows the users withthe provision of discovering knowledge in the hidden formats andpatterns. The data mining tool weka helps to recognize the dataprocessing in the affective manner. In the context of higher educationthe requirement of keyword searching, some data items might requiregetting the data based on words, grammatical rules and properties. Theprocessing of the methods can be surrounded by document relevancy andother kind of data access might be identification of the content basedon the key terms. The following description will help the techniques ofdata mining which are really worthy to handle the hidden patterns of thedata.Text mining methods can be viewed as an extension of Data Mining to textdata. Some of the typical aspects of Text Mining research involvedevelopment of models for reasoning about new text documents based onwords, phrases, linguistic and grammatical properties of text and;extracting information and knowledge from large amounts of textdocuments. Different application areas have been identified as havingpotential for Text Mining. With the help of Information and Communication Technology (ICT),advanced methods and tools the availability of different kind ofpublished materials in electronic form. In this work, our main goal hasbeen to design and develop a text mining tool to be used in educationalapplications.Two text mining techniques are used in our work, which helps incategorization and summarization of the text documents are as follows

1. Automatic classification of documentsDocuments are indexed according to a domain-specific taxonomy, allowingusers to categories the documents related to particular taxonomy terms.

2. Identification of key termsExtraction of keywords from the documents, help to categories orsummarize the content of the documents. So effective handling of thedigital data population can be handled with the provision of Informaticatool. With the provision of statistics such as data access rate by

community of the users and most frequently accessed data and updatedcontent and with the log data improvement of the data quality ispossible. The following diagram shows the data mining functionalities tohandle the data.

Source: Data Mining in Higher Education in Google Search

V. Conclusion and Future work

The overall goal of the article is to enhance the quality in higher educationwith research innovations, to explain the possibilities we used the informaticatool for the extraction, transformation and loading process of the digitalcontent of the system. The usage of transformations such as rank, router andupdate strategy will give the better exposure of the data quality estimation.With this tool we can easily manipulate various categories of the sources intoa common repository which gives the processing of the content in the easiestpossible way. The usage of data mining tools such as weka tool will also helpto handle the hidden patterns in the efficient way. The mining will find outthe digital contents of the data in the context of data and keyword basedsearching options. With data warehousing the population of the data fromvarious sources is possible and with data mining the searching of knowledge anddiscovery of the knowledge in data content is allowed. As a future scope thedistribution and parallelised aspects and there is a chance of developingmobile apps for the usage of digital content according to the requirement ofthe stake holder who are involved in the higher education quality improvement.Especially in case of Data warehousing implementation of query processing of required contentsthrough the usage of indexing will give the benefit of faster processing.Bitmap indexing mechanisms will allows the processing of data in the fastestmanner, implementation of NOSql strategies for unstructured data are alsoemerging trend in the data warehousing.

References

[1] Sudha Narasimhachar “Tuesday 29 July 2014 News updated at 3:04 AM IST Improving qualityof higher education”. [2] Bandhana Bhasin” Integration of Information and Communication Technologies in Enhancing Teaching andLearning” Contemporary Educational Technology, 2012, 3(2), 130-140[3] Raju Narayana Swamy I.A.S. “Towards Improving the Quality of Education by Integrating ICT in TeacherEducation”, CSI Communications | March 2012[4] Prateek Bhanti,” E-Governance: An Approach towards the Integration of Higher Education System inIndia”, International Journal of Emerging Technology and Advanced Engineering Website:www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 8, August 2012)[5] Prateek Bhanti,” E-Governance: An Approach towards the Integration of Higher Education System in India“International Journal of Emerging Technology and Advanced Engineering Website:www.ijetae.com (ISSN 2250-2459, Volume 2, Issue 8, August 2012)[6] Ashish Hattangdi,” Enhancing the quality and accessibility of higher education through the use ofInformation and Communication Technologies.”[7]www.informatica.com[8]WWW.google.com[9] Amorntep Keawpibal, ”Enhanced Encoded Bitmap Index for Equality Queries”, IEEE,

2012.[10] www.trividas.com[11] K.Uma Pavan Kumar,”The requirements of Parallel Data Warehousing Environment toImprove the Performance with dominating sets for Next generation Computing “,International Journal of advanced Computing, July 2012.[12]Uma Pavan Kumar Kethavarapu,”Ten Ingredients of data warehousing to improve theperformance”, International Journal of Computing, April2012.[13]Uma Pavan Kumar Kethavarapu,”Data Management in Distributed Environments to achievebetter performance in data warehousing”,IIJC,,2012.[14]Uma Pavan Kumar Kethavarapu,”The Requirements of parallel data warehousingEnvironment to improve the performance with dominating set for the generation ofUsers”, International Journal Of Computer science and Information Security,,March2012.[15]Bhasole.P,””Efficient Indexing Techniques on Data warehousing”, InternationalJournal of Scientific and Engineering Research, May 2013.[16] Naveen Garg, ”Bitmap Indexing techniques for Data warehousing and Data mining”,International Journal of latest trends in Engineering and Technology”, January2013[17] Zanab Quays abdullahadi, ”Bitmap index as the effective Indexing for the low

cardinality column in data warehousing”, International Journal Of ComputerApplications, April 2013.

Authors Profiles

K.Uma Pavan Kumar, ReceivedM.Tech (CSE) from JNTUK affiliatedCollege, working as assistant professorin AIMS Institutions Bangalore. He isResearch scholar in Pondicherry

Engineering College, under the guidance ofDr.S.Saraswathi Professor and HOD IT Dept,PEC,Hisresearch interest are Data Bases,Data warehousing,Parallel processing and Distributed aspects. Currently heis working on bitmap indexing mechanisms to achieve forbetter query processing

M.Narayana Swami, Working asAssociate Professor and HOD, Computer Science,Presidency College, Bangalore. He is Researchscholar in Bharathiar University, under theguidance of Dr.M.Hanumantappa, Associate Professor

and HOD CSA Dept., Bangalore University, His research interest areData Bases, Data Mining, Data Structures and Principles ofProgramming. His current research is usage of data miningtechniques in text processing.