Spatiotemporal Representation Learning for College Student ...
-
Upload
khangminh22 -
Category
Documents
-
view
2 -
download
0
Transcript of Spatiotemporal Representation Learning for College Student ...
Li XL, Ma L, He XD et al. You are how you behave – Spatiotemporal representation learning for college student aca-
demic achievement. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 35(2): 353–367 Mar. 2020. DOI
10.1007/s11390-020-9971-x
You Are How You Behave – Spatiotemporal Representation Learning
for College Student Academic Achievement
Xiao-Lin Li1, Li Ma1, Xiang-Dong He2,∗, and Hui Xiong3, Fellow, IEEE
1School of Business, Nanjing University, Nanjing 210093, China2Information Technology Services Center, Nanjing University, Nanjing 210093, China3School of Business, Rutgers University, Newark, NJ 07102, U.S.A.
E-mail: [email protected]; [email protected]; [email protected]; [email protected]
Received August 20, 2019; revised January 22, 2020.
Abstract Scholarships are a reflection of academic achievement for college students. The traditional scholarship assign-
ment is strictly based on final grades and cannot recognize students whose performance trend improves or declines during
the semester. This paper develops the Trajectory Mining on Clustering for Scholarship Assignment and Academic Warning
(TMS) approach to identify the factors that affect the academic achievement of college students and to provide decision
support to help low-performing students attain better performance. Specifically, we first conduct feature engineering to
generate a set of features to characterize the lifestyles patterns, learning patterns, and Internet usage patterns of students.
We then apply the objective and subjective combined weighted k-means (Wosk-means) algorithm to perform clustering
analysis to identify the characteristics of different student groups. Considering the difficulty in obtaining the real global
positioning system (GPS) records of students, we apply manually generated spatiotemporal trajectories data to quantify
the direction of trajectory deviation with the assistance of the PrefixSpan algorithm to identify low-performing students.
The experimental results show that the silhouette coefficient and Calinski-Harabasz index of the Wosk-means algorithm are
both approximately 1.5 times to that of the best baseline algorithm, and the sum of the squared error of the Wosk-means
algorithm is only the half of the best baseline algorithm.
Keywords academic achievement, spatiotemporal trajectory, feature engineering, student segmentation
1 Introduction
Academic achievement not only is the ultimate out-
come of learning activities for college students but also
reflects the quality of higher education. In many con-
texts, it is the overwhelming assessment criterion for ap-
plicants’ learning and problem-solving abilities. Hence,
it is necessary for colleges to identify the factors that
affect the academic achievement of college students
to incentivize high-performing students and help low-
performing students.
Numerous studies have focused on the mecha-
nism about which and how external factors impact
academic achievement. We need to consider cogni-
tive factors such as personal factors, long-term mem-
ory, short-term memory, creativity, and other intel-
lectual factors [1, 2], as well as non-intellectual factors,
such as self-discipline, time management, and lifestyle
habits [3–6]. Non-personal factors, such as social re-
lations, information and communication technologies,
and school environments, are increasingly difficult to
disregard [7–10]. These factors from diverse studies serve
as a reflection of individual differences, and provide as-
sistance for the subsequent research in students’ aca-
demic achievement.
Unfortunately, research data in prior studies mainly
Regular Paper
Special Section on Learning and Mining in Dynamic Environments
This work was supported by the National Natural Science Foundation of China under Grant Nos. 61773199 and 71732002, andthe National Key Research and Development Program of China under Grant No. 2018YFB1004300.
∗Corresponding Author
©Institute of Computing Technology, Chinese Academy of Sciences 2020
354 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
covered students’ in-classroom activities and demo-
graphical factors, with a lack of large-scale behavioural
data generated by students’ daily activities. On the
other hand, traditional metrology and statistics are not
sufficient to analyze massive amounts of data as effec-
tively as emerging machine learning approaches. This
paper would deal large-scale behavioural data gene-
rated by students’ daily activities with machine learn-
ing approaches instead of traditional approaches. In
traditional empirical research area, researchers must
propose some hypotheses before experiments, which
may ignore some potential useful features. Correspond-
ingly, the emerging feature engineering techniques can
help to discover the influencing factors that are not no-
ticed by researchers. We also notice that only focusing
on individual or several influencing factors in a single
study, the correlations and connections among factors
would be ignored, which violates the logic of the inher-
ent development of such factors.
Thus, this paper develops the TMS (Trajectory
Mining on Clustering for Scholarship Assignment and
Academic Warning) approach to identify the factors
that affect the academic achievement of college students
and provide decision support to help low-performing
students attain better performance. Specifically, we
first conduct feature engineering to generate a set of
features to characterize the lifestyle patterns, learning
patterns, and Internet usage patterns of students. We
then apply the proposed Wosk-means algorithm to per-
form clustering analysis to identify the characteristics of
different student groups. Subsequently, considering the
difficulty in obtaining the real GPS records of students,
we apply manually generated spatiotemporal trajecto-
ries data to quantify the trajectory deviation direction
with the assistance of the PrefixSpan algorithm [11] to
identify low-performing students. Finally, we compare
the conclusions with prior studies. We present new in-
sights into how student behaviour factors can influence
academic achievement and identify low-performing stu-
dents from the perspective of trajectory deviation.
2 Problem Statement and Framework
2.1 Problem Definition
Each college student owns an authorized smart
card with a unique student ID. When he/she uses
his/her card to make payment, log on the Internet
with his/her ID, or enter some buildings with his/her
card, some new pieces of digital records will be added
to his/her profiles. Hence, these constantly increasing
data records will serve as a reflection of college stu-
dents’ daily behaviour. Self-discipline and time mana-
gement are closely associated with academic achieve-
ment of students [12], which results from the fact that
high-performing students will show different behaviour
patterns from common students [13]. This paper sup-
poses that academic achievement of each student can
be reflected from his/her daily life, which involves in
smart card usage behaviour, Internet usage behaviour
and trajectories within campus. Notably, there exist
two major tasks: 1) uncovering novel factors of aca-
demic achievement, and 2) inferring different student
groups based on daily trajectory.
2.2 Preliminaries
Many universities often have several campuses and
some functional areas are simultaneously distributed in
multiple locations for each campus. In detail, student
dormitories may be distributed in different locations
and named differently. We denote the campus of the
college as Cγ and divide all locations into twelve func-
tional areas (denoted as Fη), including dormitory, can-
teen, classroom, library, courtyard, bathroom, school
hospital, supermarket, office, water room, multimedia,
and the other areas. Therefore, for the campus Cγ , the
λ-th location p(γ,λ) can be defined by (p.loc, p.fun).
Here, p.loc represents the address of the location p
in the actual physical space, and p.fun represents the
functional area to which it belongs. c(·) is the trans-
forming function to transform p.loc into p.fun. The
following p refers to p.fun. Time and location informa-
tion are mainly extracted from the consumption data,
network services data, and access control data. Based
on the advanced technologies applied in [14, 15], some
related definitions are defined as follows.
Definition 1 (Activity). The activity a of a stu-
dent u is represented by the timestamp ta, functional
area p ∈ Fη, the activity behaviour b ∈ Bϕ and the
other related attributes a.attr.
Here, a.attr is a set of vectors that depend on the
activity behaviour b. In detail, the consumption be-
haviour in consumption activity should consist of the
consumption amount, card balance, and the other at-
tributes. The recharge behaviour should include at-
tributes such as recharge amount. After extracting the
time and location information of each student in each
activity from smart card usage, Internet usage, and ac-
cess control data, the information is reordered accord-
ing to the time to obtain students’ space-time sequence
in the time period.
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 355
Definition 2 (Activity Sequence). Given a student
u, the activity sequence Aseq is a space-time sequence
for u,Aseq = {(t1, p1), · · · , (tk, pk)}, where ti is the
timestamp and ti < tj (i < j), pi ∈ Pφ represents
the i-th place in the sequence, and pi can be the same
as pj .
Definition 3 (Stay Point). Given an activity se-
quence Aseq = {(t1, p1), · · · , (tk, pk)} and the parame-
ter ξ, if (ti, pi) and (ti+1, pi+1) satisfy pi = pi+1 and
|ti − ti+1| < ξ, then (ti, pi) and (ti+1, pi+1) belong to
the same stay point.
Definition 4 (Semantic Trajectory). Given a stu-
dent u and its activity sequence Aseq, the trajectory
Tra is a subset of Aseq, T ra ∈ Aseq.
Researchers can exploit the node information em-
bedded in behavioural data to transform original data
into semantic trajectories [16]. In this paper, the tra-
jectory segmentation method based on time is applied
to divide the original trajectory into a daily trajectory
according to the student’s activities in a day [17]. It
is worth noting that a day does not refer to the sim-
ple natural day (00:00 am–00:00 am) but the student’s
complete day of activities. For example, a trajectory
may consist of {(December 1st, 10:00 pm, classroom)
→ (December 1st, 11:00 pm, bedroom) → (December
2nd, 00:10 am, bedroom)}. Here, the activities of the
student clearly belong to the trajectory of the same
day, although it takes two natural days. Hence, the
daily trajectory is defined as follows.
Definition 5 (Daily Trajectory). Given a trajec-
tory Tra = {(t1, p1), · · · , (tk, pk)}, the daily trajectory
DTra = {(t1, p1), · · · , (ts, ps)}, where DTra ∈ Tra.
There must exist a unique (tj , pj) in Tra such that
(t1, p1) = (tj , pj) holds, and for ∀n ∈ {1, 2, · · · , s −
1}, (t1+n, p1+n) = (tj+n, pj+n).
The same activity recorded by the campus informa-
tion system may contain multiple pieces of data that
need to be merged. For instance, the student ordered
three dishes within ten minutes, which in fact belong to
the same activity. To conveniently calculate the subse-
quent similarity, the timestamp will be converted to the
time interval moving between the two locations. This
paper also sets the time interval unit to the hour level.
Definition 6 (Trajectory Pattern). Given a tra-
jectory Tra = {(t1, p1), · · · , (tk, pk)}, the trajectory
pattern is shaped such as TraP = p′1∆t′1−−→ p′2
∆t′2−−→
· · ·∆t′u−1−−−−→ p′u, where p′i ∈ {pj}. If p′i corresponds to
(ti, pi) and p′i+1 corresponds to (tj , pj), then ∆t′i =
tj − ti.
2.3 Framework Overview
This paper proposes the TMS approach (in Fig.1),
which includes two components: 1) the objective and
the subjective combined weighted k-means (Wosk-
means) algorithm proposed by this paper, which is
developed to segment students into five disjoint groups,
and 2) the Prefixspan algorithm [11], which is able to
calculate the similarity and direction of trajectory de-
viation among different groups. Essentially, this paper
can be able to uncover the influencing factors of college
students’ academic achievement and segment students
into different groups.
3 Feature Extraction
The lifestyle habits of students can be reflected from
their behavioural data, which are the important influ-
encing factors in academic achievement [18, 19]. To gene-
rate different features, this paper performs fundamen-
tal statistical manipulations on the original data, which
mainly involve summation, count, etc. For instance, the
total amount of students’ consumption is the summa-
tion of expenses incurred by the student’s consumption.
In addition, this paper also generates sequences based
on students’ access control records to form students’
daily trajectory, which can be found at Subsection 2.2.
In this section, we will present the feature extraction
procedure briefly.
3.1 Consumption Feature
Consumption features can show part of lifestyle
habits. The consumption behaviour characteristics of
students mainly include three aspects.
1) General Feature. These stem from simple ma-
nipulations on data such as summation and count.
Since the student’s learning activities are mainly con-
centrated on the weekday, the average daily consump-
tion on weekdays and on the weekend are calculated
separately.
2) Classification Feature. This paper divides the
locations into 23 categories, including canteen, super-
market, bathroom and so on. For category of location,
we calculate the proportion to total consumption of it,
compare the differences between weekdays and week-
ends, and then calculate the average daily consump-
tion amount. Due to the prominent role of breakfast,
this paper also divides the breakfast time into a range
of three time periods: ∼08:00, 08:00–09:00, and 09:00–
10:00. Furthermore, we compute the days of having
breakfast for students at different time periods (i.e., to-
356 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
In-CampusConsumption
InternetUsage
AccessControl
Scholarship StudentInformation
Feature Engineering
Step1
Clustering Analysis:the Wosk-means Algorithm
Objective WeightsSubjective Weights
Repre
senta
tion
Learn
ing
SpatiotemporalFeatures
SemanticTrajectory
Prefixspan Frequent TrajectoryPattern
Trajectory Similarity
Trajectory Deviation
Trajectory DeviationDirection
0
1
2
3
4
StudentGroups
Step2 Trajectory Analysis
Academic Warnings Behavior Factors
Fig.1. Framework of the TMS approach.
tal, weekday, weekend), and calculate the ratio of days
in weekdays to days on the weekend.
3) Recharge Feature. First, the total amount and
frequency of students’ recharges are counted to obtain
the average amount of students’ recharges. Due to the
particularity of the recharge amount, statistics on the
extremum and quartiles are also conducted. In addi-
tion, different recharge habits reflect the variances of
financial management ability and self-regularity. We
then consider students’ recharge habits of smart cam-
pus card, given that some students recharge at regular
time, and some students recharge when the card bal-
ance is close to 0. Hence, we evaluate two statistics of
recharge time: card balance when recharging, and time
interval between two recharges.
3.2 Internet Usage Feature
The impact of Internet usage on academic achieve-
ment is increasingly important. Some researchers dis-
covered the negative impact of Internet use on aca-
demic achievement that students with restricted ac-
cess to YouTube and other sites have higher academic
achievement [20, 21]. However, the application of Inter-
net in education is also likely to bring positive results.
In addition, Internet usage has a higher impact on men
than on women. The extraction of Internet usage fea-
tures mainly includes three aspects.
1) General Feature. First, some basic information
is counted, which mainly includes the network fee, con-
necting time, the number of connections, and uplink
and downlink dataflow. On the basis of total cost, du-
ration, connections and dataflow, the daily average data
of each student can be calculated connecting with the
actual days of Internet usage. In addition, the paper
also carries out statistics on the extremum, quartile,
kurtosis, skewness, and standard deviation to fully de-
scribe the students’ Internet usage.
2) Time Distribution Feature. The habits of Inter-
net usage may reflect the lifestyle and study habits
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 357
of the students. For instance, students who often
use the Internet in the early hours of the morning do
not sleep early. This paper divides the day into four
time periods, including morning (06:00–12:00), after-
noon (12:00–18:00), evening (18:00–24:00), and night
(00:00–06:00). Hence, we also conduct the same calcu-
lation on the general features of Internet usage for each
time period.
3) Uplink and Downlink Dataflow Features. Belo
et al. [21] found that broadband has a negative im-
pact on academic achievement. Different types of on-
line behaviours result in different uplink and downlink
dataflows, and thus the student’s online behaviour can
be inferred to some extent. For instance, when search-
ing for a paper, the webpages involved are mostly text
contents; thus, the amount of data and the uplink and
downlink dataflows are small. When watching videos,
the amount of dataflow involved will be correspondingly
large. In addition, the multi-variate data including the
time, location, and Internet connections, can be em-
ployed to judge the type of students’ online behaviour
more accurately.
3.3 Trajectory Feature
Trajectory features can also reflect the lifestyle
habits and self-discipline of college students, both
of which are important factors that affect academic
achievement [6, 18, 19]. While it is difficult to obtain the
GPS records of college students, this paper generates
the trajectory sequence data for each student through
large-scale original behavioural data based on the idea
of representation learning. The trajectory of a student
can be represented as a series of ordered sequences,
which lies in the fact that each digital record contains
the location and the time information. The generating
of trajectory features can be seen in Section 2.
4 Methodology
4.1 Cluster Analysis
Given X = {X1, X2, · · · , XN} and C =
{C1, C2, · · · , CK}, Xn is the n-th instance in X ,
and then Ck is the k-th cluster in C. Xn =
{xn1, xn2, · · · , xnM}, where xnm is the eigenvalue of the
m-th feature of the i-th instance. C1∪C2∪· · ·∪CK = X ,
and C1∩C2∩· · ·∩CK = ∅. The centre of each cluster is
Ck = {ck1, ck2, · · · , ckM}, k = 1, 2, · · · ,K. The idea of
the k-means algorithm is to maximize the intra-cluster
similarity and minimize the between-cluster similarity
and thus the constraint function is the distance summa-
tion between the intra-cluster instances and the cluster
centre:
P (U , C) =
K∑
k=1
N∑
n=1
unk
M∑
m=1
d(xnm, ckm). (1)
Here, U is an N × K matrix and unk ∈ {0, 1}.
unk = 1 only when the n-th instance belongs to the
k-th cluster. Hence,∑K
k=1 unk = 1, n = 1, 2, · · · , N .
C consists of k clusters, that is, C = {C1, C2, ..., Ck}.
d(xnm, ckm) represents the distance between the intra-
cluster instance and the cluster centre. Rewriting (1)
with the Euclidean metric:
P (U , C) =
K∑
k=1
N∑
n=1
unk
M∑
m=1
(xnm − ckm)2. (2)
It is obvious that each feature in (2) is treated
equally. In reality, only some features are truly
valuable [22], and thus we apply a feature selection tech-
nique for removing redundant and noisy features [23].
Furthermore, feature weighting is a generalization tech-
nique for feature selection that assigns each feature
a weight value ([0, 1]) rather than simply removes a
feature [24]. Therefore, some researchers considered as-
signing weights to features when applying the k-means
algorithm [25, 26]. Huang et al. [27] once proposed a W -
k-means algorithm, and set the weight of M features
as W = (w1, w2, · · · , wM ). (2) can be rewritten as
(3). Here, wm ∈ [0, 1], and∑M
m=1 wm = 1. β is
a self-defined parameter. Under the constraint that
the summation of weights equals 1, the feasible so-
lution minimizing (3) is shown in (4). In addition,
Dm =∑K
k=1
∑Nn=1 unk(xnm − ckm)
2 is the summation
of the variance of the m-th feature in all clusters.
P (U , C) =
K∑
k=1
N∑
n=1
unk
M∑
m=1
wβm(xnm − ckm)
2, (3)
wm =1∑
t∈F [Dm/Dt]1/(β−1). (4)
However, manually defining feature weights is hard
to achieve on high-dimensional data, and it is difficult to
ensure that the optimal weights will exist in the defined
feature weights set. In this paper, the proposed Wosk-
means algorithm (Algorithm 1) revises the basic k-
means algorithm by addressing the weight difference of
features. Specifically, this paper considers two weight-
ing methods, including objective weight W and subjec-
tive weight V . The comprehensive weights are given by
γ = (γ1, γ2, · · · , γM ), where γm = wmvm∑Mm=1 wmvm
.
358 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
Algorithm 1. The Wosk-means Algorithm
Input: dataset X = {X1,X2, · · · ,XN}; the number of clustersK; weight vector γ
Process:
1. Standardize the dataset X to X∗.
2. Randomly select K instances from X∗ as the initial meanvector (µ1, µ2, · · · , µK).
3. repeat:
4. Set Ck = ∅ (1 6 k 6 K)
5. for n = 1, 2, · · · , N do
6. Calculate the distance between the instance Xn and
each mean vector µk:
dnk =M∑
m=1γm ‖xnm − µkm‖2 .
7. Determine the cluster of Xn based on the nearest
mean vector: λn = argmink∈{1,2,··· ,K} dnk;
8. Divide the instance Xn into the corresponding
cluster: Cλn= Cλn
∪ {Xn};
9. end for
10. for k = 1, 2, · · · , K do
11. Calculate the new mean vector:
µ′k= 1
|Ck|
∑x∈Ck
x.
12. if µ′k6= µk then
13. Update the mean vector as µ′k
14. else
15. Do not change the mean vector
16. end if
17. end for
18. until the mean vector cannot update
Output: cluster C = {C1, C2, · · · , CK}.
4.2 Frequent Pattern Analysis of Trajectory
The clustering analysis can segment students into
disjoint groups, and then, we need to calculate the de-
viation of trajectory direction among different groups.
First, the analysis of frequent patterns should be con-
ducted, which is accomplished by applying the Prefixs-
pan algorithm [11]. This algorithm incorporates the idea
of the Apriori algorithm and the tree algorithm to re-
duce the cost of trajectory mining. For the PrefixSpan
algorithm, the sequence is ordered. The children of a se-
quence are item sets, and the children of an itemset are
terms. Let us take an example of an ordered sequence
< a(ab)c >, where <> is the identifier of the sequence
and ( ) is the identifier of the item set. “a” and “b” in
( ) represent the item, and then “a” and “c” not in ( )
are single-item item sets. A sequence contains one or
more ordered item sets in such a way that the sequence
< a(ab)c > and the sequence < ac(ab) > are different
sequences. An item set contains one or more unordered
items in such a way that the sequence < a(ab)c > and
the sequence < a(ba)c > are the same sequence. Some
necessary definitions when analyzing frequent patterns
are given below.
Algorithm 2. The Prefixspan Algorithm [11]
Input: sequence database S, minimum support minSup
Parameter: sequence pattern α (length(α) = L), and thedatabase S|α after the projection of α
PrefixSpan(α,L, S|α) [11]
1. Scan S|α to find frequent items a satisfying one of the fol-lowing conditions: a can be deemed as an item of the lastitem set of α; < a > can be deemed as the last item set inα.
2. Insert a into the end of α to construct the new frequentsequence pattern α′ and output α′.
3. Construct the database S|α′ after the projection of α′ toobtain PrefixSpan(α′, L+ 1, S|α′).
Output: sets of frequent sequence patterns
Definition 7 (Subsequence, Supersequence).
Given the sequence A = (a1, a2, · · · , an) and sequence
B = (b1, b2, · · · , bm), n 6 m, if there exists a num-
ber sequence 1 6 j1 6 j2 6 · · · 6 jn 6 m, and
a1 ⊆ bj1, a2 ⊆ bj2, · · · , an ⊆ bjn are satisfied, then A
is the subsequence of B and B is the supersequence of
A. A subsequence meets the following conditions: 1)
all item sets in the subsequence can be found in the
supersequence; and 2) the order of item sets in the sub-
sequence remains the same to that in the supersequence.
Taking the sequence S = < a(abc)(ac)d(cf) > as an
example, < a(abc) > is the subsequence of S.
Definition 8 (Prefix). For sequence A =
(a1, a2, · · · , an) and sequence B = (b1, b2, · · · , bm), n 6
m, if a1 = b1, a2 = b2, · · · , an−1 = bn−1, and an ⊆ bn,
then A is called the prefix of B.
A prefix is a subsequence that can only start at the
beginning of a supersequence. For example, < a(abc) >
is a prefix of S. Although < d(cf) > is also a subse-
quence of S, it is not a prefix, which locates in the
middle part of sequence S.
Definition 9 (Projection). Subsequence A′ is the
projection of the sequence A on B if it satisfies:
1) B is the prefix of A′;
2) A′ is the largest subsequence of A satisfying con-
dition 1).
Informally, A′ begins with B and is the longest sub-
sequence found in A. Taking S = < a(abc)(ac)d(cf) >
as an example, < (abc)(ac)d(cf) > is the projection of
S on < (abc) >.
Definition 10 (Suffix). Subsequence A′ is the pro-
jection of sequence S on B, and the suffix C of sequence
A on B is C = A′ −B′.
In other words, the suffix is a projection that
removes the suffix. Taking S as an example, <
(ac)d(cf) > is the projection of S on < (abc) >. To ob-
tain the frequent sequence pattern, the support thresh-
old minSup should be set manually. The checked pat-
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 359
tern can be considered as a frequent sequence pattern
when its support exceeds the threshold.
4.3 Trajectory Deviation Analysis
The premise of calculating the trajectory deviation
is to clarify the distance between two trajectories. The
followings are some definitions.
Definition 11 (Trajectory Matching). Given the
parameter ρ ∈ [0, 1], and two trajectory patterns
TraP1 = p′11∆t′11−−−→ · · ·
∆t′1[u−1]−−−−−→ p′1u and TraP2 =
p′21∆t′21−−−→ · · ·
∆t′2[u−1]−−−−−→ p′2u, there exists a trajec-
tory matching between these two trajectories, i.e.,
TraM = {p1, p2, · · · , pk}, when the following condi-
tions are satisfied:
1) ∀i, j ∈ [1, k], pi = p′1m = p′2n, pj = p′1r = p′2s and
m < r, n < s when i < j;
2) ∀i ∈ [1, k − 1],|∆t′1m−∆t′2n|
max(∆t′1m, ∆t′2n)6 ρ;
3) when k = 1, if p1 = p′1m = p′2n, the trajec-
tory matching of length 1 is considered to exist, that
is TraP = [p1].
Definition 12 (Frequent Matching Pattern). Given
the frequent pattern TraP = p′1∆t′1−−→ · · ·
∆t′u−1−−−−→
p′u, the pattern can be partitioned into D non-
coincident children trajectory patterns TraP =
(DTraP1, · · · , DTraPD). The trajectory matching sets
among D children trajectory patterns can be calcu-
lated, where TraMk = {TraMk,di,dj}, i, j ∈ [1, D],
i 6= j, and k is the length of the trajectory match-
ing, k ∈ [1,K]. For each TraMk, choose TraMk,di,dj
satisfying #TraMk,di,dj> α × D(D−1)
2 as the fre-
quent trajectory pattern under the k-length matching,
where #TraMk,di,djrepresents the computing times of
the children trajectory pattern satisfying the match-
ing,D(D−1)
2 is all of the running times to perform the
matching calculation, and α is the limit parameter.
Definition 13 (Similarity of Trajectory Pattern).
Given two trajectory patterns TraP1 = p′11∆t′11−−−→
· · ·∆t′1[u−1]−−−−−→ p′1u and TraP2 = p′21
∆t′21−−−→ · · ·∆t′2[u−1]−−−−−→
p′2u, the atomic sets of their trajectory pattern match-
ing are TMS1 = {tm1,r,k} and TMS2 = {tm2,r,k},
where r is the matching number, and k is the length
of trajectory matching, k ∈ [1,K]. The similarity S of
these two trajectory patterns is:
S(TraP1, T raP2)
=
K∑
k=1
fw(k)Sl(FT k1 , FT k
2 ),
Sl(tm1, k, tm2, k)
=
∑ri=1 ftw(i, j)× Stm(tm1, r, k, tm2, r, k)
CkK
,
f tw(i, j) =k∏
l=1
#pil/# pi1#pjl/#pj1
,
where ftw(i, j) is the self-defined location coefficient,
and #pij means the number of pij .
5 Experimental Results
5.1 Experimental Data
In this paper, we utilized the behavioural data gene-
rated by undergraduate students in a university in east-
ern China. The datasets include smart card usage data,
Internet usage data, and access control data, as well as
scholarships and student information data. The time
period of data is from December 1, 2014 to December
31, 2014. The scholarship assignment data are mainly
based on the data of the 2014-2015 academic year, and
the target population is the undergraduate students of
grade 2011 and grade 2012. It should be noted that
the trajectory data are manually generated based on
the temporal and spatial records of student behavioural
data. Table 1 shows the data statistics. After being
processed, the datasets cover 6 701 students and 23 lo-
cations.
Table 1. Statistics of the Experimental Data
Data Source Properties Statistics
General # Students 6 701
# Types of location 23
# Types of functional area 12
Time period 12/2014
Smart card usage # Records 185 293
Internet usage # Records 187 564
Access control # Records 168 916
Trajectory constructed # Max Length 175
Notes: Smart card usage data consist of student No., date, time,location, amount, etc. The form of Internet data is {student No.,location, start time, end time, data flow}. Access control dataincludes student No., entry and exit time, entry and exit status,and direction. #: number of.
5.2 Evaluation Metrics
In this paper, four metrics will be used to evaluate
the performances of the above clustering algorithms.
Silhouette Coefficient. The silhouette coefficient
value is a metric to assess the clustering effect [28]. The
closer the value is to 1, the better the model’s cluster-
ing effect tends to be. The closer the value is to −1,
the worse the model’s clustering effect tends to be. The
360 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
formulas for the silhouette coefficient are as follows:
s1 =1
|Ck′ |
∑
Xn∈Ck, Xj∈Ck′
M∑
m=1
γm ‖xnm − xjm‖2,
s2 =1
|Ck| − 1
∑
Xn∈Ck, Xl∈Ck, l 6=n
M∑
m=1
γm ‖xnm − xlm‖2,
SCscore(X) =N∑
n=1
s1 − s2max(s1, s2)
.
Here, C′k denotes the nearest cluster to the cluster
Ck including the instance Xn.
Calinski-Harabasz Index. The Calinski-Harabasz in-
dex is a metric to examine the covariance of intra-
cluster data [28]. The larger the value is, the better the
model’s clustering effect tends to be. The formula for
the Calinski-Harabasz index is as follows:
CHscore(k) =tr(Bk)
tr(Wk)×
N − k
k − 1.
Here, N denotes the instance size, and k is the num-
ber of clusters. Bk refers to the covariance matrix be-
tween clusters, and Wk refers to the covariance matrix
within a cluster. tr() is used to compute the trace of
the matrix.
Sum of the Squared Error (SSE). SSE is a metric to
compute the intra-cluster similarity [25]. The smaller
the value is, the better the model’s clustering effect
tends to be. The formula for SSE is as follows:
SSE =
K∑
k=1
∑
x∈Ck
M∑
m=1
γm||xm − µkm||2.
Here, µk represents the mean vector of features within
cluster Ck.
Running Time. The running time is a metric to
assess the algorithm’s efficiency [29]. The smaller the
value is, the faster the algorithm’s running speed tends
to be. The formula for the running time is as follows:
RT (t) = timestart − timeend.
Here, t denotes the times of running the same algo-
rithm. In this paper, all experiments are run 10 times.
5.3 Clustering for Student Group
Twenty-one Internet-related features and 17
consumption-related features are considered to add
subjective weights when conducting the Wosk-means
algorithm. Notably, the effect of the Wosk-means al-
gorithm depends on k to some extent. When k < 5,
the sum of the squared error (SSE) decreases obviously
with the increase of the k value. When k > 5, the am-
plitude of the decrease is significantly reduced (Fig.2).
Furthermore, the difference in scholarship distribution
under each cluster at k = 5 is biggest (Fig.3). Specifi-
cally, the students within cluster 3 and cluster 4 are
mainly those who are granted ownership of scholar-
ships, while the students within cluster 0 and cluster 1
are mainly those who are not granted. Therefore, this
paper sets the k value as 5.
1 3 5 7 9 10
Number of Clusters (k)
16
17
18
19
20
21
22
23
24
Sum
of th
e S
quare
d E
rror
(SSE)
Τ103
Fig.2. SSE under each k value.
Cluster 0
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Student Groups
Num
ber
of Stu
dents
0
500
1 000
1 500
2 000
2 500
3 000
3 500
GrantedNon-Granted
Fig.3. Scholarship distribution under each student group.
This paper further compares the proposed Wosk-
means algorithm with the basic k-means algorithm, the
bisecting k-means algorithm, and the k-means++ algo-
rithm (Fig.4, Fig.5, and Fig.6). The better algorithm
should be viewed as the algorithm with higher values
of silhouette coefficient and Calinski-Harabasz index,
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 361
and lower values of SSE and running time. It is obvi-
ous that the proposed algorithm outperforms the other
three algorithms in terms of most of the metrics (sil-
houette coefficient, Calinski-Harabaszindex, and SSE).
These three metrics reflect the algorithm’s intra-cluster
similarity. That is, by considering the objective weights
and subjective weights, the proposed algorithm is bet-
ter at highlighting the importance of some features.
However, the proposed algorithm is not optimal at run-
ning efficiency. As shown in Fig.7, the running time of
the Wosk-means algorithm ranks the second among the
four algorithms.
Wosk-m
eans
k-mean
s++
Bisecti
ng k-
means
k-mean
s-0.04
-0.02
0.00
0.02
Silhouett
e C
oeffic
ient
0.04
0.06
0.08
0.10
0.12
0.14
Fig.4. Comparison of silhouette coefficient with models usingthe Wosk-means algorithm and the three baseline algorithms.
0
5
10
15
20
25
30
35
40
45
Wosk-m
eans
k-mean
s++
Bisecti
ng k-
means
k-mean
s
SSE
Τ103
Fig.5. Comparison of SSE with models using the Wosk-meansalgorithm and the three baseline algorithms.
0
200
400
600
800
1 000
1 200Calinski-Harabasz IndexRunning Time (s)
Wosk-m
eans
k-mean
s++
Bisecti
ng k-
means
k-mean
s
Fig. 6. Comparisons of Calinski-Harabasz index and runningtime with models using the Wosk-means algorithm and the threebaseline algorithms.
Cluster 0
40.5% 43.6% 53.2%
44.7% 44.7%
Cluster 1 Cluster 2
Cluster 3 Cluster 4
DormitoryOthers
Fig.7. Percentage of dormitory access within each cluster.
5.4 Comparative Analysis After Clustering
Considering the above clustering analysis results,
this paper further analyzes the students’ lifestyle within
each cluster (Table 2). In terms of smart card usage,
students within cluster 4 ate breakfast most frequently,
while students within cluster 0 seldom ate breakfast.
Furthermore, the situation remains the same when
considering the detailed breakfast time slot, which con-
firms that good lifestyle habits can improve students’
academic achievement [30]. On the other hand, students
within cluster 1 and cluster 2 had a less bathing fre-
quency than the other students. In regard to canteen
consumption, the frequency of going to the canteen
among these students was not much different. In terms
of sports, students within cluster 0 and cluster 1 did
less sports, while students within cluster 2 did more
sports. Surprisingly, students within cluster 4 were the
most frequent group to go to the school hospital, and
362 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
Table 2. Students’ Lifestyle Within Each Cluster
Cluster Grant Smart Card Usage Internet Usage
Proportion Amount Breakfast Printing Canteen Bathing Sport Hospital Downlink Duration Connections
0 0.97 −97.80 −15.09 58.55 4.90 22.29 −28.71 −1.17 56.21 43.34 36.04
1 0.43 −99.42 −6.27 −28.28 −2.27 −6.05 −13.11 1.01 −9.34 −2.94 −1.25
2 8.02 −89.12 3.45 −21.71 −4.68 −37.86 18.41 −11.42 −29.17 −42.22 −45.11
3 95.41 296.45 2.90 25.15 7.90 19.90 6.94 −27.83 7.73 7.58 9.83
4 97.66 404.46 42.65 40.91 0.46 21.09 5.18 41.30 19.66 20.37 21.08
students within cluster 0 and cluster 4 preferred to go
to the printing shop. In terms of Internet usage, usage
amount and usage frequency of students within cluster
0 were both the highest, while those of students within
cluster 2 were both the lowest. Hence, students within
cluster 3 and cluster 4 had the best living habits. Al-
though students within cluster 0 had better lifestyle
habits, their breakfast habits were relatively worse and
their Internet usage amount and frequency were higher.
In addition, students within cluster 1 and cluster 2 both
had bad living habits, but the latter had a higher sports
frequency. The above analysis confirms that lifestyle
and Internet usage habits can serve as a reflection of
students’ academic achievement.
This paper also studies students’ daily habits (see
Table A1 in Appendix). First, students’ habits are
analyzed from the perspective of gender. For students
within cluster 0, cluster 1 and cluster 2, the breakfast
frequency of females was higher than that of males.
Students within cluster 3 had the opposite situation
that the female students had a lower frequency of eat-
ing breakfast. Except for students within cluster 2,
women were more likely to go to school hospitals than
men. Among students within all clusters, females had a
higher frequency of going to printing shop and bathing,
but had a lower frequency of sports and going to the
canteen than males. In terms of Internet usage, the
female students within cluster 0 had more downlink
dataflow than the male students, while the situation
for the other clusters was the opposite. The numbers
of network connections and durations of the female stu-
dents within cluster 0, cluster 1 and cluster 2 were
higher than those of the male students, and the oppo-
site was true for students within cluster 3 and cluster
4. This finding seems to be in contrast to the con-
clusion in [21] that the effect of bandwidth on males
is shown to be higher than that on females. There-
fore, further discussion should be conducted on the
relationship among gender, broadband and academic
achievement. Some similar comparative analyses have
been carried out from the perspective of disciplines and
grades, which are not stated here. It should be noted
that the values in Table A1 are the difference from the
mean.
In addition, this paper compares the differences of
research conclusions with prior studies, as shown in Ta-
ble A2 (in Appendix). The claim that good lifestyle
habits improve academic achievement remained. Al-
though females were more self-disciplined than males,
it was not sufficient to explain academic achievement.
Notably, Internet usage had a higher impact on the fe-
male students than on the male students, which was
not in line with the research of [21].
5.5 Experiments for Trajectory Deviation
After clustering analysis, this paper conducts the
deviation analysis of the trajectories. First, the stu-
dent’s access frequency of each location for each clus-
ter is recognized. This paper considers the situation
in which the length of trajectory matching is equal to
1. For each cluster, dormitory is the most frequent lo-
cation for students to access (Fig.7). Furthermore,the
matching pattern (dormitory → dormitory) far exceeds
other matching patterns within each cluster. Hence,
this paper conducts further study with the length of
the matching pattern equal to 2.
The daily trajectory is modelled to obtain a fre-
quent matching pattern of a student. To avoid infor-
mation loss, the threshold is set to 30%. That is, if the
frequency of the matching pattern exceeds 30%, it is
considered as the frequent matching pattern. Frequent
matching patterns for all students within each cluster
need to be counted subsequently. To reduce the im-
pact of uneven access frequency on different locations,
the similarity between the student trajectory and the
centroid trajectory of the cluster to which the student
belongs is further calculated based on Definition 13. Ta-
ble 3 shows the percentage of students with less than
30% similarity between the individual trajectory and
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 363
centroid trajectory within each cluster.
Table 3. Percentage of Students with Less Than 30% SimilarityBetween Individual Trajectory and Centroid Trajectory WithinEach Cluster
Cluster [0,30%) [0,20%) [0,10%)
0 6.84 3.90 1.17
1 6.78 3.85 1.18
2 5.10 2.40 0.66
3 3.89 2.84 0.41
4 5.22 4.10 1.46
The value of trajectory deviation is equal to 1 mi-
nus the trajectory similarity. Specifically, the threshold
can be adjusted to better suit different university situa-
tions. To determine the direction of the deviation, this
paper calculates the distance among these students and
each cluster. Table 4 shows the percentage of students
whose trajectory deviation is greater than the thresh-
old within each cluster (the threshold is set to 90%).
Here, the column indicates the cluster that the student
currently belongs to, and the row indicates the cluster
that the student tends to belong to. The trend that
students within cluster 0, cluster 1 and cluster 2 trans-
form to be students within cluster 3 and cluster 4 is
considered a trend worth encouraging. Moreover, the
opposite trend could be used as a basis for academic
early warning scenarios.
Table 4. Percentage of Students Whose Trajectory DeviationIs Greater Than 90% Within Each Cluster
Cluster
0 1 2 3 4
Cluster 0 – 41.67 33.33 8.33 16.67
Cluster 1 25.00 – 58.33 13.89 2.78
Cluster 2 12.50 37.50 – 50.00 0.00
Cluster 3 0.00 33.33 66.67 – 0.00
Cluster 4 10.00 20.00 40.00 30.00 –
Note: The column indicates the cluster that the student cur-rently belongs to, and the row indicates the cluster that thestudent tends to belong to.
Ultimately, the analysis results confirm that colleges
can both predict students who may receive scholarships
through their daily life habits data and apply these data
to provide decision support to help low-performing stu-
dents back on track. In terms of scholarship assign-
ment, students within cluster 3 and cluster 4 had higher
academic achievement. In detail, students within clus-
ter 4 could be regarded as the main target of large-
amount scholarships, while students within cluster 3
could be regarded as the main targets of small-amount
scholarships. The trajectory deviation analysis indi-
cates that part of students with trajectory deviations
greater than 90% show a trend of transferring to clus-
ter 3 and cluster 4. In other words, the trajectories
of these students are highly similar to those of stu-
dents with higher academic achievement. Therefore,
these students are the key targets that colleges and uni-
versities should encourage, and they can be considered
as potential recipients for small-amount scholarships to
further enhance their enthusiasm for learning. Hence,
students within cluster 0 and cluster 1 and the stu-
dents strongly tending to cluster 0 and cluster 1 could
be deemed as the targets that should be given more
attention by colleges and universities.
6 Related Work
Education data mining is the main area for explor-
ing students’ behavioural data. With the assistance of
a multi-instance multi-label algorithm, some scholars
utilized the pre-course information of each student to
predict their performance on each subsequent course [31]
and graduation failure [32]. In terms of utilization of in-
campus data, some studies have provided substantial
inspirations for this paper. For instance, Buniyamin
et al. [33] utilized campus information system data to
classify and predict student achievement. Wu et al. [13]
visualized spatial temporal features of student perfor-
mance from the activity and consumption data on cam-
pus. Hang et al. [34] explored students’ check-in be-
haviour data to predict the point-of-interest informa-
tion of college students. Guan et al. [14] and Ye et al. [15]
leveraged campus information system data to study
the assignment issues of school scholarships and grants.
Hence, this paper adopts some research ideas and meth-
ods from the above studies in exploring the factors that
influence the academic achievement of college students.
Trajectory data not only records the physical infor-
mation of a user but also reflects the personal habits
and preferences of a user to a certain extent [35]. Some
scholars have ever evaluated the similarity among users
based on GPS trajectory mining [35, 36]. It is quite diffi-
cult to obtain the GPS records of college students, and
thus, this paper generates the trajectory sequential data
for each student through in-campus behavioural data,
which is in line with these studies [14, 15]. To the best of
our knowledge, we are the first to study the problem of
364 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
identifying influencing factors of academic achievement
by utilizing students’ behavioural data.
Another related study is clustering analysis which
has been extensively applied in prior studies. In re-
gard to clustering analysis, the assignment of feature
weights is quite important, which lies in the fact that
only some features are truly valuable [23]. Huang et
al. [27] once proposed the W -k-means algorithm with-
out sacrificing the efficiency of the k-means algorithm,
which stems from the fuzzy C-means (FCM) algorithm.
Furthermore, Hung et al. [37] proposed to select the ini-
tial weight based on the coefficient of variation, thus im-
proving the performance of the W -k-means algorithm.
Abductive learning is a good framework bridging ma-
chine learning and logical reasoning [38], which can se-
lectively infer certain facts and hypotheses that explain
phenomena and observations based on known back-
ground knowledge. In this paper, domain knowledge
is taken into account when determining the weights of
features. Hence, this paper determines the weight of
features from objective and subjective perspectives to-
wards the clustering algorithm.
7 Conclusions
This paper developed a TMS approach to explore
the factors that affect the academic achievement of col-
lege students and to provide decision-making support
for early warnings. First, we segmented students into
five disjoint groups through the Wosk-means algorithm.
The behavioural factors in the clustering analysis exhib-
ited some differences from prior studies, which provide
new insights into research work in educational contexts.
Meanwhile, we analyzed manually generated trajecto-
ries data to quantify the direction of trajectory devia-
tion of students through the PrefixSpan algorithm [11].
The results could help colleges identify students who are
in need of academic warnings. We also noticed that,
bad habits and cognitive ability can reduce academic
achievement, which is the same with the prior studies
[19, 30]. Notably, female students with high academic
achievement do not show better lifestyle habits than
male students with high academic achivement that is
not completely consistent with the prior studies [12, 39].
In addition, we also viewed that, the impact of Inter-
net usage on women is stronger than on men, which is
opposite to the prior studies [21]. In the future, we will
consider the other criteria (such as course score, hon-
orary, and so on) to assess students’ academic achieve-
ment based on large-scale long-term in-campus student
data.
References
[1] Petrides K V, Frederickson N, Furnham A. The role of trait
emotional intelligence in academic performance and deviant
behavior at school. Personality and Individual Differences,
2004, 36(2): 277-293.
[2] Alloway T P, Alloway R G. Investigating the predictive roles
of working memory and IQ in academic attainment. Journal
of Experimental Child Psychology, 2010, 106(1): 20-29.
[3] Rampersaud G C, Pereira M A, Girard B L et al. Breakfast
habits, nutritional status, body weight, and academic per-
formance in children and adolescents. Journal of the Amer-
ican Dietetic Association, 2005, 105(5): 743-760.
[4] Pilcher J J, Morris D M, Donnelly J et al. Interactions
between sleep habits and self-control. Frontiers in Human
Neuroscience, 2015, 9: 284.
[5] Macan T H, Shahani C, Dipboye R L et al. College stu-
dents’ time management: Correlations with academic per-
formance and stress. Journal of Educational Psychology,
1990, 82(4): 760-768.
[6] Stadler M, Aust M, Becker N et al. Choosing between what
you want now and what you want most: Self-control ex-
plains academic achievement beyond cognitive ability. Per-
sonality and Individual Differences, 2016, 94: 168-172
[7] Lundstrom S. The impact of family income on child achieve-
ment: evidence from the earned income tax credit: Com-
ment. American Economic Review, 2017, 107(2): 623-28.
[8] Figlio D, Karbownik K, Roth J et al. School quality and
the gender gap in educational achievement. American Eco-
nomic Review, 2016, 106(5): 289-295.
[9] Jia J, Li D, Li X et al. Psychological security and deviant
peer affiliation as mediators between teacher-student rela-
tionship and adolescent Internet addiction. Computers in
Human Behavior, 2017, 73: 345-352.
[10] LeungK C. Preliminary empirical model of crucial determi-
nants of best practice for peer tutoring on academic achieve-
ment. Journal of Educational Psychology, 2015, 107(2):
558-579.
[11] Pei J, Han J, Mortazavi-Asl B et al. Prefixspan: Mining
sequential patterns efficiently by prefix-projected pattern
growth. In Proc. the 17th International Conference on Data
Engineering, April 2001, pp.215-224.
[12] Duckworth A L, Seligman M E P. Self-discipline outdoes
IQ in predicting academic performance of adolescents. Psy-
chological Science, 2005, 16(12): 939-944.
[13] Wu Y, Gong R, Cao Y et al. EduCircle: Visualizing spa-
tial temporal features of student performance from cam-
pus activity and consumption data. In Proc. International
Conference on Cooperative Design, Visualization and En-
gineering, October 2016, pp.313-321.
[14] Guan C, Lu X, Li X et al. Discovery of college students in
financial hardship. In Proc. IEEE International Conference
on Data Mining, November 2015, pp.141-150.
[15] Ye H J, Zhan D C, Li X et al. College student scholar-
ships and subsidies granting: A multi-modal multi-label ap-
proach. In Proc. the 16th IEEE International Conference
on Data Mining, December 2016, pp.559-568.
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 365
[16] Liu J, Wang D, Feng S et al. Learning distributed represen-
tations for community search using node embedding. Fron-
tiers of Computer Science, 2019, 13(2): 437-439.
[17] Zheng Y. Trajectory data mining: An overview. ACM
Trans. Intelligent Systems and Technology, 2015, 6(3): 1-
41.
[18] Singh A, Uijtdewilligen L, Twisk J W R et al. Physical
activity and performance at school: A systematic review
of the literature including a methodological quality assess-
ment. Archives of Pediatrics & Adolescent Medicine, 2012,
166(1): 49-55.
[19] Forrest C B, Bevans K B, Riley A W et al. Health and
school outcomes during children’s transition into adoles-
cence. Journal of Adolescent Health, 2013, 52(2): 186-194.
[20] Skryabin M, Zhang J J, Liu L et al. How the ICT develop-
ment level and usage influence student achievement in read-
ing, mathematics, and science. Computers & Education,
2015, 85: 49-58.
[21] Belo R, Ferreira P, Telang R. Broadband in school: Impact
on student performance. Management Science, 2013, 60(2):
265-282.
[22] Han J, Pei J, Kamber M. Data Mining: Concepts and Tech-
niques (3rd edition). Morgan Kaufmann Publishers, Mas-
sachusetts, USA, 2011.
[23] Liu H, Yu L. Toward integrating feature selection algo-
rithms for classification and clustering. IEEE Trans. Know-
ledge and Data Engineering, 2005, 17(4): 491-502.
[24] Wettschereck D, Aha D W, Mohri T. A review and em-
pirical evaluation of feature weighting methods for a class
of lazy learning algorithms. Artificial Intelligence Review,
1997, 11(1/2/3/4/5): 273-314.
[25] Tsai C Y, Chiu C C. Developing a feature weight self-
adjustment mechanism for a K-means clustering algorithm.
Computational Statistics and Data Analysis, 2008, 52(10):
4658-4672.
[26] Modha D S, Spangler W S. Feature weighting in k-means
clustering. Machine Learning, 2003, 52(3): 217-237.
[27] Huang J Z, Ng M K, Rong H et al. Automated variable
weighting in k-means type clustering. IEEE Trans. Pattern
Analysis & Machine Intelligence, 2005, 27(5): 657-668.
[28] Lord E, Willems M, Lapointe F J et al. Using the stability
of objects to determine the number of clusters in datasets.
Information Sciences, 2017, 393: 29-46.
[29] Kushnir D, Jalali S, Saniee I. Towards clustering high-
dimensional Gaussian mixture clouds in linear running
time. In Proc. the 22nd International Conference on Arti-
ficial Intelligence and Statistics, April 2019, pp.1379-1387.
[30] Basch C E. Healthier students are better learners: A miss-
ing link in school reforms to close the achievement gap.
Journal of School Health, 2011, 81(10): 593-598.
[31] Ma Y, Cui C, Nie X et al. Pre-course student performance
prediction with multi-instance multi-label learning. Science
China Information Sciences, 2019, 62(29101): 1-3.
[32] Lakkaraju H, Aguiar E, Shan C et al. A machine learn-
ing framework to identify students at risk of adverse aca-
demic outcomes. In Proc. the 21th ACM SIGKDD Int.
Conf. Knowledge Discovery and Data Mining, August 2015,
pp.1909-1918.
[33] Buniyamin N, bin Mat U, Arshad P M. Educational data
mining for prediction and classification of engineering stu-
dents’ achievement. In Proc. the 7th IEEE Int. Conf. En-
gineering Education, November 2015, pp.49-53.
[34] Hang M, Pytlarz I, Neville J. Exploring student check-in
behavior for improved point-of-interest prediction. In Proc.
the 24th ACM SIGKDD Int. Conf. Knowledge Discovery
& Data Mining, July 2018, pp.321-330.
[35] Li Q, Zheng Y, Xie X et al. Mining user similarity based
on location history. In Proc. the 16th ACM SIGSPATIAL
International Conference on Advances in Geographic In-
formation Systems, November 2008, pp.1-10.
[36] Xiao X, Zheng Y, Luo Q et al. Finding similar users using
category-based location history. In Proc. the 18th SIGSPA-
TIAL International Conference on Advances In Geographic
Information Systems, November 2010, pp.442-445.
[37] Hung W L, Chang Y C, Lee E S. Weight selection in W -
K-means algorithm with an application in color image seg-
mentation. Computers & Mathematics with Applications,
2011, 62(2): 668-676.
[38] Zhou Z H. Abductive learning: Towards bridging machine
learning and logical reasoning. Science China Information
Sciences, 2019, 62(7): 76101.
[39] Duckworth A L, Seligman M E P. Self-discipline gives girls
the edge: Gender in self-discipline, grades, and achievement
test scores. Journal of Educational Psychology, 2006, 98(1):
198-208.
Xiao-Lin Li received her Ph.D. de-
gree in computer science from the School
of Computer Science and Technology,
Jilin University, Changchun, in 2005.
She was a postdoctoral researcher of
the Department of Computer Science
and Technology of Nanjing University,
Nanjing, from 2005 to 2007. Currently
she is an associate professor in the School of Management,
Nanjing University, Nanjing. Her research interests include
data mining, business intelligence, and decision making.
She has published in refereed journals and conference
proceedings, such as TKDE, DSS, INS, KDD, AAAI. She
was on programme committees of conferences including
KDD, AAAI, IJCAI.
Li Ma currently is a Master student
of the School of Management, Nanjing
University, Nanjing. Her major re-
search interests include online customer
behaviours and data mining.text text
text text text text text text text text
text text text text text text text text
366 J. Comput. Sci. & Technol., Mar. 2020, Vol.35, No.2
Xiang-Dong He is currently a
senior engineer of the Information
Technology Services Center, Nanjing
University, Nanjing. His current re-
search interests include smart campus,
information security and IT project
management.text text text text text
text text text text text text tex
Hui Xiong is currently a full
professor at the Rutgers, the State
University of New Jersey, where
he received the 2018 Ram Charan
Management Practice Award as the
Grand Prix winner from the Har-
vard Business Review, RBS Dean’s
Research Professorship (2016), the Rutgers University
Board of Trustees Research Fellowship for Scholarly
Excellence (2009), the ICDM Best Research Paper Award
(2011), and the IEEE ICDM Outstanding Service Award
(2017). He received his Ph.D. degree from the University
of Minnesota (UMN), Minnesota. He is a co-Editor-
in-Chief of Encyclopedia of GIS, an associate editor of
IEEE Transactions on Big Data (TBD), ACM Transac-
tions on Knowledge Discovery from Data (TKDD), and
ACM Transactions on Management Information Sys-
tems (TMIS). He has served regularly on the organiza-
tion and program committees of numerous conferences,
including as a program co-chair of the Industrial and
Government Track for the 18th ACM SIGKDD Inter-
national Conference on Knowledge Discovery and Data
Mining (KDD), a program co-chair for the IEEE 2013
International Conference on Data Mining (ICDM), a
general co-chair for the IEEE 2015 International Confe-
rence on Data Mining (ICDM), and a program co-chair
of the Research Track for the 2018 ACM SIGKDD In-
ternational Conference on Knowledge Discovery and
Data Mining. He is an IEEE Fellow and an ACM Dis-
tinguished Scientist.
Appendix
Table A1. Student Characteristics Considering Differences of Gender, Discipline and Grade
Cluster Classification Internet Usage Smart Card Usage
Downlink Dataflow Duration Connections Breakfast Printing Canteen Bathing Sport Hospital0 Female 3.05 14.76 13.95 −10.42 51.94 −5.15 39.72 −12.51 8.19
Male 0.75 3.75 6.70 −34.60 39.70 21.05 −1.80 8.95 −42.63Liberal arts 2.00 15.87 15.18 −15.41 46.79 −1.05 31.88 −17.36 4.32Science 2.49 4.74 6.76 −23.59 48.40 10.68 16.23 9.78 −27.382011 2.88 6.24 4.66 −7.54 35.73 5.26 22.25 15.75 26.012012 2.34 14.53 16.62 −26.93 57.45 3.81 26.66 −25.41 −36.18
1 Female −12.16 7.48 6.96 6.12 −19.04 −16.17 29.85 −0.57 28.28Male −7.44 −5.96 −3.78 −15.30 −49.10 1.83 −22.06 28.85 −15.91Liberal arts −15.38 4.00 4.27 2.06 −11.71 −9.02 13.50 −38.43 8.65Science −6.38 −3.85 −2.13 −12.54 −50.47 −1.99 −12.74 43.48 −5.672011 −9.95 −2.76 −1.91 3.23 −49.04 −7.22 −5.29 93.95 1.492012 −6.96 0.80 3.07 −32.14 −19.08 2.41 −4.83 13.85 −7.33
2 Female −34.52 −42.64 −45.53 20.65 −24.50 −13.45 −29.94 −39.99 −16.29Male −31.65 −45.43 −48.45 2.00 −29.03 7.55 −57.83 41.68 −11.32Liberal arts −38.95 −43.91 −47.62 −1.45 −12.26 −13.00 −39.04 −20.43 −0.80Science −27.78 −43.48 −45.63 29.00 −40.52 2.21 −42.01 1.31 −28.362011 −43.07 −51.11 −52.98 13.95 −44.67 −9.61 −45.73 23.65 −2.542012 −5.44 −21.82 −27.98 13.23 27.05 7.39 −24.75 −34.99 −48.86
3 Female 4.31 7.74 9.39 −21.80 43.90 −2.96 37.64 −40.79 −37.79Male 10.41 7.81 10.77 −10.69 1.91 14.06 7.24 −5.14 −48.16Liberal arts −0.29 6.45 9.00 −17.79 106.40 3.98 40.70 −25.53 −66.23Science 11.87 8.42 10.79 −13.99 −23.04 8.62 9.51 −29.07 −33.172011 −1.99 20.52 19.24 −41.90 −86.05 −11.18 −2.60 −14.89 −7.262012 8.14 6.87 9.60 −13.79 25.58 7.92 20.31 −63.43 −46.59
4 Female 47.11 39.28 31.68 40.34 38.53 −13.47 32.19 −20.61 46.95Male 84.99 54.09 46.39 40.44 15.35 28.12 −4.67 77.18 29.07Liberal arts 52.07 55.70 46.02 19.73 71.68 −5.08 34.76 −29.92 81.43Science 63.60 34.31 28.30 56.66 −0.12 2.33 10.30 39.43 10.112011 −7.15 −20.10 −21.53 149.31 −100.00 −24.34 −29.87 40.60 109.472012 62.24 47.65 39.60 33.17 40.84 0.88 24.34 10.07 38.77
Note: The variance of different clusters can be revealed from these numerical characteristics to some extent.
Xiao-Lin Li et al.: You Are How You Behave – Spatiotemporal Representation Learning 367
Table A2. Comparison About Research Results
Perspective Our Conclusions Behaviours in This Paper Prior Studies Comparison
LifestyleHabits
Students with higher aca-demic achievement have betterlifestyle habits
The frequency of eating break-fast, going to the canteen, sportsand bathing are higher than theaverage
Bad habits and cognitive abilitycan reduce academic achieve-ment, while good lifestylehabits can improve academicachievement [19, 30]
Consistent
Students with lower academicachievement have poorerlifestyle habits
The frequency of eating break-fast is relatively low, and the fre-quency of exercise and bathingare relatively low
For students with lower aca-demic achievement, women aremore self-disciplined than men
Women with lower academicachievement eat more frequentlythan men with lower academicachievement
Women are more self-disciplinedthan men, which explains thefemale dominance in academicachievement [12, 39]
Not completelyconsistent op-posite
For students with higher aca-demic achievement, women arenot more self-disciplined thanmen
Women with higher academicachievement do not eat more fre-quently than men with higheracademic achievement
For students with loweracademic achievement, lib-eral arts students are moreself-disciplined than sciencestudents
Liberal arts students with loweracademic achievement havehigher breakfast frequency thanscience students with loweracademic achievement
May be relatedto the genderstructure of thedepartment
LearningHabits
Students with lower academicachievement are not necessarilybad at learning habits
Some students with lower aca-demic achievement go to theprint shop frequently thanstudents with higher academicachievement
- -
InternetUsage
There is no direct correlationbetween Internet usage and aca-demic achievement
Among students who use highdataflow, duration, and connec-tions, there are some studentswith high academic achievementand some students with low aca-demic achievement
The use of Internet communicatetechnology in schools has not af-fected student performance [21]
Consistent
The impact of Internet commu-nicate technology on academicachievement has a higher im-pact on women than on men
Women with low academicachievement have higher dataflowusage, duration and connectionsthan men with low academicachievement; women with highacademic achievement have lowerdataflow usage, duration andconnections than men with highacademic achievement
Internet communicate technologyhas a higher impact on men thanon women [21]
Opposite