Click Fraud Prevention via multimodal evidence fusion by Dempster-Shafer theory

Abstract—We address the problem of combining information from diversified sources in a coherent fashion. A generalized evidence processing theory and an architecture for data fusion that accommodates diversified sources of information are presented. Different levels at which data fusion may take place such as the level of dynamics, the level of attributes, and the level of evidence are discussed. A multi-level fusion architecture based Collaborative Click Fraud Detection and Prevention (CCFDP) system for real time click fraud detection and prevention is proposed and its performance is compared with a traditional rule based click fraud detection system. Both systems are tested with real world data from an actual ad campaign. Results show that use of multi-level data fusion improves the quality of click fraud analysis.

I. INTRODUCTION EB search is a fundamental technology for

navigating the Internet and it provides access to information for millions of users per day. Internet search engine companies, such as Google, Yahoo, and Bing have revolutionized not only the use of the Internet by individuals but also the way businesses advertise to consumers [9]. Typical search engine queries are short and reveal a great deal of information about user preferences. This gives search engine companies a unique opportunity to display highly targeted ads to the user. These search services are expensive to maintain and depend upon advertisement revenue to remain free for the end user [8]. Many search service companies generate advertisement revenue by selling clicks. This business model is known as Pay-Per-Click (PPC) model.

In the PPC model, Internet content providers are paid for each time an advertisement link on their website is clicked leading to the sponsoring company’s content. There is an incentive for dishonest service providers to inflate the number of clicks their sites generate. In addition, dishonest advertisers tend to simulate clicks on the advertisements of their competitors to deplete their advertising budgets [10]. Generation of such invalid clicks either by humans or software with the intension to make money or deplete competitor’s budget is known as click fraud (CF).

The diversity of CF attack types makes it hard for a single counter measure to prevent click fraud. Therefore, it is

Mehmed Kantardzic, Chamila Walgampaya, Roman Yampolskiy, and

Ryu Joung Woo are with the Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA. fmmkant01,ckwalg01,[email protected],[email protected].

important to be able to combine multiple measures capable of effective protection from CF. A real time CF prevention system based on multi-model and multi-level data fusion is described in this paper. Each independent component can be considered as an invisible data mining module, in which “smart” software incorporates data mining into its functional components, often unbeknownst to the user [5]. Evidence for CF from multiple models are “fused” in this system, using Dempster-Shafer evidence theory [13], so that it achieves improved accuracy for detecting fraudulent traffic. Conversely it increases the quality of clicks reaching advertisers’ websites.

This paper is organized as follows. In section II data collection process in the CCFDP system is briefly introduced while in section III an introduction to the multi-sensor data fusion is given. Section IV describes the fusion architecture of the CCFDP system. Experimental results and discussion are given in section V. Conclusions and future works are given in section VI.

II. ARCHITECTURE AND DATA COLLECTION PROCESS OF THE CCFDP SYSTEM

The CCFDP system was developed to collect data about each click, involving the data fusion between client side log and server side log. The data collection is a five step process (see Figure 1). A user sends a request for a web site (for example by clicking on an ad)(message 1). The web server sends server side data and query the Blocking database in the Global Fraudulent Database (GFD) to see if its IP, Referrer, Country etc. are blocked due to fraudulent behaviors in the past(message 2). If found the click is marked invalid (message 3). If not the requested web site is sent to the client with a javaScript attached to it(message 4), that will continuously send user activities to the GFD(message 5).

Fig. 1. Data Collection Process in the CCFDP System.

Click Fraud Prevention via Multimodal Evidence Fusion by Dempster-Shafer Theory

Mehmed Kantardzic, Chamila Walgampaya, Roman Yampolskiy, Ryu Joung Woo

W

2010 IEEE International Conference on Multisensor Fusion andIntegration for Intelligent SystemsUniversity of Utah, Salt Lake City, UT, USA, Sept. 5-7, 2010

978-1-4244-5426-6/10/$26.00 ©2010 IEEE 26

https://www.researchgate.net/publication/200110650_Detectives_Detecting_coalition_hit_inflation_attacks_in_advertising_networks_streams?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/243433534_Click_Fraud_Resistant_Methods_for_Learning_ClickThrough_Rates?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/225070185_A_Mathematical_Theory_of_Evidence?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3


Then the client side and the server side data are fused (shown as “Data Fusion” in Figure 2). The click is validated based on scores given by three independent modules in the GFD. They are rule based module (shown as “online click scoring” in Figure 2), click map module (shown as “interactive analysis” in Figure 2), and outlier detection module (shown as “extended online monitoring” in Figure 2). A detail explanation of this process can be found at [6]. In each of these modules, output is a probabilistic measure of evidence for the click being fraudulent. Authors have discussed the functionality of each of these modules in detail before [6],[7]. In this paper we discuss in detail the fusion process of evidence collected from these separate modules.

Fig. 2. Architecture of the CCFDP System

III. MULTI-SENSOR DATA FUSION Data fusion is a process dealing with the association,

correlation, and combination of data and information from single and multiple sources to achieve refined position and identity estimates, and complete and timely assessments of situations and threats, and their significance [4].

Data fusion encompasses theory, techniques and tools conceived and employed for exploiting the synergy in information acquired from multiple sources. The objective is that the resulting decision or action is in some sense better (qualitatively or quantitatively, in terms of accuracy, robustness etc.) than it would be if any of these sources were used individually, i.e., without exploiting synergy.

Waltz and Llinas have described advantages related to the incorporation of data fusion [16]:

(i) Robustness and reliability: The system is operational even if one or several sources are missing or malfunctioning.

(ii) Extended coverage in space and time. (iii) Increased dimensionality of the data space: It

increases the quality of the deduced information while

reducing vulnerability of the system. (iv) Reduced ambiguity: More complete information

provides better discrimination between available hypotheses. (v) Solution to information explosion. Data fusion techniques are widely used for target

identification and tracking, situation awareness, threat assessment, and military and public security applications [1],[3],[14],[17],[18]. In most of these applications the fusion model has been selected mainly considering the practical application since until today there is no universally

�accepted fusion model available. In this paper, speci c fusion process architecture has been introduced to our application based on the Joint Directors of Laboratories (JDL) model [15]. In the JDL model we differentiate between the following levels of abstraction:

• Data/Observation level fusion: Measurements which can be univariate, multivariate, and/ or multidimensional, measurements may also exhibit temporal, spatial properties etc. are fused at this level.

• Variable level fusion: A variable is derived from data using a data analysis algorithm. Transformed domain variables are fused at this level.

• Decision Level: When the results of the high level fusion are available, variables can be interpreted for decision making. The final result is obtained at the decision module by fusing the local decisions of the system to get a more precise and comprehensive understanding of the system’s situation.

Different fusion algorithms are used in different fusion levels, such as statistical estimation, Kalman filter, fuzzy integration, neutral networks, Dempster Shafer(D-S) evidence theory and so on. Of these fusion methods, we have chosen D-S evidence theory. It is widely known for better handling uncertainties. Moreover, it provides flexible information processing and can deal with asynchronous information [12]. In the following section we give a brief description of the information fusion process in D-S evidence theory. Refer to [13] for more implementation details of the D-S evidence Theory.

Suppose 1m and 2m are two mass functions formed based on information obtained from two different information sources (about the same click C) in the same frame of discernment; Dempster’s orthogonal rule to combine these evidences is depicted as

))(()( 21 CmmCm ⊕= (1) In our system, evidence are used to either support a click

being genuine or fraudulent. Therefore it becomes a two class problem. Accordingly we have modified the calculation of )(Cm for the CCFDP system. For a two class problem, we can simplify the equation for combination of

evidence to:

∏∏

∏

==

=

−+= n

ii

n

ii

n

ii

rr

rS

11

1

)1(

(2)

27

https://www.researchgate.net/publication/224381152_Improving_Click_Fraud_Detection_by_Real_Time_Data_Fusion?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/224381152_Improving_Click_Fraud_Detection_by_Real_Time_Data_Fusion?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/237285899_Data_Fusion_for_Traffic_Incident_Detection_Using_DS_Evidence_Theory_with_Probabilistic_SVMs?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/224368835_A_method_of_Distributed_Decision_Fusion_based_on_SVM_and_D-S_evidence_theory?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/253117110_Automatic_map_updating_by_fusion_of_multispectral_images_in_the_Dempster-Shafer_framework?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/4215838_D-S_Evidence_Theory_and_its_Data_Fusion_Application_in_Intrusion_Detection?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/224460807_A_Data_Fusion_Based_Intrusion_Detection_Model?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3


https://www.researchgate.net/publication/233806994_Multisensor_Data_Fusion?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3

https://www.researchgate.net/publication/224327295_Application_of_D-S_evidence_theory_in_flow_regime_identification_of_two-phase_horizontal_pipe_flow?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3


where ir is the output from each model discussed in the sections IV, and n is the number of models.

IV. FUSION OF EVIDENCE OF CLICK FRAUD IN THE CCFDP SYSTEM

Fusion of evidence from different sources can be done only with the assumption that they are independent. We have calculated the Pearson’s correlation coefficient between scores from rule based module and the outlier module. It is 0.012 and we believe this provide enough evidence to treat the modules as independent.

Click i

D

A BClick i

Click i 1

DC

Fig. 3. Event Vs. Evidence

A. Event vs. evidence Incoming click is the event we consider in the CCFDP

system. The evidence provides support information for this event’s past and present activities. For example consider the following four pieces of evidence of CF. Evidence A: IP associated with the click generates software clicks (A software generated click such as by a click Bot). Evidence B: Search engine crawler. Evidence C: Past referrals from the associated Referrer are counterfeit. Evidence D: Click does not have user activities.

A piece of evidence can be associated with multiple possible events unlike traditional probability theory where evidence is associated with only one event. For example the click event i in Figure 3 is associated with evidence A, C, and D. Click event )1( −i associated with evidence D. The two events share event D. Evidence can overlap (A and C), one evidence can be a specific case of more general evidence (A and B) etc. In CCFDP we try to maximize the detection of evidence associated with an event. We achieve this by both using extended context of the click and heterogeneous models that are designed to detect specific fraudulent activities.

B. Evidence in the rule based module: rules. Each rule has a value between 0 and 1. This module

extensively analyzes the context of the incoming click. It also uses the information in the GFD (Global Fraudulent Database) and BDB (Blocking Data Base). Final score of this module is obtained by fusing these individual scores using the D-S evidence theory.

C. Evidence in the outlier detection module: The outlier detection module defines baselines for each

click parameter (IP, referrer, country, etc.) considering the data collected in the past. We maintain two sets of outliers. Local outliers are based on one set of click parameters and Global outliers are defined using another set of click parameters. Baselines are compared with the variations in the current context of the click to detect suspicious patterns in the data. The variation (evidence) in each parameter is represented as a probabilistic score. A variable level fusion is performed to combine the individual pieces of evidence using D-S evidence theory.

D. Evidence in the click map module: The “click map” tracks the current activities being

performed by the user while the user’s session is active. “Click map” assigns a score for each click based on click’s location relative to the positions of the advertisements in the webpage the user is viewing. Currently, the click map module works as a classifier which filters out clicks which are off position with respect to the advertisement area.

CCFDP combines the past and present evidence for each click to better understand its current and future behavior of incoming clicks to a website. If the click is found fraudulent (where the combined score is greater than a threshold), the associated sources of the click will be moved to a suspicious database, which will be immediately used by modules discussed above, to score the next new incoming click. The model-driven fusion process of CCFDP is depicted in Figure 4.

Fig. 4. Model-driven fusion process of CCFDP

Real-time data feeds from three sources: server side,

client side, and extended context of the click ( 321 ,,, SandSS ). This is represented by sensors in Figure 4. In the data preprocessing stage we standardize (align) the input data. The concept of alignment is an integral part of the fusion process, and assumes “common language” between the inputs and includes the standardization of measurement units. The scores from the rules based module (DM model 1), outlier detection module (DM model 2), and click map module (DM model 3) are then combined using D-S evidence theory at the decision level. The combination of scores will be used to dynamically adjust advertising profiles in such a way that low quality sources of traffic will no longer be shown advertisements.

28

V. EXPERIMENTAL RESULTS AND DISCUSSION All of our experiments used click data from

www.hosting.com and www.thebestmusicsites.org websites. The process was started on January 7th, 2007 and it is still collecting data. As of June 30th, 2008 we have collected around 1,400,000 natural and 25,000 paid click data points. Even though the CCFDP system is running in real time now, during 2007-2008 period we did not have the online version of CCFDP. Therefore, experiments were conducted by taking the click data that we had and simulating the progression of days. Significant findings from these experiments led us to build the real time version of CCFDP, which is now available at www.netmosaics.com.

Initial version of CCFDP was designed using a rule based system, which closely matches most of the click fraud detection systems available today. The new CCFDP has outlier module and the click map module in addition to an improved rule based system with additional click context information. Experiments are performed on both old and new versions of CCFDP.

1

1.2

Rulebase module score outlier module score final score

ClickScore

0.4

0.6

0.8

Number of Clicks

0

0.2

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73

Fig. 5. Variation of score for a blacklisted IP

We have mentioned in previous work [6] that highly

suspicious members of IP, referrer, country, etc. are blacklisted once they are detected by the CCFDP system. Figure 5 shows the variation of scores for such a blacklisted IP. In this graph 0 corresponds to a valid click and 1 is assigned for an invalid click. Even a valid IP can have many 1s recorded as its final score due to situations like double clicks. We do not penalize users for double clicking on an advertisement since the user may be accustomed to Microsoft’s default way of choosing something on the screen. This can be clearly observed in Figure 5 where there are clicks that hit the line 1 even before it gets blacklisted. The situation changes if the clicks are repeated more than 2 times during a given time interval. Once enough suspicious activities are detected the IP is finally moved to the blacklisted database and the search providers are notified. For example in Figure 5 the IP is moved to a black list after 64 clicks.

Figure 6 shows the distribution of final scores for all the clicks with two versions of CCFDP. The lighter graph (L) corresponds to the first version of CCFDP where only rule

based module was used.

Fig. 6. Variation of Final scores

The darker one (D) is the new version with multiple modules. Area I represents most of the valid clicks. This corresponds to the records with attributes which do not have presence in the fraudulent database and all key attributes satisfies the requirements defined in the algorithm to be a legitimate click. The percentage of traffic present in Area I with system L is much higher than that of system D. With the inclusion of multiple modals the suspiciousness of clicks has increased and the graph is shifted to the Area II with the system D, which is still in the safer region. Area III shows the suspected clicks. These are records with the attributes present in the fraudulent database or attributes that exceed certain threshold values. It can be clearly seen in the graph how the scores have increased after fusing multiple pieces of evidence from different modules. Area IV includes invalid clicks. Blocked traffic is identified as clicks with highly suspicious scores.

Additionally, we looked at the number of suspicious clicks in the Hosting.com advertising traffic during two different periods, where we had considerably more clicks (Figure 7(a) and (b)).

In these time periods outlier detection module has found increasing number of suspicious alarms. A closer inspection of the data shows that a number of factors contributed to this higher rate of outlier detection. One such factor was that there was a major shift in the keywords used in the Google advertising campaign. During these periods of time we analyzed the volume of total clicks from Google and its partner network.

Figure 8 shows the total and invalid traffics from Google and Google partner networks for the second month and the eighth month. First thing to observe is there is much higher volume of traffic in the eighth month compared to the second month. Second thing to observe is that traffic from Google partner networks in the eighth month is almost negligible. In the second month out of 307 direct Google referrals 71 are observed invalid, while 138 of 583 Google partner network referrals are detected invalid. In the eighth month Google only traffic is 2444 and nearly 50% (1036) of that traffic is found to be invalid. We also looked at the top

29

500

600

valid invalid

100

200

300

400

ClickCo

unt

0

Fig. 7(a). Click traffic in second month of 2007.

2500

3000

valid invalid

500

1000

1500

2000

ClickCo

unt

0

8/9/20

07

8/11

/2007

8/13

/2007

8/15

/2007

8/17

/2007

8/19

/2007

8/21

/2007

8/23

/2007

8/25

/2007

8/27

/2007

8/29

/2007

8/31

/2007

9/2/20

07

9/4/20

07

9/6/20

07

9/8/20

07

9/10

/2007

9/12

/2007

9/14

/2007

Fig. 7(b). Click traffic in the eighth month of 2007.

2444

1500

2000

2500

3000

Coun

t

30771

1036

583

22138 50

500

1000

2/14/07 2/29/07 8/24/07 9/6/07

Click

Google direct traffic Google partner network traffic

valid traffic for Google direct invalid traffic for Google direct

valid traffic for Google partner invalid traffic for Google partner Fig. 8. Traffic analysis for Google

7

10 12 93040506070

ckCo

unt

Total Invalid

5137 35 35

24 21 18 17 17 17

4 7 5 4 3 4

0102030

Clic

Google Publisher ID Fig. 9. Referrer analysis with Google publisher ID.

referrers in the Google partner network during the second month. Figure 9 shows these referrers identified by their Google publisher ID. These statistics are generated using the extended context of click data. We looked at the top five

countries, other than the US, present in the invalid Google only traffic. They are depicted in Figure 10. Both India (IN) and Turkey (TR) remained as a common candidate in both lists. It is the outlier detection module that detected traffic from these countries becoming suspicious. These evidence is added to the overall system for the purpose of detecting fraud.

Finally we looked at the changes in quality of traffic after implementing the multi-modal based CCFDP system (see Figure 11). The dataset was used in the rule based module alone and found that on average 53% of traffic is suspicious [6]. When running the outlier detection module alone on the dataset we discovered that about 34.6% of all clicks had one or more attributes that were found to have an outlying attribute-value pair count [7]. Clicks found to have an outlier will contribute evidence effecting partial scores. The CCFDP system computes the final score measuring suspicion for each click, and with the multi-modal system classified about 64% of paid traffic as fraudulent. We define the quality of traffic by 1-score. Using only the rule based module and the outlier module we have about 47% and 65% quality scores respectively. With the combined model we were able to get much better traffic with about 36% of quality clicks. With these results we can see that the multi-modal based CCFDP system is capable of improving the traffic quality at least by 10% compared to same models working alone.

14

16

18

20

Percen

tage

4

6

8

10

12

0

2

4

1 2IN MA MY PH TR CA TR GB IN AU

Country Fig. 10. Top 5 country lists for invalid Google only trffic

80%90%100%

Avg. Score of traffic Avg. Quality of traffic

20%30%40%50%60%70%

0%10%20%

Rule based modulealone

Outlier detectionmodule alone

Combined system

Fig. 11. Variation of quality of traffic before and after fusion

30

VI. CONCLUSIONS AND FUTURE WORKS

A. Conclusions In this paper we proposed a multi-modal real time

detection and prevention system for click fraud. The CCFDP system uses multi-level data to enhance the description of

�each click, and to obtain better estimation of the click traf c quality. Our system analyzes the extended click record using three independent data mining modules: rule based, outlier and click map. A score is assigned to each click based on the individual scores estimated by independent modules. Scores are combined using the Dempster-Shafer evidence theory. We have tested version 1.0 and version 2.0 of CCFDP systems with data from actual ad campaign. Results show that high percentage of click fraud is present even with most popular search engines such as Google. The multimodal based CCFDP estimated the average score as 64% where the 53% is the highest average score recorded by any individual module that ran the data alone. By these additional refinements we were able to increase the quality of the click traffic reaching a website by an average of 10%.

B. Future Works The CCFDP system is now available for public to try at

www.netmosaics.com. Its modules are continuously updated and tuned with new detection techniques. The emphasis in the next phase of CCFDP will be on proactively combating robot clicks (click bots).

In this study, fusion of information about a click is done using the D-S evidence theory. Our next goal is to compare the current results with that of other fusion techniques, especially with the probabilistic methods associated with the Bayesian networks.

VII. ACKNOWLEDGMENTS This research has been partially funded by National

Science Foundation (NSF) under grant #0637563 and Kentucky Science and Technology Corp. (KSTC) under grant #KSTC-144-401-07-018.

REFERENCES [1] Fusheng, Z., Feng, D., ”Application of D-S Evidence Theory in Flow

�Regime Identi cation of Two-Phase Horizontal Pipe Flow,” Proceedings of the 27th Chinese Control Conference, China, 2008.

[2] Ge L., Kantardzic M., King D., ”CCFDP: Collaborative Click Fraud Detection and Prevention System,” 18th International Conference on Computer Application in Industry and Engineering - CAINE 2005, Honolulu, November, 2005.

[3] Janez, F., Goretta, O., and Michel, A., ”Automatic map updating by fusion of multispectral images in the Dempster-Shafer framework,” Proceedings of SPIE, the International Society for Optical Engineering, vol. 4115, pp. 245-255, 2000.

[4] Lambert, D., ”A blueprint for higher-level fusion systems,” Special Issue on High-level Information Fusion and Situation Awareness, Volume 10, Issue 1,Pages 6-24,January 2009.

[5] Han, J., Kamber, M., Pei, J., Data Mining: Concepts and Techniques, 2 edition, Morgan Kaufmann, 2005

[6] Kantardzic, M., Walgampaya, C., ”Improving Click Fraud Detection by Real Time Data Fusion,” Proceedings of the Signal Processing and Information Technology, Sarajeveo, 2009.

[7] Kantardzic, M., Wenerstrom, B., ”Time and Space Contextual Information Improves Click Quality Estimation,” Proceedings of the Multi Conference on Computer Science and Information Systems, Portugal, 2009.

[8] Mahdian M., ”Theoretical challenges in the design of advertisement auctions,” The Capital Area Theory Symposia, University of Maryland, Spring 2006.

[9] Inmorlica, N., Jain, K., Mahdian, M., Talwar, K.,”Click fraud resistant methods for learning click-through rates,” Proceedings of the First International Workshop on Internet and Network Economics, Lecture Notes in Computer Science Vol. 3828, pp. 34-45, 2005.

[10] Metwally A., Agrawal, D., El Abbadi, A., ”Detectives: detecting �coalition hit in ation attacks in advertising networks streams,”

Proceedings of the 16th international Conference on WWW, Alberta, Canada, 2007.

[11] Metwally A. et al., ”Duplicate detection in click streams,” Proceedings of the 14th international conference on World Wide Web, Japan, May 2005.

[12] Ouyang, N., Liu, Z., Kang, H., ”A method of Distributed Decision Fusion Based On SVM and D-S Evidence Theory,” 5th International Conference on Visual Information Engineering, China, 2008.

[13] Shafer, G., ”A mathematical theory of evidence,” Princeton University Press, Princeton, N.J., 1976.

[14] Tian, J., Zhao, W., Du, R., and Zhang, Z. ”D-S Evidence Theory and its Data Fusion Application in Intrusion Detection”, In Proceedings of the Sixth international Conference on Parallel and Distributed Computing Applications and Technologies, China, 2005.

[15] U.S. Department of Defense, Data Fusion Subpanel of the Joint Directors of Laboratories, Technical Panel for C3, ”Data fusion lexicon,” 1991.

[16] Waltz, E. L. and Llinas, J. ”Multisensor Data Fusion”. Artech House, Inc., 1990

[17] �Zeng,D., Xu, J., ”Data Fusion for Traf c Incident Detection Using DS Evidence Theory with Probabilistic SVMs,” Journal of Computers, Vol.3, No.10, October 2008.

[18] Zhao, X., Jiang, H., Jiao, L., ”A Data Fusion Based Intrusion Detection Model,” First International Workshop on Education Technology and Computer Science, vol. 1, pp.1017-1021, 2009.

31

https://www.researchgate.net/publication/220922912_CCFDP_Collaborative_Click_Fraud_Detection_and_Prevention_System?el=1_x_8&enrichId=rgreq-ad76266eab77d877952b86a3b2f16e25-XXX&enrichSource=Y292ZXJQYWdlOzIyNDE4MzM5ODtBUzo5OTA1MTM2NjI1NjY0OEAxNDAwNjI3MDk0NzA3




































Click Fraud Prevention via multimodal evidence fusion by Dempster-Shafer theory

Documents

Transcript of Click Fraud Prevention via multimodal evidence fusion by Dempster-Shafer theory