Advanced Notification System - IS MUNI

81
Masaryk University Faculty of Informatics Advanced Notification System Diploma Thesis Bc. Filip Nguyen Brno, 2012

Transcript of Advanced Notification System - IS MUNI

Masaryk University

Faculty of Informatics

} w���������� ������������ !"#$%&'()+,-./012345<yA|Advanced Notification System

Diploma Thesis

Bc. Filip Nguyen

Brno, 2012

Statement

I declare that this thesis is my original copyrighted work, which I developed alone. Allresources, sources, and literature, which I used in preparing or I drew on them, I quotein the thesis properly with stating the full reference to the source.

Adviser: RNDr. Jaroslav Škrabálek

ii

Abstract

This thesis walks through development of notification system NotX, to build highlyreusable, flexible, both platform and protocol independent solution. Service orientedsystem NotX is capable of notifying users of external information systems via various en-gines; currently: SMS engine, Mail engine and proposed Voice engine. Adaptable designdecision makes it possible to easily extend NotX with following engines: Facebook engine,Twitter engine, Content Management System engine. Also the design of NotX allows tonotify users in theirs own language with full localization support which is necessary tobring value in today’s market. Most importantly, the core design of NotX allows to rununder heavy load compromising thousands of requests for notification per second via var-ious protocols (currently Thrift, Web Services, Java Client). Thus NotX is designed to beused by state of the art Enterprise Applications that require by default certain propertiesof theirs external systems as scalability, reliability and fail-over.

iii

Keywords

information system, soa, notx, sms, voice, phone, mail, notification system, enterprise,java, jms, sqs, jee, j2ee

iv

Acknowledgment

I would like to thank to my supervisor RNDr. Jaroslav Škrabálek, MBA for supportingme with his business knowledge and never ending flow of inspiration. I also want to thankIng. Pavol Grešša for his technical accumen gave clear boundaries to the developmentand even to my technical expertise.

v

Contents

1 Introduction 1

1.1 Structure of The Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Towards service oriented notification 5

2.1 Service Oriented Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 State of the Art Notification Systems . . . . . . . . . . . . . . . . . . . . . 6

2.3 Service Oriented Notifications . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Development methodology 8

3.1 State of the Art Methodologies . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1.1 Waterfall model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1.2 Unified process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1.3 Agile methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Scrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.1 Product backlog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2.2 Sprint backlog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2.3 Scrum meeting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.4 Sprint review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Analytical extension to Scrum . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3.1 Unified Modeling Language . . . . . . . . . . . . . . . . . . . . . . 17

vi

Contents

3.3.2 HIT model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Analysis 19

4.1 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 Key Architectural Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.3 Product Backlog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.4 Domain Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.4.1 HIT model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.4.2 Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Design and tools 30

5.1 NotX design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2 Programming Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2.1 Build system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2.2 Presentation Framework . . . . . . . . . . . . . . . . . . . . . . . . 34

5.2.3 DI Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2.4 IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2.5 SCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.2.6 Application Server . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.3 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.3.1 NoSQL databases vs Relation databases . . . . . . . . . . . . . . . 36

5.3.2 Cassandra NoSQL database . . . . . . . . . . . . . . . . . . . . . . 37

5.4 Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.4.1 JMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.4.2 SQS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.5 Programming practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.5.1 Clean Code and Refactoring . . . . . . . . . . . . . . . . . . . . . . 40

5.5.2 General Responsibility Assignment Software Patterns . . . . . . . . 41

vii

Contents

5.5.3 Design Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6 Implementation 43

6.1 Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.1.1 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.1.2 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.1.3 Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.1.4 SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.1.5 Voice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.2 Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2.1 Application in Web Container . . . . . . . . . . . . . . . . . . . . . 52

6.2.2 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.2.3 Supertemplates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.3 Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.3.1 Class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.3.2 Multi-threading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.4 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.4.1 Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.5 Web interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.6 Communication module . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.6.1 API overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.6.2 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6.7 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.7.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.7.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.7.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

7 Conclusion 66

7.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

viii

Contents

Appendix 67

A Contents of the CD 68

ix

List of Figures

2.1 NotX contact handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1 Waterfall Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Unified Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.3 Scrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.1 Conceptual ERD diagram for NotX data . . . . . . . . . . . . . . . . . . . 22

4.2 Use case diagram of NotX system . . . . . . . . . . . . . . . . . . . . . . . 29

5.1 NotX design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.1 Component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.2 Correlation class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.3 SMSBrana decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.4 RoutingRecord class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.5 SuperTemplate class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.6 Amazon SQS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.7 Cassandra schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.8 Tag structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.9 Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.10 Notification of IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.11 Notification of tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

x

Chapter 1

Introduction

The development of Enterprise Information Systems (viable definition of EIS can be foundin [29]) is a very important part of the software development industry because EIS arethe most basic kind of application software that covers activities within organizations([34]). There are also many standards, frameworks, tools and development processes thatsupport such effort. From a technological point of view, there are also some competingplatforms (Java EE, Spring, .NET), and major software industry companies have theircommercial offerings in the area (Microsoft, Oracle, Red Hat).

The process of developing EIS is highly unstable. Standish Group gathered statisticsof the project’s success rate in IT. Results were disappointing: 30% of unsuccessfullprojects in 1995, and it still isn’t getting much better. This alone makes it clear thatmore focus should be put on developing IS from more reusable services (Service OrientedArchitecture paradigm), and these components should be used more often than rollingout own solutions.

The notification (through e-mail and SMS) is a well known way of notifying EIS users.This thesis examines state of the art practices, platforms and methodologies for EIScomponent development, and uses them to build an enterprise level reusable service calledNotX, which will encapsulate notification functionality.

Work presented in this thesis is going to be extended in further publications. The reportabout implementation of NotX itself was also presented at the conference FedCSIS 2011[41].

1

Introduction

1.1 Structure of The Thesis

This chapter will lay out some high level definitions, motivation and concrete goals.

The chapter Towards service oriented notification focuses in detail on state of the artnotification systems (NS) and discusses possible outputs of this thesis as well as possiblefuture work that will be enabled by meeting the goals of this thesis.

Development Methodology for EIS is a very disputed topic with ongoing research amongacademics and turbulent changes in commercial companies. There are many processoriented methodologies (RUP, UP, etc.) and a new wave of agile practices (Scrum, Lean,TRIZ, XP etc.) They are not mutually exclusive, but an agile process was chosen for theimplementation of NotX, and reasons for that are in chapter Development methodology,which contains a more detailed discussion of the consequences.

The next chapter, Analysis, pinpoints the main requirements put on NotX and documentsthem in a semi-formal way. It is important to note that agile practices don’t generallyuse a formal capture of user requirements in the form of diagrams and up front modeling.The formal techniques are used only as whiteboard sketches which are not archived inany way. There are many reasons for this which are discussed in chapter Developmentmethodology, and this thesis embraces those ideas. However, since readers of this thesisare not part of development, it is necessary to use some of the techniques from ObjectOriented Analysis and Design (OOAD).

The chapter Design and tools is more technical than any of the preceding chapters, andin it’s nature captures the current technological constraints that developers of EIS mustface. Tools are described and their usage is discussed.

After the preceding chapters, everything is in place to present the implementation of NotXin chapter Implementation which is fairly technical. The given level of detail is chosenbecause many book authors and community members believe that having clean sourcecode is a major quality for any software. This thesis agrees and tries to use the bestpractices.

The last chapter, Conclusion, summarizes what has been achieved, and presents the highlevel state of NotX together with plans for future work.

2

Introduction

1.2 Notifications

Notification is a very direct way how IS can interact very quickly with the user. Whilealmost every information system has standard e-mail notifications, the features of SMSnotifications or voice notifications (telephone calls via cellular networks) are not verycommon, but there are many instances, where they could be useful:

• Urgent appeal through voice calls about the rescheduling of some event

• SMS for dual authentication processes where a confirmation SMS is sent

• SMS with any information that the user should have with himself. For example, aregistration code for an event

There is of course a downside with using SMS and voice notifications: the price. NS mustbe configurable so that the user can choose which events are so important to him that heis willing to pay the price for such notifications.

1.3 Goals

There are two main goals of the thesis. The first is to understand and choose the bestpractices and tools for implementation of the NS. These practices are in fact aligned withthe implementation of a small EIS because NotX should be part of it. The second goal isto build a NotX prototype. Following minor goals are implicated as well as consequences.

To create more a realistic NS, NotX is implemented into an IS as a case study. Choosing agood case study for such a NS is essential because concrete requirements given by domainexperts will guide the implementation to bring business value.

The next very important goal is multi-platformity so that NotX can have the potentialto be hosted on as many platforms as possible.

To really make NotX extensible, the architecture should allow to add new notificationchannels easily instead of just containing few out of the box.

Lastly, NotX itself should be developed with the best practices of software developmentin mind. There are many publications on this topic written mainly by advocates of

3

Introduction

agile development practices. This way it will be possible to publish NotX as an opensource project allowing the community to contribute, which also helps the spreading ofthe software to both industry and community.

4

Chapter 2

Towards service oriented notification

This chapter introduces the main concepts of service oriented notifications as well as thekey interest areas that will be examined and implemented in this thesis.

2.1 Service Oriented Architecture

Service Oriented Architecture (SOA) is the key concept upon which the NotX NS is built.NotX itself is a service. From [36]: "SOA is all about designing programs as a set ofcooperating services that can interact with each other through the Internet and Web.The term service is essentially the business service. An enterprise application, designedusing SOA, is composed of several services. These services are, usually, loosely coupledin nature so that new services can be added or existing services can be modified (or evenretired) quickly, based on the dynamic nature of the business situation. Such organizationscan be more competitive and are therefore more likely to survive and thrive in the businessworld."

Today, SOA is more than just a business term. It is understood as a combination of abusiness understanding of problems and their solutions in combination with technology.A service can be defined as a self-contained business function - not necessarily tied to anytechnological solution.

5

Towards service oriented notification

2.2 State of the Art Notification Systems

The development of NS with it’s implications is described for example in [37], [45]. Thereare also a few ready made solutions for notifications ([1],[2],[3]). The main drawback ofthese solutions is a lack of openness. They are commercial products that are not extensibleby a community. Openness is the key factor for NS which will bring added value becausenew ways of notifications are entering the market every day (e.g. [4], push notificationsor social networks). Using open solution a community can implement additional channelsand doesn’t have to wait for a vendor.

Notification systems also bring up an interesting topic of user identity. While there are fewattempts to consolidate a user’s identity ([5] for authentication), it seems that there is noexisting system that would collect users’ contact information to be used across differentEIS. That’s why users usually have to enter contact information for every service/EISthey sign into as well as have to set up their notification preferences in each such service(for example a SMS notification in a reservation system of a fitness center). This is a biggap in state of the art EIS. The rest of this chapter discusses a way to solve this problem.

2.3 Service Oriented Notifications

As stated in the previous paragraph, there is no central database of user contacts. NotXchooses to centralize all user contact information into one place and doesn’t reveal itor accept that from client information systems (systems developed by companies thatwant to leverage NotX as a service). This way only abstract IDs are used to addressthe recipient of the notification. The Figure 2.1 shows the ideal state of service orientednotifications.

As seen in the figure, users don’t reveal their contact information to client informationsystems. This way their privacy is better secured, which is a high priority as leaks of suchinformation do occur ([6]). With this architecture, a user’s information can be secured bya vendor who will be solely dedicated to protecting the contact information.

Also, because the user doesn’t enter his contact information repeatedly in many EIS, hehas greatly improved upon keeping up to date contacts in one place for all notifications.

6

Towards service oriented notification

NotX

Client IS

User (group: x342)

manages contacts through

Sends notifications through NotX to abstract group x342

Figure 2.1: NotX contact handling

Another big advantage is that the configuration of the notification routes is centralized.A user can easily set up whether he wants to receive notifications via e-mail only or hecan selectively configure through which channels he wants to be notified.

The main obstacle is a wide adoption of NotX. For users to really take advantage of thestated arguments, all client information systems that are used by the user would have touse NotX as a NS. On the other hand, for the user, there is just the simple task of fillingin his NotX ID into the client information system.

7

Chapter 3

Development methodology

The development process is a very important aspect of creating any non-trivial softwarepackage. This chapter will discuss state of the art development practices and describe thechosen practice for NotX development.

3.1 State of the Art Methodologies

There are many software development methodologies, some of which are present onlybecause of historical adoption. However, new projects are getting much more complexand require developers to be more productive, quicker and more flexible.

3.1.1 Waterfall model

The Waterfall Model is a well known software development process (3.1) which was pre-sented by Winston Royce in 1970 ([44]).

The main idea of this process is to create artifacts at each stage and hand them off toanother stage. In the first, requirements, phase the requirements are collected and codifiedby industry standards as UML.

In the design phase, a more detailed description of EIS is created with more specificdiagrams and technical details of implementation. Then, design documents are handedoff to the implementation phase.

In the implementation phase, the system is built using selected technology according to

8

Development methodology

Requirements

Design

Implementation

Verification

Maintenance

Figure 3.1: Waterfall Model

the design documents.

System quality is measured and verified it the verification phase. Then, IS goes into themaintenance phase and is routinely maintained in a production environment.

The main problems (discussed in [43, p.24]) with this model are:

1. The model is not tolerant to a change in the requirements

2. The model doesn’t support learning from new facts in the project

The reason for widespread usage of the Waterfall Model can be contributed to the factthat there is a perception in classical project management that this process allows goodcontrol and predictability. Traditional project management approaches fear feedbackloops because they could modify predetermined plan and therefore threaten the scope,cost or schedule.

9

Development methodology

3.1.2 Unified process

Unified Process [32] (UP) has been widely adopted with some modifications (for exampleRational Unified Process [35]). This process is iterative - development happens in smallcycles of requirements, analysis, design and implementation, which contrast with theWaterfall Model. It amplifies learning from previous iterations.

Figure 3.2: Unified Process

Figure 3.2 ([7]) shows phases of Unified Process. Each phase has clearly defined roles andcompetences as well as the artifacts that should be produced or consumed. UP is veryflexible and focuses on many artifacts, e.g.: Requirements Artifact Set, Analysis & Designartifact set, Implementation Artifact Set, etc. and roles, e.g.: Business Process Analyst,System Analyst, Integrator, Test Manager, Project Manager, Implementer etc. UP thendefines various interactions between roles and their responsibilities for artifacts.

3.1.3 Agile methodologies

There are many agile methodologies and they all have one common characteristic: theyaim to promote best practices called Agile Manifesto [8]: "We are uncovering better waysof developing software by doing it and helping others do it. Through this work we havecome to value:"

• Individuals and interactions over processes and tools

10

Development methodology

• Working software over comprehensive documentation

• Customer collaboration over contract negotiation

• Responding to change over following a plan

"That is, while there is value in the items on the right, we value the items on the leftmore."1

The development of a new NS is an ill-defined process. As noted in [43, p. 19]: "Theright the first time approach may work for well-structured problems, but the try-it, fix-itapproach is usually the better approach for ill-structured problems." Because of this, itseems that the best way to approach its development is choosing a development processthat amplifies learning and prefers a lot of feedback. Some variations of UP could beconsidered, but given that NotX is going to be implemented by 1 developer, it is morereasonable to choose an agile process that doesn’t require specific roles in a project, andfocuses more on the task at hand than on interactions between roles and the artifactswhich they should exchange. For these reasons, Scrum methodology was selected as adevelopment process of NotX. The methodology is described in the rest of this chapterand shows that it is really a practice that has mentioned qualities.

3.2 Scrum

The development process of NotX was driven by Scrum methodology. This agile processintroduced by Schwaber and Shuterland [33] is best suited for the development of NotXbecause its requirements from the start were more about searching for possibilities ratherthan launching repeatable processes.

The figure 3.3([9]) shows the timeline of a typical project driven by Scrum. Scrum is aniterative, incremental and agile methodology that consists of iterations called Sprints (thelength of sprint is not restricted to 30 days; it can range anywhere from 7-30 days).

Scrum consists of 3 main artifacts: product backlog, sprint backlog and a burndown chart.

Product backlog is a high level list of all the requirements that should make up the product

1The left and the right is referring to the parts of sentences before and after the word "over".

11

Development methodology

Figure 3.3: Scrum

or EIS. Scrum itself doesn’t give many hints on how to specify backlog items, but there arepublications which address the issue, e.g. [27], which introduces user stories into Scrum.

Sprint backlog is list of actions that will contribute to the completition of one or more ofbacklog items. An item in the sprint backlog has an hourly estimation which was madeby the person who will be responsible for completing the task. This estimation of howmuch time is left for the task is updated every day.

The burndown chart is a chart with days of the sprint represented on the X axis andhours on the Y axis. It shows how much work is left to be done in a particular day of theiteration. The goal is to reach zero at the end of the iteration. The burndown chart isusually drawn on a whiteboard so that each developer and team member can see it.

In NotX settings, Scrum is used with 2 week sprints. Each sprint starts with sprintplanning, where a sprint goal is presented (the major function, or tangible goal that is tobe produced by this sprint) and the product backlog items for this sprint are presented.

Development then proceeds and the sprint review, which presents the output of the sprint,takes place at the end. The sprint review and the sprint planning takes place on the sameday, usually Wednesday.

The product backlog as well as the sprint backlog are kept in an Open Office spreadsheet.This low tech approach always yields less administration and more focus on actual work.It is possible to derive how much work was invested into a specific product backlog itemeach day and how well the task was estimated.

To author’s knowledge, there are no major modifications made to Scrum methodology

12

Development methodology

for web development (also in [47] there wasn’t found any consistent difference in decisionmaking for web projects wasn’t found). There are, however, some subtle differences whendeveloping systems such as NotX (not just with agile practices):

• External tools spike first

• The Sprint review should contain technical details

• The Sprints to refactor code have to be more explicitly specified

• More focus on automatized testing

It’s essential that each external tool such as Text-to-Speech synthesis (TTS) or SMSgateway, that is used to carry out notifications, is spiked first before the adding of anyproduct backlog items that are dependent on it. Good practice is to have a sprint inwhich external tools are examined. Such sprints help with the planning of next sprintsbecause the developers can help product owner to prioritize and estimate product backlogitems that will include external tool usage.

The sprint review should include technical details because the product owner representstechnically experienced users (developers of EIS).

Sprints to refactor code have to be specified very explicitly with a carefully formulatedsprint goal. Sprint goals shouldn’t be vague or imeasurable like: "create more readablecode". The goals should be measurable e.g.:

• write an automated test that will fire up an in-memory database and also performscreate, update, read and delete operations

• rewrite the logic of configuration loading and present this new design using a classdiagram and a sequence diagram in the sprint review

Focus on integration testing comes from the fact that NotX itself uses several externalsystems and a lot of logic is an orchestration between JMS provider and external engines.This makes the unit testing of the core logic less effective.

All of the points above can be addressed with Scrum by managing the content of theproduct backlog and sprint reviews.

13

Development methodology

3.2.1 Product backlog

Product backlog in Scrum is the central document for communication between the productowner and the Scrum team. It is an evolving, prioritized queue of business and technicalfunctionality that needs to be developed into a system ([33]).

It is as formal as the hosting organization. In the case of this thesis, the main customeris the EIS from the case study, and according to the meetings that take place every 14days or via e-mail, the product backlog should grow and change as more is learned aboutthe product and its customers ([27]).

Product backlog items are specified as user stories [27] and sprint backlog items arespecified ad hoc, informally.

3.2.2 Sprint backlog

Sprint backlog is a breakdown of product backlog items in terms which are understoodby the developers. This backlog is created at a sprint planning session by selecting afew product backlog items and breaking them down. Developers are responsible for com-mitting to such a number of product backlog items which they think they are able toimplement. Each sprint backlog item gets an estimate that is assigned by the developers.Then the product owner prioritizes the list while keeping the estimates in mind. In anongoing sprint, the sprint backlog cannot be changed. This aids developers in focusing onactual work and not on communicating about the consequences of requirement changes.

When Sprint begins, the developers work with the sprint backlog every day. A developerselects which task he wants to work on and assigns it to himself. This can be done ina spreadsheet or using more tangible tools such as board with sticky notes. After that,typically at the end of the day, the developers fill out information into the sprint backlogspreadsheet specifying:

• A time estimate to finish the backlog items they were working on

• Time which was consumed on assigned backlog items

By filling out these two simple numbers, the management of the Scrum project can tell howquickly they are approaching the end of the Sprint. This is visualized by the decreasing

14

Development methodology

XY graph (burndown chart) of the "time to finish" estimate. Hence in the end of theSprint, the graph should reach zero. Sometimes it does not reach zero because it is moreimportant to finish the sprint and to make a review and a retrospective than to postponethe sprint ending.

3.2.3 Scrum meeting

Each day (usually in the morning), developers gather for a so called scrum meeting whichlasts about 10-15 minutes. During this meeting, each developer should answer the follow-ing questions:

• What did I accomplish yesterday

• What are my goals for today

• What kind of problems did I encounter yesterday

No interference nor discussion should be allowed during this meeting so that it doesn’texceed the 15 minute time slot.

3.2.4 Sprint review

When a sprint ends, a sprint review takes place. The goal is to present a tangible resultthat is understandable to the product owner. The features should be ready to ship.It is very important to get valuable feedback because product owners and customerscannot give complete feedback when working with prototypes. By providing a potentiallyshippable product, it is possible to assess the quality and give feedback, and discuss featurecontent of following sprints.

3.3 Analytical extension to Scrum

In Scrum methodology, the emphasis is always put on communication with a productowner. In the case of NotX, the product owner was represented by the stakeholders ofTakeplace EIS: RNDr. Jaroslav Škrabálek, MBA. and Ing. Pavol Grešša. Mentioned

15

Development methodology

persons will be referred to as analysis participants. The analysis was conducted by brain-storming features of the desired system and then prioritizing the features into specificsprints until the desired amount of features was delivered.

To further enhance the quality of NotX there was an interview with Mgr. Ing. LukášRychnovský, Ph.D., who has experience in the field of event driven distributed applica-tions. The output of this interview was used to make better architectural decisions whichare described in the following chapters.

The interview is a well known analytical method in software engineering. The approachoutlined in [34] was used to conduct interviews for this thesis. Because the analysisparticipants were very open, the following interview structure was used:

• Good preparation ([34, p. 89]) in the form of a written agenda and scenarios

• Catching and recording important facts

• Consolidation of the facts after an interview

To capture the analysis of NotX in written form, some additional approaches are com-bined:

1. The use case diagram [30] to group stories from the product backlog and show aclearer picture of NotX

2. A conceptual domain model analyzed by a simplified HIT model ([28]). This modelhelps with understanding basic parts of NotX and also with establishing a vocabu-lary - communication means. Note that this may be viewed as a different approachto domain modeling as opposed to conceptual class diagram (CCD).

3. key architectural points (KAP) are identified after interviewing Takeplace stakehold-ers and the expert in distributed systems.

This chapter will further describe the requirements analysis using two modeling languages.In standard Agile project this would be out of the scope (a good explanation of such agilementality is in [43] - Eliminate Waste), but the reader of this thesis is not presented withthe project and needs the requirements captured more formally.

16

Development methodology

3.3.1 Unified Modeling Language

[30] The Unified Modeling Language (UML) is a successor to a wave of object-orientedanalysis and design (OOA&D) methods that appeared in the late ’80s and early ’90s. Mostdirectly, it unifies the methods of Booch, Rumbaugh (OMT), and Jacobson. However itsreach is wider than that. The UML went through a standardization process with theOMG (Object Management Group) and is now an OMG standard.

As G. Booch, I. Jacobson and J. Rambaugh said, the UML was created mainly to de-mystify the process of software modeling through standardized notation. In this thesis,the main parts of NotX requirements are depicted as UML diagrams to better clarify theproblem and to communicate its solution.

Explaining the UML syntax and semantics is beyond the scope of this thesis. A readerwho is not familiar with it should briefly review [30] or a similar publication.

3.3.2 HIT model

HIT modeling was introduced in the Czech Republic in the ’80s on the basis of mathe-matical logic, lambda calculus and transparent intensional logic [28][46]. HIT modelingis a less known methodology that was developed to describe domain semantics of a spe-cific problem. As far as the author of this thesis knows, HIT is the best way to clearlycapture the semantics of entities and their relationships in the domain. Thus, subset ofHIT methodology rigor is described and used to model the NotX domain.

HIT also introduces a way of capturing cardinalities in text instead of pictographicalrepresentation (as in ERD).

This thesis uses the first two tools of HIT modeling and those are sortalization andfunctional dependency.

Sortalization is the definition of basic types (e.g. a name represented as a string ofcharacters) and entities (e.g. a person) that are presented in the domain. Main benefitsof this method are:

1. Adding unambiguous semantics to entities

2. Adding unambiguous semantics to relations between entities

17

Development methodology

Describing HIT in detail is not in the scope of this thesis and only the practical outputswill be presented. More information about HIT is in [28][46].

18

Chapter 4

Analysis

The goal of this chapter is to clearly introduce the problem and to capture requirementsfrom the analysis participants.

To create a usable service the development was guided as a module for existing EIS. Thisapproach adds more feedback and amplifies learning as noted in [43],[33].

4.1 Case Study

Takeplace is a system for steering and running symposia [10]. Its main features include:

• No installation, world-wide access via a web browser

• Reviewing contributions

• Executive control (nominating committees, effective communication support)

• Administering participants of conference

• A scheduling program (inviting speakers, selecting/rejecting contributions)

This brief list of features is by far not complete. To achieve this vast amount of func-tionality, Takeplace concept is distributed architecture with loosely coupled submodules.This architecture makes it possible to create a highly scalable and maintainable system.Because it was beneficial to create parts of the system from legacy or third party products,Takeplace itself resulted in a very heterogeneous software suite from a technical point ofview.

19

Analysis

From the NS point of view, it is easy to see that such a system is ideal for a case study.Many possible usages of an NS in Takeplace will be covered later in this chapter. However,the most important problems that are visible just from the description of Takeplace are:

• The demand for high scalability as Takeplace itself is intended to be used for largeamounts of users all over the world

• Heterogeneity of the system, from a technical point of view, making it hard toconnect NS to every part (submodule) of Takeplace

4.2 Key Architectural Points

The following KAP were established after creating the product backlog and gathering in-put from interview participants. These would guide day to day decisions when developingthe NotX.

1. The data model of NotX is relatively simple and therefore it should be possible to usea NoSQL database to leverage horizontal scalability and the speed of writes/reads

2. The engines may get overloaded. It is imperative to use a message queue enterprisedesign pattern

3. It may be beneficial to spike (test up front) various voice synthesizers

4. In a distributed system which consumes events its important to work with an eventas it was generated to enable retransmission of the event in the case of failure.This is beneficial after bug fixes because usually when a bug occurs, events are nothandled. After the bug fix, they should be resent to the system and handled again.

5. When possible, use transactional processing of events. When an event occurs itshould be persisted as soon as possible and only when it is processed it should bedeleted or marked as processed.

6. Correlation is an important aspect of event handling, and one should be aware thatanytime, users might request the functionality.

20

Analysis

4.3 Product Backlog

This section presents the product backlog of NotX.

ID Backlog item

1 Thrift interface2 Java interface3 Web services interface4 Extensibility of the interfaces5 Extensibility of the engines6 SMS engine7 Email engine, multi part emails8 Proposal of phone engine9 Web view of failed messages and recovery use case10 Web view for setting the contacts11 Notifications statistics12 Fault tolerance13 Abstract user groups for notifications14 Encapsulation of notification channel15 I18N16 Notification templating17 Open Source and Multi platform

4.4 Domain Models

Domain models are used to capture parts of reality in an unambiguous way. HIT modelingis a methodology based on formal systems, and brings an interesting tool set of both textbased and graphics based approaches.

21

Analysis

4.4.1 HIT model

The first step of HIT methodology is to identify entities and their relationships. Entity-Relationship Model (ERD), introduced by Chen [26], is a well accepted way of modelinga view of data, and will be used as a tool to visualize entities and relationships.

Contact

Channel

Engine

Interface

Tag

Super notification

Template

Super template

Concrete Notification

Notification type

User

NotX

Language

01

02

03

04

05

06

07

08

09

10

11

12

Figure 4.1: Conceptual ERD diagram for NotX data

22

Analysis

User

An object of a type (#User) is every person or entity registered in NotX that will receivenotifications.

Contact

An object of a type (#Contact) is every identifier in some system usually used for ad-dressing, e.g. e-mail address, phone number, Facebook ID.

Tag

An object of a type (#Tag) is every identifier of a group of entities that is used tocategorize them.

Super notification

An object of a type (#Super notification) is every message that is addressed to an abstractentity. Super notification abstracts from the language in which the message is sent andabstracts from the recipient.

Concrete notification

An object of a type (#Concrete notification) is every message that abstracts from a mes-sage channel. Note that concrete notification doesn’t abstract from a recipient. Concretenotification is addressed to a specific person or entity.

Notification type

An object of a type (#Notification type) is every type that is used to give more semanticsto a message. For example, the type "Important", gives the means to distinguish thisnotification/message from others.

23

Analysis

Super template

An object of a type (#Super template) is every template of a notification that abstractsfrom concrete notification means. For example, the super template for the notification"Welcome to the conference" is an entity that doesn’t rely on a specific language ornotification means (e-mail, SMS, voice, social network).

Template

An object of a type (#Template) is every standardized description with placeholders ofa message to be sent via a specific channel and in a specific language e.g., "Welcome tothe conference" in English for SMS delivery.

Interface

An object of a type (#Interface) is every computer protocol to be used as an extensiblepart of NotX that will allow basic NotX operations. Examples of these are Web Services,Thrift API, Java Interface. The interface itself will be designed as JAR (Java Archive).

Channel

An object of a type (#Channel) is every logical way of communication. Channels arelogical means through which notifications can travel. For example "Cellular network,E-mail, SMS".

Engine

An object of a type (#Engine) is every concrete extension to NotX used as a notificationprovider. The engine is used to deliver a Notification via a specific Channel and will bedesigned as a JAR plug in. Examples are SMS engine, Voice engine and Mail engine.

24

Analysis

NotX

An object of a type (#NotX) is every deployment of the notification engine NotX (instal-lation on some Hardware).

Semantics of relationships:

01 User (#User) that has given contacts (#Contact). /1,1:0,M

02 User (#User) that belongs to given tags (#Tag). Everytime there is one automaticallygenerated tag a character string starting with ":" character and followed by a userID. /1,1:1,M

03 Tag (#Tag) that is a destination for the given Super notifications (#Super notifica-tion)./1,1:0,M

04 Super notification (#Super Notification) that is dissected into Concrete notifications(#Concrete Notification). /1,1:0,M

05 Super Template (#Super Template) that is used for the given Super Notification(#SuperNotification)./1,1:0,M

06 Super Template (#Super Template) that consists of given Templates(#Template)./1,1:0,M

07 Template (#Template) that is used to send the given Concrete Notifications(#ConcreteNotification)./1,1:0,M

08 Notification type (#Notification Type) of the given Concrete Notification(#ConcreteNotification)./1,1:0,M

09 Engine (#Engine) used to send the given Concrete Notification (#Concrete Notifica-tion)./1,1:0,M

10 Channel (#Channel) that is managed, used or implemented by a given Engine (#En-gine)./1,1:0,M

11 NotX instance (#NotX) that has given concrete Engines plugged in (#Engine)./1,1:0,M

12 NotX instance (#NotX) that uses given interfaces (#Interface) as a way of receivingsignals to operate /1,1:0,M

25

Analysis

4.4.2 Use Cases

There are three main roles in NotX.

1. Client programmer

2. User of client IS

3. Administrator of NotX

(1) is the Takeplace developer in the case study. This developer usually uses NotX tosend notifications.

The user of IS that receives notifications (2) has contact information stored in NotX andreceives various notifications.

Lastly, 3 is the technical person who has to charge for uses of paid services of NotX (voiceand SMS).

UC01 Send notification to abstract group

Main Flow

1. The client programmer wants to send notification to an abstract group

2. The client programmer chooses a super template and abstract group to send anotification to

3. NotX sends out concrete notifications in the language of the recipient and throughchannels configured by the user of client IS and the administrator of NotX

UC02 Send notification to user

Main Flow

1. The client programmer wants to send a notification a specific user. He implicitlyspecifies a defined user ID

2. NotX sends the notification in the same fashion as for UC01

26

Analysis

UC03 Designate tag for user

Main Flow

1. The client programmer wants to add a user to a certain abstract group. He sendsout this request.

2. NotX persists the information that this user was added to the abstract group.

UC04 Configure contact information

Main Flow

1. The user of client IS wants to configure his contact information

2. The user uses web browser to view and edit his details

3. NotX persists the setting

UC05 Modify personal routing policies

Main Flow

1. The user of client IS wants to be notified via a different channel with certain noti-fications.

2. The user uses a web browser to edit personal routing records.

3. Routing records are persisted by NotX.

UC06 Modify global routing policies

Main Flow

1. The administrator of NotX wants to set up global routing policies to route notifica-tions.

2. The a dministrator uses web browser to edit global personal routing records.

3. Routing records are persisted by NotX.

27

Analysis

UC07 Add new engine

Main Flow

1. The administrator of NotX wants to add new engine

2. The administrator creates an executable code in form of the JAR for deploymentalong with configuration

3. The administrator restarts NotX and the new engine is ready to operate

UC08 Add new interface

Main Flow

1. The administrator of NotX needs a new interface to communicate with NotX

2. The administrator of NotX implements specific interfaces and wrappers

3. The administrator adds these communication classes to the NotX kernel and restartsNotX

4. NotX is accessible via the new Interface

The use case diagram of NotX is depicted on Figure 4.2.

28

Analysis

Client programmer

User of client IS

NotX administrator

Send notification to user

Send notification to abstract group

Configure contact information

Modify global routing policies

Modify personal routing policies

Add new engine

Add new interface

Figure 4.2: Use case diagram of NotX system

29

Chapter 5

Design and tools

This chapter reveals the final design of NotX and describes the tools which were used forimplementing the requirements. The next chapter, Implementation, finalizes the descrip-tion by giving insight into the technical details of how these tools were used in the contextof NotX.

5.1 NotX design

The design of NotX is a combined product of KAP and the requirements gathered duringthe interviews. It become clear very soon that a message queue centric design would buildthe most robust solution. As soon as requests for notification are received from the clientIS, they are put into a queue. The communication module steps in and is responsiblefor transforming the Notification into canonical form. This is very important becuase thecommunication module is an extension point and new interfaces may be added to NotXthrough it.

An important part of the design is the interaction labeled with numbers 1,2,3 which isdepicted on Figure 5.1. When the NotX kernel picks up a supernotification (1) it breaksit up into concrete notifications and puts them into a concrete notification queue. Thislast step was not in the initial design, but was added for the following reasons:

• A supernotification is for an abstract recipient (tag), which may effectively be hun-dereds of users. If a notification were to fail for any reason, it would be very hard

30

Design and tools

Kernel

SMS Engine

Voice Engine

Mail Engine

Client is developer

Supernotification queue

Communication module

User

Super notification

1

2

Concrete notification queue3

Persistance

Figure 5.1: NotX design

to debug.

• To enable KAP 4. Without spliting supernotification, it would be necessary to usesome application log to reconstruct which notifications had already been sent. Thisway each concrete notification has its own life-cycle.

• Every user who belongs to the abstract recipient list of a supernotification can havea different language and setting for his notification preferences. This complicatesthe matter further.

• To promote future scalability. This way, it’s easier to achieve enterprise level capa-bilities of the system e.g. load balancing on engines for a same channel.

After breaking up a supernotification in (2), the Kernel subsequently consumes concretenotifications (3) and runs lookup logic necessary to resolve global and user specific policies

31

Design and tools

for notifications and internationalization.

The persistence component is a very important part of a statefull service. The componentwill be described in more detail in chapter 6. The following brief list of the informationstored in the persistence layer:

• User information

• Routing information

• Fail-over information about failed concrete notifications

• Sent Concrete notifications statistics

5.2 Programming Platform

Item 17 of the product backlog requires the software to be Open Source and multi-platform. While this requirement is viable today with some scripting languages (Python,Ruby) and even with native code (C programming language), it is clear that the Javaprogramming language is the leading Multi-platform language. It runs on a large numberof operating systems and there are many application platforms and tools which facilitatedevelopment. This chapter discusses a selection of specific pieces of the programmingplatform.

5.2.1 Build system

There are 2 major build systems for the Java programming language: Apache Maven andApache Ant.

Apache Ant is an XML based language which is very similar by semantics to well knownLinux Make. It consists of targets and tasks that are sequentially run inside of them. Anexample of Ant script that compiles java source code in an src directory is depicted inListing 5.1.

Note that this is just a compilation of the .java files to .class files. To further create aJAR archive, it is necessary to explicitly use the jar task.

32

Design and tools

<pro j ec t ><ta rg e t name="compile">

<mkdir d i r="bu i ld / c l a s s e s "/><javac s r c d i r="s r c " d e s t d i r="bu i ld / c l a s s e s "/>

</target></pro j e c t >

Listing 5.1: Ant

With regard to dependency resolution, there is a dependency resolution tool for ApacheAnt called Apache Ivy.

Ant itself is a very widespread build system for Java because it has a long history. It waseven used internally in Netbeans IDE.

Apache Maven is a build system that gives less freedom to the developer and is moredeclarative. A user is expected to use the standard Maven directory structure (5.2 forstandard Java SE project. Taken from [11])

my−app|−− pom. xml‘−− s r c

|−− main| ‘−− java| ‘−− com| ‘−− mycompany| ‘−− app| ‘−− App . java‘−− t e s t

‘−− java‘−− com

‘−− mycompany‘−− app

‘−− AppTest . java

Listing 5.2: Maven directory structure

When a developer uses the structure, he is immediately enabled with 4 basic features: au-

33

Design and tools

tomatic dependency resolution, unit testing support, compilation support and packagingsupport.

Dependency resolution gives easy access to Java artifacts that were published in theCentral Maven Repository (can be browsed at [12]). Thanks to dependency resolution,it is trivial to change a version of a dependency or exchange it for a completely differentimplementation. After adding a dependency to the Maven descriptor, it immediately popsup on the classpath for files in a src directory. Maven also has advanced features such asthe scoping of dependencies only for unit tests.

Unit testing support is available through the simple command "mvn test" which runs allunit tests in the test directory.

Packaging support enables the developer to create standard archives such as JAR, WAR,EAR, etc.

To conclude this chapter, Apache Maven clearly includes more out of the box features.Similar functionality can be achieved with Ant but no clear advantages can be identifiedwith this approach. It just introduces more boilerplate code and confusing directorystructures. Because of this Maven was chosen as the build system for NotX.

5.2.2 Presentation Framework

The Java Enterprise Edition stack contains Servlet API, which is used for low level de-velopment of dynamic web pages. Solely using this specification to create, update, readand delete operations in typical EIS causes unreadable source code which is hard to main-tain. To remedy this, the Java community developed a number of frameworks for webdevelopment, mainly implementations of Model-View-Controller (MVC) pattern. Usageof this pattern results in cleaner separation of concerns ([42]) and the code is easier toread. Because the web interface of NotX is relatively simple, it is sufficient to selectsome lightweight framework such as Stripes [13], which is very compact, configurable andcontains nice features, e.g. validation of user input. The main drawback is that Stripesdoesn’t provide any built-in model functionality. However, NotX is built with a NoSQLdatabase so this is not a problem.

34

Design and tools

5.2.3 DI Framework

Dependency Injection (DI) is a well known design pattern in Object Oriented program-ming ([29]).

In the author’s opinion, the greatest benefit is not the better testability of the code base,but the fact that DI simplifies code and makes it more readable. It takes a well definedconcept of object creation out of the source code and puts it into a separate file.

The DI framework of choice for NotX is SpringFramework ([14]). Apart from DI, thisframework is an ever growing development platform which contains many utility librariesthat are not presented in JDK; some of them will be mentioned in the section Implemen-tation.

5.2.4 IDE

Integrated Development Environment (IDE) is an important tool for a developer. It helpswith debugging, syntax highlighting, navigation through source code and refactoring.Eclipse IDE is de fact standard among free IDEs in the Java world. There are a numberof plug-ins for it (Maven plug-in, SVN plug-in, GIT plug-in. It is not only widespreadin the community, but also has a very unobtrusive mentality. It doesn’t create a lot ofbinary project information files (in comparison with Netbeans or InteliJ IDEA). Only2 files (.project, .classpath) are generated in a project’s folder and both are humanlyreadable. This makes Eclipse very compact and doesn’t impose vendor lock-in.

Many specialized IDEs like JetBrains RubyMine (for Ruby on Rails framework) or JBossDeveloper Studio (for RedHat’s offerings) are also built on top of the Eclipse platform,which validates the quality of Eclipse.

5.2.5 SCM

Any serious Open Source project needs to be versioned in a state of the art version con-trol system (Subversion, Git, Mercurial) to enable collaboration. Because this is alreadybacklog item 17, it was necessary to choose one of the mentioned systems. The systemselected to do the job was Subversion. Instead of hosting it, it was decided to use Source-

35

Design and tools

Forge [15], which enables Open Source software projects to create a basic project webpage, version control system and bug tracking.

5.2.6 Application Server

Many application servers are on the market to support the deployment of complex EISe.g., JBoss Application Server, Microsoft IIS, IBM Websphere Application Server. Theseapplication servers offer basic services needed to develop EIS such as:

• Web application deployment

• Security

• Scalability

• Database integration

• Message queue broker

Those for Java platform are implementing Java EE.

5.3 Persistence

An important decision for NotX was the selection of a scalable database system. Techno-logical advisers indicated in KAP, that the database scheme of NotX will be simple andthat a large load of writes/reads will be necessary. Also, a scalable persistence mechanismseems to be important in the case of extending NotX as a publicly available service.

5.3.1 NoSQL databases vs Relation databases

Relational databases (RDBMS) today are supported by application frameworks (ObjectRelational Mappings), and there is large body of knowledge about them. Also, they arebased on a strong mathematical basis of relational algebra. The main advantages of usingthem are: the possibility of constructing complex queries, normalization resulting in nodata duplication and tools support in the Java community.

36

Design and tools

Services such as Google Maps or Facebook don’t leverage RDBMS to store its data. In-stead, Google uses its own BigTable [16] and Facebook also uses its own solution calledCassandra [38]. The main reason is the possibility of storing a lot of data in a decen-tralized manner. It is possible to use cheap hardware and still be able to achieve greatresponsiveness for writes and reads.

A big disadvantage is data denormalization. Although it is possible to store data in theNoSQL database in a normalized manner, it soon becomes cumbersome to query thedatabase.

Because there is the possibility of having a large number of user data in the future, and alsobecause of the simplistic database scheme, it was decided to use the Cassandra NoSQLdatabase.

5.3.2 Cassandra NoSQL database

[38] Cassandra is a distributed storage system for managing very large amounts of struc-tured data which are spread out across many commodity servers, while also providinghighly available service with no single point of failure. Cassandra is aimed to run on topof an infrastructure of hundreds of nodes (possibly spread across different data centers).

The Cassandra is an Apache project [17], entirely written in Java. The first problem withthis cutting edge technology is that there is no standard like JDBC for java to access it.However, the architecture of Cassandra is quite multi-platform. It uses the Thrift APIas an interface. Thrift is a multi-language interoperability framework which enables thegeneration of client/server communication code for different languages and the exchangeof data in a common format. This way, Cassandra is usable for a number of languages([18]). The next problem is documentation which is not entirely readable. The mostimportant documents on Cassandra wiki are:

• Getting started guide http://wiki.apache.org/cassandra/GettingStarted

• Thrift API documentation http://wiki.apache.org/cassandra/API

• A description of Cassandra data model http://wiki.apache.org/cassandra/DataModel

To access Cassandra from Java, there are a few options that should encapsulate verbose

37

Design and tools

Thrift API.

Hector API

There were 2 possible options in the time of developing NotX: Hector API[19] andKundera[20]. Kundera promised JPA implementation that maps to the NoSQL database,but was still in a beta version and therefore the only viable option was to use Hector API.

Hector API wrapper

Hector API is only a very thin wrapper around Thrift calls. Therefore it is necessary tocreate a Hector wrapper which would encapsulate the persistence logic and support theinitial implementation of data persistence with unit tests.

All persistence implementation is located in the Java project notx-cassandra-access andwill be described in detail in the Implementation chapter. For now, just a few commentson design should explain important decisions that were made.

The Cassandra database scheme should be automatically loaded via source code. Thiswill enable further unit testing.

To force compile time checking as often as possible, Data Access Objects that encapsulatequeries on specific data objects are used.

Facade design pattern should be used to access the database via source code, which shouldbe the first point of entry into querying the database.

5.4 Messaging

The design contains message queues as a decoupling tool between the interface andthekernel. In the Java world, JMS standard is a well accepted solution to this prob-lem and few implementations are available.

The first version of NotX used JMS implementation, which was however soon changedto Amazon SQS (described below). There is a good business argument for this becauseTakeplace stakeholders don’t have any message brokers deployed and it wouldn’t be cost

38

Design and tools

efficient to deploy a new solution when there is already a cloud service (SQS by Amazon)that offers a message queue.

5.4.1 JMS

Implemented as JSR-914, JMS provides standardized API for accessing, manipulating andbrowsing queues and topics. This API is of a fairly low level and in the author’s opinion,lacks many enterprise level features like scalability (the reception of messages in multiplethreads) and fail-over. To remedy this, it is possible to use Spring wrappers around theJMS API that are available as spring beans.

5.4.2 SQS

Development of JMS4SQS [21](bridge from JMS and SQS) is already underway and thiswill be a very good opportunity for integrating SQS into the existing Java application.For now though, it is necessary to use a native solution for NotX. Because it is a veryimportant part of the system again, unit testing is necessary and the dedicated Mavenmodule notx-amazon-sqs was created to support the functionality. Implementation detailsare in next chapter of this thesis.

5.4.3 Conclusion

This chapter identified the options for implementing NotX in order to fulfill requirements,and also clarified that some solutions to problems are not yet sufficiently solved in thecommunity (JMS4SQS, accessing the NoSQL database). These problems are interestingon their own but to fulfill goals of this thesis it was necessary to solve them.

5.5 Programming practices

Programming practices that are applied throughout this thesis are from the area of ObjectOriented Programming (OOP). The described practices are embraced by the communityand it is possible to see influences of them in many Open Source projects. From the Java

39

Design and tools

community, for example, JUnit, Log4J and Hibernate. These libraries are also presentedin the .NET framework and still these practices are embraced. To further support theclaim of being well established in the community, the Microsoft ASP.NET MVC wasreleased as an OpenSource and the most influential book about it([42]), written by theframework authors, contains direct references to these practices.

This section describes some of these practices that recur when developing any ObjectOriented software.

5.5.1 Clean Code and Refactoring

Clean Code and Refactoring of existing code are concepts that are common to any lan-guage and paradigm. The most important ideas behind a readable code ([40]) are

1. Meaningful names

2. Short functions, a low amount of function parameters

3. Functions written on one level of abstraction

These three points are probably the most important ones. Getting them all right is noteasy, because even to create good name (1), it is necessary to understand the domainwell and to think about the technology used in the function. A general rule of thumb isthat a name should cause no confusion and its function should be reusable without theneed to rewrite it in a different context. This usually results in short names, because longfunction names indicate that a function comes with too many concerns.

The rule of short functions is disputable. It is aggressively one in [40] and many cleancode enthusiasts will say that this is the first aspect of function design. It is easy howeverto argue that any piece of code can be mechanically split up into many one line functionswhile the quality of the code doesn’t increase very much (it just gets more commentedby function names). On the other hand, long functions certainly smell of wrong designdecisions because it is usually impossible to come up with a short name for a long function.

The most important rule for clean code in general is keeping abstractions (3) where theybelong. This is also by far the most complex task even for an intermediate program-mer. It requires an understanding of the problem and the use of judgment to uncover

40

Design and tools

possible abstractions. A nice example of such an abstraction (from OOP) is String class,which contains methods strictly on one abstraction level - the abstraction of a string ofcharacters.

This rule can be further emphasized to the point of saying that programming essentiallyis looking for abstractions upon which the programmer will operate.

5.5.2 General Responsibility Assignment Software Patterns

General Responsibility Assignment Software Patterns (GRASP) are object oriented prac-tices to help with software design. A good explanation with a list of them is in the nowclassical book, [39]. These practices are represented as patterns of OOP.

1. Creator

2. Information Expert

3. Low Coupling

4. Controller

5. High Cohesion

6. Polymorphisms

7. Indirection

8. Pure Fabrication

9. Protected Variations

Some of these patterns are so common that many frameworks have built-in tools or codeto promote their usage. 7 and 1 are solved automatically by using DI. Solving 4 is greatlyaided by the use of MVC framework for the presentation layer.

One of the most useful patterns is 2, which deals directly with the assignment of respon-sibilities. This decision of deciding which class should contain a certain method is alwaysleft on the programmer so it is an activity he must inevitably master. The pattern says"Assign a responsibility to the information expert class that has the information necessaryto fulfill the responsibility." meaning that the responsibility belongs to the data owner.

41

Design and tools

A second very useful GRASP pattern is 5. This pattern is easily evaluable by looking upthe implementation of some class. The absolute highest cohesion is achieved when everymethod of the class uses all of thefields of the class (all of the data). This way it is easyto see that the inner workings of the class are interrelated and highly cohesive. Whencohesion decreases it is much harder to comprehend the class and it becomes too largewith too many responsibilities.

5.5.3 Design Patterns

The Gang of Four Design Patterns [31] are very popular and well known practices inOOP. Patterns are essentially templates for object oriented design that has proven to beuseful in particular situations. This way it is much easier to communicate solutions to theproblems because of common naming conventions and design. There are many patternsand a few of the most used ones are:

1. Factory method

2. Facade

3. Decorator

4. Observer

5. Template method

6. Strategy

7. Proxy

Pattern Singleton was intentionally opted out because it is usually misused.

Patterns 5,6 are basic patterns that are used in almost every system.

42

Chapter 6

Implementation

This chapter breaks NotX down into components and shows important parts of the im-plementation on a class level.

There are 4 most implemented components which are the most important. Figure 6.1shows dependencies among these NotX components.

• Core

• Kernel (Business logic, External interfaces)

• Messaging

• Persistence

• Engines

• Web

The core component is represented as a Maven module notx-core and contains commoninterfaces along with some utility classes. Important parts of this package are used byother components and will be discussed in respective sections.

6.1 Engines

The engines component represents a set of interfaces used to implement an engine in theform of a JAR file with attached configuration (Java Properties). This section will givean example of the 2 implemented engines in NotX.

43

Implementation

<<component>>

Core

<<component>>

Engines

<<component>>

Kernel

<<component>>

Messaging

<<component>>

Persistance

Figure 6.1: Component diagram

6.1.1 Deployment

Engines are implemented as JAR files that should contain a class which implements theEngine interface. Along with this JAR file the developer of the engine also supplies theconfiguration which is changeable at the time of deployment. All engines are configuredin the engines.properties file. An example of such a file is presented in the Listing 6.1.

The keys engineClass and correlationClass are global configuration keys that are used forevery engine. The mail prefix on each line just assigns this configuration key to a specificengine. The developer of the engine can create any key he wants in this configuratione.g., mail.asdf, and at runtime this key will be passed into the MailEngine class instancesfor the developer.

Each engine gets its configuration at runtime and NotX provides its engines with key/valuepairs where a key is prefixed with its engine name.

44

Implementation

mail . eng ineClas s=net . notx . eng ine . mail . MailEnginemail . c o r r e l a t i o nC l a s s=net . notx . eng ine . mail . Ma i lEng ineCorre la t ionProces sormail . smtp . host=mail . password=mail . user=mail . from=mail . return−r e c e i p t−emai l=notxwatchdog@gmail . com

sms . eng ineClas s=net . notx . eng ine . sms . SmsEngineBranasms . c o r r e l a t i o nC l a s s=net . notx . eng ine . sms . SmsEngineBranaCorrelatorsms . username=sms . password=

Listing 6.1: Engine configuration

6.1.2 Correlation

One very important backlog item is correlation. This feature enables NotX to work withinformation whether the recipient has received the notification or not. The implementa-tion of this function is always engine-specific and therefore needs to be implemented in amore abstract way than other functions.

In reality, the correlation is asynchronous and there are a few most common ways to dothis:

1. An email sent by the notification provider

2. A push correlation by the provider. In this case, the provider uses a GET requeston a specific URL (this is the case for voice correlations by the OptimSYS company)

3. A pull correlation where NotX needs to periodically test the status of a messagesent under some unique ID. This can be a web service or some REST API.

Correlation logic has to be a part of the engine itself because each engine has a unique cor-relation. NotX provides classes and processes to implement such correlation. The imple-mentation which resides in net.notx.comodule.correlation package in the notx-core module.The class diagram shows the most important classes. The developer should choose which

45

Implementation

CorrelationInfo

+correlationID: String

+receivedTime: Date

+engine: String

ProcessedCorrelationMessage<T>

+unprocessedMessage: T

ProcessedEmailMessage<javax.mail.Message>

+markAsRead()

+markAsUnread()

ProcessedHttpServletRequest<HttpServletRequest>

CorrelationProcessor<T>

+canHandle(correlationMessage:T): boolean

+handle(correlationMessage:T): ProcessedCorrelationMessage<T>

+getName(): String

ActiveCorrelationProcessor

+tryCorrelate(): List<CorrelationInfo>

+configure(engineSettings:EngineSettings)

EmailCorrelationProcessor<javax.mail.Message>

+handle(correlationMessage:Message): ProcessedEmailMessage

HttpCorrelationProcessor<HttpServletRequest>

+handle(request:HttpServletRequest): ProcessedHttpServletRequest

Figure 6.2: Correlation class diagram

sort of correlation his engine uses. When 1 is the case, he should extend the EmailCorrela-tionProcessor and override method the ProcessedEmailMessage.handle(javax.mail.Message

46

Implementation

msg). Notice that this method receives a class from javax.mail, thus the developeronly focuses on the logic needed to parse out a unique correlation ID. After that thedeveloper sets the correlation class in the engines.properties file under the key engine-Name.correlationClass. NotX will periodically check email and push it into correlationclasses.

Correlation method 2 requires a lot of boilerplate code. The developer has to set up aJava Servlet that waits for a correlation messages. To avoid setting up such infrastructure,there is a class HttpCorrelationProcessor with an overridable method ProcessedHttpServle-tRequest handle(HttpServletRequest request). Again, the developer gets the whole requestand he just extracts the unique id.

Lastly, to have flexibility in correlation, there is the "do it yourself" possibility in the formof an ActiveCorrelationProcessor. In the case when correlation is done via pull methodby checking web service or REST API, this is the way to go. This processor is used inNotX for correlating sent SMS messages.

6.1.3 Email

This engine is part of the NotX source code and corresponds to the Maven module namednotx-mailengine under the notx-engines module. It is designed as Java to be used withSMTP protocol to send e-mails. An example of notx-mailengine.jar configuration is inlisting 6.2.

mail . eng ineClas s=net . notx . eng ine . mail . MailEnginemail . c o r r e l a t i o nC l a s s=net . notx . eng ine . mail . Ma i lEng ineCorre la t ionProces sormail . smtp . host=xxxxmail . password=xxxxxmail . user=xxxxmail . from=xxxxxmail . return−r e c e i p t−emai l=notxwatchdog@gmail . com

Listing 6.2: Engine configuration

There are two interesting points about this implementation.

47

Implementation

Firstly, it is correlation. The return-receipt-email is an address used to receive receipts,that are sent back by receivers of the notification. This is implemented by adding aDisposition-Notification-To email header and a NotX-specific Notx-Correlation-ID header(see listing 6.3).

headers . put (" Di spos i t i on−Not i f i c a t i on−To" ,c on f i g . getProperty (" return−r e c e i p t−emai l " ) ) ;

headers . put ("Notx−Corre la t i on−ID" , messageParts . get (" c id " ) ) ;

Listing 6.3: Email correlation

The MailEngineCorrelationProcessor is used here. It implements the EmailCorrelation-Processor to correlate by e-mail. The processor checks email for those containing a Notx-Correlation-ID and after identifying correlation, it constructs a ProcessedEmailMessagewhich contains information about the successful correlation.

6.1.4 SMS

Several options for sending text messages were identified:

1. Self-implemented SMS sending via cell phone

2. Hardware 2N gate

3. Konzulta CZ

4. SMSBrana.cz

Implementation of 1 is a low cost solution and is viable for non-scalable scenarios. Aprototype of such SMS notification was implemented and any mobile phone which canbe connected to a PC and supports specific AT commands can be used. AT commandscan be sent to a mobile phone using Java SE, and implementations are already in thecommunity ([22]). Problem with this solution is that it cannot be easily pushed to thecloud or any external server and invariably there would be problems with the scalabilityof the solution when the volume of SMS increases.

The privately held company (3) offers a solution for high-volume SMS sending and sim-ilar providers (collectively called SMS Aggregators) have contracts directly with Mobile

48

Implementation

operators which allow them to send SMS via a respective SMS central of the operator.Technologically, this is done via a VPN connection and Java API. Konzulta publishesits API as a web service which is a very convenient solution for a Java developer. Thespeed of SMS delivery (tested with couple of SMS messages) is the fastest of all of thetested solutions. However, it was not incorporated, because Konzulta still didn’t havethis service usable out of the box. There was a need to contact the business departmentand negotiate on the price and SLAs. Also, basic subscriptions were priced quite high,compared to other solutions.

Another self implemented option is to use specialized hardware 2 that can contain multipleSIM cards. This solution requires high investments and so it was not used, nor tested, atall.

NotX uses (4) because it is consumable out of the box and economically cheapest of allof the solutions. The website offers a couple of products. The particular one we areinterested in is named "SMS Connect" and features an API to send messages and receivereceipts. There is a pdf manual on the SMSBrana website which is self contained withall of the technical information for the developer to use it [23]. From a business pointof view, the NotX deployer just registers himself on SMSBrana’s website and sends somecredit via a bank transfer, so his developers will easily be enabled to send SMSs.

SMS brana engine details

The most important functionalities that NotX needs are SMS sending and the correlationof sent messages. Every correlation mechanism requires the pairing up of the unique IDand a sent message.

Sending in SMSBrana is done via a rather complicated REST API and thus it was nec-essary to split up this part of code and encapsulate it in logical classes which resulted inthe decomposition depicted on Figure 6.3. The SMSBrana class has a constructor whichtakes a name and password. Everything else is done via performAction which encapsu-lates logic (so called actions) described in the SMSBrana manual. The developer is freedof rather complex authentication mechanisms that require to compute hashes from time,given salt and a password.

49

Implementation

SMSEngineBrana

+sendNotification(contact:String,messageParts:Templates)

SMSBrana

+SMSBrana(username:string,password:string)

+performAction(actionName:string,actionParameters: Map<String,

String >): String

-createHttpGetToSMSBrana(qparams:List<NameValuePair>): HttpGet

-getDateInSpecialFormat(): String

-computePHPlikeMD5Hash(string:string): String

SMSEngineBranaCorrelator

+tryCorrelate(): List<CorrelationInfo>

ActiveCorrelationProcessor Engine

Figure 6.3: SMSBrana decomposition

6.1.5 Voice

Requirements asked for the proposal of a voice engine. The following options were iden-tified on the market in the Czech Republic:

1. Skype API

2. TELFA (DotazovaTEL)

3. OptimSys

50

Implementation

It is already possible to use Skype as an automated tool for making phone calls usingunofficial Skype API For Java [24], in combination with virtual cable to redirect audiooutput from a sound file into Skype. Thus it is possible to playback into a voice call.

The only missing thing is to synthesize text into an arbitrary language, which is a majorproblem. The only viable (economically affordable) solution is to use an unofficial Googletranslator API. The developer can send text via a GET request (listing 6.4).

http :// t r a n s l a t e . goog l e . com/ t r an s l a t e_t t s ? t l=lang&q=text

Listing 6.4: Unofficial google translator API

Text will be synthesized into lang and sent back as an MP3 file. There is a limit to howmany words can be synthesized and thus it is necessary to break the text up into sentencesand rejoin them after synthesis.

To get back to Skype API, it is a very unpromising option in contrast to the other twobecause it is not designed to be used as a massive parallel notification mean. Each callwould have to be done separately.

An another option to voice notifications is the service DotazovaTEL, which is prepared formassive automated phone campaigns. This service features TTS for the Czech language,but that is all. A similar service is offered by OptimSys without the Text-to-Speechsynthesis.

In conclusion, there is no service on the Czech market which offers multi language TTSin combination with a phone call. Even the simple API for sending a file to be called toa specific phone number is not offered as a service, and needs to be negotiated with theprovider.

To implement Voice notification into NotX it will be necessary to combine the aboveapproaches. This may be very well done by creating such a "phone call service", that hasa great value added by itself.

6.2 Kernel

This component of NotX contains the most important logic and integrates all of the com-ponents together. Hence it directly depends on most of the others. There is also indirect

51

Implementation

(runtime) dependency on created engines that are loaded by Java URLClassloader. Ker-nel contains most of the important ideas behind NotX such as routing and supertemplateswhich are described in this section.

6.2.1 Application in Web Container

NotX is deployed as WAR in a web container but at the same time it is not a typical webapplication. It does deploy a web interface for testing and administration but main thelogic may be thought of as a standalone application. There are advantages of deployingNotX as WAR.

1. Easier packaging by Maven WAR packaging. Standalone applications need to bepackaged by Assembly plugin and the resulting directory structure may be morecomplicated than necessary.

2. Easy deployment into a platform as a service solution

3. Unified configuration in web.xml

4. It is possible to immediately expose the web interface and Java Servlets

A technique to simulate the standalone application packaged as a WAR is to implement aServletContextListener interface and add an implementation class description into web.xmlas listener-class.

6.2.2 Routing

Routing is a way of configuring global and personal notification delivery options. It shouldbe possible to create flexible abstraction allowing global and personal settings. NotXsolves this by introducing routing record which is represented as class RoutingRecord:

Each routing record is a statement of how a notification should be routed. When useridis filled, it is applicable only for that user, and when source (a tag) is filled, the record isapplicable only for that tag.

When routing record is applicable (canRouteMessage method) the one with the highestpriority is chosen and destination (multiple engines split by ’,’) is used.

52

Implementation

RoutingRecord

+userid: string

+priority: int

+source: string

+destination: string

+msgtype: string

+getDestinationEngines(): List<string>

+canRouteMessage(jmsMessage:NotificationMessage): boolean

+compareTo(o:RoutingRecord): int

Figure 6.4: RoutingRecord class

6.2.3 Supertemplates

As was noted already, the supertemplates abstract from a language and a destinationchannel. It is necessary that for each supertemplate, there be a number of versions - for

Supertemplate

+name: string

+description: string

+getTemplate(descriptor:TemplateDescriptor): Template

Template

+parts: Map<string, string>

+inject(placeholders:Map<string, string>): void

1 0..n

Figure 6.5: SuperTemplate class

each combination of language and engine destination. This is necessary because enginesdiffer greatly in their needs. SMS messages require a short understandable message andon the other hand, an e-mail can be composed via HTML and contain images or clickablelinks.

6.3 Messaging

While developing NotX, there was a transition from using standard JMS API to accessan ActiveMQ broker to using Amazon SQS. Thankfully, messaging was identified as alogical component from the start and was encapsulated in the module notx-amazon-sqs.

53

Implementation

JMS is not a very friendly API. It is designed without Enterprise functions in mind. Thereis however a community effort to remedy this in the form of a Spring component calledDefaultMessageListenerContainer. It makes it possible to construct simple POJO (plainold java object) beans that will consume messages in multiple (configurable) amounts ofthreads, and supports neat fail-over and confirmation of the message. Working with thisabstraction was so convenient that it inspired the author to build a similar abstractionfor Amazon SQS. Note that SQS features only single threaded, hard to work with, API.Also, by abstracting and putting all message handling logic into simple POJOs, it is fareasier to migrate to a different messaging provider, thus avoiding vendor-lockin.

6.3.1 Class diagram

The diagram shows the most important classes which participate in the sending andreceiving of messages. It is important to note that API lacks some kind of symetricity,but that comes from the fact that the sending of a message is not multi-threaded whilereceiving is. The main class for sending is SQSFacade. For multi-threaded reception, theSQSMessageContainer should be configured, which is done usually via Spring because allof the classes are built as Spring Beans.

6.3.2 Multi-threading

The whole design of notx-amazon-sqs is aimed to allow a multi-threaded reception ofmessages with fail-over capability and acknowledgment. When the developer gets anobject through the SQSFacade using the method receiveObjects, the object has to beacknowledged by the acknowledge method after the processing is complete.

The implemented API allows the easy setting of the number of threads via setMaxConcur-rentThreads. After starting the container, it will spawn up to that many threads. Whenany exception is thrown out while processing the message, it is not acknowledged, whichleaves it in the queue.

54

Implementation

SQSFacade

+SQSFacade(configPath:String)

+createQueue(name:string): string

+deleteQueue(name:string)

+acknowledge(msg:SQSMessage)

+sendObject(queue:string,object:Serializable)

+receiveObjects(maxNumberOfObjects:int,queue:string): List<SQSMessage>

SQSHandleThread

+SQSHandleThread(msg:SQSMessage,messageListener:object)

SQSMessage

+receiptHandle: string

+fromQueue: string

+theObject: object

SQSMessageContainer

+queueName: string

+maxConcurrentThreads: int

+messageListener: object

+listenInterval: int

+maxThreadWorkTimeSeconds: int

+listeningThreads: SQSThredCollection

+start()

+stop()

SQSThreadCollection

+getSuccessfullFinished(): SQSThredCollection

+getFailedFinished(): SQSThredCollection

+deleteFinished()

ArrayList<SQSHandleThread> Thread

<<interface>>

Runnable

Figure 6.6: Amazon SQS

6.4 Persistence

The persistence layer is a very important part of NotX because it is used to store usersinformation, templates for messages, statistics and notifications that couldn’t be deliv-ered. It is located in the notx-cassandra-access module and contains abstract wrappersof Hector API implementation. The reason for this is because Hector API is of a ratherlow level and concrete implementations of NotX Data Access Objects are in the packagenet.notx.cassandra.dal.

Abstract wrappers allow to construct the keyspace of cassandra as well as bootstrapinitial/test data. It is more convenient than maintaining some scripts for a databasescheme and it also seems to be the best way to do it with regards to unit testing. ORMtools support this argument because today it is more common to build a domain model(OOP classes) and to let an ORM tool to generate a database layer.

55

Implementation

Each entity has its own Data Access Object. For example, the entity User has theUsersDAO object that inherits from the DAO class and corresponding unit test is calledthe UserDAOTest which inherits from the DAOTest. Note that the developer doesn’tinstantiate the UserDAO directly but should call the getUsersDAO() method on theCassandraFacade. As it turns out, this is still very well applicable in Spring becauseSpring DI container allows the construction of objects by a factory method.

The last important note concerns unit testing. From experience, the Data Access Layer(DAL) should expose as high a level interface as possible to a business layer so that thedeveloper achieves a clean separation and there is no data related logic around. Whenheavy unit testing (better said as functional testing) of such DAL happens, many problemsthat are data dependent can be easily debugged and simulated (provided for the test datacreation is also automated and easy to do). Cassandra has a great advantage that isrelatively easy to run in-memory with some workarounds. This way it is not necessaryto configure source code against database testing instance, but unit tests are absolutelyindependent.

6.4.1 Diagram

There is no standard for capturing the design of a NoSQL database, but because NotXonly uses standard column families (no Super Column families), it is easy to represent itas an ERD diagram. The primary key designates the key that is used to address rows inthe Column Family.

When reading this diagram, the reader should realize that columns are added dynamicallyduring runtime. Only the Column Families themselves are deployed as a database scheme.

In a NoSQL database, it is necessary to denormalize the scheme according to queryrequirement that are the reason for CF_IDS_BY_TAGS and CF_TAGS_BY_IDS.Whenever the developer needs to query data the scheme needs to be tailored to reflectthe need. This has an obvious drawback in the stability of the scheme. The next issuerelates to the complexity of writing into the NoSQL database. It is easier to just storeobjects serialized then marshaling their parts into primitive types. This approach however

56

Implementation

CF_ROUTING

CF_ROUTE serialized

RECORD_ID int

CF_FAILS

FAIL serialized

CF_SENT_NOTIFICATIONS_BY_DOMAIN

CORRELATION_ID int

DOMAIN string

CF_SENT_NOTIFICATIONS

CORRELATION_ID

NOTIFICATION serialized

CF_TEMPLATES

TEMPLATE serialized

CF_TAGS_BY_IDS

USER_ID string

TAG string

CF_IDS_BY_TAGS

TAG string

USER_ID string

CF_USERS

USER_ID

USERS_NAME string

USERS_LANG string

USERS_CONTACTS contacts

Figure 6.7: Cassandra schema

can result in compatibility problems and it’s harder to update data in such schema.

Having these problems in mind, a developer will get very quick writes/reads and freehorizontal scalability.

6.5 Web interface

Web interface is implemented as a Stripes application that uses NotX persistence compo-nents and messaging. Web interface allows the following functions:

1. Recreating cassandra keyspace (mainly for testing purposes)

2. User Management - creating, viewing, updating and deleting users

3. Domain statistics of sent notifications and correlation information

4. Error view of notifications that weren’t send because of some error

57

Implementation

5. The sending of notifications via web interface (again for testing purposes)

6. Supertemplates management

7. Routing

Functions won’t be described here as they are self-explanatory. However, there are 2interesting implementation details.

The NotX web interface provides statistics of sent notifications. It is possible to trace anarbitrary notification and find out whether it was correlated and when. Statistics alsoprovide graphs with the amount of notifications according to domain per engine.

The second important note is about the integration of Spring with Stripes. NotX usesSpring DI to pass service objects around e.g., CassandraFacade. To integrate Spring DIwith Stripes like that, there is a very elegant way via Spring class ContextLoaderListener.This is a standard JSP listener that will inject Spring context into ServletContext byadding the configuration from the Listing 6.5 into web.xml.

<l i s t e n e r ><l i s t e n e r −c l a s s >

org . springframework . web . context . ContextLoaderListener</ l i s t e n e r −c l a s s >

</ l i s t e n e r >

Listing 6.5: Spring context classloader

In Stripes, it is then possible to overrideActionBeanContext class and use a Stripes filterto force usage of this ActionBeanContext as shown in Listing 6.6.

<in i t −param><param−name>ActionBeanContext . Class </param−name><param−value>net . notx . web . NotxActionBeanContext</param−value>

</i n i t −param>

Listing 6.6: Stripes customization

In the class NotxActionBeanContext, it is then possible to useWebApplicationContextUtilswhich will return the required Spring WebApplicationContext.

58

Implementation

The described technique will allow a seamless integration of Spring into Stripes and thedeveloper can easily access his Spring Beans.

6.6 Communication module

The communication module is used to publish external NotX API. The main class thatcan be thought of as a representation of the communication module is Comodule. Thisclass starts up respective worker threads. Each one is kind of an interface. This showshow NotX is not an actual web application but a standalone application, deployed in aweb container.

6.6.1 API overview

There are 3 operations for an external developer:

Tag

This operation is used to label a single user by his ID. Figure 6.8 shows a typical tagthat consists of two parts - a before and an after ’.’ character. The domain is used forstatistical purposes.

OpenMobility.attendees

domain

tag

Figure 6.8: Tag structure

59

Implementation

UnTag

This operation reverses tag operation.

SendNotification

Sends a notification. Syntax is

s endNo t i f i c a t i on ( tag , msgType , templateName , p l a c eho l d e r s )

Listing 6.7: Send notification syntax

The tag parameter is there to address an abstract recipient. It can take 3 different forms:

1. preceded by ’:’ which addresses a specific user.

2. a variation to 1, unique IDs deliminated by ’:’

3. any other string that represents a tag name

The msgType decorates a notification with a custom string giving it more semantics forrouting.

The templateName selects a super template that will be used for sending a notificatione.g., Registration_confirmation.

Finally, placeholders are used to add variable information (time, name, place) to thenotification itself.

6.6.2 API

NotX provides 3 APIs: WebServices, Java native client and Thrift API.

Web services are a widely used communication style mainly for its SOAP binding overHTTP which inherently gains advantages coming from HTTP (no firewall problems andSSL support).

Implementation in NotX is based on an Apache CXF framework which provides a con-tainer for JAX-WS annotated services.

The Java client is just a wrapper around the Web service API.

The Thrift API is best described by the Thrift definition file itself.

60

Implementation

namespace java net . notx . com . t h r i f t

s e r v i c e NotxServer {oneway void tag ( s t r i n g userId , s t r i n g tag )oneway void unTag( s t r i n g userId , s t r i n g tag )oneway void s endNo t i f i c a t i on ( s t r i n g tag , s t r i n g msgType ,

s t r i n g templateName ,map<st r ing , s t r i ng > placeHolderVal s )

}

Listing 6.8: Thrift API

It is very concise and understandable (compared with WSDL), yet it is possible to useThrift to generate client/server stubs into many programming languages. A big problemof Thrift is that it wasn’t developed with security in mind, so it is only usable in privatenetworks.

6.7 Performance

Performance in terms of throughput and fail-over are important things to watch for inevery system. State of the art tools and scalable persistence should help to promote it asmuch as possible.

6.7.1 Environment

Performance was tested on an AMD Phenom II 3.2 GHz with 8GB of memory.

The test was conducted by sending notification requests to a testing environment. Theresponse time and tag time of the kernel were measured.

Both the client and the server were deployed on the machine and Amazon SQS was usedas an external service available over an Internet connection.

61

Implementation

6.7.2 Results

Tagging

Users Response time Tag Time

10 2709 1189420 5147 1708730 6918 2213240 10170 3282950 11197 3530560 14327 4028970 17028 5070080 19022 5972590 20263 62179100 23105 68083

(a) Response time (b) Total time

Figure 6.9: Tagging

Results of tagging show a linear speed up of response time and tag time, which offersgood basis for scalability.

62

Implementation

Notification of IDs

Users Response time Notification of ID

10 3071 1389320 5280 2065430 7608 2959140 10246 3842750 11669 4557260 16705 5941070 18448 6748380 23584 7648790 22428 79066100 25795 87522

(a) Response time (b) Total time

Figure 6.10: Notification of IDs

The notification of concrete IDs also showed an almost linear speed up in response time.In Total time it was somewhat shaky, but this can be attributed to the external SQSservice.

63

Implementation

Notification of Tags

Users Response time Notification of tag

10 393 1580020 478 1606330 766 2650240 970 4541450 1312 6183360 1475 6970570 1960 5424680 1964 5491490 3020 68360100 2756 80472

(a) Response time (b) Total time

Figure 6.11: Notification of tags

Pleasing results also occurred for the notifications of concrete tags. A linear speed up isa great opportunity for scaling the application horizontally.

64

Implementation

6.7.3 Discussion

Results of performance test show a linear speed up of time consumption.

There seems to be a big overhead generated by Amazon SQS mainly for notificationsbecause they need to use queue two times which results in longer round-trip. Needlessto say, the architecture of NotX is queue centric and that makes it possible to scalehorizontally by adding more NotX Kernels.

Thanks to the distributed nature of the Cassandra database, it is possible to get a betterperformance by also distributing persistence layer .

Another possibility for performance improvement is to implement asynchronous requestson the interface. This is somewhat risky in the case that a SQS connection fails and theinterface wouldn’t be able to persist the message, but this is an acceptable risk. Thisasynchronous option is implemented in the Thrift interface.

65

Chapter 7

Conclusion

This thesis was motivated by the fact that reusable services in EIS development shouldbe developed with an emphasis on extensible architecture, high code quality, multi-platformity and scalability.

Requirements for the NotX were gathered using SCRUM methodology, followed by anal-ysis. SCRUM has proven to be a very flexible methodology that didn’t put unnecessarypressure on either the developer nor case study stakeholders. The process was easy toexplain and effective.

The next goal was to create a communication interface and extensible engines to NotX.Communication interfaces can be added to NotX through Java interfaces as mentioned inthe Implementation chapter. Engines are added as JAR files.

A big part of the thesis was dedicated to show and describe prototype implementationof NotX. Implementation shows deployable SMS and e-mail notifications. It was imple-mented with best practices of code quality, unit testing, and also with respect to tooling.

There is a web page for the NotX project http://www.notx.net/ and the source code ofthe prototype is hosted on SourceForge [25].

There is also a published paper about NotX by Filip Nguyen and Jaroslav Škrabálek [41]that was presented on an international conference FedCSIS 2011 and is also included inthe proceedings of the conference.

During the implementation of NotX, many possible outcomes and directions of possiblefuture work were identified. These are described in the following section.

66

Conclusion

7.1 Future work

There are many possible ways to extend NotX. The most important extension would beto finish the implementation of the Voice engine. This will not be an easy task becausethere are no out of the box working multi language solutions.

Engines are a real extension point of NotX, and other useful extensions could be developed:

• Engines for social networks (Facebook engine, Twitter engine)

• Engines for writing into database and calling web services(to integrate with databases)

• Engines for Content Management Systems

Better support for distributed architecture can be added to NotX by adding a load bal-ancer and an intelligent switch. Am intelligent switch could route notification requests todifferent queues because certain engines can be in different places.

To publish NotX as a publicly available service and not only as a case study it is necessaryto extend user management capabilities of NotX - registration, authentication of users.

An interesting extension to NotX would be to govern more logic related to generating ofnotifications. Complex Event Processing networks can be added between EIS and NotXwhich would generate notifications based on simple notifications generated by the system.

67

Appendix A

Contents of the CD

The Compact Disk supplied with the CD as well as the ZIP archive notx.zip attachedin the electronical submission of this thesis contains a source code snapshot of the NotXprototype which is buildable by Maven and deployable into a Java Servlet Container. Itwas tested on Tomcat web server. Please read supplied README file.

68

Bibliography

[1] http://www.cooper-safety.com/products/mass-notification-systems.

[2] http://www.blackboard.com/Platforms/Connect/Overview.aspx.

[3] http://www.alert-software.com/customers/case_studies/.

[4] http://code.google.com/intl/cs/android/c2dm/.

[5] http://openid.net/.

[6] http://breakingnewsworld.net/2011/08/mass-panic-over-alleged-facebook-leak-of-mobile-contacts/.

[7] http://en.wikipedia.org/wiki/Unified_Process.

[8] http://agilemanifesto.org/.

[9] http://en.wikipedia.org/wiki/Scrum_%28development%29.

[10] http://www.takeplace.eu.

[11] http://maven.apache.org/guides/getting-started/index.html.

[12] http://mvnrepository.com/.

[13] http://www.stripesframework.org/display/stripes/Home.

[14] http://www.springsource.org.

[15] http://sourceforge.net/.

[16] http://andrewhitchcock.org/?post=214.

69

Bibliography

[17] http://cassandra.apache.org/.

[18] http://thrift.apache.org/.

[19] http://rantav.github.com/hector/build/html/index.html.

[20] https://github.com/impetus-opensource/Kundera.

[21] http://www.jsig.com/confluence/display/HJMS/JMS4SQS.

[22] http://smslib.org/.

[23] http://www.smsbrana.cz/dokumenty/smsconnect_http.pdf.

[24] http://skype.sourceforge.jp/index.php?Skype%20API%20For%20Java%20%28English%29.

[25] http://sourceforge.net/projects/notx/.

[26] Peter Chen. The entity-relationship model - toward a unified view of data. ACMTransactions on Database Systems, 1976.

[27] Mike Cohn. User Stories Aplied for Agile Software Development. Addison WesleyProfessional, 2010.

[28] Marie Duží. Konceptualní modelování datový model HIT. Slezká univerzita v Opavě,2000.

[29] Martin Fowler. Patterns of Enterprise Application Architecture. 2002.

[30] Martin Fowler and Kendall Scott. UML Distilled Second Edition A Brief Guide tothe Standard Object Modeling Language. Addison Wesley, 1999.

[31] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns:Elements of Reusable Object-Oriented Software. 1995.

[32] Booch G. Jacobson, I. and J. Rumbaugh. The Unified Software Development Process.Addison-Wesley Professional, 1999.

70

Bibliography

[33] Mike Beedle Ken Schwaber. Agile Software Development with Scrum. 2000.

[34] Jaroslav Král. Informační Systémy. 1998.

[35] P. Kruchten. The Rational Unified Process. 2000.

[36] B. V. Kumar, Prakash Narayan, and Tony Ng. Implementing SOA Using Java EE.

[37] Jeunwoo Lee Kyuchang Kang and Hoon Choi. Instant notification service for ubiq-uitous personal care in healthcare application. International Conference on Conver-gence Information Technology, 2007.

[38] Avinash Lakshman and Prashant Malik. Cassandra - a decentralized structuredstorage system. 2009.

[39] Craig Larman. Applying UML and Patterns: An Introduction to Object-OrientedAnalysis and Design and Iterative Development. 2004.

[40] Robert C. Martin. Clean Code. 2008.

[41] Filip Nguyen and Jaroslav Škrábálek. Notx service oriented multi-platform notifica-tion system. FedCSIS 2011 proceedings, 2011.

[42] Jeffrey Palermo, Ben Scheirman, and Jimmy Bogard. ASP.NET MVC in Action.2009.

[43] Mary Poppendieck and Tom Poppendieck. Lean Software Development An AgileToolkit. 2010.

[44] Winston Royce. Managing the development of large software systems. Proceedingsof IEEE WESCON 26, 1970.

[45] C. Schmandt, N. Marmasse, S. Marti, N. Sawhney, and S. Wheeler. Everywheremessaging. 1707.

[46] Zdenko Staníček. Datové modelovaní metodou HIT. 1999.

[47] Carmen Zannier and Frank Maurer. Foundations of agile decision making from agilementors and developers.

71