BIG DATA Stories of big data success

11
BIG DATA Stories of big data success:

Transcript of BIG DATA Stories of big data success

BIG DATA

Stories of big data success:

• Visa recently advised that it has greatly improved its ability to detect fraudulent transactions (estimated to be 6 cents out of every $100) by increasing the amount of data it analyzes and looking at a broader range of attributes for each transaction. – “Visa says Big Data identifies billions of dollars in fraud” Wall Street Journal, July 2013

• Citibank has improved the quality of its consumer loan portfolio by hiring IBM's Watsonsupercomputer as a "financial advisor." By using information on market conditions as well as theapplicant's life events, interactions on social media and past decisions, the company is able to get afar better prediction of potential loan defaults and fraud. – “Crunching the numbers” The Economist,May 2012

• Walmart applied big data techniques and technologies to allow it to understand how to better serveits online customers. The retailer generated product and category popularity scores by mining socialmedia, which it combined with a self-teaching semantic search capability honed by the clickstreamdata of 45 million online shoppers each month. As a result, it was able to provide each individualcustomer with choices that would be attractive to them, increasing the number of online shopperscompleting transactions by between 10% and 15% - “Walmart builds its own online shopping searchengine” Gigaom 2013

• The University of Pittsburg Medical Center applied big data approaches to integrate electronic healthrecord information with gene sequences which it then compared against age, tumor size and nodalstatus in 140 cancer patients. They were able to identify molecular differences in the makeup of pre-menopausal and post-menopausal breast cancer. These findings are scientifically significant bythemselves, but they also lead the way to the providing a doctor with the tools and knowledge theyneed to prescribe a completely customized treatment regimen for each patient based on theirunique genetic makeup – “Big Data project at UPMC reveals patterns in breast cancer tumor” –eWeek June 2013

Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight
Regy14
Highlight

The business demand for information is not going to decrease — rather it will continue to increase. New data sources will continue to appear and enterprises that master the use of information of all kinds for business insights will enjoy significant competitive advantages. Although each of the technologies taken individually might not be transformative, their combined use may very well be.

It is also important to remember that many organizations lack the skills required to exploit big data. The emerging discipline of data science encompasses hard skills like process knowledge, statistics, data visualization, data mining, machine learning and database and computer programming. Most of these skills are in short supply in many organizations, and rare in the market at large. Big data is also about using creative combinations of various types of data and human knowledge.

A Big Plus and A Big Minus:

Necessary Steps to take:

• Work closely with business counterparts to discover what types of information would improvebusiness outcomes and where that information might come from. Leverage existing big data usecases, like advanced fraud detection, claims analytics or telematics to understand how big datamight be affecting your industry.

• Build multidisciplinary teams of business and technology experts that include data scientists — youwill need familiarity with the information, the ability to perform complex analyses using some of thetechnologies mentioned here (for example, In-Memory Data Grids and Predictive Analytics) and theability to visualize the results of the analysis in a meaningful way.

• Begin to explore potential new sources of information (for example, activity streams, opengovernment data, underutilized data held within the enterprise ("dark" data) or machineinstrumentation and operational technology feeds to understand how that information mightcontribute to your business insight.

• Experiment with new ways of capturing information, like Complex-Event Processing, Video Search orText Analytics.

Hype Cycle

Regy14
Highlight
Regy14
Highlight
Regy14
Highlight

Priority Matrix

Relevant Technologies on the Hype

Data ScienceDefinition Data science is the business capability and associated discipline to model, create

and apply sophisticated data analysis against disparate, complex, voluminousand/or high-velocity information assets as a means to improve decision making,operational performance, business innovation or marketplace insights. Datascience is a discipline that spans data preparation, business modeling and analyticmodeling. Hard skills include statistics, data visualization, data mining, machinelearning and database and computer programming. Soft skills that organizationsfrequently desire in their data scientists include communication, collaboration,leadership and a passion for data.

The data scientist role is critical for organizations looking to extract insight frominformation assets for big data initiatives, and requires a broad combination ofskills that may be fulfilled better as a team, for example:•Collaboration and teamwork is required for working with business stakeholders

to understand business issues.•Analytical and decision modeling skills are required for discovering relationships

within data and detecting patterns.•Data management skills are required to build the relevant dataset used for the

analysis.

The data management side of data science is also giving rise to a role that isbecoming more prevalent, that of the chief data officer (CDO, see "CEO Advisory:Chief Data Officers Are Foresight, Not Fad"). As information becomes anacknowledged asset, rather than just talked about as one, CDOs will emerge as theultimate stewards of these assets. The role of the CDO is to maximize value anduse of data across the enterprise and to manage the associated risk. They willoften focus on the places and ways in which certain information assets will havemore impact on the organization. And, just as other key corporate resources haveindependent executive oversight and organizations (such as material assets,financial assets, human capital), information assets are also beginning to do so. Assuch, CIOs, CDOs and COOs (or line-of-business leaders) are starting to form a newand exciting management triumvirate.

Justification Data science is still an emerging discipline where practices and ROI benefits arenot yet established. Information-centric companies such as Google, Amazon andFacebook base far more of their decisions on complex ad hoc analysis of data.(More than 10 years for mainstream adoption)

Key Applications User Advice Catalog and consider the range of data sources available within the organization

and the greater ecosystem of information assets available. Hypothesize andexperiment, looking to other industries for astounding ideas to adopt and adapt.Create sandboxes for data scientists to "play" in, and don't conflate your datawarehouse or BI competency center with the data science function. Then, confirmthe relative economic value of findings and the organization's ability to leverage

results (technically and culturally). Recognize that data scientists are different fromstatisticians or BI analysts in terms of both skill set and goals. But also recognizethat they are in short supply, so incubating skills internally or paying handsomelyfor top talent are the only options. Data science teaming arrangements that havethe requisite skills in aggregate can work, but are not the same as individuals withend-to-end abilities.

Business Impact Businesses that are open to leveraging new data sources and analytic techniquescan achieve considerable competitive leaps in operational or strategicperformance over those of traditional query and reporting environments.Advances in data science have yielded significant innovations in sales andmarketing, operational and financial performance, compliance and riskmanagement and new product and service innovation, and have even spawnedcapabilities for directly or indirectly productizing data itself.

Benefit HighMaturity EmergingSample Vendors

Cloud ComputingDefinition Cloud computing is a style of computing in which scalable and elastic IT-enabled

capabilities are delivered as a service using Internet technologies.Justification Cloud computing is still a visible and hyped term, but, at this point, it has clearly

passed the Peak of Inflated Expectations. There are many signs of fatigue, rampantcloudwashing and disillusionment (for example, highly visible failures). Although cloud computing is approaching the Trough of Disillusionment, it remains a major force in IT. (Mainstream adoption 2-5 years)

Key Applications User Advice As service provisioning (a critical aspect of cloud computing) grows, vendors must

become providers, or partners with service providers, to deliver technologies indirectly to users. User organizations will watch portfolios of owned technologies decline as service portfolios grow. The key activity will be to determine which cloud services will be viable, and when.

Business Impact Potential benefits of cloud include cost savings and capabilities (including conceptsthat go by names like agility, time to market and innovation). Organizations should formulate cloud strategies that align business needs with those potential benefits.

Benefit TransformationalMaturity Early mainstreamSample Vendors Amazon; Google; Microsoft; salesforce.com; VMware

Complex Event ProcessingDefinition A kind of computing in which incoming data about events is distilled into more

useful, higher-level and more complex event data that provides insights into whatis happening.

Justification CEP has progressed slightly on the Hype Cycle, putting it just past the Peak ofInflated Expectations. However, companies are adopting CEP at a relatively slow

rate because its architecture is so different from conventional system designs. Itmay take up to 10 years for CEP to reach the Plateau of Productivity and be in usein the majority of applications for which it is appropriate. (5 - 10 years formainstream adoption)

Key Applications Any process of application that need to capture and evaluate “data in motion”such as near-real-time precision marketing, (cross-sell and upsell), fraud detection,factory floor and website monitoring, customer contact center management,trading systems for capital markets and transportation operation management(for airlines, trains, shipping and trucking)

User Advice A very well structured information architecture and a processing platform needsto be created

Business Impact Improves the quality of decision making, enables faster response to threats andopportunities, reduces cost

Benefit TransformationalMaturity AdolescentSample Vendors Apache; IBM; Informatica; LG CNS; Microsoft; Oracle; Red Hat; SAP; SAS

(DataFlux);

Context Enriched ServicesDefinition Services are those that combine situational and environmental information with

other information to proactively offer enriched, situation-aware and usablecontent, functions and experiences.

Justification The majority of current implementations are consumer facing, in mobilecomputing, social computing, identity controls, search and e-commerce — areasin which context is emerging as an element of competitive differentiation.Enterprise-facing implementations, which use context information to improveproductivity and decision making by associates and business partners, have slowlybegun to emerge. The focus on big data has created a favorable environment forthe development of context-enriched services. (Mainstream adoption 2 - 5 years)

Key Applications Walmart — Whose Polaris search engine utilizes social media and semanticsearch of clickstream data to provide online customers with more-targetedoffers (leading to a 10% reduction in shopping cart abandonment).

VinTank — Which analyzes over 1 million wine-related conversations each day, topredict which customers will be interested in specific wines at specific pricepoints, and combines that with location information and alerts wineries whena customer who is likely to be interested in their wines is nearby.

Orbitz — has utilized behavioral information from user history and search todevelop predictive patterns that would increase hotel bookings by presentingusers with hotels that more closely match their preferences. This projectresulted in an addition of 50,000 hotel bookings per day — a 2.6% increase

User Advice IT leaders in charge of information strategy and big data projects should leveragecontextual elements sourced both internally and externally for their customer-facing projects. In addition, investigate how you can leverage contextual servicesfrom providers such as Google and Facebook to augment your existing

information.Business Impact New kinds of business applications — especially those driven by consumer

opportunities — will emerge, because the function of full context awareness mayend up being revolutionary and disruptive to established practices.

Benefit TransformationalMaturity AdolescentSample Vendors Apple; Facebook; Google; Microsoft; Sense Networks

Internet of ThingsDefinition Network of physical objects that contain embedded technology to communicate

and sense or interact with their internal states or the external environmentJustification On the technology side, there continues to be slow progress toward

standardization. Internet of Things wireless protocols continue to vie fordominance, but no clear leader stands out universally. There are some exceptions.Bluetooth LE is getting strong adoption as the wireless protocol to connect thingsto smartphones, tablets and computers. (10+ years for mainstream adoption)

Key Applications Connected assets, building and facilities managementUser Advice Increase your knowledge and capabilities with big data. The Internet of Things will

produce two challenges with information: volume and velocity.Business Impact Improvement of enterprise processes (Manage, Charge, Operate, Extend)Benefit TransformationalMaturity EmergingSample Vendors Atos; Axeda; Bosch; Cisco; Eurotech; GE; Honeywell; IBM; Microsoft; QNX;

Schneider Electric; Siemens

Predictive AnalyticsDefinition Data mining with four attributes: an emphasis on prediction (rather than

description, classification or clustering); rapid time-to-insight (measured in hoursor days); an emphasis on the business relevance of the resulting insights; and anincreasing emphasis on ease of use

Justification The algorithms underpinning predictive analytic applications are reasonablymature. Model management capabilities with more enhancements to aid ease ofuse is required before getting fully mature. (Less than 2 years for mainstreamadoption)

Key Applications Understanding the future behavior of customers, the future state of customers, topredict the likely performance of equipment

User Advice Focus on mechanisms to fine-tune the model performance that a traditional datamining workbench might deliver

Business Impact Better allocation of investments and maximization of returnsBenefit HighMaturity Early mainstreamSample Vendors Angoss; FICO; IBM (SPSS); KXEN; SAS; StatSoft