Does ‘Good governance’ remain the most effective means for eradicating global poverty?
Building Effective Information Governance with Data ... - Netwrix
-
Upload
khangminh22 -
Category
Documents
-
view
7 -
download
0
Transcript of Building Effective Information Governance with Data ... - Netwrix
Table of Contents
1
1.1
2.1
2.2
2.3
Why automation is critical
4
5
5
7
7
7
8
9
Introduction
2
Making more accurate risk assessments
Reducing risk by minimizing access permissions
Mitigating risk by automatically quarantining or redacting sensitive data
9
10
10
11
11
11
Data Classification and Risk Management
Executive Summary
3
Enhancing other data security products (DLP, IRM, etc.)
Using Data Classification to Protect Data
3.1
4
Monitoring and reporting on user permissions and activity involving sensitive data
How Data Classification Aids Compliance Activities
4.1
5
E-discovery and litigation support
Eliminating duplicate, unnecessary and obsolete data
Using Data Classification to Support Other Functions
5.1
5.2
15
6
7
Advanced data discovery techniques
Agent-based vs. agentless architecture
Supported platforms and data types
Discover relevant data across multiple repositories faster
Prioritize your data security efforts
Enforce information governance policies
Revoke excessive permissions
Stay compliant with data retention mandates
Reduce the total cost of storage by cleaning up low-value information
Improve the efficiency of data management technologies
Choosing a Data Classification Solution
Building Effective Information Governance with Netwrix Data Classification
12
12
13
14
15
16
17
18
19
20
21
22
22About Netwrix
About the Author
6.1
6.2
6.3
7.1
7.2
7.3
7.4
7.5
7.6
7.7
4
Executive Summary
Information governance is essential for every organization today. Whether you are concerned about the security of your critical business data or need to comply with increasingly complex regu-lations, a solid data discovery and classification (DDC) solution is a wise investment. In fact, as the volume and variety of data has exploded, DDC has become a business necessity.
DDC solutions scan data across your IT infrastructure and tag it according to its value and sensi-tivity. That way, IT teams and data managers can ensure it is handled and protected appropriately — for example, by putting proper data access controls in place, keeping permissions up to date, and implementing the right backup and restore capabilities.
This eBook explores the capabilities and benefits of DDC solutions, as well as what to look for when selecting a DDC solution. You’ll discover how a DDC solution can help you save time and money while minimizing the risk of security incidents and compliance failures.
Executive Summary
5
1.
1.1
1| Introduction
Why automation is critical
Data discovery and classification overcomes all of these challenges. But today, both the dis-covery and classification processes simply must be automated.
First, the discovery process has to be able to find data no matter where it’s stored: in data-bases, on SANs or NAS in company data centers and wiring closets, in the cloud, on server-at-tached storage, on standalone devices, and more. Note that we are talking about both struc-tured data and unstructured information like documents, PowerPoint presentations, pictures and emails, and about both physical and virtualized platforms. Manual processes are simply not a scalable approach to data discovery in today’s complex IT ecosystems.
Similarly, classifying data manually is simply too labor-intensive, time-consuming and er-ror-prone to be a practical approach for all but the smallest of companies. In particular, man-ual data classification suffers from the following issues:
Introduction
With the amount and variety of data exploding around the world, organizations are facing unprecedented data sprawl — data is stored in every nook and cranny of the enterprise IT infrastructure, and no one knows which files contain personally-identifiable information (PII), information about particular projects, intellectual property (IP) or other valuable or regulated content. As a result, they struggle to protect information adequately, comply with legal mandates, weed out duplicate and redundant data, and empower employees to find the content they need to do their jobs.
Inconsistency. Different people will classify similar documents in different ways.
Inaccuracy.
Employees with their own jobs to attend to often fail to classify data at all or simply pick the first tag in the list to expedite the process.
61| Introduction
The Value of data classification.
� Protect sensitive against misuse or loss as required by compliance regulations
� Guard proprietary data to preserve its business value
� Save time and money by tailoring data protection methods to each type of data
� Retain data as long as required by compliance or business requirements and dispose of it responsibly
� Reduce the risk of fines and other expenses associated with security incidents and compliance failures
� Improve the efficacy of other data management tools, which can use the tags that the DDC solution embeds in content
Inflexibility.
As company requirements and regulations change, no one has the time or inclination to update the tags on gigabytes or terabytes of existing data.
Failure.
As users realize that data is not classified correctly, they will quit trusting the process and the whole project will fail.
Automating data discovery and classification overcomes these limitations by making the process reliable, accurate and continuous. The discovery process finds data across the IT environment, and the classification process checks them for the various types of data it’s set to classify. For example, it can spot PII by looking for data patterns that indicate names, dates of birth, addresses, phone numbers, financial information, health information, and Social Security numbers. And it can re-classify the data later as needed due to organizational or regulatory changes.
72| Data Classification and Risk Management
Data Classification and Risk Management
2.
2.1 Making more accurate risk assessments
Accurate risk management is critical for organizations today. Overestimate your risk and you might fail to take advantage of opportunities that could increase your market share or profit-ability. Underestimate your risk and you can face schedule or budget overruns, legal compli-cations, and compliance violations.
DDC solutions can help you make more accurate risk analyses, and therefore lead to more effective risk mitigation and contingency planning. However, two capabilities are crucial: flexibility and accuracy.
Most DDC solutions come with a number of predefined criteria for discovering and classi-fying data. But they also need to empower users to easily customize the rules and create new categories, so all their data will be properly classified according to their unique require-ments.
It’s difficult to properly assess and mitigate risks when data is misclassified. The data discov-ery process needs to be thorough and the classification process needs to be highly accurate.
Flexibility.
Accuracy.
82| Data Classification and Risk Management
2.3 Mitigating risk by automatically quarantining or redacting sensitive data
Some data classification solutions offer additional risk mitigation features. Some can au-tomatically move sensitive data that is discovered in improper locations to a secure quar-antine area where it is protected while a data manager determines where it should reside. Some tools can also automatically redact sensitive information from documents that need to be disseminated for compliance or business reasons.
DDC solutions help you know which data merits strong restrictions and oversight, and which data can be made freely available to employees, customers, partners and other groups. This information is critical for assigning privileges properly and conducting effective privi-lege attestations.
2.2 Reducing risk by minimizing access permissions
The single most important best practice for reducing risk is to implement the principle of least privilege. By granting each user and system account the minimum permissions necessary for it to operate successfully, you reduce the ability of each account to cause damage, whether it is being used by the legitimate owner or has been taken over by an attacker or malware.
9
Using Data Classification to Protect Data
3.
Data classification can also help organizations ensure the availability, integrity and confi-dentiality of their data. A DDC solution can help you prevent accidental and deliberate data modification or loss by enabling you to evaluate your data management strategies and strengthen your security posture. For example, you can protect sensitive data from being sent outside the organization or copied to a thumb drive, and ensure you have appropriate backup and recovery processes in place based on the value and sensitivity of the data.
The best DDC solutions also offer integration with monitoring and alerting tools that can notify you about suspicious activity that could put sensitive data at risk, as well as streamline regular review of user permissions and attempts to access certain types of data.
3 | Using Data Classification to Protect Data
3.1 Enhancing other data security products (DLP, IRM, etc.)
The tags that DDC solutions assign to data can improve the effectiveness of other data management tools, such as data-loss prevention (DLP), information rights management (IRM), systems management, and backup and recovery solutions. For example, without da-ta classification, companies must back up everything to make sure they don’t miss anything of value. But data classification empowers them to tune their backup criteria so that mis-sion-critical data gets the highest level of redundancy and lower priority data gets a lower tier of protection.
Indeed, any software tool that needs to make decisions about how to treat files based on their content can benefit from the automatic tagging of the data classification process.
104 | How Data Classification Aids Compliance Activities
How Data Classification Aids Compliance Activities
4.
Compliance regulations are increasing in number and scope, and packing greater penalties for failures. Some, like Sarbanes-Oxley (SOX), even include civil and criminal penalties for corporate leaders who fail to ensure the accuracy of certain activities and filings.
DDC solutions help you ensure and prove compliance by automatically identifying data that is subject to specific compliance regulations so you can put proper controls in place. You can also make more accurate decisions about data retention because you know which regulations and policies apply to which data. For instance, the U.S. Internal Revenue Service has very specific guidelines on which types of corporate data must be retained and for what length of time.
As noted earlier, some solutions can also quarantine, redact and move data. These features facilitate response to requests from data subjects under GDPR and other data privacy man-dates. For example, instead of facing the nightmare of having to manually discover and delete all sensitive data related to a particular individual, with DDC, you can automate this process to ensure that your company is always in compliance. Similarly, some solutions can automate the redaction of sensitive data in documents that must be exposed to users who do not have the rights to view it. Some DDC solutions even offer workflows that can automatically revoke access rights for users who should not be able to access sensitive data.
4.1 Monitoring and reporting on user permissions and activity involving sensitive data
Discovering and classifying your data is a critical first step, but many compliance standards al-so require you to maintain tight control over activity around sensitive data. Therefore, it’s es-sential to choose a DDC solution that integrated easily with a monitoring and alerting tool.
115 | Using Data Classification to Support Other Functions
Using Data Classification to Support Other Functions
5.
5.1 E-discovery and litigation support
Lawsuits and other legal actions are very common for companies of all sizes these days. They often come with inflexible deadlines and long lists of criteria for content that could be evidence and therefore must quickly be retrieved or put on hold. With data classifications, metadata tags, time/date stamps and other pertinent information from your DDC solution, performing e-discovery as part of legal requests is much more manageable.
In fact, considering the high stakes of modern legal warfare, data classification solutions of-fer a significant competitive advantage to corporations with a high volume of legal inquir-ies, e-discovery and litigation support. Indeed, these use cases alone can easily justify the expenditure for a DDC solution.
Finally, data discovery and classification can also help you automate e-discovery and storage optimization.
5.2 Eliminating duplicate, unnecessary and obsolete data
Data classification can also help companies find and eliminate data that doesn’t need to be kept, such as:
Eliminating this data reduces risk (since it cannot be lost or stolen), reduces data manage-ment and storage costs, and improves user productivity by reducing clutter.
� Duplicate documents � Files that serve no business purpose, such as routine corporate correspondence for closed
and abandoned projects and other outdated files � Data that has no business, legal or regulatory value for the company
126 | Choosing a Data Classification Solution
Choosing a Data Classification Solution
6.
Because data discovery and classification has such profound effects on an organization’s security, compliance and operations, it’s essential to evaluate candidate solutions carefully. Here are some specific features to look for.
6.1 Advanced data discovery techniques
Solutions that can use compound terms to search for and classify sensitive data deliver more accurate results. Look for a tool that can handle all three variations:
Keyword stemming looks at the root of a word and automatically includes all its variations in the search process. For instance, the word discovery yields the step discover, and the solution will also find instances of discovering, discovered, etc.
Though online search engines commonly use this technique to find the greatest number of matching documents and sites, do not take it for granted that every data classification solution offers keyword stemming out of the box. Some tools require data managers to “teach” the search engine about the variations of a word or even require them to manually define all possible variants.
Compound terms
Keyword stemming
� Hyphenated compound words: Two or more words separated hyphens, such as father-in-law or agent-based
� Open compound words: Words that are commonly written together but are separated by a space instead of a hyphen, like truck stop and luxury hotel
� Closed compound words: Words that are typically written by combing two words, such as rooftop and beachfront
136 | Choosing a Data Classification Solution
6.2 Agent-based vs. agentless architecture
Like many types of management software, DDC solutions come in both agent-based and agent-less forms, as illustrated in Figure 2. Which is better depends on your needs and priorities.
FIGURE 2: Agent-based vs. agentless architecture
Advantages of an agentless architecture include:
Agentless architecture
DDC server
Servers
Storage
Servers with local agent
Network devices
Network devices without local agent
Cloud resourcesCloud resourceswith local agent
Storage with local agent
Data collector/ agent proxy
DDC server
Agent-based architecture
However, the pro-agent crowd argues that agentless solutions fail to deliver the features, scalability and robustness that agents make possible. An agent-based approach can also re-duce network traffic. For devices where an agent cannot run, a proxy agent process might be able to provide data classification functionality.
� Faster, easier deployment because no software has to be installed on end nodes
� Less overhead for IT staff because they do not have to maintain agents
� No risk of agents dragging down system performance
� Better security, since there are no agents to be hacked
14Domain 6 | Choosing a Data Classification Solution
6.3 Supported platforms and data types
Last but not least, you need a DDC solution that supports as much of your data as possible. Be sure to consider whether you use Oracle, SQL Server, file storages, cloud storage and so on. Be sure to include both your structured and unstructured data. Not all solutions support structured data, and unstructured data is notoriously difficult to discover and parse prop-erly. Therefore, pay careful attention to not just the platforms listed as “supported” but the quality of the results that each solution delivers.
15
Building Effective Information Governance with Netwrix Data Classification
7.
7.1 Discover relevant data across multiple repositories faster
During eDiscovery and legal proceedings, you need to be able to collect all relevant files across your on-premises and cloud-based storages, such as Windows file servers, Share-Point and SQL Server. With Netwrix Data Classification, you can quickly find everything you need from a single platform.
7 | Building Effective Information Governance with Netwrix Data Classification
https://enterprise-my.sharepoint.com/sites/Documents/Project15245_Financial.xlsx
https://enterprise-my.sharepoint.com/sites/Projects/Project_15245/Release policy.docx
https://enterprise.com/Personal/AJakobs/Projects/Project_15245/Project map.pdf
add custom filter
Project Data
Project 15245 codename BarrelFind:
Filter by URL:
(100%)
(100%)
Financial dashboard codename Barrel March 2019 project15245 Estimated expenses Actual expenditures Average variances Total cost/month One-time cost Invoice number: 11544/7 Code: 7741
1
Showing 115 of 373 records Suggest clues for Search
Suggest Clues Add to Working Set Add to Negative Working Set Re-Index Re-Classify
Search
Road map guide Project 15245 - Codename Barrel Key challenges: high product market competition, non-ecological testing Impacts on working conditions: None Project owners: Adam Smith, Jack Malrow, Nina Cooper
2
(48%)
Project 15245 Release Policy Version 0.1 Date May 11, 2019 Author Mark Durclay Version Comments Draft Release policy goals: Predictability. Scope, delivery time and development status of new product versions should be visible to all stakeholders. Flexibility. Release process should allow for changing priorities and schedules.
3
16
7.2 Prioritize your data security efforts
To prioritize your security and governance efforts, you need to know where various types of information are located. Netwrix Data Classification empowers you to accurately identify the data that matters most to your business so you can manage and protect it properly.
Content DistributionThe “Content Distribution” report allows you to view the distribution of your content in several formats: grouping by source, grouping by taxonomy, or grouping by item. You can zoom in to a particular area of the chart by left-clicking in that area. Right-clicking will zoom back out again.
Group By: Source
Dashboard
Content Distribution
Recent Tagging
Index Analysis
Term Cloud
Classification Reports
Clue Building Reports
Document Reports
System Reports
Reports Queued Reports Plugins
Term
Source Filter: Include
Taxonomy
Exclude
GDPR
Generate
\\fs\share\customers
\\fs\share\customersDocuments: 518
https://enterprise.sharepoint.com
Server=SQL\Enterprise, Database=Accounting
PII
\\fs\share\public Server=SQL\Enterprise, Database=Accounting
https://enterprise.sharepoint.com
\\fs\share\internal
IP
https://enterprise.sharepoint.com
7 | Building Effective Information Governance with Netwrix Data Classification
17
Enforce information governance policies
It is important for organizations to not only formulate strong information governance pol-icies, but also make sure they are being obeyed company-wide. Since relying solely on each employee’s judgement is risky, it is essential to automate policy enforcement. With Netwrix Data Classification, you can automate many critical information governance pro-cesses, such as spotting sensitive files that surface in unsecure locations, moving them to a secure quarantine area and alerting the appropriate staff about the event.
Which content source(s)?
Choose a name for your workflow
Should this workflow be enabled on creation?
Which content source(s)?
What do you want to do?
Enabled
Action:Destination:
Maintain Folder Structure?: Move/Copy?:
If File Already Exists?: Redact Document?:
Migrate document to File System
\\fs\internal\quarantine\customer data
No
Move
Append Migration Date
No
Run this workflow against: Documents with Specific ClassificationsClassified as:• PII (All Terms)
Source Type:Sources:
SharePointhttp://sp.enterprise.com/sites/Sales
Disabled
Quarantine Workflow
What do you want to do? When do you want to do it? Summary
When do you want to do it?
?
7.3
7 | Building Effective Information Governance with Netwrix Data Classification
18
Revoke excessive permissions
Many organizations don’t know how much of their sensitive or business-critical data is ac-cessible by large groups of users and don’t have a quick way to find out. This gap in data access governance often leads to data leaks and compliance problems. With Netwrix Data Classification, you can easily create a workflow that will automatically remove permissions to sensitive data, including inherited permissions, from groups like Everyone.
WorkflowsWorkflow > \\fs1\Accounting > Update Permissions
GDPR > UK passport Classified
Conditions
Rule Conditions Edit
Rule 1
Enabled:
Workflows Plugins LogsConfigs
+
Showing 1 record(s)Copy | CSV | XLSX
i
Conditions Include Children Criteria
Rule Actions Addi
Action Parameters
Update Permissions RemoveAccessFrom=Everybody, GrantAccessTo=J.Smith, GrantAccessPermissionLevel=Full Control, RemoveInheritedPermissions=false
Edit | Delete
Edit Action
Action Type Update Permissions
Save Cancel
Remove Access From Everyone
Grant Access To J.Smith
Grant Access Permission Level Full Control
Remove Inherited Permissions
7.4
7 | Building Effective Information Governance with Netwrix Data Classification
19
Stay compliant with data retention mandates
Meet data retention requirements and improve your records management by automatical-ly finding specific types of records, such as tax returns, across your IT environment and en-forcing the required retention policies around them.
7.5
Which content source(s)?
Choose a name for your workflow
Should this workflow be enabled on creation?
Which content source(s)?
What do you want to do?
Enabled
Action:Destination:
Maintain Folder Structure?: Move/Copy?:
If File Already Exists?: Redact Document?:
Migrate document to File System
\\fs2\Archive\Tax Records
No
Move
Append Migration Date
No
Run this workflow against: Documents with Specific ClassificationsClassified as:• PII (All Terms)
Source Type:Sources:
File\\fs1\Finance\Tax Records
Disabled
Retention Workflow
What do you want to do? When do you want to do it? Summary
When do you want to do it?
?
7 | Building Effective Information Governance with Netwrix Data Classification
20
Reduce the total cost of storage by cleaning up low-value information
How much money and effort are you wasting on storing and maintaining duplicate, obsolete and trivial data? Netwrix Data Classification will automatically find and get rid of all low-quali-ty and low-value files, such as duplicate or old versions of documents, so users won’t have to slog through piles of clutter and you won’t have to constantly purchase more storage.
7.6
Near Duplicate Detection
Details near duplicate documents across the index. Near duplicates are detected as a background process, to enable the background processing simply enable the option “Near Duplicate Detection” within the Indexer Settings and rebuild the desired sources.
+ Show filters Generate
PageUrl Duplicate PageId
\\fs1\shared\Product Management\Release 5.4.docx
\\fs1\shared\Product Management\Release 5.4.docx
\\fs1\shared\Product Management\Various Documents\Release Notes.docx
Match Precision (%):
Minimum Text Length:
Maximum Text Length Difference:
95
100
20
Duplicate PageUrl
\\fs1\shared\Product Management\2019\Release 5.4 draft.docx
Relevancy
10075464
75546
75628
Text Length Difference
\\fs1\shared\Product Management\Various Documents\Release 5.4.docx
\\fs1\shared\HR\Other\Release Notes.docx
100
97
0.00
0.00
3.50
PageId
75450
75538
75617
7 | Building Effective Information Governance with Netwrix Data Classification
21
Improve the efficiency of data management technologies
Forcing your expensive data management and protection solutions to process all data, regard-less of its sensitivity or business value, bogs them down and drives up costs. Using the highly accurate classification tags written by Netwrix Data Classification, you can increase the effective-ness of your endpoint security, data loss prevention and data management solutions.
7.7
Recent TaggingThe “Recent Tagging” graph requires the “Auto-Classification Change Log” feature to be enabled (Config -> Classifier)
Url:
Dashboard
Content Distribution
Recent Tagging
Index Analysis
Term Cloud
Classification Reports
Clue Building Reports
Document Reports
System Reports
Reports Queued Reports Plugins
Taxonomy:
AMEX
Diners Club
Discover
JCB
Mastercard
UnionPay
VISA
0 5 10 15 20 25 30
No filter
All
Display Period: Past Week
Apply filters
7 | Building Effective Information Governance with Netwrix Data Classification
22
About the Author
About Netwrix
Earl is a 30-year veteran of the computer industry. His experience in IT training, marketing, technical evangelism and market analysis covers many areas, including networking, systems management, disaster recovery and business continuity, and application performance monitoring. Along the way, he has authored many eB-ooks, white papers and articles.
Netwrix is a software company that enables information security and governance profession-als to reclaim control over sensitive, regulated and business-critical data, regardless of where it resides. Over 10,000 organizations worldwide rely on Netwrix solutions to secure sensitive data, realize the full business value of enterprise content, pass compliance audits with less effort and expense, and increase the productivity of IT teams and knowledge workers.
Founded in 2006, Netwrix has earned more than 150 industry awards and been named to both the Inc. 5000 and Deloitte Technology Fast 500 lists of the fastest growing companies in the U.S.
For more information, visit www.netwrix.com.
netwrix.com/social
CORPORATE HEADQUARTER:
300 Spectrum Center Drive Suite 200 Irvine, CA 92618
PHONES: OTHER LOCATIONS: SOCIAL:
+33 9 75 18 11 19
+34 911 9826081-949-407-5125 Toll-free (USA): 888-638-9749
1-201-490-8840
+44 (0) 203 588 3023
565 Metro Place S, Suite 400Dublin, OH 43017
5 New Street SquareLondon EC4A 3TW
+49 711 899 89 187
+31 858 887 804
+852 5808 1306
+46 8 525 03487
+39 02 947 53539
+41 43 508 3472
France:
Spain:
Germany:
Netherlands:
Hong Kong:
Sweden:
Italy:
Switzerland:
Earl Follis