International Metadata Standards and Enterprise Data Quality ...

26
DM_PPT_NP_v02 International Metadata Standards and Enterprise Data Quality Metadata Systems Ted Habermann Director of Earth Science The HDF Group [email protected] This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C

Transcript of International Metadata Standards and Enterprise Data Quality ...

DM_PPT_NP_v02

International Metadata Standards and Enterprise Data Quality Metadata SystemsTed HabermannDirector of Earth Science The HDF [email protected]

This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C

DM_PPT_NP_v02

2

The Big PictureISO19157isaconceptualmodelofdataqualitymetadatathatwasrecentlyapprovedasaninternationalstandard.Itcombinesthreeolderstandardsintoaunifiedmodelfordescribingdataquality.

Manyoftheprincipleelementsofthisconceptualmodelareabstract,andcanbeimplementedinseveralways.

DM_PPT_NP_v02

3

The Big Picture

DataQualityElement

ISO19157isaconceptualmodelofdataqualitymetadatathatwasrecentlyapprovedasaninternationalstandard.Itcombinesthreeolderstandardsintoaunifiedmodelfordescribingdataquality.

Manyoftheprincipleelementsofthisconceptualmodelareabstract,theycanbeimplementedinseveralways.

Whenonlytheabstractconceptsareconsidered,themodelisverysimple.

StandaloneReport

DM_PPT_NP_v02

4

Enterprise Systems?

DM_PPT_NP_v02

5

Data Quality Scope“Thequalityofmydatavaryintimeandspaceanddifferentparametershavedifferentqualitymeasuresandresults.”

ISOqualityreportsallincludedescriptionsoftemporalandspatialextentsandelementsofthedatasetthattheypertainto.Youcansaythingslike:Between2001and2002thequalityofthedatainthenorthernhemisphere…orThedatacollectedbythissensordegradedduringJune2011because…orQualityinformationforthisparameterisinthisvariable…

<<DataType>>DQ_Scope

+level:MD_ScopeCode+extent[0..1]:EX_Extent+levelDescription [0..*]:MD_ScopeDescription

DM_PPT_NP_v02

6

Stand Alone Quality Reports“Therearepapersandwebpagesthatdescribethequalityofmydata.”

PapersandreportsthatdescribedataqualityareStandAloneReports.Metadatacanincludebriefdescriptionsoftheresults(abstracts)andreferencestoanynumberofthese(citations).

Abstract:Thefiretraining-setmayalsohavebeenbiasedagainstsavannaandsavannawoodlandfiressincetheirdetectionismoredifficultthaninhumid,forestenvironmentswithcoolbackgroundtemperatures[Malingreau,1990].Theremay,therefore,beanunder-samplingoffiresinthesewarmerbackgroundenvironments.

Citation:Malingreau J.P,1990,Thecontributionofremotesensingtothe

globalmonitoringoffiresintropicalandsubtropicalecosystems.In:Firein

TropicalBiota,(J.G.Goldammer ,editor),SpringerVerlag ,Berlin:337-370.

DQ_StandaloneQualityReportInformation

+abstract :CharacterString+reportReference:CI_Citation

DOI

DM_PPT_NP_v02

7

Data Usage (19115-1)“Usersincreaseourunderstandingofdataquality.Weneedtokeepthemintheloop.”

MD_Usage

+specificUsage :CharacterString+usageDateTime [0..1]:DateTime+userDeterminedLimitations [0..1]:CharacterString+userContactInfo [1..*]:CI_ResponsibleParty+response[0..*]:CharacterString+additionalDocumentation [0..*]:CI_Citation+identifiedIssues [0..1]:CI_Citation

DOI

DM_PPT_NP_v02

8

What is a Data Quality Element?

DataQualityElement

Measure

Result

Method

QA_PercentMissingData

NumberofPixelswithMissingFlagsTotalNumberofPixels

15%

DM_PPT_NP_v02

9

What Are Quality Measures?“Mymetadataalreadyincludedataqualitymeasures.”

NASAEOSDISmetadataincludestwotypesofqualitymeasures.

QA_StatsQA_Flags

DM_PPT_NP_v02

10

What Are Quality Measures?“IuseconsistentQualityMeasuresacrossmanyproducts.”

QAStats– StandardmeasuresforallproductsQAPercentMissingData- Granulelevel%missingdata.Thisattributecanberepeatedforindividualparameterswithinagranule.QAPercentOutOfBoundsData– Granulelevel%outofboundsdata.Thisattributecanberepeatedforindividualparameterswithinagranule.QAPercentInterpolatedData– Granulelevel%interpolateddata.Thisattributecanberepeatedforindividualparameterswithinagranule.QAPercentCloudCover– Thisattributeisusedtocharacterizethecloudcoveramountofagranule.Thisattributemayberepeatedforindividualparameterswithinagranule.(Note- theremaybemorethanonewaytodefineacloudorit'seffectswithinaproductcontainingseveralparameters;i.e.thisattributemaybeparameterspecific)

QA_Stats

DM_PPT_NP_v02

11

What Are Quality Measures?“IuseconsistenttypesofQualityMeasureacrossmanyproducts.”

QAFlags – ClassesofqualitymeasureswithproductspecificimplementationsAutomaticQualityFlag – Thegranulelevelflagapplyinggenerallytothegranuleandspecificallytoparametersthegranulelevel.Whenappliedtoparameter,theflagreferstothequalityofthatparameterforthegranule(asapplicable).TheparametersdeterminingwhethertheflagissetaredefinedbythedeveloperanddocumentedintheQualityFlagExplanation.AutomaticQualityFlagExplanation – Atextexplanationofthecriteriausedtosetautomaticqualityflag,includingthresholdsorothercriteria.OperationalQualityFlag – Thegranulelevelflagapplyingbothgenerallytoagranuleandspecificallytoparametersatthegranulelevel.Whenappliedtoparameter,theflagreferstothequalityofthatparameterforthegranule(asapplicable).TheparametersdeterminingwhethertheflagissetaredefinedbythedevelopersanddocumentedintheOperationalQualityFlagExplanation.OperationalQualityFlagExplanation – Atextexplanationofthecriteriausedtosetoperationalqualityflag;includingthresholdsorothercriteria.ScienceQualityFlag – Granulelevelflagapplyingtoagranule,andspecificallytoparameters.Whenappliedtoparameter,theflagreferstothequalityofthatparameterforthegranule(asapplicable).TheparametersdeterminingwhethertheflagissetaredefinedbythedevelopersanddocumentedintheScienceQualityFlagExplanation.ScienceQualityFlagExplanation – Atextexplanationofthecriteriausedtosetsciencequalityflag;includingthresholdsorothercriteria.

QA_Flags

DM_PPT_NP_v02

12

<<Abstract>>DQ_Element

Data Quality Measures

DQM_Measure

+measureIdentifier :MD_Identifier+name :CharacterString+alias[0..*]:CharacterString+sourceReference[0..*]:CI_Citation+elementName[1..*]:TypeName+definition :CharacterString+description [0..1]:DQM_Description+valueType :TypeName+valueStructure[0..1]: DQM_ValueStructure+example[0..*]:DQM_Description

DQM_BasicMeasure

+name:CharacterString+definition:CharacterString+example:DQM_Description[0..1]+valueType:TypeName

DQM_Parameter

+name:CharacterString+definition:CharacterString+description:DQM_Description [0..1]+valueType :TypeName+valueStructure [0..1]:DQM_ValueStructure

DQM_Description

+textDescription:CharacterString+extendedDescription [0..1]: MD_BrowseGraphic

DQ_MeasureReference

+measureIdentification [0..1]:MD_Identifier+nameOfMeasure [0..*]:CharacterString+measureDescription [0..1]:CharacterString

ifmeasureIdentification isnotprovided,thennameOfMeasure shallbeprovided

+measure0..1

ISO19157includesaDQ_MeasureReferencedesignedtoprovideaconnectiontoadetaileddescriptionofthequalitymeasure.

“Mydataqualitymeasuresareconsistentlydescribedinadatabase.”

DM_PPT_NP_v02

13

Data Quality Measures“IneedtoclearlyandconsistentlyexplainhowImeasurequality.”TheISOmodelforqualitymeasuresincludesidentifiers,definitions,descriptions,referencesandillustrations.

DM_PPT_NP_v02

14

Modular DQ Information“Mydataqualityinformationexistsin databasesorwebservices.”

Majorelementsofthe19157conceptualmodelareseparatecomponentsthatcanbeindependentlyconnectedtothemetadataandreusedinmultiplerecords.

Results

Measures

Methods

DM_PPT_NP_v02

15

Enterprise Systems?

Reports/DocumentsMeasures Methods

DM_PPT_NP_v02

16

Data Quality Results“Mymetadatacurrentlyincludesdescriptionsofthequalityofmydata.”

Thesedescriptionscanbeincludedin19157metadataasdescriptivereports.

<Quality>Due to the lack of high resolution data available over the region for1993-94, it has been hard to validate the product. However the maps ofburnt areas correspond well with active fire maps for theregion. Where large [>3km] scars are found, the detection is morereliable. In areas of small scars more problems are involved. It ishoped that the 1994-95 data set will cover the whole of the study areaand be calibrated by high resolution data.

</Quality>

DQ_DescriptiveResult

+statement:CharacterString

<gco:CharacterString>Due to the lack of high resolution data available over the region for1993-94, it has been hard to validate the product. However the maps ofburnt areas correspond well with active fire maps for theregion. Where large [>3km] scars are found, the detection is morereliable. In areas of small scars more problems are involved. It ishoped that the 1994-95 data set will cover the whole of the study areaand be calibrated by high resolution data.

</gco:CharacterString>

DM_PPT_NP_v02

17

Summary

“Therearepapersandwebpagesthatdescribethequalityofmydata.”

“Mydataqualityinformationexistsindatabasesorwebservices.”

“IuseconsistentQualityMeasuresacrossmanyproducts.”

“IuseconsistenttypesofQualityMeasureacrossmanyproducts.”

“IneedtoclearlyandconsistentlyexplainhowImeasurequality.”

“Mymetadatacurrentlyincludesdescriptionsofthequalityofmydata.”

“Usersincreaseourunderstandingofdataquality.Weneedtokeepthemintheloop.”

“Thequalityofmydatavaryintimeandspaceanddifferentparametershavedifferentqualitymeasuresandresults.”

DM_PPT_NP_v02

Documentation Resources on the ESIP Wiki

Ted Habermann, Sean Gordon, John KozimorThe HDF [email protected]

This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C

DM_PPT_NP_v02

19

Documentation ConnectionsDocumentationconcepts,recommendationsandimplementationsinmultipledialects

http://wiki.esipfed.org/index.php/Category:Documentation_Connections

DM_PPT_NP_v02

20

Concept Glossary

DM_PPT_NP_v02

21

Dialects

Basicinformationaboutthedialectandwhocreatedit.

DM_PPT_NP_v02

22

RecommendationsManyrecommendationsincludemultiplelevels(mandatory,recommended,optipnal).Arecommendationpagegives:

1. ConceptNames2. ConceptDefinitionsand3. ConceptImplementations

(multipledialects)

Foreachrecommendationlevel

DM_PPT_NP_v02

23

Resource Title Concept

Concept Description Crosswalk

DM_PPT_NP_v02

24

ISO Explorer

http://wiki.esipfed.org/index.php/Category:ISO_Explorer

DM_PPT_NP_v02

25

ISO Explorer Pages

Class Name

UML

Element Names, Definitions and Examples

ParentsGuidance

All Explorer Pages

DM_PPT_NP_v02

Acknowledgements

ThisworkwaspartiallysupportedbycontractnumberNNG15HZ39C fromNASA.

Anyopinions,findings,conclusions,orrecommendationsexpressedinthismaterialarethoseoftheauthoranddonotnecessarilyreflecttheviewsofNASAorTheHDFGroup.