NASA GES DISC Efforts for Preserving Nimbus Atmospheric ...

16
NASA GES DISC Efforts for Preserving Nimbus Atmospheric and Meteorological Data Sets James Johnson, Ed Esfandiari, Emily Zamkoff, Irina Gerasimov, Atheer Al-Jazrawi and Gary Alcott AGU Fall 2014 Meeting Session: IN43E-07 http://disc.gsfc.nasa.gov

Transcript of NASA GES DISC Efforts for Preserving Nimbus Atmospheric ...

NASA GES DISC Efforts for Preserving Nimbus Atmospheric

and Meteorological Data SetsJames Johnson, Ed Esfandiari, Emily Zamkoff, Irina

Gerasimov, Atheer Al-Jazrawi and Gary Alcott

AGU Fall 2014 MeetingSession: IN43E-07

http://disc.gsfc.nasa.gov

Nimbus: 50 Year Data Record

Nimbus 1 2UARSTOMS

3 4 5 6 7

Meteor-3 EPSeaWiFS

TRMM

TerraAquaAura

Suomi-NPP

1960 1970 1980 1990 2000 2010

• Why save the old data?• Lifetime of tapes exceeded• Useful for long-term climate studies

Image: Nimbus 1 prior to launch

Introduction

•At end of mission, Nimbus archival datasets were transferred from the investigator facilities

•NASA conducted a study in 2006 on preserving the old Nimbus and other 1960s and 1970s era data.

• John Bordynuik Inc. (JBI) selected as the contractor to recover and restore the old tape media to digital files (TAP).

•HOV Services Inc. selected as the contractor to scan 7-mm film strip and images to JPEG2000 format.

•GES DISC enlisted in 2009 to coordinate the recovery of the Nimbus-II HRIR data as the pilot data set, extract orbit files, and make data publicly available.

• Any hard copy documentation is electronically scanned.

Nimbus Data Recovery ProcessData Recovery

1) NASA Requests Tapes

2) NASA Retrieves Tapes

3) JBI Recovers Tapes to Digital Files

4) NASA Validates Digital Copies of Tapes and Evaluates Data Quality

Data Ingest / Archive / Services

7) NASA Asks for Tapes to be Destroyed

6) NASA Follows Backup & Recovery Procedures

5) NASA Ingests & Archives Files;Makes Data Public

GES DISC Nimbus DatasetsSensor/Experiment Nimbus

1 2 3 4 5 6 7

HRIR High Resolution Infrared Radiometer (Met) x x x

MRIR Medium Resolution Infrared Radiometer (Met) x x x

IRIS Infrared Interferometer Spectrometer (Met/AC) ? x

SIRS Satellite Infrared Spectrometer (Met) - -

THIR Temperature and Humidity Infrared Radiometer (Met) x x x -

BUV Backscatter Ultraviolet Spectrometer (AC) x

SCR Selective Chopper Radiometer (Met) - -

ESMR Electrically Scanning Microwave Radiometer (Met/Surf) - -

ITPR Infrared Temperature Profile Radiometer (Met) -

NEMS Nimbus-E Microwave Spectrometer (Met) -

SCMR Surface Composition Mapping Radiometer (Surf) -

HIRS High Resolution Infrared Radiation Sounder (Met) x

LRIR Limb Radiance Inversion Radiometer (AC) -

SCAMS Scanning Microwave Spectrometer (Met) x

LIMS Limb Infrared Monitor of the Stratosphere (AC) x

SAMS Stratospheric and Mesospheric Sounder (AC) -

SBUV Solar Backscatter Ultraviolet Spectrometer (AC) x

TOMS Total Ozone Mapping Spectrometer (AC) x

Various Issues Encountered

• Nimbus data do not have a common file format, data collections are unique and proprietaryo However HRIR, MRIR and THIR (N4-N6) share a formato Limits software reuse

• No metadata available from the old heritage missionso Metadata must be obtained by reading the data fileso Header and geolocation info are sometimes corrupted

• Documentation exists in Nimbus User’s Guides and Catalogs, though information is sometimes incomplete

• Time and orbit number stamps on film strips and digital data often do not agree (orbit number in data files is the retrieval orbit, not the orbit of measurements)

The TAP File FormatTAP Format Header: 32-bit integer

bit 0-30: length of record in bytesbit 31: 0 = good record

1 = bad recordzero record length indicates file mark

two consecutive headers with zero value indicates end of tape (EOT)

084

Tape LabelEBCDIC

840

128

File 1Record 1

12811028

File 1Record 2

1102811028

File 1Record 3

11028

11028

File 1Record X

11028

0128

File NRecord 1

12811028

File NRecord 2

1102811028

File NRecord 3

11028

11028

File NRecord Y

1102800

EOT

File mark

File mark

File markRecord Begin

Record End

Record Begin

Record End

Record End

Record End

Record End

Record End

Record End

Record Begin

Record Begin

Record Begin

Record BeginRecord End

Record End

Record Begin

Record Begin

Record Begin

Archiving the TAP file• Each digital retrieved tape file (TAP) is

archived to the GES DISC This is archived as {product}_TAPE collection

• GES DISC evaluates effort to extract individual data files and metadata

This is archived as {product} collection (for public)• Can always go back to {product}_TAPE to

restore {product} again• Original tapes are sent to JBI first, backups

are sent separately• Inventory log is kept on all items shipped and

returned

Extracting the Data Files• In the Nimbus era, each experiment team

designed their own unique file format.• No concept of granule level metadata,

this has to be pulled from each data file.• Data were written on IBM-360 machines:

use 36-bit or 32-bit words, with IBM floats.• Files have no names on tape, we provide names

with mission, begin date, tape number and orbit• Time and orbit on film strips and digital data do

not agree (orbit in data files is the retrieval orbit)• Backup tapes must be reviewed individually and

compared with primary for any missing data files

HRIR (MRIR and THIR) Data Files

P13P31P1

0128

Documentation Record

12811028

Data Record 1

1102811028

Data Record 2

1102811028

Data Record 3

11028

11028

Data Record X

11028

Documentation RecordStart Date and TimeEnd Date and TimeOrbit Retrieval Number Number of location anchor points

(31 though typically 29 used)Swath size (in words)Number of swaths (6 per record)

Data RecordHeader

Date and timePitch, yaw, roll errorsHardware statusNadir angles for anchor points (31)

Swath (repeats 6 times)Start time (seconds)Channel (for MRIR)Number of data points in swathSub-satellite lat and lonAnchor point lat and lon (31)Instrument status flagBrightness Temperatures (~430)

• Data originally created on IBM-360 using 36-bit words• Data are packed in either 6 byte or 4.5 byte words• The original file structure is preserved

428-432 Pixels

184 records1104 swaths

Nimbus-2 HRIROctober 6, 196605:50:03 to06:15:30 UTCOrbit 1917

Dire

ctio

n of

trav

el

P15 P17P14 P16

Anchor Points

HurricaneInez

The File-Level Metadata

4) Extract Orbit (actually retrieval orbit) from header

3) Begin DateEnd Datefrom header

1) Interpolate Data to the Lat/Lon Anchor Points

2) Assign to 10°x10° grid, and create spatial polygons, this is adequate for searching

5) Add the JBI producer QA metadata

HRIR Film Strip Data ProblemsOrbit off by 1Unreadable Digits Wrong

Time ticks:2 secondincrements

2° Lat/Lon marks spaced every 10°

Nadir Lat/Lon

Earth view

Space view

Counter

Label

File name

- About 1 week of images were tarred together and archived

The SCAMS File Format Problemsa) Ideal situation at rightb) Block does not always

contain 3 records. Typically on 1 or 2 records in block

c) Block size is not an increment of the record size. Blocks contain extra block marker and/or record markers

d) Misleading block sizes. Sometimes a block is given as 4200, but contains extra block marker and/or record markers and a short last record

Offset Length Object0 4 Begin Block1 Marker (value = 4200)

4 4200 Block1

Record1 (size = 1400 bytes)

Record2 (size = 1400 bytes)

Record3 (size = 1400 bytes)

4204 4 End Block1 Marker (value = 4200)4208 4 Begin Block2 Marker (value = 4200)

4212 4200 Block2

Record1 (size = 1400 bytes)

Record2 (size = 1400 bytes)

Record3 (size = 1400 bytes)

8412 4 End Block2 Marker (value = 4200)

...

(N-1) x4208 4 Begin BlockN Marker (value = 4200)

(N-1) x 4208 + 4 4200 BlockN

Record1 (size = 1400 bytes)

Record2 (size = 1400 bytes)

Record3 (size = 1400 bytes)

(N-1) x 4208 + 4204 4 End BlockN Marker (value = 4200)

SCAMS records written in reverse chronological order. However, misplaced data records are sometimes included.

The Collection-Level Metadata

• GES DISC uses the GCMD DIF for storing collection level metadata (can be used by other discovery tools)

• Allows for common looking landing pages across the GES DISC site.

• DIF information is populated from Nimbus User’s Guides and information available from NSSDC web pages

Documentation

• GES DISC web site contains directory of Nimbus data products, and supporting documentation User’s Guides, Data Catalogs, and READMEs. We also keep the inventory of all tapes and files ingested.

• Some Hardcopies must be scanned.

Conclusion• This is tedious work! But

important in preserving data, otherwise lost forever!!!

• No common format makes each product unique, limits software reuse

• File formats sometimes deviate from file descriptors

• Corrupted records and data make extraction hard

• See related document and data preservation poster Friday afternoon IN53C-3816See http://disc.gsfc.nasa.gov for more information. Nimbus recovered data are availableTo download via anonymous FTP and through Reverb/ECHO.