DpropR for System i and LUW - Little known secrets - IDUG

82
Click to edit Master title style DpropR for System i and LUW - Little known secrets Senthil Chandramohan Jenny Pang IBM Nationwide Insurance Session E06 Room 110 | May 16, 2012 09:45 AM - 10:45 AM | Platform:Cross Platform

Transcript of DpropR for System i and LUW - Little known secrets - IDUG

Click to edit Master title style

DpropR for System i and LUW - Little known secrets

Senthil Chandramohan Jenny PangIBM Nationwide Insurance

Session E06Room 110 | May 16, 2012 09:45 AM - 10:45 AM | Platform:Cross Platform

Click to edit Master title style

Agenda Topics

•SQL Replication Basics•System i•Replication Setup and Tuning•Administering the Replication Environment•Recovery

Click to edit Master title style

SQL Replication Basics Replication Architecture

Source Server - Archive Logging needs to be turned on

Source table – Datacapture changes has to be set

Capture Program – Reads the transactions from the logs and stores the changes in CD tables

Apply program – Fetches the changes from the CD tables and updates the target tables

Click to edit Master title style

Common Replication Scenarios

Data distribution Data consolidation Update-anywhere

Data Transformation Workload distribution History and Auditing In a HADR environment Non-Db2 Sources Non-DB2 targets

Data Distribution

Update Anywhere

Data Consolidation

Click to edit Master title style

System i

● Library, Collection, Schema (Similar to Schema in LUW)● Physical file, Table (Similar to Tables in LUW)● Logical file, Index, View (Similar to Indexes and views in LUW)

System i objects

Journalling● Source table has to be journalled with *BOTH images (Similar to Datacapture

changes in LUW)● Delete Journal exit program prevents deletion of journal receivers until capture

has read from it.

Commitment control● Tables can be updated even with no commitment control

Click to edit Master title style

Native Operations● CLRPFM ● CPYF *REPLACE● RSTOBJ

PTFS

● Joblog (Similar to logs in LUW)● Spooled files

Logs

● DSPPTF 5761DP4

Commands● STRDPRCAP● STRDPRAPY● WRKSBSJOB QZSNDPR● GO CMDDSP● DSPMSGD

Click to edit Master title style

Replication setup and tuning Registration – Identifies the source table and the changed

data table (CD Table)

Subscription set – Identifies the source server, target server and how often to replicate

Subscription member – Identifies the source table, target table and the target structure

Capture – Captures data from the transaction logs. Uniquely identified by the capture server and schema name

Apply – Applies data to the target. Uniquely identified by the apply control server and apply qualifier name

Click to edit Master title style

● It is created in the Capture control server.● It tells replication what is the source table and corresponding Change

data table (CD Table) that will hold the changes● There will be one registration per source table● When you register a table, DATA CAPTURE CHANGES will be turned

on for that table● A sub-set of the source table columns could be registered if desired● Before images can also be captured

Registration

Click to edit Master title style

● It is created in the Apply control server.● It tells what is the source server, target server and how often the

tables in it are processed.● It tells what apply qualifier the set belongs to and if there are any sql

before or after statements to be run.● The subscription set is a grouping of replicated tables and is treated

as a single unit. So everything replicates or does not. Even if only one table has an error the entire subscription set is affected.

● Only one subscription set will be active in an apply qualifier at any point in time.

Subscription set

Click to edit Master title style

● It is created in the Apply control server.● It maps the source table/view with the target table● One or more members could be in a subscription set● It defines the target structure and if there is any row filtering● It defines the key columns and any column expressions● Load options could be specified at a table level

Subscription Member

Click to edit Master title style

● Capture reads the db2 transaction logs and populates the CD table with the changed rows and the IBMSNAP_UOW table with additional details on the transaction

● Usually one capture per database is enough but you could run more than one if needed

● On System i there could be a DST impact based on how the system is configured to make the time change

● Capture process is multi-threaded.● The prune thread deletes rows from the CD and UOW tables when all

the targets have applied the data.● Capture log will be in the capture_path or in the directory where

capture was started (if no capture_path was specified)

Capture

Click to edit Master title style

● Apply will connect to the source server, fetch the data to be replicated from the CD tables, store it in a spill file and then connect to the target and apply the data to the target tables

● Apply will process one subscription set at a time. A set could contain one or more tables. Other sets in the same apply qualifier will wait their turn based on how often they are configured to replicate

● Each Apply qualifier will correspond to a separate process on the system and will use resources accordingly.

Apply

Click to edit Master title style

● Replication Center Part of DB2 General Administration tools. It is a GUI that allows you to

setup and administer replication.

● Asnclp Part of DB2 also, allows replication setup to be scripted. This is very

helpful when setting up replication on several systems or if you will have to recreate the environment multiple times.

Note: Although the tools can run from anywhere, it is important that the database

alias names where you run them, match the database alias names on the system running apply.

Setup tools

Click to edit Master title style

Replication Center

Click to edit Master title style

SET RUN SCRIPT NOW STOP ON SQL ERROR ONSET SERVER CAPTURE TO DB RCHASK60 AS400 HOSTNAME "9.2.9.299" ID MYUSERID

PASSWORD "PASS4NOW" SET SERVER CONTROL TO DB SQLTGTSET SERVER TARGET TO DB SQLTGTCREATE REGISTRATION (SENTHIL.TABLE01) DIFFERENTIAL REFRESH STAGE CDTABLE01CREATE SUBSCRIPTION SET SETNAME SET1 APPLYQUAL APP1 ACTIVATE YES TIMING

INTERVAL 3600 BLOCKING 0 COMMIT COUNT 0CREATE MEMBER IN SETNAME SET1 APPLYQUAL APP1 SOURCE SENTHIL.TABLE01

TARGET NAME SENTHIL.TABLE01 DEFINITION IN USERSPACE1 TYPE USERCOPY COLS ALL REGISTERED

asnclp <crt.in >crt.out

Above script registers a table, creates a subscription set and adds a member to it.Note: Each statement is on one line although it has wrapped in this presentation.

Asnclp

Click to edit Master title style

Capture-Apply Initial handshake Capture reads the register table at startup and looks out for signals from apply for the tables

being replicated..

Apply updates the SYNCHPOINT column in the IBMSNAP_PRUNCNTL table to hex 20 zeroes for the tables in the subscription set.

Apply inserts CAPSTART signals for each table in the subscription set

Capture sees the signal in the db2 transaction log. Uses that LSN to update the Pruncntl SYNCHPOINT and starts capturing data for that source table from that point.

Apply starts the fullrefresh (It directly fetches data from the source and loads the target)

Apply updates IBMSNAP_SUBS_SET when the load is complete to set the SYNCHTIME to current timestamp and SYNCHPOINT to NULL

Information on success or failure is inserted to the IBMSNAP_APPLYTRAIL table

Subsequent cycles fetch data from the CD table.

In the first differential cycle, reworks maybe seen where a row was captured by capture and also picked up by the fullrefresh. This prevents dataloss when changes come to the source while the table was being loaded.

Click to edit Master title style

Administering the replication environment

Application knowledge asnmon Capture and apply logs Control tables Db2 logs Home grown monitoring

Click to edit Master title style

● Tables being replicated and their importance● Relationship between the tables● Transaction characteristics● Target application requirements and how they will use the

replicated data

Application Knowledge

Click to edit Master title style

● Allows you to monitor several important aspects of replication like errors, warnings, status, latency

● Has the ability to send e-mail● It is a separate program and requires to be maintained

● Has information on the startup parameters passed ● Has information on errors and warnings that may have occurred● Has some indication on the progress of the replication programs

Asnmon

Capture and Apply Logs

Click to edit Master title style

● What is being replicated● From where and To where● Type of replication● Any transformation● Current status

Control tables

Click to edit Master title style

● ps -ef |grep asn● asnccmd capture_server=... capture_schema=... status● asnacmd apply_qual=... control_server=... status● db2 list applications show detail |grep asn

● select cap_schema_name from asn.ibmsnap_capschemas

Each capture schema will have its own set of capture control tables

Is Replication up

How many captures (IBMSNAP_CAPSCHEMAS)

Click to edit Master title style

● IBMSNAP_REGISTER● IBMSNAP_PRUNCNTL● IBMSNAP_SIGNAL● IBMSNAP_PRUNE_SET

Tables at the Capture control server

Click to edit Master title style

● select memory_limit, startmode, prune_interval, retention_limit, lag_limit from <schema>.ibmsnap_capparms with ur

The capture startup parameters are in this table. At startup if a value ispassed as a startup parameter it will take precedence over the value inthe capparms table.

RETENTION_LIMIT is how long capture retains changes if the targetshave not yet applied them. It is seven days by default.

What is the memory_limit, startmode, prune_interval, retention_limit and Lag_limit (IBMSNAP_CAPPARMS)

Click to edit Master title style

● select source_owner, source_table, source_view_qual, hex(cd_old_synchpoint) as cd_old_synchpoint, state, state_info, disable_refresh from <schema>.ibmsnap_register where global_record = 'N' with ur

CD_OLD_SYNCHPOINT that is not null implies that capture at somepoint was capturing changes for that source tableSTATE of I means inactive and the capture apply handshake has not yetbeen completed for that table. A means activeDISABLE_REFRESH of 1 will prevent apply from doing a fullrefresh.This is used as a precaution after the initial handshake, to avoid anysurprise fullrefreshes from happening during production hours.

What tables are being replicated (IBMSNAP_REGISTER)

Click to edit Master title style

● select hex(min_inflightseq) as min_inflightseq, max_commit_time, curr_commit_time from <schema>.ibmsnap_restart with ur

db2flsn of last 16 digits of MIN_INFLIGHTSEQ will give the name of theoldest Db2 transaction log that capture will need at startup. The lower of the timestamps MAX_COMMIT_TIME andCURR_COMMIT_TIME will tell approximately how caughtup capture is.

● select hex(synchpoint) as synchpoint, synchtime from <schema>.ibmsnap_register where global_record ='Y' with ur

SYNCHTIME will tell approximately how caughtup capture is.

How current is capture (IBMSNAP_RESTART, IBMSNAP_REGISTER)

Click to edit Master title style

● select current_memory, cd_rows_inserted, trans_processed, trans_spilled, max_trans_size, synchtime from <schema>.ibmsnap_capmon where monitor_time > current timestamp - 10 minutes order by monitor_time desc with ur

TRANS_SPILLED is usually not good. It means that capture had to spillto disk because it could not fit the transaction in memory.MAX_TRANS_SIZE provides some indication on the size of thetransactions seen by capture. Capture startup parameter MEMORY_LIMIT should be increased toprevent spilling. The default is 32 MB. Applications should also beencouraged to commit more often if possible.

Capture activity in the last 10 minutes (IBMSNAP_CAPMON)

Click to edit Master title style

● select set_name, synchtime, hex(synchpoint) as synchpoint, current timestamp as currenttime from <schema>.ibmsnap_prune_set with ur

If subscriptions have not yet applied any data then the SYNCHTIME willbe null and SYNCHPOINT will be hex 20 zeroesIf there are old subscriptions (synchtime is much older than currenttimestamp) that are no longer needed, they should be cleaned up. If notthey could cause retention limit pruning.

How caughtup are the subscriptions (IBMSNAP_PRUNE_SET)

Click to edit Master title style

● select target_server, apply_qual, set_name, source_table, target_table, hex(synchpoint) as synchpoint, synchtime, target_structure from <schema>.ibmsnap_pruncntl order by apply_qual, set_name with ur

SYNCHPOINT that is non-zero means that the capture apply handshakeIs complete. If it is hex 20 zeroes it means the fullrefresh has completedbut capture has still not seen the CAPSTART signal. NULL means afullrefresh is yet to be done.

SYNCHTIME usually corresponds to when apply started the fullrefresh. Avalue of NULL will mean a fullrefresh is yet to be done.

How many targets are there and have they completed a fullrefresh (IBMSNAP_PRUNCNTL)

Click to edit Master title style

● IBMSNAP_SUBS_SET● IBMSNAP_SUBS_MEMBR● IBMSNAP_SUBS_COLS● IBMSNAP_APPLYTRAIL

Tables at the Apply control server

Click to edit Master title style

● select lastrun, set_name, status, hex(synchpoint) as synchpoint, synchtime from asn.ibmsnap_subs_set with ur

STATUS of 1 means the set is currently being processed. STATUS of -1means the cycle failed. STATUS of 0 or 2 mean the cycle was successful.

SYNCHTIME is a source server timestamp and could be used to tell howcurrent the target is with respect to the source.

Are the subscriptions running ok (IBMSNAP_SUBS_SET)

Click to edit Master title style

● select lastrun, set_name, status, set_inserted, set_updated, set_deleted, set_reworked, full_refresh, sqlstate, sqlcode, sqlerrm, apperrm from asn.ibmsnap_applytrail where lastrun > current timestamp -10 minutes order by lastrun desc with ur

STATUS of -1 means the cycle failed.

Did the previous cycles complete ok (IBMSNAP_APPLYTRAIL)

Click to edit Master title style

● select monitor_time, current_setname, current_tabname, state from asn.ibmsnap_applymon where monitor_time > current timestamp -10 minutes order by monitor_time desc with ur

● This is a new table populated by a separate monitor thread available in v9.7 FP3a and above

● It requires apply to be started with the monitor_enabled=y parameter● It provides insight at a member level on what apply is currently doing

(fetching, applying, working with control tables or sleeping)

What is apply doing (IBMSNAP_APPLYMON)

Click to edit Master title style

● asnanalyze -db <SOURCEDB> <TARGETDB> -la detailed

The analyzer report provides a summary of all the control tables in an easy toread html format.It also provides some basic recommendations. It requires a password file to connect to remote databases. It could use the sameencrypted password file as apply.

● asnapply apply_qual=.. control_server=.. trcflow trcfile copyonce

A .TRC file is created in the apply pathTrace provides the information in the control tables in the order apply sees themTrace will provide information on the number of rows fetched for each member.Trace will provide some additional information on errors

Analyzer

Apply trace

Click to edit Master title style

Db2 support tools

● DB2 snapshots● db2support.zip● Deadlock event monitor● Db2look● Db2expln● Runstats● reorgchk

Click to edit Master title style

Recovery

● Setup Errors● These are usually errors like the wrong key column was used or an incorrect predicate

was used.● The Analyzer report and Apply trace usually are enough to spot such errors.● Use of supported tools like the replication center and asnclp help avoid such errors

● SQL Errors● The capture or apply log usually has the sql code/sqlstate.● Db2 ? SQL0911N or Db2 ? 02000 provides information and recommendations to

recover.● For apply, the apply trace could provide additional information on errors● The errors maybe transient like connection errors and deadlocks or require intervention

like if the target table tablespace is full.

Click to edit Master title style

● Disasters● If the database had to be restored from an old backup or if the DB2 transaction log

required by capture was corrupted, then a Coldstart (Capture started with startmode=cold) will be recommended. This will initiate a fullrefresh and replication will begin as if it is a new setup.

● Other options maybe possible based on the conditions after the disaster

● ASN Errors● The capture or apply log usually has additional information. They usually include

sqlcodes/sqlstates. If they do we could follow the sql error recommendations.● Other messages like ASN1016I Refresh copying has been disabled.etc could relate to

some replication setting. DB2 ? ASN1016I could provide clues on what is happening.

Click to edit Master title style

● Normal fullrefresh – Apply selects all rows from the source, stores them in a spill file and then connects to the target, deletes all rows from the target and then inserts the data from the source and commits. Possible ways to speed this process will be to delete the target data beforehand by say loading an empty file to it before starting the apply load process. Options like refresh_commit_cnt=n could also help if the target transaction logs are not sufficiently large.

● Asnload – ASNLOAD is an exit program that apply calls when a fullrefresh needs to be done and the loadxit=y startup parameter is used. Asnload can use utilities like export and load, load from cursor, export and import or even no load. The asnload.ini file and the LOADX_TYPE column value in the IBMSNAP_SUBS_MEMBR can be used to influence what type of load is used. LOADX_TYPE of 6 means no load.

● Manual refresh – This is used when the user wants to control all the steps done during the load. The User does the handshake by updating the control tables, synchs the source and target and then starts apply. The benefit of this method is that if anything goes wrong then the user could re-do just that part instead of re-doing the fullrefresh for the entire set.

Fullrefresh - target load options

Click to edit Master title style

In some cases it maybe necessary to initiate a fullrefresh. ● Starting capture with STARTMODE=COLD will initiate a refresh of all targets● A fullrefresh could be initiated at a set level by setting the SYNCHPOINT and

SYNCHTIME columns to NULL in the IBMSNAP_SUBS_SET or IBMSNAP_PRUNCNTL tables for that subscription set . Then starting capture and then starting apply.

● A fullrefresh could be initiated at a member level by setting the MEMBER_STATE to N for that member.

● Asntdiff is a differencing utility that can compare the source and target tables and provide a table with the rows that are different. Then the user could synch the tables by using asntrep or exporting and loading the rows missing and deleting the rows that are extra.

Initiating a Fullrefresh

Asntdiff

Click to edit Master title style

Recommendations

Plan Develop and Test Implement Monitor, maintain and fine tune Re-visit as the business needs evolve

Click to edit Master title style

References

Sql replication Roadmap Db2 infocenter Replication server support site and Technotes Redbook Channeldb2 DB2 UDB support site (Technotes and fixpacks) Connectivity cheat sheet System i infocenter

Click to edit Master title style

Senthil Chandramohan Jenny PangIBM Nationwide [email protected] [email protected]

Session E06DpropR for System i and LUW - Little known secrets

1

1

Click to edit Master title style

DpropR for System i and LUW - Little known secrets

Senthil Chandramohan Jenny PangIBM Nationwide Insurance

Session E06Room 110 | May 16, 2012 09:45 AM - 10:45 AM | Platform:Cross Platform

Are you interested in SQL replication on LUW or the System i? If so,in this presentation you will learn the tips and tricks to setup, tune and administer such an environment effectively.

The presentation is based on real User experience and IBM Support experience.

New users will benefit by understanding the Replication architecture and how it can address their replication needs.

Existing users will benefit by understanding how the product functions, which in turn will help them better Monitor, Tune and Recover.

Click to edit Master title style

2

Agenda Topics

•SQL Replication Basics•System i•Replication Setup and Tuning•Administering the Replication Environment•Recovery

This presentation is based on IBM SQL Replication.

●In the 1990's SQL Replication started of as Data Propagator or DpropR. R stood for Relational. We had another product that was called DpropNR for non-relational databases.●In 1994 we had DPropR v1. ●In 2002 we had DB2 Replication v8 and its architecture resembles what we have now. Since then a lot has changed in terms of features, performance and supporting tools. ●The current version is v10 on LUW and Z-os.

2

3

Click to edit Master title style

3

SQL Replication Basics Replication Architecture

Source Server - Archive Logging needs to be turned on

Source table – Datacapture changes has to be set

Capture Program – Reads the transactions from the logs and stores the changes in CD tables

Apply program – Fetches the changes from the CD tables and updates the target tables

● SQL Replication replicates data in tables from one table to another.

●The source transactions are fully logged in the db2 transaction log or in Journals.

●Capture reads the logs and populates Changed Data Tables (CD tables)

●Apply fetches the data from the CD tables and applies it to the target tables.

● It is Asynchronous and there is no two phase commit

4

Click to edit Master title style

4

Common Replication Scenarios

Data distribution Data consolidation Update-anywhere

Data Transformation Workload distribution History and Auditing In a HADR environment Non-Db2 Sources Non-DB2 targets

Data Distribution

Update Anywhere

Data Consolidation

●Data distribution : You can replicate data from a source to one or more targets. The targets could be on the same system or remote systems, DB2 or Non-Db2

●Data consolidation : You can replicate data from many sources to a central repository.

●Update-anywhere replication :You can replicate in both directions between a Master and a Replica.

●Data Transformation : Data can be transformed by using sql expressions when replicating from the source to target. Additional sql statements or procedures could be called before or after replication

●Workload Distribution : Data could be replicated from source to target and the applications that read the data could run against the target. With update anywhere replication you could have applications update the tables on the source and replica.

●History and Auditing: Consistent CD (CCD) target tables contain columns that provide details on the changes that occurred. They can provide details about the type of operation performed, when it was performed and by who, in addition to the actual data that was changed.

5

Click to edit Master title style

5

System i

● Library, Collection, Schema (Similar to Schema in LUW)● Physical file, Table (Similar to Tables in LUW)● Logical file, Index, View (Similar to Indexes and views in LUW)

System i objects

Journalling● Source table has to be journalled with *BOTH images (Similar to Datacapture

changes in LUW)● Delete Journal exit program prevents deletion of journal receivers until capture

has read from it.

Commitment control● Tables can be updated even with no commitment control

●The replication architecture is similar on all platforms but understanding the similarities and differences in the operating system and database itself will help.

6

Click to edit Master title style

6

Native Operations● CLRPFM ● CPYF *REPLACE● RSTOBJ

PTFS

● Joblog (Similar to logs in LUW)● Spooled files

Logs

● DSPPTF 5761DP4

Commands● STRDPRCAP● STRDPRAPY● WRKSBSJOB QZSNDPR● GO CMDDSP● DSPMSGD

•Some useful technotes -http://www.ibm.com/search/csass/search?q=repl400&filter=%2Bcollection:stgsysx,dblue,ic,pubs,devrel1%20%2Blanguage:en&prod=A696063M22780F88~0&sn=spe

7

Click to edit Master title style

7

Replication setup and tuning Registration – Identifies the source table and the changed

data table (CD Table)

Subscription set – Identifies the source server, target server and how often to replicate

Subscription member – Identifies the source table, target table and the target structure

Capture – Captures data from the transaction logs. Uniquely identified by the capture server and schema name

Apply – Applies data to the target. Uniquely identified by the apply control server and apply qualifier name

8

Click to edit Master title style

8

● It is created in the Capture control server.● It tells replication what is the source table and corresponding Change

data table (CD Table) that will hold the changes● There will be one registration per source table● When you register a table, DATA CAPTURE CHANGES will be turned

on for that table● A sub-set of the source table columns could be registered if desired● Before images can also be captured

Registration

Recommendations ● Use a comfortable naming convention for CD tables to identify

them without having to go to the control tables● It is usually recommended that the CD tables are placed in their

own tablespace.● CD tables need to be reorged regularly since they will have a very

high deleted record count.● CD tables are defined as volatile and runstats if done, should be

done only when they have a lot of rows in them.● The Before image prefix will tell capture if the column is a before

image or after image. So it should not be the starting letter of any column in the source table.

● Before images for key columns should be captured if the application will update the key column itself.

9

Click to edit Master title style

9

● It is created in the Apply control server.● It tells what is the source server, target server and how often the

tables in it are processed.● It tells what apply qualifier the set belongs to and if there are any sql

before or after statements to be run.● The subscription set is a grouping of replicated tables and is treated

as a single unit. So everything replicates or does not. Even if only one table has an error the entire subscription set is affected.

● Only one subscription set will be active in an apply qualifier at any point in time.

Subscription set

Recommendations ● Deciding what tables go under a subscription set will be important since it

will influence performance, latency, and even data consistency. ● If tables are related via Referential Integrity you have to place them in the

same subscription set so that they are replicated together.● If tables have a strong relationship like the information in one depends on

the other then also it is best to group them together.● Large or important tables could be placed in separate subscription sets, even

under a separate apply qualifiers. This way incase they have to be loaded they will not affect the other subscriptions. They could in this way be processed more frequently also.

● Smaller tables could be grouped together so that you minimize the connections to the source when fetching data.

● Problematic tables could be kept separate too...say you have lobs in a high volume table then keeping it separate will allow other subscription sets to get their turn quicker.

● Too many subscription sets and apply qualifiers also means too much to manage. So you should strike a balance.

● How often to replicate, whether to have a blocking factor or commit more frequently, apply the data in transactional order etc., should all be planned based on the need.

● Source applications should also commit often since capture nor apply can break a unit of work.

10

Click to edit Master title style

10

● It is created in the Apply control server.● It maps the source table/view with the target table● One or more members could be in a subscription set● It defines the target structure and if there is any row filtering● It defines the key columns and any column expressions● Load options could be specified at a table level

Subscription Member

Recommendations ● Key columns should match the source and target unique index.

IS_KEY in the IBMSNAP_SUBS_COLS table will be Y for columns that are part of the replication key.

● If key columns will be updated then the before image of the key columns should be used to update the target

11

Click to edit Master title style

11

● Capture reads the db2 transaction logs and populates the CD table with the changed rows and the IBMSNAP_UOW table with additional details on the transaction

● Usually one capture per database is enough but you could run more than one if needed

● On System i there could be a DST impact based on how the system is configured to make the time change

● Capture process is multi-threaded.● The prune thread deletes rows from the CD and UOW tables when all

the targets have applied the data.● Capture log will be in the capture_path or in the directory where

capture was started (if no capture_path was specified)

Capture

Recommendations ● Memory_limit should be high to hold your biggest transaction. Never let

capture spill.● If capture is behind and transaction logs have been archived, then their

retrieval could be slow. So if they are retrieved beforehand, that could speed capture. Usually capture is always caught up.

Pruning performance will be better if the CD and UOW tables table are re-orged and Indexes are used. So make sure the tables are defined as volatile and RUNSTATS is done when the table has a lot of data.

● Remove un-used registrations and subscriptions to improve capture performance by avoiding capturing and retaining data that may no longer be needed.

● Capture has to run at the same codepage as the database. So Set the DB2CODEPAGE environment variable at the session level if needed to avoid unnecessary conversions

● Use TMPDIR environment variable to allow capture to create its temporary files in a specific directory instead of /tmp.

● Default parameters are usually good but it will be nice to take a look at the different options available to see if you could benefit from them

12

Click to edit Master title style

12

● Apply will connect to the source server, fetch the data to be replicated from the CD tables, store it in a spill file and then connect to the target and apply the data to the target tables

● Apply will process one subscription set at a time. A set could contain one or more tables. Other sets in the same apply qualifier will wait their turn based on how often they are configured to replicate

● Each Apply qualifier will correspond to a separate process on the system and will use resources accordingly.

Apply

Recommendations ● Apply is usually run on the target server which is usually the apply

control server also. This allows apply to connect locally and get information on what to replicate and it also allows it to block the fetches from the source database when it pulls data from the CD tables.

● Apply will create its log files and spill files in the apply_path or in the directory from where it is started(if apply path is not specified). So this directory should be sufficiently large.

● Apply supports an userexit called asnload for doing fullrefresh. The userexit provides a lot of options and should be made use of whenever possible when refreshing the target tables.

● Using several apply qualifiers would mean lower latency since they could all replicate the data in parallel but it would also mean more system resources are being used and several processes to maintain. So you should use them as needed.

● Set the DB2CODEPAGE environment variable at the session level if needed, to avoid unnecessary or incorrect conversions

● Use TMPDIR environment variable to allow apply to create its temporary files in a specific directory.

● Default parameters are usually good but it will be nice to take a look at the different options available to see if you could benefit from them

13

Click to edit Master title style

13

● Replication Center Part of DB2 General Administration tools. It is a GUI that allows you to

setup and administer replication.

● Asnclp Part of DB2 also, allows replication setup to be scripted. This is very

helpful when setting up replication on several systems or if you will have to recreate the environment multiple times.

Note: Although the tools can run from anywhere, it is important that the database

alias names where you run them, match the database alias names on the system running apply.

Setup tools

14

Click to edit Master title style

14

Replication Center

15

Click to edit Master title style

15

SET RUN SCRIPT NOW STOP ON SQL ERROR ONSET SERVER CAPTURE TO DB RCHASK60 AS400 HOSTNAME "9.2.9.299" ID MYUSERID

PASSWORD "PASS4NOW" SET SERVER CONTROL TO DB SQLTGTSET SERVER TARGET TO DB SQLTGTCREATE REGISTRATION (SENTHIL.TABLE01) DIFFERENTIAL REFRESH STAGE CDTABLE01CREATE SUBSCRIPTION SET SETNAME SET1 APPLYQUAL APP1 ACTIVATE YES TIMING

INTERVAL 3600 BLOCKING 0 COMMIT COUNT 0CREATE MEMBER IN SETNAME SET1 APPLYQUAL APP1 SOURCE SENTHIL.TABLE01

TARGET NAME SENTHIL.TABLE01 DEFINITION IN USERSPACE1 TYPE USERCOPY COLS ALL REGISTERED

asnclp <crt.in >crt.out

Above script registers a table, creates a subscription set and adds a member to it.Note: Each statement is on one line although it has wrapped in this presentation.

Asnclp

16

Click to edit Master title style

16

Capture-Apply Initial handshake Capture reads the register table at startup and looks out for signals from apply for the tables

being replicated..

Apply updates the SYNCHPOINT column in the IBMSNAP_PRUNCNTL table to hex 20 zeroes for the tables in the subscription set.

Apply inserts CAPSTART signals for each table in the subscription set

Capture sees the signal in the db2 transaction log. Uses that LSN to update the Pruncntl SYNCHPOINT and starts capturing data for that source table from that point.

Apply starts the fullrefresh (It directly fetches data from the source and loads the target)

Apply updates IBMSNAP_SUBS_SET when the load is complete to set the SYNCHTIME to current timestamp and SYNCHPOINT to NULL

Information on success or failure is inserted to the IBMSNAP_APPLYTRAIL table

Subsequent cycles fetch data from the CD table.

In the first differential cycle, reworks maybe seen where a row was captured by capture and also picked up by the fullrefresh. This prevents dataloss when changes come to the source while the table was being loaded.

Above could be used to manually refresh subscription sets.

The following technote describes the manual refresh steps-http://www-01.ibm.com/support/docview.wss?uid=swg21221699

Step 6 confirms if capture has acknowledged the capstart signal. Only after that should the source table be unloaded to avoid dataloss (If the source table was not quiesced).

Select hex(synchpoint) as synchpoint, synchtime, source_table, set_name from <schema>.ibmsnap_pruncntl where set_name='....' and apply_qual='....' with ur

It should show a non-zero value for synchpoint.

17

Click to edit Master title style

17

Administering the replication environment

Application knowledge asnmon Capture and apply logs Control tables Db2 logs Home grown monitoring

●When administering we usually have questions like - What is being replicated ? Are the processes running ? Are there errors ?●It may also be required to make changes to the existing setup.●Apply Maintenance

18

Click to edit Master title style

18

● Tables being replicated and their importance● Relationship between the tables● Transaction characteristics● Target application requirements and how they will use the

replicated data

Application Knowledge

19

Click to edit Master title style

19

● Allows you to monitor several important aspects of replication like errors, warnings, status, latency

● Has the ability to send e-mail● It is a separate program and requires to be maintained

● Has information on the startup parameters passed ● Has information on errors and warnings that may have occurred● Has some indication on the progress of the replication programs

Asnmon

Capture and Apply Logs

20

Click to edit Master title style

20

● What is being replicated● From where and To where● Type of replication● Any transformation● Current status

Control tables

21

Click to edit Master title style

21

● ps -ef |grep asn● asnccmd capture_server=... capture_schema=... status● asnacmd apply_qual=... control_server=... status● db2 list applications show detail |grep asn

● select cap_schema_name from asn.ibmsnap_capschemas

Each capture schema will have its own set of capture control tables

Is Replication up

How many captures (IBMSNAP_CAPSCHEMAS)

22

Click to edit Master title style

22

● IBMSNAP_REGISTER● IBMSNAP_PRUNCNTL● IBMSNAP_SIGNAL● IBMSNAP_PRUNE_SET

Tables at the Capture control server

23

Click to edit Master title style

23

● select memory_limit, startmode, prune_interval, retention_limit, lag_limit from <schema>.ibmsnap_capparms with ur

The capture startup parameters are in this table. At startup if a value ispassed as a startup parameter it will take precedence over the value inthe capparms table.

RETENTION_LIMIT is how long capture retains changes if the targetshave not yet applied them. It is seven days by default.

What is the memory_limit, startmode, prune_interval, retention_limit and Lag_limit (IBMSNAP_CAPPARMS)

24

Click to edit Master title style

24

● select source_owner, source_table, source_view_qual, hex(cd_old_synchpoint) as cd_old_synchpoint, state, state_info, disable_refresh from <schema>.ibmsnap_register where global_record = 'N' with ur

CD_OLD_SYNCHPOINT that is not null implies that capture at somepoint was capturing changes for that source tableSTATE of I means inactive and the capture apply handshake has not yetbeen completed for that table. A means activeDISABLE_REFRESH of 1 will prevent apply from doing a fullrefresh.This is used as a precaution after the initial handshake, to avoid anysurprise fullrefreshes from happening during production hours.

What tables are being replicated (IBMSNAP_REGISTER)

25

Click to edit Master title style

25

● select hex(min_inflightseq) as min_inflightseq, max_commit_time, curr_commit_time from <schema>.ibmsnap_restart with ur

db2flsn of last 16 digits of MIN_INFLIGHTSEQ will give the name of theoldest Db2 transaction log that capture will need at startup. The lower of the timestamps MAX_COMMIT_TIME andCURR_COMMIT_TIME will tell approximately how caughtup capture is.

● select hex(synchpoint) as synchpoint, synchtime from <schema>.ibmsnap_register where global_record ='Y' with ur

SYNCHTIME will tell approximately how caughtup capture is.

How current is capture (IBMSNAP_RESTART, IBMSNAP_REGISTER)

26

Click to edit Master title style

26

● select current_memory, cd_rows_inserted, trans_processed, trans_spilled, max_trans_size, synchtime from <schema>.ibmsnap_capmon where monitor_time > current timestamp - 10 minutes order by monitor_time desc with ur

TRANS_SPILLED is usually not good. It means that capture had to spillto disk because it could not fit the transaction in memory.MAX_TRANS_SIZE provides some indication on the size of thetransactions seen by capture. Capture startup parameter MEMORY_LIMIT should be increased toprevent spilling. The default is 32 MB. Applications should also beencouraged to commit more often if possible.

Capture activity in the last 10 minutes (IBMSNAP_CAPMON)

27

Click to edit Master title style

27

● select set_name, synchtime, hex(synchpoint) as synchpoint, current timestamp as currenttime from <schema>.ibmsnap_prune_set with ur

If subscriptions have not yet applied any data then the SYNCHTIME willbe null and SYNCHPOINT will be hex 20 zeroesIf there are old subscriptions (synchtime is much older than currenttimestamp) that are no longer needed, they should be cleaned up. If notthey could cause retention limit pruning.

How caughtup are the subscriptions (IBMSNAP_PRUNE_SET)

28

Click to edit Master title style

28

● select target_server, apply_qual, set_name, source_table, target_table, hex(synchpoint) as synchpoint, synchtime, target_structure from <schema>.ibmsnap_pruncntl order by apply_qual, set_name with ur

SYNCHPOINT that is non-zero means that the capture apply handshakeIs complete. If it is hex 20 zeroes it means the fullrefresh has completedbut capture has still not seen the CAPSTART signal. NULL means afullrefresh is yet to be done.

SYNCHTIME usually corresponds to when apply started the fullrefresh. Avalue of NULL will mean a fullrefresh is yet to be done.

How many targets are there and have they completed a fullrefresh (IBMSNAP_PRUNCNTL)

29

Click to edit Master title style

29

● IBMSNAP_SUBS_SET● IBMSNAP_SUBS_MEMBR● IBMSNAP_SUBS_COLS● IBMSNAP_APPLYTRAIL

Tables at the Apply control server

30

Click to edit Master title style

30

● select lastrun, set_name, status, hex(synchpoint) as synchpoint, synchtime from asn.ibmsnap_subs_set with ur

STATUS of 1 means the set is currently being processed. STATUS of -1means the cycle failed. STATUS of 0 or 2 mean the cycle was successful.

SYNCHTIME is a source server timestamp and could be used to tell howcurrent the target is with respect to the source.

Are the subscriptions running ok (IBMSNAP_SUBS_SET)

31

Click to edit Master title style

31

● select lastrun, set_name, status, set_inserted, set_updated, set_deleted, set_reworked, full_refresh, sqlstate, sqlcode, sqlerrm, apperrm from asn.ibmsnap_applytrail where lastrun > current timestamp -10 minutes order by lastrun desc with ur

STATUS of -1 means the cycle failed.

Did the previous cycles complete ok (IBMSNAP_APPLYTRAIL)

32

Click to edit Master title style

32

● select monitor_time, current_setname, current_tabname, state from asn.ibmsnap_applymon where monitor_time > current timestamp -10 minutes order by monitor_time desc with ur

● This is a new table populated by a separate monitor thread available in v9.7 FP3a and above

● It requires apply to be started with the monitor_enabled=y parameter● It provides insight at a member level on what apply is currently doing

(fetching, applying, working with control tables or sleeping)

What is apply doing (IBMSNAP_APPLYMON)

33

Click to edit Master title style

33

● asnanalyze -db <SOURCEDB> <TARGETDB> -la detailed

The analyzer report provides a summary of all the control tables in an easy toread html format.It also provides some basic recommendations. It requires a password file to connect to remote databases. It could use the sameencrypted password file as apply.

● asnapply apply_qual=.. control_server=.. trcflow trcfile copyonce

A .TRC file is created in the apply pathTrace provides the information in the control tables in the order apply sees themTrace will provide information on the number of rows fetched for each member.Trace will provide some additional information on errors

Analyzer

Apply trace

34

Click to edit Master title style

34

Db2 support tools

● DB2 snapshots● db2support.zip● Deadlock event monitor● Db2look● Db2expln● Runstats● reorgchk

35

Click to edit Master title style

35

Recovery

● Setup Errors● These are usually errors like the wrong key column was used or an incorrect predicate

was used.● The Analyzer report and Apply trace usually are enough to spot such errors.● Use of supported tools like the replication center and asnclp help avoid such errors

● SQL Errors● The capture or apply log usually has the sql code/sqlstate.● Db2 ? SQL0911N or Db2 ? 02000 provides information and recommendations to

recover.● For apply, the apply trace could provide additional information on errors● The errors maybe transient like connection errors and deadlocks or require intervention

like if the target table tablespace is full.

36

Click to edit Master title style

36

● Disasters● If the database had to be restored from an old backup or if the DB2 transaction log

required by capture was corrupted, then a Coldstart (Capture started with startmode=cold) will be recommended. This will initiate a fullrefresh and replication will begin as if it is a new setup.

● Other options maybe possible based on the conditions after the disaster

● ASN Errors● The capture or apply log usually has additional information. They usually include

sqlcodes/sqlstates. If they do we could follow the sql error recommendations.● Other messages like ASN1016I Refresh copying has been disabled.etc could relate to

some replication setting. DB2 ? ASN1016I could provide clues on what is happening.

37

Click to edit Master title style

37

● Normal fullrefresh – Apply selects all rows from the source, stores them in a spill file and then connects to the target, deletes all rows from the target and then inserts the data from the source and commits. Possible ways to speed this process will be to delete the target data beforehand by say loading an empty file to it before starting the apply load process. Options like refresh_commit_cnt=n could also help if the target transaction logs are not sufficiently large.

● Asnload – ASNLOAD is an exit program that apply calls when a fullrefresh needs to be done and the loadxit=y startup parameter is used. Asnload can use utilities like export and load, load from cursor, export and import or even no load. The asnload.ini file and the LOADX_TYPE column value in the IBMSNAP_SUBS_MEMBR can be used to influence what type of load is used. LOADX_TYPE of 6 means no load.

● Manual refresh – This is used when the user wants to control all the steps done during the load. The User does the handshake by updating the control tables, synchs the source and target and then starts apply. The benefit of this method is that if anything goes wrong then the user could re-do just that part instead of re-doing the fullrefresh for the entire set.

Fullrefresh - target load options

38

Click to edit Master title style

38

In some cases it maybe necessary to initiate a fullrefresh. ● Starting capture with STARTMODE=COLD will initiate a refresh of all targets● A fullrefresh could be initiated at a set level by setting the SYNCHPOINT and

SYNCHTIME columns to NULL in the IBMSNAP_SUBS_SET or IBMSNAP_PRUNCNTL tables for that subscription set . Then starting capture and then starting apply.

● A fullrefresh could be initiated at a member level by setting the MEMBER_STATE to N for that member.

● Asntdiff is a differencing utility that can compare the source and target tables and provide a table with the rows that are different. Then the user could synch the tables by using asntrep or exporting and loading the rows missing and deleting the rows that are extra.

Initiating a Fullrefresh

Asntdiff

39

Click to edit Master title style

39

Recommendations

Plan Develop and Test Implement Monitor, maintain and fine tune Re-visit as the business needs evolve

● Plan: Know your application, database, data and replication

needs● Develop and Test: Pick the best replication architecture, test for

Functionality, Performance, Stress, Changes (like adding

columns to replicated tables etc.,), Simulate problems and

Recovery● Implement● Monitor, maintain and fine tune (Replication and DB2)● Re-visit as the business needs evolve● Scripts and steps: Automation, monitoring, maintenance,

recovery● Document: Replication training, Usage of support tools and

scripts, Details from planing, Roles and authority, Recovery

actions and times from test, RCA and remedial actions, etc

40

Click to edit Master title style

40

References

Sql replication Roadmap Db2 infocenter Replication server support site and Technotes Redbook Channeldb2 DB2 UDB support site (Technotes and fixpacks) Connectivity cheat sheet System i infocenter

http://www.ibm.com/developerworks/data/roadmaps/sqlrepl-roadmap.html

http://www-947.ibm.com/support/entry/portal/Overview/Software/Information_Management/InfoSphere_Replication_Server

http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp

http://www.redbooks.ibm.com/abstracts/sg246828.html?Open

http://www.channeldb2.com/profiles/blog/list?user=SideshowDave

http://www-947.ibm.com/support/entry/portal/Overview/Software/Information_Management/DB2_for_Linux,_UNIX_and_Windows

http://publib.boulder.ibm.com/eserver/ibmi.html

http://www.ibm.com/developerworks/data/library/techarticle/0310chong/0310chong.html

41

41

Click to edit Master title style

Senthil Chandramohan Jenny PangIBM Nationwide [email protected] [email protected]

Session E06DpropR for System i and LUW - Little known secrets

THANK YOU